kanejaku.org

Minimp3 Wasm Without Emscripten

8 Feb 2020

I created a web-based mp3 decoder using minimp3. The decoder is a tiny WebAssembly compiled with clang version 9 (without Emscripten). Applying wasm-opt, the binary size of the decoder is just 21 KB.

Here is a demo page and the repository.

This article describes how I created the decoder.

minimp3 has very few dependencies. It just requires memcpy(3), memmove(3), and memset(3). We can easily find naive implementations of these functions on the Web.

An interesting part is memory allocation. To decode a mp3, we need to pass the mp3 data to minimp3 and allocate memory for decoded PCM data. Both require dynamic memory allocations.

One might think of using malloc(3). This is the way I took first, but after reading Surma’s article that describes wasm-ld’s memory layout, I came up an idea – having a fixed memory layout and allowing dynamic memory allocation only for input (MP3 data) and output (PCM data). This approach reduced most of the complexities of memory allocation. I no longer needed a generic malloc(3) implementation. That eliminated the need for Emscripten.

Here is the memory layout.

We put input MP3 data at the beginning of the heap (that is, __heap_base). The size of the input MP3 data is stored in mp3_data_size. Decoded PCM data follows it and its size is stored in pcm_data_size. Assuming that the Wasm is used from a single thread, the decoder information such as minimp3’s context (mp3dec_t), number of samples etc are placed on statically allocated memory.

This memory layout doesn’t allow us to free up input MP3 data after decoding. This is a trade-off not having a generic memory allocator. Practically this wouldn’t be a problem because we need to allocate memory for PCM data before decoding and WebAssembly doesn’t provide a way to shrink memory. If we want to free up input data, what we can do is to copy the decoded PCM data somewhere and discard the Wasm instance.

With this memory layout, decoding a MP3 data would look like the below:

// decoder.js

const { instance } = await WebAssembly.instantiate(...);
const wasm = instance.exports;
const mp3Data = /* Some Uint8Array */;

// Set MP3 data into Wasm memory.
wasm.set_mp3_data_size(mp3Data.byteLength);
const mp3DataInWasm = new Uint8Array(
  wasm.memory.buffer,
  wasm.mp3_data_offset(),
  wasm.mp3_data_size());
mp3DataInWasm.set(mp3Data);

// Decode.
wasm.decode();
const pcm = new Int16Array(
  wasm.memory.buffer,
  wasm.pcm_data_offset(),
  wasm.pcm_data_size() / 2);

// Use `pcm`.
// decoder.c

extern unsigned char __heap_base; // Defined by LLVM.

static size_t mp3_data_size;
static size_t pcm_data_size;

const uint8_t *mp3_data_offset() {
  return &__heap_base;
}

size_t mp3_data_size() {
  return mp3_data_size;
}

void set_mp3_data_size(size_t size) {
  mp3_data_size = size;
  // Grow memory if needed to set MP3 data.
}

const uint8_t *pcm_data_offset() {
  return mp3_data_offset() + mp3_data_size;
}

size_t pcm_data_size() {
  return pcm_data_size;
}

void decode() {
  pcm_data_size = /* Compute PCM size using minimp3 */;
  // Grow memory if needed for PCM data.

  const uint8_t *mp3 = mp3_data_offset();
  const int16_t *pcm = pcm_data_offset();
  // Decode `mp3` using minimp3 and put samples into `pcm`.
}

A tricky part is to use wasm-ld’s __heap_base. By specifying a linker option -Wl,--export-all I was able to use __heap_base in my code, but the option also exported functions which weren’t needed, bloating Wasm binary size slightly.

What I needed was -Wl,--export=__heap_base. --export is a lld’s option which isn’t specific to WebAssembly and that’s why it isn’t listed on the document of wasm-ld.

The remaining parts of the implementation aren’t specific to Wasm. The github repository has the full implementation.

Side notes

We don’t need a custom MP3 decoder if we don’t need to decode large MP3 files. BaseAudioContext.decodeAudioData() is enough for most MP3 files, but decodeAudioData() didn’t work well for my use case. I want to decode large MP3 files like ATP podcast. Decoding such MP3 files consumes a lot of memory which I want to avoid.

WebCodecs may eliminate the demand of this kind of decoder. This is one of APIs I’d like browsers to implement.