MP3 Extensions
by Alex Beregszaszi
November 28, 2003
Contents
--------
* Xing VBR header
* LAME header
* PlusV
* CodingTechnologies' SBR
Xing VBR header
---------------
The Xing header is placed in the beginning of a frame, but actual position
may depend on wether mpeg 1/2/3 layer I/II/III and which bitrate was used.
For LSF (>= mpeg2) if the stream is mono, it starts at byte 9, else at byte 17.
In case of mpeg1, if the stream is mono, it starts at byte 17, else at byte 32.
(not counting the 4 byte header)
All values are stored in most significant bit first (big-endian) order.
4 bytes header tag ("Xing")
4 bytes header flags
If second bit of flags is set,
4 bytes number of frames
If third bit of flags is set,
4 bytes total stream size
If first bit of flags is set,
100*1 bytes 100 1-byte TOC entries
If forth bit of flags is set,
4 bytes vbr quality: 0 (best) .. 100 (worst)
Every TOC entry contains the size of the n-th frame. Calculating the position
of the 3rd frame should look as following: header_size + toc[0] + toc[1] + toc[2]
LAME header
-----------
120 bytes Xing header
9 bytes lame version string (for example, "LAME3.12 (beta 6)")
1 byte revision and vbr method:
4 bits revision
4 bits vbr method
1 byte lowpass filter frequency
4 bytes peak signal amplitude
2 bytes radio replay gain
2 bytes audiohpile replay gain
1 byte flags 1:
4 bits auth type
1 bit Naoki's psycho acoustic model was used
1 bit safe joint
1 bit no gap more
1 bit no gap previous
1 byte abr bitrate (0xFF means invalid)
12 bits encoding delay
12 bits encoding padding
1 byte flags 2:
2 bits noise shaping
3 bits stereo mode
1 bit non optimal
2 bits source frequency
1 byte unused
2 bytes preset value
4 bytes music length
2 bytes music crc
2 bytes crc
Values of vbr method:
0: ? (-vbr 5)
1: ? (-vbr 0)
2: ? (-vbr 3)
3: ? (-vbr 2 and -vbr 6)
4: ? (-vbr 4)
5: ? (-vbr 1)
>5: unused
Values of source frequency:
0: < 32000 kHz
1: 44100 kHz
2: 48000 kHz
3: > 48000 kHz
Values of stereo mode:
0: mono
1: stereo
2: dual channel
3: joint stereo
4: joint stereo with forced mid-side stereo
5: auto mid-side stereo
6: unused/reserved
7: default fallback (unknown)
Note: the Xing header tag may be the string "Info" in case of non-VBR stream.
PlusV
-----
It's open, well documented, and everyone is allowed to use it with only one
restriction: the resulting codec must be open!
[ coming soon ]
CodingTechnologies' SBR
-----------------------
[ coming soon ]
References
----------
MPlayerXP sources:
http://mplayerxp.sf.net/
LAME 3.93.1 sources:
http://www.mp3dev.org/ or http://lame.sf.net/
PlusV specification:
http://www.plusv.org/
FAAD sources:
http://www.audiocoding.com/