MP3 Extensions
by Alex Beregszaszi
November 28, 2003

Contents
--------
 * Xing VBR header
 * LAME header
 * PlusV
 * CodingTechnologies' SBR


Xing VBR header
---------------

The Xing header is placed in the beginning of a frame, but actual position
may depend on wether mpeg 1/2/3 layer I/II/III and which bitrate was used.

For LSF (>= mpeg2) if the stream is mono, it starts at byte 9, else at byte 17.
In case of mpeg1, if the stream is mono, it starts at byte 17, else at byte 32.
(not counting the 4 byte header)

All values are stored in most significant bit first (big-endian) order.

4 bytes		header tag ("Xing")
4 bytes		header flags

If second bit of flags is set,
4 bytes		number of frames

If third bit of flags is set,
4 bytes		total stream size

If first bit of flags is set,
100*1 bytes	100 1-byte TOC entries

If forth bit of flags is set,
4 bytes		vbr quality: 0 (best) .. 100 (worst)

Every TOC entry contains the size of the n-th frame. Calculating the position
of the 3rd frame should look as following: header_size + toc[0] + toc[1] + toc[2]


LAME header
-----------

120 bytes	Xing header
9 bytes		lame version string (for example, "LAME3.12 (beta 6)")
1 byte		revision and vbr method:
  4 bits		revision
  4 bits		vbr method
1 byte		lowpass filter frequency
4 bytes		peak signal amplitude
2 bytes		radio replay gain
2 bytes		audiohpile replay gain
1 byte 		flags 1:
  4 bits		auth type
  1 bit			Naoki's psycho acoustic model was used
  1 bit			safe joint
  1 bit			no gap more
  1 bit			no gap previous
1 byte		abr bitrate (0xFF means invalid)
12 bits		encoding delay
12 bits		encoding padding
1 byte		flags 2:
  2 bits		noise shaping
  3 bits		stereo mode
  1 bit			non optimal
  2 bits		source frequency
1 byte		unused
2 bytes		preset value
4 bytes		music length
2 bytes		music crc
2 bytes		crc

Values of vbr method:
  0: ? (-vbr 5)
  1: ? (-vbr 0)
  2: ? (-vbr 3)
  3: ? (-vbr 2 and -vbr 6)
  4: ? (-vbr 4)
  5: ? (-vbr 1)
 >5: unused

Values of source frequency:
  0: < 32000 kHz
  1:   44100 kHz
  2:   48000 kHz
  3: > 48000 kHz

Values of stereo mode:
  0: mono
  1: stereo
  2: dual channel
  3: joint stereo
  4: joint stereo with forced mid-side stereo
  5: auto mid-side stereo
  6: unused/reserved
  7: default fallback (unknown)

Note: the Xing header tag may be the string "Info" in case of non-VBR stream.


PlusV
-----

It's open, well documented, and everyone is allowed to use it with only one
restriction: the resulting codec must be open!

[ coming soon ]


CodingTechnologies' SBR
-----------------------

[ coming soon ]


References
----------

MPlayerXP sources:
http://mplayerxp.sf.net/

LAME 3.93.1 sources:
http://www.mp3dev.org/ or http://lame.sf.net/

PlusV specification:
http://www.plusv.org/

FAAD sources:
http://www.audiocoding.com/