MPEG-4 Part 3

{{Short description|Third part of the ISO/IEC MPEG-4 standard}}

MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group.{{cite web|url=https://www.iso.org/standard/53943.html|title=ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio|author=ISO|year=2009|publisher=ISO|access-date=2009-10-06|author-link=International Organization for Standardization}} It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.{{cite web|url=https://www.iso.org/standard/25035.html|title=ISO/IEC 14496-3:1999 - Information technology -- Coding of audio-visual objects -- Part 3: Audio|author=ISO|year=1999|publisher=ISO|access-date=2009-10-06|author-link=International Organization for Standardization}}

The MPEG-4 Part 3 consists of a variety of audio coding technologies – from lossy speech coding (HVXC, CELP), general audio coding (AAC, TwinVQ, BSAC), lossless audio compression (MPEG-4 SLS, Audio Lossless Coding, MPEG-4 DST), a Text-To-Speech Interface (TTSI), Structured Audio (using SAOL, SASL, MIDI) and many additional audio synthesis and coding techniques.{{cite web | url=http://www.thefreelibrary.com/MPEG-4+Audio+Licensing+Committee+Selects+Via+Licensing+Corporation+as+...-a094779778 | title=MPEG-4 Audio Licensing Committee Selects Via Licensing Corporation as Administrator; MPEG-4 Audio Licensing Committee Finalizing Terms for Audio Profile Licensing. | author=Business Wire | publisher=The Free Library | date=2002-12-02 | access-date=2009-10-06}}{{cite web|url=http://mpeg.chiariglione.org/tutorials/papers/icj-mpeg4-si/09-natural_audio_paper/profiles.html |title=MPEG-4 Natural Audio Coding – Audio profiles and levels |author1=Karlheinz Brandenburg |author2=Oliver Kunz |author3=Akihiko Sugiyama |publisher=chiariglione.org |year=1999 |access-date=2009-10-06 |url-status=dead |archive-url=https://web.archive.org/web/20100717130019/http://mpeg.chiariglione.org/tutorials/papers/icj-mpeg4-si/09-natural_audio_paper/profiles.html |archive-date=2010-07-17 }}{{cite web|url=http://mpeg.chiariglione.org/tutorials/papers/icj-mpeg4-si/09-natural_audio_paper/scalability.html |title=MPEG-4 Natural Audio Coding – scalability in MPEG-4 natural audio |author1=Karlheinz Brandenburg |author2=Oliver Kunz |author3=Akihiko Sugiyama |publisher=chiariglione.org |access-date=2009-10-06 |url-status=dead |archive-url=https://web.archive.org/web/20100228165503/http://mpeg.chiariglione.org/tutorials/papers/icj-mpeg4-si/09-natural_audio_paper/scalability.html |archive-date=2010-02-28 }}{{cite web|url=https://mpeg.chiariglione.org/faq/mp4-aud/mp4-aud.htm|title=MPEG Audio FAQ – MPEG-4|author=D. Thom, H. Purnhagen, and the MPEG Audio Subgroup|date=October 1998|publisher=chiariglione.org|access-date=2009-10-06}}{{citation | url=ftp://ftp.tnt.uni-hannover.de/pub/MPEG/audio/mpeg4/documents/w2803/w2803_n.pdf | title=ISO/IEC 14496-3:/Amd.1 – Final Committee Draft – MPEG-4 Audio Version 2 | author=ISO/IEC JTC 1/SC 29/WG 11 | date=July 1999 | access-date=2009-10-07 | archive-url=http://webarchive.loc.gov/all/20120801161551/ftp://ftp.tnt.uni-hannover.de/pub/MPEG/audio/mpeg4/documents/w2803/w2803_n.pdf | archive-date=2012-08-01 | url-status=dead }}{{citation | url=ftp://ftp.tnt.uni-hannover.de/pub/papers/1999/AES17-HP.pdf | archive-url=https://web.archive.org/web/20170706032218/ftp://ftp.tnt.uni-hannover.de/pub/papers/1999/AES17-HP.pdf | url-status=dead | archive-date=2017-07-06 | title=An Overview of MPEG-4 Audio Version 2 | author=Heiko Purnhagen | publisher=Heiko Purnhagen | date=1999-06-07 | access-date=2009-10-07 }}{{cite web | url=http://140.130.175.70/html/mpeg4/sound.media.mit.edu/mpeg4/audio/general/index.html#aes106 | title=The MPEG-4 Audio Standard: Overview and Applications | author=Heiko Purnhagen | publisher=Heiko Purnhagen | date=2001-06-01 | access-date=2009-10-07}} {{Dead link|date=September 2010|bot=H3llBot}}{{cite web | url=http://140.130.175.70/html/mpeg4/sound.media.mit.edu/mpeg4/audio/index.html#mpeg4 | title=The MPEG Audio Web Page – MPEG-4 Audio (ISO/IEC 14496-3) | author=Heiko Purnhagen | date=2001-11-07 | access-date=2009-10-07}} {{Dead link|date=September 2010|bot=H3llBot}}{{cite web|url=https://mpeg.chiariglione.org/standards/mpeg-4/mpeg-4.htm|title=Overview of the MPEG-4 Standard|author=Rob Koenen, ISO/IEC JTC1/SC29/WG11|date=March 2002|publisher=chiariglione.org|access-date=2009-10-06}}

MPEG-4 Audio does not target a single application such as real-time telephony or high-quality audio compression. It applies to every application which requires the use of advanced sound compression, synthesis, manipulation, or playback.

MPEG-4 Audio is a new type of audio standard that integrates numerous different types of audio coding: natural sound and synthetic sound, low bitrate delivery and high-quality delivery, speech and music, complex soundtracks and simple ones, traditional content and interactive content.

Versions

class="wikitable sortable"

|+MPEG-4 Audio versions and editions{{cite web|url=http://mpeg.chiariglione.org/standards.htm |title=MPEG standards – Full list of standards developed or under development |author=MPEG |publisher=chiariglione.org |access-date=2009-10-31 |url-status=dead |archive-url=https://web.archive.org/web/20100420192552/http://mpeg.chiariglione.org/standards.htm |archive-date=April 20, 2010 }}

Edition

! Release date

! Latest amendment

! Standard

! Description

First edition

| 1999

| 2001

| ISO/IEC 14496-3:1999

| also known as "MPEG-4 Audio Version 1"

|

| 2000

| ISO/IEC 14496-3:1999/Amd 1:2000{{cite web | url=http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=31568 | title=ISO/IEC 14496-3:1999/Amd 1:2000 - Audio extensions | author=ISO | publisher=ISO | year=2000 | access-date=2009-10-07| author-link=International Organization for Standardization }}

| also known as "MPEG-4 Audio Version 2", an Amendment to first edition

Second edition

| 2001

| 2005

| ISO/IEC 14496-3:2001{{cite web | url=http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=36083 | title=ISO/IEC 14496-3:2001 - Information technology -- Coding of audio-visual objects -- Part 3: Audio | author=ISO | publisher=ISO | year=2001 | access-date=2009-10-14| author-link=International Organization for Standardization }}

|

Third edition

| 2005

| 2008

| ISO/IEC 14496-3:2005{{cite web | url=http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=42739 | title=ISO/IEC 14496-3:2005 - Information technology -- Coding of audio-visual objects -- Part 3: Audio | author=ISO | publisher=ISO | year=2005 | access-date=2009-10-14| author-link=International Organization for Standardization }}

|

Fourth edition

| 2009

| 2015 and under development

| ISO/IEC 14496-3:2009{{citation | url=http://webstore.iec.ch/preview/info_isoiec14496-3%7Bed4.0%7Den.pdf | title=ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio | author=ISO/IEC | publisher=IEC | date=2009-09-01 | access-date=2009-10-07}}

|

Fifth edition

| 2019

|

| ISO/IEC 14496-3:2019{{citation | url=https://www.iso.org/standard/76383.html | title=ISO/IEC 14496-3:2019 - Information technology -- Coding of audio-visual objects -- Part 3: Audio | author=ISO/IEC | publisher=IEC | date=2019-12-01 | access-date=2020-06-02}}

| Current version

Subparts

MPEG-4 Part 3 contains following subparts:

  • Subpart 1: Main (list of Audio Object Types, Profiles, Levels, interface to ISO/IEC 14496-1, MPEG-4 Audio transport stream, etc.)
  • Subpart 2: Speech coding – HVXC (Harmonic Vector eXcitation Coding)
  • Subpart 3: Speech coding – CELP (Code Excited Linear Prediction)
  • Subpart 4: General Audio Coding (GA) (Time/Frequency Coding) – AAC, TwinVQ, BSAC
  • Subpart 5: Structured Audio (SA)
  • Subpart 6: Text to Speech Interface (TTSI)
  • Subpart 7: Parametric Audio Coding – HILN (Harmonic and Individual Line plus Noise)
  • Subpart 8: Technical description of parametric coding for high quality audio (SSC, Parametric Stereo)
  • Subpart 9: MPEG-1/MPEG-2 Audio in MPEG-4
  • Subpart 10: Technical description of lossless coding of oversampled audio (MPEG-4 DST – Direct Stream Transfer)
  • Subpart 11: Audio Lossless Coding (ALS)
  • Subpart 12: Scalable Lossless Coding (SLS)

MPEG-4 Audio Object Types

MPEG-4 Audio includes a system for handling a diverse group of audio formats in a uniform manner. Each format is assigned a unique Audio Object Type to represent it.{{cite web | url=http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio | title=MPEG-4 Audio | author=MultimediaWiki | publisher=MultimediaWiki | year=2009 | access-date=2009-10-09}}{{citation|url=http://www.iis.fraunhofer.de/fhg/Images/AES5270_MPEG-4_Audio_Components_on_various_Platforms_tcm278-67534.PDF |title=Implementation of MPEG-4 Audio Components on various Platforms |author1=Bernhard Grill |author2=Stefan Geyersberger |author3=Johannes Hilpert |author4=Bodo Teichmann |publisher=Fraunhofer Gesellschaft |date=July 2004 |access-date=2009-10-09 |url-status=dead |archive-url=https://web.archive.org/web/20070610222853/http://www.iis.fraunhofer.de/fhg/Images/AES5270_MPEG-4_Audio_Components_on_various_Platforms_tcm278-67534.PDF |archive-date=2007-06-10 }} Object Type is used to distinguish between different coding methods. It directly determines the MPEG-4 tool subset required to decode a specific object. The MPEG-4 profiles are based on the object types and each profile supports a different list of object types.

class="wikitable sortable"

|+MPEG-4 Audio Object Types{{cite web|url=http://140.130.175.70/html/mpeg4/sound.media.mit.edu/mpeg4/audio/documents/index.html |title=MPEG-4 Audio (Final Committee Draft 14496-3) |author=ISO/IEC JTC1/SC29/WG11 N2203 |publisher=Heiko Purnhagen |date=March 1998 |access-date=2009-10-07 }}{{dead link|date=June 2016|bot=medic}}{{citation|url=http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n6475t.doc |title=Text of ISO/IEC 14496-3:2001/FPDAM 4, Audio Lossless Coding (ALS), new audio profiles and BSAC extensions |format=DOC |author=ISO/IEC JTC1/SC29/WG11/N7016 |date=2005-01-11 |access-date=2009-10-09 |url-status=dead |archive-url=https://web.archive.org/web/20140512215821/http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n6475t.doc |archive-date=2014-05-12 }}

Object Type ID

! Audio Object Type

! First public release date

! Description

1

| AAC Main

| 1999

| contains AAC LC

2

| AAC LC (Low Complexity)

| 1999

| Used in the "AAC Profile". MPEG-4 AAC LC Audio Object Type is based on the MPEG-2 Part 7 Low Complexity profile (LC) combined with Perceptual Noise Substitution (PNS) (defined in MPEG-4 Part 3 Subpart 4).{{cite web|url=http://mpeg.chiariglione.org/tutorials/papers/icj-mpeg4-si/09-natural_audio_paper/gacoding.html |title=MPEG-4 Natural Audio Coding – General Audio Coding (AAC based) |author1=Karlheinz Brandenburg |author2=Oliver Kunz |author3=Akihiko Sugiyama |publisher=chiariglione.org |year=1999 |access-date=2009-10-06 |url-status=dead |archive-url=https://web.archive.org/web/20100219233137/http://mpeg.chiariglione.org/tutorials/papers/icj-mpeg4-si/09-natural_audio_paper/gacoding.html |archive-date=2010-02-19 }}

3

| AAC SSR (Scalable Sample Rate)

| 1999

| MPEG-4 AAC SSR Audio Object Type is based on the MPEG-2 Part 7 Scalable Sampling Rate profile (SSR) combined with Perceptual Noise Substitution (PNS) (defined in MPEG-4 Part 3 Subpart 4).

4

| AAC LTP (Long Term Prediction)

| 1999

| contains AAC LC

5

| SBR (Spectral Band Replication)

| 2003{{cite web | title=Bandwidth extension, ISO/IEC 14496-3:2001/Amd 1:2003 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38148 | author=ISO | publisher=ISO | year=2003 | access-date=2009-10-13}}

| used with AAC LC in the "High Efficiency AAC Profile" (HE-AAC v1)

6

| AAC Scalable

| 1999

|

7

| TwinVQ

| 1999

| audio coding at very low bitrates

8

| CELP (Code Excited Linear Prediction)

| 1999

| speech coding

9

| HVXC (Harmonic Vector eXcitation Coding)

| 1999

| speech coding

10

| (Reserved)

|

|

11

| (Reserved)

|

|

12

| TTSI (Text-To-Speech Interface)

| 1999

|

13

| Main synthesis

| 1999

| contains 'wavetable' sample-based synthesis and Algorithmic Synthesis and Audio Effects

14

| 'wavetable' sample-based synthesis

| 1999

| based on SoundFont and DownLoadable Sounds,

{{cite journal

| last1 = Scheirer | first1 = Eric D.

| last2 = Ray | first2 = Lee

| title = Algorithmic and Wavetable Synthesis in the MPEG-4 Multimedia Standard

| periodical= Audio Engineering Society Convention 105, 1998

| date = 1998

| quote = 2.2 Wavetable synthesis with SASBF: The SASBF wavetable-bank format had a somewhat complex history of development. The original specification was contributed by E-Mu Systems and was based on their "SoundFont" format [15]. After integration of this component in the MPEG-4 reference software was complete, the MIDI Manufacturers Association (MMA) approached MPEG requesting that MPEG-4 SASBF be compatible with their "Downloaded Sounds" format [13]. E-Mu agreed that this compatibility was desirable, and so a new format was negotiated and designed collaboratively by all parties.

| citeseerx = 10.1.1.35.2773

}}

contains General MIDI

15

| General MIDI

| 1999

|

16

| Algorithmic Synthesis and Audio Effects

| 1999

|

17

| ER AAC LC

| 2000

| Error Resilient

18

| (Reserved )

|

|

19

| ER AAC LTP

| 2000

| Error Resilient

20

| ER AAC Scalable

| 2000

| Error Resilient

21

| ER TwinVQ

| 2000

| Error Resilient

22

| ER BSAC (Bit-Sliced Arithmetic Coding)

| 2000

| It is also known as "Fine Granule Audio" or fine grain scalability tool. It is used in combination with the AAC coding tools and replaces the noiseless coding and the bitstream formatting of MPEG-4 Version 1 GA coder. Error Resilient

23

| ER AAC LD (Low Delay)

| 2000

| Error Resilient, used with CELP, ER CELP, HVXC, ER HVXC and TTSI in the "Low Delay Profile", (commonly used for real-time conversation applications)

24

| ER CELP

| 2000

| Error Resilient

25

| ER HVXC

| 2000

| Error Resilient

26

| ER HILN (Harmonic and Individual Lines plus Noise)

| 2000

| Error Resilient

27

| ER Parametric

| 2000

| Error Resilient

28

| SSC (SinuSoidal Coding)

| 2004{{cite web | title=Parametric coding for high-quality audio, ISO/IEC 14496-3:2001/Amd 2:2004 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=39382 | author=ISO | publisher=ISO | year=2004 | access-date=2009-10-13}}{{cite web|title=Text of ISO/IEC 14496-3:2001/FPDAM2 (Parametric Audio) - N5713 |url=http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n5462t.doc |format=DOC |author=ISO/IEC JTC1/SC29/WG11 |date=2003-07-25 |access-date=2009-10-13 |url-status=dead |archive-url=https://web.archive.org/web/20140512225123/http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n5462t.doc |archive-date=2014-05-12 }}

|

29

| PS (Parametric Stereo)

| 2004{{cite web | title=3GPP TS 26.401 V6.0.0 (2004-09), General Audio Codec audio processing functions; Enhanced aacPlus General Audio CodecGeneral Description (Release 6) | url=http://www.3gpp.org/ftp/Specs/archive/26_series/26.401/26401-600.zip | format=DOC | author=3GPP | publisher=3GPP | date=2004-09-30 | access-date=2009-10-13}} and 2006{{cite web | title=ETSI TS 126 401 V6.1.0 (2004-12) - Universal Mobile Telecommunications System (UMTS)General audio codec audio processing functions; Enhanced aacPlus general audio codecGeneral description (3GPP TS 26.401 version 6.1.0 Release 6) | url=http://webapp.etsi.org/workprogram/Report_WorkItem.asp?wki_id=21806 | author=3GPP | publisher=3GPP | date=2005-01-04 | access-date=2009-10-13}}{{cite web | title=Audio Lossless Coding (ALS), new audio profiles and BSAC extensions, ISO/IEC 14496-3:2005/Amd 2:2006 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=43026 | author=ISO | publisher=ISO | year=2006 | access-date=2009-10-13}}

| used with AAC LC and SBR in the "HE-AAC v2 Profile". PS coding tool was defined in 2004 and Object Type defined in 2006.

30

| MPEG Surround

| 2007{{cite web | title=BSAC extensions and transport of MPEG Surround, ISO/IEC 14496-3:2005/Amd 5:2007 | url=http://www.iso.org/iso/catalogue_detail.htm?csnumber=44009 | author=ISO | publisher=ISO | year=2007 | access-date=2009-10-13}}

| also known as MPEG Spatial Audio Coding (SAC), it is a type of spatial audio coding{{cite web | url=http://mpeg.chiariglione.org/technologies/mpeg-d/mpd-mps/index.htm | title=Tutorial on MPEG Surround Audio Coding | author=ISO/IEC JTC1/SC29/WG11 | date=July 2005 | access-date=2010-02-09 | archive-url=https://web.archive.org/web/20100430175217/http://mpeg.chiariglione.org/technologies/mpeg-d/mpd-mps/index.htm | archive-date=2010-04-30 | url-status=dead }}{{cite web | url=http://www.chiariglione.org/mpeg/technologies/mpd-mps/index.htm | title=Tutorial on MPEG Surround Audio Coding | author=ISO/IEC JTC1/SC29/WG11 | date=July 2005 | access-date=2010-02-09 |archive-url = https://web.archive.org/web/20080324164259/http://www.chiariglione.org/mpeg/technologies/mpd-mps/index.htm |archive-date = 2008-03-24}} (MPEG Surround was also defined in ISO/IEC 23003-1 in 2007{{cite web | url=http://www.iso.org/iso/catalogue_detail.htm?csnumber=44159 | title=ISO/IEC 23003-1:2007 - Information technology -- MPEG audio technologies -- Part 1: MPEG Surround | author=ISO | publisher=ISO | date=2007-01-29 | access-date=2009-10-24}})

31

| (ESCAPE)

|

|

32

| MPEG-1/2 Layer-1

| 2005{{cite web | title=MPEG-1/2 audio in MPEG-4, ISO/IEC 14496-3:2001/Amd 3:2005 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=39584 | author=ISO | publisher=ISO | year=2005 | access-date=2009-10-13}}

|

33

| MPEG-1/2 Layer-2

| 2005

|

34

| MPEG-1/2 Layer-3

| 2005

| also known as "MP3onMP4"

35

| DST (Direct Stream Transfer)

| 2005{{cite web | title=Lossless coding of oversampled audio, ISO/IEC 14496-3:2001/Amd 6:2005 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=41811 | author=ISO | publisher=ISO | year=2005 | access-date=2009-10-13}}

| lossless audio coding, used on Super Audio CD

36

| ALS (Audio Lossless Coding)

| 2006

| lossless audio coding

37

| SLS (Scalable Lossless Coding)

| 2006{{cite web | title=Scalable Lossless Coding (SLS), ISO/IEC 14496-3:2005/Amd 3:2006 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=43362 | author=ISO | publisher=ISO | year=2006 | access-date=2009-10-13}}

| two-layer audio coding with lossless layer and lossy General Audio core/layer (e.g. AAC)

38

| SLS non-core

| 2006

| lossless audio coding without lossy General Audio core/layer (e.g. AAC)

39

| ER AAC ELD (Enhanced Low Delay)

| 2008{{cite web | title=Enhanced low delay AAC, ISO/IEC 14496-3:2005/Amd 9:2008 | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=46457 | author=ISO | publisher=ISO | year=2008 | access-date=2009-10-13}}

| Error Resilient

40

| SMR (Symbolic Music Representation) Simple

| 2008

| note: Symbolic Music Representation is also the MPEG-4 Part 23 standard (ISO/IEC 14496-23:2008){{cite web | title=ISO/IEC 14496-23:2008, Information technology -- Coding of audio-visual objects -- Part 23: Symbolic Music Representation | url=http://www.iso.org/iso/catalogue_detail.htm?csnumber=45531 | author=ISO | publisher=ISO | year=2008 | access-date=2009-10-13}}{{cite web | title=Symbolic Music Representation conformance, ISO/IEC 14496-4:2004/Amd 29:2008 | url=http://www.iso.org/iso/catalogue_detail.htm?csnumber=46593 | author=ISO | publisher=ISO | year=2008 | access-date=2009-10-13}}

41

| SMR Main

| 2008

|

42

| USAC (Unified Speech and Audio Coding)

| 2012

| Unified Speech and audio Coding is defined in MPEG-D Part 3 (ISO/IEC 23003-3:2012)

{{cite web

| title=ISO/IEC 23003-3:2012 - Information technology -- MPEG audio technologies -- Part 3: Unified speech and audio coding

| url=http://www.iso.org/standard/57464.html

| author=ISO

| publisher=ISO

| year=2012

| access-date=2019-11-07}}

43

| SAOC (Spatial Audio Object Coding)

| 2010

{{cite web

| title=ISO/IEC 14496-3:2009/Amd 2:2010, ALS simple profile and transport of SAOC | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=54838 | author=ISO | publisher=ISO

| year=2009

| access-date=2009-10-13}}

{{citation|title=ISO/IEC 14496-3:200X/PDAM 2 – ALS Simple Profile and Transport of SAOC, N10826 |url=http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n10483t.doc |author=ISO/IEC JTC1/SC29/WG11

|format=DOC

|date=2009-07-03

|access-date=2009-10-13 |url-status=dead |archive-url=https://web.archive.org/web/20140729155217/http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n10483t.doc |archive-date=2014-07-29 }}

| note: Spatial Audio Object Coding is also the MPEG-D Part 2 standard (ISO/IEC 23003-2:2010)

{{cite web

| title=ISO/IEC 23003-2:2010 - Information technology -- MPEG audio technologies -- Part 2: Spatial Audio Object Coding (SAOC)

| url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=51827

| author=ISO

| publisher=ISO

| year=2010

| access-date=2010-12-27}}

44

| LD MPEG Surround

| 2010

{{citation|url=https://www.iis.fraunhofer.de/content/dam/iis/de/doc/ame/conference/AES-128-Convention_ANewParametricStereo-AndMultichannelExtension_AES8099.pdf

|title=AES Convention Paper 8099 – A new parametric stereo and Multi Channel Extension for MPEG-4 Enhanced Low Delay AAC (AAC-ELD)

|access-date=2019-11-07}}

| This object type conveys Low Delay MPEG Surround Coding side information (that was defined in MPEG-D Part 2 – ISO/IEC 23003-2) in the MPEG-4 Audio framework.

45

| SAOC-DE

| 2013

| Spatial Audio Object Coding Dialogue Enhancement

46

| Audio Sync

| 2015

| The audio synchronization tool provides capability of synchronizing multiple contents in multiple devices.

Audio Profiles

File:HE-AAC and HE-AAC v2.svg

The MPEG-4 Audio standard defines several profiles. These profiles are based on the object types and each profile supports different list of object types. Each profile may also have several levels, which limit some parameters of the tools present in a profile. These parameters usually are the sampling rate and the number of audio channels decoded at the same time.

class="wikitable sortable" style="min-width:600px"

|+MPEG-4 Audio Profiles

width="20%" | Audio Profile

! width="50%" | Audio Object Types

! width="10%" | First public release date

AAC Profile

| AAC LC

| 2003

High Efficiency AAC Profile

| AAC LC, SBR

| 2003

HE-AAC v2 Profile

| AAC LC, SBR, PS

| 2006

Main Audio Profile

| AAC Main, AAC LC, AAC SSR, AAC LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI, Main synthesis

| 1999

Scalable Audio Profile

| AAC LC, AAC LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI

| 1999

Speech Audio Profile

| CELP, HVXC, TTSI

| 1999

Synthetic Audio Profile

| TTSI, Main synthesis

| 1999

High Quality Audio Profile

| AAC LC, AAC LTP, AAC Scalable, CELP, ER AAC LC, ER AAC LTP, ER AAC Scalable, ER CELP

| 2000

Low Delay Audio Profile

| CELP, HVXC, TTSI, ER AAC LD, ER CELP, ER HVXC

| 2000

Natural Audio Profile

| AAC Main, AAC LC, AAC SSR, AAC LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI, ER AAC LC, ER AAC LTP, ER AAC Scalable, ER TwinVQ, ER BSAC, ER AAC LD, ER CELP, ER HVXC, ER HILN, ER Parametric

| 2000

Mobile Audio Internetworking Profile

| ER AAC LC, ER AAC Scalable, ER TwinVQ, ER BSAC, ER AAC LD

| 2000

HD-AAC Profile

| AAC LC, SLS{{citation|title=ISO/IEC 14496-3:2005/PDAM 10:200X HD-AAC profile, MPEG2008/N10188 |url=http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n9813t.doc |format=DOC |author=ISO/IEC JTC1/SC29/WG11 |date=2008-10-17 |access-date=2009-10-19 |url-status=dead |archive-url=https://web.archive.org/web/20140512223049/http://kikaku.itscj.ipsj.or.jp/sc29/open/29view/29n9813t.doc |archive-date=2014-05-12 }}

| 2009{{cite web | title=ISO/IEC 14496-3:2009/Amd 1:2009 - HD-AAC profile and MPEG Surround signaling | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53944 | author=ISO | publisher=ISO | date=2009-09-11 | access-date=2009-10-15}}

ALS Simple Profile

| ALS

| 2010{{cite web | title=ISO/IEC 14496-3:2009/Amd 2:2010 - ALS simple profile and transport of SAOC | url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=54838 | author=ISO | publisher=ISO | date=2009-10-08 | access-date=2009-10-15}}

Audio storage and transport

class="wikitable sortable" width="100%"

|+Multiplex, storage and transmission formats for MPEG-4 Audio

width="10%" |

! width="20%" | Standard

! width="50%" | Description

Multiplex

| ISO/IEC 14496-1

| MPEG-4 Multiplex scheme (M4Mux){{citation|url=http://webstore.iec.ch/preview/info_isoiec14496-1{ed3.0}en.pdf |title=ISO/IEC 14496-1, Third edition 2004-11-15, Part 1: Systems |author=ISO |publisher=ISO |date=2004-11-15 |access-date=2009-10-14 |url-status=dead |archive-url=https://web.archive.org/web/20110614010701/http://webstore.iec.ch/preview/info_isoiec14496-1%7Bed3.0%7Den.pdf |archive-date=June 14, 2011 }}

Multiplex

| ISO/IEC 14496-3

| Low Overhead Audio Transport Multiplex (LATM)

Storage

| ISO/IEC 14496-3 (informative)

| Audio Data Interchange Format (ADIF) – only for AAC

Storage

| ISO/IEC 14496-12

| MPEG-4 file format (MP4) / ISO base media file format

Transmission

| ISO/IEC 14496-3 (informative)

| Audio Data Transport Stream (ADTS) – only for AAC

Transmission

| ISO/IEC 14496-3

| Low Overhead Audio Stream (LOAS), based on LATM

There is no standard for transport of elementary streams over a channel, because the broad range of MPEG-4 applications have delivery requirements that are too wide to easily characterize with a single solution.

The capabilities of a transport layer and the communication between transport, multiplex, and demultiplex functions are described in the Delivery Multimedia Integration Framework (DMIF) in ISO/IEC 14496-6. A wide variety of delivery mechanisms exist below this interface, e.g., MPEG transport stream, Real-time Transport Protocol (RTP), etc.

Transport in Real-time Transport Protocol is defined in RFC 3016 (RTP Payload Format for MPEG-4 Audio/Visual Streams), RFC 3640 (RTP Payload Format for Transport of MPEG-4 Elementary Streams), RFC 4281 (The Codecs Parameter for "Bucket" Media Types) and RFC 4337 (MIME Type Registration for MPEG-4).

LATM and LOAS were defined for natural audio applications, which do not require sophisticated object-based coding or other functions provided by MPEG-4 Systems.

Bifurcation in the AAC technical standard

{{Main|Advanced Audio Coding}}

The Advanced Audio Coding in MPEG-4 Part 3 (MPEG-4 Audio) Subpart 4 was enhanced relative to the previous standard MPEG-2 Part 7 (Advanced Audio Coding), in order to provide better sound quality for a given encoding bitrate.

It is assumed that any Part 3 and Part 7 differences will be ironed out by the ISO standards body in the near future to avoid the possibility of future bitstream incompatibilities. At present there are no known player or codec incompatibilities due to the newness of the standard.

The MPEG-2 Part 7 standard (Advanced Audio Coding) was first published in 1997 and offers three default profiles:{{citation|url=http://jongyeob.com/moniwiki/pds/upload/13818-7.pdf |archive-url=https://web.archive.org/web/20110713115817/http://jongyeob.com/moniwiki/pds/upload/13818-7.pdf |url-status=dead |archive-date=2011-07-13 |title=ISO/IEC 13818-7, Third edition, Part 7 – Advanced Audio Coding (AAC) |author=ISO |page=32 |date=2004-10-15 |access-date=2009-10-19 }}{{cite web | url=http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=25040 | title=ISO/IEC 13818-7:1997, Information technology -- Generic coding of moving pictures and associated audio information -- Part 7: Advanced Audio Coding (AAC) | author=ISO | year=1997 | access-date=2009-10-19 }} Low Complexity profile (LC), Main profile and Scalable Sampling Rate profile (SSR).

The MPEG-4 Part 3 Subpart 4 (General Audio Coding) combined the profiles from MPEG-2 Part 7 with Perceptual Noise Substitution (PNS) and defined them as Audio Object Types (AAC LC, AAC Main, AAC SSR).

HE-AAC

{{Main|HE-AAC}}

High-Efficiency Advanced Audio Coding is an extension of AAC LC using spectral band replication (SBR), and Parametric Stereo (PS). It is designed to increase coding efficiency at low bitrates by using partial parametric representation of audio.

AAC-SSR

AAC Scalable Sample Rate was introduced by Sony to the MPEG-2 Part 7 and MPEG-4 Part 3 standards.{{Citation needed|date=October 2009|reason=There should be a reference to Sony's work on this standard.}} It was first published in ISO/IEC 13818-7, Part 7: Advanced Audio Coding (AAC) in 1997. The audio signal is first split into 4 bands using a 4 band polyphase quadrature filter bank. Then these 4 bands are further split using MDCTs with a size k of 32 or 256 samples. This is similar to normal AAC LC which uses MDCTs with a size k of 128 or 1024 directly on the audio signal.

The advantage of this technique is that short block switching can be done separately for every PQF band. So high frequencies can be encoded using a short block to enhance temporal resolution, low frequencies can be still encoded with high spectral resolution. However, due to aliasing between the 4 PQF bands, coding efficiency around (1,2,3) * fs/8 is worse than with normal MPEG-4 AAC LC.{{Citation needed|date=February 2013}}

MPEG-4 AAC-SSR is very similar to ATRAC and ATRAC-3.

= Why AAC-SSR was introduced =

The idea behind AAC-SSR was not only the advantage listed above, but also the possibility of reducing the data rate by removing 1, 2 or 3 of the upper PQF bands. A very simple bitstream splitter can remove these bands and thus reduce the bitrate and sample rate.

Example:

  • 4 subbands: bitrate = 128 kbit/s, sample rate = 48 kHz, f_lowpass = 20 kHz
  • 3 subbands: bitrate ~ 120 kbit/s, sample rate = 48 kHz, f_lowpass = 18 kHz
  • 2 subbands: bitrate ~ 100 kbit/s, sample rate = 24 kHz, f_lowpass = 12 kHz
  • 1 subband: bitrate ~ 65 kbit/s, sample rate = 12 kHz, f_lowpass = 6 kHz

Note: although possible, the resulting quality is much worse than typical

for this bitrate. So for normal 64 kbit/s AAC LC a bandwidth of 14–16 kHz is

achieved by using intensity stereo and reduced NMRs. This degrades audible quality

less than transmitting 6 kHz bandwidth with perfect quality.

BSAC

Bit Sliced Arithmetic Coding is an MPEG-4 standard (ISO/IEC 14496-3 subpart 4) for scalable audio coding. BSAC uses an alternative noiseless coding to AAC, with the rest of the processing being identical to AAC. This support for scalability allows for nearly transparent sound quality at 64 kbit/s and graceful degradation at lower bit rates. BSAC coding is best performed in the range of 40 kbit/s to 64 kbit/s, though it operates in the range of 16 kbit/s to 64 kbit/s. The AAC-BSAC codec is used in Digital Multimedia Broadcasting (DMB) applications.

Licensing

In 2002, the MPEG-4 Audio Licensing Committee selected the Via Licensing Corporation as the Licensing Administrator for the MPEG-4 Audio patent pool.{{cite web | url=https://www.reuters.com/article/pressRelease/idUS137433+05-Jan-2009+BW20090105 | archive-url=https://archive.today/20130104140933/http://www.reuters.com/article/pressRelease/idUS137433+05-Jan-2009+BW20090105 | url-status=dead | archive-date=2013-01-04 | title=Via Licensing Announces MPEG-4 SLS Patent Pool License | author=Business Wire | publisher=Reuters | date=2009-01-05 | access-date=2009-10-09 }}{{cite web | url=http://www.businesswire.com/portal/site/home/permalink/?ndmViewId=news_view&newsId=20090512006643&newsLang=en | title=Via Licensing Announces the Availability of an MPEG-4 SLS Joint Patent Licensing Program | author=Via Licensing Corporation | publisher=Business Wire | date=2009-05-12 | access-date=2009-10-09}}

See also

References

{{Reflist|30em}}