HTML audio#Web Audio API and MediaStream Processing API
{{Short description|HTML element}}
HTML audio is a subject of the HTML specification, incorporating audio, including speech to text, all in the browser.
<audio> element
The {{tag|audio|o}} element represents a sound, or an audio stream. It is commonly used to play back a single audio file within a web page, showing a GUI widget with play/pause/volume controls.
The {{tag|audio|o}} element has these attributes: the music
- global attributes (accesskey; class; contenteditable; contextmenu; dir; draggable; dropzone; hidden; id; lang; spellcheck; style; tabindex; title; translate)
- autoplay = "autoplay" or "" (empty string) or empty
Instructs the User-Agent to automatically begin playback of the audio stream as soon as it can do so without stopping. - preload = "none" or "metadata" or "auto" or "" (empty string) or empty
Represents a hint to the User-Agent about whether optimistic downloading of the audio stream itself or its metadata is considered worthwhile. - "none": Hints to the User-Agent that the user is not expected to need the audio stream, or that minimizing unnecessary traffic is desirable.
- "metadata": Hints to the User-Agent that the user is not expected to need the audio stream, but that fetching its metadata (duration and so on) is desirable.
- "auto": Hints to the User-Agent that optimistically downloading the entire audio stream is considered desirable.
- controls = "controls" or "" (empty string) or empty
Instructs the User-Agent to expose a user interface for controlling playback of the audio stream. - loop = "loop" or "" (empty string) or empty
Instructs the User-Agent to seek back to the start of the audio stream upon reaching the end. - mediagroup = string
Instructs the User-Agent to link multiple videos and/or audio streams together. - muted = "muted" or "" (empty string) or empty
Represents the default state of the audio stream, potentially overriding user preferences. - src = non-empty [URL] potentially surrounded by spaces
The URL for the audio stream.
Example:
= Supporting browsers =
On PC:
- Google Chrome
- Internet Explorer 9
- Firefox 3.5
- Opera 10.5
- Safari 3.1
On mobile devices:
- Android Browser 2.3
- Google Chrome
- Internet Explorer Mobile 9
- Safari 4
- Firefox
- Opera Mobile 11
Supported audio coding formats
The adoption of HTML audio, as with HTML video, has become polarized between proponents of free and patent-encumbered formats. In 2007, the recommendation to use Vorbis was retracted from the HTML5 specification by the W3C together with that to use Ogg Theora, citing the lack of a format accepted by all the major browser vendors.
Apple and Microsoft support the ISO/IEC-defined formats AAC and the older MP3. Mozilla and Opera support the free and open, royalty-free Vorbis format in Ogg and WebM containers, and criticize the patent-encumbered nature of MP3 and AAC, which are guaranteed to be “non-free”. Google has so far provided support for all common formats.
Most AAC files with finite length are wrapped in an MPEG-4 container (.mp4, .m4a), which is supported natively in Internet Explorer, Safari, and Chrome, and supported by the OS in Firefox and Opera.{{cn|date=August 2024}} Most AAC live streams with infinite length are wrapped in an Audio Data Transport Stream container (.aac, .adts), which is supported by Chrome, Safari, Firefox and Edge.{{Cite web|title=MP4 container · Issue #95 · karlheyes/icecast-kh|url=https://github.com/karlheyes/icecast-kh/issues/95|access-date=2022-11-18|website=GitHub|language=en}}{{Cite web|url=https://developer.apple.com/library/ios/technotes/tn2236/_index.html#//apple_ref/doc/uid/DTS40008748-CH1-SECTION5|title = Technical Note TN2236: High-Efficiency Advanced Audio Coding (HE-AAC)}}{{Cite web|url=https://bugzilla.mozilla.org/show_bug.cgi?id=1224887|title = 1224887 – Implement OpenMax IL AAC audio decoding client}}
Many browsers also support uncompressed PCM audio in a WAVE container.{{Cite web|url=https://developer.mozilla.org/en-US/docs/Web/Media/Formats|title=Media type and format guide: image, audio, and video content – Web media technologies | MDN|website=developer.mozilla.org}}
In 2012, the free and open royalty-free Opus format was released and standardized by IETF. It is supported by Mozilla, Google, Opera and Edge.{{Cite web | url=https://www.xiph.org/press/2012/rfc-6716/ | title=September 11, 2012: Opus audio codec is now RFC6716, Opus 1.0.1 reference source released }}{{Cite web|url=https://hacks.mozilla.org/2012/09/its-opus-it-rocks-and-now-its-an-audio-codec-standard/|title = It's Opus, it rocks and now it's an audio codec standard! – Mozilla Hacks – the Web developer blog}}{{Cite web|url=https://blogs.windows.com/msedgedev/2016/04/18/webm-vp9-and-opus-support-in-microsoft-edge/#3ZKPLtTr0QvtJ2aF.97|title=WebM, VP9 and Opus Support in Microsoft Edge – Microsoft Edge Dev BlogMicrosoft Edge Dev Blog|website=blogs.windows.com|date=18 April 2016|language=en-US|access-date=2017-03-22}}
This table documents the current support for audio coding formats by the <audio>
element.
Web Audio API and MediaStream Processing API
The Web Audio API specification developed by W3C describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported.{{Cite web |url=https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html |title=Web Audio API |author=Chris Rogers |date=2012-03-15 |publisher=W3C |archive-url=https://web.archive.org/web/20120720115514/https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html |archive-date=2012-07-20 |access-date=2012-07-04 |url-status=bot: unknown }}
Mozilla's Firefox browser implements a similar Audio Data API extension since version 4, implemented in 2010 {{Cite web | url=https://wiki.mozilla.org/Audio_Data_API | title=Audio Data API}} and released in 2011, but Mozilla warns it is non-standard and deprecated, and recommends the Web Audio API instead.{{Cite web |url=https://developer.mozilla.org/en/Introducing_the_Audio_API_Extension |title=Introducing the Audio API extension |date=2012-03-05 |work=Mozilla Developer Network |publisher=Mozilla |archive-url=https://web.archive.org/web/20120505042746/https://developer.mozilla.org/en/Introducing_the_Audio_API_Extension |archive-date=2012-05-05 |access-date=2012-07-04 |url-status=dead }}
Some JavaScript audio processing and synthesis libraries such as [https://oampo.github.com/Audiolet/ Audiolet] {{Webarchive|url=https://web.archive.org/web/20130128201851/http://oampo.github.com/Audiolet/ |date=2013-01-28 }} support both APIs.
The [http://www.w3.org/2011/audio/ W3C Audio Working Group] is also considering the MediaStream Processing API specification developed by Mozilla.{{Cite web |url=http://www.w3.org/TR/audioproc/ |title=Audio Processing API |date=2011-12-15 |publisher=W3C |archive-url=https://web.archive.org/web/20120614023917/http://www.w3.org/TR/audioproc/ |archive-date=2012-06-14 |access-date=2012-07-04 |url-status=bot: unknown }}
In addition to audio mixing and processing, it covers more general media streaming, including synchronization with HTML elements, capture of audio and video streams, and peer-to-peer routing of such media streams.{{Cite web |url=http://www.w3.org/TR/2012/NOTE-streamproc-20120531/ |title=MediaStream Processing API |author=Robert O'Callahan |date=2012-05-31 |publisher=W3C |access-date=2012-07-04}}
= Supporting browsers =
On PC:
- Google Chrome 10{{Cite web|title=Web Audio API is now available in Chrome from Chris Rogers on 2011-02-01 (public-xg-audio@w3.org from February 2011)|url=https://lists.w3.org/Archives/Public/public-xg-audio/2011Feb/0000.html|access-date=2022-11-18|website=lists.w3.org}} (Enabled by default since 14{{Cite web | url=http://www.webmonkey.com/2011/09/chrome-14-adds-better-audio-native-client-support/ | title=Chrome 14 Adds Better Audio, 'Native Client' Support |author=Scott Gilbertson |date=2011-09-19 |work=Webmonkey |publisher=Wired |access-date=2012-07-04}})
- Firefox 23 (Enabled by default since 25)
- Opera 15
- Safari 6
- Microsoft Edge 12
- Opera GX 36
On mobile devices:
- Google Chrome for Android 28 (Enabled by default since 29) and Apple iPads
- Safari 6 (Has restrictions on use (Muted unless user called))
- Firefox 23 (Enabled by default since 25)
- Tizen
Web Speech API
The Web Speech API aims to provide an alternative input method for web applications (without using a keyboard). With this API, developers can give web apps the ability to transcribe voice to text, from the computer's microphone. The recorded audio is sent to speech servers for transcription, after which the text is typed out for the user. The API itself is agnostic of the underlying speech recognition implementation and can support both server based as well as embedded recognizers.{{Cite web | url=http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html#introduction | title=API draft | access-date=January 28, 2012}}
The HTML Speech Incubator group has proposed the implementation of audio-speech technology in browsers in the form of uniform, cross-platform APIs. The API contains both:{{Cite web | url=https://wiki.mozilla.org/HTML5_Speech_API | title=HTML5 Speech API | access-date=January 28, 2012}}
- Speech Input API
- Text to Speech API
Google integrated this feature into Google Chrome in March 2011.{{Cite web | url=http://chrome.blogspot.com/2011/03/talking-to-your-computer-with-html5.html | title=Talking to your computer | access-date=January 28, 2012}} Letting its users search the web with their voice with code like:
= Supporting browsers =
- Safari 14.1 and up {{Cite web|url=https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API|title=Web Speech API – Web APIs | MDN|access-date=May 20, 2024}}
- Google Chrome 25 and up
- Firefox Desktop 44.0 and up (Linux and Mac) / 45.0 and up (Windows) [PARTIAL: speech synthesis only; no recognition; enabled by default since 49.0]{{Cite web|url=https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/49|title=Firefox 49 for developers – Mozilla | MDN|access-date=May 20, 2024}}{{Cite web|url=https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API|title=Web Speech API – Web APIs | MDN|access-date=May 20, 2024}}
See also
Notes
{{Notelist}}
References
{{Reflist|30em}}
External links
- HTML/Elements/audio – W3C Wiki
- [https://web.archive.org/web/20130606104953/http://www.w3.org/TR/html5/embedded-content-0.html#the-audio-element HTML5 audio element – W3C]
- [http://www.w3.org/TR/webaudio/ Web Audio API – W3C]
- [http://www.w3.org/TR/streamproc/ MediaStream Processing API – W3C]
- [https://web.archive.org/web/20150525223958/https://dvcs.w3.org/hg/speech-api/raw-file/9a0075d25326/speechapi.html Web Speech API – W3C]
- [https://github.com/rserota/wad Web Audio DAW – GitHub]
- [https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API Mozilla's Web Audio API]