XHTML#common errors
{{short description|Markup language which places HTML in XML form}}
{{Infobox file format
| name = XHTML
| icon =
| logo =
| caption =
| extension = .xhtml, .xht,
.xml, .html, .htm
| mime = application/xhtml+xml
| developer = WHATWG
| type code =
| uniform type = public.xhtml
| conforms_to = public.xml
| magic =
| owner = World Wide Web Consortium (W3C)
| released = {{Start date|2000|01|26|df=yes}}
| latest release version =
| latest release date =
| creator code =
| genre = Markup language
| screenshot =
| container for =
| contained by =
| extended to =
| open = Yes
| standard = [https://html.spec.whatwg.org/multipage/ HTML LS]
| url =
}}
{{Html series}}
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
While HTML, prior to HTML5, was defined as an application of Standard Generalized Markup Language (SGML), a flexible markup language framework, XHTML is an application of XML, a more restrictive subset of SGML. XHTML documents are well-formed and may therefore be parsed using standard XML parsers, unlike HTML, which requires a lenient HTML-specific parser.{{cite web
|last1 = Graff
|first1 = Eliot
|title = Polyglot Markup: A robust profile of the HTML5 vocabulary
|url = http://dev.w3.org/html5/html-polyglot/html-polyglot.html
|publisher = W3C
|date = 7 May 2014
|access-date = 17 October 2015
|archive-date = 16 June 2022
|archive-url = https://web.archive.org/web/20220616202318/https://dev.w3.org/html5/html-polyglot/html-polyglot.html
|url-status = dead
}}
XHTML 1.0 became a World Wide Web Consortium (W3C) recommendation on 26 January 2000. XHTML 1.1 became a W3C recommendation on 31 May 2001. XHTML is now referred to as "the XML syntax for HTML"{{cite web | url=https://html.spec.whatwg.org/multipage/xhtml.html | publisher=WHATWG | url-status=live | archive-url=https://web.archive.org/web/20230707043310/https://html.spec.whatwg.org/multipage/xhtml.html | archive-date=7 July 2023 | work=HTML Living Standard | title=Writing documents in the XML syntax}}{{cite web | url=https://html.spec.whatwg.org/dev/xhtml.html | publisher=WHATWG | url-status=live | archive-url=https://web.archive.org/web/20230605155122/https://html.spec.whatwg.org/dev/xhtml.html | archive-date=5 June 2023 | work=HTML: The Living Standard | title=The XML syntax}} and being developed as an XML adaptation of the HTML living standard.{{cite web|title=HTML vs. XHTML|url=http://wiki.whatwg.org/wiki/HTML_vs._XHTML|work=whatwg.org}}{{cite web|url=http://blog.whatwg.org/xhtml5-in-a-nutshell|title=The WHATWG Blog|work=whatwg.org|date=25 July 2010 }}
Overview
XHTML 1.0 was "a reformulation of the three HTML 4 document types as applications of XML 1.0".{{cite web|access-date=2007-06-16|date=2000-01-26|publisher=World Wide Web Consortium|title=XHTML 1.0 Specification, Section 1: What is XHTML?|url=http://www.w3.org/TR/xhtml1/#xhtml}} The World Wide Web Consortium (W3C) also simultaneously maintained the HTML 4.01 Recommendation. In the XHTML 1.0 Recommendation document, as published and revised in August 2002, the W3C commented that "The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility."
However, in 2005, the Web Hypertext Application Technology Working Group (WHATWG) formed, independently of the W3C, to work on advancing ordinary HTML not based on XHTML. The WHATWG eventually began working on a standard that supported both XML and non-XML serializations, HTML5, in parallel to W3C standards such as XHTML 2.0. In 2007, the W3C's HTML working group voted to officially recognize HTML5 and work on it as the next-generation HTML standard.{{cite web|title=results of HTML 5 text, editor, name questions|url=http://lists.w3.org/Archives/Public/public-html/2007May/0909.html|work=W3C }} In 2009, the W3C allowed the XHTML 2.0 Working Group's charter to expire, acknowledging that HTML5 would be the sole next-generation HTML standard, including both XML and non-XML serializations. Of the two serializations, the W3C suggests that most authors use the HTML syntax, rather than the XHTML syntax.{{cite web|url=http://www.w3.org/TR/html5/introduction.html#html-vs-xhtml|access-date=2011-02-16|date=2011-01-13|publisher=World Wide Web Consortium|title=HTML5 Working Draft, Section 1.6: HTML vs XHTML}}
The W3C recommendations of both XHTML 1.0 and XHTML 1.1 were retired on 27 March 2018,{{cite web |title=XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition) Publication History|date=27 March 2018 |publisher=World Wide Web Consortium|url=https://www.w3.org/standards/history/xhtml1}}{{cite web |title=XHTML™ 1.1 - Module-based XHTML - Second Edition Publication History|date=27 March 2018 |publisher=World Wide Web Consortium|url=https://www.w3.org/standards/history/xhtml11}} along with HTML 4.0,{{cite web |title=HTML 4.0 Publication History|date=27 March 2018 |publisher=World Wide Web Consortium|url=https://www.w3.org/standards/history/html40}} HTML 4.01,{{cite web |title=HTML 4.01 Publication History|date=27 March 2018 |publisher=World Wide Web Consortium|url=https://www.w3.org/standards/history/html401}} and HTML5.{{cite web |title=HTML5 Publication History|date=27 March 2018 |publisher=World Wide Web Consortium|url=https://www.w3.org/standards/history/html5}}
=Motivation=
XHTML was developed to make HTML more extensible and increase interoperability with other data formats.{{cite web|access-date=2007-06-16|date=2000-01-26|publisher=World Wide Web Consortium|title=XHTML 1.0 Specification, Section 1.1: Why the need for XHTML?|url=http://www.w3.org/TR/xhtml1/#why}} In addition, browsers were forgiving of errors in HTML, and most websites were displayed despite technical errors in the markup; XHTML introduced stricter error handling.{{Cite web
| url = http://diveintohtml5.info/past.html
| title = How Did We Get Here? - Dive Into HTML5
| last = Pilgrim
| first = Mark
| website = diveintohtml5.info
| access-date = 2016-06-11
}} HTML 4 was ostensibly an application of Standard Generalized Markup Language (SGML); however the specification for SGML was complex, and neither web browsers nor the HTML 4 Recommendation were fully conformant to it.{{cite web|access-date=2008-12-29|author=Arjun Ray|date=1999-10-06|quote=... However, since ISO 8879 does not afford applications the leeway to prohibit internal subsets, it follows that the letter of the HTML [4] spec automatically disentitles it to be a conforming SGML application...|title=Dropping the Normative Reference to SGML (was: I-D ACTION.)|url=http://markmail.org/message/drvncr3f6yscveeg|archive-date=2021-02-25|archive-url=https://web.archive.org/web/20210225124458/https://markmail.org/message/drvncr3f6yscveeg|url-status=dead}} The XML standard, approved in 1998, provided a simpler data format closer in simplicity to HTML 4.{{cite web
|url = http://www.dev-archive.net/articles/xhtml.html
|title = XHTML—Myths and Reality
|author = Tina Holmboe
|publisher = The Developer's Archive
|date = 2008-10-06
|access-date = 2008-12-29
|quote = ... Since the design goals of XML itself partially mirrored those of the original HTML, it was logical for work to begin on formulating an XML–based markup language...
|archive-date = 2017-01-12
|archive-url = https://web.archive.org/web/20170112181538/http://www.dev-archive.net/articles/xhtml.html
|url-status = dead
}} By shifting to an XML format, it was hoped HTML would become compatible with common XML tools;{{cite web
| url = http://www.xml.com/pub/a/2000/01/10/perlwebtools.html
| title = Creating Web Utilities Using XML::XPath
| author = Kip Hampton
| publisher = XML.com
| date = 2001-01-10
| access-date = 2008-12-29
| quote = ... The problem: You want to take advantage of the power and simplicity that XML tools can offer, but you face a site full of aging HTML documents. The solution: Convert your documents to XHTML and put Perl and XML::XPath
to work...
}} servers and proxies would be able to transform content, as necessary, for constrained devices such as mobile phones.{{cite web
| url = http://www.xml.com/pub/a/2004/04/14/mobile.html
| title = Developing Wireless Content using XHTML Mobile
| author = Jean-Luc David
| publisher = XML.com
| date = 2004-04-14
| access-date = 2008-12-29
| quote = ... A useful feature of XHTML is that it can be manipulated as XML. Extensible Stylesheet Language Templates can be used to transform XHTML into WML or any other proprietary mobile formats...
}}
By using namespaces, XHTML documents could provide extensibility by including fragments from other XML-based languages such as Scalable Vector Graphics and MathML.{{cite web
|url = https://developer.mozilla.org/En/SVG:Namespaces_Crash_Course
|title = Namespaces Crash Course
|publisher = Mozilla Developer Center
|access-date = 2008-12-29
|quote = ... It has been a long-standing goal of the W3C to make it possible for different types of XML-based content to be mixed together in the same XML file. For example, SVG and MathML might be incorporated directly into an XHTML-based scientific document...
|archive-date = 2008-10-02
|archive-url = https://web.archive.org/web/20081002054459/http://developer.mozilla.org/En/SVG:Namespaces_Crash_Course
|url-status = dead
}} Finally, the renewed work would provide an opportunity to divide HTML into reusable components (XHTML Modularization) and clean up untidy parts of the language.{{cite web
| url = http://www.w3.org/MarkUp/2004/xhtml-faq
| title = HTML and XHTML Frequently Answered Questions
| author = Steven Pemberton
| publisher = World Wide Web Consortium
| date = 2004-07-21
| access-date = 2008-12-29
| quote = ... with an XML-based HTML other XML languages could include bits of XHTML, and XHTML documents could include bits of other markup languages. We could also take advantage of the redesign to clean up some of the more untidy parts of HTML and add some new needed functionality, like better forms...
}}
===Relationship to HTML===
There are various differences between XHTML and HTML. The Document Object Model (DOM) is a tree structure that represents the page internally in applications, and XHTML and HTML are two different ways of representing that in markup. Both are less expressive than the DOM – for example, "--" may be placed in comments in the DOM, but cannot be represented in a comment in either XHTML or HTML – and generally, XHTML's XML syntax is more expressive than HTML (for example, arbitrary namespaces are not allowed in HTML). XHTML uses an XML syntax, while HTML uses a pseudo-SGML syntax (officially SGML for HTML 4 and under, but never in practice, and standardized away from SGML in HTML5). Because the expressible contents of the DOM in syntax are slightly different, there are some changes in actual behavior between the two models. Syntax differences, however, can be overcome by implementing an alternate translational framework within the markup.
First, there are some differences in syntax:{{cite web
| url = http://www.w3.org/TR/NOTE-sgml-xml-971215
| first = James
| last = Clark
| title = Comparison of SGML and XML
| publisher = World Wide Web Consortium Note
| date = 1997-12-15
}}
- {{anchor|self-closing tag|self-closing syntax|self-closing}}Broadly, the XML rules require that every element be closed, either with a separate closing tag (e.g. {{Code||xml}}) or by using the self-closing syntax (e.g. {{Code|
|xml}}), while HTML syntax permits some elements to be unclosed because either they are always empty (e.g. {{Code||html}}) or their end can be determined implicitly ("omissibility", e.g. {{Code||html}}).
- XML is case-sensitive for element and attribute names, while HTML is not.
- Some shorthand features in HTML are omitted in XML, such as (1) attribute minimization, where attribute values or their quotes may be omitted (e.g. {{Code|