UniNetNews Logo

Let's Unify the Net!

Standards News and Solutions for Web Designers

Why XHTML? and How does it differ from HTML?

Author: Jan Hunt

Without an understanding of the answers to these questions, there would really be no reason to change from HTML to XHTML. I believe that a poor understanding of these answers and the speed in which things are changing are probably the two main reasons that XHTML has not caught on faster.

Doesn't it seem like we just started hearing about XML and now it is the biggest buzz word in the industry! So then, what is XHTML? I have programmers and Web designers ask me this constantly. They become confused, if XML is the big thing, why is there XHTML? Then there is the problem that Web editor software (WYSIWYG) has yet to catch up; only recently have there been any major HTML editors that produce XHTML compliant Web pages. The World Wide Web Consortium (W3C) created an HTML editor called Amaya and Macromedia has come out with version 5 of Homesite which produces XHTML compliant code.

So, What is XHTML?

XHTML 1.0 is the reformulation of HTML 4 as a XML application; it is a markup language that combines the best of HTML and XML. In a nutshell, this means that HTML will now conform to the stricter syntax and structure of XML but use the tags from HTML 4.01 strict. By conforming to these rules, your documents will move into the future of XML, which is opening the door to the expansion and full potential of the Web.

The long run implications are enormous and XHTML 1.0 is just the tip of the iceberg. XHTML 1.0 is really just a recast of HTML 4 so it should work fine in older browsers. The real power of XHTML arrives with XHTML 1.1 and XHTML Basic. In these versions, the W3C is breaking the HTML into modules which will allow devices to request the information (subsets) that they can handle. The server will build Web pages designed specifically for that particular device. Remember, small devices may not have the processing power to run a full fledge browser and they may not be able to support things like graphics or forms.

eXtensible?

Geez, I didn't even mention the word EXTENSIBLE and isn't that the X in XHTML?
Yes, that is the great thing about XHTML, because it is based on XML which allows developers to create their own tags (extending the language), XHTML too can have its own custom tags. The only requirements are that these elements (tags) conform to the rules of XML.

Document elements are defined in what is called a Document Type Definition (DTD) and the display formatting is handled by style sheets. Did you know that HTML has a DTD? The difference is the structure and syntax is not as strict as XHTML and the tags are not exactly the same. As you progress towards the higher versions of HTML, such as HTML 4.01 strict, the jump to XHTML is not as great. In other words, documents that conform to HTML 4.01 will be easier to clean up into XHTML.

In summary,

  1. What is XHTML?
    XHTML is the next generation markup language of the Web. It will support device independence as well as retaining operability with the HTML browsers we use today.
  2. Why should I use XHTML?
    Learning to hand-code XHTML 1.0 will allow you to start creating Web pages that will work today and that will move into the XML future. As XML applications link into XHTML, valid XHTML will become a necessity. Lastly, the skills you learn will help prepare you for the more advanced XHTML and XML that is developing!

Why Not Use XML Now Instead of XHTML?

You are! XHTML is XML! Since XHTML uses the same tags as HTML 4.01 Strict the pages will be understood by older and current browsers. Only the most current browsers (IE 5.0 or greater, Netscape 6.0 or Opera 5.0) support other XML languages and how much support probably varies. I read that it takes at least 2 years for 75% of the population to update to the most current browser, so using your own XML language today would leave a lot of users unable to view your pages. Also convincing Web site designers to drop HTML and learn a whole new family of technologies (some that are still being developed) such as XML, Xlink, XSL, XSLT, CSS, DTD, XML Schemas, etc., is not likely....at least not yet.

XHTML uses tags that todays Web designers and browsers recognize so the jump from HTML to XHTML is not so big; with XHTML Web designers can ease into the world of XML. XHTML is the bridge XML application that will display in older browsers, current browsers and the XML software of the future.

Not only that but XHTML 1.0 provides a foundation for device-independent Web access. If you are interested (and I am sure you will be), read the 26 January 2000 Press Release World Wide Web Consortium Issues XHTML 1.0 as a Recommendation.

How does XHTML differ from HTML?

HTML evolved from a simple structural markup language to an intricate language that became overly concerned with document style (display) to the detriment of document structure. HTML code became a mix of structural and style tags interspersed with browser specific markup. There is no way that small mobile devices could ever hope of processing these pages. HTML is a one-size fits all markup language that was constantly being added to in the attempt to please everyone. So the W3C set out to fix the problems with HTML and recommended XHTML 1.0 because it is compatible with both HTML and XML.

Quickly here is a list of how XHTML differs from HTML:

  • Documents must include a DTD and XML namespace.
  • You cannot omit head and body elements.
  • Documents must be well-formed, this means all elements must have an opening and closing tag.
  • Element and attribute names must be lowercase.
  • Non-empty elements must have end tags.
  • Attribute values must be quoted.
  • There are no standalone attributes.
  • Empty elements must have an end tag or use a combination start/end tag.
  • Use Special Characters.

Tags Every XHTML Web Page Must Have

  1. A doctype declaration that indicates which XHTML version and DTD you are using. As an FYI here, there are the 3 DTDs for XHTML 1.0 - Strict, Transitional and Frameset.
  2. HTML tags - note the opening tag is combined with the required XML namespace (xmlns).
  3. Head tags which hold information about the document and more.
  4. Title tags (always contained in the head)
  5. Body tags which hold the contents of the Web page
<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3c.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
	<title></title>
</head>

<body>


</body>
</html>

Don't forget the rules, there is always an opening and closing tag for every element! (Except that weird one, DOCTYPE. If you want more information on what makes up a DOCTYPE declaration, read the UniNetNews article "How to Read an XHTML Doctype Declaration").


Just the facts please...

Following is a list of the required tags for XHTML 1.0:

  • <!DOCTYPE>
  • <html>
  • <head>
  • <title>
  • </title>
  • </head>
  • <body>
  • </body>
  • </html>

Validating Your Page Using the W3C Validation Service

W3C HTML VALIDATION SERVICE

This service will tell you if your Web page code complies with the rules from XML and the tags - syntax and structure - for XHTML. It looks at your DOCTYPE and validates against the DTD you declare at the top of your XHTML document (in the prolog). For example, I used the TRANSITIONAL DTD for the example shown in the section Tags Every XHTML Document Must Have. In this case, the validator sees the Transitional DTD and uses those rules to check the submitted document.

The W3C validator is really easy to use, just type in the URL of the page you want to validate and click submit! The hard part can be reading the error report so you can find and fix the errors in your page. If your page has no errors, the service returns a page with a message saying "Congratulations, this document validates as XHTML 1.0 Transitional!...." or whatever DTD you listed in your DOCTYPE. The service provides you with code that you can cut & paste into your document along with an icon (gif) of the W3C XHTML 1.0 validated XHTML.

Valid XHTML 1.0!


Valid XHTML 1.0! Valid CSS! Bobby Approved Triple A