Objective: HTML versus XHTML, and validation
Learn:
Book: not covered in the book in one place
Web Readings:
To simulate in-class lectures
Name | MB | Description |
Validate.wmv | 6 | Validation |
You code to the rules of a certain standard like HTML 4.01. There are different standards since HTML has evolved and its successor XHTML continues to evolve. As a result, some rules have changed and some elements have been deprecated meaning at some future point they may no longer be supported. In a sense there are two standards:
As of today, browsers continue to support legacy code even if it does not conform to the latest official standard. Indeed, browsers will not likely dis-continue support for HTML any time soon. But at some point a browser may not support or display the old code correctly. In one way it's similar to HD-TV so at some point everyone needs to convert, but the difference is with HTML there is no deadline set so nobody knows how long legacy code will work okay. XHTML offers several advantages over HTML which are:
There is no right or wrong but the best advice is probably:
The path is not so clear however since the question is what should be the latest standard you write for as explained below
XML is another markup language with a strict structure meaning if any piece of code had one error then the whole document is deemed bad (not well-formed) and nothing displays. However the biggest differences between html and xml are:
Although XHTML implies it is HTML + XML, that is misleading. XHTML is really like HTML since it uses pre-defined elements, however it follows the syntax rules of XML .We cover XML later so our choice for now is HTML and XHTML. But XML has two concepts that carry over to XHTML:
So using XHTML allows us to ensure documents are well formed and are valid against some standard. The different versions to date are:
When creating new pages your best choice is code for XHTML 1.1 since it is more forward compatible than other versions. Even 1.1 may not even be compatible with the future, however, it will be much easier to convert strict XHTML than html into whatever the future is. Further, companies insist on competent coders even though non-standards compliant code (sloppy and deprecated code ) work just fine with today's browsers. Using XHTML strict code does NOT ensure more browsers today will render the code better, since strict requires CSS which very old browsers cannot handle. But most people use fairly modern browsers so probably not much difference display-wise whether use XHTML or not
Code for XHTML 1.1 (or strict) means
For this course, you can use any code you want unless the assignment instructs otherwise. You still should understand HTML code because HTML and XHTML are quite similar in many respects and when you go to update old pages you need to know what the old HTML is. The book does not cover XHTML very well so must rely on web readings.
The most confusing thing is some elements have been deprecated and should not
be used in XHTML. The hard part is realizing which tags are obsolete and which are not.
Realize CSS (style sheets,
which we cover later) are the
preferred way to control layout and formats in XHTML. Although some html tags have been deprecated in XHTML, they
still work in browsers.
For example, there are various ways to do things like show a background image:
1. use background attribute in body tag (will automatically tile, i.e.,
repeat)
2. use bgimage tag (obsolete so avoid it)
3. use a style
The preferred way is #3 because as stated at
w3schools.com/html/html_backgrounds.asp
#1-2 are deprecated meaning they are no longer part of html standard. However
#1-2
may still work because browsers may still support them (then again they may
not). So
you will see lots of info about #1-2 on the web but the modern way is #3. We
will cover some of the deprecated tags because:
XHTML is not very different from HTML in that most of the same markup exists but there are some new rules. In technical terms, a valid XHTML document must be well formed (no syntax errors) and conform to a certain DTD (we will cover this more in XML) as specified in the DOCTYPE. There are many syntax rules but the important points are that in XHTML (realize there may be other differences but they are minor compared to below):
Indeed most of these are no big deal since you probably coded that way in HTML also. XHTML 1.1 has even more rules like
Validation involves determining if code has any syntax errors. It has little to do with whether a document is artistically well designed and looks nice in a browser. It simple determines if code follows the syntax standards.
An XHTML document is validated against a Document Type Definition (DTD). Unlike XML where you can make your own DTD, with XHTML the DTD's are built into the standard. So for validation you really need 2 things:
for example
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>whatever</title>
</head>
<body>
<p>your stuff goes here</p>
</body>
</html>
The main DTD and doctypes are below and can just copy & paste the doctype
you want
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3c.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
NOTE: even though xhtml tags must be lowercase the DOCTYPE tag must be uppercase to validate
Every web page should be tested. There are 2 potential ways:
So to ensure code is well formed and valid for a certain standard, test it in a validator. But always test whether your page works in browsers. The issue becomes what to do when your page generates lots of crazy validator errors and yet still looks okay in a browser which is often the case
Ideally, test a page in every browser your users might use. However, some browsers are awfully old and for casual use you likely do not have more than the latest IE or Mozilla.
As for validators, there are many software and web sites available to do this, for example
Validators run your code against a known standard (like XML 1.1) and return any errors. Keep in mind errors
So in order of priority
If you want to create valid XHTML, there are some differences in what elements can be used versus HTML since over the years some elements evolved that did the same thing and some worked in certain browsers only, for example embed versus object (Netscape versus IE browsers). So creating valid XHTML sometimes means the code actually will not work in some browsers. On the other hand, if validity is that important to you, you are probably already reconciled to leaving older browsers behind.
The code for this page should be valid so try it and see at
http://validator.w3.org/#validate_by_input+with_options
Also see examples at:
Browser display is more affected by MIME type than the DOCTYPE. Validation uses DOCTYPE to see what standard to compare code against. However, the browser displays a file based on its MIME type and not on what type of code it really has.
What does this mean...one one hand a lot and on other very little? If the MIME type is not correct then unexpected results may occur. But for everything except XHTML, the MIME is pretty well defined (it is like a file type association with filename extensions in windows) meaning there is only 1 type you would ever use. However, for XHTML you can actually use a MIME of either html or xml.
Read more at xml.com or at w3.org The summary is
For XHTML as xml, use something like this (where xml-stylesheet is optional):
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/css" href="/style.css" media="screen,projection"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict //EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
For the latest browsers, using <xml> prolog or not makes no difference, but for older browsers it can be a problem, which is why many xhtml pages use type html. XHTML served as html is not much different than HTML as html. If XHTML is served as xml like standards suggest then will find xhtml has less browser support than html.
In technical detail...the MIME type (sent in the Content-Type HTTP header) tells the browser that the document is an application of XML if it is application/xhtml+xml, or can also use application/xml or even text/xml although text/xml is not recommended. Most XHTML documents are served with a MIME type of text/html, which means that they are to be considered as HTML documents. With such a MIME type, you are not using XHTML as far as browsers are concerned but are using HTML.
The XML namespace declaration (xmlns) in the <html> tag tells user agents
that it is XHTML (rather than any other application of XML). You must use one of
the three aforementioned XML MIME types, or user agents will ignore the XML
namespace. If, for instance, you serve your document as text/html, the namespace
is ignored, since HTML does not support XML namespaces.
Meta tags can also be used for character encoding, stylesheet, or Content-Type
HTTP equivalent but really no need to use meta cause:
However some browsers have trouble with a prolog so you can specify character encoding by inserting a Content-Type element into the <head> of your document to avoid troublesome prolog.
The fact that XHTML may be served as HTML or XML makes a difference to the
way encoding information needs to be declared. Current browsers may display an
HTML file in either standards mode or quirks mode. This means that different
rules are applied to the display of the file, one conforming to the W3C
standards interpretation of expected behavior, the other to expectations based
on the non-standard behavior of older browsers.
In recent browsers such as Internet Explorer 7, Firefox, Opera, and others, a
page served with a DOCTYPE declaration will be rendered in standards mode with
or without the XML declaration,
With Internet Explorer 6, however, if anything (like xml prolog) appears before the DOCTYPE declaration the page is rendered in quirks mode. Because Internet Explorer 6 users still count for a many users, this is a significant issue. If you want to ensure that your pages are rendered in the same way on all standards-compliant browsers, you need to think carefully about how you deal with this. It is a good idea to use a DOCTYPE declaration at the top of an HTML or XHTML file so that the document is rendered in standards mode by more recent user agents. The presence of an XML declaration in an XHTML file served as HTML will cause your file to be rendered in quirks mode on Internet Explorer 6 (and therefore for a potentially large proportion of your audience).