Introduction
There seems to be some sort of fear of XHTML in the development community at the moment:
as if changing from HTML to XHTML is going to take years of learning and some
blood, sweat and tears in order to get compliant. What very few people realize
is that they are probably already very close to coding XHTML without even
knowing it. It's that simple. Also, the few minutes it takes for you to read this
article could save you literally hundreds of hours in cross-browser compatibility
excercises later on. With that firmly in mind, let us begin.
If this looks wrong to you:
<p>Hello there. <b>How are you?</p></b>
and you would rather do it this way
<p>Hello there. <b>How are you?</b></p>
you are already half way there. In other words, nesting makes sense to you.
On the other hand, if you cant see the difference between the two examples
above, we need to explore the concept of nesting. It is really quite
simple - You close an HTML Tag in the opposite order to what you opened it. In
other words if you have an intricate list, you would might open its elements in this
order:
<ul>
<li>
<ul>
<li>
And then you would need to close them in the opposite order that they were
opened, so you would say:
</li>
</ul>
</li>
</ul>
And that is the way it always works. You open elements 1 2 3 and close them 3 2 1.
I hear someone at the back of the room whispering: "But what about an image tag? Or a break? Or a Horizontal Rule?"
Those questions neatly tie in with another integral part of XHTML Compliance:
Tag Closing. Put simply XHTML Elements must always be closed. If you
open a tag, you need to close it. We have already discussed the order that it
should be closed in. Now let us consider how to close elements. I am assuming
that you understand that tags like <b> are closed by simply
inserting a forward slash "/" just after the tag has opened, like this:
</b>. But there are a few exceptions - just enough to keep
things interesting. The following tags do not have a closing tag, but are said
to be self closing.
<br />
<hr />
<img />
<input />
Again I hear someone whispering in the background: "What about the doctype? That is not listed there?"
True. It isn't. But that is for a good reason: DOCTYPE is not really an HTML
tag. DOCTYPE is a special tag that does not really define the content of the
page, rather it sets the definition. Speaking about DOCTYPE, it is something
that is required in an XHTML page. To make things simple, there are only three
types of DOCTYPE, and their uses are pretty self descriptive:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Strict: Follows the strictest XHTML parsing rules and will not validate
unless the markup is 100% XHTML compliant.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Transitional: If you are still using some elements from older html
markup, like the "name" attribute, this DOCTYPE will still validate.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
Frameset: This DOCTYPE is to be used when you are
using frames on your website.
What is that about the name attribute not validating
anymore? Well, it's simple: the name attribute has
been deprecated in favor of the id attribute.
However, as said before - if you are using the
transitional DOCTYPE you can use the name
as well as the id attributes together, like this:
<div name="something" id="something">
Doing it like that means that older browsers that do not
recognize the id attribute will still render the
content, albeit oddly (My own opinion is that anyone using a
browser older than IE6 or Firefox 2 deserves to have a bad
internet experience, but hey--that's just me).
Something I am sure you have noticed already is that my tags are always written
in lower case. Upper case tags in XHTML are taboo. Stick to lower case
and all will be fine. Another thing you may or may not have noticed is that my
tags are not minimalized. Minimalization means that this is right:
<input type="text" readonly="readonly" />
while this is wrong:
<input type="text" readonly />
The idea is that the attribute value is the same as the
attribute name, so
checked="checked"
readonly="readonly"
disabled="disabled"
selected="selected"
Now I'm sure that another thing rearing its ugly head here--
at least for me--is the fact that all atttribute values are
quoted. This is important. The following is wrong:
<input type=text readonly=readonly />
while this is right:
<input type="text" readonly="readonly" />
Both single quotes or double quotes are fine as long as all
attribute values are quoted. My obsessive-compulsive mind
goes thorugh stages of preferring single quotes over double
quotes and vice-versa. But whichever I use, I try to be
consistant. God is in the Mark-Up.
Before calling it a day--and before you fall asleep due to
over-exposure--there are just a few minor things to look at.
First, the structure of an XHTML page. The basic elements
required in an XHTML page follow:
<!DOCTYPE ...>
<html>
<head>
<title>... </title>
</head>
<body>... </body>
</html>
They are ALWAYS required in the order shown above. More
elements can be added as need be, but these cannot be taken
away.
Last, but not least, always
Validate Your Pages with
the W3C Validator. This handy tools checks your pages and
even tells you how to fix the problems.
Until next time, Happy Validating!
Marc Steven Plotz