There seems to be some sort of fear of XHTML in the development community at the moment: as if changing from HTML to XHTML is going to take years of learning and some blood, sweat and tears in order to get compliant. What very few people realize is that they are probably already very close to coding XHTML without even knowing it. It's that simple. Also, the few minutes it takes for you to read this article could save you literally hundreds of hours in cross-browser compatibility excercises later on. With that firmly in mind, let us begin.
If this looks wrong to you:
<p>Hello there. <b>How are you?</p></b>
and you would rather do it this way
<p>Hello there. <b>How are you?</b></p>
you are already half way there. In other words, nesting makes sense to you.
On the other hand, if you cant see the difference between the two examples above, we need to explore the concept of nesting. It is really quite simple - You close an HTML Tag in the opposite order to what you opened it. In other words if you have an intricate list, you would might open its elements in this order:
And then you would need to close them in the opposite order that they were opened, so you would say:
And that is the way it always works. You open elements 1 2 3 and close them 3 2 1.
I hear someone at the back of the room whispering: "But what about an image tag? Or a break? Or a Horizontal Rule?"
Those questions neatly tie in with another integral part of XHTML Compliance: Tag Closing. Put simply XHTML Elements must always be closed. If you open a tag, you need to close it. We have already discussed the order that it should be closed in. Now let us consider how to close elements. I am assuming that you understand that tags like <b> are closed by simply inserting a forward slash "/" just after the tag has opened, like this: </b>. But there are a few exceptions - just enough to keep things interesting. The following tags do not have a closing tag, but are said to be self closing.
<br />
<hr />
<img />
<input />
Again I hear someone whispering in the background: "What about the doctype? That is not listed there?"
True. It isn't. But that is for a good reason: DOCTYPE is not really an HTML tag. DOCTYPE is a special tag that does not really define the content of the page, rather it sets the definition. Speaking about DOCTYPE, it is something that is required in an XHTML page. To make things simple, there are only three types of DOCTYPE, and their uses are pretty self descriptive:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
Strict: Follows the strictest XHTML parsing rules and will not validate unless the markup is 100% XHTML compliant.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
Transitional: If you are still using some elements from older html markup, like the "name" attribute, this DOCTYPE will still validate.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "">
Frameset: This DOCTYPE is to be used when you are using frames on your website.
What is that about the name attribute not validating anymore? Well, it's simple: the name attribute has been deprecated in favor of the id attribute. However, as said before - if you are using the transitional DOCTYPE you can use the name as well as the id attributes together, like this:
<div name="something" id="something">
Doing it like that means that older browsers that do not recognize the id attribute will still render the content, albeit oddly (My own opinion is that anyone using a browser older than IE6 or Firefox 2 deserves to have a bad internet experience, but hey--that's just me).
Something I am sure you have noticed already is that my tags are always written in lower case. Upper case tags in XHTML are taboo. Stick to lower case and all will be fine. Another thing you may or may not have noticed is that my tags are not minimalized. Minimalization means that this is right:
<input type="text" readonly="readonly" />
while this is wrong:
<input type="text" readonly />
The idea is that the attribute value is the same as the attribute name, so
Now I'm sure that another thing rearing its ugly head here-- at least for me--is the fact that all atttribute values are quoted. This is important. The following is wrong:
<input type=text readonly=readonly />
while this is right:
<input type="text" readonly="readonly" />
Both single quotes or double quotes are fine as long as all attribute values are quoted. My obsessive-compulsive mind goes thorugh stages of preferring single quotes over double quotes and vice-versa. But whichever I use, I try to be consistant. God is in the Mark-Up.
Before calling it a day--and before you fall asleep due to over-exposure--there are just a few minor things to look at. First, the structure of an XHTML page. The basic elements required in an XHTML page follow:
<!DOCTYPE ...>
<title>... </title>
<body>... </body>
They are ALWAYS required in the order shown above. More elements can be added as need be, but these cannot be taken away.
Last, but not least, always Validate Your Pages with the W3C Validator. This handy tools checks your pages and even tells you how to fix the problems.
Until next time, Happy Validating!
Marc Steven Plotz