Skip to: Navigation | Content | Sidebar | Footer


Weblog Entry

Switched

April 20, 2009

Ah, blogging: the new long-form Tweet. This morning I said:

retraining myself not to /> close img, input, and meta tags. It’s an uphill battle.

Which received an instant string of responses asking, in a nutshell, “why?” So I clarified:

because I’m done with XHTML is why. Back to HTML 4.01 Strict for now, then HTML5 whenever that happens.

Which caused more replies and follow-up emails than I was really prepared for. People are still passionate about this stuff, which kind of surprised me.

So I’ll take the luxury of a few more than 140 characters to expand my thoughts. Within the last few months — though I’d been intending to do so for much longer than that — I made the decision to switch all my client work starting point templates from XHTML over to HTML 4.01 and start delivering everything as plain old semantic HTML, minus the X.

The XHTML/HTML choice still seems to be a bit of a dilemma, with most having a fairly established opinion one way or the other. Some love XHTML for its strict syntax and easier learning curve, some despise it for being so misused by the average content producer.

I made the switch more because of overall trends. Six years ago, many of us thought XHTML would be the future of the web and we’d be living in an XML world by now. But in the intervening time it’s become fairly apparent to myself and others that XHTML2 really isn’t going anywhere, at least not in the realm that we care about. For me, a guy who builds web sites and applications for clients that have to work in today’s browsers, XHTML2 is a non-issue. No browser support, no use to today’s web authors. End of story.

And as a result, there seems to be a shifting of consensus from “XTHML is the future of the web” to “XHTML2 is pretty much never going to happen, looks like HTML5 is the pony to back now”. For proof, just compare how fast the major browsers are implementing HTML5 features now to the significant steps they’ve made with XHTML2 over the past 5 years (hint: crickets are chirping).

Not to say that HTML5 doesn’t have its own set of problems; I’ve been seeing a lot of the same blog posts you have regarding the difficulty of implementation today, and the working group issues plaguing its development. It’s got a long ways to go. But if I were to make a bet on which of the two languages I’ll be writing in ten or fifteen years, HTML5 seems like a safer bet.

So given that I’ve lost faith in XHTML, it doesn’t make much sense to continue propagating it. I’m not ready to start working through the contortions needed to make my sites work with an HTML5 DOCTYPE yet, which leaves me with the most recent implemented version of the language. I may start writing HTML5-ready HTML 4.01 like Clear Left did for their UX London site, but until I get a better sense that HTML5 has arrived, 4.01 will do me just fine for the next four or five years.

Update: Jeff Allen points out in the comments that UX London does in fact use the HTML5 DOCTYPE. So, to clarify the above point: I may start writing pseudo-HTML5 markup using classes like Clear Left does, however I’m not quite ready to use the actual DOCTYPE as they did.

Keep in mind this is really just one opinion in the fray. I’m not necessarily right, I’m just offering my personal response to what I see as the trend.


April 20, 12h

Thanks for more than 140. ;) I have to say that I heartily agree with you and have been toying with the same idea myself lately. If XHTML is never going to be what it should have been, why should I write it? HTML 5 looks much more likely at this time.

When I started web development, before I knew of XHTML, I closed tags. A p, li, whatever had a matching closing tag. And I will always do that. But I can see having trouble losing my habit now of closing ALL tags!

Get well!
Stef.
@stefsull

April 20, 12h

I think it makes sense to ignore XHTML for now, but I don’t see the point in intentionally leaving out closing tags just because HTML4 is sloppy. Why not validate HTML4, but use the good parts of XHTML? In my opinion, /> in a tag shows your intent to close the tag, so it’s clear to anyone reading your HTML (and to browser rendering images) that the following tags aren’t being nested. Why not be explicit about your intent?

April 20, 12h

This mirrors thoughts I’ve been having on this very topic lately. I was editing some items in the header of my blog, and when I habitually closed the tag with /> I stopped to think “Why am I still doing this?” especially in the light of HTML5’s obvious center stage in the next phase of website production instead of XHTML2.

As such, I’ve been considering the jump sideways to HTML 4.01, saving myself from a few hundred / in the process.

The only catch with such that I can think of is potential problems for those producing sites for clients that have rich text editors that may exist in their CMSes. In some cases these format what the clients enter to XHTML, which could cause validation issues down the road.

All that said, although I don’t think you need to justify your choice to use HTML 4.01, I agree with you on the topic.

TJ Kelly says:
April 20, 12h

I really admire your simple, logical approach. I’m pretty set in my ways, sometimes to a fault. I think, given the “next 4 or 5 years” it will take before HTML5 is widely accepted, I will probably not switch out of XHTML for at least another 2-3 years.

For my money, there is no hugely compelling motivation to make any change yet. I appreciate your matter-of-fact point of view.

Thanks!

TJ Kelly
http://www.tjkwebdesign.com/

April 20, 12h

Oops - “browser rendering images” -> “browser rendering engines”.

April 20, 12h

I find this very strange.

Somehow you feel like HTML5 ‘is here’ while I would argue that XHTML2.0 nor HTML5 are here either.

The ‘improvements’ offered by HTML5 are few and browser vendors have jumped into implementing it without really considering the future properly.

You are now taking a step back in order to ‘potentially take the step forward’.

Most people I speak to are not happy with HTML5 to great extent and that specification does not seem to be heading forward that rapidly either (in other words one could argue both XHTML2.0 and HTML5 are pretty dead).

Either way, I find it extremely strange for you to now be opting for a less semantic and less standards compliant technology in order to reap the ‘potential future rewards’, which you will not be able to justify are coming.

Ted Graf says:
April 20, 12h

Thanks for your opinion on this Dave.
I’ve long been an advocate for XHTML, but I’d have to agree with your take.

@jon way to be explicit ;)

April 20, 12h

@Jason Grant: you’ve been misinformed. There is no discernible difference between HTML4.01 Strict and XHTML 1.0 Strict. There is no better or worse or less semantic or less standards compliant between the two.

trovster says:
April 20, 12h

I started doing this a couple of years ago. When I first moved to semantic web-development I abused the XHTML route. But learning semantic HTML and the reasons behind XHTML, it seemed obvious to me there was no reason to stick to XHTML - aside from the false sense that it has a “strict syntax and easier learning curve” – the truth is that it really doesn’t!

The arguement always arises that you must close tags, whether it be single elements, such as img, or the corresponding end tag for a list-item or paragraph. Oh yeh, and the fact that it is lowercase.

Here’s the news – you can write as semantic, correctly nested and closed (aside from img tags and the ilk) lowercased HTML as in you can in XHTML. And if you’re doing it correctly, one unescaped ampersand won’t render your page useless HTML4.01Strict all the way – until HTML5 is established and well supported.

April 20, 12h

Thanks for sharing your thoughts. I’ve had the same impression that XHTML->XML dream is fading and have started making my XHTML HTML5-ready; though I’ll wait until HTML5 is a little more mainstream before making any specific DOCTYPE changes.

Peter says:
April 20, 12h

The main reason I wouldn’t switch back to HTML 4 is the set of rules XHTML imposed on us.

I like to think about HTML as a computer language, not a markup soup. Because when we decide to drop adding “/]” we actually leave some elements open. Which ones can be open? And why are they not nested?

The next step is going back to monstrosities like this:

[ul]
[li]One
[li]Two
[/ul]
[p]Lorem ipsum.
[p class=sth1 sth2 isthisattributeorclass]And lorem.

(Used “[” because comment engine seems to cut all HTML tags)

Because why the heck not… if HTML is not XML, we can do anything.

I understand XHTML2 is not going to be our future, but I’d rather stay with XHTML5 and write proper, XML-compatible markup.

April 20, 12h

@Jason: Actually, I should clarify. There is better or worse. HTML4.01 Strict is better than XHTML 1.0 Strict as it’s more widely accepted and more “fail-safe”. That is, your page will likely render even if errors exist within your page. Serving up an XHTML document as HTML doesn’t provide any benefit and is likely to allow XML errors to creep in. (The google analytics code on your page, for example, wouldn’t run.)

April 20, 12h

Props to you. It’s a bold, not quite farsighted but rather progressive move. Your reasons make all the sense in the world to me, but still ring as drastic measures.

As sites developed in XHTML won’t necessarily need to be converted to HTML 5.0 at a later date. Nor would 4.01 sites for that matter.

It’s almost like a race. We’ve been standing for a while waiting for the XHTML 2 gun to go off. It sounds like you’ve decided to put a foot back and get in your running stance while waiting for a new gun to off. Just like in life, it’s not the weapon that causes the damage, it’s the people wielding it.

I’m not sure I have any more faith in HTML 5 practitioners than I do in it’s competitors. I’ll keep standing for now, and hope that my legs don’t give out before you take off running.

trovster says:
April 20, 12h

@Jason Grant: All of XHTML1.0’s derivatives (Strict, Transitional and Frameset) are all pretty much simple XML-formulation of the HTML4.01 sets of the derivatives name. As @snook pointed out, I think you’ve been mis-informed.

My question to you is; are you sending the correct content-type for your XHTML – so it actually is XHTML and not just broken HTML masquerading? Do you CDATA-escape your internal JavaScript (if you have any)?

Dave S. says:
April 20, 12h

@Jon Galloway - I don’t disagree with your core point, it just doesn’t feel necessary to me. img tag closing always baffled me, for one.


@Jason Grant - time to re-read what I wrote. At no point did I make the claim that HTML5 was “here”, far from it.

April 20, 12h

@Peter - Just because it’s allowed doesn’t mean you have to use it. Using tables for layout in XHTML is technically allowed but not recommended for well-known reasons.

As an author you decide what’s ultimately right or wrong. And you can actually still use the “/” and still validate as HTML5.

Divya says:
April 20, 12h

Wow, I wrote on almost the exact same thing here: http://nimbupani.com/blog/the-long-road-from-xhtml-to-html.html

Dave S. says:
April 20, 12h

@Peter - The advantage of tag soup is you don’t have to have a CS degree to write it. I’m not necessarily advocating that professional web developers start writing it, but surely the millions of amateurs out there benefit from the looseness.

I don’t agree that it’s a slippery slope from <img> to your example; I’m choosing to leave off the closing tag, not doing it cause I don’t know better. (Your argument is valid, I just don’t quite see it the same way.)

@Antoine - Thanks, but it’s really not that drastic or unique. I’m following in other’s footsteps on this one.

April 20, 12h

As far as I’m concerned it’s less a matter of what I write in, and more than I write it well.

I can’t control the direction of the web and its coding standards very much, but I can control the degree to which I work hard to ensure I write well-structured, semantic and correct markup.

More focus on spelling and grammar and less on sentences, I say.

April 20, 12h

I agree! Why fake XML when we are serving pages as text/html?

April 20, 13h

Absolutely agree, I’ve been doing this for a while now. With current browser support XHTML can’t be served as true XHTML, so using an XHTML doctype is meaningless. As others have pointed out, there’s nothing “less semantic and less standards compliant” about HTML 4.01.

22
Jeff Allen says:
April 20, 13h

Dave,
Thanks for sharing your thoughts on this.

If you look at Clearleft’s UX London site it’s using the HTML 5 doctype and not the HTML 4.01 Strict Doctype. It may not be a fully HTML 5 based site as they substitute some classes for tags (like .section instead of ) but Jeremy did a good job of explaining that choice, and I think it makes sense.

I agree that you can write semantically correct markup in any of them (HTML 4.01, XHTML 1.0 or HTML 5) but wouldn’t the wiser move be to build in the hybrid HTML5 like what Jeremy and Clearleft are doig rather than taking a step back to take two steps forward?

Neal G says:
April 20, 13h

I completely agree with you. I also agree with Ian Hickson regarding XHTML. I always seem to get strange looks from fellow web developers when I mention that I only use HTML 4.01 Strict.

While I would love to serve all my content as XHTML using the correct MIME type, it’s nearly impossible to ensure ALL my content validates at all times. I mean sure, I can make my site’s template validate, but what happens when a user types in some strange HTML into the comment area of my site and de-validates it thus giving the entire page the ‘yellow screen of death’?

IE6 & IE7 do support XHTML if you use a non-standard MIME type and clone the entire DOM as XML - http://www.nealgrosskopf.com/tech/thread.asp?pid=1

April 20, 13h

I agree with you that XHTML 2 is unlikely to arrive anytime soon, but the consequence of switching back to tag soup is wrong. HTML 5 may be both written in non-well-formed as well as well-formed syntax.

April 20, 13h

I am really stoked to see this discussion going on today. It was recently suggested to me to switch from using XHTML to HTML. I did a bit of searching around and while I did find some reasoning behind the advice I never really found any concrete information.

It’s awesome to see some well respected designers and developers adding their comments and I think this article has already proved to be more helpful than anything I have found up until now.

@Dave S - Thanks once again for an awesome article that has def educated me a bit more on the subject.

bruce says:
April 20, 13h

I learned to code when xhtml was the next big thing, so I like lower case, quoting attributes and closing tags. They just feel more aethetically pleasing to me.

But it seemed to me that HTML 5 was the way forward, not xhtml 2, so that’s what I decided to use. And, as luck would have it, I can have an html 5 doctype and still use lower case, close all tags (even image tags) and quote my attributes.

Hsving my doctype cake, and eating it. Huzzah!

April 20, 14h

@Peter - I’m going to agree with Dave that switching from XHTML “back” to HTML doesn’t mean a practiced developer is going to then start leaving open list tags, etc.

I agree that it could mislead a new coder to make a mistake like that, but if they’re using a validation service, they’ll be notified of their error. If they’re not validating, they’re probably not concerned about standards, and likely won’t be coding properly in HTML or XHTML regardless. They would be equally confused why img has a / even though li doesn’t, for example. In both HTML and XHTML the only way someone will know when to use the proper technique comes with experience.

April 20, 14h

Good to see another web design moving back to HTML from that myth that was XHTML.

@Jon Galloway - HTML 4.01 isn’t any more sloppy than XHTML, it simply has better error handling for the real world. I doubt I’d by far off saying 99.9% of all XHTML sites out there aren’t even XHTML anyway and don’t benefit from the stricter XML error handling.

April 20, 14h

To put your mind at ease regarding the HTML5 DOCTYPE, as far as I can tell, all major browsers (including IE6) go into “standards mode.”

April 20, 14h

Here’s another good read for those who think HTML is tag soup, that XHTML is more semantic, cleaner, future compatible etc:

http://www.webdevout.net/articles/beware-of-xhtml

trovster says:
April 20, 14h

@bruce: You can “lower case, quoting attributes and closing tags” (with a caveat of img, link, meta, br etc – although, when (ab)using XHTML as text/html, you’re (most-likely) using “HTML-compatible spacing” between the last attribute and the closing slash and this can be used in HTML4.01 too).

@Jeff Allen: I really don’t see moving from XHTML to HTML4.01 a ‘step-backwards’, let alone two. Secondly, you could add in the HTML5 class-style convention, but one should take the time to understand HTML5 closer (me included), and grok .section and .article etc, as they affect heading levels and such can not just be added without knowing their (full) behaviour. The movement from XHTML1.0 to HTML4.01 should be a relatively simple one (as I mentioned before, one is simply an XML representation of the other), but adding in anything else would require more learning before implementing, in my opinion.

@Dale Mugford: I agree the whole topic is pretty much moot – the site and content should speak a lot louder than the mark-up choice and even validation. However, people should learn the technology they’re defending, and most advocates of XHTML seem to incorrectly argue better semantics, closing tags, lowercase tags and all-round stricter code by using XHTML over HTML – but they’re sending their content as text/html, and/or don’t know about the caveats of application/xhtml+xml from browser support (looking at you know who), draconian error-handling (great in XML, but doesn’t belong on the have-a-go philosophy of the web), CDATA with JavaScript and even the different behaviour of CSS on html/body!

April 20, 14h

Thanks @Snook and @Dave for comments on my comment.

I see XHTML as more semantic as it is more stict in nature and there is less room for interpretation on what each snippet of code might be (a misformed XML or actual XML, etc.).

I have been using XHTML1.0 Strict for well over 5 years and have rarely come into any major cross-browser rendering problems.

We worked on major sites with it such as Tesco.com and there was not much problem even with advanced AJAX interfaces which were to be in strict conformance with W3C standards and also have valid XHTML at every point.

I run Google Analytics code on my site and it all works very well under XHTML1.0 Strict.

I like the concept of Interface as an API, which is much harder to achieve with HTML4.01.

Why not have a stricter HTML standard? What’s the problem with XML? Do we all prefer being sloppy?

trovster says:
April 20, 14h

@Jason Grant: I noticed you’re sending text/html as the content-type (even to Firefox) for the site you’re linking to from your avatar. You are giving the browser (some might argue, broken) HTML and *not* XHTML. Try sending the pages you create as XHTML using the correct content-type – application/xhtml+xml – and you’ll see the point people are trying to make. I suggested reading the two following article;

http://hixie.ch/advocacy/xhtml
http://friendlybit.com/html/why-xhtml-is-a-bad-idea

April 20, 14h

@ Jason Grant Are you sure you’re using XHTML and not XHTML as text/html? Sending it as such is actually invalid HTML (tag soup). From your comments it would appear you’re sending it as this.

April 20, 14h

@Jason: The problem with XML is it hurts users if you get it wrong. Tesco.com, which you cite, can be made to very easily throw a fatal error: see http://www.tesco.com/books/search.aspx?Ntt=%EF%BF%BF&Ntk=primary&VSI=1&Ntx=mode%2Bmatchall&Nty=1&N=0 . The instant you start allowing user content, it becomes virtually impossible to ensure what you output is always XML, and if you’re using XML, if you get it wrong, your users will never forgive you. That took me a whole thirty seconds to break, mainly looking for where I can enter input.

Also, as long as you care about IE support, you need to serve it as text/html to IE, so you are limited to a specific subset of XHTML for compatibility with legacy HTML parsers.

Valid HTML is no more sloppy than XHTML: it has well-defined parsing behaviour.

Again, the problem with XML: it’s very hard to do correctly (as proven by sites like Tesco.com being broken as above), users don’t like seeing error messages, and it isn’t actually any easier to parse than HTML (regardless, the complexity of browsers is primarily in rendering/scripting: parsing is easy).

April 20, 14h

@Jason:

“I like the concept of Interface as an API, which is much harder to achieve with HTML4.01.”

How does XHTML 1.0 help? It contains the exact same elements and attributes. There is nothing semantically different between XHTML 1.0 and HTML 4.01.

The only difference between XHTML 1.0 and HTML 4.01 is that XHTML 1.0 has defined behaviour for syntactically invalid markup (i.e., throw a fatal error) whereas HTML 4.01 has none (HTML 5 adds that, though). Everything else is not an advantage of XHTML 1.0, and is entirely possible with HTML 4.01.

trovster says:
April 20, 14h

@Geoffrey Sneddon: Actually, XHTML should be easier to parse as it should work with generic XML parsers. However, I do not know anyone who uses XHTML that way. Happy Birthday by the way!

Mark B says:
April 20, 14h

From my own research, I think a much more elegant solution is going to wind up being XML and XSLT being rendered into HTML4/5/6/what have you.

At least from a UI perspective, it’s very tempting to have one XSLT stylesheet that encapsulates the entire view layer of a website, including very simple business logic.

I like to think of it as the “items in cart” problem. If you have 1 item in your cart, you want to display “1 item in cart”. If you have 2, you want to display “items”, plural. Right now, you’re probably spending cpu cycles on the server to determine that; XSLT allows you to push that code off to the client.

Now imagine having one XSLT stylesheet that holds the presentation layer for your entire site, and then caching that file on the client browser for a long period of time. Every subsequent page the user goes to gets _only_ the XML required to render that page; ie, only the actual data on the page, instead of any markup at all.

XHTML doesn’t really help with the above scenario, whereas XML/XSLT can output HTML5 easily.

April 20, 15h

@Ryan
I am following the standards on this one regarding outputting XHTML.

@Geoffrey Sneddon
There are quite a number of differences between HTML4.01 and XTHML1.0, the main difference being that proper XHTML1.0 is XML, which is quite a powerful concept as the raw XHTML1.0 output can be directly manipulated with things like XSLT or XQuery as it’s more popular now. Otherwise we have to jump through hoops to parse public data using shitty, custom APIs which have separate implementations for each back end laguage as opposed to plugging directly into the output itself.

@trovster
I am having a current project I am working on which requires some content scraping from a public site and adding more value to it for intranet. Luckily the content on the public site has been marked up with XHTML1.0 strict and parsing it into whatever I need will be a doddle.

April 20, 15h

@Kyle Weems
Unfinished list items in HTML4.01 Strict are valid code.
This is exactly the kind of problem I have with HTML4.01.
Check a sample as a perfectly valid piece of HTML4.01 Stict code I just wrote at
http://flexewebs.com/semantix/examples/bad/html4.01.html

Parsing that with automated tools is much harder than XHTML1.0 Strict and that’s essentially where one of the major differences is.

41
Mark says:
April 20, 16h

I’m one of those grumpy old folk who never wholly moved on to XHTML anyway. I like it a lot and some things in the XHTML2 specification are very nice, but when I found out back in the original hype that with the then current browsers I wouldn’t be able to serve my pages with the correct headers I got all disgruntled and pouty.

I still /> self-closing tags and double-quote attributes etc though.

April 20, 19h

Hum, did anyone notice that HTML5 allows tags that don’t require a closing tag to end with “/>”? And I mean HTML, not XHTML5.

If I remember well, the idea behind this move was that everybody was already doing it, every browser supported it, and it wasn’t harmful in anyway. So given the current state of things it was more trouble to disallow it and explain to everyone why they need to remove the / they were told to add a few years before than allow it as an optional terminator for tags that do not require a closing counterpart.

So, with HTML5, it really doesn’t matter whether you write <img> or <img/>. You can do as you like.

43
Rob says:
April 20, 20h

More importantly, you should move back to your blog and dump Twitter. I mean, Oprah is now tweating, and they call it “tweating” for gosh sake.

Real men don’t tweat.

April 20, 23h

@trovster, @Jason: But there are equally off-the-shelf HTML parsers that parse into an XML object model, so you can use an XML tool-chain on it. libxml includes an HTML parser, for example.

April 21, 00h

@ Jason Grant And what’s the standard? I only ask because everything you’ve said suggests you’re sending XHTML as text/html, which isn’t proper XHTML and gives you none of the benefits XML would give over HTML 4.01. Neither does XHTML 1.0 Strict make improvements to semantics over HTML 4.01 Strict. If you’re using proper XHTML served as application/xml then your sites wont even work in IE and I can’t imagine Tesco accepting that.

April 21, 02h

@everyone
Self-closing tags in HTML4.01 are not allowed anyway according to the validator.

Unclosed list items are allowed even in HTML4.01 Strict.

Example: http://flexewebs.com/semantix/examples/bad/html4.01.html

These are things which allow newbie developers to code up very losely and therefore create interface code which is very ambiguous and requires more machine programming to understand properly.

47
Anton says:
April 21, 02h

Please, keep closing non-paired tags with “/>”! Your documents will still validate as html 5!

Pete says:
April 21, 03h

I’m sticking with XHTML. It’s not perfect but it has better portability, mobile devices, web scraping etc.

While you may not always be developing for a specific device, if you have add support later then XHTML will be a better place to start.

And, I expect it will work better with the different DOM apis that are out there.

Dan says:
April 21, 05h

Nice article, Dave.

I pretty much totally agree with you but I don’t think there is enough potential benefit for me to make the same switch just yet.

Dan

Robert says:
April 21, 06h

I never made the jump to XHTML because, at least by the time I wanted to make a decision about whether to use XHTML, the web standards movement’s embracing of XHTML is fundamentally broken. I know everyone anticipated XHTML to be the next big thing, but 5 or 6 years in, the most popular browser didn’t support REAL XHTML. Even 9 years in, the newest version of the most popular browser doesn’t support REAL XHTML. I laid out my longer argument in my own blog about the subject. Click my name for the link.

h3 says:
April 21, 07h

I did to put my bet on XHTML 4 years ago when I started web dev, it seemed so promising. But when I began doing actual work for clients (about three years ago) with it, I quickly realized that it was DOA.

You just can’t serve XHTML as XHTML because of IE, so what’s the point of writing XHTML ? Argument over.

However, for some reasons, I still have hard time to drop the closing tags.

April 21, 08h

Dave - why not go the whole hog and move to HTML5 right now? There really is nothing to be lost, and you get the benefit of being able to remember how to write a doctype without googling for it.

Sites like UX London not using the new HTML elements like section and header doesn’t mean they are not valid or semantic (okay, they’re not as semantic as they could be if they used new elements, but that’s not an automatic fail).

For the most part, you can take any HTML 4.01 Strict site, swap out the doctype for an HTML5 one, and have it validate straight away.

Dave S. says:
April 21, 09h

@Matthew Pennell - while there may be nothing to be lost, I don’t feel like it’s remotely close to the right time to start delivering client sites in HTML5. Experimental and personal sites, maybe.

Shelley says:
April 21, 11h

I don’t necessarily agree with your take that XHTML 2 is “dead”, but we’ll leave that aside…

Your writing and this discussion would have more meaning regarding the HTML versus XHTML debate if it weren’t for one thing: the HTML5 effort encompasses both HTML and XHTML serializations.

In other words, my site is valid XHTML5, right at this moment, and I still can make use of inline SVG and RDFa. In fact, I do make use of both, and it validates (using a test schema for the HTML5 validator that supports SVG and RDFa).

So, if you’re concerned about the future of XHTML dead ending with XHTML 2.0, you can still support XHTML, and the advantages of XHTML, with the HTML5 effort.

If you’re tired of closing tags and escaping ampersands, or whatever, and aren’t interested in the use of RDFa or any future XML-based vocabulary, then by all means, use the HTML serialization.


Dave S. says:
April 21, 13h

@Shelley - “the HTML5 effort encompasses both HTML and XHTML serializations.”

It’d be a fair charge to say I haven’t been following the specifics of HTML5 very closely. In fact, that’s news to me. I’m not sure if it actually changes anything, but I didn’t know that.

Shelley says:
April 21, 16h

Dave, most people don’t know – the HTML5 working group keeps referring to “HTML”, which gives an impression that HTML5 is specific to HTML, only.

As you can imagine, lots of confusion, too, about XHTML5 versus XHTML 2, and differences in the DOM, namespaces, et al.

Do you think the W3C could possibly make it all more confusing ;-)

April 21, 16h

Sorry, I know that this is naughty of me, but I don’t have the time right now to read all 47 comments. So if I repeat something, then sorry, my bad.

XHTML2 has some great aspects which aren’t yet in HTML5 and that makes me sad. But I do agree that HTML5 will undoubtedly be the new standard and XHTML2 will die alone.

@Shelley - “the HTML5 effort encompasses both HTML and XHTML serializations.”

This is correct and exactly what I was about to point out. I love the same things about XHTML that you do Dave and I wasn’t about to give these things up for HTML5.

I went and found Ian Hickson on IRC and pointed out what I didn’t like about HTML5. He responded that HTML5 is actually (X)HTML5. Basically it all comes down to the mime-type.

Of course, this creates problems with IE. (X)HTML5 as XML/XHTML is like XHTML1.1. IE can’t handle XML/XHTML mime-types so for a long time we’ve just ignored XHTML1.1.

What I am considering doing is actually going back to XHTML1.1 and serving it as HTML to browsers which can’t handle XHTML. This way I’ve still got my XHTML goodness and I can still use (X)HTML5-esque classes.

I know that this isn’t always considered to be the best approach but I think that we’re all pretty sick of IE.

Obviously you need to be careful of non-validating content though.

April 21, 17h

What timing! Dave, I found your post because I’m embarked on the same poignant turn in the road and was re-reading all the debates today, starting with Ian Hickson’s mostly reasonable rant [http://hixie.ch/advocacy/xhtml] and ending up with this remarkably well- and instantaneously-linked post of yours.

Having Gone Into The Light with XHTML a few years back, I am now blushing, turning, and Going Back Into The Other Goddam Light with HTML4. The last thing I’m going to do is give up my fetish for semantic, well-formed markup and separation of layers for the sake of a lousy doctype, as others here suggest might be my fate, so other than losing a few virgules I don’t expect to change my ways. But I really came to love typing that X!

Still, it’s tiresome to hear the refrain that there’s no advantage to using XHTML unless you’re serving it as application/xml. My motivation in outputting XHTML-Strict markup (yup, served as text/html) was to produce future-friendly sites that would require next to no modification when the XML dream came true. But it has always grated to knowingly mis-use a tool in anticipation that it will someday be used properly, and I’m tired of the wait.

I’m maintaining a lot of sites I’ve created using WordPress and I thought that would be a major hurdle since WordPress currently outputs XHTML markup only, until I re-read WP’s XHTML page, at the bottom of which [http://codex.wordpress.org/HTML_to_XHTML#Problems_with_XHTML] it reminds us that XHTML served as text/html is browserized as plain HTML and “In order to avoid this problem and output standards compliant code you can use the XHTML to HTML wordpress plugin.” [http://www.kilroyjames.co.uk/2008/07/xhtml-to-html-wordpress-plugin] I tried it, it works, ridiculously simply with a preg_replace() and an ob_start().

Farewell, XHTML-Strict, we knew ye well.

HTML5? That will be nice when the time comes, but I’m loathe to hold my breath so long again.

April 21, 23h

The only real benefit to having a XHTML/XML document is that it can be properly parsed by a XML parser. In theory, that should make manipulating the document easier and rendering the document less error-prone.

The reality is though, as a third party developer, since most sites don’t actually properly validate as XML, you can’t make the assumption that they are parsable, which basically means you have to treat most things as HTML anyways and allow for some slop.

As another commenter pointed out, you actually can’t even serve a XHTML web page with a mime type of text/xml (which is technically required), as some browsers will barf and try to save the file to disk instead. So right now we have some weird pseudo XHTML going on (or as someone said, XHTML served as HTML basically).

I don’t know enough about HTML 5.0 to even comment, but I think it’s pretty clear that XHTML hasn’t been the end-game that everyone hoped it would be. I’d like to think that the skills required to construct a valid XHTML document will only benefit someone in the future working on HTML 5.0 or whatever the new standards become, but who knows.

April 23, 14h

Talk about looking left when I should have been looking right! Great post Dave, certainly proved to me that I’ve been working with my XHTML blinders on for far too long.

61
Bernhard Schulte says:
April 23, 15h

Some good reasons for sticking with HTML by Tina Holmboe from W3C XHTML working group.
http://www.dev-archive.net/articles/xhtml.html

Will says:
April 24, 00h

Is it really an either/or?

XHTML: I work at a web agency. We use the DOM to store data (well marked up visible “data” not goofy hidden stuff) and when I need to do something with that data, say, submit it to a server function with JS, I want to ensure that it’s XML happy already. Handily, I can use the validator to find out. I understand that I’m welcome to close my tags in HTML 4 but I don’t think the validator will tell me when I haven’t. We make a lot stuff. XHTML Strict is a good standard across all of our projects as it works for brochureware and apps.

HTML: Amen. Power to the people.

April 30, 04h

What are the problems with HTML4 and XHTML1?

HTML permits tag soup and the error handling is inconsistent across browsers.

XHTML, in its strict sense, doesn’t work in the real world at all. XHTML as HTML is a bit of a fudge and doesn’t deliver (m)any practical benefits.

However, HTML5 brings us two wonderful things.

1) HTML5 as XML. If/when we can every server XHTML properly, then the opportunity is there for us. XML is great for its simple parsing options.

2) Standardised error handling! Finally, we should have consistant handling of HTML5 errors (incorrect nesting, missing angle brackets, etc.) because the error handling has been specced for the first time. That takes away one of the biggest problems with HTML.

So, in conclusion, when implemented, we’ll have consistant error handling in HTML, and we’ll have an XML flavour available for those whoo need it.

So just pick the most suitable for yourself and get on with the job!

64
Reinmar says:
May 06, 06h

Good to see more people offering balanced views to the effect that XHTML is perhaps not the Holy Grail after all. I wish more folks would actually look at the technical facts and real-world status quo instead of clinging (often aggressively) to XHTML just because they’re emotionally invested in it.

In XHTML vs. HTML discussions, one of the common myths involved – apart from “it’s more semantic than HTML” – is that XHTML is somehow more beneficial to mobile devices. There’s a lot of evidence to the contrary. No mobile browser vendor can afford to exclude HTML 4.01, whether valid/strict or sloppy tag soup, since that would exclude the majority of the Web. Therefore, plain HTML is in no way disadvantageous.

Simon Pieters conducted some revealing tests: http://simon.html5.org/test/mobile/
Results: http://simon.html5.org/articles/mobile-results

The relevant bit here is summarized in the result notes: “… most [mobile browsers] seem to think that application/xhtml+xml is equivalent to text/html”

Note that I’m not talking about XHTML Basic or XHTML-MP, which are specific to mobile sites, and thus not used commonly for sites aimed at modern desktop browsers anyway.

Murnau says:
May 07, 11h

argh….before some years I needed some month to habituate me for the new XHTML. And now?

But it’s okay. If I can see the pros for me I can make a turnaround. So long I only look at the deployments.

May 13, 16h

“HTML permits tag soup”

No, it’s website producers who permit tag soup.

Using HTML neither allows nor forces you to get sloppy. Only you can allow yourself to get sloppy.

So don’t get sloppy.

I have never known the W3C validator to overlook malformed HTML-Strict markup. With that robot to catch me when I slip, I have no fear walking into the scary swamps of HTML.

Try running the validator on XHTML-Strict sites you come across and you’ll see how sloppy developers can get even using the shiniest tool.

May 26, 08h

I’m sorry, but I’m just not following your logic here. What is gained by moving away from XHTML and back to HTML 4.01? It sounds like you’re suggesting that the only way to prepare for this new standard is XHTML 1.0 -> HTML 4.01 -> HTML 5 rather than XHTML 1 -> XHTML 5.

I can respect your disappointment in the lack of momentum of the web standards community when it comes to XHTML evolution, but it seems a bit cavalier to suggest people should abandon XHTML completely.

What is gained by “migrating” your work (or shifting your coding style) back to old HTML? In my opinion the slight added rigor in XHTML helps to enforce good, clean coding. Granted, both can be validated with nesting errors caught, but there are real-world advantages of XHTML that you are walking away from.

I think the more intelligent approach is to create your own personal, consistent “coding style guide” that establishes standards for using specific DOM ids (e.g. for divs that represent unique page elements like ) and classes (for sections that would eventually become like ). Then, when the time comes to move to HTML5, you could use a relatively simple XSLT template to mass-migrate your code to XHTML5 (or HTML5).

68
Gilsinger says:
May 29, 13h

As someone new to web design, I learnt both HTML and XHTML in parallel and having read the arguments for both sides, I see no inherent advantage to either. As I understand it both allow equally semantic markup when used correctly.

As I see it, the only reason to use XHTML over HTML4 are personal choice or if your site benefits from features such as MathML. I would also extend the same logic to HTML5, unless you genuinely need to use (as an example) the canvas tag, I see no benefits in converting.

As I say, I am very much a novice designer so if I’ve missed a major point here I apologise but it seems to me that for 99% of the websites you see, there is no real advantage to either XHTML1 and HTML4.

69
Tom Wright says:
May 30, 19h

html5 is in no way the “death” of xhtml, it can be served with either a html or xhtml content-type whereas it uses a generic doctype (html) in a way to obsolete doctypes (whilst still making legacy browsers render in standards compliance mode).

What it really means is that either version has all of the features of xhtml which browsers (or one browser in particular at least) actually supported (including lovely /> ness) and you can start using it now as it will be treated just the same as html 4.01 except it gets the newer validator which is permissive of new features.

June 02, 10h

I’m eating crow.

See my May 26th comment. Now scratch that. I just created an interesting drama for myself when I tweaked a web server to pair my XHTML pages with the (appropriate) application/xhtml+xml MIME type. (I appropriately had the configuration exclude IE6 since it’s broken and won’t handle true XHTML.)

However, all my JavaScript broke because I hadn’t surrounded it in the necessary CDATA enclosures. I’m realizing that XHTML has its pitfalls and people are not necessarily ready to embrace all the necessary rules that come with switching your browser into XML mode. (I’m also aware of some CSS rule changes like background colors.)

I think people are getting sloppy about their XHTML, which is ironic because (I believe) people choose XHTML as a sort of declaration of coding-discipline. The ironic truth is, unfortunately, that living in the XHTML world requires more contradictory rule-breaking and half-standards than clean, strict HTML 4.01.

I’ve written my own blog entry about the experience. (Link in this comment’s URL field.) Sorry, Dave, about being so dismissive in my earlier comment.

June 08, 16h

Dave, you write, “I’m not ready to start working through the contortions needed to make my sites work with an HTML5 DOCTYPE yet….” Can you go into more detail about what those contortions would be?

Reviewing the differences between HTML 4 & 5, it looks like the only habits that I myself would have to change in order to write HTML4 documents that were also valid HTML5 would be to change the doctype and stop using acronym. Surely the transition from XHTML to HTML requires many more changes than from HTML 4 to 5.

June 09, 10h

to wit:

HTML 5 differences from HTML 4
W3C Working Draft 23 April 2009
http://dev.w3.org/html5/html4-differences/Overview.html

The Changed Elements aren’t a problem for me because the few I use are not changed in a way that would force me to alter my markup habits.

Of the Absent Elements, the only one I ever use is acronym which I can replace with abbr.

I never use any of the Absent Attributes anyway.

So where are the contortions in the shift from XHTML or HTML4 to validatable HTML5? Changing the doctype to doesn’t necessitate using HTML5’s new features. Can we not take our time and introduce the new features gradually as they become adopted universally (hah) by the browser bowsers?

Curiously,
Paul

June 23, 16h

Stuff like this makes me want to start a lovely career in construction.

74
Anon the wee mous says:
June 24, 12h

I started transitioning everything back to HTML4 too.
I’m getting fewer complaints, and a much higher os-independant compatibility by doing so.
Plus my bandwidth usage is way lower.

xhtml was supposed to be the next big thing.
But it’s nothing like how we envisioned it in the 90’s, it’s really been a sad let-down.
The nay sayers and the creatively-challenged just don’t/can’t understand.

June 29, 13h

Personally, I prefer xhtml and I am pretty stoked about the inclusion of RDFa triples in xhtml2 (this makes parsing the document with javascript so much easier). And while x/html5 has some amazing features (the canvas tag has me salivating), I can’t help but feel that xhtml2 is a little more future-proof. By the looks of it, xhtml2 will probably only be fully realized a few years after x/html5, but hopefully that means that some of the cooler features of x/html5 will appear in xhtml2.

xhtml.com provides an excellent comparison (http://xhtml.com/en/future/x-html-5-versus-xhtml-2/) between xhtml2 and x/html5, covering the good and the bad.

That is all.

Dave S. says:
July 02, 11h

A quick update: the W3C just announced they will not be renewing the XHTML 2 Working Group’s charter when it expires at the end of this year:

http://www.w3.org/News/2009#item119

So, that’s the final nail in the coffin I was expecting and waiting for.

77
Joe Hoyle says:
July 05, 13h

Not sure if I am missing something here… I have read through the comments, and from what I understand HTML5 will the self-closing ” />”. It seems many people are switching back to HTML 4.01 in preparation for HTML5 support in browsers.

Rather than getting used to the minor differences (I say minor as as I understand it, there is very little syntax variation in the two) of XHTML -> HTML 4.01, why not just keep using XHTML served as text/html so we can keep the stricter markup rules, then pretty much change to the HTML5 doctype whenever that may be.

78
Rhyaniwyn says:
July 16, 13h

@Jonathan - Tsk tsk. There is a very discernible difference between HTML 4.01 Strict and XHTML 1.0 Strict!

Write a document with HTML 4.01 Strict in mind, omitting the small XML-compatible syntax changes XHTML 1.0 Transitional introduces. Then try to validate it as XHTML 1.0 Strict. You should see at least a few errors, rules that differ. I discern a difference.

Many of them are very stupid errors, in my opinion. Like having to wrap all links in a parent block element. But it’s a mistake to think that the only differences are the minor ones introduced for XML compatibility.

I agree with the real point: it’s possible to write simple, semantic, syntactically correct, and neat HTML 4.01. Write it just like you would XHTML and omit the /> and use the HTML 4.01 doctype. Since XHTML *should* be served as xml, not as text/html, it’s really all the same thing.

That doesn’t mean that XHTML in and of itself is dead. XHTML 2, which was incompatible with XHTML 1 to begin with, is dead. XHTML 1 is well supported and will continue to be supported as long as HTML 4.01.

I personally still have a wistful fondness for the XML-filled future I thought we had. It takes care of extensibility and special use cases neatly. If implemented by browsers, of course…

I won’t comment on XHTML 2 itself. I don’t normally follow specs until they are much closer to completion than either “contender” is/was. For that reason I also can’t comment on HTML 5, though I certainly have more familiarity with it due to all the hype.

My opinion is not that HTML 5 will save us all, though it may be a reasonable successor with many improvements over HTML 4.01. But since it appears as of now to be the only new specification coming, I personally intend to get vary familiar with the current draft and speak loudly about what I do and do not like.

After all, I will probably end up having to use it, so I would best serve myself by putting in due diligence and my 2 cents. I don’t intend to jump on any bandwagon – whether that wagon is full of haters or fans.

July 20, 13h

“I think it makes sense to ignore XHTML for now, but I don’t see the point in intentionally leaving out closing tags just because HTML4 is sloppy. Why not validate HTML4, but use the good parts of XHTML? In my opinion, /> in a tag shows your intent to close the tag, so it’s clear to anyone reading your HTML (and to browser rendering images) that the following tags aren’t being nested.”

This comment waked the pedant in me. HTML (or SGML) isn’t sloppy for minimising end tags. XML doesn’t allow minimising because it is more pure, but because of design trade-offs. SGML, HTML, and XML documents contain a tree of elements.

The start and end tags are just that, tags to delimit the elements in the tree. Sometimes the tags are necessary, sometimes they are not, this is defined in the SGML application. Their presence or absence doesn’t change the document structure. Omitting the end tag for the ‘p’ element makes for a more terse and readable document, on the other hand it requires the parser to know that ‘p’ elements can’t be nested. This can be done with HTML because it is a known format, with SGML using DTDs, and can’t be done in XML.

A starker example can be the ‘body’ element. In HTML4 Strict it can’t be empty (it must contain at least one element), but both the start and end tags are optional. Why? Because ‘body’ is unambiguously defined by its content (different from ‘head’ content which has optional tags as well, just like the ‘html’ element), you don’t actually need start or end tags to know when you have finished parsing the ‘head’ element and have begun with the ‘body’ element.

You might want to see http://my.opera.com/jax/blog/html5-xml-stealth for some more discussion.

brooks says:
July 27, 16h

html 5 is still extensible.

the w3c could have let people know how they were treating the html 5 spec as having most of what xhtml 1 has and is a natural evolution. this would have answered the question, “What happened to xhtml 2, 3, and 4?”

xml is still the structure language, and as browsers support more standards, like what 5 does with video, audio, interactive, and services, it will be possible for anybody to participate in making and sharing content.

<social api=[“facebook”, “flickr”]>
<photos caption=”true”>today</photos>
</social>

August 04, 11h

Seeing how IE has always put a full stopper to XHTML, the only reason I used it was so that you could make a simple XML parser load the XHTML data and traverse the data.

With XHTML4 that was fine as you could send XHTML1.0 as html/text and xml/text.

But with XHTML5 being xml/text only, I’m thinking I’d have to give up that little extra novelty accessibility.

Rooturaj says:
August 20, 20h

I am mostly a PHP programmer. But I do need to write code in HTML and being a team leader I have to guide my team. We have been mostly into XHTML till now.
You did not give a good idea as to why you have lost faith in XHTML. Would have been useful for us intermediates.
Any way I am going to take your tip and give HTML4-5 a hit. *Thanks*