Excerpts from my XHTML 1.1 Chapter

One of the great things about XHTML 1.1 is that you are never allowed to serve it's content as text. You can send every type of XHTML 1.0 as text all day long, but never XHTML 1.1. You say you never send as text anyhow? Sure you do... that's what the text/html content-type is all about. The default for every web server (that I know of) is to send "browser" content (i.e. HTML, XHTML) as the text/html content-type. To use a different content-type you have to specifically say so.

The typical way to send XHTML 1.1 content is actually with the application/xhtml+xml content-type. To server a page using this type in .NET, you simply state the following in the early parts of your .NET page rendering (at the Init or Page events).

C#

Response.ContentType = "application/xhtml+xml";

VB2005

Response.ContentType = "application/xhtml+xml"

When you do this, you kick modern web browsers, like Firefox, into what I like to call "parsing mode", "ultra strict mode", or "application mode" depending on my mood, but what it really is is XML parsing mode. In my lectures I often say "HTML is fundamentally unparsable". What I really mean is "HTML is fundamentally unparsed". That is, browsers tokenize HTML (via scanning), rather than parse it via XML parsing. Not to say that web browsers parse XML per se, but they could fundamentally do so it they wanted to. XHTML is XML and therefore "XHTML is parseable". Kicking modern web browsers into this ultra strict mode actually forces the browser to parse the page in an XML centric manner. What's this mean? It means that if your XML (XHTML) is not well-formed, it will throw an error. It's important to note that when you are in this ultra strict mode, the browser is not a validator (that would be REALLY cool), but is more of a well-formedness checker and is the closest things we have to a runtime compiler (which, I know, seems like an oxymoron.)

So, why do this work to get this ultra strict mode? For one simple reason: QA! Quality assurance requires that you take your work seriously. You can't just throw together a bunch of pages and throw them out on the web. If you were writing C#, C++, or Java you would be forced to run your code through a compiler to check for errors. By using ultra strict XHTML mode, you again get this forced compliance.

One word of caution...and it's a common word of caution. Gosh, no matter what I say or what I do on the web it always seems to be the same word of caution: this doesn't work in IE! This is because IE, by default, has absolutely no idea what application/xhtml+xml is. If you try to send this type of content to IE, it will probably try to download the file locally. Now there are times when it will load fine, but what usually is happening in these cases is that the content type is specified in the Windows registry to tell IE how to render it. Since we are talking about the web here, and not the Intranet world we have to follow the universal rules of the W3C, ECMA, and...well least common modern denominators (I throw the word modern in there just in case anyone wants to say that Netscape Communicator 4 is the LCD...and for the record, I don't care about any 4th generation browsers.)

So we need to make sure that IE doesn't get this ultra strict content type. Consequently, we won't have ultra strict mode in IE. Now, if you're paying attention at all you will notice that we have a page with at least two content-types required to support two different planets of browsers (IE, the rest of Earth). You can't send two at the same time. Not a problem. All we need to do is whip out a quick condition based upon what the browser can and cannot do.

When it comes to server-side browser detection some people like like to rely on the UserAgent string of the browser. This is actually a very bad idea due to how easy it is to modify a browser's UserAgent string (especially in IE with all the IE toolbars out there!). I actually read somewhere where someone said that "using the UserAgent is the fastest way to hell" I'd have to go along with that hyperbole.

Here's what you really do: test if the browser can accept the application/xhtml+xml content-type. If they can use it, use it. If not, don't. That's all there is to it, but before I get into the code there's one more thing. We're not talking just about the content-type here, were also talking about the document type (the DTD). If the condition comes back negative, that is, it does not accept xhtml+xml, then you can't use XHTML 1.1 either. So you also have to control the content-type nad the document type in the same shot. This is also not a problem.

Here's a standard method I use in the pages I want to be in ultra strict mode.

In the ASP.NET page, I replace the doctype with the following.

<asp:literal id="litDoctype" runat="server"></asp:literal>

Now in the code-behind I do the following...

C#

Boolean debugMode = false;
private void SetDoctype( ) {
    Boolean mimeTypeOverride = false;
    if (Request.QueryString["xhtml"] != null && Request.QueryString["xhtml"] == "1") {
        mimeTypeOverride = true;
    }

    Boolean realBrowser = false;
    if (Request.ServerVariables["HTTP_ACCEPT"] != null || mimeTypeOverride) {
        String httpAccept = Request.ServerVariables["HTTP_ACCEPT"];
        if (httpAccept != null && httpAccept.IndexOf("application/xhtml+xml") > -1 || mimeTypeOverride) {
            realBrowser = true;
        }
    }

    if (realBrowser && !debugMode) {
        Response.ContentType = "application/xhtml+xml";
        litDoctype.Text = "";
        litDoctype.Text = "\n";
        litDoctype.Text += "\n";
    }
    else {
        Response.ContentType = "application/xml";
        litDoctype.Text = "\n";
    }
}

VB2005

Dim debugMode As Boolean = False
Sub SetDoctype()
    Dim mimeTypeOverride As Boolean = False

    If Request.QueryString("xhtml") <> Nothing And Request.QueryString("xhtml") = "1" Then
        mimeTypeOverride = True
    End If

    Dim realBrowser As Boolean = False
    If Not (Request.ServerVariables("HTTP_ACCEPT") Is Nothing) Or mimeTypeOverride Then
        Dim httpAccept As String = Request.ServerVariables("HTTP_ACCEPT")
        If Not (httpAccept Is Nothing) And httpAccept.IndexOf("application/xhtml+xml") > -1 Or mimeTypeOverride Then
            realBrowser = True
        End If
    End If

    If realBrowser And Not debugMode Then
        Response.ContentType = "application/xhtml+xml"
        litDoctype.Text = ""
        litDoctype.Text = "" & vbNewLine
        litDoctype.Text += ""
    Else
        Response.ContentType = "application/xml"
        litDoctype.Text = ""
    End If
End Sub

This code is actually longer than you might expect, but this version is a bit more robust than a simple condition. The first to notice about this code is obvious: it checks to see if the browser can accept the application/xhtml+xml content type by checking the HTTP_ACCEPT server variable. Depending on the result, the browser either gets application/xhtml+xml content type and the XHTML 1.1 doctype or text/html content type and XHTML 1.0 Transitional.

Secondly, as you can also see, I've included a debug mode which, if set to true, will tell the page to serve the "less-strict" doctype. This comes in handy when you want to make sure you get the "less-strict" doctype. I'm using a private class boolean field named debugMode as a means to put the entire page into debugMode. You could also use a query parameter to put just this one part into debug mode.

Finally, you can see something rather odd towards the beginning of the code. Even though we are doing all of this to guarantee well formed XML, this does not in any way give us any information about validation. That's where this other part comes in. If you were go point the W3C validator at this page it would actually get the XHTML 1.0 Transitional doctype with the text/html content-type. That's all well and good for validating against that doctype, but you technically have two different versions of the same page here. You need to validate against the XHTML 1.1 version as well. So, with the inclusion of a quick query parameter check I'm allowing for the ability to give the W3C validator a Url such as default.aspx?xhtml=1 which will force the page to be in XHTML 1.1 mode. This is a little easier than always having to tell the W3C validator the XHTML 1.1 override (I never much liked the warning it gives you anyways when you do it their way anyhow). As I mentioned previously, you could use a similar technique for for it into the "less-strict" mode. One idea would be to set default.aspx?xhtml=0 to kick in "less-strict" mode.

David Betz' Home for Architecture, Cloud, Linux, Docker, and Elegance