2005 2006 2007 2008 2009 2010 2011 2015 2016 2017 aspnet azure csharp debugging elasticsearch exceptions firefox javascriptajax linux llblgen mongodb powershell projects python security services silverlight training videos wcf wpf xag xhtmlcss

XHTML 1.1 Escaping (Chapter Excerpt 3)

One thing you will inevitably notice when working with the ultra strict XHTML 1.1 is how the ampersand (&) works. In HTML you never had to worry much about the ampersand, though you could use it to implant an HTML space ( ). Now given that XHTML is XML, you simply can't get by with the rules of HTML. In the XML world, if you want to save an XML document with an ampersand, then you have to escape it. If you are familiar with C-based programming languages, then you know all about special characters like this. In those languages you have to "escape" certain characters such as the backslash (\), with another backslash.

In fact, in Altova's XMLSpy application you will get a stopping warning if you have XML that looks something like <b>&</b>. The appropriate way to have an ampersand in XML is to write it as &amp;. This is very similar to what you could do with the   space in HTML and it's not that far off from what we already do. For instance, if you wanted to write &nbsp;. In fact if you want to write &amp;nbsp; you have to write &amp;amp;nbsp. So, if you know HTML, you already know the fundamentals of this rule. It's just that in the strict world of XHTML 1.1, you can no longer let little mistakes enter your code.

Now this rule about dealing with ampersands doesn't end with standalone ampersands, but also applies to ampersands in links. For instance, you can't just have a link like default.aspx?id=3&category=5. Your link has to transform to default.aspx?id=3&category=5.

Fortunately ASP.NET 2.0 is smart enough to deal with things like this in your databinding. For instance, look at the following code...

ASP.NET 2.0 Code

<asp:GridView ID="gvLinks" runat="server"></asp:GridView>

Code Behind

C#

String[] links = new String[] { "http://jampad.net/default.aspx?id=1&name=2", "http://jampad.net/default.aspx?id=1&name=2&frog=3", "Amburgers & Wootbeer" };
gvLinks.DataSource = links;
gvLinks.DataBind( );

Believe it or not, ASP.NET will output produce the following...

<table cellspacing="0" rules="all" border="1" id="gvLinks" style="border-collapse: collapse;">
    <tr>
        <th scope="col">Item</th>
    </tr>
    <tr>
        <td>http://jampad.net/default.aspx?id=1&amp;name=2</td>
    </tr>
    <tr>
        <td>http://jampad.net/default.aspx?id=1&amp;name=2&amp;frog=3</td>
    </tr>
    <tr>
        <td>A&W</td>
    </tr>
</table>

While I'm not a big fan of looking at table code, I am however a huge fan of automation. As you can see here, you don't have to fight with manually escaping data binding ampersands. Thus, you can give ampersands in XHTML data binding a vote towards deterministic.

Base64 PNG Server

In my mind, one of the coolest things tht modern web browsers can do is deal with base64 PNG images. PNG images are the "new standard" in web images. They can be very small in size or they can be larger as true color images depending on your needs. They don't replace everything, but they do replace a lot.

A base64 PNG image is a PNG image encoded as base64. Base64 encoding is a way to encode non-printable characters (stuff you can't see, but the computer can read) into printable characters (things like letters and numbers).

Base64 PNG images (which are text) can actually be read by modern web browsers as real images. In fact, it's one of my qualification requirements for being a modern web browser (actually there are MANY requirements in my mind). You can actually use base64 PNG images directly in CSS. Here's an example...

background: url();

Well, it's not SUPPOSED to be small! It's supposed to be embeddable. And it is. But think for a moment. You can take an image (not just PNG mind you, GIF can do this as well) and turn it into Base64. Now, can't you dynamically load images on the client? Well, yes you can... All you have to do is do a remote call to somewhere which will send the Base64 stream back.

One you get the stream back all you have to do is prefix the base64 stream with "data:image/png;base64," and assign the entire value to the src property (attribute) of an img object.

Here's an example I put together a few months ago of how you can do all this... Firefox users only please! IE6 won't get NEAR base64 images.

PNG Client/Service Example

Actually, this is also a great example of how to work with web remoting (I just CAN'T call it Ajax, that's too weird) and how to dynamically work with XML files.

XHTML 1.1 and DataBinding (Chapter Excerpt #2)

When doing serious data binding in ASP.NET you may want to reconsider using XHTML 1.1 or XHTML 1.0 Strict at all. The rule is simple: use the document type that can be deterministicly proven to be proper in your situation. Put another way, unless you have deterministicly proven that there will never be any invalid markup in the data, you should always use XHTML 1.0 Transitional.

If the binding data has some odd markup, then you will end up sending invalid XHTML 1.0 sent to IE breaking validation or sending invalid XHTML 1.1 markup to browsers such as Firefox, halting the rendering.

But what does it mean to be deterministicly proper? To put it simply it means to absolutely garuntee that the data will always be proper, that is, to be able to always predict the properness of data. This does not mean "well, it worked for 100,000 tests, so it's good enough", but rather it means that it absolutely will always work 100% of the time. You can get this level of determinism by looking at the symantics of what is going to be bound. For example, if you are binding a table with numbers, which will always be numbers, then you should have a level of determinism here. However, if you are binding a table with unvalidated under input, then you do not have determinism as you have no idea what a user will input. The user could input <b><i></b></i>, which will break the page. You have no idea. Having a proven, not demonstrated, view of data is what this is all about.

Here are a few guidelines that should help you with determining what is and not deterministic.

These things are never deterministic...

  • Unvalidated user input
  • Unvalidated external data
  • HTML
  • Anything else with angle brackets (<, >), except wellformed XML

Given symantical care, these things should be deterministic...

  • Wellformed XML
  • Alphanumeric strings
  • Base64 encoded data
  • Alphanumeric strings

To reiterate: only use the document type that is deterministicly proven to always be proper.

Excerpts from my XHTML 1.1 Chapter

One of the great things about XHTML 1.1 is that you are never allowed to serve it's content as text. You can send every type of XHTML 1.0 as text all day long, but never XHTML 1.1. You say you never send as text anyhow? Sure you do... that's what the text/html content-type is all about. The default for every web server (that I know of) is to send "browser" content (i.e. HTML, XHTML) as the text/html content-type. To use a different content-type you have to specifically say so.

The typical way to send XHTML 1.1 content is actually with the application/xhtml+xml content-type. To server a page using this type in .NET, you simply state the following in the early parts of your .NET page rendering (at the Init or Page events).

C#

Response.ContentType = "application/xhtml+xml";

VB2005

Response.ContentType = "application/xhtml+xml"

When you do this, you kick modern web browsers, like Firefox, into what I like to call "parsing mode", "ultra strict mode", or "application mode" depending on my mood, but what it really is is XML parsing mode. In my lectures I often say "HTML is fundamentally unparsable". What I really mean is "HTML is fundamentally unparsed". That is, browsers tokenize HTML (via scanning), rather than parse it via XML parsing. Not to say that web browsers parse XML per se, but they could fundamentally do so it they wanted to. XHTML is XML and therefore "XHTML is parseable". Kicking modern web browsers into this ultra strict mode actually forces the browser to parse the page in an XML centric manner. What's this mean? It means that if your XML (XHTML) is not well-formed, it will throw an error. It's important to note that when you are in this ultra strict mode, the browser is not a validator (that would be REALLY cool), but is more of a well-formedness checker and is the closest things we have to a runtime compiler (which, I know, seems like an oxymoron.)

So, why do this work to get this ultra strict mode? For one simple reason: QA! Quality assurance requires that you take your work seriously. You can't just throw together a bunch of pages and throw them out on the web. If you were writing C#, C++, or Java you would be forced to run your code through a compiler to check for errors. By using ultra strict XHTML mode, you again get this forced compliance.

One word of caution...and it's a common word of caution. Gosh, no matter what I say or what I do on the web it always seems to be the same word of caution: this doesn't work in IE! This is because IE, by default, has absolutely no idea what application/xhtml+xml is. If you try to send this type of content to IE, it will probably try to download the file locally. Now there are times when it will load fine, but what usually is happening in these cases is that the content type is specified in the Windows registry to tell IE how to render it. Since we are talking about the web here, and not the Intranet world we have to follow the universal rules of the W3C, ECMA, and...well least common modern denominators (I throw the word modern in there just in case anyone wants to say that Netscape Communicator 4 is the LCD...and for the record, I don't care about any 4th generation browsers.)

So we need to make sure that IE doesn't get this ultra strict content type. Consequently, we won't have ultra strict mode in IE. Now, if you're paying attention at all you will notice that we have a page with at least two content-types required to support two different planets of browsers (IE, the rest of Earth). You can't send two at the same time. Not a problem. All we need to do is whip out a quick condition based upon what the browser can and cannot do.

When it comes to server-side browser detection some people like like to rely on the UserAgent string of the browser. This is actually a very bad idea due to how easy it is to modify a browser's UserAgent string (especially in IE with all the IE toolbars out there!). I actually read somewhere where someone said that "using the UserAgent is the fastest way to hell" I'd have to go along with that hyperbole.

Here's what you really do: test if the browser can accept the application/xhtml+xml content-type. If they can use it, use it. If not, don't. That's all there is to it, but before I get into the code there's one more thing. We're not talking just about the content-type here, were also talking about the document type (the DTD). If the condition comes back negative, that is, it does not accept xhtml+xml, then you can't use XHTML 1.1 either. So you also have to control the content-type nad the document type in the same shot. This is also not a problem.

Here's a standard method I use in the pages I want to be in ultra strict mode.

In the ASP.NET page, I replace the doctype with the following.

<asp:literal id="litDoctype" runat="server"></asp:literal>

Now in the code-behind I do the following...

C#

Boolean debugMode = false;
private void SetDoctype( ) {
    Boolean mimeTypeOverride = false;
    if (Request.QueryString["xhtml"] != null && Request.QueryString["xhtml"] == "1") {
        mimeTypeOverride = true;
    }

    Boolean realBrowser = false;
    if (Request.ServerVariables["HTTP_ACCEPT"] != null || mimeTypeOverride) {
        String httpAccept = Request.ServerVariables["HTTP_ACCEPT"];
        if (httpAccept != null && httpAccept.IndexOf("application/xhtml+xml") > -1 || mimeTypeOverride) {
            realBrowser = true;
        }
    }

    if (realBrowser && !debugMode) {
        Response.ContentType = "application/xhtml+xml";
        litDoctype.Text = "";
        litDoctype.Text = "\n";
        litDoctype.Text += "\n";
    }
    else {
        Response.ContentType = "application/xml";
        litDoctype.Text = "\n";
    }
}

VB2005

Dim debugMode As Boolean = False
Sub SetDoctype()
    Dim mimeTypeOverride As Boolean = False

    If Request.QueryString("xhtml") <> Nothing And Request.QueryString("xhtml") = "1" Then
        mimeTypeOverride = True
    End If

    Dim realBrowser As Boolean = False
    If Not (Request.ServerVariables("HTTP_ACCEPT") Is Nothing) Or mimeTypeOverride Then
        Dim httpAccept As String = Request.ServerVariables("HTTP_ACCEPT")
        If Not (httpAccept Is Nothing) And httpAccept.IndexOf("application/xhtml+xml") > -1 Or mimeTypeOverride Then
            realBrowser = True
        End If
    End If

    If realBrowser And Not debugMode Then
        Response.ContentType = "application/xhtml+xml"
        litDoctype.Text = ""
        litDoctype.Text = "" & vbNewLine
        litDoctype.Text += ""
    Else
        Response.ContentType = "application/xml"
        litDoctype.Text = ""
    End If
End Sub

This code is actually longer than you might expect, but this version is a bit more robust than a simple condition. The first to notice about this code is obvious: it checks to see if the browser can accept the application/xhtml+xml content type by checking the HTTP_ACCEPT server variable. Depending on the result, the browser either gets application/xhtml+xml content type and the XHTML 1.1 doctype or text/html content type and XHTML 1.0 Transitional.

Secondly, as you can also see, I've included a debug mode which, if set to true, will tell the page to serve the "less-strict" doctype. This comes in handy when you want to make sure you get the "less-strict" doctype. I'm using a private class boolean field named debugMode as a means to put the entire page into debugMode. You could also use a query parameter to put just this one part into debug mode.

Finally, you can see something rather odd towards the beginning of the code. Even though we are doing all of this to guarantee well formed XML, this does not in any way give us any information about validation. That's where this other part comes in. If you were go point the W3C validator at this page it would actually get the XHTML 1.0 Transitional doctype with the text/html content-type. That's all well and good for validating against that doctype, but you technically have two different versions of the same page here. You need to validate against the XHTML 1.1 version as well. So, with the inclusion of a quick query parameter check I'm allowing for the ability to give the W3C validator a Url such as default.aspx?xhtml=1 which will force the page to be in XHTML 1.1 mode. This is a little easier than always having to tell the W3C validator the XHTML 1.1 override (I never much liked the warning it gives you anyways when you do it their way anyhow). As I mentioned previously, you could use a similar technique for for it into the "less-strict" mode. One idea would be to set default.aspx?xhtml=0 to kick in "less-strict" mode.

Video 1 (FWD) - "Setting up your Firefox Development Environment"

Finally! Here's the long awaited part 1 of my Firefox for ASP.NET 2.0 Developers Video Series. I will be releasing more parts to the series over the next few weeks.

This video is titled "Setting up your Firefox Development Environment" and contains valuable information on setting up your web development environment for maximizing efficiency. More setup information relating to this video will be mentioned in future videos as the utilities used in future videos of course also require setup.

Furthermore, this video is valuable not only to the professional web developer (as well as the non-professionals who still use tables for layout), but it's also valuable for anyone interested in maximizing their web experience.

Below is the link to part 1 of the Firefox for ASP.NET 2.0 Developers video series. You can download Visual Web Developer 2005 Express and Firefox 1.5 below as well.

1 2 3 4 5 6

Powered by
Python / Django / Elasticsearch / Azure / Nginx / CentOS 7

Mini-icons are part of the Silk Icons set of icons at famfamfam.com