Grok the Web

A Programmer's Guide to the New Software Development Paradigm

by Andrew Schulman


Chapter 8

Don't Write HTML!

Last revised: April 6, 1997


Having sold you on the wonders of HTML as the lingua franca for the next generation of software development, in this chapter I'm going to tell you not to write HTML. Instead, write a program to write HTML. That of course is what a CGI program is, but this isn't really a CGI program: instead, it's a throw-away program to generate your HTML for you. Probably discuss benefits and problems to commercial HTML generators (HotMetal, FrontPage, etc.) here? Introduce the notion of "HTML compilers": programmers are not used to thinking of something as simple as HTML as "code," or to thinking of programs that generate it as "compilers." But behind the simplicity of HTML which is of course key to web's success, there's a lot of complexity in what you can do with it. HTML really is the code programmers are going to be dealing with the next 5-10 years. It will be the back-end produced by more and more programs.

("Technology Review" interview with Tim Berners-Lee in which he expresses surprise at number of people who actually sit down and write HTML.)

For me, I've *never* written a complete HTML table. Yet plenty of my documents contain them. How's that? In chapter 4, we saw using JavaScript to generate the table on the fly. But that only works if have a formula from which table contents come. What if have an external database?

We know the answer already: CGI. Remember, a CGI is just an program whose output is HTML. (Actually not true: any program whose output starts off with a MIME header, followed by data of the appropriate type. This an important point that needs to be made, and dwelled on at some length, earlier in book.)

So we can just write a CGI program to output complex HTML that we don't want to write by hand. Such as a table. Point here is that we won't necessarily deploy this CGI on the web: we'll use it just to output some static pages which otherwise too complex to construct by hand. Write these in perl or C or whatever.

Note that there are of course many commercial HTML generators, and some are good, but before you use these, it's important to understand that "all" they're doing is spitting out HTML. A lot of people not quite happy with results. (Sometimes seems deliberately obfuscated HTML code, as in stuff Microsoft generates. Need HTML optimizing compiler?!) Alternative isn't necessarily writing the stuff by hand. No, write a program (maybe a throw-away two-line script) to write the HTML.

// mktab.c
int i, j;
printf("<TABLE BORDER=1>\n");
for (i=0; i<255; i+=16)
{
    printf("<TR>");
    for (j=0; j<16; j++)
        printf("<TD>&#%u;", i+j);
    printf("</TR>\n");
}
printf("</TABLE>\n");

C:\>mktab > table.html
(Make sure they understand difference between "\n" and "<BR>" in this context!)

The resulting table.html source code looks like crap, but it produces the output we wanted, and we didn't have to write it ourselves:

<TR><TD>P<TD>Q<TD>R<TD>S<TD>T<TD>U<TD>V<TD>W<TD>X<TD>Y<TD>Z<TD>[<TD>\<TD>]<TD>^<TD>_</TR>
<TR><TD>`<TD>a<TD>b<TD>c<TD>d<TD>e<TD>f<TD>g<TD>h<TD>i<TD>j<TD>k<TD>l<TD>m<TD>n<TD>o</TR>
Again, this is CGI without the net: local CGI, sort of. Difference with local CGI is that when program on hard disk produces output, we want web browser to display it immediately, not to have to load output file (here, table.html) into browser by hand. On the other hand, lots of interesting security issues here. (Note: most local CGI discussion belongs in chapter 5. This one really about writing throw-away tools to generate HTML so you don't have to write it by hand.)

[Norm Walsh: "One can hope that by the time your book comes out, HTML will be only one of any number of DTDs that you can reasonable use for presentation on the web. With that in mind, Don't Write HTML might also address the appropriateness of authoring in a more structured DTD and either filtering to the web or serving your structure plus a stylesheet."]

WebReview article on whether HTML "irrelevant" ( http://www.webreview.com/97/04/04/feature/index.html) has comparison of writing HTML to writing RTF.