Content vs. Presentation

Content vs./and Presentation. Content is essential; presentation an afterthought. Content is unchanging, change being intrinsic to presentation. Content endures, while presentation(s) may come and go.

Presentation assures visibility, recognition, attention; content may be, on its own, unappealing to the eye. Presentation helps the content communicate—get across to the reader—what it has to say; content, without presentation, may fail to do so. Presentation is aware of its audience(s) and publication(s), and may modify its visual approach(es) accordingly; content is largely ignorant of these contexts (excepting perhaps writing style).

The way Presentation is applied to Content has been too direct in typical website development over the past many years. Presentation has been too tightly "bound" to the content. More indirection is called for, to "uncouple" the two concepts, such that the inevitable desire for change in "look + feel" can be accomodated efficiently, by modification of a small number of "coupling" devices (like, a Cascading Style Sheet), rather than by the editing of hundreds of content instances (e.g., all the web pages on a site).

This isn't the place to write a discourse on the general problem in website development of content being too tightly bound to presentation, but it is worth noting that the basis for a solution lies in recognizing the essentially hierarchical nature of the various elements that come together to form a page. New kinds of design decisions can be made as to where, precisely, in this hierarchy—this "nesting" or "cascading" of objects—the critical information can be stored that defines the presentation decision for each element.

In the past, this presentation "decision" information has been closely attached to source content elements, and this has made maintenance difficult (e.g. now-deprecated HTML tags like <font>, etc.).

With this project (and many others like it), the adoption of the eXtensible Markup Language (XML) has permitted us to concentrate solely on the semantic and structural information found in the Content Types themselves, by use of an application of XML (a "vocabulary," or markup language) that we call "BusinessML." It is essentially an extension of the W3C HTML, with additional elements. These added "tags" are given names that reflect the actual semantic meaning of the information found within them (e.g. <dateline> in a press release; <patentNumber> in a patent; etc.). The parts of the website page content that do not lend themselves to semantic identification (the flow of paragraphs, the bulleted lists, etc.), simply use what we refer to as a "ca. 1994" HTML tagset (e.g. 'p', 'b', 'ul', 'li', 'a', etc.), and explicitly eschew any use of HTML tags beyond strict structural expression of the content's information. For example, tables may be used, but only to present tabular information!

BusinessML is in fact a higher level abstraction than strictly an XML application, for instead of being a single Document Type, it is instead a set of cooperating DTDs (Document Type Definitions), in that one DTD is created for each identified Content Type. So we have ct01_general.dtd, ct10_press_release.dtd, ct40_article.dtd, ct50_bio.dtd, ct80_glossary.dtd, etc. These all "import" the ct00_core.dtd with its shared set of fundamental element definitions (metadata fields, core required fields, etc.).

Authoring source content for the website—for the present—is usually a matter of accepting documents from business unit staff who use Microsoft Word, and write with a high degree of freedom from document type strictures. These documents are then processed by hand (and simple macros) into BusinessML by website staff.

Only minimal requirements are placed on business unit staff:

Goals for the future include use of more "XML-aware" (though largely invisible to the user) tools "upstream" by business unit staff, such that their assignment to produce content for the website becomes further automated, in terms of publishing directly to the site. Controls and workflow are to be introduced along with these new authoring capabilities.

An important, related goal, is to identify additional Content Types, where possible, such that publishing from within the company becomes an activity practiced in more than a single "publication." Today's public-facing corporate website is the only target audience for the information being produced, but in the future additional publications may be devised, in particular those that aim at more focussed audiences, and which may reach those audiences via channels other than just the web. The portability, and malleability, of the XML-based vocabulary in use then becomes even more valuable, as automated processes can be created to work on the content, as well as the metadata attached to the content, to produce various publications from a core base of information.

The figures below depict the transformation of a "tree" of content into a "rectangle" of presentable information.

Turning trees into rectangles is what this system is about. Trees are the full expression of the information the company possesses about its own resources, products, & services, as well as what it knows about the intended audiences for those assets.

Rectangles are the polished, "finished products" of publishing information about those assets, as designed for those audiences.