Slug: mark-pilgrim-on-opml Date: 2002-04-15 Title: Mark Pilgrim on OPML layout: post

Mark Pilgrim investigates OPML. He makes some of the same observations that I've made in the past (unfortunately I don't believe I published any of them, so go read his).

OPML is a nice idea, but is limited by Dave's view that XML circa 1998 is all anyone should ever need, and his dislike for namespaces and subsequent refusal to update Frontier XML parser to support them.

For the record: having an xml-based outline format is an excellent idea! XML is perfect for describing hierarchies. Of course, an outline is often pretty boring unless you can also describe what the nodes represent. So Dave came up with nodetypes. Nodetypes allow you to say "this node represents an mp3 file" or "this node represents a Manila website structure". This also is an excellent idea. However

In order to be XML compatible/compliant, (and I DO mean XML circa 1998 or thereabouts) OPML needs to have a DTD against which strict parsers can validate the document. However (and I can't find the reference at this time) OPML does not define what attributes can or cannot appear in an <outline> node. The only required attribute is the "text" attribute which contains the text of the outline heading. This is a problem because each nodetype uses a seperate set of attributes contained in the <outline> node to store information. This makes it impossible to develop a DTD for any OPML file because it may contain arbitrary attributes on a node.

The accepted solution to this problem in 98% of the XML-using world would be to break out nodetypes into their own namespaces. Define OPML-the-structred-document as simply as possible: <opml>, <head>, <body>, <outline>. Then, let applications thereof define their own elements and attributes, point those namespaces at their own DTDs, and then a parser that cares about such things can compare the OPML document to the included DTDs, and be happy. Meanwhile, parsers that don't care could ignore the DTDs and keep right on going.

Of course, Frontier's parser has not changed much since being "kernelized" a while back, and it does not support namespaces. So no Userland XML will ever make use of them, which is sad because I think XML Namespaces have "grown up" since their introduction, and are a big part of taking the web to the next step.