The Incomplete Works of Josh English

Writings and Ramblings about almost anything...

On this page: The WCMSBaseParser class | Attributes | Conversion methods | Parsing methods | FixTags

The WCMSBaseParser class

The WCMSBaseParser class contains useful tools WCMS parsers can call. Since the purpose of this system is to convert a bunch of XML files into HTML, most of the default methods are involved in creating usable, valid, and readable HTML code.

WCMSBaseParser(path,cms,[menu=None])

Base class for all page parsers used in WCMS. Pass the path to the source file, the cms object, and the menu object to WCMSBaseParser. The Menu object is a WebMenu object and it is optional.

The WCMSBaseParser has the following attributes:

Attributes

Conversion methods

Convert

Parse the source file. Use the output method to retrieve parsed code.

output

Returns the text of the parsed document.

GetText(node)

Gets the data of any text nodes that are children of the node. This method only looks at the first generation of children.

ExtractText(node)

Extracts only the text of a node’s children. This is a recursive function and it will get the text of every child that has text, no matter how many generations down it is.

MakeMeta(node)

Creates meta information for the web. The tag name is assigned to the meta ‹name’ attribute and the data in the tag is assigned to the contents.

In the Parser include the function:


				def do_author(self,node):
					self.MakeMeta(node)

The expected XML is '<author>Your Name</author>' This will return '<meta name=“author” content=“Your Name”>' to the output

MakeSpan(node)

Takes an XML tag and turns it into an HTML span with a class attribute of the tag name.

Example:

In XML '<note>This is cool</note>'

To HTML '<span class=“note”>This is cool</span>'

MakeParagraph(node)

Takes an XML tag and turns it into an HTML span with a class attribute of the tag name.

Exampke:

In XML '<note>This is important</note>'

To HTML '<p class=“note”>This is important</p>'

MakeDiv(node)

Takes an XML tag and turns it into an HTML span with a class attribute of the tag name.

Example:

In XML '<note>This is important</note>'

To HTML '<div class=“note”>This is important</div>'

PassAsHTML(node)

Passes the node exactly as is.

Example:

In XML: '<span class=“yadda”>This is yadda text.</span>'

In HTML: '<span class=“yadda”>This is yadda text.</span>'

PassAsBlockHTML(node)

Passes the node exactly as is with line spacing.

Example:

In XML: '<pre class=“yadda”>This is yadda text.</pre>'

In HTML: '<pre class=“yadda”> This is yadda text.</pre>'

SetMenuKey(key)

Call this function with the appropirate string and it allows the WebMenu object create relative links in the file. This is required to call as you parse every page for WebMenu to work properly.

In my XML source I usually have a menukey attribute in the document element, and my parser calls the getAttribute method to get the right value, which is passed to SetMenuKey .

SetDate(date)

This function allows you to dynamically set the date for the file. I intended this to be used to encode a ‹last updated’ line on my pages.

Parsing methods

These methods are used to parse the document.

parse(node)

parses a single XML component

parse_Element(node)

Parses an element. The XML element correspondes to an actual tag in the source.

parse_Text(node)

Append to self.pieces unless it consists of nothing of carriage returns.

parse_Comment(node)

The default behavior passes comments straight through to the final product

do_unknowntag(node)

If there is no ‹do_tag’ handler for a tag in the XML source, the default behavior is to check if the tag name is the same as the source tag name, in which case it adds the <html> tag. Otherwise it adds a tag with with the nodes tagName attribute. This will probably result in bad HTML. You can override this to raise errors or simply parse the children of the node, essentially ignoring it, or ignore it and all of the node’s children.

AddPiece(item)

Adds the string representation of item into the parsers pieces list with an end of line character. If you don’t want to add the end of line character, you can call self.pieces.append(item) during parsing.

AddToReport(item)

This function adds information to the parsers report, so you can track what’s going on.

GenerateReport

This method returns the report as one long piece of text

ClearReport

This method clears the report.

FixTags

The minidom parser doesn’t escape entities, so the WCMSBaseParser class has the following methods built in. For instance, if you want an ampersand, put <amp/> in the XML source and it will appear in the HTML as the &amp; character. If you set the parsers allowfixtags attribute to 0, then these do not process the data.

The encoding of the page needs to be iso-8859-1 for these to work.