The name XML comes from the full name of the language, Extensible Markup Language. Although "markup" is in its name, do not think of XML like you do HTML. Aside from the fact that both languages are based on tag pairs, there are no similarities. XML is a method of data exchange, in that it holds well-defined content within its boundaries. HTML, on the other hand, couldn't care less what is contained in the content or how it is structured—its only purpose is to display the content to the browser. XML is used to define and carry the content, whereas HTML is used to make it "pretty."
This is not to say that XML data cannot be made pretty, or that you cannot display XML data in your web browser. In fact, this is exactly what you do when using Extensible Style Language (XSL) and Cascading Style Sheets (CSS) to render your content into a format your web browser can understand, while still preserving the content categorization. For example, say you have an area on your website reserved for recent system messages, and those items each contain the following:
Title
Message
Author
Date of message
You might want to display the title in bold, the message as a paragraph, the author's name in italics, and the date in a small font. For this, you would use HTML. XML, on the other hand, only cares that there are four distinct content elements. By separating the data and its structure from the presentation elements, you can use the content however you want, and you're not limited to the particular marked-up style that static HTML has forced on you.
Before moving forward into working with XML documents, you need to know exactly how to create them! XML documents contain two major elements, the prolog and the body. The prolog contains the XML declaration statement (much like an HTML document type definition statement), and any processing instructions and comments you want to add.
| Note |
For a complete definition of XML documents, read the XML specification at http://www.w3.org/TR/REC-xml. |
Using the system message example from the previous section, open a text editor and create a file called messages.xml. Type the following:
<?xml version="1.0" ?> <!-- Sample XML document -->
Next, the fun begins in the body area of the document, where the content structure is contained. XML is hierarchical, like a book—books have titles and chapters, each of which contain paragraphs, and so forth. There is only one root element in an XML document. Using the book example, the element might be called Book, and the tags <Book></Book> surround all other information.
But I am using the system messages example here, so call the root element SystemMessage, and add an open tag to your messages.xml document:
<SystemMessage>
Next, add any subsequent elements—called children—to your document. Using the system messages example, you need title, body, author, and date information. Call the children elements MessageTitle, MessageBody, MessageAuthor, and MessageDate. But what if you want both the name and an e-mail address for the author? Not a problem—you just create another set of child elements within your parent element (which also just happens to be a child element of the root element). For example, just the <MessageAuthor> element could look like this:
<MessageAuthor> <MessageAuthorName>Joe SystemGod</MessageAuthorName> <MessageAuthorEmail>systemgod@someserver.com</MessageAuthorEmail> </MessageAuthor>
All together, your sample messages.xml document could look something like this:
<?xml version="1.0" ?> <!--Sample XML document --> <SystemMessage> <MessageTitle>System Down for Maintenance</MessageTitle> <MessageBody>Going down for maintenance soon!</MessageBody> <MessageAuthor> <MessageAuthorName>Joe SystemGod</MessageAuthorName> <MessageAuthorEmail>systemgod@someserver.com</MessageAuthorEmail> </MessageAuthor> <MessageDate>March 4, 2004</MessageDate> </SystemMessage>
Here are two very important rules to keep in mind for creating valid XML documents:
XML is case-sensitive, so <Book> and <book> are different elements.
All XML tags must be properly closed, XML tags must be properly nested, and no overlapping tags are allowed.
Put the messages.xml file (or one like it) in the document root of your web server for use in later examples. As a side note, current versions of some browsers, such as Microsoft Internet Explorer and Netscape, allow you to view your XML document in a tree-like format, using their own internal style sheets. The first figure shows the original view of the messages.xml file with all elements opened, whereas the next figure shows the messages.xml file with the MessageAuthor element collapsed.