Thursday, December 8, 2011

7 - DTDs


Quick questions:

1.    People who prepare XML documents sometimes put part of the document in a CDATA section.

a. Why would they do that?

CDATA is used to indicate text that needs to be ignored by the XML parser. This is done when the text will include characters used be XML such as "<" or ">".

b. How is the CDATA section indicated?

A CDATA section is indicated by the use of: <! [CDATA [Your Text ...]]> 

c. If CDATA sections hadn't been invented, would there be any other way to achieve the same effect?
The use of text commenting can be applied for text that is not to be parsed, but it does not allow characters used by XML such as "<" or ">". However, the escaping technique could be used to include these characters by replacing "<" with "&lt" and ">" with "&gt" for example. This would require some time and effort and would make the text slightly difficult to be understood by humans.

2. What is a parser and what does it have to do with validity?

There are two types: validating and non-validating parsers. A non-validating parser checks the well-formedness of the XML document. On the other hand, a validating parser will check and validate the structure of the document according to a specified schema. Parsers are able of doing tasks such as reading and extracting the data and identifying special statements such as the prolog and DTD declaration.

3. You write a .dtd file to accompany a class of XML documents. You want one of the elements, with the tag <trinity>, to appear exactly three times within the document element of every document in this class. Is it possible for the .dtd file to specify this?

In a DTD you cannot specify exactly the nth time an element should appear. Despite this, one can include three sub-elements of <trinity>  named <trinity1><trinity2><trinity3> to overcome this.

Longer questions:

1. The following is one of the documents that featured in last week’s exercises. As mentioned before, this is to be “Chapter 2: Volcanic winter” in a book.

a) Write a suitable prolog for this document.

b) Write a .dtd file to act as the Document Type Description for this document. Or modify the one you wrote last week, if you wrote one.

c) Put tags into the document. Obviously, there must be a document element. But also, the poem needs special treatment (because of the way it will be displayed) and, in fact, each line of the poem needs special treatment (you can spot the places where the lines start, by the capital letters). The mention of the poets at Geneva needs to be identified, because it will feature in the index, and so do the pyroclastic flows and Mount Tambora and Sumbawa and the year without a summer and the famines.

2. This chapter obviously needs some pictures. You have available the following, and you decide to include them in the chapter, at appropriate places:
  • a picture of Sumbawa, after the volcanic eruption. It’s in a file sumbawa.jpg. Caption: “Sumbawa, after the volcanic eruption”.
  • a picture of Lake Geneva, in 1816. It’s in a file Geneva1816.jpg. Caption: “Lake Geneva, during the summer of 1816”.
  • a picture of Mary Shelley. It’s in a file MaryShelley.jpg. Caption: “Mary Shelley, author of Frankenstein”.

Amend your two files so that they can cope with these pictures and captions.

DTD File:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE chapter [
<!ELEMENT chapter (title, section)>
<!ELEMENT section (text | poem)*>
<!ELEMENT poem (verse+)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT text (#PCDATA | indexEntry)*>
<!ELEMENT indexEntry (#PCDATA)>
<!ELEMENT verse (#PCDATA)>
<!ATTLIST chapter chapterNumber ID #REQUIRED>
]>

XML File:


<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE chapter>

<chapter chapterNumber="Two">

            <title>Chapter Two: Volcanic winter</title>
            <section>
                        <text>A volcanic winter is very bad news. The worst eruption in recorded history happened at <indexEntry> Mount Tambora</indexEntry> in 1815. It killed about 71 000 people locally, mainly because the <indexEntry>pyroclastic</indexEntry>  flows killed everyone on the island of <indexEntry>Sumbawa</indexEntry> and the tsunamis drowned the neighbouring islands, but also because the ash blanketed many other islands and killed the vegetation. It also put about 160 cubic kilometres of dust and ash, and about 150 million tons of sulphuric acid mist, into the sky, which started a volcanic winter throughout the northern hemisphere.</text>
                        <img src="/sumbawa.jpg" cap="Sumbawa, after the volcanic eruption"/>
                        <text>The next year was the <indexEntry>year without a summer</indexEntry>. No spring, no summer – it stayed dark and cold all the year round. This had its upside. In due course, all that ash and mist in the upper atmosphere made for some lovely sunsets, and Turner was inspired to paint this.</text>
                        <text>The Lakeland poets took a holiday at <indexEntry>Geneva</indexEntry>, and the weather was so horrible that Lord Byron was inspired to write this. </text>
                        <img src="/Geneva1816.jpg" cap="Lake Geneva, during the summer of 1816"/>
                        <poem>
                                    <verse>The bright sun was extinguish'd, and the stars </verse>
                                    <verse>Did wander darkling in the eternal space,</verse>
                                    <verse>Rayless, and pathless, and the icy earth</verse>
                                    <verse>Swung blind and blackening in the moonless air;</verse>
                                    <verse>Morn came and went—and came, and brought no day,</verse>
                        </poem>
                        <text>Mary Shelley was inspired to write Frankenstein.</text>
                        <img src="/MaryShelley.jpg" cap="Mary Shelley, author of Frankenstein"/>
                        <text>The downside was that there were <indexEntry>famines</indexEntry> throughout Europe, India, China and North America, and perhaps 200 000 people died of starvation in Europe alone.</text>
            </section>
</chapter>




No comments:

Post a Comment