Wednesday, November 30, 2011

6 - DTDs

Quick questions:

1) What exactly does a DTD do in XML?

A DTD file is used to validate and XML document. It validates the elements and their attributes of the XML file as well as their order.

2) You’ve written an XML document, with the XML declaration  <?xml version= “1.0”?>
  1. change the XML declaration to <?xml version= “1.0” encoding=”ISO 8859-6”?>
  2. change the XML declaration to <?xml version= “1.0” encoding=”UTF-8”?>
  3. do nothing: the declaration is fine as it is.
The declaration is fine as it is, since if the character set is not declared, it is assumed to be "UTF-8". "UTF-8" supports the Arabic characters in encoding=”ISO 8859-6”.

3) Can you use a binary graphics file in an XML document?

An image can be defined as an external entity in an XML document. The image format must be specified and the data must be non-parsable.

Longer questions:


I decide to produce a book called “Toba: the worst volcanic eruption of all”. I ask 3 colleagues to write three text files entitled:
  • “Chapter 1: The mystery of Lake Toba’s origins”.
  • “Chapter 2: Volcanic winter”.
  • “Chapter 3: What Toba did to the human race”.
All three text files are placed into a folder c:\bookproject\chapters on the hard drive on my computer. I insert at the start of each file, and at the end. I name the three files chap1.xml, chap2.xml, and chap3.xml respectively. I draw up the title page, title page verso and contents page of the book like this: Then I construct an XML document that encompasses the whole book.

a. Provide this XML document
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book SYSTEM "Lab06_book.dtd">
<book>
            <titlePage>
                        <bookTitle>Toba: the worst volcanic eruption of all</bookTitle>
                        <author>John</author>
                        <author>Jack</author>
                        <author>Jill</author>
                        <author>Joe</author>
                        <publisher>&pub;</publisher>
            </titlePage>
            <titlePageVerso>
                        <copyright>Copyright 2010 STC Press</copyright>
                        <publishedBy>&pub;</publishedBy>
                        <ISBN>978-0-596-52722-0</ISBN>
            </titlePageVerso>
            <contents>
                        <chapterName number="1">The Mystery of Lake Toba's origins</chapterName>
                        <chapterName number="2">Volcanic Winter</chapterName>
                        <chapterName number="3">What Toba did to the human race</chapterName>
            </contents>
            <chapter number="1" name="The Mystery of Lake Toba's origins">&chap1;</chapter>
            <chapter number="2" name="Volcanic Winter">&chap2;</chapter>
            <chapter number="3" name="What Toba did to the human race">&chap3;</chapter>
</book>

b.  Provide the accompanying .dtd file
<?xml version="1.0" encoding="UTF-8"?>
        <!ENTITY pub "STC Press, Malta">
        <!ENTITY chap1 SYSTEM "chap1.xml">
        <!ENTITY chap2 SYSTEM "chap2.xml">
        <!ENTITY chap3 SYSTEM "chap3.xml">
        <!ELEMENT book (titlePage, titlePageVerso, contents, chapter+)>
        <!ELEMENT titlePage (bookTitle, author+, publisher)>
        <!ELEMENT bookTitle (#PCDATA)>
        <!ELEMENT author (#PCDATA)>
        <!ELEMENT publisher (#PCDATA)>
        <!ELEMENT titlePageVerso (copyright, publishedBy, ISBN)>
        <!ELEMENT copyright (#PCDATA)>
        <!ELEMENT publishedBy (#PCDATA)>
        <!ELEMENT ISBN (#PCDATA)>
        <!ELEMENT contents (chapterName+)>
        <!ELEMENT chapterName (#PCDATA)>
        <!ATTLIST chapterName number CDATA #REQUIRED>
        <!ELEMENT chapter (text)>
        <!ATTLIST chapter number CDATA #REQUIRED name CDATA #REQUIRED>
        <!ELEMENT text (#PCDATA)>


Wednesday, November 23, 2011

5 - Well-formedness & DTDs

Quick questions:

1. This is a smiley. Is it also a well-formed XML document? Say why?

<:-/>

This is not a will-formed XML document because according to W3 recommendation, XML name tags can consist of alphabetical characters, numbers and other characters, but not punctuation characters. Therefore the name tag of “:-” is not valid.

2. What is the difference between well-formed and valid XML?

A well-formed XML document is an XML document that adheres to the XML well formed rules. On the other hand a valid XML document means that it is a validated document according to a schema.

3. Is it a good idea to start an XML document with a comment, explaining what the document is and what it’s for? Say why.

XML documents must always start with the XML declaration, so no comments should be placed before that. However it is good practice to add a comment or two highlighting the contents of the document right after the declaration or anywhere else in the document.

Longer questions:

1. A set of documents is to be constructed as follows. The type of document is a college textbook. Every college textbook has a title page, on which is a title and an author and the publisher; optionally, there may be an aphorism. Every college textbook has a title page verso, on which is a publisher’s address, a copyright notice, an ISBN; there may be a dedication, or there may be more than one. Every college textbook has several chapters, and each chapter has several sections, and each section has several bodies of text. A chapter is identified by a chapter number and a chapter title. A section is identified by a section number and a section title. The name of the publisher will always be Excellent Books Ltd. The address of the publisher will always be 21 Cemetry Lane, SE1 1AA, UK. The application that will process the documents can accept Unicode.
Write a .dtd file for this specification.

<?xml version="1.0" encoding="utf-8"?> 
<!DOCTYPE collegeTextbook [
      <!ENTITY publisherName "Excellent Books Ltd">
      <!ENTITY publisherAddress "21, Cemetry Lane, SE1 1AA, UK">
      <!ELEMENT textBook (titlePage, titlePageVerso, chapter+)>
      <!ELEMENT titlePage (title, author, publisher, aphorism?)>
      <!ELEMENT titlePageVerso (publisherAddress, copyright, ISBN, dedication*)>
      <!ELEMENT chapter (section+)>
      <!ELEMENT section (bodyText+)>     
      <!ELEMENT title (#PCDATA)>
      <!ELEMENT author (#PCDATA)>
      <!ELEMENT publisher (#PCDATA)>
      <!ELEMENT aphorism (#PCDATA)>
      <!ELEMENT publisherAddress (#PCDATA)>
      <!ELEMENT copyright (#PCDATA)>
      <!ELEMENT ISBN (#PCDATA)>
      <!ELEMENT dedication (#PCDATA)>
      <!ELEMENT bodyText (#PCDATA)>
      <!ATTLIST chapter chapterNumber CDATA #REQUIRED>
      <!ATTLIST chapter chapterTitle CDATA #REQUIRED>
      <!ATTLIST section sectionNumber CDATA #REQUIRED>
      <!ATTLIST section sectionTitle CDATA #REQUIRED>
      ]

Write an XML document that contains the following information: the name of a London tourist attraction. The name of the district it is in. The type of attraction it is (official building, art gallery, park etc). Whether it is in-doors or out-doors. The year it was built or founded [Feel free to make this up if you don’t know]. Choose appropriate tags. Use attributes for the type of attraction and in-doors or out-doors status.

<?xml version="1.0" encoding="utf-8"?>
<attraction type="Ferris Wheel" indoors="No">
  <name>London Eye</name>
  <district>Central London</district>
  <yearFounded>1900</yearFounded>
</attraction>

The following is the document element (root element) of an XML document.

a) It’s clear that it’s concerned with English phrases and their Russian translations. One of the start tags is <targLangPhrase> with </targLangPhrase> as its end tag. Why do you suppose this isn’t <russianPhrase> with </russianPhrase> ?

The document is aimed at translating English phrases into Russian but it could be used to translate into another language, so actually makes sense.

b) Write a suitable prolog for this document.


<!DOCTYPE phraseBook>

c) Write a .dtd file to act as the Document Type Description for this document.

<!DOCTYPE phraseBook [
<!ELEMENT phraseBook (section+)>
<!ELEMENT section (sectionTitle, phraseGroup+)>
<!ELEMENT phraseGroup(engPhrase, translitPhrase,targLangPhrase)>
<!ELEMENT engPhrase(gloss?)>
<!ELEMENT translitPhrase(gloss?)>
<!ELEMENT sectionTitle (#PCDATA)>
<!ELEMENT engPhrase (#PCDATA)>
<!ELEMENT translitPhrase (#PCDATA)>
<!ELEMENT targLangPhrase (#PCDATA)>
<!ELEMENT gloss (#PCDATA)>
<!ATTLIST phraseBook targLang CDATA #REQUIRED>
]

d) The application that is to use this document runs on a Unix system, and was written some years ago. Is that likely to make any difference to the XML declaration?

Unix systems might not necessarily use the Unicode system so to ensure proper compatibility, characterset declaration of "UTF-8" might be considered.

<phraseBook targLang=”Russian”>
<section><sectionTitle>Greetings</sectionTitle>
<phraseGroup><engPhrase>Hi! </engPhrase><translitPhrase>privEt </translitPhrase><targLangPhrase>Привет! <phraseGroup><engPhrase>Good morning! dObraye Utra Доброе утро! <phraseGroup><engPhrase>Good evening! </engPhrase><translitPhrase>dObriy dEn/ vEcher <gloss>(day/evening) </gloss> </translitPhrase><targLangPhrase>Добрый день/ вечер! <phraseGroup><engPhrase>Welcome! <gloss>(to greet someone)</gloss></engPhrase><translitPhrase>dabrO pazhAlavat’ </translitPhrase><targLangPhrase>Добро пожаловать! <phraseGroup><engPhrase>How are you? </engPhrase><translitPhrase>kak dela? </translitPhrase><targLangPhrase>Как дела? <phraseGroup><engPhrase>I'm fine, thanks! </engPhrase><translitPhrase>harashO! Spasiba </translitPhrase><targLangPhrase>Хорошо, спасибо! <phraseGroup><engPhrase>And you? </engPhrase><translitPhrase>a u tibyA? А у тебя? <phraseGroup><engPhrase>Good/ So-So. </engPhrase><translitPhrase>harashO/ tAk sibe </translitPhrase><targLangPhrase>Хорошо/Так себе <phraseGroup><engPhrase>Thank you <gloss>(very much)</gloss>! </engPhrase><translitPhrase>spasiba </translitPhrase><targLangPhrase>Спасибо! <phraseGroup><engPhrase>You're welcome! <gloss>(for "thank you")</gloss> </engPhrase><translitPhrase>pazhAlusta </translitPhrase><targLangPhrase>пожалуйста! <phraseGroup><engPhrase>Hey! Friend! </engPhrase><translitPhrase>Ey, drug! </translitPhrase><targLangPhrase>Эй, друг\ Эй, приятель. <phraseGroup><engPhrase>I missed you so much! </engPhrase><translitPhrase>Ya tak sil'no skuchAl/a <gloss>(female)</gloss> pa tibE
</translitPhrase><targLangPhrase>Я так сильно скучал/a по тебе <phraseGroup><engPhrase>What's new? </engPhrase><translitPhrase>Chto nOvava? </translitPhrase><targLangPhrase>Что нового? <phraseGroup><engPhrase>Nothing much </engPhrase><translitPhrase>NiplOha/ NichivO </translitPhrase><targLangPhrase>Неплохо\ Ничего. <phraseGroup><engPhrase>Good night! </engPhrase><translitPhrase>spakOynay nOchi </translitPhrase><targLangPhrase>спокойной ночи <phraseGroup><engPhrase>See you later! </engPhrase><translitPhrase>da vstrEchi/ da svidAn’ya </translitPhrase><targLangPhrase>до встречи/ до свидания <phraseGroup><engPhrase>Good bye! </engPhrase><translitPhrase>pakA/ da svidAn’ya </translitPhrase><targLangPhrase>Пока/до свидания </targLangPhrase></phraseGroup>  </section>
<section><sectionTitle>Asking for Help and Directions</sectionTitle>  <phraseGroup><engPhrase>  I'm lost </engPhrase><translitPhrase>ya zabludils’a </translitPhrase><targLangPhrase>Я заблудился </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Can I help you? </engPhrase><translitPhrase>Ya magU vam pamOch? </translitPhrase><targLangPhrase>Я могу вам помочь? </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Can you help me? </engPhrase><translitPhrase>Vy mOzhite mne pamOch? </translitPhrase><targLangPhrase>Вы можете мне помочь? </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Where is the (bathroom/ pharmacy)? </engPhrase><translitPhrase>Gde nahOditsa (vAnnaya/ aptEka)?
</translitPhrase><targLangPhrase>Где находится (Ванная/ Аптека) </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Go straight! then turn left/ right! </engPhrase><translitPhrase>idite pryAmo, patOm nalEva/ naprAva
</translitPhrase><targLangPhrase>Идите прямо, потом налево/направо </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  I'm looking for john. </engPhrase><translitPhrase>Ya ichU DzhOna </translitPhrase><targLangPhrase>Я ищу Джона. </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  One moment please! </engPhrase><translitPhrase>MinUtu, pazhAlusta </translitPhrase><targLangPhrase>Минуту, пожалуйста. </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Hold on please! <gloss>(phone)</gloss> </engPhrase><translitPhrase>PadazhdIte, pazhAlusta! </translitPhrase><targLangPhrase>Подождите, пожалуйста! </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  How much is this? </engPhrase><translitPhrase>SkOl'ka Eta stOit? </translitPhrase><targLangPhrase>Сколько это стоит? </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Excuse me ...! <gloss>(to ask for something)</gloss> </engPhrase><translitPhrase>izvinite! / prastite </translitPhrase><targLangPhrase>Извините\Простите </targLangPhrase></phraseGroup><phraseGroup>  <engPhrase>  Excuse me! <gloss>(to pass by)</gloss> </engPhrase><translitPhrase>izvinite! </translitPhrase><targLangPhrase>Извините! Come with me! </engPhrase><translitPhrase>PaidyOmte sa mnOy! </translitPhrase><targLangPhrase>Пойдемте со мной! </targLangPhrase></phraseGroup>  </section> </PhraseBook>