Explain DTD

A document type definition (DTD) describes the permissible tags of an XML document.

The DTD serves as a data template. It defines entities, elements, attributes, and notations, as well as the relationships between these.

For example, the DTD can state that a memo element consists of To, From, Subject, and Message elements.

You need to use a DTD if you want the XML processor to validate your XML documents.

DTDs can help you ensure that your XML documents are well-formed.

You need to indicate to the XML processor that a DTD should be used. This is done by adding a document type declaration before the document element.

The DTD can be internal or external to the document. If the DTD is external, you will need to specify its location or URL.

Is an XML schema an alternative to DTD

Schemas are becoming more popular and DTDs less so. A schema is an XML-based syntax for describing how the XML document is marked up or how it looks – very similar to a DTD, but a DTD has a lot of drawbacks.
A DTD doesn’t use anything like XML syntax to describe the definition. You can’t do data typing in a DTD, and it’s not extensible.

An XML schema allows you to specify elements as an integer, a float, a Boolean, or whatever, and has more extensibility to it. Plus, it’s in XML format. Your XML schema, the definition for your XML, is also in XML format.
Microsoft is pushing the XML schema – so you’re likely to see schemas more and more. That’s all we use. In fact, the new BizTalk Server from Microsoft actually uses schemas.

Explain XML Namespaces

An XML namespace is a collection of element names or attribute names to be defined within an XML document.
Let’s say you have an invoice XML document and an order XML document and you want to put them together. You usually have certain names that overlap each other – like the date. You might have an invoice date and an order date – both called “date”.

With a namespace, you have an invoice date and an order date, and you can reference them as such – “invoice:date” and then “order:date.” That way you can still use the same two tag names and you won’t get confused about which is which. That’s really what an XML namespace is supposed to do for you.

What are the advantages of XML

XML is free form which means that you can configure it any way you want. It’s also very easy to read – you don’t have to read cryptic code to figure it out.
It was designed specifically for internet protocols, and this makes it simple to transmit an XML document across, for example, a HTTP protocol.
Strong data typing is available for xml, and it is also compatible with the SGML standard.

XML is Application independent – you can transfer XML data from a C language program to a Visual Basic program. Or you could, for example, have XML data going from one server out across the internet and being picked up by another application running on a Unix box. XML is also platform independent – it works equally well on Windows, UNIX, or for example, a CICS on a mainframe.
XML is also language independent so it doesn’t matter what type of programming language you’re using – C, Visual Basic, ASP using JavaScript. It doesn’t matter – all have mechanisms to read an XML document.

XML is Unicode-based which makes it very flexible and very good for operating across languages like English, Spanish, or French. You can describe all these types of languages since it’s in Unicode.
A huge benefit of XML is that it’s license free –
it doesn’t cost anything to use XML.

Differences between DTD and Schema

  XML Schema DTD
Markup validation Any global element can be root. No ambiguous content support. Can specify only the root element in the instance document. No ambiguous content support.
Namespace support Yes. Declarations only where multiple namespaces are used. No.
Code reuse Can reuse definitions using named types. Poorly supported. Can use parameter entities.
Datatype Validation Provides flexible set of datatypes. Provides multi-field key cross references. No co-occurrence constraints. No real datatype support.

What are the disadvantages of DTD

  • They are not written in XML syntax, which means you have to learn a new syntax in order to write them
  • there is no support for namespaces
  • there are no constraints imposed on the kind of character data allowed, so datatyping is not possible
  • there is minimal support for code modularity and none for inheritance
  • large DTDs are hard to read and maintain
  • there are no default values for elements and attribute defaults must be specified when they are declared
  • its attribute value models and ID attribute mechanism are simplistic
  • there is limited ability to control whitespace
  • there is limited documentation support, as you cannot use the structured documentation features available for schema notation

85. Explain about XML attributes

Attributes are simple name/value pairs associated with an element.

They are attached to the start-tag, as shown below, but not to the end-tag:

<name nickname=”sharat”>
<first>sharat</first>
<middle></middle>
<last>lastname</last>
</name>

Attributes must have values–even if that value is just an empty string (like “”)–and those values must be in quotes.

Attributes Can Provide Meta Data that May Not be Relevant to Most Applications Dealing with Our XML

Attributes are Un-Ordered

Elements Can Be More Complex Than Attributes.

84. What do you mean by well-formed XML

XML documents must adhere to following rules to be well-formed.

  1. Every start-tag must have a matching end-tag, or be a self-closing tag
  2. Tags can’t overlap
  3. XML documents can have only one root element
  4. Element names must obey XML naming conventions
  5. XML is case-sensitive
  6. XML will keep whitespace in your text

83. What are the differences between SAX and DOM parser.

SAX DOM
Both SAX and DOM are used to parse the XML document. Both has advantages and disadvantages and can be used in our programming depending on the situation.
Parses node by node Stores the entire XML document into memory before processing
Doesn’t store the XML in memory Occupies more memory
We cant insert or delete a node We can insert or delete nodes
Top to bottom traversing Traverse in any direction.
SAX is an event based parser DOM is a tree model parser
SAX is a Simple API for XML Document Object Model (DOM) API
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
 
doesn’t preserve comments preserves comments
SAX generally runs a little faster than DOM SAX generally runs a little faster than DOM
If we need to find a node and doesn’t need to insert or delete we can go with SAX itself otherwise DOM provided we have more memory.

82. Describe the differences between XML and HTML

XML HTML

User definable tags

Defined set of tags designed for web display

Content driven

Format driven

End tags required for well formed documents

End tags not required

Quotes required around attributes values

Quotes not required

Slash required in empty tags

Slash not required