Posted on May 18, 2007 by sharat
A document type definition (DTD) describes the permissible tags of an XML document.
The DTD serves as a data template. It defines entities, elements, attributes, and notations, as well as the relationships between these.
For example, the DTD can state that a memo element consists of To, From, Subject, and Message elements.
You need to use a DTD if you want the XML processor to validate your XML documents.
DTDs can help you ensure that your XML documents are well-formed.
You need to indicate to the XML processor that a DTD should be used. This is done by adding a document type declaration before the document element.
The DTD can be internal or external to the document. If the DTD is external, you will need to specify its location or URL.
Filed under: XML | 1 Comment »
Posted on May 18, 2007 by sharat
Schemas are becoming more popular and DTDs less so. A schema is an XML-based syntax for describing how the XML document is marked up or how it looks – very similar to a DTD, but a DTD has a lot of drawbacks.
A DTD doesn’t use anything like XML syntax to describe the definition. You can’t do data typing in a DTD, and it’s not extensible.
An XML schema allows you to specify elements as an integer, a float, a Boolean, or whatever, and has more extensibility to it. Plus, it’s in XML format. Your XML schema, the definition for your XML, is also in XML format.
Microsoft is pushing the XML schema – so you’re likely to see schemas more and more. That’s all we use. In fact, the new BizTalk Server from Microsoft actually uses schemas.
Filed under: XML | Leave a comment »
Posted on May 18, 2007 by sharat
An XML namespace is a collection of element names or attribute names to be defined within an XML document.
Let’s say you have an invoice XML document and an order XML document and you want to put them together. You usually have certain names that overlap each other – like the date. You might have an invoice date and an order date – both called “date”.
With a namespace, you have an invoice date and an order date, and you can reference them as such – “invoice:date” and then “order:date.” That way you can still use the same two tag names and you won’t get confused about which is which. That’s really what an XML namespace is supposed to do for you.
Filed under: XML | 2 Comments »
Posted on May 18, 2007 by sharat
XML is free form which means that you can configure it any way you want. It’s also very easy to read – you don’t have to read cryptic code to figure it out.
It was designed specifically for internet protocols, and this makes it simple to transmit an XML document across, for example, a HTTP protocol.
Strong data typing is available for xml, and it is also compatible with the SGML standard.
XML is Application independent – you can transfer XML data from a C language program to a Visual Basic program. Or you could, for example, have XML data going from one server out across the internet and being picked up by another application running on a Unix box. XML is also platform independent – it works equally well on Windows, UNIX, or for example, a CICS on a mainframe.
XML is also language independent so it doesn’t matter what type of programming language you’re using – C, Visual Basic, ASP using JavaScript. It doesn’t matter – all have mechanisms to read an XML document.
XML is Unicode-based which makes it very flexible and very good for operating across languages like English, Spanish, or French. You can describe all these types of languages since it’s in Unicode.
A huge benefit of XML is that it’s license free –
it doesn’t cost anything to use XML.
Filed under: XML | 1 Comment »
Posted on May 18, 2007 by sharat
|
XML Schema |
DTD |
Markup validation |
Any global element can be root. No ambiguous content support. |
Can specify only the root element in the instance document. No ambiguous content support. |
Namespace support |
Yes. Declarations only where multiple namespaces are used. |
No. |
Code reuse |
Can reuse definitions using named types. |
Poorly supported. Can use parameter entities. |
Datatype Validation |
Provides flexible set of datatypes. Provides multi-field key cross references. No co-occurrence constraints. |
No real datatype support. |
Filed under: XML | 10 Comments »
Posted on May 18, 2007 by sharat
- They are not written in XML syntax, which means you have to learn a new syntax in order to write them
- there is no support for namespaces
- there are no constraints imposed on the kind of character data allowed, so datatyping is not possible
- there is minimal support for code modularity and none for inheritance
- large DTDs are hard to read and maintain
- there are no default values for elements and attribute defaults must be specified when they are declared
- its attribute value models and ID attribute mechanism are simplistic
- there is limited ability to control whitespace
- there is limited documentation support, as you cannot use the structured documentation features available for schema notation
Filed under: XML | Leave a comment »
Posted on October 9, 2006 by sharat
Attributes are simple name/value pairs associated with an element.
They are attached to the start-tag, as shown below, but not to the end-tag:
<name nickname=”sharat”>
<first>sharat</first>
<middle></middle>
<last>lastname</last>
</name>
Attributes must have values–even if that value is just an empty string (like “”)–and those values must be in quotes.
Attributes Can Provide Meta Data that May Not be Relevant to Most Applications Dealing with Our XML
Attributes are Un-Ordered
Elements Can Be More Complex Than Attributes.
Filed under: XML | 2 Comments »
Posted on October 9, 2006 by sharat
XML documents must adhere to following rules to be well-formed.
- Every start-tag must have a matching end-tag, or be a self-closing tag
- Tags can’t overlap
- XML documents can have only one root element
- Element names must obey XML naming conventions
- XML is case-sensitive
- XML will keep whitespace in your text
Filed under: XML | 2 Comments »
Posted on September 27, 2006 by sharat
SAX |
DOM |
Both SAX and DOM are used to parse the XML document. Both has advantages and disadvantages and can be used in our programming depending on the situation. |
Parses node by node |
Stores the entire XML document into memory before processing |
Doesn’t store the XML in memory |
Occupies more memory |
We cant insert or delete a node |
We can insert or delete nodes |
Top to bottom traversing |
Traverse in any direction. |
SAX is an event based parser |
DOM is a tree model parser |
SAX is a Simple API for XML |
Document Object Model (DOM) API |
import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; |
import javax.xml.parsers.*; import org.w3c.dom.*; |
doesn’t preserve comments |
preserves comments |
SAX generally runs a little faster than DOM |
SAX generally runs a little faster than DOM |
If we need to find a node and doesn’t need to insert or delete we can go with SAX itself otherwise DOM provided we have more memory. |
Filed under: XML | 49 Comments »
Posted on September 26, 2006 by sharat
XML |
HTML |
User definable tags
|
Defined set of tags designed for web display
|
Content driven
|
Format driven
|
End tags required for well formed documents
|
End tags not required
|
Quotes required around attributes values
|
Quotes not required
|
Slash required in empty tags
|
Slash not required
|
Filed under: XML | 4 Comments »