The latest version of Microsoft XML Core Services (MSXML
4.0) complies with the World Wide Web Consortium (W3C) 2001 XML Schema
Recommendation.
Numerous features in this version provide XML Schema support. You can
validate XML against XML schemas in both SAX and DOM using either an
external schema cache or xsi:schemaLocation/xsi:noNamespaceSchemaLocation
attributes. Although there is no XPath 2.0 yet, MSXML 4.0 provides extension
functions, permitted by standards, to support handling XSD types in
XPath and XSLT.
MSXML 4.0 also provides a way to reach schema information
in validated documents using type discovery in SAX and the Schema Object
Model (SOM) in DOM. In addition to this added support for the final
XML Schema (XSD) recommendation, MSXML continues to support XML-Data
Reduced (XDR) and document type definition (DTD) validations.
Performance Improvements:
MSXML 4.0 provides the new, faster XML parser and a substantially improved
XSLT engine. You can use the new parser with DOM by setting the NewParser
property to True, e.g. xmlDoc.setProperty("NewParser",
true).
The new parser does not yet support asynchronous DOM load or DTD validation.
However, everything else functions the same way as with the old parser,
only faster. Microsoft's tests of MSXML 4.0 showed about 2x better
performance for pure parsing, and more than 4x better performance for
XSLT transformation. Other test claims I've seen show up to an 8x
better performance. My own tests confirm this, but I would be hesitant
to make such claims, although my tests -- mostly because of time constraints
-- have been less than "Lab Quality".
Extended Support For Sequential XML Processing
MSXML 4.0 provides extended support for sequential XML processing architectures
based on the SAX2 API. This includes:
* Integration between the DOM and SAX parsing models
* Ability to generate HTML output
* Ability to plug the SAX content handler to the output of the XSLT
processor
* Tracking of namespace declarations
You can now use the MXXMLWriter object to generate SAX events from a
DOM tree. You can also build a DOM tree out of SAX events. This feature
allows you to closely integrate DOM and SAX in your applications. For
developers who really want to integrate complex XML processing, this
opens up a whole new world of efficiencies.
A new object, MXHTMLWriter, enables you to output HTML using
a stream of SAX events in much the same way that the <xsl:output>
element in XSLT can generate HTML from a result tree. The new MXHTMLWriter
object provides support for high-performing Active Server Pages, which
can now read XML documents with a SAX reader, put those documents
through custom SAX filters, and output the data to the user as
an HTML page. The MXHTMLWriter object is also useful for a number
of other applications such as the manual generation of HTML pages. You
will also find corresponding classes in the .NET platform to do things
like this, with fine - grained control over the output.
The XSLT processor can now accept the SAX content handler as output.
This means that the chain of SAX filters can directly process the
transformed XML. You can use this feature to eliminate XML regeneration
and reparsing, allowing XML documents to be consumed immediately by
an application when incoming XML documents need to be translated to
the same dictionary.
The new MXNamespaceManager object allows you to programatically
track namespace declarations and resolve them, either in the current
context or in the context of a particular DOM node. Of course MSXML
supports namespaces and can automatically resolve names of elements
and attributes, but there are many cases in which an attribute's value
or an element's content uses qualified names. The MXNamespaceManager
object tracks and resolves these qualified names.
Separate WinHTTP Version 5.0 Component
The former functions of the ServerHTTPRequest
component are now provided by the separate WinHTTP component. This is
a new server-side component (which has been in separate BETA as a stand-alone
component for some months, with its own newsgroup on MS) and which provides
reliable HTTP stack functionality. Without the WinHTTP component, ServerHTTPRequest
and DOM/SAX with server-side mode can not access HTTP-based data. When
you install MSXML 4.0 on a computer running NT / 2000 / XP, you automatically
get the WinHTTP component. Windows 98 / Me / 95 can not
support WinHTTP. You can still install MSXML 4.0 on Windows 98 or Windows
Me, but you will have to use the default DOM/SAX mode, or the XMLHTTPRequest
object, which uses WinInet.
The RTM release provides more compact, faster, and more conformant XML
processing components to be used in a server-side environment with enterprise-grade
systems. MSXML 4.0 can still be used on the client side in a controlled
environment where you can ensure installation of the component on client
machines, as in cases with Intranet or trusted site environments and
applications. Now let's look at a few of the negatives (at least for
some of us).
NewParser Property to use new Parser
with DOMDocument(Transitional):
The NewParser internal property
(flag)- True/False holds a value indicating whether MSXML uses the old
or new internal parser when loading DOMDocument objects.
IMPORTANT: If you
want to use the new faster parser, you must explicitly set this flag
to "true"!
This property is transitional for the period while MSXML provides
a choice of two internal parsers. The new parser is faster and more
reliable, but it does not yet support asynchronous mode or DTD validation.
Once the new parser has been updated to provide for asynchronous mode
and DTD validation, this property will always be set to True.
If the newParser property is set to False, which is the current default
setting, subsequent DOMDocument objects are loaded using the old parser.
If this property is set to True, subsequent DOMDocument objects are
loaded using the new parser.
For example, the following code makes a DOMDocument object use the new
parser when loading.
xmldoc.setProperty("NewParser", True );
Side-by-Side Functionality and the Removal of Replace Mode:
XMLInst.exe
is Gone!
Until MSXML 3.0, you could use replace mode to make the latest MSXML
component simulate MSXML 2.0, which Internet Explorer 5.0 and 5.5 used
for presenting XML when browsing. Now replace mode is completely removed
from MSXML 4.0 and cannot be used to substitute MSXML 2.0 for Internet
Explorer. That means that if Internet Explorer is your default program
to open XML files, and you double click on an XML document, Internet
Explorer will not use MSXML 4.0 to show it. MSXML 4.0 can still be used
in the traditional way to manipulate XML within an HTML page using a
script.
Removal of Version-Independent ProgIDs
Version-independent ProgIDs are gone. This provides true side-by-side
installation, compared to previous versions in which some ProgIDs were
upgraded with the installation of a new version of MSXML. Now CreateObject("MSXML2.DOMDocument"*)
will not instantiate the MSXML 4.0 DOM, but a previous version (if it
is registered). If you want to use MSXML 4.0, you must use a ProgId
like this: CreateObject(*MSXML2.DOMDocument.4.0*). With C++ and Visual
Basic you will create "MSXML2.DOMDocument40". Similar changes
will be necessary with all other MSXML objects in order to use the MSXML
4.0 version.
The reason for this change is to improve the maintainability of code
which otherwise would be error-prone when unexpected changes occur in
the environment. Version-independent ProgIDs were great for developers
trying MSXML, but proved risky in a production environment. If a user
developed code with version-independent ProgIDs, expecting MSXML 3.0
to be in place, and later installed or reinstalled SQL Server, for example,
they might find that they were using MSXML 2.6 instead of MSXML 3.0.
Removing version-independent ProgIDs in MSXML 4.0 kind of "bites
the bullet", eliminating such instability, and improves MSXML as
a server-side enterprise-grade component.
Side-by-Side Functionality
The release version of MSXML 4.0 is shipped with the same DLL names
(msxml4.dll, msxml4r.dll, and msxml4a.dll) as in preview releases. With
version-independent ProgIDs removed, this guarantees that MSXML 4.0
will not interfere with any versions of MSXML (2.0, 2.6, or 3.0) previously
installed. If you have code that uses version - independent ProgID's
instantiating MSXML 3.0 or 3.0 SP1, the installation of MSXML 4.0 RTM
should have no effect whatsoever. Windows XP Side-by-Side installation
does this in an even more integrated manner. This means that with Windows
XP, you can use the special side-by-side functionality to manage how
your applications are using MSXML and which versions (starting from
MSXML 4.0) that they are using. You'll create a Windows XP application
manifest which will link your application to the specific version
of MSXML 4.0.
Important Notes
If you have MSXML 4.0 Previews installed (April or July Technical
Preview Release of MSXML 4.0):
Direct upgrade from previews to RTM is still supported. You will have
to uninstall preview, and after that install RTM. You might have to
manually unregister and remove msxml4*.dll files from your system32
directory. To unregister the MSXML 4.0 preview, run:
regsvr32 /u msxml4.dll
If you have the MSXML 4.0 April Technical Preview Release of MSXML 4.0
installed:
Note that version-independent ProgIDs have been removed from MSXML 4.0
(despite having existed in the April release), so installing this release
will make them non-functional. This might seriously affect a number
of applications (such as Microsoft Visual Studio® .NET setup)
that use MSXML 3.0. To avoid this problem, run the following two commands
from the command line and delete msxml4*.dll files from the system32
directory before installing this release.
regsvr32 /u msxml4.dll
regsvr32 msxml3.dll
Note that after unregistration you might have to manually delete
msxml4*.dll files from your system32 directory.
Some Final Comments:
MSXML 4.0 RTM represents what I believe
is the final stage in the evolution of Microsoft's COM - compliant XML
technologies, ex "Dot Net". Developers and organizations who
want to be able to upgrade their code base will be well-advised to use
global constants application - wide for the instantiation of Version
Specific ProgID's. In. this manner an entire application's source code
base can be upgraded by simply doing a search and replace of ,for example
"strMSXMLVersionNum='.3.0' and changing the 3 to a "4".
In addition, it is possible to use the temporary "NewParser"
property in code in an "if" test such that: if(strMSXMLVersionNum=".4.0")
xmlDoc.setProperty("NewParser", true).
However, conversion may not be that
smooth. Developers should be ready to run in to additional problems
most of which will revolve around that fact that older, non standards
- compliant code in XPATH - type statements, replacement - type variables
such as "$any$" and other previously acceptable constructs
will no longer work under MSXML 4.0 RTM. It's time to meet the W3C and
bring the company's code base up to standards if you wanna play, and
unfortunately it may not be a picnic.