Semaine 8


21/05/2001

 

XML:

http://www.oasis-open.org/
OASIS, the Organization for the Advancement of Structured Information Standards, is a non-profit, international consortium that creates interoperable industry specifications based on public standards such as XML and SGML, as well as others that are related to structured information processing.
 

DOM:

"camelback" notation: docType instead of doctype

DOM Interfaces:

Accessing Nodes in DOM: p.334: When you need that flexibility and performance, you should seriously consider using SAX.
import org.xml.sax.*
callbacks
handlers
HandlerBase: just subclass, define a few methods, and you're done.
parser.DocumentHandler (docHandler)
PI: Processing Instruction
 
 

Xerces:

[barkati@ceres dom]$ javac DOMCount.java -classpath "../../src/:../../class"
[barkati@ceres dom]$ cd ..
[barkati@ceres samples]$ java dom.DOMCount -classpath "../src/:../class"
Exception in thread "main" java.lang.NoClassDefFoundError: org/w3c/dom/Node
 

Objet:        WELCOME to xerces-j-user@xml.apache.org
  Date:      21 May 2001 14:43:09 -0000
   De:        xerces-j-user-help@xml.apache.org
     A:        barkati@edite-de-paris.com.fr
 

Hi! This is the ezmlm program. I'm managing the
xerces-j-user@xml.apache.org mailing list.

I'm working for my owner, who can be reached
at xerces-j-user-owner@xml.apache.org.

Acknowledgment: I have added the address
   barkati@edite-de-paris.com.fr
to the xerces-j-user mailing list.

Welcome to xerces-j-user@xml.apache.org!

Please save this message so that you know the address you are
subscribed under, in case you later want to unsubscribe or change your
subscription address.
 

--- Administrative commands for the xerces-j-user list ---

I can handle administrative requests automatically. Please
do not send them to the list address! Instead, send
your message to the correct command address:

To subscribe to the list, send a message to:
   <xerces-j-user-subscribe@xml.apache.org>

To remove your address from the list, send a message to:
   <xerces-j-user-unsubscribe@xml.apache.org>

Send mail to the following for info and FAQ for this list:
   <xerces-j-user-info@xml.apache.org>
   <xerces-j-user-faq@xml.apache.org>

Similar addresses exist for the digest list:
   <xerces-j-user-digest-subscribe@xml.apache.org>
   <xerces-j-user-digest-unsubscribe@xml.apache.org>

To get messages 123 through 145 (a maximum of 100 per request), mail:
   <xerces-j-user-get.123_145@xml.apache.org>

To get an index with subject and author for messages 123-456 , mail:
   <xerces-j-user-index.123_456@xml.apache.org>

They are always returned as sets of 100, max 2000 per request,
so you'll actually get 100-499.

To receive all messages with the same subject as message 12345,
send an empty message to:
   <xerces-j-user-thread.12345@xml.apache.org>

The messages do not really need to be empty, but I will ignore
their content. Only the ADDRESS you send to is important.

You can start a subscription for an alternate address,
for example "john@host.domain", just add a hyphen and your
address (with '=' instead of '@') after the command word:
<xerces-j-user-subscribe-john=host.domain@xml.apache.org>

To stop subscription for this address, mail:
<xerces-j-user-unsubscribe-john=host.domain@xml.apache.org>

In both cases, I'll send a confirmation message to that address. When
you receive it, simply reply to it to complete your subscription.

If despite following these instructions, you do not get the
desired results, please contact my owner at
xerces-j-user-owner@xml.apache.org. Please be patient, my owner is a
lot slower than I am ;-)

--- Enclosed is a copy of the request I received.

Return-Path: <barkati@edite-de-paris.com.fr>
Received: (qmail 93390 invoked from network); 21 May 2001 14:42:47 -0000
Received: from unknown (HELO ceres.lip6.fr) (132.227.64.159)
  by h31.sny.collab.net with SMTP; 21 May 2001 14:42:47 -0000
Received: from edite-de-paris.com.fr (localhost.localdomain [127.0.0.1])
        by ceres.lip6.fr (8.11.0/8.11.0) with ESMTP id f4LDjTc14345
        for <xerces-j-user-sc.990455345.dbcdccimcpffpmcgajdo-barkati=edite-de-paris.com.fr@xml.apache.org>; Mon, 21 May 2001 15:45:29 +0200
Sender: barkati@ceres.lip6.fr
Message-ID: <3B091BF7.E8E9DEE1@edite-de-paris.com.fr>
Date: Mon, 21 May 2001 15:45:27 +0200
From: Karim Barkati <barkati@edite-de-paris.com.fr>
X-Mailer: Mozilla 4.75 [fr] (X11; U; Linux 2.2.16-22 i586)
X-Accept-Language: en
MIME-Version: 1.0
To: xerces-j-user-sc.990455345.dbcdccimcpffpmcgajdo-barkati=edite-de-paris.com.fr@xml.apache.org
Subject: Re: confirm subscribe to xerces-j-user@xml.apache.org
References: <990455345.79297.ezmlm@xml.apache.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N
 

Batik:

org.apache.batik.refimpl.util.JSVGCanvas

The SVG generator lets all Java application export their graphics as SVG, using the same code used for drawing to the screen or for printing. For example, an application that displays a pie chart in a window, can use the SVG generator to easily export the sequence of Java2D drawing calls for the pie chart to a SVG format.

Apache's mission is to allow the web to be an open environment and to remain an open environment. Batik is an open source implementation of a key format for today and tomorow's web and fits well in Apache's mission.

Note that Batik uses Xalan for its support of XSL transformations (have a look at the xsltest.svg file in the distribution).

For example, if your data (say stock information) is contained in an XML document, you could use XSLT to transform your XML data into SVG. If your data comes from a database and you retrieve that data in a servlet on a Web server (e.g., using JDBC), you could use the Java binding for the DOM API to generate an SVG document from the data base data. You could also use Batik's SVG generator and use the Java 2D API to generate that graphic.

java.awt.Graphics2D abstract class
There are specialized implementations of this abstract class for each type of output, such as a monitor or a printer. SVGGraphics2D is a new implementation of that interface that generates SVG content instead of drawing to a screen or a printer.

VGGraphics2D provides the following:


TestSVGGen

SVG has two ways of expressing properties, such as the fill color: either XML attributes or CSS values. The 'useCss' parameter allows the user to control that option.


22/05/2001

 

SAX:

If you actually have a URI for a document, avoid using an InputSource object to describe it to the parser.

ErrorHandler et SAXParseException :

DocumentHandler reports element and text content, + startDocument and endDocument calbacks, + Locator object.
p.354: With SAX, you can choose your parser separetly from the API you use to access it!

EntityResolver for external parsed entities.
DTDHandler -> parser.setDTDHandler (handler); // listing 16.10
ENTITY, ENTITIES, NOTATION; ex: images.
 

SVG:

XML to SVG sample codes:
http://www.kurtcagle.net/

Objet:        WELCOME to batik-users@xml.apache.org
  Date:        22 May 2001 08:46:57 -0000
   De:        batik-users-help@xml.apache.org
     A:        barkati@edite-de-paris.com.fr

Hi! This is the ezmlm program. I'm managing the
batik-users@xml.apache.org mailing list.

Acknowledgment: I have added the address
   barkati@edite-de-paris.com.fr
to the batik-users mailing list.

Welcome to batik-users@xml.apache.org!
 

XML tools:

[barkati@ceres xmlpro]$ java -jar xmlpro
 

Xerces:

[barkati@ceres dom]$ java dom.DOMCount good.xml
good.xml: 1953 ms (1 elems, 1 attrs, 0 spaces, 14 chars)
[barkati@ceres dom]$ more good.xml
<?xml version='1.0' ?>
<greeting tone='effusive'>
Hello, world!</greeting>
[barkati@ceres sax]$ java sax.SAXCount ../../data/personal.xml
../../data/personal.xml: 3459 ms (37 elems, 18 attrs, 140 spaces, 128 chars)
[barkati@ceres sax]$ cd ../dom
[barkati@ceres dom]$  java dom.DOMCount ../../data/personal.xml
../../data/personal.xml: 3104 ms (37 elems, 18 attrs, 140 spaces, 128 chars)
 

Java:

export CLASSPATH=.:$JAVA_HOME/
Pour complier un ficher appartenant à un package, il faut préciser le nom du package à la compilation:
[barkati@ceres dom]$ java dom.DOMCount good.xml
java et javac sont des shells-scripts!!!!

Demander Runjava à Nabil.


23/05/2001

 

News:

http://xml.apache.org/batik/architecture.html
The SVG Font Converter illustrates how Batik can help you embed SVG Font definitions in an SVG file by providing an application that converts ranges of characters from a True Type Font format to the SVG Font format.
 

Objet:               [xsl] RE: XML parser for use on the client
        Date:               Wed, 16 May 2001 15:37:42 +0100
          De:               "Michael Kay" <mhkay@iclway.co.uk>
 Répondre-A:               xsl-list@lists.mulberrytech.com
           A:               <xsl-list@lists.mulberrytech.com>

>     We need an xml parser, to be used at the client side...
> Xalan is 3mbs.

Xalan is an XSLT transformer, not an XML parser. Which do you want?
If you want an XML parser, Crimson and AElfred are far smaller than Xerces.
If you want an XSLT transformer, Saxon is somewhat smaller than Xalan.
I suspect this is mainly because Xalan and Xerces have support for a very
extensive range of character encodings.

Mike Kay
 

Objet:               Xerces-J 1.4.0 released
        Date:               Tue, 22 May 2001 15:31:20 -0400
          De:               neilg@ca.ibm.com
 Répondre-A:               xerces-j-user@xml.apache.org
           A:               xerces-j-user@xml.apache.org, xerces-j-dev@xml.apache.org
     Copies à:               general@xml.apache.org

Hi folks,

Xerces-J 1.4.0 is now ready for prime-time.  The most important new feature
in Xerces-J 1.4.0 is full schema support (except for some small limitations
as detailed in the documentation).  Other improvements include:

- Completed implementation of schema Identity Constraints [Neil Graham]
- Update XPath support to bring it into compliance with Schema PR [Achille
Fokoue Nkoutche/Neil Graham]
- Implemented Schema PR changes to the syntax of <attribute> declarations
[Ted Han (than@ghx.com)/Neil  Graham]
- Added French resource bundle for regex package [Jean-Claude Dufourd,
Laurent Foret/Neil Graham]
- Added support for nillable and removed limitation for xsi:schemaLocation
usage [Elena Litani]
- PR changes for Datatypes (including implementation of date/time) [Sandy
Gao, Elena Litani]
- Added support for fixed attribute on datatype facets [Elena Litani]
- Constraint checking [Lisa Martin, Neil Graham, Sandy Gao, Elena Litani]
- Re-implemented "all" group support for performance reasons [Henry
Zongaro]
- Re-implemented "mixed" content model groups for Schema [Lisa Martin]
- Miscellaneous bug fixes [Arnaud Le Hors, Jeffrey Rodrigues, Elena Litani]

Enjoy!
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  416-448-3519, T/L 778-3519
E-mail:  neilg@ca.ibm.com
 

Xerces:

[barkati@ceres sax]$ java sax.SAXCount ~/MusicXML/mut.xml
/home/barkati/MusicXML/mut.xml: 13218 ms (9120 elems, 730 attrs, 56139 spaces, 15060 chars)
[barkati@ceres sax]$ java sax.SAX2Count ~/MusicXML/mut.xml
/home/barkati/MusicXML/mut.xml: 16154 ms (9120 elems, 730 attrs, 56139 spaces, 15060 chars)
[barkati@ceres sax]$ cd -
[barkati@ceres dom]$ java dom.DOMCount ~/MusicXML/mut.xml
/home/barkati/MusicXML/mut.xml: 15877 ms (9120 elems, 730 attrs, 56139 spaces, 15060 chars)

Analyse papier de:

-DOMCount.java (pas d'héritage):
    private static final String
    DEFAULT_PARSER_NAME = "dom.wrappers.DOMParser";
[...]
           DOMParserWrapper parser =
            (DOMParserWrapper)Class.forName(parserWrapperName).newInstance();
            DOMCount counter = new DOMCount();
            long before = System.currentTimeMillis();
            parser.setFeature( "http://apache.org/xml/features/dom/defer-node-expansion",
                               setDeferredDOM );
            parser.setFeature( "http://xml.org/sax/features/validation",
                               setValidation );
            parser.setFeature( "http://xml.org/sax/features/namespaces",
                               setNameSpaces );
            parser.setFeature( "http://apache.org/xml/features/validation/schema",
                               setSchemaSupport );
           Document document = parser.parse(uri);
            counter.traverse(document);   // tous les compteurs ++ en fonction des Node
            long after = System.currentTimeMillis();
            counter.printResults(uri, after - before);

-SAXCount.java:
public class SAXCount
extends HandlerBase {
[...]
    private static final String
    DEFAULT_PARSER_NAME = "org.apache.xerces.parsers.SAXParser";
[...]
           Parser parser = ParserFactory.makeParser(parserName);
            parser.setDocumentHandler(counter);
            parser.setErrorHandler(counter);
            try {
                //if (validate && parser instanceof XMLReader)
                if ( parser instanceof XMLReader ){
                    ((XMLReader)parser).setFeature( "http://xml.org/sax/features/validation",
                                                    validate);
                    ((XMLReader)parser).setFeature( "http://xml.org/sax/features/namespaces",
                                                    setNameSpaces );
                    ((XMLReader)parser).setFeature( "http://apache.org/xml/features/validation/schema",
                                                    setSchemaSupport );
                    ((XMLReader)parser).setFeature( "http://apache.org/xml/features/nonvalidating/load-external-dtd",
                                                    setLoadExternalDTD );

                }
            } catch (Exception ex) {
            }
            long before = System.currentTimeMillis();
            parser.parse(uri);   // incrémentation des compteurs aux startElement(), characters, et ignorableWhitespaces
            long after = System.currentTimeMillis();
            counter.printResults(uri, after - before);

-SAX2Count.java:
public class SAX2Count
extends DefaultHandler {
[...]
    private static final String
    DEFAULT_PARSER_NAME = "org.apache.xerces.parsers.SAXParser";
[...]
XMLReader parser = (XMLReader)Class.forName(parserName).newInstance();
            parser.setContentHandler(counter);
            parser.setErrorHandler(counter);

            //if (validate)
            //   parser.setFeature("http://xml.org/sax/features/validation", true);

            parser.setFeature( "http://xml.org/sax/features/validation",
                                               validate);
            parser.setFeature( "http://xml.org/sax/features/namespaces",
                                               setNameSpaces );
            parser.setFeature( "http://apache.org/xml/features/validation/schema",
                                               setSchemaSupport );

            long before = System.currentTimeMillis();
            parser.parse(uri);   // incrémentation des compteurs aux startElement(), characters, et ignorableWhitespaces
            long after = System.currentTimeMillis();
            counter.printResults(uri, after - before);


25/05/2001


Objet:               Re: Terminating with </>
        Date:               Wed, 23 May 2001 14:59:41 -0700
          De:               Arnaud Le Hors <lehors@us.ibm.com>
 Répondre-A:               xerces-j-user@xml.apache.org
      Société:               IBM
           A:               xerces-j-user@xml.apache.org
  Références:               1 , 2 , 3 , 4
 

Tom Bradford wrote:
>
> Gregory Steuck wrote:
> >  > the c++ programmers to actually spit out proper XML than this
> >  > bastardization of it.  It will only help them in the end.
> > This vaguely reminded me some SGML shortcut, or am I dreaming?
>
> I've seen it used in SML discussions, I don't remember it being part of
> SGML, but maybe it is.

Yes, I think it is. I believe it's called End Tag Minimization. I'm
guessing it was voted out of XML because not using them allows for more
robust processing and enforcement of proper element nesting. I'd
personally favor it though.

But back to the original question, I agree with Tom. There isn't much
point in doing "almost XML". It's true that using XML has a cost. For
one thing, it's not anywhere near being the most compact format you
could use. But that's not the only overhead. Fully compliant XML parsers
have to deal with many obscure scenarios that make them inherently
slower than most ad-hock parsers. But for the price you get a lot: a
whole set of tools that are interoperable (and often available for
free).
Using your "almost XML" you pay most of the price of using XML without
ANY of the benefits...
--
Arnaud  Le Hors - IBM Cupertino, XML Strategy Group