You are not logged in Log in Join
You are here: Home » Members » arielpartners » XML Transform » Release Information » View Document

Log in
Name

Password

 

Release Information

################################################################

$RCSfile: README.txt,v $

Authors: Chip Morris and Craeg Strong, Ariel Partners LLC

$Date: 2002/11/08 05:21:42 $

################################################################

Contents

  1. Quick Start
  2. Prerequisites
  3. Description
  4. Known Limitations
  5. Contributions
  6. Schema Migration
  7. XSLT Processor Support Status
  8. Notes
  9. Unit Testing on Win32

Quick Start

Don't forget to read the Prerequisites section below!

XMLTransform enables Zope users to associate an XSLT transformer with an XML document that automatically renders the result of the transformation when called. It can behave like either a Page Template or a DTML Method. There are no constraints on the type of Zope objects used for the XML or XSLT. In fact, the content may cobbled together from multiple sources, as long as the final content may be obtained as well-formed XML from a single object for each.

The XMLTransform product adds three separate objects to the "Add" menu in the Zope Management Interface:

  • XML Transform
  • XML Transform Cache Manager
  • XML Transform Registry

The quickest way to get started with XMLTransform is to read the description below, then follow the directions in TUTORIAL.txt, then re-read the description below :-) The tutorial includes examples of increasing complexity that should cover most normal uses of the product. Don't forget to read the Prerequisites section below!

XMLTransform features a pluggable architecture that makes it possible to dynamically choose between different XSLT Processors at runtime. It currently works with any of the following combinations:

  • PyXML 0.6.6 and 4Suite 0.11.1. For me, on either my Win2K machine or my Red Hat Linux 7.2 machine, that means downloading and installing the following:
           PyXML-0.6.6.tar.gz
           4Suite-0.11.1.tar.gz
    
       **DON'T FORGET TO SPECIFY THE --without-xslt --without-xpath
         OPTIONS FOR PYXML**
    
  • PyXML 0.8.1 and 4Suite 0.12.a3 For me, on either my Win2K machine or my Red Hat Linux 7.2 machine, that means downloading and installing the following:
           PyXML-0.8.1.tar.gz
           4Suite-0.12.0a3.tar.gz
    
       **DON'T FORGET TO SPECIFY THE --without-xpath OPTION FOR PYXML**
    
  • libxml2 2.4.26 and libxslt 1.0.22 (Python bindings). For me, on my Red Hat Linux 7.2 machine, that means downloading and installing the following:
            libxml2-2.4.26-1.i386.rpm
            libxml2-python-2.4.26-1.i386.rpm
            libxslt-1.0.22-1.i386.rpm
            libxslt-python-1.0.22-1.i386.rpm
    
         There is a
         "site":http://www.fh-frankfurt.de/~igor/projects/libxml/index.html
         with a Win2K port of the base libraries, but I don't see any way
         of installing the python bindings on Win2K.  *If someone has
         successfully set up libxslt on Win2K, please email me the recipe
         and I will update this document.*
    
  • Pyana 0.6. This is a python wrapper around the apache XML parser xercesC++, version 1.4 and XSLT processor xalanC++, version 2.1 These should not be confused with the Java products xalanJ and xercesJ. They are something totally different For me, on my Red Hat Linux 7.2 machine, that means downloading and installing the following:
            Pyana-0.6.0.linux-i686-extras.tar.gz
            Pyana-0.6.0.linux-i686-py2.1.tar.gz
    
         On my Win2K machine, that means downloading and installing::
    
            Pyana-0.6.0.win32-py2.1.exe
    
  • SabPyth 0.52. This is a python wrapper around the Sablotron XSLT processor, version 0.96. For me, on my Red Hat Linux 7.2 machine, that means downloading and installing the following:
            js-1.5rc4-2.i386.rpm
            sablotron-0.96-1.i386.rpm
            sablotron-devel-0.96-1.i386.rpm
            Sab-pyth-0.52.linux-i686.tar.gz
    
        Unfortunately I have not been able to get Sabloton working on Red
        Hat Linux 7.2.  However, I did get it working on Win2K, by
        downloading and installing the following::
    
            Sablot-Win-0.96-FullPack.zip
            Sab-pyth-0.52-win32-py2.1.exe
            expat_win32bin_1_95_5.exe (from expat.sourceforge.net, for libexpat.dll)
    

One should be able to get XMLTransform working with a Java-based XSLT processor via XML-RPC without too much trouble. See ZOPE\lib\python\Products\XMLTransform\IXSLTProcessor.py for more details. Contributions are greatfully accepted. Support Open Source!

Prerequisites

This product requires the presence of at least one XSLT processor. It features a pluggable architecture that makes it possible to dynamically choose between different XSLT Processors at runtime. Today, the product offers support for 4Suite 0.11, 4Suite 0.12, Pyana 0.6, SabPyth 0.52, and GNOME libxml2/libxslt out of the box, but it should be straightforward to add support for another library, if your favorite is not on that list. You can either do it yourself, or ask us to help you (email us at mailto:[email protected]).

Java-based XSLT processors can be supported as well (for example, Saxon or XalanJ) via XML-RPC, perhaps using EIONET's XMLRPC product. This requires a little extra work and more maintenance, but shouldn't be too bad. Supporting URI resolution to local Zope resources might be difficult, however.

Below are some quick instructions for how to setup the various alternatives. Please be sure to read the installation instructions for the package you are installing.

XMLTransform should work fine for Zope releases 2.4 and above. We have done most of our testing on release 2.5.1 and 2.6 under Linux.

  1. Install Zope
  2. Ensure that you are starting Zope with at least two threads (by default Zope starts with 4 threads, unless you change it via the -t option)
  3. Install your XML/XSLT processor libraries
  4. Install XMLTransform (see ZOPE\lib\python\Products\XMLTransform\INSTALL.txt)
  5. Ensure that XMLTransform is registered to use the particular XSLT processor you wish (if you installed more than one).

You can use the automated test suite to check the installation. Here's what you do:

  1. Download and install ZopeTestCase carefully, as per instructions. Be sure to install it into the lib/python/Testing area, not the lib/python/Products area! We used ZopeTestCase version 0.5.3
  2. From a shell or DOS window prompt, do the following, where ZOPE represents the directory in which you installed Zope. UNIX users substitute / for '\':
           cd ZOPE\lib\python\Products\XMLTransform\tests
           ZOPE\bin\python alltests.py
    
  3. As long as you are running a version of 4Suite, you should see lots of messages followed by OK. That means all 33 tests ran successfully. You are in, baby!:
            Ran 39 tests in 3.686s
            OK
    

If you are running libxslt, you should see even more messages followed by FAILED. Four testcases fail, but that is expected, because the current version of libxslt does not yet offer support for URI resolver hooks in Python. But 35 out of 39 ain't bad!:

        Ran 39 tests in 3.686s
        FAILED (failures=5)

      If you are running Pyana on Linux, you should see *even more
      messages* followed by 'FAILED'.  Three testcases fail, but that
      is expected, because the current version of Pyana/Linux does not
      yet offer support for URI resolver hooks in Python.  But 36 out
      of 39 ain't bad!::

        Ran 39 tests in 3.686s
        FAILED (failures=4)

Description

XMLTransform

An XMLTransform is a Zope object that links an XML document to a desired XSLT script. The XMLTransform automatically runs the XSLT transformation and renders the results when accessed through DTML or page templates.

An XMLTransform object contains neither the XML source document nor the XSLT transformer. Instead, it obtains each of them from two separate Zope objects, whose IDs are recorded as properties. In this way, an XMLTransform object represents an association between an XML document and a transformer.

This feature differentiates XMLTransform from other XML/XSLT-based Zope products, in that it recognizes the fact that there is often a many to many relationship between XML documents and XSLT transformers.

The XML source pointed to by an XMLTransform can come from nearly anywhere, for example:

  • Content retrieved from an SQL database and converted to XML format
  • A DTMLDocument that is an XML file, with portions grabbed from other objects via DTML tags.
  • An XMLFile instance (XMLFile is part of the XMLKit Zope product)
  • A CVSFile object that points to an external XML document in a CVS repository (CVSFile is part of the CVSFile Zope product)
  • An ExternalFile object that points to an external XML document in the file system (ExternalFile is part of the ExternalFile Zope product)
  • A Page Template, a DTML Method, A File object, etc.

The only requirement is that Zope object from which XMLTransform obtains the XSLT source must support the __call__() method, and that the resulting XML must be well-formed. Unfortunately, two well-known Zope products do not support the __call__() method:

     - ParsedXML object (ParsedXML is part of the
       "ParsedXML":http://www.zope.org/Members/faassen/ParsedXML
       Zope product)

     - File object

We expect to add the capability to specify an alternate method (for example, __str__() would work in both cases above) to call in a future release of XMLTransform.

In this way, XMLTransform can be used to form "pipelines," where the output of one object becomes the input of the next. This approach is more modular: each kind of object performs only one task, and can be tested and/or replaced on an individual basis.

TransformerRegistry

XMLTransform obtains the XSLT transformer from a Transformer Registry. An XMLTransform obtains its registry through acquisition (that is, by looking for it in its parent folder, then its parent's parent, on up the tree to the root folder), and it stops when it finds the first one. The Registry is a folder-like object that contains a logically related group of transformers. Transformers may be any kind of Zope object that can render a well-formed XSLT, such as:

  • DTMLDocument
  • PageTemplate
  • DTMLMethod
  • ExternalFile
  • CVSFile
  • XMLFile

...to name a few. Objects contained in a TransformerRegistry (either directly or in subfolders) are made available as choices for selection for use as the XSLT in each XMLTransform instance that "sees" this registry. Again, the same caveats apply:

The only requirement is that Zope object from which XMLTransform obtains the XSLT source must support the __call__() method, and that the resulting XML must be well-formed. Unfortunately, two well-known Zope products do not support the __call__() method:

     - ParsedXML object (ParsedXML is part of the
       "ParsedXML":http://www.zope.org/Members/faassen/ParsedXML
       Zope product)

     - File object

We expect to add the capability to specify an alternate method (for example, __str__() would work in both cases above) to call in a future release of XMLTransform.

Because it is often convenient to store objects other than the actual transformers themselves inside a registry, the TransformerRegistry may be customized to omit certain types of objects from its transformer menu. For example, if an XSLT transformer was cobbled together from bits and pieces from different Zope objects, you could store all of the pieces inside the registry together with the object that assembles it all into well-formed XML (after all, the registry is nothing more than a slightly specialized Zope folder). Optionally, you could also store Zope objects that serve as documentation, such as README files inside the registry. You don't have to, but you could. If you choose to do so, you must indicate to the Registry that you want it to omit these (non-transformer) objects from the list of transformers made available to XMLTransform objects.

Objects can be omitted in one of two ways:

  1. Omit all objects of a certain type. This is controlled through the transformer_meta_types property in the Properties Page of the registry.
  2. Omit selected objects. An object is omitted from the list (no matter what its meta_type) if it has an attribute called omit (it doesn't matter what kind of attribute it is, or what its value is, only that its name is omit). This attribute may be added via the Properties page of the object in question (that is, the object to be omitted).

Because no instance of XMLTransform may be added without a transformer, the first thing you should do is to create a TransformerRegistry and put a transformer in it. See ZOPE/lib/python/Products/XMLTransform/TUTORIAL.txt for some simple examples.

CacheManager

By default, XMLTransform never caches its results. However, XSLT processing is expensive. In the real world, there is generally no need to transform the content every time except in rare circumstances such as when the XML content is retrieved from a database on the fly and changes dynamically. For all but these rare circumstances, caching can improve performance considerably. This is what CacheManager is for.

An XMLTransform object searches for a CacheManager instance via acquisition. That is, it looks for it in its parent folder, then its parent's parent, on up the tree to the root folder, and stops when it finds the first one.

A CacheManager is a purely optional thing. Removing one will never break anything -- it is only for performance.

Every XMLTransform object has a "caching" property. This controls whether caching should be done if a CacheManager is present. If a Cache Manager is not present, caching will never be done, regardless of the setting of the property. The property is there to guard against those situations where caching is never appropriate, such as when dynamic content is obtained from a database as described above.

If an XMLTransform's "caching" parameter is set to "CacheWhenAvailable" and a CacheManager is present, caching will be done. The CacheManager stores cached content in the filesystem, not in the Zope object database.

Assuming the conditions above, once a CacheManager has stored the results of a transformation on behalf of an XMLTransform, the XMLTransform will thereafter always retrieve its results from the CacheManager rather than re-running the transformation unless one of the following occurs:

  1. Its source XML document is modified
  2. Its source XML document reference is changed to point to a different Zope object altogether
  3. Its XSLT transformer is modified
  4. Its XSLT transformer reference is changed to point to a different transformer altogether.

Typically, cached content is stored in the system temporary directory, c:\tmp on windows platforms and /tmp on UNIX platforms. The placement of cache files is controlled by the Cache Manager's cachePrefix property.

The CacheManager includes some convenience functionality in the Cache tab of its Zope Management Interface. A CacheManager operates on the set of XMLTransform objects that exist in its containing folder and all subfolders: its clients. From the Cache tab, it is possible to perform batch operations on all of its clients, such as:

  1. Turn caching on or off
  2. Force regeneration of transformed content (regardless of whether or not it is actually "out of date")
  3. List the filenames for all of the currently cached content
  4. Remove all the files representing the currently cached content

Note that the files that contain the cached content may be manually removed from the disk without any ill effects. CacheManager is robust enough to notice this and simply re-cache (replace) the missing files the next time the relevant content is requested.

Transformers

We typically organize our XSL transformers in a hierarchy, for convenience. It is useful to regard each transformer as belonging to a family of transformers. The family is determined by the format of the XML file to be transformed. For example, there might be a DocBook family that understands the popular XML DocBook format, a "Resume" family that understands a homegrown Resume format, and others.

For each family, there may be several individual transformers, one for each kind of output desired. Standard examples of different outputs include XML (to convert XML of one format into another), browseable HTML, printable HTML, Structured Text (STX), FO (formatting objects), VXML, WAP, etc. Coupled with a FO processor like FOP, one could churn out many more output types such as PDF, PCL, PS, AWT, etc.

Typically, we create a subfolder underneath the registry for each family, then create a transformer for output type as an object within the subfolder. The transformers then show up in the Registry menu in a very intuitive way, such as:

  • resume/html
  • resume/fo
  • article/html
  • article/fo

But you could put them all in the registry base folder:

  • resumeToHTML
  • ArticleToSTX

Or even have multiple sublevels

  • Role/Print/HTML
  • Role/Print/FO
  • Role/Browse/STX
  • Article/Print/HTML

The possibilities are endless! Our advice is to start simple and add structure as you add more and more transformers.

There are no requirements on transformer files other than that they be well-formed XSLT documents. They need not produce stand-alone HTML pages (pages with tags), but can produce HTML fragments, XML fragments, or plain text output.

behave_like

An XMLTransform can behave_like standard Zope products. The current list is:

  • DTML Method
  • Page Template

For example, if an XML source file was transformed via XSLT to HTML, and that HTML included some TAL attributes, that is, it was actually a Zope Page Template, the templates would automatically be resolved assuming the behave_like was set to "PageTemplate". Refer to the examples in the tutorial for more details.

XSL Parameters

Parameters may be passed to the XSLT transformation. If parameters are needed, create a property in the current context named XSLparameters. This property should be of type "lines", that is, it should return a (Python) sequence of strings. It may be defined on the XMLTransform object itself or acquired from the XMLTransform's context. Each name on the list is itself looked up in the current context. If its value is a scalar, then the pair name=value will be supplied to the XSL transformer as a parameter specification. If the value is an object, then the pair name=url is returned where url is the absolute URL of the object. For example, suppose the XSLparameters value is ["properties", "color"]. Suppose Zope object with id=properties exists in the context of the current XMLTransform, say at localhost:8080/test/foobar. Suppose the XMLTransform itself has a property named color with value blue. Then the following parameters will be "passed on the command line" to the XSLT transformer:

      properties=localhost:8080/test/foobar

      color=blue

See the tutorial for some examples of how parameters might be used.

URN namespaces and reusable content

In certain circumstances, it is desirable to "genericize" content, such that it is independent of the particular context in which it is currently being used. For example, some documents may include portions of other documents. Certain documents may be created out of reusable "boilerplate" pieces. There may be standard legal clauses, headers/footers, or other pieces of content that are used in multiple places. Even XSLT transformers themselves might be created out of reusable pieces (for example, a family of XSLT transformers with multiple output flavors might include several reusable templates).

In circumstances such as the above, URI resolution may be used to avoid hardcoding document references. For example, the URN "urn:acme:legal/header_boilerplate" might refer to a header that is included in all legal documents. URI resolvers and XML catalog technology was invented to provide a way to map such URNs to actual URLs. By plugging in a different map, you can reuse the URN in many different situations.

Fortunately, the 4Suite XSLT processor supports URI resolution. URNs consist of two pieces, an NID and an NSS:

      NID: namespace ID

      NSS: namespace specific string

For example:

      URN: urn:acme:legal/header_boilerplate

      NID: acme

      NSS: legal/header_boilerplate

In order to do URN lookup, there must be a string variable named URNnamespaces in the context in which a new XMLTransform is created. It may be defined on the XMLTransform object itself or acquired from the XMLTransform's context. This variable defines the different namespaces that can be used in URNs. URNs may be used to lookup XSLT transformers or XML documents using XPath expressions, for example in the XSLT document() function, or via xsl:import() or xsl:include(). The namespaces themselves can either be the names of folders in the ZODB, or string properties that give base URIs. If the namespace is the name of a ZODB folder, the NSS will be interpreted as a list of Zope contexts relative to the folder.

In the example above, the URNnamespaces property would contain a single string "acme". You would create a folder obtainable via acquisition from the XMLTransform called "acme" with a subfolder called "legal." The "legal" subfolder would contain an object with the ID "header_boilerplate.":

      Root_Folder/acme/legal/header_boilerplate

See the tutorial for some examples of URN namespaces in action.

Known Limitations

libxslt

libxml2/libxslt currently does not export URI resolver hooks to the Python API (its actually written in C). This means that URNs currently cannot be used with this processor (see TUTORIAL.txt for examples of using URNs). If this feature is avoided, however, it should work fine.

Pyana/Linux

Pyana/Linux has limitations in its URI resolver hooks on the Linux platform. Pyana/Win32, by contrast, does not share this limitation. This means that URNs currently cannot be used with this processor (see TUTORIAL.txt for examples of using URNs). If this feature is avoided, however, it should work fine.

Not compatible with Zope File objects Not compatible with ParsedXML objects

Zope File objects do not support the __call__() method as required by the current release of XMLTransform. We expect to add the capability to specify an alternate method (for example, __str__() would work in both cases above) to call in a future release of XMLTransform.

Contributions

We hope others find this code useful. If you have extended or improved this product, please feel free to submit your changes to us.

If there is enough interest, we would certainly consider setting up an open-source project on SourceForge.

Schema Migration

From time to time, you may find yourself with a new version of this product, either because we have released a "new improved" version or from some changes you may have made on your own. How do you deal with all of the existing instances in your ZODB that were created with the old definition? Here is our preferred technique:

  1. Find the repair() method (we usually put it at the bottom of the respective python source file)
  2. Change the repair() method so that it updates the object from the old version to the new version
  3. Create a python script in the root folder that recursively calls repair() on each of the object with a given metatype
  4. Execute the python script (the easy way is via the "test" tab)

Here is our python script. We call it "repairAll" and put it in the ZODB root folder:

    objects = context.ZopeFind(context, obj_metatypes=[metaType], search_sub=1)
    map(lambda x: x[1].repair(), objects)
    return map(lambda x: x[0], objects)

XSLT Processor Support Status

4Suite 0.11.1

  1. of 39 tests pass on Win2000 and Linux. (EXSLT dyn:evaluate() not supported). URN resolution works on both platforms. Ariel website runs 100% on both platforms. No catalog resolver available.

4Suite 0.12.a3

  1. of 39 tests pass on both Win2000 and Linux. (EXSLT dyn:evaluate() and small URN issue). URN resolution works on both platforms. Ariel website runs 95% on both platforms. Catalog resolver adaptor to be implemented

libxslt 1.0.22

  1. of 39 tests pass on Win2000 and Linux (all except URN resolver tests, EXSLT dyn:evaluate() not supported). URN resolution not supported on either platform to be exported to Python API Ariel website does not run on either platform. No Catalog resolver available to be exported to Python API

Pyana 0.6

  1. of 39 tests pass on Linux, 38 of 39 pass on Win2000. (EXSLT dyn:evaluate() not supported). URN resolution not supported on Linux. URN resolution works on Win32. Ariel website has problems to be tested Catalog resolver adaptor to be implemented

SabPyth 0.52

I was unable to configure Sablotron for Linux. 38 of 39 tests run on Win2000 (EXSLT dyn:evaluate() not supported). URN resolution works on Win2000. Ariel website has two problems: a) EXSLT function dyn:evaluate() not supported b) bugs in xsl:import . No catalog resolver available.

Notes

This document is written in structured text. For a quickie intro to structured text, look here.

Unit Testing on Win32

If you don't plan to run the automated unit test suite, this section is irrelevant.

For some reason, the automated unit tests don't run properly for me on win32 using the FAT file system under Zopes earlier than 2.5.1. I found the following workaround. In the file name:

    [ZOPE]\lib\python\Testing\custom_zodb.py

Where ZOPE stands for the directory in which you installed Zope, Change line number seven:

    Storage = DemoStorage(base=FileStorage(dfi,read_only=1), quota=(1<<20))

to instead read:

    Storage = DemoStorage(base=FileStorage(dfi, read_only=0), quota=(1<<20))

My guess is that FAT does not support the read_only attribute as required.