Table of Contents
The functionality of this record model has been improved and replaced by the DOM XML record model, see Chapter 7, DOM XML Record Model and Filter Module. The Alvis XML record model is considered obsolete, and will eventually be removed from future releases of the Zebra software.
   The record model described in this chapter applies to the fundamental,
   structured XML
   record type alvis, introduced in
   Section 2.5.2, “ALVIS XML Record Model and Filter Module”.
  
This filter has been developed under the ALVIS project funded by the European Community under the "Information Society Technologies" Program (2002-2006).
    The experimental, loadable  Alvis XML/XSLT filter module
    mod-alvis.so is packaged in the GNU/Debian package
    libidzebra1.4-mod-alvis.
    It is invoked by the zebra.cfg configuration statement
    
     recordtype.xml: alvis.db/filter_alvis_conf.xml
    
    In this example on all data files with suffix
    *.xml, where the
    Alvis XSLT filter configuration file is found in the
    path db/filter_alvis_conf.xml.
   
The Alvis XSLT filter configuration file must be valid XML. It might look like this (This example is used for indexing and display of OAI harvested records):
     <?xml version="1.0" encoding="UTF-8"?>
     <schemaInfo>
     <schema name="identity" stylesheet="xsl/identity.xsl" />
     <schema name="index" identifier="http://indexdata.dk/zebra/xslt/1"
     stylesheet="xsl/oai2index.xsl" />
     <schema name="dc" stylesheet="xsl/oai2dc.xsl" />
     <!-- use split level 2 when indexing whole OAI Record lists -->
     <split level="2"/>
     </schemaInfo>
    
    All named stylesheets defined inside
    schema element tags
    are for presentation after search, including
    the indexing stylesheet (which is a great debugging help). The
    names defined in the name attributes must be
    unique, these are the literal schema or
    element set names used in
    SRW,
    SRU and
    Z39.50 protocol queries.
    The paths in the stylesheet attributes
    are relative to zebras working directory, or absolute to file
    system root.
   
    The <split level="2"/> decides where the
    XML Reader shall split the
    collections of records into individual records, which then are
    loaded into DOM, and have the indexing XSLT stylesheet applied.
   
    There must be exactly one indexing XSLT stylesheet, which is
    defined by the magic attribute
    identifier="http://indexdata.dk/zebra/xslt/1".
   
When indexing, an XML Reader is invoked to split the input
     files into suitable record XML pieces. Each record piece is then
     transformed to an XML DOM structure, which is essentially the
     record model. Only XSLT transformations can be applied during
     index, search and retrieval. Consequently, output formats are
     restricted to whatever XSLT can deliver from the record XML
     structure, be it other XML formats, HTML, or plain text. In case
     you have libxslt1 running with EXSLT support,
     you can use this functionality inside the Alvis
     filter configuration XSLT stylesheets.
    
The output of the indexing XSLT stylesheets must contain
     certain elements in the magic
     xmlns:z="http://indexdata.dk/zebra/xslt/1"
     namespace. The output of the XSLT indexing transformation is then
     parsed using DOM methods, and the contained instructions are
     performed on the magic elements and their
      subtrees.
    
For example, the output of the command
      xsltproc xsl/oai2index.xsl one-record.xml
     might look like this:
      <?xml version="1.0" encoding="UTF-8"?>
      <z:record xmlns:z="http://indexdata.dk/zebra/xslt/1"
      z:id="oai:JTRS:CP-3290---Volume-I"
      z:rank="47896">
      <z:index name="oai_identifier" type="0">
      oai:JTRS:CP-3290---Volume-I</z:index>
      <z:index name="oai_datestamp" type="0">2004-07-09</z:index>
      <z:index name="oai_setspec" type="0">jtrs</z:index>
      <z:index name="dc_all" type="w">
      <z:index name="dc_title" type="w">Proceedings of the 4th
      International Conference and Exhibition:
      World Congress on Superconductivity - Volume I</z:index>
      <z:index name="dc_creator" type="w">Kumar Krishen and *Calvin
      Burnham, Editors</z:index>
      </z:index>
      </z:record>
     
This means the following: From the original XML file
     one-record.xml (or from the XML record DOM of the
     same form coming from a split input file), the indexing
     stylesheet produces an indexing XML record, which is defined by
     the record element in the magic namespace
     xmlns:z="http://indexdata.dk/zebra/xslt/1".
     Zebra uses the content of
     z:id="oai:JTRS:CP-3290---Volume-I" as internal
     record ID, and - in case static ranking is set - the content of
     z:rank="47896" as static rank. Following the
     discussion in Section 9, “Relevance Ranking and Sorting of Result Sets”
     we see that this records is internally ordered
     lexicographically according to the value of the string
     oai:JTRS:CP-3290---Volume-I47896.
     
    
In this example, the following literal indexes are constructed:
      oai_identifier
      oai_datestamp
      oai_setspec
      dc_all
      dc_title
      dc_creator
     
     where the indexing type is defined in the
     type attribute
     (any value from the standard configuration
     file default.idx will do). Finally, any
     text() node content recursively contained
     inside the index will be filtered through the
     appropriate char map for character normalization, and will be
     inserted in the index.
    
     Specific to this example, we see that the single word
     oai:JTRS:CP-3290---Volume-I will be literal,
     byte for byte without any form of character normalization,
     inserted into the index named oai:identifier,
     the text
     Kumar Krishen and *Calvin Burnham, Editors
     will be inserted using the w character
     normalization defined in default.idx into
     the index dc:creator (that is, after character
     normalization the index will keep the individual words
     kumar, krishen,
     and, calvin,
     burnham, and editors), and
     finally both the texts
     Proceedings of the 4th International Conference and Exhibition:
      World Congress on Superconductivity - Volume I
     and
     Kumar Krishen and *Calvin Burnham, Editors
     will be inserted into the index dc:all using
     the same character normalization map w.
    
Finally, this example configuration can be queried using PQF queries, either transported by Z39.50, (here using a yaz-client)
      
      Z> open localhost:9999
      Z> elem dc
      Z> form xml
      Z>
      Z> f @attr 1=dc_creator Kumar
      Z> scan @attr 1=dc_creator adam
      Z>
      Z> f @attr 1=dc_title @attr 4=2 "proceeding congress superconductivity"
      Z> scan @attr 1=dc_title abc
      
     
     or the proprietary
     extensions x-pquery and
     x-pScanClause to
     SRU, and SRW
     
      
      http://localhost:9999/?version=1.1&operation=searchRetrieve&x-pquery=%40attr+1%3Ddc_creator+%40attr+4%3D6+%22the
      http://localhost:9999/?version=1.1&operation=scan&x-pScanClause=@attr+1=dc_date+@attr+4=2+a
      
     See the section called “The SRU Server” for more information on SRU/SRW configuration, and the section called “YAZ server virtual hosts” or the YAZ CQL section for the details or the YAZ frontend server.
     Notice that there are no *.abs,
     *.est, *.map, or other GRS-1
     filter configuration files involves in this process, and that the
     literal index names are used during search and retrieval.