Class IndexFilter

java.lang.Object
org.xml.sax.helpers.XMLFilterImpl
org.faceless.publisher.ext.IndexFilter
All Implemented Interfaces:
ReportFactoryExtension, ContentHandler, DTDHandler, EntityResolver, ErrorHandler, XMLFilter, XMLReader

public class IndexFilter extends XMLFilterImpl implements ReportFactoryExtension
An XMLFilter which can be specified with "bfo:xslt" on the about:index resource to style the generated index as a book index. It's typically used like this:

 <h2 style="break-before: page">Index</h2>
 <xi:include
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:bfo="http://bfo.com/ns/publisher" 
    bfo:xslt="classpath:org.faceless.publisher.ext.IndexFilter"
    href="about:index"
 />
 

Input format

The expected XML consists of a root element containing multiple <entry> elements. Elements other than <entry> and <term> (see below) are traversed but ignored; <entry> elements can be at any depth or location in the tree.

Entry elements can have the following attributes:

  • term - specifies the index term. An index term is required, and can be set with the term attribute, a <term> descendant of the <entry> or both. See below for formatting details.
  • page - the page the item is on. The format is an integer starting at 1, followed optionally by the formatted value of the page. For example "12" and "12 12" both link to page 12, and "12 xii" does the same but format the page as lower-latin digits. If no page is specified, the entry is assumed to be a cross-reference and requires a "see" value set in the term.
  • to-page - if the entry covers a range of pages, the to-page attribute specifies the last page of the range. The format is identical to page.
  • class - the class to apply to the term. Optional
  • page-class - the class to apply to the page-number. Optional

Term formatting

The term attribute is best demonstrated by example:
  • term="apples" - add an entry with the term "apples"
  • term="fruit/apples" - add an entry with the term "fruit" and a sub-entry with the term "apples". The class, page number etc. apply to the sub-entry.
  • term="α-particle { alpha-particle }" - add an entry with the term "α-particle" but sort it as if it were "alpha-particle"
  • term="α{alpha}-particle" - exactly as the previous example.
  • term="http:\/\/" - create an entry "http://" - the slash characters are escaped by prefixing with a backslash "\"
  • term="malus domestica -> apple" - create an entry "malus domestical" which is a cross-reference to the entry "apple".
  • term="rgb() // rgba()" - create two identical entries, "rgb()" and "rgba()".
  • term="rgba() -> rgb() // #number -> rgb()" - create two entries for "rgba()" and "#number", both of which are cross-references to the "rgb()" entry.

If a cross-reference is set and no page is specified, a See NNN style entry is created. If it is set as well as page, a See also NNN is created instead.

Any nested <term> entries will be merged into the term attribute, first replacing any zero-length terms, then being appended if no zero-length terms exist. The following constructions are all equivalent:

<entry term="apples" />
<entry>
  <term>apples</term>
</entry>
<entry term="apples" />
<entry>
  <term term="apples" />
</entry>
<entry term="fruit / apples" />
<entry>
  <term>fruit</term>
  <term term="apples" />
</entry>
<entry term="fruit/apples" />
<entry term="/apples">
  <term>fruit</term>
</entry>
<entry term="fruit/apples" />
<entry term="fruit/">
  <term term="apples" />
</entry>
<entry term="fruit/apples" />
<entry term="fruit">
  <term>apples</term>
</entry>
<entry term="fruit/apples/golden delicious" />
<entry term="fruit//golden delicious">
  <term>apples</term>
</entry>

The tokens "/" (subdivision), "{" and "}" (open and close sort), "->" (xref) and "//" (division) can be surrounded by whitespace, and the values of each can be overriden by setting either an environment variable, or an attribute on the root element. For example, to change the "/" and "//" to "|" and "||", and change the "sort" tokens to square brackets, either include the following as a stylesheet:


  @bfo env {
      bfo-ext-index-subdivision: "|";
      bfo-ext-index-division: "||";
      bfo-ext-index-sort: "[ ]";
  }
 

or set the following attributes on the root element of the XML:


  <index subdivision="|" division="||" sort="[ ]">
 
to set these on the root element when the index is being included with XInclude:

 <xi:include
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:bfo="http://bfo.com/ns/publisher" 
    xmlns:xila="http://www.w3.org/2001/XInclude/local-attributes"
    bfo:xslt="classpath:org.faceless.publisher.ext.IndexFilter"
    href="about:index"
    xila:subdivision="|"
    xila:division="||"
    xila:sort="[ ]"
 />
 

Output format

Given the following input

  <index>
   <entry term="aardvark" page="9">
   <entry term="aardvark" page-class="main" page="10">
   <entry term="fruit/apples" page="12">
   <entry term="fruit/malus domestica -> fruit/apples">
  </index>
 
The generated XML looks like this:

 <section class="bfo-index-container">
  <div class="bfo-index">
 
   <div class="bfo-index-group" data-term="A">
    <div class="bfo-index-heading" data-term="A">A</div>
 
    <div class="bfo-index-entry bfo-index-entry-final" data-term="aardvarks">
     <span class="bfo-index-term">aardvarks</span>
     <span class="bfo-index-pages">
      <a href="pdf:goto(9)">9</a>,
      <a class="main" href="pdf:goto(10)">10</a>
     </span>
    </div>
 
   </div>
 
   <div class="bfo-index-group" data-term="F">
    <div class="bfo-index-heading" data-term="F">F</div>
 
    <div class="bfo-index-entry" data-term="fruit">
     <span class="bfo-index-term">fruit</span>
 
     <div class="bfo-index-entry bfo-index-entry-final" data-term="apples">
      <span class="bfo-index-term">apples</span>
      <span class="bfo-index-pages">
       <a href="pdf:goto(12)">12</a>
      </span>
     </div>
 
     <div class="bfo-index-entry bfo-index-entry-final" data-term="malus domestica">
      <span class="bfo-index-term">malus domestica</span>
      <span class="bfo-index-xref">apples</span>
     </div>
    </div>
 
   </div>
  </div>
 </section>
 

Each bfo-index-group contains a bfo-index-heading followed by one or more bfo-index-entry. Each of those contains a bfo-index-term and optionally a bfo-index-pages and/or bfo-index-xref. If the term has children it contains nested bfo-index-entry elements, if not it has the additional class bfo-index-entry-final.

A basic stylesheet is included to style this into two-columns, but can be overridden.