apidoc
[ class tree: apidoc ] [ index: apidoc ] [ all elements ]

Class: XML_Indexing_Reader

Source Location: /XML_Indexing-0.3.5/Indexing/Reader.php

Class Overview


Transparent XML Indexing Reader


Author(s):

Version:

  • Release: @package_version@

Copyright:

  • 2004 Samalyse SARL corporation

Methods


Inherited Variables

Inherited Methods


Class Details

[line 97]
Transparent XML Indexing Reader

This class allows to work on big XML files without madly increasing access time. For this purpose, it creates an index, which contains informations to rapidly seek through a given XML file to retrieve a specific portion of it.

The indexing process is based on XPath expressions. Not all of the XPath language, but an appropriate subset for what a big XML files is expected to contain.

Currently, this class works transparently, creating specific indexes upon specific requests.

For example, when initially looking for /foo/bar[232], all instances from /foo/bar[1] to /foo/bar[n] will get indexed (so the first run is slow). Subsequent calls with such expressions as /foo/bar[232], /foo/bar[100], /foo/bar[25], etc... will then all make use of the created index (fast).

In addition to numerical indexes, attribute values indexing is currently supported as well. That is, expressions as /foo/bar[@id='someValue']. Similarly to the numerical indexing process, looking for a such expression will index all values of the 'id' attribute for the given XPath root (/foo/bar here).

Using this class is pretty straightforward :

  1.  $reader = new XML_Indexing_Reader ('test.xml');
  2.  $reader->find('/foo/bar[232]')// Or any other XPath expression
  3.  $xmlStrings $reader->fetchStrings();
  4.  
  5.  echo "Extracted XML data : "
  6.  foreach ($xmlStrings as $n => $str{
  7.      echo "######## Match $n ######### \n";
  8.      echo "$str\n\n";
  9.  }

Namespaces extraction is supported. These namespaces declarations are stored in the index files. You can retrieve them with :

  1.  $reader->find(...)// Needs to be call prior to getNamespaces()
  2.  $nsList $reader->getNamespaces();
  3.  foreach ($nsList as $prefix => $uri{
  4.      echo "$prefix => $uri";
  5.  }

The index storage strategy can be customized by modifying the default dsn value. Currently, only local file containers are supported.

  1.  // The following will store indexes in /tmp, using file names with an .xi
  2.  // prefix. That is the default.
  3.  $options['dsn''file:///tmp/%s.xi';
  4.  $indexer = new XML_Indexing_Reader ('test.xml'$options);
  5.  
  6.  // You can specify your own path as long as you include the %s expression :
  7.  $options['dsn''file:///var/cache/xi/%s.xi'
  8.  $indexer = new XML_Indexing_Reader ('test.xml'$options);

See the constructor documentation for more information on options.



[ Top ]


Method Detail

XML_Indexing_Reader (Constructor)   [line 224]

XML_Indexing_Reader XML_Indexing_Reader( string $filename, [array $options = array()])

Constructor

Supported options :

  • "dsn" : Index storage strategy, Default is to create a file in the system default temporary directory (ie: /tmp on *nix), with a '.xi' suffix. The only currently supported format is 'file://<path>'. Example : 'file:///var/cache/xi/%s.xi' Using the '%s' expression is required.
  • "gz_level" : Zlib compression level of the index files. 0 by default (no compression). Goes up to 9 (maximum compression, slow). Use this if you expect big indexes (many attributes, etc...)
  • "profiling" : takes a boolean value to enable/disable profiling support. Default is false. Enabling this option requires the Benchmark and Console_Table packages. See profile().

  • Access: public

Parameters:

string   $filename   —  The XML file to parse
array   $options   —  Optional custom options

[ Top ]

count   [line 565]

int count( )

Retrieves the total number of matches
  • Return: The number of matches
  • Access: public

[ Top ]

fetchDomNodes   [line 620]

array fetchDomNodes( [int $offset = 0], [int $limit = null])

Fetch a set of XML matches as DOM nodes
  • Return: DomElements
  • Access: public

Parameters:

int   $offset   —  The n match to start fetching from (zero based, default : 0)
int   $limit   —  How many matches to fetch (default : all)

[ Top ]

fetchStrings   [line 577]

array fetchStrings( [int $offset = 0], [int $limit = null])

Fetch a set of XML matches as raw strings
  • Return: Array of XML strings

Parameters:

int   $offset   —  The n match to start fetching from (zero based, default : 0)
int   $limit   —  How many matches to fetch (default : all)

[ Top ]

find   [line 400]

bool find( string $xpath)

Search for an XPath expression
  • Return: The number of nodes matched or a PEAR_Error
  • Access: public

Parameters:

string   $xpath   —  XPath expression to look for

[ Top ]

getNamespaces   [line 675]

array getNamespaces( )

Return namespaces declared in the XML file
  • Return: An associative array of the form ('prefix' => 'uri', ...)
  • Access: public

[ Top ]

profile   [line 687]

void profile( )

Output profiling informations
  • Access: public

[ Top ]


Documentation generated on Mon, 11 Mar 2019 14:23:33 -0400 by phpDocumentor 1.4.4. PEAR Logo Copyright © PHP Group 2004.