[line 97]
Transparent XML Indexing Reader
This class allows to work on big XML files without madly increasing access time. For this purpose, it creates an index, which contains informations to rapidly seek through a given XML file to retrieve a specific portion of it.
The indexing process is based on XPath expressions. Not all of the XPath language, but an appropriate subset for what a big XML files is expected to contain.
Currently, this class works transparently, creating specific indexes upon specific requests.
For example, when initially looking for /foo/bar[232], all instances from /foo/bar[1] to /foo/bar[n] will get indexed (so the first run is slow). Subsequent calls with such expressions as /foo/bar[232], /foo/bar[100], /foo/bar[25], etc... will then all make use of the created index (fast).
In addition to numerical indexes, attribute values indexing is currently supported as well. That is, expressions as /foo/bar[@id='someValue']. Similarly to the numerical indexing process, looking for a such expression will index all values of the 'id' attribute for the given XPath root (/foo/bar here).
Using this class is pretty straightforward :
$reader = new XML_Indexing_Reader ('test.xml');
$reader->find('/foo/bar[232]'); // Or any other XPath expression
$xmlStrings = $reader->fetchStrings();
echo "Extracted XML data : "
foreach ($xmlStrings as $n => $str) {
echo "######## Match $n ######### \n";
echo "$str\n\n";
}
Namespaces extraction is supported. These namespaces declarations are stored in the index files. You can retrieve them with :
$reader->find(...); // Needs to be call prior to getNamespaces()
$nsList = $reader->getNamespaces();
foreach ($nsList as $prefix => $uri) {
echo "$prefix => $uri";
}
The index storage strategy can be customized by modifying the default dsn value. Currently, only local file containers are supported. // The following will store indexes in /tmp, using file names with an .xi
// prefix. That is the default.
$options['dsn'] = 'file:///tmp/%s.xi';
$indexer = new XML_Indexing_Reader ('test.xml', $options);
// You can specify your own path as long as you include the %s expression :
$options['dsn'] = 'file:///var/cache/xi/%s.xi'
$indexer = new XML_Indexing_Reader ('test.xml', $options);
See the constructor documentation for more information on options.