=====================
APIs specific to lxml
=====================
lxml tries to follow established APIs wherever possible. Sometimes, however,
the need to expose a feature in an easy way led to the invention of a new API.
.. contents::
..
1 lxml.etree
2 Other Element APIs
3 Trees and Documents
4 Iteration
5 Parsers
6 iterparse and iterwalk
7 Error handling on exceptions
8 Python unicode strings
9 XPath
10 XSLT
11 RelaxNG
12 XMLSchema
13 xinclude
14 write_c14n on ElementTree
lxml.etree
----------
lxml.etree tries to follow the `ElementTree API`_ wherever it can. There are
however some incompatibilities (see `compatibility`_). The extensions are
documented here.
.. _`ElementTree API`: http://effbot.org/zone/element-index.htm
.. _`compatibility`: compatibility.html
If you need to know which version of lxml is installed, you can access the
``lxml.etree.LXML_VERSION`` attribute to retrieve a version tuple. Note,
however, that it did not exist before version 1.0, so you will get an
AttributeError in older versions. The versions of libxml2 and libxslt are
available through the attributes ``LIBXML_VERSION`` and ``LIBXSLT_VERSION``.
The following examples usually assume this to be executed first::
>>> from lxml import etree
>>> from StringIO import StringIO
Other Element APIs
------------------
While lxml.etree itself uses the ElementTree API, it is possible to replace
the Element implementation by `custom element subclasses`_. This has been
used to implement well-known XML APIs on top of lxml. The ``lxml.elements``
package contains examples. Currently, there is a data-binding implementation
called `objectify`_, which is similar to the `Amara bindery`_ tool.
Additionally, the `lxml.elements.classlookup`_ module provides a number of
different schemes to customize the mapping between libxml2 nodes and the
Element classes used by lxml.etree.
.. _`custom element subclasses`: namespace_extensions.html
.. _`objectify`: objectify.html
.. _`lxml.elements.classlookup`: elements.html#lxml.elements.classlookup
.. _`Amara bindery`: http://uche.ogbuji.net/tech/4suite/amara/
Trees and Documents
-------------------
Compared to the original ElementTree API, lxml.etree has an extended tree
model. It knows about parents and siblings of elements::
>>> root = etree.Element("root")
>>> a = etree.SubElement(root, "a")
>>> b = etree.SubElement(root, "b")
>>> c = etree.SubElement(root, "c")
>>> d = etree.SubElement(root, "d")
>>> e = etree.SubElement(d, "e")
>>> b.getparent() == root
True
>>> print b.getnext().tag
c
>>> print c.getprevious().tag
b
Elements always live within a document context in lxml. This implies that
there is also a notion of an absolute document root. You can retrieve an
ElementTree for the root node of a document from any of its elements::
>>> tree = d.getroottree()
>>> print tree.getroot().tag
root
Note that this is different from wrapping an Element in an ElementTree. You
can use ElementTrees to create XML trees with an explicit root node::
>>> tree = etree.ElementTree(d)
>>> print tree.getroot().tag
d
>>> print etree.tostring(tree)