XPath Axes are powerful mechanisms in XPath that allow you to navigate through nodes in an XML document based on their relationships (parent, child, sibling, ancestor, etc.). Axes help you specify the direction in which to search for nodes relative to the current node. Understanding XPath axes gives you more control and flexibility when querying XML data.
In XPath, an axis defines the relationship between the node you’re starting at (called the context node) and the nodes you want to select. The axes let you move through the XML tree in different directions. Each axis specifies a direction relative to the context node.
Here are the most commonly used XPath axes:
child
: Selects the child nodes of the current node.parent
: Selects the parent of the current node.descendant
: Selects all descendants (children, grandchildren, etc.) of the current node.ancestor
: Selects all ancestors (parents, grandparents, etc.) of the current node.following-sibling
: Selects all sibling nodes after the current node.preceding-sibling
: Selects all sibling nodes before the current node.following
: Selects everything in the document after the current node.preceding
: Selects all nodes before the current node.self
: Selects the current node.attribute
: Selects the attributes of the current node.We will use the following XML structure for all our examples:
The Great Gatsby
F. Scott Fitzgerald
10.99
Sapiens
Yuval Noah Harari
15.99
National Geographic
5.50
The child
axis selects the children of the current node. This is the default axis in XPath, so you can omit the axis name if you’re selecting child elements.
Example: Select all title
child elements of the book
element.
/bookstore/book/child::title
This can be simplified to:
/bookstore/book/title
<title lang="en">The Great Gatsby</title>
<title lang="en">Sapiens</title>
The parent
axis selects the parent of the current node.
Example: Select the parent of any title
element.
//title/parent::book
<book category="fiction">...</book>
<book category="non-fiction">...</book>
The descendant
axis selects all descendants (children, grandchildren, etc.) of the current node.
Example: Select all descendant title
elements of bookstore
.
/bookstore/descendant::title
<title lang="en">The Great Gatsby</title>
<title lang="en">Sapiens</title>
<title>National Geographic</title>
The ancestor
axis selects all ancestors (parents, grandparents, etc.) of the current node.
Example: Select all ancestors of the price
element.
//price/ancestor::bookstore
Output:
<bookstore>...</bookstore>
The following-sibling
axis selects all sibling nodes that come after the current node.
Example: Select the sibling nodes after the first book
element.
/bookstore/book[1]/following-sibling::*
<book category="non-fiction">...</book>
<magazine category="science">...</magazine>
The preceding-sibling
axis selects all sibling nodes that come before the current node.
Example: Select the sibling nodes before the magazine
element.
/bookstore/magazine/preceding-sibling::*
<book category="fiction">...</book>
<book category="non-fiction">...</book>
The following
axis selects everything in the document that comes after the current node.
Example: Select all nodes after the first book
element.
/bookstore/book[1]/following::*
<book category="non-fiction">...</book>
<magazine category="science">...</magazine>
The preceding
axis selects all nodes before the current node.
Example: Select all nodes before the magazine
element.
/bookstore/magazine/preceding::*
<book category="fiction">...</book>
<book category="non-fiction">...</book>
The self
axis selects the current node itself.
Example: Select the bookstore
element.
/self::bookstore
<bookstore>...</bookstore>
The attribute
axis selects the attributes of the current node.
Example: Select the category
attribute of the book
elements.
/bookstore/book/attribute::category
This can be simplified to:
/bookstore/book/@category
fiction
non-fiction
Let’s implement XPath axes in Python using the lxml
library.
from lxml import etree
# Load and parse the XML document
tree = etree.parse('books.xml')
# 1. Select the parent of the title element
parent_of_title = tree.xpath('//title/parent::book')
for parent in parent_of_title:
print("Parent of title:", parent.tag)
# 2. Select the following sibling of the first book element
following_sibling = tree.xpath('/bookstore/book[1]/following-sibling::*')
for sibling in following_sibling:
print("Following sibling of first book:", sibling.tag)
# 3. Select the ancestors of the price element
ancestors_of_price = tree.xpath('//price/ancestor::*')
for ancestor in ancestors_of_price:
print("Ancestor of price:", ancestor.tag)
# 4. Select the attributes of book elements
book_attributes = tree.xpath('/bookstore/book/@category')
print("Book categories:", book_attributes)
// Output
Parent of title: book
Parent of title: book
Following sibling of first book: book
Following sibling of first book: magazine
Ancestor of price: book
Ancestor of price: bookstore
Book categories: ['fiction', 'non-fiction']
XPath Axes provide advanced capabilities for navigating an XML document in different directions, not just from parent to child but also across siblings, ancestors, and descendants. These axes are incredibly helpful when dealing with complex XML structures and need to access nodes relative to a certain position. Happy coding !❤️