XML parsing refers to reading XML documents and extracting meaningful data from them. XML parsers allow us to work with XML data by transforming it into an easily manageable format (like objects or data structures) in different programming languages.
This chapter will guide you through XML parsing in various popular programming languages, demonstrating examples in Python, Java, C#, JavaScript, and PHP. By the end of this section, you will have a solid understanding of how to parse XML in these languages.
Python has a built-in library called xml.etree.ElementTree
for parsing XML.
import xml.etree.ElementTree as ET
xml_data = '''
XML Developer's Guide
Author Name
44.95
Learn XML
Another Author
39.95
'''
# Parse the XML data
root = ET.fromstring(xml_data)
# Extract and print book titles and authors
for book in root.findall('book'):
title = book.find('title').text
author = book.find('author').text
print(f"Title: {title}, Author: {author}")
# Output
Title: XML Developer's Guide, Author: Author Name
Title: Learn XML, Author: Another Author
Here:
ET.fromstring()
parses the XML string into a tree structure.findall()
and find()
are used to extract specific elements.Java provides the javax.xml.parsers
package, with support for both DOM and SAX parsers. Below is an example using DOM parsing.
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.*;
public class XMLParser {
public static void main(String[] args) throws Exception {
String xmlData = "XML Developer's Guide Author Name 44.95 ";
// Parse XML
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
Document doc = factory.newDocumentBuilder().parse(new InputSource(new StringReader(xmlData)));
// Get elements by tag name
NodeList books = doc.getElementsByTagName("book");
for (int i = 0; i < books.getLength(); i++) {
Element book = (Element) books.item(i);
String title = book.getElementsByTagName("title").item(0).getTextContent();
String author = book.getElementsByTagName("author").item(0).getTextContent();
System.out.println("Title: " + title + ", Author: " + author);
}
}
}
# Output
Title: XML Developer's Guide, Author: Author Name
Here:
ET.fromstring()
parses the XML string into a tree structure.findall()
and find()
are used to extract specific elements.In C#, the System.Xml
namespace provides tools for parsing XML, including the XmlDocument
class.
using System;
using System.Xml;
class Program
{
static void Main()
{
string xmlData = "XML Developer's Guide Author Name 44.95 ";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlData);
XmlNodeList books = doc.GetElementsByTagName("book");
foreach (XmlNode book in books)
{
string title = book["title"].InnerText;
string author = book["author"].InnerText;
Console.WriteLine($"Title: {title}, Author: {author}");
}
}
}
// Output
Title: XML Developer's Guide, Author: Author Name
Here:
XmlDocument.LoadXml()
parses the XML string.GetElementsByTagName()
retrieves elements, and you access child elements using InnerText
.In JavaScript, you can use the DOMParser
for parsing XML strings.
const xmlData = `
XML Developer's Guide
Author Name
44.95
`;
const parser = new DOMParser();
const xmlDoc = parser.parseFromString(xmlData, "application/xml");
const books = xmlDoc.getElementsByTagName("book");
for (let i = 0; i < books.length; i++) {
const title = books[i].getElementsByTagName("title")[0].textContent;
const author = books[i].getElementsByTagName("author")[0].textContent;
console.log(`Title: ${title}, Author: ${author}`);
}
// Output
Title: XML Developer's Guide, Author: Author Name
Here:
DOMParser.parseFromString()
converts the XML string into a document object.getElementsByTagName()
retrieves XML elements, and textContent
extracts the value.In PHP, the SimpleXML
extension provides an easy way to parse XML.
XML Developer\'s Guide
Author Name
44.95
';
$xml = simplexml_load_string($xmlData);
foreach ($xml->book as $book) {
echo "Title: " . $book->title . ", Author: " . $book->author . "\n";
}
?>
// Output
Title: XML Developer's Guide, Author: Author Name
Here:
simplexml_load_string()
parses the XML string into an object.Go provides the encoding/xml
package for parsing XML data.
package main
import (
"encoding/xml"
"fmt"
"strings"
)
type Book struct {
Title string `xml:"title"`
Author string `xml:"author"`
Price string `xml:"price"`
}
type Bookstore struct {
Books []Book `xml:"book"`
}
func main() {
xmlData := `
XML Developer's Guide
Author Name
44.95
`
var bookstore Bookstore
xml.Unmarshal([]byte(xmlData), &bookstore)
for _, book := range bookstore.Books {
fmt.Printf("Title: %s, Author: %s\n", book.Title, book.Author)
}
}
// Output
Title: XML Developer's Guide, Author: Author Name
Here:
xml.Unmarshal()
parses the XML string into Go structs.Book
, Bookstore
) with tags to match XML elements.XML parsing is essential for working with structured data in various programming languages. Whether you're working with small XML files or handling large datasets, each language provides powerful libraries for reading and manipulating XML. From Python’s ElementTree to Go’s encoding/xml, you now have the knowledge to parse XML in the most common programming languages. This chapter provides a full guide on XML parsing across different platforms and serves as a one-stop resource. Happy coding !❤️