Python - How To Convert XML to Dictionary in Python

ID : 186

viewed : 109

Tags : PythonPython DictionaryPython XML

vote vote

94

XML is known as Extensible Markup Language. It is used to store and transport small to medium amounts of data, and it is also widely used for sharing structured information. Python enables us to parse and modify the XML documents.

In this tutorial, we will demonstrate how to convert XML string into a dictionary in Python.

Use the xmltodict Module to Convert XML String Into a Dictionary in Python

xmltodict is a module in Python that makes working with XML feel like JSON. Due to the structure of XML, it can be easily converted to a dictionary using this module.

See the code snippet below.

import xmltodict xml_data = """     <student>       <id>DEL</id>       <name> Jack </name>       <email>jack@example.com</email>       <smeseter>8</smeseter>       <class>CSE</class>       <cgpa> 7.5</cgpa>     </student> """  d = xmltodict.parse(xml_data) print(d) 

Output:

OrderedDict([('student', OrderedDict([('id', 'DEL'), ('name', 'Jack'), ('email', 'jack@example.com'), ('smeseter', '8'), ('class', 'CSE'), ('cgpa', '7.5')]))]) 

Here, we can see that the result is in the form of an ordered dictionary. An ordered dictionary preserves the order of the key-value pairs in a dictionary. The parse() function here parses the XML data to an ordered dictionary.

Use the cElemenTree Library to Convert XML String Into Dictionary in Python

cElementTree is an essential Python library allowing us to parse and navigate an XML document. With cElementTree, we can break down the XML document into a tree structure that is easy to work with.

We will create our own function to parse the XML data and convert it to a dictionary. We will use a deafultdict class object from the collections module to get the final result in our desired form.

See the following code.

from collections import defaultdict from xml.etree import cElementTree as ET   def xml2dict(t):     d = {t.tag: {} if t.attrib else None}     children = list(t)     if children:         dd = defaultdict(list)         for dc in map(etree_to_dict, children):             for k, v in dc.items():                 dd[k].append(v)         d = {t.tag: {k: v[0] if len(v) == 1 else v                      for k, v in dd.items()}}     if t.attrib:         d[t.tag].update(('@' + k, v)                         for k, v in t.attrib.items())     if t.text:         text = t.text.strip()         if children or t.attrib:             if text:               d[t.tag]['#text'] = text         else:             d[t.tag] = text     return d  xml_data = ET.XML("""     <student>       <id>DEL</id>       <name> Jack </name>       <email>jack@example.com</email>       <smeseter>8</smeseter>       <class>CSE</class>       <cgpa> 7.5</cgpa>     </student> """)  d = xml2dict(xml_data)  print(d) 

Output:

{'student': {'id': 'DEL', 'name': 'Jack', 'email': 'jack@example.com', 'smeseter': '8', 'class': 'CSE', 'cgpa': '7.5'}} 

Note that this is a tree structure of a dictionary. Notice that the final dictionary is usually a nested dictionary in both the methods. This is due to the structure of the XML.

  • Related HOW TO?