How to Download and Read/Parse XML Files in R Programming

XML stands for Extensible Markup Language (XML). XML is very popular while defining the structure of a web page and just like HTML it contains Tags. In the previous tutorial I have shown How to Read CSV, Excel and Table files in r Programming. Now we are going to see How to Download a XML file from internet in R environment and parsing of XML file through R functions.

Read Also:

Downloading Files from Internet

Before we can read or parse a XML file, first that needs to be downloaded in R environment. To do so we can use download.file() function. Suppose I need to parse the XML file of URL http://technokarak.com/sitemap-pt-post-2016-04.xml. See below where destfile parameter is used to store the contents of downloaded file.

Downloading Files from Internet

How to Read/Parse XML File

To use XML Reading and Parsing functions we need to install and use “XML” package of R. Use install.packages(“XML”) command to install the required package. After that load packages: XML and methods by using library(“XML”) and library(“methods”) command. Now see how to read and print the XML file we downloaded in the above step.

How to Read-Parse XML File

To Parse the XML file and fetch root nodes use the following code shown in the image below:

Parse the XML file and fetch root nodes

You can use the above mentioned functions to Parse the XML and get the result according to your requirement. Before concluding the post I would like to show How to Convert XML into Data Frame which will provide an easy way to process data in XML. To do so use xmlToDataFrame() function like shown below:

How to Convert XML into Data Frame

To access the column value you can use the following code:

access the column value

Thank you for reading the post. Subscribe Technokarak.com for more Programming Tutorials.

Leave a Reply