XML stands for Extensible Markup Language (XML). XML is very popular while defining the structure of a web page and just like HTML it contains Tags. In the previous tutorial I have shown How to Read CSV, Excel and Table files in r Programming. Now we are going to see How to Download a XML file from internet in R environment and parsing of XML file through R functions.
Read Also:
Downloading Files from Internet
Before we can read or parse a XML file, first that needs to be downloaded in R environment. To do so we can use download.file() function. Suppose I need to parse the XML file of URL http://technokarak.com/sitemap-pt-post-2016-04.xml. See below where destfile parameter is used to store the contents of downloaded file.
How to Read/Parse XML File
To use XML Reading and Parsing functions we need to install and use “XML” package of R. Use install.packages(“XML”) command to install the required package. After that load packages: XML and methods by using library(“XML”) and library(“methods”) command. Now see how to read and print the XML file we downloaded in the above step.
To Parse the XML file and fetch root nodes use the following code shown in the image below:
You can use the above mentioned functions to Parse the XML and get the result according to your requirement. Before concluding the post I would like to show How to Convert XML into Data Frame which will provide an easy way to process data in XML. To do so use xmlToDataFrame() function like shown below:
To access the column value you can use the following code:
Thank you for reading the post. Subscribe Technokarak.com for more Programming Tutorials.