Here’s a quick example showing how to write XML using the library XML in R. There is a more modern package called xlm2 that I haven’t had time to try yet.
After loading the necessary libraries and determining how many records we want to write
library(dplyr) library(randomNames) library(uuid) library(XML) library(parallel) set.seed(0) Nrecords <- 500 # number of records for the test data
we create random contacts, that have a name, an id and up to four nicknames.
# create a data frame with random names, an id and some secondary first names contacts <- data.frame(name = randomNames(Nrecords)) %>% rowwise() %>% mutate(id = UUIDgenerate(), otherNames = paste(randomNames(sample(1:4,size = 1),which.names = 'first'),collapse = ','))
Next we write a function that writes an xml files for each record. The nicknames are listed as separate values.
# a function that takes an id, name and nicknames and writes out # an XML file writeData <- function(id,name,moreNames) { fileName <- paste("~/tmp/xml/",id,'.xml',sep = "") contactXML <- xmlOutputDOM(tag = "Contacts",nsURI = "http://example.org/dddd/eee") contactXML$addTag("id",id) contactXML$addTag("name",name) otherNames <- strsplit(moreNames,',')[[1]] contactXML$addTag("otherNames",close=F) for(j in 1:length(otherNames)) { contactXML$addTag("nickName",otherNames[j]) } contactXML$closeTag() #saveXML(contactXML$value(),file = fileName, prefix = '\n') saveXML(contactXML$value(),file = fileName, prefix = '') }
Then we run the function as a single thread.
# single thread system.time( mapply(writeData,contacts$id,contacts$name,contacts$otherNames) ) # user system elapsed # 1.992 0.040 2.031
and for comparison with four threads.
# four concurrent threads system.time( mcmapply(writeData,contacts$id,contacts$name,contacts$otherNames,mc.cores = 4) ) # # user system elapsed # 1.068 0.072 0.614
We get a speedup of just under four, presumably due to some overheads relating to writing files and other inefficiencies.
The XML result looks like this:
To learn more about XML head to the XML tutorials.