We start to experiment with the corpus from

  • R. Bigge, “Making the Invisible Visible: The Neo-Conceptual Tentacles of Mark Lombardi,” Left History, vol. 10, iss. 2, pp. 127-134, 2005. bibtex Go to document
  • A. Friedman, “Mark Lombardi.s visualisation discovery,” in Visual Literacy Conference (ADV-VIS), 2011, pp. 12-16. bibtex Go to document
  • J. M. Law, “Mark Lombardi’s “Narrative Structures”: The Visibility of the Network and the New Global Order,” Master’s Dissertation , 2012. bibtex Go to document
  • Y. Yeadon, “Occasional Notes on Mark Lombardi.s Banca Nazionale del Lavoro, Reagan, Bush, Thatcher and the Arming of Iraq, c. 1979.1990, 3rd Version,” Rethinking Marxism: A Journal of Economics, Culture & Society, vol. 15, iss. 3, pp. 343-349, 2003. bibtex Go to document
  • J. Zdebik, “Networks of Corruption: The Aesthetics of Mark Lombardi.s Relational Diagrams,” RACAR Revue d’art canadienne / Canadian Art Review, vol. 36, iss. 2, pp. 66-77, 2011. bibtex Go to document

As a first experiment we used the tm package for R to build a full text index of the corpus and generated a simple wordcloud as follows:

wordcloud

Admittedly, the result is not surprising 🙂

This is the respective (beginners) R code.

library(tm)
pdf1 <- readPDF()(elem = list(uri = "M:/RSpace/pdf/5684-5555-1-PB.pdf"),language = "en",id = "id1")
pdf2 <- readPDF()(elem = list(uri = "M:/RSpace/pdf/5999-6062-1-PB.pdf"),language = "en",id = "id2")
pdf3 <- readPDF()(elem = list(uri = "M:/RSpace/pdf/0893569032000131659.pdf"),language = "en",id = "id3")
pdf4 <- readPDF()(elem = list(uri = "M:/RSpace/pdf/Making_visible_the_invisible.pdf"),language = "en",id = "id4")
pdf5 <- readPDF()(elem = list(uri = "M:/RSpace/pdf/ohiou1338559899.pdf"),language = "en",id = "id5")

doc.list <- list(paste(pdf1, collapse = ''),paste(pdf2, collapse = ''),
                 paste(pdf3, collapse = ''),paste(pdf4, collapse = ''),
                 paste(pdf5, collapse = ''))
N.docs <- length(doc.list)
names(doc.list) <- paste0("doc", c(1:N.docs))

my.docs <- VectorSource(c(doc.list))
my.docs$Names <- c(names(doc.list))

my.corpus <- Corpus(my.docs)
my.corpus <- tm_map(my.corpus, removePunctuation)
my.corpus <- tm_map(my.corpus, removeNumbers)
my.corpus <- tm_map(my.corpus, tolower)
my.corpus <- tm_map(my.corpus, stripWhitespace)
my.corpus <- tm_map(my.corpus, removeWords, stopwords("en"))

library(wordcloud)
wordcloud(my.corpus, scale=c(5,0.5), max.words=100, random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))

Comments

Corpus analysis — No Comments

Leave a Reply