A WordCloud can be one of the best tools that allows us to visualize most of the words and terms contained in tweets. In this article we will see How to Create WordCloud of Twitter Data using R Programming. WordCloud shows the words present in a text according to the frequency of their occurrence. To build WordCloud first need Twitter Data and after that cleaning and preparation of data is required. To see How it is done Read the following posts:
- How to Setup Twitter App for Analysis
- How to Fetch Tweets in R Programming
- How to Clean Twitter Data using R
Packages Required to Build WordCloud
Following Packages needs to be installed and loaded for building WordCloud in R.
- tm
- wordcloud
- RColorBrewer
Fetch Some Tweets to start building WordCloud in R
Before building WordCloud setup connection with Twitter and for this I have written a function named “twitter_auth”. See the Code below:
twitter_auth <- function() { library(twitteR) consumer_key <- ‘Your Key’ consumer_secret <- ‘Your secret’ access_token<-‘Your Token Key’ access_secret <- ‘Your Token Secret’ setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret) }
To build WordCloud First we need to clean the data which is explained in the previous post.
After cleaning is done to plot WordCloud of Twitter data “wordcloud()” function is used. For demonstration I have written a function named “tweet_word_cloud()” that takes two arguments: First tweet_text = For searching Twitter Data and second is no. of tweets to fetch. See the below code:
tweet_word_cloud <- function(tweet_text,i) { library(twitteR) library(tm) library(wordcloud) library(RColorBrewer) tweets_data <- searchTwitter(tweet_text, n=i, lang="en") tweets_text <- sapply(tweets_data, function(x) x$getText()) #Extract Text from Tweets tweets_corpus <- Corpus(VectorSource(tweets_text)) #Create Corpus tweets_corpus <- tm_map(tweets_corpus, PlainTextDocument) #Convert Corpus to Plain Text Document tweets_corpus <- tm_map(tweets_corpus, removePunctuation) #Remove Punctuation tweets_corpus <- tm_map(tweets_corpus, removeWords, c('IPL', '2016', stopwords('english'))) #Remove StopWords tweets_corpus <- tm_map(tweets_corpus, stemDocument) #Perform Stemming of Words wordcloud(tweets_corpus, max.words = 100, random.order = FALSE, colors=brewer.pal(8, "Dark2")) #Plot WordCloud }
To run the above code use the following commands:
> source('~/tweet_word_cloud.R') > twitter_auth() > tweet_word_cloud("IPL", 1000)
Output:
Please comment below for any query and Read Next Post about Comparison WordCloud of Twitter Data in R Programming.