Sentiment Analysis of Twitter Data using R Programming – To perform Sentiment Analysis in R we need “sentiment” R package well developed by Timothy P. Jurka. The Package comes with two important functions namely: “classify_emotion” and “classify_polarity”. In this post I will show How to use both the functions in brief:
Read Also:
- How to Setup Twitter App for Analysis
- How to Fetch Tweets in R Programming
- How to Clean Twitter Data using R
First extract the tweet using the desired keyword. For demonstration process I am fetching 2000 tweets for both the keywords. So a total of 4000 tweets to analyze using sentiment package in R.
Read Fetching Twitter Data using R and Cleaning of Twitter Data.
In this tutorial I will use both “classify_emotion” and “classify_polarity” function to perform sentiment analysis on Twitter Data collected. See the code of the sentiment analysis function below:
sentiment_twitter <- function(searchterm,i) { library(twitteR) library(wordcloud) library(tm) library(RColorBrewer) library(sentiment) library(plyr) library(ggplot2) consumer_key <- 'Your-Key' consumer_secret <- 'Your Secret' access_token<-'Your Access Token' access_secret <- 'Your Access Secret' setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret) tweet_to_score <- searchTwitter(searchterm, n=i, lang="en", resultType="recent") tweet_text <- sapply(tweet_to_score, function(x) x$getText()) tweet_clean <- clean_text(tweet_text) emotion_class <- classify_emotion(tweet_clean, algorithm = "bayes", prior = 1) emotion <- emotion_class[,7] emotion[is.na(emotion)] = "unknown" polarity_class <- classify_polarity(tweet_clean, algorithm = "bayes") polarity = polarity_class[,4] sent_df = data.frame(text=tweet_clean, emotion=emotion,polarity=polarity, stringsAsFactors=FALSE) sent_df = within(sent_df,emotion <- factor(emotion, levels=names(sort(table(emotion), decreasing=TRUE)))) head(sent_df, n=5) ggplot(sent_df, aes(x=emotion)) + geom_bar(aes(y=..count.., fill=emotion)) + scale_fill_brewer(palette="Dark2") + labs(x="emotion categories", y="number of tweets") + ggtitle("Sentiment Analysis of Twitter") + theme(plot.title = element_text(size=12, face="bold")) }
Code of clean_text function
clean_text = function(x) { x = gsub("rt", "", x) # remove rt x = gsub("RT", "", x) # remove RT x = gsub("@\\w+", "", x) # remove at x = gsub("[[:punct:]]", "", x) # remove punctuation x = gsub("[[:digit:]]", "", x) # remove numbers x = gsub("http\\w+", "", x) # remove links http x = gsub("[ |\t]{2,}", "", x) # remove tabs x = gsub("^ ", "", x) # remove blank spaces at the beginning x = gsub(" $", "", x) # remove blank spaces at the end try.error = function(z) { y = NA try_error = tryCatch(tolower(z), error=function(e) e) if (!inherits(try_error, "error")) y = tolower(z) return(y) } x = sapply(x, try.error) return(x) }
Copy Paste the above code in a R Script File and call the function sentiment_twitter(“Search Term”, No. of tweets)
For example I am calling the function for search term: IPL2016
sentiment_twitter("IPL2016", 2000)
Output of the Analysis is as follows:
Please comment below for any query.