How to Do Twitter Sentiment Analysis in R

November 11, 2025

Curious what people really think about your brand on Twitter and other social media platforms? Instead of guessing from a handful of mentions, you can use the R programming language to systematically analyze that sentiment data directly. This guide will walk you through every step, from fetching tweets with the Twitter API to cleaning the text and visualizing your findings into easy-to-understand insights.

What is Sentiment Analysis? (And Why Should You Care?)

Sentiment analysis, or opinion mining, is the process of using natural language processing (NLP) to determine if a piece of text is positive, negative, or neutral. For social media marketers, this isn't just a cool tech trick - it's a goldmine of actionable data. It helps you:

Monitor Brand Health: Get a real-time pulse check on how your audience feels about your brand, competitors, or industry. A sudden dip in sentiment could wave a red flag about a PR issue before it blows up.
Measure Campaign ROI: Did your latest campaign spark joy or frustration? Sentiment analysis gives you feedback beyond simple likes and retweets, showing you the emotional impact of your content.
Gather Product Feedback: Customers are constantly sharing their opinions on social media. Analyzing their tweets can reveal what they love about your product and, more importantly, what they wish you would fix.
Stay Ahead of Competitors: Turn the same lens on your competitors. Understanding their audience's pain points and praises can open up opportunities for you to monitor competitors' social media.

In short, it’s a way to listen at scale and turn unstructured chatter into a clear picture of public perception.

Setting Up Your R Environment

Before we start pulling data, we need to get our digital toolkit ready. R’s power comes from its massive library of packages created by the community. For this task, we'll need a few key players. Open up RStudio (or your R console of choice) and run the following command to install them. If you already have some of these installed, it will just skip them.

install.packages(c("rtweet", "tidyverse", "tidytext", "lubridate", "wordcloud", "textdata"))

Let's quickly break down what each of these does:

rtweet: This is our primary tool for interacting with the Twitter/X API. It handles the authentication and allows us to pull tweet data directly into R.
tidyverse: Not just one package, but an entire collection of data science tools. We'll mainly use dplyr for data manipulation (like filtering and rearranging) and ggplot2 for creating beautiful data visualizations.
tidytext: This package brings the principles of "tidy data" to text analysis. It makes cleaning and manipulating text data much simpler by treating it like any other well-structured table.
lubridate: Working with dates and times can be messy. This package makes it much easier to handle the timestamps that come with tweet data.
wordcloud: A fun and intuitive way to visualize text data by making the most frequent words appear larger.
textdata: The tidytext package needs this to access some of the sentiment lexicons (dictionaries) we'll be using later on.

Once those are installed, you're ready for the first major step: getting the data.

Step 1: Get Access to Twitter Data

To pull tweets, you need permission from X in the form of API credentials. This used to be easy, but recent changes have made it a bit more involved. The key is to apply for a Developer Account on the X Developer Platform. You'll need to sign up for a plan - for learning purposes, the free tier should work, but it's much more limited than it used to be.

Once you are approved and have created a new "App" in your developer dashboard, you'll be given a set of unique keys and tokens. These are like a username and password for your R script.

You’ll need the following:

API Key
API Key Secret

Keep these handy and safe! Now, let’s use rtweet to connect. When you first run rtweet, it will prompt you to authenticate. You can follow the browser authentication flow, which is the most straightforward method. Once you've logged in and authorized your app, rtweet will cache your credentials so you don't have to do it every time.

Let’s try it by pulling the 1,000 most recent non-retweet statuses in English that mention the hashtag #socialmediamarketing.

library(rtweet) library(tidyverse) # The first time you use rtweet, it may automatically open a browser for you to authorize. # If that flow doesn't work, you'll need to manually set up your tokens. # Search for 1,000 recent English-language tweets mentioning #socialmediamarketing tweets <,- search_tweets("#socialmediamarketing", n = 1000, include_rts = FALSE, lang = "en") # Let's take a look at what we got tweets %>,% select(screen_name, text) %>,% head()

If all went well, you'll see a table with Twitter handles and their corresponding tweet text. You're officially pulling live social media data!

Step 2: Clean and Prepare Your Text for Analysis

Raw tweet text is a mess. It's full of URLs, mentions (@), hashtags (#), numbers, retweets (RT), and punctuation. To analyze the sentiment accurately, we need to clean this up and reshape our data into a "tidy" format. Tidy text means having a table with one token (in our case, one word) per row.

The tidytext package makes this incredibly smooth. Here's our game plan:

Create a cleaner version of the raw texts.
Break the texts down into individual words (a process called tokenization).
Remove "stop words" - common words like "the," "is," "a," "in" - that don't carry much sentiment.

library(tidytext) library(stringr) # First, clean the text column by removing links, mentions, and numbers tweets_cleaned <,- tweets %>,% select(status_id, text, created_at) %>,% mutate(clean_text = str_remove_all(text, "https?://\\S+")) %>,% # Remove URLs mutate(clean_text = str_remove_all(clean_text, "@\\S+")) %>,% # Remove mentions mutate(clean_text = str_remove_all(clean_text, "[:punct:]")) %>,% # Remove punctuation mutate(clean_text = str_remove_all(clean_text, "[:digit:]")) %>,% # Remove numbers mutate(clean_text = str_remove_all(clean_text, "RT")) # Remove "RT" # Now, tokenize the text to have one word per row tweet_words <,- tweets_cleaned %>,% select(status_id, clean_text, created_at) %>,% unnest_tokens(word, clean_text) # Get the built-in list of stop words data("stop_words") # Remove stop words from our dataset tidy_tweets <,- tweet_words %>,% anti_join(stop_words, by = "word") # Let's have a look at the most common words after cleaning tidy_tweets %>,% count(word, sort = TRUE) %>,% head(10)

Now we have a clean, organized table where each meaningful word has its own row. The data is ready for the real magic: the analysis itself.

Step 3: Perform the Sentiment Analysis

This is where we connect words to feelings. To do this, we use a "sentiment lexicon," which is just a pre-made dictionary of words classified by their emotional content or polarity (positive/negative).

The tidytext package gives us easy access to several lexicons. We'll use two popular ones:

"bing": Developed by Bing Liu, this lexicon classifies words as either "positive" or "negative." It’s straightforward and great for getting a general positive-vs-negative score.
"nrc": From Saif Mohammad and Peter Turney, this lexicon is more detailed. It classifies words into positive/negative categories and emotions like "joy," "sadness," "anger," and "trust."

Let’s start with the "bing" lexicon. Here’s how we can join it with our cleaned tweet words and count up the number of positive and negative impressions.

# Get the 'bing' sentiment lexicon bing_sentiments <,- get_sentiments("bing") # Inner join our tidy data with the lexicon tweet_sentiments <,- tidy_tweets %>,% inner_join(bing_sentiments, by = "word") # Now, we can count the number of positive and negative words sentiment_counts <,- tweet_sentiments %>,% count(sentiment, sort = TRUE) print(sentiment_counts)

This will output a simple table showing the total count of words falling into the "positive" and "negative" categories. Right there, you have your first high-level insight into the general perception of your topic!

Step 4: Visualize Your Findings Like a Pro

Numbers in a table are good, but a chart is often better for sharing insights with your team or clients. Let’s use ggplot2 to create a simple bar chart of our positive vs. negative sentiment counts.

library(ggplot2) sentiment_counts %>,% mutate(sentiment = str_to_title(sentiment)) %>,% # Capitalize for prettier plot ggplot(aes(x = sentiment, y = n, fill = sentiment)) + geom_col(show.legend = FALSE) + labs(title = "Sentiment of Tweets mentioning #socialmediamarketing", x = "Sentiment", y = "Number of Words")

This code generates a professional-looking bar chart, making it immediately obvious which sentiment is more dominant.

Going Deeper with Word Clouds

Another powerful way to visualize tweet analysis is with a word cloud. It helps you see instantly which positive and negative words appear most often. Let's create one for each sentiment.

library(wordcloud) library(reshape2) # Create a word cloud of the most common positive and negative words tidy_tweets %>,% inner_join(bing_sentiments, by = "word") %>,% count(word, sentiment, sort = TRUE) %>,% acast(word ~ sentiment, value.var = "n", fill = 0) %>,% comparison.cloud(colors = c("red", "darkgreen"), max.words = 100)

This visual will display the top negative words in one color and the top positive words in another. Now, you’re not just saying "the sentiment was mostly positive" - you can point to specific words like "opportunity," "growth," or "valuable" that are driving that positivity.

And there you have it! You've gone from a formless stream of tweets to a tangible, data-backed analysis of public opinion. You can now adapt this script to track brand names, monitor campaign hashtags, or even spy on your competition.

Final Thoughts

In this guide, you walked through how to connect to the Twitter API, pull live data, reshape and clean it, run a sentiment analysis, and visualize the results using R. With these steps, you can start turning raw social media conversations into meaningful insights that can inform your entire marketing strategy.

While diving deep into data with R is incredibly powerful for specific campaigns and reports, we know that managing your day-to-day content rhythm requires tools built for speed and clarity. That’s why we designed Postbase to make your life simpler. It helps you plan with visual calendars, schedule to all your platforms at once, and see what's actually working with clean analytics - all without writing a single line of code. We believe you should have easy access to the data you need to make smarter decisions, so you can spend less time tackling complex code and more time creating amazing content.

Spencer Lanoue

Spencer's spent a decade building products at companies like Buffer, UserTesting, and Bump Health. He's spent years in the weeds of social media management—scheduling posts, analyzing performance, coordinating teams. At Postbase, he's building tools to automate the busywork so you can focus on creating great content.

Stop wrestling with outdated social media tools

Wrestling with social media? It doesn’t have to be this hard. Plan your content, schedule posts, respond to comments, and analyze performance — all in one simple, easy-to-use tool.

Schedule your first post

The simplest way to manage your social media

How to Do Twitter Sentiment Analysis in R

What is Sentiment Analysis? (And Why Should You Care?)

Setting Up Your R Environment

Step 1: Get Access to Twitter Data

Step 2: Clean and Prepare Your Text for Analysis

Step 3: Perform the Sentiment Analysis

Step 4: Visualize Your Findings Like a Pro

Going Deeper with Word Clouds

Final Thoughts

Other posts you might like

How to Add Social Media Icons to an Email Signature

How to Record Audio for Instagram Reels

How to Check Instagram Profile Interactions

How to Request a Username on Instagram

How to Attract a Target Audience on Instagram

How to Turn On Instagram Insights

Stop wrestling with outdated social media tools