Winning in Trump Country

Examining the Twitter Messaging Strategies of Two Democratic Newcomers Who Overcame the Red Tide in the 2022 Midterms 

Introduction

This project applies Natural Language Processing (NLP) techniques to analyze the twitter messaging strategies of Marie Gluesenkamp Pérez (WA-03) and Chris Deluzio (PA-17), Democratic newcomers who competed in two of the most challenging districts for Democrats in the 2022 midterm cycle.

Given the 2022 midterms were marked by the defeats of many election deniers and January 6th apologists, a secondary focus of this study is to assess the difference in our candidates’ messaging strategies against distinct types of opponents— one faced Joe Kent in WA, a ‘Kooky’ nominee who fully embraced the 2020 election conspiracies, and the other faced Jeremy Shaffer in PA, a mainstream Republican who acknowledged, though reluctantly, Joe Biden’s 2020 victory.

Methodology

I used classification models to analyze a dataset of 5000 Twitter and Facebook posts by members of the 114th Congress. The dataset was pre-labeled with categories including bias, message nature, and political affiliation. My goal was to train the models to classify tweets based on these labels, which I then applied to my two candidates’ tweets leading up to the 2022 midterm elections.

I also applied unsupervised topic modeling techniques, beginning with Latent Dirichlet Allocation (LDA) as a baseline method and then using Non-Negative Matrix Factorization (NMF) on Twitter GloVe vectors for refined clustering. After generating the topic groupings, I searched through each candidate’s Tweet corpus for words closely associated with these topics to compare how often each candidate messaged on these topics to assess differing strategies. I used cosine similarity calculations within the tweet vector space to determine which words were most semantically similar in each topic.

Data Used

  1. PVI score data was sourced from the Cook Political Report.
  2. 2022 Midterm Results were sourced from The Daily Kos.
  3. The campaign tweets from Marie Gluesenkamp Pérez and Chris Deluzio were hand-copied from their twitter accounts @MGPforCongress and @ChrisforPA
  4. The 114th Congress tweets addended with characterization inputs was sourced from Crowdflower’s Data For Everyone Library via Kaggle.
  5. GloVe models and vector arrays were sourced from Jeffrey Pennington, Richard Socher, and Christopher D. Manning of Stanford

Selecting the Candidates

SYNOPSIS: I determined which candidates to focus on through comparing their 2022 electoral margins with their district’s Partisan Voter Index scores (PVI). I ultimately landed on Marie Gluesenkamp Pérez (WA-03), and Chris Deluzio (PA-17). Below documents the step-by-step process of determining the candidates of focus.

Expand for Detailed Walk-Through Below

Analyzing Tweets with Trained Models (114th Congress)

This section has been removed to streamline the portfolio (and because the dataset had its limitations). However, it does explore foundational NLP concepts in depth. If you’d like to dive into the original analysis and workflow details, feel free to check out the GitHub repository.

Topic Modeling – Unsupervised Learning

Baseline Model

For my baseline model, I chose to use Latent Dirichlet Allocation (LDA) on Term Frequency-Inverse Document Frequency (TF-IDF)

LDA with TF-IDF

Advanced Modeling

For my advanced models, I leveraged GloVe embeddings and Non-Negative Matrix Factorization (NMF) to enhance topic grouping and capture deeper semantic relationships.

Twitter-Trained GloVe Embeddings with NMF

Marie Gluesenkamp Pérez Topics

These are the distributions of unlabeled tweet topics that the model found to share semantic similarity (I found 7 topics to be the best grouping parameter).

MGP Topic Distribution

Once the tweets were grouped , I went through the top 50 tweets associated with each topic, and found the tweets to be best described by the following themes:

1. MGP Topic 1 — “Voice for Working Class”

Working Class Tweet

 2. MGP Topic 2 — “Digital & Community Engagement”

 3. MGP Topic 3 — “Endorsements & Policy Priorities”

 4. MGP Topic 4 — “Voter Mobilization Efforts”

 5. MGP Topic 5 — “Anti-Extremism”

 6. MGP Topic 6 — “Volunteer & Fundraising”

 7. MGP Topic 7 — “Defending Rights & Freedoms”

The important thing to note here is that each tweet isn’t individually put into one distinct category, but rather, each tweet is given a score for the extent to which it is associated with each topic found by NMF. This makes natural sense, because you can talk about multiple things in one statement– A tweet like “My extreme opponent wants to ban abortion, but I will work to protect choice. That’s why I’m endorsed by Planned Parenthood” would have high scores in Topics 3, 5, and 7, but would be less associated with the other topics. 

The interactive graph below shows the top 50 tweets associated with each category; hover mouse over datapoint to see full tweet.

Chris Deluzio Topics

These are the distributions of unlabeled tweet topics for Chris Deluzio that the model found to share semantic similarity.

Deluzio Topic Distribution

Once the tweets were grouped , I went through the top 50 tweets associated with each topic, and found the tweets to be best described by the following themes:

 1. Deluzio Topic 1 — “Union Solidarity & Local Empowerment”

Union Tweet

 2. Deluzio Topic 2 — “Reproductive Rights & Fighting Extremism”

 3. Deluzio Topic 3 — “Community Events”

 4. Deluzio Topic 4 — “Jobs & Infrastructure”

 5. Deluzio Topic 5 — “Advocacy & Community Solidarity”

 6. Deluzio Topic 6 — “Corporate Greed & Economic Fairness”

 7. Deluzio Topic 7 — “Defending Rights & Democracy”

The interactive graph below shows the top 50 tweets associated with each category; hover mouse over datapoint to see full tweet.

Topic Comparisons Between Candidates

To quantify and compare tweet frequency on specific topics for each candidate, I analyzed their tweet corpora using keywords and semantically similar terms identified in topic modeling. I used the Twitter-trained GloVe model and cosine similarity to assist in keyword selection to reduce bias. For each keyword, I printed the 50 nearest words using cosine similarity and then divided them into 2 groups — relevant and irrelevant– based on their semantic context.

Take the term ‘extreme’ as an example. The GloVe model identified similar terms like ‘radical’, ‘dangerous’, and ‘far-right’, alongside unrelated terms such as ‘fitness’, ‘depression’, and ‘jihadist’. I then divided these into the relevant and irrelevant lists, calculated their average vectors, and used the GloVe model to isolate terms associated with my context and exclude terms outside the zone of interest.

The topic-words I chose to explore were:

  1. ‘extreme’
  2. ‘volunteer’
  3. ”unions’
  4. ‘endorsement’
  5. ‘protect’
  6. ‘folks’
  7. ‘abortion’
  8. ‘manufacturing’
  9. ‘china’
  10. ‘corporations’

The candidates’ tweets were searched for these terms along with a list of semantically-similar terms to gauge how frequently each candidate messaged on the associated topic. The interactive graph linked below shows the results of these queries, the exact terms used in each list, along with example tweets from each candidate for each category.

Insights and Conclusions

1. Abortion

In the wake of the Dobbs v. Jackson Women’s Health decision, abortion rights became a huge topic in the 2022 midterms. The results above show that nearly 10% of all tweets from both Marie Gluesenkamp Perez and Chris Deluzio touched on abortion rights and reproductive health generally.

Messaging Differences:

  • Marie Gluesenkamp Perez: Her tweets on this topic usually offered a more personal perspective to connect with voters in Washington’s 3rd.”Like many moms, I’ve suffered through the heartbreak of miscarriage – imagine the horror of compounding that with being thrown in JAIL. Mothers deserve autonomy, not a police state.”
  • Chris Deluzio: Deluzio’s messaging on the topic focused more on the broader themes of rights and freedoms.”I think you should have the right to make your own decisions about your pregnancy and health care, and I’ll vote in Congress to protect abortion rights.”

Despite the low risk of losing abortion access in Washington and Pennsylvania, nearly 1 in 10 tweets from both campaigns touched on the topic. It would be interesting to observe a candidate from a state which increased restrictions, but overall, whether through personal stories or broader rights discussions, abortion was a central topic to these campaigns and the 2022 midterms overall.

2. Extremist Opponent

Both campaigns hammered the narrative of “extremism”. Marie Gluesenkamp Perez (12% of all tweets) did so slightly more than Chris Deluzio (9.5%), but that is probably due to the fount of source material, given her opponent, Joe Kent’s, genuinely insane and extreme positions:

“Joe Kent says the attack on #January6th ‘reeks of an intelligence operation’ done by the police. Even today he continues defending the violent mob that ransacked the Capitol. Link

“I’m committed to taking the actions necessary to confront the climate crisis. My opponent thinks climate change is a hoax invented by the Chinese government to make money. #WA03 needs a Member of Congress, not an extreme conspiracy theorist. Link

“Joe Kent’s QAnon rants are desperate, weird, and do nothing to improve the lives of people in our district. Anyone who’s tired of this is welcome to join my campaign. I’m running because Congress could use someone who actually knows how to fix things. Link

Chris Deluzio, running against his moderate opponent with less controversial views, incorporated ‘extremism’ in a brilliant dual-pronged strategy. First, he highlighted Jeremy Shaffer’s silence on things like January6, to imply tacit consent. Shaffer couldn’t denounce the extreme views of the far right without alienating those base voters, so Deluzio’s campaign used this silence to associate Shaffer with broader extremism effectively:

“Jeremy Shaffer refuses to denounce the radical right’s attack on our elections. His silence speaks volumes.”

“Jeremy, why do you refuse to denounce the insurrection? Why won’t you denounce the assault on our democracy???”

Further, Deluzio frequently connected Shaffer to other more extreme political figures, to paint him with the same brush:

“My opponent campaigns alongside extremists like Doug Mastriano and Kevin McCarthy, who tried to overthrow our democracy.”

“Jeremy Shaffer just opened a joint campaign office with Doug Mastriano. These extremists are a threat to our freedom.” Jeremy Shaffer is showing you exactly who he is: Campaigning with insurrectionists, courting endorsements from extremists, and begging formoney from the radical right.’

This method of linking Shaffer with known extremists, despite his moderate stances, and highlighting his silence on extreme issues was a great strategy in the political context of 2022.

3. Unions vs. Corporations

Deluzio’s campaign emphasized unions and criticized corporate outsourcing, aligning with his district’s industrial heritage and union-heavy electorate. He connected his opponent to corporate interests and used the “China” narrative to highlight the need for domestic manufacturing.

“Corporate execs have been stiffing folks, crushing unions, & outsourcing jobs to China & all over the planet for way too long. [1/2]”

“‘Corporate executive Jeremy Shaffer really can’t stand to be asked about his business in China (or Saudi Arabia!). Corporations like the one that made him rich are raking in millions building up China’s infrastructure & selling out folks from #PA17. Need proof? Here:’”

This strategy was creative because recently, China as a cudgel has been used mostly by Republicans. Take a similar district like IN-05, Indianapolis suburbs, where Republican Rep. Victoria Spartz is associating her Republican primary opponent as “China Chuck.”

4. Ground Game

Marie Gluesenkamp Perez’s campaign prioritized ground efforts, with 1 in every 4 of her tweets promoting volunteering, canvassing, and fundraising. She often mentioned Vancouver, the largest metro area in her district, underscoring her strategy to maximize urban turnout and minimize rural conservative opposition. Since so much of her district is rural, solidifying turnout in metro/suburban areas was critical

‘Join me this Saturday for our Longview and 
Vancouver canvass kick-offs! First we’ll rally, next I’ll say a few words, and then we’ll all go knock on doors to tell our neighbors about this 
important race. RSVP and learn about other upcoming events here: 
https://mobilize.us/marieforcongress/‘,

‘Hello from Pacific County, where we’re having fun in the rain at the South Bend Labor Day parade! Our campaign is powered by volunteers like you. Join us at our next event ➡️ 
http://marieforcongress.com/volunteer/‘,

Summary: Both campaigns effectively leveraged the national issues of 2022, such as abortion rights and anti-extremism, in their strategies. Deluzio capitalized on his district-specific issues, particularly unions, to resonate with his electorate. In contrast, MGP used more of her messaging capital on structuring her ground game, using her voice and reach to mobilize volunteers and supporters to turn out the right votes in the right places.