English Summary
25 September 2023 Aghasi Tavadyan
Data Science Media

English Summary

Web Scraping Sentiment Analysis Armenia 2 min read

This is the third month I’ve been presenting these economic analyses to you. Approximately 1600 email addresses now receive these weekly analyses.

This week, we lost an important part of our identity and dignity.

I’ve thought long and hard about what kind of statistical analysis could be relevant at this time. I’ve concluded that I need your help to create value.

Currently, it is crucial to conduct a large-scale analysis of Armenia’s media landscape. This will also include sentiment analysis, meaning we will be able to see how major media outlets have treated certain critical issues over time, whether positively or negatively. We will be able to observe how they influence our society over time.

The intention is to study 3 types of media outlets:

  • Media outlets supporting the current government (e.g., armtimes.com),
  • Media outlets opposing the current government (e.g., 168.am/),
  • And as impartial media outlets as possible.

This analysis will allow us to see how positively or negatively these media outlets have treated certain important words over time. These words could include “Artsakh”, “Armenia”, and other terms (e.g., “Russia”, “Azerbaijan”, “Nikol”, “France”, “USA”).

The foundation for this analysis was laid back at the beginning of 2022, when materials published by Armenpress from 2010 to 2022 were downloaded and studied. The results can be seen on this website developed by us. This website allows for counting the frequency of specific words used in news headlines and other usage metrics. You can select one or more words for the count, as well as the time period and the analysis method.

The research was conducted based on materials from the Armenpress news agency’s website. A program was created that downloaded all internet links from this site starting from 2010, totaling over half a million links (19GB of text data). Subsequently, the news links were cleaned, and a database was created, including the date, location, link, headline text, and article text of each news item. Research was then carried out on the collected data, leading to the creation of this interactive website.

For example:

To continue this analysis, your financial and intellectual support is needed. One of the challenges is the normalization of the Armenian language, meaning reducing each word to its root form. This requires a strong understanding of Armenian logic. Another challenge is that each media outlet’s website has its own logic and complexities when working with it.

Citation

Tavadyan, A. (2023, September 25). English Summary. Tvyal Newsletter. https://tvyal.com/newsletter/en/2023/2023-09-25/

Analysis code available on GitHub.

Related

2 Million Armenians in Armenia by 2100 Apr 2024
Dollarization and Monetization Levels in Armenia Oct 2023
The Russian Ruble is Set to Strengthen. Possible Capital Inflow from Russia to Armenia Oct 2023
Russian Money is Going Back: What Awaits Armenia's Economy? Feb 2025
Demographic Crisis: 2 Million Armenians in Armenia by 2100 Nov 2024
From Capital Inflow to Outflow: The Inner Kitchen of Armenia's Economic Miracle Sep 2024

Loading…