DOI QR코드

DOI QR Code

Public Sentiment Analysis and Topic Modeling Regarding COVID-19's Three Waves of Total Lockdown: A Case Study on Movement Control Order in Malaysia

  • Alamoodi, A.H. (Department of Computing, Faculty of Arts, Computing and Creative Industry, Universiti Pendidikan Sultan Idris (UPSI)) ;
  • Baker, Mohammed Rashad (Department of Computer Techniques Engineering, Imam Ja'afar Al-Sadiq University) ;
  • Albahri, O.S. (Department of Computing, Faculty of Arts, Computing and Creative Industry, Universiti Pendidikan Sultan Idris (UPSI)) ;
  • Zaidan, B.B. (Future Technology Research Center, College of Future, National Yunlin University of Science and Technology) ;
  • Zaidan, A.A. (British University in Dubai) ;
  • Wong, Wing-Kwong (Future Technology Research Center, College of Future, National Yunlin University of Science and Technology) ;
  • Garfan, Salem (Department of Computing, Faculty of Arts, Computing and Creative Industry, Universiti Pendidikan Sultan Idris (UPSI)) ;
  • Albahri, A.S. (Informatics Institute for Postgraduate Studies (IIPS), Iraqi Commission for Computers and Informatics (ICCI)) ;
  • Alonso, Miguel A. (Departamento de Ciencias da Computacion e Tecnoloxías da Informacion, Universidade da Coruna and CITIC) ;
  • Jasim, Ali Najm (Foundation of Alshuhda) ;
  • Baqer, M.J. (Foundation of Alshuhda)
  • 투고 : 2022.01.10
  • 심사 : 2022.05.28
  • 발행 : 2022.07.31

초록

The COVID-19 pandemic has affected many aspects of human life. The pandemic not only caused millions of fatalities and problems but also changed public sentiment and behavior. Owing to the magnitude of this pandemic, governments worldwide adopted full lockdown measures that attracted much discussion on social media platforms. To investigate the effects of these lockdown measures, this study performed sentiment analysis and latent Dirichlet allocation topic modeling on textual data from Twitter published during the three lockdown waves in Malaysia between 2020 and 2021. Three lockdown measures were identified, the related data for the first two weeks of each lockdown were collected and analysed to understand the public sentiment. The changes between these lockdowns were identified, and the latent topics were highlighted. Most of the public sentiment focused on the first lockdown as reflected in the large number of latent topics generated during this period. The overall sentiment for each lockdown was mostly positive, followed by neutral and then negative. Topic modelling results identified staying at home, quarantine and lockdown as the main aspects of discussion for the first lockdown, whilst importance of health measures and government efforts were the main aspects for the second and third lockdowns. Governments may utilise these findings to understand public sentiment and to formulate precautionary measures that can assure the safety of their citizens and tend to their most pressing problems. These results also highlight the importance of positive messaging during difficult times, establishing digital interventions and formulating new policies to improve the reaction of the public to emergency situations.

키워드

1. Introduction

SARS-COV-2, mostly known as COVID-19, is a contagious disease that originated in China and spread worldwide [1] before being officially declared a pandemic by the World Health Organization in March 2020 [2]. As of November 2021, the virus has claimed more than 5,195,354 lives [3]. The COVID-19 pandemic raised not an only public concern and claimed many lives similar to any primary infectious disease but also brought devastating economic [4], social [5], and health [6] consequences across the globe [7]. Given its highly infectious nature, COVID-19 has driven people from living, traveling, working, and engaging in social interactions to being locked up inside their homes due to the fear and anxiety of losing their lives [8]. The initially detected viral strain in China even underwent genetic mutations to create new and more transmissible variants, including but not limited to alpha, beta, gamma, delta and omicron. With the continuous emergence of these variants, people are not expected to return to their pre-pandemic lifestyles anytime soon. To control the pandemic, countries worldwide are racing against time to vaccinate the majority of their population [9]. Before the introduction of these vaccines, governments of these countries adopted cost-effective measures, such as enforcing social distancing, implementing movement control orders (MCO, in the case of Malaysia) [10], and limiting social gatherings to essential businesses [11]. To force people to stay inside their homes, some governments even started filing criminal charges against those who go against their COVID-19 response measures [12], especially in total lockdowns [13] during which people are not allowed to leave their premises [14] (except when they need to attend to essential matters) and non-essential services are not permitted to operate [15]. No other event in recent history has pressured people to stay inside their homes and stop living their usual lives [16]. The implementation of MCO measures in Malaysia has had devastating consequences on people's mental health [17], with many scientific works reporting an increased prevalence of anxiety, depression, and psychological distress during the COVID-19 pandemic [18]. In 2020, Malaysia gradually lifted its MCO restrictions upon noticing downward trends in its number of infections [19]. However, the number of infections gradually increased again in the following year, to which the Malaysian government responded by adopting less rigorous strategies. However, as the number of cases skyrocketed, the government implemented a second lockdown known as MCO.2 [18]. While the Malaysian economy started to recover at the beginning of 2021 as vaccination programmes rolled out and people started leaving their homes again, other countries were experiencing new infections as new variants of the COVID-19 virus emerged [20]. These countries were forced to shut down again, which received much criticism from their citizens. While favored by others, some segments of the population opposed lockdowns due to specific issues, such as unemployment, frustration, and depression [21],[22].

Consequently, many people retreated to online platforms, such as Twitter, Reddit and Facebook, to voice out their feelings and opinions about the continuous implementation of lockdown measures [23]. As of date, millions of opinions about lockdowns have been shared and discussed online (i.e. Twitter users send approximately 500 million tweets about the pandemic daily) [24]. The pandemic also encouraged the spread of misinformation on these platforms, hence necessitating the introduction of rapid analytical tools for interpreting and analysing the flow of information, the evolution of mass opinion in different pandemic scenarios and their associated main topics. To measure online public sentiment, researchers have adopted computer science tools that deal with the language and text in social media posts, with natural language processing (NLP) being the most commonly used tool [25]. This study seeks to understand public sentiment and opinion regarding continuous COVID-19 lockdown measures (i.e. more than two lockdowns) by using data from Twitter. This work investigates the public sentiment exhibited in social media discussions and determines whether these sentiments reflect the general public sentiment towards lockdown measures. This study not only offers novel insights into the general public sentiment but also provides relevant guidelines and policies that facilitate the implementation of intervention measures and the decision making of public health officials for similar situations that may arise in the future. The goals of this research are as follows:

• to analyze public sentiment towards continuous lockdown measures by performing sentiment analysis of public tweets;

• to determine whether public sentiment changed during the first weeks of each lockdown; and

• to identify the most pressing issues associated with these lockdown measures.

The rest of this paper is organized as follows. Section 2 discusses the works related to sentiment analysis and COVID-19 lockdown measures. Section 3 discusses the proposed research and the case study for the data collection and analysis. Section 4 presents the results and discussion. Section 5 conducts a combined analysis. Section 6 outlines the topic modeling results. Section 7 concludes the paper and proposes directions for future research.

2. Related Works

Sentiment analysis, also known as opinion mining, is a branch of NLP in which people's opinions, emotions, and feelings about a particular subject are extracted from their social media interactions [26]. These interactions are presented in written posts or texts and assigned a polarity value (i.e., positive, negative, or neutral) [27]. Given the importance of public sentiment in determining which interventions and policies need to be delivered, sentiment analysis has become an emerging topic in many research areas. It has even attracted scientific, social, and commercial applications [28]. Accordingly, sentiment analysis has received significant research interest in recent years and has been integrated into other technology areas, including machine learning (ML) [29], topic modeling [30], and emotion analysis [31]. This technique has also been used to analyze trending topics, such as the COVID-19 pandemic [1] and vaccine hesitancy [32]. Many studies have applied sentiment analysis to explore topics related to COVID-19. For instance, Manguri et al. [33] applied sentiment analysis on COVID-19-related tweets published worldwide during the onset of the pandemic in early 2020. They crawled more than 530,232 tweets using the official Twitter API and the Tweepy library of Python. They then identified both the polarity of sentiment and the subjectivity of tweets using the TextBlob library. They detected high polarity in more than half these tweets with a high objectivity ratio and concluded that the reactions of people to the pandemic varied daily as reflected not only in their increased social media interactions but also in the sentiments expressed in their tweets. Garcia and Berton [34] Applied sentiment analysis and topic identification on COVID-19-related tweets published in the US and Brazil by collecting more than 3,000,000 tweets in English and Portuguese using the official Twitter API. They also explored the main content or topics of these two groups of tweets and found that 7 out of 10 topics generated were consistent across both countries. Most of these topics were associated with healthcare, case reports and daily statistics. Sharma and Sharma [35] Performed sentiment analysis to understand the online behaviors of people while they were locked inside their homes. They found that many of these people developed suicidal tendencies and boredom, necessitating an analysis of their psychology and sentiments to stop them from taking extreme decisions, such as their lives. They extracted 3,000 tweets and performed an unsupervised signature-based sentiment analysis to identify the associated emotions. Afterwards, they categorised the activities of people in accordance with the extracted emotions and then used their findings to detect depression and psychological states resulting from lockdowns and to help authorities reach out to those Twitter users who demonstrate extreme mental states. Kaur et al. [36] Explored how leaders made use of emotions portrayed in social media during the COVID-19 pandemic, especially with respect to public dissemination of knowledge. They identified different emotions by applying NRC-based sentiment analysis on 12,128 lockdown-related tweets addressed to different Indian political leaders. They revealed that most users were generally confident in the guidelines issued or communicated by their governments. These findings can help leaders improve their online communication by considering the emotions presented in social media posts. Mittal et al. [37] Crawled tweets to assess the coping behaviors and reactions of social media users worldwide during their first few days in lockdown. Their sentiment analysis showed that most Twitter users had positive sentiments toward the potential of lockdowns in curbing the spread of COVID-19 and preventing further deaths. They also found that some people managed to keep themselves engaged and entertained during these lockdowns. In contrast, others were merely fence-sitters whose opinions and emotions could swing either way depending on how the pandemic progressed and on the actions adopted by their governments. Perlstein and Verboord [38] Used sentiment analysis to examine how people from Northern European countries addressed their authorities concerning the initial stages of their COVID-19 lockdowns. They also used topic modeling to identify the prevalent Twitter discourses in each country during different phases of lockdowns. They crawled 100,000 tweets from each country between 27-1-2020 and 25-4-2020. They identified some interesting topics related to clear, comprehensive, and timely communication about the pandemic and the deliberations surrounding the coping strategies of various governments. These findings may be useful for policymakers and authorities in high-trust countries during epidemics. Gupta et al. [39] Applied sentiment analysis on the Twitter posts of Indian users related to the enforcement of lockdown measures by the Indian government. They used ML classifiers on 12,741 tweets collected from 5-4-2020 to 17-4-2020 during the second phase of lockdowns in India and then annotated these data using the TextBlob and VADER lexicons. They found that the majority of Indian citizens support the lockdown implemented by their government. These findings highlight the importance of analysing tweets before and after a lockdown to understand the changes in public sentiment and the consequences of lockdown measures. The aforementioned papers highlight the usefulness of sentiment analysis in making decisions during difficult times. Sentiment analysis not only assists governments in formulating and executing their strategies but also identifies the main issues under discussion and the potential countermeasures. However, most studies that use sentiment analysis are focused on European countries, with very few focusing on Asian countries. To the best of our knowledge, no study in Malaysia has attempted to use sentiment analysis in understanding public reactions towards the three waves of lockdowns implemented in the country. Results of such sentiment analysis will undoubtedly generate different sentiments and topics of interest. Therefore, this research explores the public sentiment about the three phases of lockdowns in Malaysia and compares the reactions of people who support and reject such measures. The findings of this work can help governments identify those countermeasures that can be taken in case of future pandemics.

3. Proposed Approach

This study was divided into four phases, namely, (1) data collection, (2) preprocessing, (3) sentiment analysis, and (4) topic modeling, as illustrated in Fig. 1.

E1KOBZ_2022_v16n7_2169_f0001.png 이미지

Fig. 1. Proposed Approach

3.1 Data Collection

The main concern of data scientists relates to the availability of datasets for the data collection process. In sentiment analysis, fetching data from social media requires exploring the available tools for extracting the most relevant data. To investigate public perception and sentiment regarding different waves of COVID-19 lockdowns, this study collected related data from Twitter using the Twint project tool, which was designed and written in Python as a substitute to the official Twitter API, hence allowing users to scrape tweets and other historical information [40]. Twint was selected instead of the official Twitter API because, apart from requiring a license from Twitter, the latter can only extract a limited number of tweets and often consumes much time. Twint was eventually selected given its advantages and tweet crawling capability. The retrieved dataset contained more than 3,000,000 tweets published across three periods of lockdowns in Malaysia from March 2020 to November 2021. Several keywords were used to obtain relevant tweets, including ‘Movement Control’,’ ‘MCO’, ‘MCO 2.0’, and ‘MCO 3.0’. After the data collection, various elements were extracted from the collected tweets apart from their texts, including tweet ID, tweet date, language, and hashtags. These were saved as comma-separated values for the analysis. The collected data were preprocessed as described in the following section before the analysis.

3.2 Preprocessing

In this research, preprocessing is an essential step that needs to be completed before performing sentiment analysis and applying topic modeling techniques to increase the effectiveness of the whole process. Several aspects of the raw data need to be considered. Some aspects involve removing or converting elements into a specific form suitable for further processing. Various Python tools and libraries were used, including (1) the Pandas-Python Data Analysis Library and (2) the Python NLP toolkit. For the Pandas preprocessing, the relevant variables included (1) the actual text in tweets, (2) the language of tweets, and (3) tweet ID. Language filtering was then applied to the 3,000,000 collected tweets to extract only those written in the English language for the next stage. For the Python NLP toolkit preprocessing, the following tasks were applied to prepare the data for the sentiment analysis and topic modeling:

• Lower Case: Any upper case letter in the text was converted into small letters to reduce the complexity of the analysis. Table 1 presents an example.

Table 1. Sample Lower case Preprocessing

E1KOBZ_2022_v16n7_2169_t0007.png 이미지

• Links Removal; All links associated with a tweet, such as HTML tags, punctuations, special characters, URLs, and numbers, were removed. They are not essential to the text for sentiment analysis and topic modeling. Table 2 presents an example.

Table 2. Sample Links Removal

E1KOBZ_2022_v16n7_2169_t0001.png 이미지

• Emojis Removal: The tweets' emojis were removed because Python could not filter emojis as pictures. The tweets were cleaned and filtered using their identifiers. Table 3 presents an example.

Table 3. Sample Emojis Removal

E1KOBZ_2022_v16n7_2169_t0002.png 이미지

• Lemmatization: Words were converted into their word roots by removing their suffixes and affixes. This process aims to create meaningful forms of these words known as lemmas. For example, ‘going’ was lemmatized as ‘go’. This process was applied across the entire dataset.

• Stop Words Removal: Each language has different stop words (e.g., articles, prepositions, pronouns, and conjunctions), which are often used to connect sentences but do not present any valuable significance in lexicon-based sentiment analysis (i.e., stop words do not add much information to the text about a particular topic). Therefore, these words should be removed. Some examples contain words in the English language, including ‘is’, ‘a’, ‘an’, and ‘the’.

3.3 Sentiment Analysis

After data preprocessing, sentiment analysis was applied to the cleaned data. According to Alamoodi et al. [1], sentiment analysis approaches can be categorized into (1) lexicon-based, (2) ML-based, and (3) hybrid. The first approach applies semantic orientation dictionaries that map words to subjectivity scores, and these scores are added up to determine the overall sentiment for a given input. The sentiment output is usually assigned a polarity score ranging between -1 and +1, where -1 is the highest score for negative sentiment, +1 is the highest score for positive sentiment, and 0 indicates neutral sentiment. Any score between 0 and 1 indicates polarity intensity, either positive or negative. A score closer to 1 corresponds to a stronger intensity, and vice versa. The ML-based approach classifies and predicts sentiment using ML techniques, including deep learning and algorithms. ML models are fed with training and testing datasets that need to be annotated with targets using manual processes, ML techniques, or other means. Hybrid sentiment analysis integrates the above approaches for polarity score identification and machine classification. The lexicon-based approach was used in this study to establish a global perspective of public sentiment toward continuous lockdown measures. Specifically, sentiment analysis was conducted using TextBlob, a well-known lexicon-based technique frequently used in previous studies [41] for different NLP tasks, including sentiment analysis, part-of-speech tagging, and classification, noun phrase extraction, and tokenization [42]. TextBlob uses a predefined dictionary to classify positive and negative words quickly and efficiently [43] by returning either a polarity or subjectivity score. The subjectivity score indicates the degree of personal opinion presented in a text. A score closer to 0 shows a more objective view, whereas a score closer to 1 indicates a more subjective view.

3.4 Topic Modelling

Topic modelling techniques extract hidden topics from large datasets. These techniques have been extensively used in sentiment analysis and ML research [44], given the increasing popularity of social media platforms containing texts on unorganised topics that need to be shaped into proper information for different purposes, such as making decisions and understanding public opinion sentiment. Latent Dirichlet allocation (LDA) was applied in this study for topic modeling. This technique allows sets of observations to be explained by unobserved groups that define why some parts of the data are similar [45]. Fig. 2 illustrates the LDA process.

E1KOBZ_2022_v16n7_2169_f0002.png 이미지

Fig. 1. LDA Topic Modelling

LDA assumes that documents with similar topics use similar terms and that topics demonstrate a sparse Dirichlet distribution. LDA calculates the proportion of words in a document assigned to a specific topic and then determines the proportion of words assigned to a topic across all documents. These topics are then qualitatively analyzed to confirm their related content. In this study, sentiment analysis and topic modelling were applied on those tweets published during the first two weeks of each lockdown in Malaysia to check for any significant changes in the sentiment or topics discussed during the three lockdown waves.

4. Experiments on Sentiment Analysis

Several experiments were conducted to analyze public sentiment across the three waves of lockdowns in Malaysia. The first two weeks of each lockdown were deemed sufficient for determining public sentiment towards the measures implemented by the government because each time a significant incident occurs, people demonstrate the most strong reactions to such actions during the first few days of their implementation. Each of the three lockdowns in Malaysia was analyzed separately and compared with the other two in the following subsections.

4.1 First Lockdown

MCO.1 occurred from 18-3-2020 to 3-5-2020. Results of the sentiment analysis for the first two weeks of MCO.1 are summarised in Table 4.

Table 4. First Lockdown Statistics.

E1KOBZ_2022_v16n7_2169_t0003.png 이미지

Table 4 clearly shows that positive polarity dominates the first two weeks of MCO.1. Statistics for the first week show that amongst the 298,150 published tweets, positive sentiment dominated with a 47.12% ratio, with 0.310 mean and 0.046 variance, followed by neutral sentiment with a 34.25% ratio (n=102,138) and negative sentiment with an 18.61% ratio (n=55,507), -0.228 mean and 0.039 variance. The sentiment subjectivity for the first week of MCO.1 had a 0.373 mean and 0.095 variance. In absolute numbers, most of the sentiment in the first week was reported on day 5 (22-3-2020), indicating that people have started to realize the pandemic and its impact during this period. However, subjectivity was mostly observed on day 7 with a 0.377 mean. The sentiment in the second week of MCO.1 (n=200,531) was less than that in the first week, but positive polarity again dominated with a 49.97% ratio (n=100,221), 0.329 mean, and 0.047 variance. Neutral sentiment had a 34.63% ratio (n=69,449), followed by negative sentiment (n=30,861) with -0.221 mean and 0.036 variance. The sentiment subjectivity for the second week had 0.371 mean and 0.095 variance. In absolute numbers, most of the sentiment in the second week was reported on day 7 (31-3-2021), but subjectivity was mostly observed on day 3 (27-3-2021) with an average score of 0.381. Although most tweets were published during the first week of MCO.1, the ratio of positive sentiment in the second week increased from 47.12% to 49.97%, indicating that people reacted to and accepted the lockdown. These findings were also confirmed by the ratio of negative sentiment, which decreased from 18.61% to 15.38% during the second week. These findings indicate that people were generally in favour of MCO.1. Fig. 3 plots the polarity scores and sentiment ratios over time.

E1KOBZ_2022_v16n7_2169_f0003.png 이미지

Fig. 2. MCO.1 polarity versus day. Polarity is presented on the y-axis, and the day is shown on the x-axis. The blue circles indicate the data points. The size of these circles indicates the number of tweets published during that period. Precisely, a larger circle corresponds to a more significant number of tweets. The red line indicates the rolling mean polarity calculations, and the expanding mean polarity calculations are represented by the yellow line.

Fig. 3 shows that public sentiment is mainly concentrated on days 5 and 7 during the first and second weeks of MCO.1, respectively. Polarity was mostly positive between (0.05, 0.20). Positive sentiment was concentrated between (0.05, 0.15) on both days. Meanwhile, the rolling mean fluctuated between (0.06, 0.017) on days 1 and 2, maintained between (0.10, 0.15) on day 3, decreased to (0.06) on days 4 and 5, and then fluctuated between (0.09) on day six and (0.12) on day 7. Expanding mean polarity calculations fluctuated between (0.11, 0.12) across the entire week.

E1KOBZ_2022_v16n7_2169_f0007.png 이미지

Fig. 3.Topics Example 

4.2 Second Lockdown

MCO.2 was implemented at the beginning of 2021, with the first two weeks covering 13-1-2021 to 26-1-2021. Table 5 presents the statistics for these two weeks.

Table 5. Second Lockdown Statistics.

E1KOBZ_2022_v16n7_2169_t0004.png 이미지

For the first week of MCO.2, the highest number of tweets was recorded on day 1 (n=1,499), the majority of which had positive sentiment (n=727), followed by neutral (n=532) and negative (n=243). Positive sentiment had 0.324 mean and 0.040 variance, whereas negative sentiment had -0.206 mean and 0.032 variance. Unlike in MCO.1, day 5 of MCO.2 had the lowest number of tweets (n=831). The positive sentiment on day five also had the highest mean (0.343) and variance (0.343). Sentiment subjectivity reported the highest mean (0.372) and variance (0.088) on day one, given the many tweets published on this date. Meanwhile, the second week of MCO.2 had more tweets and higher sentiment than the first week. Day 5 (24-1-2021) had the highest number of tweets (n=2,020), most of which had positive sentiment (n=971) with a mean of 0.336 and variance of 0.054, followed by neutral (n=677) and negative (n=372) with a mean of -0.232 and variance of 0.04. The lowest number of tweets was published on day 3 (22-1-2021), and the highest subjectivity was observed on day 5, with a mean of 0.388. Meanwhile, the highest sentiment was reported on day 1, linked to the previous experience recorded in MCO.1. Both neutral and negative tweets maintained their order in people's sentiment, unlike the first MCO, where the majority favoured their sentiment score. Unlike MCO.1, MCO.2 had more negative tweet interaction and lowered positive sentiment scores, as shown in Fig. 4.

E1KOBZ_2022_v16n7_2169_f0004.png 이미지

Fig. 4. MCO.2 polarity versus day. Polarity is presented on the y-axis, and day is shown on the x-axis. The blue circles indicate the data points. The size of these circles indicates the number of tweets published during that period. Precisely, a larger circle corresponds to a more significant number of tweets. The red line indicates the rolling mean polarity calculations, and the expanding mean polarity calculations are represented by the yellow line.​​​​​​​

Fig. 4 shows that most of the sentiment is positive, with scores ranging between (0, 0.4). Aside from the majority of the positive score, positive sentiment presented a significant increase with higher subjectivity to reach a positive score of more than (0.8). A notable trend was also observed for negative sentiment, especially in its overall distribution (between 0, -0.2) and negative score (reaching -0.6). The rolling mean indicated that the positive sentiment was maintained between (0.1, 0.3) across all days, whereas the expanding mean indicated that the positive sentiment score was maintained at 0.1.

4.3 Third Lockdown

MCO.3 was implemented at the beginning of May 2021, with the first two weeks occurring between 12-5-2021 and 25-5-2021. Table 6 presents the sentiment analysis results for these two weeks.

Table 6. Third Lockdown Statistics​​​​​​​.

E1KOBZ_2022_v16n7_2169_t0005.png 이미지

Table 6 shows that the total sentiment in MCO.3 is far lower than that in MCO.1 and MCO.2. Upon implementing the lockdown, most of the sentiment (n=542) was observed on day 1 of the first week (12-5-2021). The majority of the tweets published on this day were positive (n=300), followed by neutral (n=186), and then negative (n=56). During the first week, the highest mean positive sentiment was observed on day 2 (0.419), whereas the highest mean negative sentiment was observed on day 7 (-0.176). Sentiment subjectivity mainly was observed on days 1 and 2 with a mean of 0.378. The sentiment for the first week was mostly positive (n=1570) with a mean of 0.354, followed by neutral (n=1135) and negative (n=364) with a mean of -0.211. Meanwhile, the subjectivity for the first week had a mean of 0.362. A higher sentiment was reported in the second week (n=3,440), with the majority of the tweets expressing positive sentiment (n=1,547) with a mean of 0.316, followed by neutral (n=122) and negative (n=571) with a mean of -0.224. Fig. 5. presents the polarity scores recorded throughout MCO.3.

E1KOBZ_2022_v16n7_2169_f0005.png 이미지

Fig. 5. MCO.3 polarity versus day. Polarity is presented on the y-axis, and day is shown on the x-axis. The blue circles indicate the data points. The size of these circles is determined by the number of tweets published during that period. A larger circle corresponds to a more significant number of tweets. The red line indicates the rolling mean polarity calculations, and the expanding mean polarity calculations are represented by the yellow line.​​​​​​​

Fig. 5 shows that the majority of the sentiment is positive, with scores ranging between (0, 0.50) and that the subjectivity score of positive sentiment reaches up to 1.00. The negative sentiment polarity mainly was between (0, -0.25), and its subjectivity score reached -0.75. The rolling mean indicated that the positive sentiment on most days ranged between (0.01, 0.25), whereas the negative sentiment ranged between (-0.01, -0.12). The expanding mean for all days ranged between (0, 0.15).

4.4 Summary

Each wave of lockdowns reported different levels of sentiment. In MCO.1, 48.27% of the tweets had positive polarity scores, 17.31% had negative scores, and 34.40% had neutral scores. In MCO.2, 47.06% of the tweets had positive polarity scores, 15.22% had negative scores, and 37.71% had neutral scores. In MCO.3, 47.88% had positive polarity scores, 14.36% had negative scores, and 37.74% had neutral scores. The subjectivity of these tweets ranged between (0.05, 0.15) in MCO.1, between (0, 0.2) in MCO.2, and between (0, 25) in MCO.3. MCO.2 and MCO.3 were highly similar, as reflected in the closeness of their subjectivity scores.

5. Sentiment Topic Modeling

Determining the closeness of the main topics discussed across different MCOs is critical. If no similarities are detected, the concerns surrounding each MCO warrant further discussion. To this end, LDA was performed using the Gensim library. The number of suitable topics for each MCO should be determined in this process. The three lockdown measures discussed in previous sections were split into three independent datasets. Using the Gensim library, the coherence scores for each model were computed, as shown in Fig. 6.

E1KOBZ_2022_v16n7_2169_f0006.png 이미지

Fig. 6. Coherence Score​​​​​​​

For MCO.1, LDA modeling was run against 50 topics and returned 38 topics with the highest coherence value of 0.5276, as shown in Fig. 6. For MCO.2 and MCO.3, fewer topics were retrieved given the variations in data. Specifically, MCO.2 and MCO.3 had two topics with CS scores of 0.4669 and 0.4809, respectively. All coherence scores and topic representations are presented in the Appendix. Fig. 7 gives examples of the main topics being discussed in MCO.1.

In all lockdown periods, it was observed that positive sentiment was far more than others. It was followed by both neutrals and negative, which applies to all cases. The topics discussed in all MCOs are listed in Table 7.

Table 7. Main Topics in all MCOs​​​​​​​

E1KOBZ_2022_v16n7_2169_t0006.png 이미지

The sentiment analysis topic modeling results across all lockdown waves are discussed as follows:

• The discussions during MCO.1 were related to staying at home, quarantine and lockdown as reflected in the keywords [Stayathome, quarantine, quarantinelife, stayhome, selfquarantine, selfisolation, stayathome, quaratinelife, lockdownnow, staysafe, stayathome, staysafe, stayhome, coronalockdown, staystrong, stayathome], the implementation of lockdowns, travel restrictions, fines and other measures [Government, message, listen, important, control, prevent, state, travel, close, city, governor, police, fine, government, lead, global, response, action, flattenthecurve], the reactions of people to measures for curbing COVID-19, maintaining hygiene and social distancing [social_distance, wash, washyourhand, health, care, washyourhand], the activities performed by people before the lockdown (e.g. going to schools and offices) and how they transitioned to online and work-from-home settings [team, hard, workfromhome, office, continue, workingfromhome, employee, kid, child, school, learn, student, teacher, online, learn, join, virtual, website, plan], how people coped up with being stuck inside their homes and which activities to perform to kill time and maintain their emotional and psychological health [read, book, story, great, list, write, share, art, fun, post, enjoy, idea, watch, play, video, game, movie, show, live, netflix, tv, cook, food, recipe, youtube, make, garden, enjoy, workout] and praying for the country, the need for citizens to stick together during such difficult times and the important role of frontliners [hospital, staff, care, medical, worker, hero, world, save, country, pray, rest, human, nation, life, save_live, strong, inside, connected, stick, stopthespread].

• MCO.2 had fewer topics compared with MCO.1. The discussions during this period mainly revolved around the importance of staying at home [stayathome, family] and observing health measures [mask, care, staff, face].

• The discussions in MCO.3 primarily focused on the government and staying at home [Stayathome, stayhome, govt].

6. Discussion

The topic modelling and sentiment analysis results highlighted the continuity of positive public sentiments across all lockdown waves. Most of the discussions during these lockdowns favoured the stay-at-home policies implemented by the government. They stressed the importance of taking preventive measures in curbing the spread of COVID-19 and protecting people from emotional burdens while they are stuck inside their homes. Despite the polarity trend scores, mostly positive followed by neutral and negative, the above results may be subject to potential bias, which is understandable given the nature of online social media platforms or online users who may be unable or scared to express their ideas or feelings. Concerning the topic modeling results, given the differences in the number of tweets published during each wave, different topics were generated in each lockdown, with MCO.1 having the most number of topics and MCO.2 and MCO.3 having only two topics each. The number of these topics was determined based on coherence scores and was evaluated accordingly. However, these scores purely rely on numerical relationships in word occurrences, and their quality is not in line with what is expected from the model. Therefore, obtaining an optimal model is challenging even with a high CS. In this case, different attempts were made with various topics and scenarios until a final set of topics that shared a common theme across time was obtained. Most of the topics discussed across all lockdown waves were related to the importance of lockdowns and staying at home. These findings were expected given that people are generally supportive of the government's decisions and are willing to abide by the rules to reduce the burden placed on hospitals and other medical institutions.

7. Conclusion

The experiments reported in this article aimed to understand the public sentiment and the topics under discussion across the different lockdown waves in Malaysia during the COVID-19 pandemic. Unlike previous studies that only focused on a particular lockdown, the experimental results in this work revealed that even with the continuity of lockdown measures implemented by the government in Malaysia, the sentiments expressed on Twitter were primarily positive, followed by neutral and negative. Results also highlighted how different topics were presented in each lockdown and their main associated keywords. These findings can aid governments in understanding public sentiment and formulating preventive measures that can protect the safety of their citizens and tend to their most pressing problems at times of crisis. They may also refer to these findings when designing digital interventions and formulating policies to improve people's reactions to future emergencies. These findings also support previous studies, which confirm that public sentiment towards lockdown measures is generally positive, followed by neutral and negative. Although this study offers valuable insights, some limitations must be noted. Firstly, different tweets were used to generate topics for each lockdown period, leading to considerable differences in the number of topics generated for each wave. Future works should utilize a more balanced dataset to ensure fairness in their sentiment analysis. Secondly, the sentiment analysis in this work was mainly lexicon-based. Future studies should use intelligent tools and approaches, including huge language models and machine learning algorithms. Thirdly, this study only focused on public sentiment in Malaysia during lockdowns. Future studies may study public sentiment before and after lockdowns and compare the results across countries implementing the same number of lockdown measures.

참고문헌

  1. A. Alamoodi, B. Zaidan, A. Zaidan, O. Albahri, K. Mohammed, R. Malik, et al., "Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review," Expert systems with applications, p. 114155, 2020.
  2. G. Bell, "Pandemic Passages: An Anthropological Account of Life and Liminality during COVID-19," Anthropology in Action, vol. 28, pp. 79-84, 2021. https://doi.org/10.3167/aia.2021.280115
  3. WHO, "Weekly Operational Update on COVID-19, 30 November 2021," December 2021.
  4. B. Debata, P. Patnaik, and A. Mishra, "COVID-19 pandemic! It's impact on people, economy, and environment," Journal of Public Affairs, vol. 20, p. e2372, 2020. https://doi.org/10.1002/pa.2372
  5. V. Saladino, D. Algeri, and V. Auriemma, "The psychological and social impact of Covid-19: new perspectives of well-being," Frontiers in psychology, vol. 11, p. 2550, 2020.
  6. K. S. Khan, M. A. Mamun, M. D. Griffiths, and I. Ullah, "The mental health impact of the COVID-19 pandemic across different cohorts," International journal of mental health and addiction, pp. 1-7, 2020.
  7. P. Kumari and D. Toshniwal, "Impact of lockdown on air quality over major cities across the globe during COVID-19 pandemic," Urban Climate, vol. 34, p. 100719, 2020. https://doi.org/10.1016/j.uclim.2020.100719
  8. X. Zhu and J. Liu, "Education in and after Covid-19: Immediate responses and long-term visions," Postdigital Science and Education, vol. 2, pp. 695-699, 2020. https://doi.org/10.1007/s42438-020-00126-3
  9. R. M. Burgos, M. E. Badowski, E. Drwiega, S. Ghassemi, N. Griffith, F. Herald, et al., "The race to a COVID-19 vaccine: Opportunities and challenges in development and distribution," Drugs in Context, vol. 10, 2021.
  10. M. Gupta, M. Abdelsalam, and S. Mittal, "Enabling and enforcing social distancing measures using smart city and its infrastructures: a COVID-19 Use case," arXiv preprint arXiv:2004.09246, 2020.
  11. A. Kayes, M. S. Islam, P. A. Watters, A. Ng, and H. Kayesh, "Automated measurement of attitudes towards social distancing using social media: a COVID-19 case study," 2020.
  12. T. H. Oum and K. Wang, "Socially optimal lockdown and travel restrictions for fighting communicable virus including COVID-19," Transport Policy, vol. 96, pp. 94-100, 2020. https://doi.org/10.1016/j.tranpol.2020.07.003
  13. L. Chen, H. Lyu, T. Yang, Y. Wang, and J. Luo, "In the eyes of the beholder: analyzing social media use of neutral and controversial terms for COVID-19," arXiv preprint arXiv:2004.10225, 2020.
  14. M. Kaushik and N. Guleria, "The impact of pandemic COVID-19 in workplace," European Journal of Business and Management, vol. 12, pp. 1-10, 2020.
  15. C. Mejia, R. Pittman, J. M. Beltramo, K. Horan, A. Grinley, and M. K. Shoss, "Stigma & dirty work: In-group and out-group perceptions of essential service workers during COVID-19," International Journal of hospitality management, vol. 93, p. 102772, 2021. https://doi.org/10.1016/j.ijhm.2020.102772
  16. A. D. Dubey, "Decoding the Twitter Sentiments towards the Leadership in the times of COVID-19: A Case of USA and India," Available at SSRN 3588623, 2020.
  17. A. Bauerle, V. Musche, K. Schmidt, A. Schweda, M. Fink, B. Weismuller, et al., "Mental Health Burden of German Cancer Patients before and after the Outbreak of COVID-19: Predictors of Mental Health Impairment," International journal of environmental research and public health, vol. 18, p. 2318, 2021. https://doi.org/10.3390/ijerph18052318
  18. S. Moradian, A. Bauerle, A. Schweda, V. Musche, H. Kohler, M. Fink, et al., "Differences and similarities between the impact of the first and the second COVID-19-lockdown on mental health and safety behaviour in Germany," Journal of public health (Oxford, England), vol. 43(4), pp. 710-713, 2021. https://doi.org/10.1093/pubmed/fdab037
  19. A. Dzien, C. Dzien-Bischinger, M. Lechleitner, H. Winner, and G. Weiss, "Will the COVID-19 pandemic slow down in the Northern hemisphere by the onset of summer? An epidemiological hypothesis," Infection, vol. 48, pp. 627-629, Aug 2020. https://doi.org/10.1007/s15010-020-01460-1
  20. V. Shinde, S. Bhikha, Z. Hoosain, M. Archary, Q. Bhorat, L. Fairlie, et al., "Efficacy of NVX-CoV2373 Covid-19 Vaccine against the B. 1.351 Variant," New England Journal of Medicine, vol. 384, pp. 1899-1909, 2021. https://doi.org/10.1056/NEJMoa2103055
  21. M. Haque, I. E. Haque, M. N.-e.-A. Ziku, N. Ahamed, and M. S. Hossain, "COVID-19 Pandemic and Its Effects on Youth Mental Health in Bangladesh," Malaysian Journal of Social Sciences and Humanities (MJSSH), vol. 6, pp. 365-377, 2021.
  22. F. C. Onuoha, G. E. Ezirim, and P. A. Onuh, "Extortionate policing and the futility of COVID-19 pandemic nationwide lockdown in Nigeria: Insights from the South East Zone," African Security Review, vol. 30, no. 4, pp. 451-472, 2021. https://doi.org/10.1080/10246029.2021.1969961
  23. V. Basile, F. Cauteruccio, and G. Terracina, "How dramatic events can affect emotionality in social posting: The impact of COVID-19 on Reddit," Future Internet, vol. 13, p. 29, 2021. https://doi.org/10.3390/fi13020029
  24. D. Antonakaki, P. Fragopoulou, and S. Ioannidis, "A survey of Twitter research: Data model, graph structure, sentiment analysis and attacks," Expert Systems with Applications, vol. 164, p. 114006, 2021. https://doi.org/10.1016/j.eswa.2020.114006
  25. I. Lauriola, A. Lavelli, and F. Aiolli, "An Introduction to Deep Learning in Natural Language Processing: Models, Techniques, and Tools," Neurocomputing, vol. 470, pp. 443-456, 2022. https://doi.org/10.1016/j.neucom.2021.05.103
  26. L. Yue, W. Chen, X. Li, W. Zuo, and M. Yin, "A survey of sentiment analysis in social media," Knowledge and Information Systems, vol. 60, pp. 617-663, 2019. https://doi.org/10.1007/s10115-018-1236-4
  27. A. Tripathy, A. Agrawal, and S. K. Rath, "Classification of sentiment reviews using n-gram machine learning approach," Expert Systems with Applications, vol. 57, pp. 117-126, 2016. https://doi.org/10.1016/j.eswa.2016.03.028
  28. A. Keramatfar and H. Amirkhani, "Bibliometrics of sentiment analysis literature," Journal of Information Science, vol. 45, pp. 3-15, 2019. https://doi.org/10.1177/0165551518761013
  29. Y. Wang, Q. Chen, J. Shen, B. Hou, M. Ahmed, and Z. Li, "Aspect-level sentiment analysis based on gradual machine learning," Knowledge-Based Systems, vol. 212, p. 106509, 2021. https://doi.org/10.1016/j.knosys.2020.106509
  30. B. Ozyurt and M. A. Akcayol, "A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA," Expert Systems with Applications, vol. 168, p. 114231, 2021. https://doi.org/10.1016/j.eswa.2020.114231
  31. A. Chiorrini, C. Diamantini, A. Mircoli, and D. Potena, "Emotion and sentiment analysis of tweets using BERT," in Proc. of EDBT/ICDT Workshops, 2021.
  32. C. A. Melton, O. A. Olusanya, N. Ammar, and A. Shaban-Nejad, "Public sentiment analysis and topic modeling regarding COVID-19 vaccines on the Reddit social media platform: A call to action for strengthening vaccine confidence," Journal of Infection and Public Health, vol. 14, no. 10, pp. 1505-1512, 2021. https://doi.org/10.1016/j.jiph.2021.08.010
  33. K. H. Manguri, R. N. Ramadhan, and P. R. M. Amin, "Twitter sentiment analysis on worldwide COVID-19 outbreaks," Kurdistan Journal of Applied Research, vol. 5, no. 3, pp. 54-65, 2020.
  34. K. Garcia and L. Berton, "Topic detection and sentiment analysis in Twitter content related to COVID-19 from Brazil and the USA," Applied Soft Computing, vol. 101, p. 107057, 2021. https://doi.org/10.1016/j.asoc.2020.107057
  35. S. Sharma and S. Sharma, "Analyzing the depression and suicidal tendencies of people affected by COVID-19's lockdown using sentiment analysis on social networking websites," Journal of Statistics and Management Systems, vol. 24, pp. 115-133, 2021. https://doi.org/10.1080/09720510.2020.1833453
  36. M. Kaur, R. Verma, and F. N. K. Otoo, "Emotions in leader's crisis communication: Twitter sentiment analysis during COVID-19 outbreak," Journal of Human Behavior in the Social Environment, vol. 31, pp. 362-372, 2021. https://doi.org/10.1080/10911359.2020.1829239
  37. R. Mittal, W. Ahmed, A. Mittal, and I. Aggarwal, "Twitter users exhibited coping behaviours during the COVID-19 lockdown: an analysis of tweets using mixed methods," Information Discovery and Delivery, Vol. 49, No. 3, pp. 193-202, 2021. https://doi.org/10.1108/IDD-08-2020-0102
  38. S. G. Perlstein and M. Verboord, "Lockdowns, lethality, and laissez-faire politics. Public discourses on political authorities in high-trust countries during the COVID-19 pandemic," PLOS ONE, vol. 16, p. e0253175, 2021. https://doi.org/10.1371/journal.pone.0253175
  39. P. Gupta, S. Kumar, R. Suman, and V. Kumar, "Sentiment Analysis of Lockdown in India During COVID-19: A Case Study on Twitter," IEEE Transactions on Computational Social Systems, vol. 8, no. 4, pp. 992-1002, 2021. https://doi.org/10.1109/TCSS.2020.3042446
  40. P. Lohar, G. Xie, M. Bendechache, R. Brennan, E. Celeste, R. Trestian, et al., "Irish attitudes toward COVID tracker app & privacy: sentiment analysis on Twitter and survey data," in Proc. of The 16th International Conference on Availability, Reliability and Security, pp. 1-8, 2021.
  41. T. Zhang and C. Cheng, "Temporal and Spatial Evolution and Influencing Factors of Public Sentiment in Natural Disasters-A Case Study of Typhoon Haiyan," ISPRS International Journal of Geo-Information, vol. 10, p. 299, 2021. https://doi.org/10.3390/ijgi10050299
  42. M. A. Alsheri, L. M. Alrajhi, A. Alamri, and A. I. Cristea, "MOOCSent: a Sentiment Predictor for Massive Open Online Courses," in Proc. of 29TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DEVELOPMENT (ISD2021 VALENCIA, SPAIN), 2021.
  43. P. Sinha, P. Mitra, A. A. B. da Costa, and N. Kekatos, "Explaining Outcomes of Multi-Party Dialogues using Causal Learning," arXiv preprint arXiv:2105.00944, 2021.
  44. R. Debnath and R. Bardhan, "India nudges to contain COVID-19 pandemic: a reactive public policy analysis using machine-learning based topic modelling," PloS one, vol. 15, p. e0238972, 2020. https://doi.org/10.1371/journal.pone.0238972
  45. F. Wilhelm, "Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint Latent Dirichlet Allocation Model After All," in Proc. of Fifteenth ACM Conference on Recommender Systems, pp. 55-62, 2021.