Search | Korea Science

An Empirical Comparison of Machine Learning Models for Classifying Emotions in Korean Twitter (한국어 트위터의 감정 분류를 위한 기계학습의 실증적 비교)

Lim, Joa-Sang;Kim, Jin-Man
- Journal of Korea Multimedia Society
- /
- v.17 no.2
- /
- pp.232-239
- /
- 2014
As online texts have been rapidly growing, their automatic classification gains more interest with machine learning methods. Nevertheless, comparatively few research could be found, aiming for Korean texts. Evaluating them with statistical methods are also rare. This study took a sample of tweets and used machine learning methods to classify emotions with features of morphemes and n-grams. As a result, about 76% of emotions contained in tweets was correctly classified. Of the two methods compared in this study, Support Vector Machines were found more accurate than Na$\ddot{i}$ve Bayes. The linear model of SVM was not inferior to the non-linear one. Morphological features did not contribute to accuracy more than did the n-grams.
https://doi.org/10.9717/kmms.2014.17.2.232 인용 PDF KSCI KPUBS

Developing a Sentiment Analysing and Tagging System (감성 분석 및 감성 정보 부착 시스템 구현)

Lee, Hyun Gyu;Lee, Songwook
- KIPS Transactions on Software and Data Engineering
- /
- v.5 no.8
- /
- pp.377-384
- /
- 2016
Our goal is to build the system which collects tweets from Twitter, analyzes the sentiment of each tweet, and helps users build a sentiment tagged corpus semi-automatically. After collecting tweets with the Twitter API, we analyzes the sentiments of them with a sentiment dictionary. With the proposed system, users can verify the results of the system and can insert new sentimental words or dependency relations where sentiment information exist. Sentiment information is tagged with the JSON structure which is useful for building or accessing the corpus. With a test set, the system shows about 76% on the accuracy in analysing the sentiments of sentences as positive, neutral, or negative.
https://doi.org/10.3745/KTSDE.2016.5.8.377 인용 PDF KSCI

Investigation of Twitter Users' Activity Radius and Home Region in the City: The Case of Las Vegas (트위터 사용자의 도시 내 활동반경과 거주지역의 탐색: 라스베이거스 사례)

Cho, Jaehee;Seo, Il-Jung
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.42 no.2
- /
- pp.505-513
- /
- 2017
In this study, we collected 200,578,703 geo-tweets and removed the twitter bots. Using the concept of activity radius, Twitter users are classified. Users are also divided first into domestic and overseas, and again domestic ones are divided into locals and non-locals. Statistical characteristics of activity strength and active area of Twitter users are described according to activity radius and home region, and the geographical distribution is presented visually. Through a case study of Las Vegas, we have identified the difference in activity strength and active area by the user's home residence. We expect to derive theories about human mobility by analyzing various cities with the method proposed in this study.
https://doi.org/10.7840/kics.2017.42.2.505 인용 PDF KSCI

Framing North Korea on Twitter: Is Network Strength Related to Sentiment?

Kang, Seok
- Journal of Contemporary Eastern Asia
- /
- v.20 no.2
- /
- pp.108-128
- /
- 2021
Research on the news coverage of North Korea has been paying less attention to social media platforms than to legacy media. An increasing number of social media users post, retweet, share, interpret, and set agendas on North Korea. The accessibility of international users and North Korea's publicity purposes make social media a venue for expression, news diversity, and framing about the nation. This study examined the sentiment of Twitter posts on North Korea from a framing perspective and the relationship between network strengths and sentiment from a social network perspective. Data were collected using two tools: Jupyter Notebook with Python 3.6 for preliminary analysis and NodeXL for main analysis. A total of 11,957 tweets, 10,000 of which were collected using Python and 1,957 tweets using NodeXL, about North Korea between June 20-21, 2020 were collected. Results demonstrated that there was more negative sentiment than positive sentiment about North Korea in the sampled Twitter posts. Some users belonging to small network sizes reached out to others on Twitter to build networks and spread positive information about North Korea. Influential users tended to be impartial to sentiment about North Korea, while some Twitter users with a small network exhibited high percentages of positive words about North Korea. Overall, marginalized populations with network bonding were more likely to express positive sentiment about North Korea than were influencers at the center of networks.
https://doi.org/10.17477/jcea.2021.20.2.108 인용 PDF KSCI

Malaria Epidemic Prediction Model by Using Twitter Data and Precipitation Volume in Nigeria

Nduwayezu, Maurice;Satyabrata, Aicha;Han, Suk Young;Kim, Jung Eon;Kim, Hoon;Park, Junseok;Hwang, Won-Joo
- Journal of Korea Multimedia Society
- /
- v.22 no.5
- /
- pp.588-600
- /
- 2019
Each year Malaria affects over 200 million people worldwide. Particularly, African continent is highly hit by this disease. According to many researches, this continent is ideal for Anopheles mosquitoes which transmit Malaria parasites to thrive. Rainfall volume is one of the major factor favoring the development of these Anopheles in the tropical Sub-Sahara Africa (SSA). However, the surveillance, monitoring and reporting of this epidemic is still poor and bureaucratic only. In our paper, we proposed a method to fast monitor and report Malaria instances by using Social Network Systems (SNS) and precipitation volume in Nigeria. We used Twitter search Application Programming Interface (API) to live-stream Twitter messages mentioning Malaria, preprocessed those Tweets and classified them into Malaria cases in Nigeria by using Support Vector Machine (SVM) classification algorithm and compared those Malaria cases with average precipitation volume. The comparison yielded a correlation of 0.75 between Malaria cases recorded by using Twitter and average precipitations in Nigeria. To ensure the certainty of our classification algorithm, we used an oversampling technique and eliminated the imbalance in our training Tweets.
https://doi.org/10.9717/kmms.2019.22.5.588 인용 PDF KSCI HTML

Sentiment Analysis of COVID-19 Tweets: Impact of Pre-processing Step

Ayadi, Rami;Shahin, Osama R.;Ghorbel, Osama;Alanazi, Rayan;Saidi, Anouar
- International Journal of Computer Science & Network Security
- /
- v.21 no.3
- /
- pp.206-211
- /
- 2021
Internet users are increasingly invited to express their opinions on various subjects in social networks, e-commerce sites, news sites, forums, etc. Much of this information, which describes feelings, becomes the subject of study in several areas of research such as: "Sensing opinions and analyzing feelings". It is the process of identifying the polarity of the feelings held in the opinions found in the interactions of Internet users on the web and classifying them as positive, negative, or neutral. In this article, we suggest the implementation of a sentiment analysis tool that has the role of detecting the polarity of opinions from people about COVID-19 extracted from social media (tweeter) in the Arabic language and to know the impact of the pre-processing phase on the opinions classification. The results show gaps in this area of research, first of all, the lack of resources when collecting data. Second, Arabic language is more complexes in pre-processing step, especially the dialects in the pre-treatment phase. But ultimately the results obtained are promising.
https://doi.org/10.22937/IJCSNS.2021.21.3.28 인용 PDF KSCI

An Extended Work Architecture for Online Threat Prediction in Tweeter Dataset

Sheoran, Savita Kumari;Yadav, Partibha
- International Journal of Computer Science & Network Security
- /
- v.21 no.1
- /
- pp.97-106
- /
- 2021
Social networking platforms have become a smart way for people to interact and meet on internet. It provides a way to keep in touch with friends, families, colleagues, business partners, and many more. Among the various social networking sites, Twitter is one of the fastest-growing sites where users can read the news, share ideas, discuss issues etc. Due to its vast popularity, the accounts of legitimate users are vulnerable to the large number of threats. Spam and Malware are some of the most affecting threats found on Twitter. Therefore, in order to enjoy seamless services it is required to secure Twitter against malicious users by fixing them in advance. Various researches have used many Machine Learning (ML) based approaches to detect spammers on Twitter. This research aims to devise a secure system based on Hybrid Similarity Cosine and Soft Cosine measured in combination with Genetic Algorithm (GA) and Artificial Neural Network (ANN) to secure Twitter network against spammers. The similarity among tweets is determined using Cosine with Soft Cosine which has been applied on the Twitter dataset. GA has been utilized to enhance training with minimum training error by selecting the best suitable features according to the designed fitness function. The tweets have been classified as spammer and non-spammer based on ANN structure along with the voting rule. The True Positive Rate (TPR), False Positive Rate (FPR) and Classification Accuracy are considered as the evaluation parameter to evaluate the performance of system designed in this research. The simulation results reveals that our proposed model outperform the existing state-of-arts.
https://doi.org/10.22937/IJCSNS.2021.21.1.14 인용 PDF KSCI

Slangs and Short forms of Malay Twitter Sentiment Analysis using Supervised Machine Learning

Yin, Cheng Jet;Ayop, Zakiah;Anawar, Syarulnaziah;Othman, Nur Fadzilah;Zainudin, Norulzahrah Mohd
- International Journal of Computer Science & Network Security
- /
- v.21 no.11
- /
- pp.294-300
- /
- 2021
The current society relies upon social media on an everyday basis, which contributes to finding which of the following supervised machine learning algorithms used in sentiment analysis have higher accuracy in detecting Malay internet slang and short forms which can be offensive to a person. This paper is to determine which of the algorithms chosen in supervised machine learning with higher accuracy in detecting internet slang and short forms. To analyze the results of the supervised machine learning classifiers, we have chosen two types of datasets, one is political topic-based, and another same set but is mixed with 50 tweets per targeted keyword. The datasets are then manually labelled positive and negative, before separating the 275 tweets into training and testing sets. Naïve Bayes and Random Forest classifiers are then analyzed and evaluated from their performances. Our experiment results show that Random Forest is a better classifier compared to Naïve Bayes.
https://doi.org/10.22937/IJCSNS.2021.21.11.40 인용 PDF KSCI

Phrase-Chunk Level Hierarchical Attention Networks for Arabic Sentiment Analysis

Abdelmawgoud M. Meabed;Sherif Mahdy Abdou;Mervat Hassan Gheith
- International Journal of Computer Science & Network Security
- /
- v.23 no.9
- /
- pp.120-128
- /
- 2023
In this work, we have presented ATSA, a hierarchical attention deep learning model for Arabic sentiment analysis. ATSA was proposed by addressing several challenges and limitations that arise when applying the classical models to perform opinion mining in Arabic. Arabic-specific challenges including the morphological complexity and language sparsity were addressed by modeling semantic composition at the Arabic morphological analysis after performing tokenization. ATSA proposed to perform phrase-chunks sentiment embedding to provide a broader set of features that cover syntactic, semantic, and sentiment information. We used phrase structure parser to generate syntactic parse trees that are used as a reference for ATSA. This allowed modeling semantic and sentiment composition following the natural order in which words and phrase-chunks are combined in a sentence. The proposed model was evaluated on three Arabic corpora that correspond to different genres (newswire, online comments, and tweets) and different writing styles (MSA and dialectal Arabic). Experiments showed that each of the proposed contributions in ATSA was able to achieve significant improvement. The combination of all contributions, which makes up for the complete ATSA model, was able to improve the classification accuracy by 3% and 2% on Tweets and Hotel reviews datasets, respectively, compared to the existing models.
https://doi.org/10.22937/IJCSNS.2023.23.9.15 인용 PDF

Real-time Knowledge Structure Mapping from Twitter for Damage Information Retrieval during a Disaster

Sohn, Jiu;Kim, Yohan;Park, Somin;Kim, Hyoungkwan
- International conference on construction engineering and project management
- /
- 2020.12a
- /
- pp.505-509
- /
- 2020
Twitter is a useful medium to grasp various damage situations that have occurred in society. However, it is a laborious task to spot damage-related topics according to time in the environment where information is constantly produced. This paper proposes a methodology of constructing a knowledge structure by combining the BERT-based classifier and the community detection techniques to discover the topics underlain in the damage information. The methodology consists of two steps. In the first step, the tweets are classified into the classes that are related to human damage, infrastructure damage, and industrial activity damage by a BERT-based transfer learning approach. In the second step, networks of the words that appear in the damage-related tweets are constructed based on the co-occurrence matrix. The derived networks are partitioned by maximizing the modularity to reveal the hidden topics. Five keywords with high values of degree centrality are selected to interpret the topics. The proposed methodology is validated with the Hurricane Harvey test data.
PDF

Search Result 179, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)