DOI QR코드

DOI QR Code

Analyzing the Phenomena of Hate in Korea by Text Mining Techniques

텍스트마이닝 기법을 이용한 한국 사회의 혐오 양상 분석

  • 김혜진 (공주대학교 문헌정보교육과, 학교도서관교육연구소)
  • Received : 2022.10.18
  • Accepted : 2022.11.16
  • Published : 2022.11.30

Abstract

Hate is a collective expression of exclusivity toward others and it is fostered and reproduced through false public perception. This study aims to explore the objects and issues of hate discussed in our society using text mining techniques. To this end, we collected 17,867 news data published from 1990 to 2020 and constructed a co-word network and cluster analysis. In order to derive an explicit co-word network highly related to hate, we carried out sentence split and extracted a total of 52,520 sentences containing the words 'hate', 'prejudice' and 'discrimination' in the preprocessing phase. As a result of analyzing the frequency of words in the collected news data, the subjects that appeared most frequently in relation to hate in our society were women, race, and sexual minorities, and the related issues were related laws and crimes. As a result of cluster analysis based on the co-word network, we found a total of six hate-related clusters. The largest cluster was 'genderphobic', accounting for 41.4% of the total, followed by 'sexual minority hatred' at 28.7%, 'racial hatred' at 15.1%, 'selective hatred' at 8.5%, 'political hatred' accounted for 5.7% and 'environmental hatred' accounted for 0.3%. In the discussion, we comprehensively extracted all specific hate target names from the collected news data, which were not specifically revealed as a result of the cluster analysis.

혐오는 타인에 대한 배타성이 집단적으로 표출된 것으로, 잘못된 대중적 인식을 통하여 양산되고 재생산된다. 이 연구는 우리사회에서 언급되고 있는 '혐오' 양상을 거시적으로 탐색하고자 1990년부터 2020년까지 발행된 뉴스데이터 17,867건을 대상으로 텍스트마이닝 기법을 활용하여 키워드 네트워크와 군집 분석을 수행하였다. 그리고 단어를 추출하기 전에 먼저 기사를 문장으로 분리하는 전처리 과정을 거쳐 '혐오', '편견', '차별'이라는 단어를 포함하고 있는 문장 총 52,520개를 추출하여 분석에 활용함으로써 '혐오'라는 단어와 인접한 단어들로 구성된 키워드 네트워크를 구축하였다. 수집한 뉴스데이터의 단어 동시출현빈도 분석 결과, 우리 사회에서 혐오와 관련되어 가장 빈번하게 등장하는 대상은 여성, 인종, 성소수자 등이며, 관련된 이슈는 이들 집단과 관련된 법과 범죄 등이었다. 키워드 네트워크 군집 분석 결과, 성별(41.4%), 소수자(28.7%), 인종·민족(15.1%), 선택적·이해관계적(8.5%), 정치·이념(5.7%), 환경·생존적(0.3%) 혐오 등 총 6개의 혐오 군집들이 발견되었다. 논의에서는 군집 분석 결과 구체적으로 드러나지 않은 혐오의 표적(대상)을 모두 추출하여 분석하였다.

Keywords

Acknowledgement

이 논문은 2019년 공주대학교 학술연구지원사업의 연구지원에 의하여 연구되었음.

References

  1. An, Juyoung, Ahn, Kyubin, & Song, Min (2016). Text mining driven content analysis of Ebola on news media and scientific publications. Journal of the Korean Society for Library and Information Science, 50(2), 289-307. https://doi.org/10.4275/KSLIS.2016.50.2.289
  2. An, Soontae, Lee, Hannah, & Chung, Soondool (2021). Social perceptions and attitudes toward the elderly shared online: focusing on social big data analysis. Journal of the Korea Gerontological Society, 41(4), 505-525. https://dx.doi.org/10.31888/JKGS.2021.41.4.505
  3. Bae, Gibbuem & Kim, Chan-Woo (2019). The trend analysis and following core issue of Korean gender conflict using social big data. Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology, 9(2), 441-450. https://dx.doi.org/10.35873/ajmahs.2019.9.2.044
  4. Choi, Hyun-Cheol (2017). Aversion, its analysis and philosophical idea. Philosophical Investigation, 46, 175-199. https://dx.doi.org/10.33156/philos.2017.46.006
  5. Han, Hee-Jeong (2016). Narratives and emotions on immigrant women analyzing comments from the agora internet community (Daum portal site). Korean Journal of Communication & Information, 75(1), 43-79.
  6. Hong, Juhyun & Na, Eun-Kyung (2016). Online hate speech diffusion network analysis: issue-specific diffusion patterns, types and intensity of verbal expression on online hatred. Korean Journal of Journalism & Communication Studies, 60(5), 145-175. https://dx.doi.org/10.20879/kjjcs.2016.60.5.006
  7. Hong, Sung-Soo (2019). Law and policy on hate. Journal of Law, 30(2), 191-228. https://dx.doi.org/10.33982/clr.2019.05.30.2.191
  8. Huh, RaKeum (2018). Hate speech, it's two faces of oppression: 'Cultural Imperialism' and 'Violence'. Culture and Convergence, 40(4), 65-93. https://dx.doi.org/10.33645/cnc.2018.08.40.4.65
  9. Jin, Kyong-sun, Kim, Su-Yeon, Jeong, Yoo-Kyung, Song, Hyun-joo, & Song, Min (2017). Attitudes towards sexual comments in group texting. The Korean Journal of Woman Psychology, 22(2), 289-313. https://dx.doi.org/10.18205/kpa.2017.22.2.009
  10. Jo, Hae-Jeong (2021). The phenomenon of xenophobia in Korea and the idea of intercultural-philosophy. Cogito, 95, 141-171. https://doi.org/10.48115/cogito.2021.10.95.141
  11. Kang, Jingu (2021). The pandemic and hatred: focusing on comments to articles related to Vietnam. Journal of Multi-Cultural Contents Studies, 37, 185-212. https://dx.doi.org/10.15400/mccs.2021.08.37.185
  12. Kim, Ha-Jin & Song, Min (2014). A study on the research trends in domestic/international information science articles by co-word analysis. Journal of the Korean Society for Information Management, 31(1), 99-118. https://doi.org/10.3743/KOSIM.2014.31.1.099
  13. Kim, Jisoo & Youn, Sug-Min (2019). How does hate speech become a business in internet personal broadcasting?: focusing on a talk/camcorder genre broadcast on YouTube and Afreeca TV. Korean Journal of Broadcasting and Telecommunication Studies, 33(3), 45-79. https://doi.org/10.22876/KAB.2019.33.3.002
  14. Kim, Jongwoo (2021). Anti-discrimination law and polarized human rights: focusing on the Korean media discourse of anti-discrimination law. Economy and Society, 129, 84-117. https://dx.doi.org/10.18207/criso.2021.129.84
  15. Lee, Jungnyum (2016). Online hate speech and the freedom of expression: in focus on the current judgement of the european court of human rights. The Justice, 153, 37-56.
  16. Lee, Soo-Yeon, Kim, Hyun-Jung, & Jung, Sooyeon (2016). Text-mining based topic analysis on online sexism. Journal of Cybercommunication Academic Society, 33(3), 159-199.
  17. Lee, Younho & Jeong, Seokho (2022). Analysis of hate perception for foreign countries and foreigners on online: focusing on the text-mining analysis of online news article comments from 2019 to 2021. Korean Journal of Social Welfare, 74(1), 107-131. https://dx.doi.org/10.20970/kasw.2022.74.1.005
  18. Park Seung-Hee, Yeom, Ji-Hye, & Lee, Hyunjoo (2021). Changes in the images of persons with disabilities represented in the April newspaper articles of 1990-2020 in Korea. Disability & Employment, 31(3), 165-202. https://dx.doi.org/10.15707/disem.2021.31.3.007
  19. Park, Seung-Ho (2019). The definition and regulation method of hate speech. Kookmin Law Review, 31(3), 45-88. https://dx.doi.org/10.17251/legal.2019.31.3.45
  20. Shin, Kyung-Ah & Choi, Yoonhyeung (2020). Ageism in society: an analysis of the comments posted on major internet portal sites concerning news about elderly people. The Korean Journal of Advertising, 31(6), 93-128. https://dx.doi.org/10.14377/KJA.2020.8.31.93
  21. Yang, Hyeseung (2022). A text mining analysis on hate comments targeted at women, immigrants, and the elderly: in the context of Naver crime news. Korean Journal of Broadcasting and Telecommunication Studies, 36(3), 1-41.
  22. Youn, Eun-Joo (2019). The understand as a way to counter the expression of disgust. Human Beings, Environment and Their Future, 22, 229-250. https://dx.doi.org/10.34162/hefins.2019..22.008
  23. Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008
  24. Callon, M., Courtial, J., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: the case of polymer chemistry. Scientometrics, 22(1), 155-205. https://doi.org/10.1007/BF02019280
  25. Cheng, Q., Wang, J., Lu, W., Huang, Y., & Bu, Y. (2020). Keyword-citation-keyword network: a new perspective of discipline knowledge structure analysis. Scientometrics, 124(3), 1923-1943. https://doi.org/10.1007/s11192-020-03576-5
  26. Frenda, S., Ghanem, B., Montes-y-Gomez, M., & Rosso, P. (2019). Online hate speech against women: automatic identification of misogyny and sexism on twitter. Journal of Intelligent & Fuzzy Systems, 36(5), 4743-4752. https://doi.org/10.3233/JIFS-179023
  27. Liu, G. Y., Hu, J. M., & Wang, H. L. (2012). A co-word analysis of digital library field in China. Scientometrics, 91(1), 203-217. https://doi.org/10.1007/s11192-011-0586-4
  28. Newman, M. E. & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/physreve.69.026113
  29. Silva, L., Mondal, M., Correa, D., Benevenuto, F., & Weber, I. (2016, March). Analyzing the targets of hate in online social media. In Tenth International AAAI Conference on Web and Social Media.
  30. Watts, D. J. & Strogatz, S. (1998). Collective dynamics of 'small-world' networks. Nature, 393(6684), 440-442. doi:10.1038/30918.
  31. YTN (2015. 10. 02). [한컷뉴스] 예의지국의 씁슬한 현실 '지금은 노인혐오 시대?'. 출처: https://www.ytn.co.kr/_ln/0103_201510021500083270