DOI QR코드

DOI QR Code

Token-Based Classification and Dataset Construction for Detecting Modified Profanity

변형된 비속어 탐지를 위한 토큰 기반의 분류 및 데이터셋

  • Received : 2023.12.26
  • Accepted : 2024.03.22
  • Published : 2024.04.30

Abstract

Traditional profanity detection methods have limitations in identifying intentionally altered profanities. This paper introduces a new method based on Named Entity Recognition, a subfield of Natural Language Processing. We developed a profanity detection technique using sequence labeling, for which we constructed a dataset by labeling some profanities in Korean malicious comments and conducted experiments. Additionally, to enhance the model's performance, we augmented the dataset by labeling parts of a Korean hate speech dataset using one of the large language models, ChatGPT, and conducted training. During this process, we confirmed that filtering the dataset created by the large language model by humans alone could improve performance. This suggests that human oversight is still necessary in the dataset augmentation process.

기존의 비속어 탐지 방법들은 의도적으로 변형된 비속어를 식별하는 데 한계가 있다. 이 논문에서는 자연어 처리의 한 분야인 개체명 인식에 기반한 새로운 방법을 소개한다. 우리는 시퀀스 레이블링을 이용한 비속어 탐지 기법을 개발하고, 이를 위해 한국어 악성 댓글 중 일부 비속어를 레이블링하여 직접 데이터셋을 구축하여 실험을 수행하였다. 또한 모델의 성능을 향상시키기 위하여 거대 언어 모델중 하나인 ChatGPT를 활용해 한국어 혐오발언 데이터셋의 일부를 레이블링을 하는 방식으로 데이터셋을 증강하여 학습을 진행하였고, 이 과정에서 거대 언어 모델이 생성한 데이터셋을 인간이 필터링 하는 것만으로도 성능을 향상시킬 수 있음을 확인하였다. 이를 통해 데이터셋 증강 과정에는 여전히 인간의 관리감독이 필요함을 제시하였다.

Keywords

Acknowledgement

본 연구는 과학기술정보통신부 및 정보통신기획평가원의 학석사연계ICT 핵심인재양성사업의 연구결과로 수행되었음(IITP-2024-RS-2023-00260175).

References

  1. H.-G. Kim and H.-K. Kim, "An empirical study of SNS Language Violence(Expletives, Slang)," Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities and Sociology, Vol.9, No.3, pp.99-108, 2019.  https://doi.org/10.35873/AJMAHS.2019.9.3.010
  2. H.-Y. Kang, "The mediating effect of self-control on the relationship between adolescents' usage of slang and aggression," Journal of the Korea Institute of Youth Facility and Environment, Vol.19, No.4, pp.33-42, 2021.  https://doi.org/10.55063/KIYFE.2021.19.4.4
  3. D. Kim, J. Kim, and E. Kwak, "A study on the translation strategies of swearing in the game: Focusing on the semantic.phonological/morphological variations of game forbidden words," The Journal of Interpretation and Translation Education, Vol.21, No.3, pp.5-25, 2023. 5) https://github.com/sminu24/Detecting_Modified_Profanity  https://doi.org/10.23903/KAITED.2023.21.3.001
  4. J. Li, A. Sun, J. Han, and C. Li, "A survey on deep learning for named entity recognition : Extended abstract," IEEE 39th International Conference on Data Engineering (ICDE), pp.3817-3818, 2023. 
  5. H. Dai et al., "Chataug: Leveraging chatgpt for text data augmentation," arXiv preprint arXiv:2302.13007, 2023. 
  6. K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, "ELECTRA: Pre-training text encoders as discriminators rather than generators," arXiv preprint arXiv:2003.10555, 2020. 
  7. A. Chaudhari, P. Davda, M. Dand and S. Dholay, "Profanity detection and removal in videos using machine learning," 2021 6th International Conference on Inventive Computation Technologies (ICICT), pp.572-576, 2021. 
  8. J. Kim and S. Lee, "Developing a connection restrictions filtering system for websites based on swear words extraction," Journal of KIISE, Vol.46, No.12, pp.1272-1278, 2019.  https://doi.org/10.5626/JOK.2019.46.12.1272
  9. Y.-L. Choi, J.-W. Kim, and J. Han, "Development of profanity response module for artificial intelligence service for english education," Journal of Korean Institute of Intelligent Systems, Vol.31, No.3, pp.192-197, 2021.  https://doi.org/10.5391/JKIIS.2021.31.3.192
  10. M. Yi, M. Lim, H. Ko, and J. Shin, "Method of profanity detection using word embedding and LSTM," Mobile Information Systems, Vol.2021, 2021. 
  11. Y. Kim, H. Gang, S. Han, and H. Jeong, "Swear word detection through convolutional neural network," Proceedings of the Korea Information Processing Society Conference, Vol.28, No.2, pp.685-686, 2021. 
  12. S. Lee and S. Park, "Analyzing the classification results for korean hatespeech and bias detection models in malicious comment dataset," Journal of the Korean Institute of Industrial Engineers, Vol.48, No.6, pp.636-643, 2022.  https://doi.org/10.7232/JKIIE.2022.48.6.636
  13. J. H. Choi, "Design and implementation of abuse sentence detecting system based on deep learning," Master's dissertation, Hanyang University, Korea, 2020. 
  14. J. Yoo, "A study on the improvement of text filtering based on image learning," Master's dissertation, Sungkyunkwan University, Korea, 2019. 
  15. T. Brown et al., "Language models are few-shot learners," Advances in Neural Information Processing Systems, Vol.33, pp.1877-1901, 2020. 
  16. S. Ko and Y. Shin, "Token Classification for Detecting Modified Profanity," Proceedings of the Annual Conference of Korea Information Processing Society Conference (KIPS) 2023, Vol.30, No.2, pp.498-499, 2023.