Search | Korea Science

Korean Text Classification Using Randomforest and XGBoost Focusing on Seoul Metropolitan Civil Complaint Data (RandomForest와 XGBoost를 활용한 한국어 텍스트 분류: 서울특별시 응답소 민원 데이터를 중심으로)

Ha, Ji-Eun;Shin, Hyun-Chul;Lee, Zoon-Ky
- The Journal of Bigdata
- /
- v.2 no.2
- /
- pp.95-104
- /
- 2017
In 2014, Seoul Metropolitan Government launched a response service aimed at responding promptly to civil complaints. The complaints received are categorized based on their content and sent to the department in charge. If this part can be automated, the time and labor costs will be reduced. In this study, we collected 17,700 cases of complaints for 7 years from June 1, 2010 to May 31, 2017. We compared the XGBoost with RandomForest and confirmed the suitability of Korean text classification. As a result, the accuracy of XGBoost compared to RandomForest is generally high. The accuracy of RandomForest was unstable after upsampling and downsampling using the same sample, while XGBoost showed stable overall accuracy.
PDF

Odor Classification and Source Analysis using Pseudo Inverse (Pseudo Inverse를 이용한 악취분류와 악취원 분석)

Yu, Suk-Hyun;Park, Sang-Jin;Koo, Youn-Seo;Kwon, Hee-Yong
- Journal of Korea Multimedia Society
- /
- v.13 no.8
- /
- pp.1171-1182
- /
- 2010
In this paper, odor classification and source analysis methods are proposed to trace odor sources in th air at the specific place and period. It is necessary to generate representative patterns in order to classify the various odors efficiently. We, therefore, create 67 kinds of odor representative patterns measured from the main sources. considering the air mixed with various odors, several mixed representative patterns for odor sources are generated with the combination of two or three different odors. In addition, the weight of odor sources for an odor from a civil complaint region are computed using pseudo inverse method. As a result, we can trace and identify the odor sources to lead to a specific odor and the contribution of each source. The results of this study will be useful for settling the civil complaint related with odors.
PDF KSCI

A Study on an Automatic Classification Model for Facet-Based Multidimensional Analysis of Civil Complaints (패싯 기반 민원 다차원 분석을 위한 자동 분류 모델)

Na Rang Kim
- Journal of Korea Society of Industrial Information Systems
- /
- v.29 no.1
- /
- pp.135-144
- /
- 2024
In this study, we propose an automatic classification model for quantitative multidimensional analysis based on facet theory to understand public opinions and demands on major issues through big data analysis. Civil complaints, as a form of public feedback, are generated by various individuals on multiple topics repeatedly and continuously in real-time, which can be challenging for officials to read and analyze efficiently. Specifically, our research introduces a new classification framework that utilizes facet theory and political analysis models to analyze the characteristics of citizen complaints and apply them to the policy-making process. Furthermore, to reduce administrative tasks related to complaint analysis and processing and to facilitate citizen policy participation, we employ deep learning to automatically extract and classify attributes based on the facet analysis framework. The results of this study are expected to provide important insights into understanding and analyzing the characteristics of big data related to citizen complaints, which can pave the way for future research in various fields beyond the public sector, such as education, industry, and healthcare, for quantifying unstructured data and utilizing multidimensional analysis. In practical terms, improving the processing system for large-scale electronic complaints and automation through deep learning can enhance the efficiency and responsiveness of complaint handling, and this approach can also be applied to text data processing in other fields.
https://doi.org/10.9723/jksiis.2024.29.1.135 인용 PDF

A Study on the Classification of Unstructured Data through Morpheme Analysis

Kim, SungJin;Choi, NakJin;Lee, JunDong
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.4
- /
- pp.105-112
- /
- 2021
In the era of big data, interest in data is exploding. In particular, the development of the Internet and social media has led to the creation of new data, enabling the realization of the era of big data and artificial intelligence and opening a new chapter in convergence technology. Also, in the past, there are many demands for analysis of data that could not be handled by programs. In this paper, an analysis model was designed and verified for classification of unstructured data, which is often required in the era of big data. Data crawled DBPia's thesis summary, main words, and sub-keyword, and created a database using KoNLP's data dictionary, and tokenized words through morpheme analysis. In addition, nouns were extracted using KAIST's 9 part-of-speech classification system, TF-IDF values were generated, and an analysis dataset was created by combining training data and Y values. Finally, The adequacy of classification was measured by applying three analysis algorithms(random forest, SVM, decision tree) to the generated analysis dataset. The classification model technique proposed in this paper can be usefully used in various fields such as civil complaint classification analysis and text-related analysis in addition to thesis classification.
https://doi.org/10.9708/jksci.2021.26.04.105 인용 PDF KSCI HTML

Search Result 4, Processing Time 0.016 seconds

Korean Text Classification Using Randomforest and XGBoost Focusing on Seoul Metropolitan Civil Complaint Data (RandomForest와 XGBoost를 활용한 한국어 텍스트 분류: 서울특별시 응답소 민원 데이터를 중심으로)

Odor Classification and Source Analysis using Pseudo Inverse (Pseudo Inverse를 이용한 악취분류와 악취원 분석)

A Study on an Automatic Classification Model for Facet-Based Multidimensional Analysis of Civil Complaints (패싯 기반 민원 다차원 분석을 위한 자동 분류 모델)

A Study on the Classification of Unstructured Data through Morpheme Analysis

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)