• Title/Summary/Keyword: 정보 불균형

Search Result 593, Processing Time 0.081 seconds

Comparison of Korean Classification Models' Korean Essay Score Range Prediction Performance (한국어 학습 모델별 한국어 쓰기 답안지 점수 구간 예측 성능 비교)

  • Cho, Heeryon;Im, Hyeonyeol;Yi, Yumi;Cha, Junwoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.3
    • /
    • pp.133-140
    • /
    • 2022
  • We investigate the performance of deep learning-based Korean language models on a task of predicting the score range of Korean essays written by foreign students. We construct a data set containing a total of 304 essays, which include essays discussing the criteria for choosing a job ('job'), conditions of a happy life ('happ'), relationship between money and happiness ('econ'), and definition of success ('succ'). These essays were labeled according to four letter grades (A, B, C, and D), and a total of eleven essay score range prediction experiments were conducted (i.e., five for predicting the score range of 'job' essays, five for predicting the score range of 'happiness' essays, and one for predicting the score range of mixed topic essays). Three deep learning-based Korean language models, KoBERT, KcBERT, and KR-BERT, were fine-tuned using various training data. Moreover, two traditional probabilistic machine learning classifiers, naive Bayes and logistic regression, were also evaluated. Experiment results show that deep learning-based Korean language models performed better than the two traditional classifiers, with KR-BERT performing the best with 55.83% overall average prediction accuracy. A close second was KcBERT (55.77%) followed by KoBERT (54.91%). The performances of naive Bayes and logistic regression classifiers were 52.52% and 50.28% respectively. Due to the scarcity of training data and the imbalance in class distribution, the overall prediction performance was not high for all classifiers. Moreover, the classifiers' vocabulary did not explicitly capture the error features that were helpful in correctly grading the Korean essay. By overcoming these two limitations, we expect the score range prediction performance to improve.

The Performance Improvement of U-Net Model for Landcover Semantic Segmentation through Data Augmentation (데이터 확장을 통한 토지피복분류 U-Net 모델의 성능 개선)

  • Baek, Won-Kyung;Lee, Moung-Jin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1663-1676
    • /
    • 2022
  • Recently, a number of deep-learning based land cover segmentation studies have been introduced. Some studies denoted that the performance of land cover segmentation deteriorated due to insufficient training data. In this study, we verified the improvement of land cover segmentation performance through data augmentation. U-Net was implemented for the segmentation model. And 2020 satellite-derived landcover dataset was utilized for the study data. The pixel accuracies were 0.905 and 0.923 for U-Net trained by original and augmented data respectively. And the mean F1 scores of those models were 0.720 and 0.775 respectively, indicating the better performance of data augmentation. In addition, F1 scores for building, road, paddy field, upland field, forest, and unclassified area class were 0.770, 0.568, 0.433, 0.455, 0.964, and 0.830 for the U-Net trained by original data. It is verified that data augmentation is effective in that the F1 scores of every class were improved to 0.838, 0.660, 0.791, 0.530, 0.969, and 0.860 respectively. Although, we applied data augmentation without considering class balances, we find that data augmentation can mitigate biased segmentation performance caused by data imbalance problems from the comparisons between the performances of two models. It is expected that this study would help to prove the importance and effectiveness of data augmentation in various image processing fields.

Qualitative Research on Mothers' Stress Level of Meal Preparation and Change of Food Consumption Pattern in Context of COVID-19 (코로나19 이후 가정 내 어머니의 식사준비 스트레스와 먹거리 소비패턴 변화에 관한 질적연구)

  • Lee, Yoonsun;Ryu, Sihyun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.2
    • /
    • pp.695-709
    • /
    • 2022
  • This study examined the mothers' stress level as it related to meal preparation and a change in food consumption during the Covid-19 pandemic. I asked mothers about the difficulties related to meal preparation, changes in food consumption, and their interest in food and health information using in-depth research methods. As a result, food delivery and online food consumption have markedly expanded as eating out has decreased. Mothers tried to serve fresh food at home because they perceived delivery food as unhealthy. They stick to it more vigorously with their younger children. As the result of analysis on determinants of food consumption, their interest in organic food is higher when mothers' ages are younger with higher income. However, mothers with a low level of education tend to be uninterested in it. Therefore, the ages of the children and mothers, household income, and education levels all had an effect on food consumption behaviors.

Establishing Design Directions for Nutrition Education Materials for Early Elementary Students in South Korea (초등학교 저학년 영양교육 학습 자료의 디자인 방향 설정에 관한 연구)

  • Park, YuBin;Paik, JinKyung
    • Design Convergence Study
    • /
    • v.14 no.2
    • /
    • pp.1-16
    • /
    • 2015
  • As childhood obesity and nutrition imbalance emerge as social issues in South Korea today, the development of education materials on diets and nutrition has become important and has been attempted in diverse ways. The present study was conducted on early elementary school students with the objective of establishing directions for the design of personalized nutrition education materials that can promote a proper, balanced diet for children through application in daily life of knowledge acquired from self-learning linked to nutrition education that is taking place in their schools. For this purpose, a review of previous theoretical literature on nutrition education for early elementary students was performed. Survey questions were formulated based on the advice of field experts in medicine, education, and design and the survey was conducted among 110 children from 1st and 2nd grade in two elementary schools, one in Seoul and the other in Changwon. The results obtained from the user evaluation suggested that the early elementary school students showed positive reaction to nutrition education, had preference for the type of learning using multimedia-based contents and quiz activities, were willing to learn about calorie-adjusted meals, preferred the Gothic typeface and orange and green colors. Furthermore, they showed positive opinion on the use of numerical surveying method and pictorial style similar to actual appearance in connection with nutrition-related information representation. and preferences regarding learning styles and design elements

A Classification Model for Customs Clearance Inspection Results of Imported Aquatic Products Using Machine Learning Techniques (머신러닝 기법을 활용한 수입 수산물 통관검사결과 분류 모델)

  • Ji Seong Eom;Lee Kyung Hee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.157-165
    • /
    • 2023
  • Seafood is a major source of protein in many countries and its consumption is increasing. In Korea, consumption of seafood is increasing, but self-sufficiency rate is decreasing, and the importance of safety management is increasing as the amount of imported seafood increases. There are hundreds of species of aquatic products imported into Korea from over 110 countries, and there is a limit to relying only on the experience of inspectors for safety management of imported aquatic products. Based on the data, a model that can predict the customs inspection results of imported aquatic products is developed, and a machine learning classification model that determines the non-conformity of aquatic products when an import declaration is submitted is created. As a result of customs inspection of imported marine products, the nonconformity rate is less than 1%, which is very low imbalanced data. Therefore, a sampling method that can complement these characteristics was comparatively studied, and a preprocessing method that can interpret the classification result was applied. Among various machine learning-based classification models, Random Forest and XGBoost showed good performance. The model that predicts both compliance and non-conformance well as a result of the clearance inspection is the basic random forest model to which ADASYN and one-hot encoding are applied, and has an accuracy of 99.88%, precision of 99.87%, recall of 99.89%, and AUC of 99.88%. XGBoost is the most stable model with all indicators exceeding 90% regardless of oversampling and encoding type.

Fine-tuning BERT-based NLP Models for Sentiment Analysis of Korean Reviews: Optimizing the sequence length (BERT 기반 자연어처리 모델의 미세 조정을 통한 한국어 리뷰 감성 분석: 입력 시퀀스 길이 최적화)

  • Sunga Hwang;Seyeon Park;Beakcheol Jang
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.47-56
    • /
    • 2024
  • This paper proposes a method for fine-tuning BERT-based natural language processing models to perform sentiment analysis on Korean review data. By varying the input sequence length during this process and comparing the performance, we aim to explore the optimal performance according to the input sequence length. For this purpose, text review data collected from the clothing shopping platform M was utilized. Through web scraping, review data was collected. During the data preprocessing stage, positive and negative satisfaction scores were recalibrated to improve the accuracy of the analysis. Specifically, the GPT-4 API was used to reset the labels to reflect the actual sentiment of the review texts, and data imbalance issues were addressed by adjusting the data to 6:4 ratio. The reviews on the clothing shopping platform averaged about 12 tokens in length, and to provide the optimal model suitable for this, five BERT-based pre-trained models were used in the modeling stage, focusing on input sequence length and memory usage for performance comparison. The experimental results indicated that an input sequence length of 64 generally exhibited the most appropriate performance and memory usage. In particular, the KcELECTRA model showed optimal performance and memory usage at an input sequence length of 64, achieving higher than 92% accuracy and reliability in sentiment analysis of Korean review data. Furthermore, by utilizing BERTopic, we provide a Korean review sentiment analysis process that classifies new incoming review data by category and extracts sentiment scores for each category using the final constructed model.

Battery Level Calculation and Failure Prediction Algorithm for ESS Optimization and Stable Operation (ESS 최적화 및 안정적인 운영을 위한 배터리 잔량 산출 및 고장 예측 알고리즘)

  • Joo, Jong-Yul;Lee, Young-Jae;Park, Kyoung-Wook;Oh, Jae-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.1
    • /
    • pp.71-78
    • /
    • 2020
  • In the case of power generation using renewable energy, power production may not be smooth due to the influence of the weather. The energy storage system (ESS) is used to increase the efficiency of solar and wind power generation. ESS has been continuously fired due to a lack of battery protection systems, operation management, and control system, or careless installation, leading to very big casualties and economic losses. ESS stability and battery protection system operation management technology is indispensable. In this paper, we present a battery level calculation algorithm and a failure prediction algorithm for ESS optimization and stable operation. The proposed algorithm calculates the correct battery level by accumulating the current amount in real-time when the battery is charged and discharged, and calculates the battery failure by using the voltage imbalance between battery cells. The proposed algorithms can predict the exact battery level and failure required to operate the ESS optimally. Therefore, accurate status information on ESS battery can be measured and reliably monitored to prevent large accidents.

Predicting Highway Concrete Pavement Damage using XGBoost (XGBoost를 활용한 고속도로 콘크리트 포장 파손 예측)

  • Lee, Yongjun;Sun, Jongwan
    • Korean Journal of Construction Engineering and Management
    • /
    • v.21 no.6
    • /
    • pp.46-55
    • /
    • 2020
  • The maintenance cost for highway pavement is gradually increasing due to the continuous increase in road extension as well as increase in the number of old routes that have passed the public period. As a result, there is a need for a method of minimizing costs through preventative grievance Preventive maintenance requires the establishment of a strategic plan through accurate prediction old Highway pavement. herefore, in this study, the XGBoost among machine learning classification-based models was used to develop a highway pavement damage prediction model. First, we solved the imbalanced data issue through data sampling, then developed a predictive model using the XGBoost. This predictive model was evaluated through performance indicators such as accuracy and F1 score. As a result, the over-sampling method showed the best performance result. On the other hand, the main variables affecting road damage were calculated in the order of the number of years of service, ESAL, and the number of days below the minimum temperature -2 degrees Celsius. If the performance of the prediction model is improved through more data accumulation and detailed data pre-processing in the future, it is expected that more accurate prediction of maintenance-required sections will be possible. In addition, it is expected to be used as important basic information for estimating the highway pavement maintenance budget in the future.

A Survey on the Critical Success Factors of Knowledge Management Using AHP (AHP 분석을 이용한 지식경영 실천 요소의 중요도에 관한 실증적 연구)

  • 이영수;박준아;정광식;김진우
    • Proceedings of the Korea Database Society Conference
    • /
    • 1999.06a
    • /
    • pp.85-94
    • /
    • 1999
  • 지식경영을 효과적으로 수행하기 위해서 기업은 지식경영을 구성하고 있는 요소를 정확히 이해할 필요가 있고, 이러한 중요 요소에 따라 투자가 이루어져야 한다. 본 연구는 지식경영의 중요 요소들을 제시함으로써, 앞으로 지식경영을 계획하고 있는 기업이 효과적으로 지식경영을 추진할 수 있는 활동 지침 및 투자 방향을 제시하고자 한다. 이를 위해, 본 연구에서는 각종 국내외 지식경영 관련 문헌에서 논의된 사항을 중심으로, 지식경영을 구성하는 30개의 중요요소를 추출하고, 분석계층도(AHP)를 이용하여 지식경영을 달성하기 위한 요소들을 위계적 구조로 정리하고, 최종단계에서 238개의 지식경영 구현의 평가기준을 마련하였다. 또한 실제로 지식경영 구현 요소들의 상대적 중요성을 파악하기 위해, 먼저 국내에서 지식경영을 추진하고 있거나 관심을 보이고 있는 48개 기업의 담당자 및 관련 부서원을 대상으로 설문조사를 실시하였고, 동시에 지식경영을 실제로 수행하고 있는 13개 기업의 담당자를 대상으로 각 기업에서 추진하고 있는 지식경영의 현황 파악을 위해 지식경영 실천의 평가기준에 대한 설문을 실시하였다. 이 두 가지 설문 조사 결과를 종합해 볼 때, 기업에서는 지식경영 구현 요소 중에서 인프라 내의 프로세스와 프로세스를 구성하는 지식의 활용과 전파 등이 중요하다고 인식하고 있는 반면, 실제로는 인프라 내의 정보기술과 프로세스를 구성하는 다른 한 축인 지식의 창출과 축적 면에 투자가 이루어진 것으로 나타났다. 이 외에도 지식화, 성과와 가치의 연계 그리고 지식의 가시화 등의 요소들은 상대적 중요도 인식과는 반대로 지식경영 추진에 있어 외면당하고 있는 것으로 나타났다. 따라서 본 연구는 지식 경영의 이러한 불균형을 시정할 수 있는 방향으로 앞으로의 투자가 수행되어야 할 것을 제안하고 있다. 산업의 밀도를 비재무적 지표변수로 산정하여 로지스틱회귀 분석과 인공신경망 기법으로 검증하였다. 로지스틱회귀분석 결과에서는 재무적 지표변수 모형의 전체적 예측적중률이 87.50%인 반면에 재무/비재무적 지표모형은 90.18%로서 비재무적 지표변수 사용에 대한 개선의 효과가 나타났다. 표본기업들을 훈련과 시험용으로 구분하여 분석한 결과는 전체적으로 재무/비재무적 지표를 고려한 인공신경망기법의 예측적중률이 높은 것으로 나타났다. 즉, 로지스틱회귀분석의 재무적 지표모형은 훈련, 시험용이 84.45%, 85.10%인 반면, 재무/비재무적 지표모형은 84.45%, 85.08%로서 거의 동일한 예측적중률을 가졌으나 인공신경망기법 분석에서는 재무적 지표모형이 92.23%, 85.10%인 반면, 재무/비재무적 지표모형에서는 91.12%, 88.06%로서 향상된 예측적 중률을 나타내었다.(ⅱ) managemental and strategical learning to give information necessary to improve the making. program and policy decision making, The objectives of the study are to develop the methodology of modeling the socioeconomic evaluation, and build up the practical socioeconomic evaluation model of the HAN projects including scientific and technological effects. Since the HAN projects consists of 18 subprograms, it is difficult In evaluate all the subprograms

  • PDF

Improving QoS using Cellular-IP/PRC in Hospital Wireless Network

  • Kim, Sung-Hong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.1 no.2
    • /
    • pp.120-126
    • /
    • 2006
  • In this paper, we propose for improving QoS in hospital wireless network using Cellular-IP/PRC(Paging Route Cache) with Paging Cache and Route Cache in Cellular-IP. Although the Cellular-IP/PRC technology is devised for mobile internet communication, it has its vulnerability in frequent handoff environment. This handoff state machine using differentiated handoff improves quality of services in Cellular-IP/PRC. Suggested algorithm shows better performance than existing technology in wireless mobile internet communication environment. When speech quality is secured considering increment of interference to receive in case of suppose that proposed acceptance method grooves base radio station capacity of transfer node is plenty, and most of contiguity cell transfer node was accepted at groove base radio station with a blow, groove base radio station new trench lake acceptance method based on transmission of a message electric power estimate of transfer node be. Do it so that may apply composing PC(Paging Cache) and RC(Routing Cache) that was used to manage paging and router in radio Internet network in integral management and all nodes as one PRC(Paging Router Cache), and add hand off state machine in transfer node so that can manage hand off of transfer node and Roaming state efficiently, and studies so that achieve connection function at node. Analyze benevolent person who influence on telephone traffic in system environment and forecasts each link currency rank and imbalance degree, forecast most close and important lake interception probability and lake falling off probability, GoS(Grade of Service), efficiency of cell capacity in QoS because applies algorithm proposing based on algorithm use gun send-receive electric power that judge by looking downward link whether currency book was limited and accepts or intercept lake and handles and displays QoS performance improvement.

  • PDF