• Title/Summary/Keyword: dataset construction

Search Result 200, Processing Time 0.023 seconds

Data set design and implementation for Assistive walking device AI service construction (보조보행기구 AI 서비스 구축을 위한 데이터셋 설계 및 구현)

  • Choi, Kyu-Min;Kim, Yu-Min;Shin, Joon-Pyo;Sung, Seung-min;Lee, Byung-kwon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.227-229
    • /
    • 2021
  • 본 논문에서는 노약자 및 장애인의 증가로 인한 조행보조기구 사용량이 증가하고 있으나 물리적인 보조기구는 있지만 AI를 통한 서비스와 보조보행기구에 관한 AI 데이터셋이 부족하다. 이러한 문제점을 보안하기 위해 본 논문에서는 상기 데이터셋을 설계 및 구축하기 위해 Node JS를 사용하여 이미지 크롤링 프로그램을 구현하여 이미지 데이터를 수집했으며, Yolo Maker를 활용하여 수집된 이미지를 데이터셋으로 변환시켰다. 이를 통해 노약자 및 장애인을 위한 AI 서비스 구축에 필요한 데이터를 손쉽게 설계 및 구축한다.

  • PDF

Identifying Key Grammatical Errors of Japanese English as a Foreign Language Learners in a Learner Corpus: Toward Focused Grammar Instruction with Data-Driven Learning

  • Atsushi Mizumoto;Yoichi Watari
    • Asia Pacific Journal of Corpus Research
    • /
    • v.4 no.1
    • /
    • pp.25-42
    • /
    • 2023
  • The number of studies on data-driven learning (DDL) has increased in recent years, and DDL's overall effectiveness as an L2 (second language) teaching methodology has been reported to be high. However, the degree of its effectiveness in grammar instruction, particularly for the goal of correcting errors in L2 writing, is still unclear. To provide guidelines for focused grammar instruction with DDL in the Japanese classroom setting, we aimed to identify the typical grammatical errors made by Japanese learners in the Cambridge Learner Corpus First Certificate in English (CLC FCE) dataset. The results revealed that three error types (nouns, articles, and prepositions) should be addressed in DDL grammar instruction for Japanese English as a foreign language (EFL) learners. In light of the findings, pedagogical implications and suggestions for future DDL research and practice are discussed.

Dataset Construction of Taekwondo Beginner AI (태권도 초심자를 위한 AI의 DataSet 구축)

  • Cho, Kyu Cheol;Kim, Ju Yeon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.249-252
    • /
    • 2022
  • 세계 태권도 연맹은 국제 축구 연맹의 가입국과 동일한 수의 가입국을 보유할 만큼 태권도는 점점 더 세계적으로 나아가고 있다. 하지만 태권도의 교육방법은 예전과 다르지 않다. 도장의 관장이나 사범이 직접 자세를 눈으로 보고 판단하여 지도해야 한다. 본 연구는 기술이 발전하고 변화함에 따라 태권도를 조금 더 다양하고 흥미롭게 배울 수 있는 방법을 개발하고자 진행하였다. 본 논문에서는 피사체 모델을 촬영하여 이미지를 추출하고 이미지에서 사람의 관절 KeyPoint를 라벨링 한 후 이를 바탕으로 COCO 형식의 DataSet을 만들어낸다. 이후 이 DataSet을 기계에 학습을 시킨다면 초심자를 위한 교육용 태권도 AI가 만들어질 수 있다. 또한, 기계학습 이후 이 AI를 실제 교육현장에 적용하여 교육과정에 직접 사용할 수 있으며 이 AI를 바탕으로 교육용 게임 개발 등 다양한 방면으로 활용할 수 있을 것이라고 기대한다.

  • PDF

A Study on the Dataset Construction Needed to Realize a Digital Human in Fitness with Single Image Recognition (단일 이미지 인식으로 피트니스 분야 디지털 휴먼 구현에 필요한 데이터셋 구축에 관한 연구)

  • Soo-Hyuong Kang;Sung-Geon Park;Kwang-Young Park
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.642-643
    • /
    • 2023
  • 피트니스 분야 인공지능 서비스의 성능 개선을 AI모델 개발이 아닌 데이터셋의 품질 개선을 통해 접근하는 방식을 제안하고, 데이터품질의 성능을 평가하는 것을 목적으로 한다. 데이터 설계는 각 분야 전문가 10명이 참여하였고, 단일 시점 영상을 이용한 운동동작 자동 분류에 사용된 모델은 Google의 MediaPipe 모델을 사용하였다. 팔굽혀펴기의 운동동작인식 정확도는 100%로 나타났으나 팔꿉치의 각도 15° 이하였을 때 동작의 횟수를 인식하지 않았고 이 결과 값에 대해 피트니스 전문가의 의견과 불일치하였다. 향후 연구에서는 동작인식의 분류뿐만 아니라 운동량을 연결하여 분석할 수 있는 시스템이 필요하다.

Dataset construction and Automatic classification of Department information appearing in Domestic journals (국내 학술지 출현 학과정보 데이터셋 구축 및 자동분류)

  • Byungkyu Kim;Beom-Jong You;Hyoung-Seop Shim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.343-344
    • /
    • 2023
  • 과학기술 문헌을 활용한 계량정보분석에서 학과정보의 활용은 매유 유용하다. 본 논문에서는 한국과학기술인용색인데이터베이스에 등재된 국내 학술지 논문에 출현하는 대학기관 소속 저자의 학과정보를 추출하고 데이터 정제 및 학과유형 분류 처리를 통해 학과정보 데이터셋을 구축하였다. 학과정보 데이터셋을 학습데이터와 검증데이터로 이용하여 딥러닝 기반의 자동분류 모델을 구현하였으며, 모델 성능 평가 결과는 한글 학과정보 기준 98.6%와 영문 학과정보 기준 97.6%의 정확률로 측정되었다. 향후 과학기술 분야별 지적관계 분석 및 논문 주제분류 등에 학과정보 자동분류 처리기의 활용이 기대된다.

  • PDF

Reliable Fault Diagnosis Method Based on An Optimized Deep Belief Network for Gearbox

  • Oybek Eraliev;Ozodbek Xakimov;Chul-Hee Lee
    • Journal of Drive and Control
    • /
    • v.20 no.4
    • /
    • pp.54-63
    • /
    • 2023
  • High and intermittent loading cycles induce fatigue damage to transmission components, resulting in premature gearbox failure. To identify gearbox defects, numerous vibration-based diagnostics techniques, using several artificial intelligence (AI) algorithms, have recently been presented. In this paper, an optimized deep belief network (DBN) model for gearbox problem diagnosis was designed based on time-frequency visual pattern identification. To optimize the hyperparameters of the model, a particle swarm optimization (PSO) approach was integrated into the DBN. The proposed model was tested on two gearbox datasets: a wind turbine gearbox and an experimental gearbox. The optimized DBN model demonstrated strong and robust performance in classification accuracy. In addition, the accuracy of the generated datasets was compared using traditional ML and DL algorithms. Furthermore, the proposed model was evaluated on different partitions of the dataset. The results showed that, even with a small amount of sample data, the optimized DBN model achieved high accuracy in diagnosis.

Axial load prediction in double-skinned profiled steel composite walls using machine learning

  • G., Muthumari G;P. Vincent
    • Computers and Concrete
    • /
    • v.33 no.6
    • /
    • pp.739-754
    • /
    • 2024
  • This study presents an innovative AI-driven approach to assess the ultimate axial load in Double-Skinned Profiled Steel sheet Composite Walls (DPSCWs). Utilizing a dataset of 80 entries, seven input parameters were employed, and various AI techniques, including Linear Regression, Polynomial Regression, Support Vector Regression, Decision Tree Regression, Decision Tree with AdaBoost Regression, Random Forest Regression, Gradient Boost Regression Tree, Elastic Net Regression, Ridge Regression, and LASSO Regression, were evaluated. Decision Tree Regression and Random Forest Regression emerged as the most accurate models. The top three performing models were integrated into a hybrid approach, excelling in accurately estimating DPSCWs' ultimate axial load. This adaptable hybrid model outperforms traditional methods, reducing errors in complex scenarios. The validated Artificial Neural Network (ANN) model showcases less than 1% error, enhancing reliability. Correlation analysis highlights robust predictions, emphasizing the importance of steel sheet thickness. The study contributes insights for predicting DPSCW strength in civil engineering, suggesting optimization and database expansion. The research advances precise load capacity estimation, empowering engineers to enhance construction safety and explore further machine learning applications in structural engineering.

A Digital Thesaurus of the Traditional Common Culture of the Greater Mekong Subregion

  • Suwannee Hoaihongthong;Kanyarat Kwiecien
    • Journal of Information Science Theory and Practice
    • /
    • v.12 no.3
    • /
    • pp.63-74
    • /
    • 2024
  • This study aimed to develop a digital thesaurus dedicated to cataloging the traditional common culture of the Greater Mekong Subregion. The process followed a meticulous seven-step methodology, including scoping, vocabulary collection, knowledge structure analysis, relationship delineation, related word adjustments, list validation, and evaluation. Leveraging principles from knowledge organization, thesaurus construction, and digital platform development, the TemaTres web application emerged as the primary tool for constructing this thesaurus. The study's results showed that 2,042 principal words related to the traditional common culture of the Greater Mekong Subregion were compiled and classified into terms for each of the seven deep levels. Each term was accompanied by essential metadata, including broader and narrower terms, related terms, cross-references, and scope notes. This rich dataset empowered semantic search capabilities across diverse applications and web services, providing access to knowledge pertaining to the traditional common culture of the Greater Mekong Subregion and contributing to a deeper understanding of this cultural domain.

Developing an Ensemble Classifier for Bankruptcy Prediction (부도 예측을 위한 앙상블 분류기 개발)

  • Min, Sung-Hwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.7
    • /
    • pp.139-148
    • /
    • 2012
  • An ensemble of classifiers is to employ a set of individually trained classifiers and combine their predictions. It has been found that in most cases the ensembles produce more accurate predictions than the base classifiers. Combining outputs from multiple classifiers, known as ensemble learning, is one of the standard and most important techniques for improving classification accuracy in machine learning. An ensemble of classifiers is efficient only if the individual classifiers make decisions as diverse as possible. Bagging is the most popular method of ensemble learning to generate a diverse set of classifiers. Diversity in bagging is obtained by using different training sets. The different training data subsets are randomly drawn with replacement from the entire training dataset. The random subspace method is an ensemble construction technique using different attribute subsets. In the random subspace, the training dataset is also modified as in bagging. However, this modification is performed in the feature space. Bagging and random subspace are quite well known and popular ensemble algorithms. However, few studies have dealt with the integration of bagging and random subspace using SVM Classifiers, though there is a great potential for useful applications in this area. The focus of this paper is to propose methods for improving SVM performance using hybrid ensemble strategy for bankruptcy prediction. This paper applies the proposed ensemble model to the bankruptcy prediction problem using a real data set from Korean companies.

Evaluation of SWAT Prediction Error according to Accuracy of Land Cover Map (토지피복도 정확도에 따른 SWAT 예측 오류 평가)

  • Heo, Sunggu;Kim, Kisung;Kim, Namwon;Ahn, Jaehun;Park, Sanghun;Yoo, Dongseon;Choi, JoongDae;Lim, Kyoungjae
    • Journal of Korean Society on Water Environment
    • /
    • v.24 no.6
    • /
    • pp.690-700
    • /
    • 2008
  • The Soil and Water Assessment Tool (SWAT) model users tend to use the readily available input dataset, such as the Ministry of Environment (MOE) land cover data ignoring temporal and spatial changes in land cover. The SWAT model was calibrated and validated with this land cover data. The EI values were 0.79 and 0.85 for streamflow calibration and validation, respectively. The EI were 0.79 and 0.86 for sediment calibration and validation, respectively. With newly prepared landcover dataset for the Doam-dam watershed, the SWAT model better predicts hydrologic and sediment behaviors. The number of HRUs with new land cover data increased by 70.2% compared with that with the MOE land cover, indicating better representation of small-sized agricultural field boundaries. The SWAT estimated annual average sediment yield with the MOE land cover data was 61.8 ton/ha/year for the Doam-dam watershed, while 36.2 ton/ha/year (70.7% difference) of annual sediment yield with new land cover data. Especially the most significant difference in estimated sediment yield was 548.0% for the subwatershed #2. Therefore it is recommended that one needs to carefully validate land cover for the study watershed for accurate hydrologic and sediment simulation with the SWAT model.