• Title/Summary/Keyword: Machine data analysis

검색결과 2,207건 처리시간 0.03초

정형 데이터와 비정형 데이터를 동시에 고려하는 기계학습 기반의 직업훈련 중도탈락 예측 모형 (A Machine Learning-Based Vocational Training Dropout Prediction Model Considering Structured and Unstructured Data)

  • 하만석;안현철
    • 한국콘텐츠학회논문지
    • /
    • 제19권1호
    • /
    • pp.1-15
    • /
    • 2019
  • 직업훈련 교육 현장에서 느끼는 가장 큰 어려움 중 하나는 중도탈락 문제이다. 훈련과정마다 많은 수의 학생들이 중도탈락을 하게 되어 국가 예산 낭비 및 청년 취업률 개선에 장애 요인이 되고 있다. 본 연구에서는 중도탈락의 원인을 주로 분석한 기존 연구들과 달리, 각종 수강생 정보를 활용하여 사전에 중도탈락을 예측할 수 있는 기계학습 기반 모형을 제안하고자 한다. 특히 본 연구의 제안모형은 수강생 관련 정형 데이터 뿐 아니라 비정형 데이터인 강사의 상담일지 정보까지 동시에 고려하여 모형의 예측정확도를 제고하고자 하였다. 이 때 비정형 데이터에 대한 분석은 최근 주목받고 있는 텍스트 분석 기술인 Word2vec과 합성곱 신경망을 이용해 수행하였다. 국내 한 직업훈련기관의 실제 데이터에 제안모형을 적용해 본 결과, 정형데이터만을 사용하여 중도탈락을 예측할 때보다 비정형 데이터를 함께 고려했을 때 예측의 정확도가 최대 20%까지 향상됨을 확인할 수 있었다. 아울러, Support Vector Machine을 기반으로 정형 데이터와 비정형 데이터를 결합해 분석했을 때, 검증용 데이터셋 기준으로 90% 후반대의 높은 예측 정확도를 나타냄을 확인하였다.

Performance Assessment of an Access Point for Human Data and Machine Data

  • 이훈
    • 한국통신학회논문지
    • /
    • 제40권6호
    • /
    • pp.1081-1090
    • /
    • 2015
  • This work proposes a theoretic framework for the performance assessment of an access point in the IP network that accommodates MD (Machine Data) and HD (Human Data). First, we investigate typical resource allocation methods in LTE for MD and HD. After that we carry out a Max-Min analysis about the surplus and deficiency of network resource seen from MD and HD. Finally, we evaluate the performance via numerical experiment.

코어 다중가공에서 공구마모 예측을 위한 기계학습 데이터 분석 (Machine Learning Data Analysis for Tool Wear Prediction in Core Multi Process Machining)

  • 최수진;이동주;황승국
    • 한국기계가공학회지
    • /
    • 제20권9호
    • /
    • pp.90-96
    • /
    • 2021
  • As real-time data of factories can be collected using various sensors, the adaptation of intelligent unmanned processing systems is spreading via the establishment of smart factories. In intelligent unmanned processing systems, data are collected in real time using sensors. The equipment is controlled by predicting future situations using the collected data. Particularly, a technology for the prediction of tool wear and for determining the exact timing of tool replacement is needed to prevent defected or unprocessed products due to tool breakage or tool wear. Directly measuring the tool wear in real time is difficult during the cutting process in milling. Therefore, tool wear should be predicted indirectly by analyzing the cutting load of the main spindle, current, vibration, noise, etc. In this study, data from the current and acceleration sensors; displacement data along the X, Y, and Z axes; tool wear value, and shape change data observed using Newroview were collected from the high-speed, two-edge, flat-end mill machining process of SKD11 steel. The support vector machine technique (machine learning technique) was applied to predict the amount of tool wear using the aforementioned data. Additionally, the prediction accuracies of all kernels were compared.

Development of a Wearable Inertial Sensor-based Gait Analysis Device Using Machine Learning Algorithms -Validity of the Temporal Gait Parameter in Healthy Young Adults-

  • Seol, Pyong-Wha;Yoo, Heung-Jong;Choi, Yoon-Chul;Shin, Min-Yong;Choo, Kwang-Jae;Kim, Kyoung-Shin;Baek, Seung-Yoon;Lee, Yong-Woo;Song, Chang-Ho
    • PNF and Movement
    • /
    • 제18권2호
    • /
    • pp.287-296
    • /
    • 2020
  • Purpose: The study aims were to develop a wearable inertial sensor-based gait analysis device that uses machine learning algorithms, and to validate this novel device using temporal gait parameters. Methods: Thirty-four healthy young participants (22 male, 12 female, aged 25.76 years) with no musculoskeletal disorders were asked to walk at three different speeds. As they walked, data were simultaneously collected by a motion capture system and inertial measurement units (Reseed®). The data were sent to a machine learning algorithm adapted to the wearable inertial sensor-based gait analysis device. The validity of the newly developed instrument was assessed by comparing it to data from the motion capture system. Results: At normal speeds, intra-class correlation coefficients (ICC) for the temporal gait parameters were excellent (ICC [2, 1], 0.99~0.99), and coefficient of variation (CV) error values were insignificant for all gait parameters (0.31~1.08%). At slow speeds, ICCs for the temporal gait parameters were excellent (ICC [2, 1], 0.98~0.99), and CV error values were very small for all gait parameters (0.33~1.24%). At the fastest speeds, ICCs for temporal gait parameters were excellent (ICC [2, 1], 0.86~0.99) but less impressive than for the other speeds. CV error values were small for all gait parameters (0.17~5.58%). Conclusion: These results confirm that both the wearable inertial sensor-based gait analysis device and the machine learning algorithms have strong concurrent validity for temporal variables. On that basis, this novel wearable device is likely to prove useful for establishing temporal gait parameters while assessing gait.

A Comparative Analysis of the Pre-Processing in the Kaggle Titanic Competition

  • Tai-Sung, Hur;Suyoung, Bang
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권3호
    • /
    • pp.17-24
    • /
    • 2023
  • 데이터 과학과 관련한 과제를 제시하고 참가자가 이를 해결하는 캐글(Kaggle)의 대표적인 대회인 'Tatanic - Machine Learning from Disaster' 문제를 기반으로 데이터 전처리 방식과 모델 구축이 예측 정확도와 점수에 어떤 영향을 미치는지 확인하고자 한다. 중복된 모델을 사용하였거나 앙상블 기법을 사용한 경우를 제외하고 높은 점수를 획득하여 상위 순위에 위치한 7건의 해결 방식을 선정하여 특징들을 비교 분석한다. 전처리를 진행하는 데 있어 대부분 고유하고 차별적인 특징을 가진 것을 확인하였으며, 거의 동일할 정도의 전처리 과정을 거쳤으나 모델의 종류에 따라 점수 차이가 존재하기도 하였다. 본 논문의 비교 분석 연구는 상위 점수 참가자의 전처리 방식의 특징과 분석 흐름을 이해함으로써 캐글 대회 참가자들과 데이터 과학 입문자들에게 많은 도움이 될 것으로 생각한다.

SVM 기반 자동 품질검사 시스템에서 상관분석 기반 데이터 선정 연구 (Study on Correlation-based Feature Selection in an Automatic Quality Inspection System using Support Vector Machine (SVM))

  • 송동환;오영광;김남훈
    • 대한산업공학회지
    • /
    • 제42권6호
    • /
    • pp.370-376
    • /
    • 2016
  • Manufacturing data analysis and its applications are getting a huge popularity in various industries. In spite of the fast advancement in the big data analysis technology, however, the manufacturing quality data monitored from the automated inspection system sometimes is not reliable enough due to the complex patterns of product quality. In this study, thus, we aim to define the level of trusty of an automated quality inspection system and improve the reliability of the quality inspection data. By correlation analysis and feature selection, this paper presents a method of improving the inspection accuracy and efficiency in an SVM-based automatic product quality inspection system using thermal image data in an auto part manufacturing case. The proposed method is implemented in the sealer dispensing process of the automobile manufacturing and verified by the analysis of the optimal feature selection from the quality analysis results.

Underwater Acoustic Research Trends with Machine Learning: Passive SONAR Applications

  • Yang, Haesang;Lee, Keunhwa;Choo, Youngmin;Kim, Kookhyun
    • 한국해양공학회지
    • /
    • 제34권3호
    • /
    • pp.227-236
    • /
    • 2020
  • Underwater acoustics, which is the domain that addresses phenomena related to the generation, propagation, and reception of sound waves in water, has been applied mainly in the research on the use of sound navigation and ranging (SONAR) systems for underwater communication, target detection, investigation of marine resources and environment mapping, and measurement and analysis of sound sources in water. The main objective of remote sensing based on underwater acoustics is to indirectly acquire information on underwater targets of interest using acoustic data. Meanwhile, highly advanced data-driven machine-learning techniques are being used in various ways in the processes of acquiring information from acoustic data. The related theoretical background is introduced in the first part of this paper (Yang et al., 2020). This paper reviews machine-learning applications in passive SONAR signal-processing tasks including target detection/identification and localization.

센서 네트워크에서 기계학습을 사용한 잔류 전력 추정 방안 (A Residual Power Estimation Scheme Using Machine Learning in Wireless Sensor Networks)

  • 배시규
    • 한국멀티미디어학회논문지
    • /
    • 제24권1호
    • /
    • pp.67-74
    • /
    • 2021
  • As IoT(Internet Of Things) devices like a smart sensor have constrained power sources, a power strategy is critical in WSN(Wireless Sensor Networks). Therefore, it is necessary to figure out the residual power of each sensor node for managing power strategies in WSN, which, however, requires additional data transmission, leading to more power consumption. In this paper, a residual power estimation method was proposed, which uses ignorantly small amount of power consumption in the resource-constrained wireless networks including WSN. A residual power prediction is possible with the least data transmission by using Machine Learning method with some training data in this proposal. The performance of the proposed scheme was evaluated by machine learning method, simulation, and analysis.

Cross platform classification of microarrays by rank comparison

  • Lee, Sunho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권2호
    • /
    • pp.475-486
    • /
    • 2015
  • Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.

머신러닝을 이용한 세금 계정과목 분류 (Taxation Analysis Using Machine Learning)

  • 최동빈;조인수;박용범
    • 반도체디스플레이기술학회지
    • /
    • 제18권2호
    • /
    • pp.73-77
    • /
    • 2019
  • Data mining techniques can also be used to increase the efficiency of production in the tax sector, which requires professional skills. As tax-related computerization was carried out, large amounts of data were accumulated, creating a good environment for data mining. In this paper, we have developed a system that can help tax accountant who have existing professional abilities by using data mining techniques on accumulated tax related data. The data mining technique used is random forest and improved by using f1-score. Using the implemented system, data accumulated over two years was learned, showing high accuracy at prediction.