• Title/Summary/Keyword: machine data

Search Result 6,279, Processing Time 0.038 seconds

Machine Layout Decision Algorithm for Cell Formation Problem Using Self-Organizing Map (자기조직화 신경망을 이용한 셀 형성 문제의 기계 배치순서 결정 알고리듬)

  • Jeon, Yong-Deok
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.42 no.2
    • /
    • pp.94-103
    • /
    • 2019
  • Self Organizing Map (SOM) is a neural network that is effective in classifying patterns that form the feature map by extracting characteristics of the input data. In this study, we propose an algorithm to determine the cell formation and the machine layout within the cell for the cell formation problem with operation sequence using the SOM. In the proposed algorithm, the output layer of the SOM is a one-dimensional structure, and the SOM is applied to the parts and the machine in two steps. The initial cell is formed when the formed clusters is grouped largely by the utilization of the machine within the cell. At this stage, machine cell are formed. The next step is to create a flow matrix of the all machine that calculates the frequency of consecutive forward movement for the machine. The machine layout order in each machine cell is determined based on this flow matrix so that the machine operation sequence is most reflected. The final step is to optimize the overall machine and parts to increase machine layout efficiency. As a result, the final cell is formed and the machine layout within the cell is determined. The proposed algorithm was tested on well-known cell formation problems with operation sequence shown in previous papers. The proposed algorithm has better performance than the other algorithms.

Performance Characteristics of an Ensemble Machine Learning Model for Turbidity Prediction With Improved Data Imbalance (데이터 불균형 개선에 따른 탁도 예측 앙상블 머신러닝 모형의 성능 특성)

  • HyunSeok Yang;Jungsu Park
    • Ecology and Resilient Infrastructure
    • /
    • v.10 no.4
    • /
    • pp.107-115
    • /
    • 2023
  • High turbidity in source water can have adverse effects on water treatment plant operations and aquatic ecosystems, necessitating turbidity management. Consequently, research aimed at predicting river turbidity continues. This study developed a multi-class classification model for prediction of turbidity using LightGBM (Light Gradient Boosting Machine), a representative ensemble machine learning algorithm. The model utilized data that was classified into four classes ranging from 1 to 4 based on turbidity, from low to high. The number of input data points used for analysis varied among classes, with 945, 763, 95, and 25 data points for classes 1 to 4, respectively. The developed model exhibited precisions of 0.85, 0.71, 0.26, and 0.30, as well as recalls of 0.82, 0.76, 0.19, and 0.60 for classes 1 to 4, respectively. The model tended to perform less effectively in the minority classes due to the limited data available for these classes. To address data imbalance, the SMOTE (Synthetic Minority Over-sampling Technique) algorithm was applied, resulting in improved model performance. For classes 1 to 4, the Precision and Recall of the improved model were 0.88, 0.71, 0.26, 0.25 and 0.79, 0.76, 0.38, 0.60, respectively. This demonstrated that alleviating data imbalance led to a significant enhancement in Recall of the model. Furthermore, to analyze the impact of differences in input data composition addressing the input data imbalance, input data was constructed with various ratios for each class, and the model performances were compared. The results indicate that an appropriate composition ratio for model input data improves the performance of the machine learning model.

KISTI-ML Platform: A Community-based Rapid AI Model Development Tool for Scientific Data (KISTI-ML 플랫폼: 과학기술 데이터를 위한 커뮤니티 기반 AI 모델 개발 도구)

  • Lee, Jeongcheol;Ahn, Sunil
    • Journal of Internet Computing and Services
    • /
    • v.20 no.6
    • /
    • pp.73-84
    • /
    • 2019
  • Machine learning as a service, the so-called MLaaS, has recently attracted much attention in almost all industries and research groups. The main reason for this is that you do not need network servers, storage, or even data scientists, except for the data itself, to build a productive service model. However, machine learning is often very difficult for most developers, especially in traditional science due to the lack of well-structured big data for scientific data. For experiment or application researchers, the results of an experiment are rarely shared with other researchers, so creating big data in specific research areas is also a big challenge. In this paper, we introduce the KISTI-ML platform, a community-based rapid AI model development for scientific data. It is a place where machine learning beginners use their own data to automatically generate code by providing a user-friendly online development environment. Users can share datasets and their Jupyter interactive notebooks among authorized community members, including know-how such as data preprocessing to extract features, hidden network design, and other engineering techniques.

A Study on Training Data Selection Method for EEG Emotion Analysis using Semi-supervised Learning Algorithm (준 지도학습 알고리즘을 이용한 뇌파 감정 분석을 위한 학습데이터 선택 방법에 관한 연구)

  • Yun, Jong-Seob;Kim, Jin Heon
    • Journal of IKEEE
    • /
    • v.22 no.3
    • /
    • pp.816-821
    • /
    • 2018
  • Recently, machine learning algorithms based on artificial neural networks started to be used widely as classifiers in the field of EEG research for emotion analysis and disease diagnosis. When a machine learning model is used to classify EEG data, if training data is composed of only data having similar characteristics, classification performance may be deteriorated when applied to data of another group. In this paper, we propose a method to construct training data set by selecting several groups of data using semi-supervised learning algorithm to improve these problems. We then compared the performance of the two models by training the model with a training data set consisting of data with similar characteristics to the training data set constructed using the proposed method.

Conceptual Data Modeling: Entity-Relationship Models as Thinging Machines

  • Al-Fedaghi, Sabah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.247-260
    • /
    • 2021
  • Data modeling is a process of developing a model to design and develop a data system that supports an organization's various business processes. A conceptual data model represents a technology-independent specification of structure of data to be stored within a database. The model aims to provide richer expressiveness and incorporate a set of semantics to (a) support the design, control, and integrity parts of the data stored in data management structures and (b) coordinate the viewing of connections and ideas on a database. The described structure of the data is often represented in an entity–relationship (ER) model, which was one of the first data-modeling techniques and is likely to continue to be a popular way of characterizing entity classes, attributes, and relationships. This paper attempts to examine the basic ER modeling notions in order to analyze the concepts to which they refer as well as ways to represent them. In such a mission, we apply a new modeling methodology (thinging machine; TM) to ER in terms of its fundamental building constructs, representation entities, relationships, and attributes. The goal of this venture is to further the understanding of data models and enrich their semantics. Three specific contributions to modeling in this context are incorporated: (a) using the TM model's five generic actions to inject processing in the ER structure; (b) relating the single ontological element of TM modeling (i.e., a thing/machine or thimac) to ER entities and relationships; and (c) proposing a high-level integrated, extended ER model that includes structural and time-oriented notions (e.g., events or behavior).

Development of Online Machine Learning Model for AHU Supply Air Temperature Prediction using Progressive Sampling and Normalized Mutual Information (점진적 샘플링과 정규 상호정보량을 이용한 온라인 기계학습 공조기 급기온도 예측 모델 개발)

  • Chu, Han-Gyeong;Shin, Han-Sol;Ahn, Ki-Uhn;Ra, Seon-Jung;Park, Cheol Soo
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.34 no.6
    • /
    • pp.63-69
    • /
    • 2018
  • The machine learning model can capture the dynamics of building systems with less inputs than the first principle based simulation model. The training data for developing a machine learning model are usually selected in a heuristic manner. In this study, the authors developed a machine learning model which can describe supply air temperature from an AHU in a real office building. For rational reduction of the training data, the progressive sampling method was used. It is found that even though the progressive sampling requires far less training data (n=60) than the offline regular sampling (n=1,799), the MBEs of both models are similar (2.6% vs. 5.4%). In addition, for the update of the machine learning model, the normalized mutual information (NMI) was applied. If the NMI between the simulation output and the measured data is less than 0.2, the model has to be updated. By the use of the NMI, the model can perform better prediction ($5.4%{\rightarrow}1.3%$).

Development of ensemble machine learning model considering the characteristics of input variables and the interpretation of model performance using explainable artificial intelligence (수질자료의 특성을 고려한 앙상블 머신러닝 모형 구축 및 설명가능한 인공지능을 이용한 모형결과 해석에 대한 연구)

  • Park, Jungsu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.36 no.4
    • /
    • pp.239-248
    • /
    • 2022
  • The prediction of algal bloom is an important field of study in algal bloom management, and chlorophyll-a concentration(Chl-a) is commonly used to represent the status of algal bloom. In, recent years advanced machine learning algorithms are increasingly used for the prediction of algal bloom. In this study, XGBoost(XGB), an ensemble machine learning algorithm, was used to develop a model to predict Chl-a in a reservoir. The daily observation of water quality data and climate data was used for the training and testing of the model. In the first step of the study, the input variables were clustered into two groups(low and high value groups) based on the observed value of water temperature(TEMP), total organic carbon concentration(TOC), total nitrogen concentration(TN) and total phosphorus concentration(TP). For each of the four water quality items, two XGB models were developed using only the data in each clustered group(Model 1). The results were compared to the prediction of an XGB model developed by using the entire data before clustering(Model 2). The model performance was evaluated using three indices including root mean squared error-observation standard deviation ratio(RSR). The model performance was improved using Model 1 for TEMP, TN, TP as the RSR of each model was 0.503, 0.477 and 0.493, respectively, while the RSR of Model 2 was 0.521. On the other hand, Model 2 shows better performance than Model 1 for TOC, where the RSR was 0.532. Explainable artificial intelligence(XAI) is an ongoing field of research in machine learning study. Shapley value analysis, a novel XAI algorithm, was also used for the quantitative interpretation of the XGB model performance developed in this study.

Performance Assessment of an Access Point for Human Data and Machine Data

  • Lee, Hoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.6
    • /
    • pp.1081-1090
    • /
    • 2015
  • This work proposes a theoretic framework for the performance assessment of an access point in the IP network that accommodates MD (Machine Data) and HD (Human Data). First, we investigate typical resource allocation methods in LTE for MD and HD. After that we carry out a Max-Min analysis about the surplus and deficiency of network resource seen from MD and HD. Finally, we evaluate the performance via numerical experiment.

Study on the Application Methods of Big Data at a Corporation -Cases of A and Y corporation Big Data System Projects- (기업의 빅데이터 적용방안 연구 -A사, Y사 빅데이터 시스템 적용 사례-)

  • Lee, Jae Sung;Hong, Sung Chan
    • Journal of Internet Computing and Services
    • /
    • v.15 no.1
    • /
    • pp.103-112
    • /
    • 2014
  • In recent years, the rapid diffusion of smart devices and growth of internet usage and social media has led to a constant production of huge amount of valuable data set that includes personal information, buying patterns, location information and other things. IT and Production Infrastructure has also started to produce its own data with the vitalization of M2M (Machine-to-Machine) and IoT (Internet of Things). This analysis study researches the applicable effects of Structured and Unstructured Big Data in various business circumstances, and purposes to find out the value creation method for a corporation through the Structured and Unstructured Big Data case studies. The result demonstrates that corporations looking for the optimized big data utilization plan could maximize their creative values by utilizing Unstructured and Structured Big Data generated interior and exterior of corporations.

Extraction of the control data for the shoe laster by using tension spline method and verification of the geometric grading system (Tension spline 방법을 이용한 제화용 라스팅기의 제어데이터 추출 및 기하할출제도의 검증)

  • Jang, Kwang-Keol;Kim, Seung-Ho;Huh, Hoon
    • Proceedings of the KSME Conference
    • /
    • 2001.06c
    • /
    • pp.140-145
    • /
    • 2001
  • Lasting machines for shoe manufacturing are continuously developed with the aid of automation and Computer Aided Manufacturing (CAM). Adaptive lasting machine and CAD data of a shoe last are inevitably introduced for the labor-free manufacturing process. Recently, method for the CAD datarization of a shoe last is suggested using finite element mesh system. Initial set up data and control data of machine parts are required for the adaptive lasting machine. For the efficient process, grading of those data is essential to minimize data storage and production costs. In this paper, bonding lines are extracted from the CAD data of a shoe last and graded by the geometric grading system. Tension spline method is adopted for the interpolation of last CAD data. The results are compared with the results from the arithmetic grading system that is widely adopted in the shoemaking companies.

  • PDF