• 제목/요약/키워드: Process Data

검색결과 23,765건 처리시간 0.048초

A Container Orchestration System for Process Workloads

  • Jong-Sub Lee;Seok-Jae Moon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제15권4호
    • /
    • pp.270-278
    • /
    • 2023
  • We propose a container orchestration system for process workloads that combines the potential of big data and machine learning technologies to integrate enterprise process-centric workloads. This proposed system analyzes big data generated from industrial automation to identify hidden patterns and build a machine learning prediction model. For each machine learning case, training data is loaded into a data store and preprocessed for model training. In the next step, you can use the training data to select and apply an appropriate model. Then evaluate the model using the following test data: This step is called model construction and can be performed in a deployment framework. Additionally, a visual hierarchy is constructed to display prediction results and facilitate big data analysis. In order to implement parallel computing of PCA in the proposed system, several virtual systems were implemented to build the cluster required for the big data cluster. The implementation for evaluation and analysis built the necessary clusters by creating multiple virtual machines in a big data cluster to implement parallel computation of PCA. The proposed system is modeled as layers of individual components that can be connected together. The advantage of a system is that components can be added, replaced, or reused without affecting the rest of the system.

Principles of Multivariate Data Visualization

  • Huh, Moon Yul;Cha, Woon Ock
    • Communications for Statistical Applications and Methods
    • /
    • 제11권3호
    • /
    • pp.465-474
    • /
    • 2004
  • Data visualization is the automation process and the discovery process to data sets in an effort to discover underlying information from the data. It provides rich visual depictions of the data. It has distinct advantages over traditional data analysis techniques such as exploring the structure of large scale data set both in the sense of number of observations and the number of variables by allowing great interaction with the data and end-user. We discuss the principles of data visualization and evaluate the characteristics of various tools of visualization according to these principles.

Character Recognition Algorithm using Accumulation Mask

  • Yoo, Suk Won
    • International Journal of Advanced Culture Technology
    • /
    • 제6권2호
    • /
    • pp.123-128
    • /
    • 2018
  • Learning data is composed of 100 characters with 10 different fonts, and test data is composed of 10 characters with a new font that is not used for the learning data. In order to consider the variety of learning data with several different fonts, 10 learning masks are constructed by accumulating pixel values of same characters with 10 different fonts. This process eliminates minute difference of characters with different fonts. After finding maximum values of learning masks, test data is expanded by multiplying these maximum values to the test data. The algorithm calculates sum of differences of two corresponding pixel values of the expanded test data and the learning masks. The learning mask with the smallest value among these 10 calculated sums is selected as the result of the recognition process for the test data. The proposed algorithm can recognize various types of fonts, and the learning data can be modified easily by adding a new font. Also, the recognition process is easy to understand, and the algorithm makes satisfactory results for character recognition.

6 시그마 위한 대용량 공정데이터 분석에 관한 연구 (A Study on Analysis of Superlarge Manufacturing Process Data for Six Sigma)

  • 박재홍;변재현
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 2001년도 추계학술대회 논문집
    • /
    • pp.411-415
    • /
    • 2001
  • Advances in computer and sensor technology have made it possible to obtain superlarge manufacturing process data in real time, letting us to extract meaningful information from these superlarge data sets. We propose a systematic data analysis procedure which field engineers can apply easily to manufacture quality products. The procedure consists of data cleaning and data analysis stages. Data cleaning stage is to construct a database suitable for statistical analysis from the original superlarge manufacturing process data. In the data analysis stage, we suggest a graphical easy-to-implement approach to extract practical information from the cleaned database. This study will help manufacturing companies to achieve six sigma quality.

  • PDF

금융데이터의 성능 비교를 통한 연합학습 기법의 효용성 분석 (Utility Analysis of Federated Learning Techniques through Comparison of Financial Data Performance)

  • 장진혁;안윤수;최대선
    • 정보보호학회논문지
    • /
    • 제32권2호
    • /
    • pp.405-416
    • /
    • 2022
  • AI기술은 데이터 기반의 기계학습을 이용하여 삶의 질을 높여주고 있다. 기계학습을 이용시, 분산된 데이터를 전송해 한곳에 모으는 작업은 프라이버시 침해가 발생할 위험성이 있어 비식별화 과정을 거친다. 비식별화 데이터는 정보의 손상, 누락이 있어 기계학 습과정의 성능을 저하시키며 전처리과정을 복잡하게한다. 이에 구글이 2017년에 데이터의 비식별화와 데이터를 한 서버로 모으는 과정없이 학습하는 방법인 연합학습을 발표했다. 본 논문은 실제 금융데이터를 이용하여, K익명성, 차분프라이버시 재현데이터의 비식별과정을 거친 데이터의 학습성능과 연합학습의 성능간의 차이를 비교하여 효용성을 분석하였으며, 이를 통해 연합학습의 우수성을 보여주고자 한다. 실험결과 원본데이터 학습의 정확도는 91% K-익명성을 거친 데이터학습은 k=2일 때 정확도 79%, k=5일 때76%, k=7일 때 62%, 차분프라이버시를 사용한 데이터학습은 𝜖=2일 때 정확도 52%, 𝜖=1일 때 50%, 𝜖=0.1일 때 36% 재현데이터는 정확도 82%가 나왔으며 연합학습의 정확도는 86%로 두번째로 높은 성능을 보여 주었다.

산업유형별 데이터융합과 데이터처리 모델의 설계 (Design of Data Fusion and Data Processing Model According to Industrial Types)

  • 정민승;진선아;조우현
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제6권2호
    • /
    • pp.67-76
    • /
    • 2017
  • 다양한 분야의 산업 현장에서는 복합적으로 대용량의 데이터가 상호 연관성을 가지고 발생한다. 산업유형별 공정에서 다양한 데이터들을 수집할 수 있으나, 각 프로세스 사이에서 서로 연관성 있게 통합하지 못하고 있다. 기존 산업유형별 데이터는 성형조건표 설정치 값과 작업공정에서 문제가 발생 했을 경우 작업자가 임의 값을 입력하였다. 본 논문에서는 각 산업유형별로 수집되는 데이터의 융합 및 분석처리 모델의 설계를 하고, 예측 사례(자동차 커넥트)를 통해서 표준 성형 조건표를 통한 마스터 데이터와 제조공정 과정에서 수집된 생산이력파일을 비교분석하여 다양한 불량요인과 예외사항에 대한 패턴분석과 재해석, 작업자에 대한 임의 값을 수치화를 통해 새로운 성형 조건표를 통한 불량률 감소, 생산성 증가, 공정 개선, 원가 절감 등의 기업 수익 향상과 제조 산업의 공정에 맞는 다양한 데이터분석과 검증 모델을 설계할 수 있다. 또한, 분석 검증된 표준 설정치에 의한 제조 공정의 최적화, 일관성, 객관성을 확보할 수 있고 다양한 패턴유형을 통한 산업유형별에 맞는 최적화(표준 설정치) 기술을 지원할 수 있다.

Recurrent Neural Network Modeling of Etch Tool Data: a Preliminary for Fault Inference via Bayesian Networks

  • Nawaz, Javeria;Arshad, Muhammad Zeeshan;Park, Jin-Su;Shin, Sung-Won;Hong, Sang-Jeen
    • 한국진공학회:학술대회논문집
    • /
    • 한국진공학회 2012년도 제42회 동계 정기 학술대회 초록집
    • /
    • pp.239-240
    • /
    • 2012
  • With advancements in semiconductor device technologies, manufacturing processes are getting more complex and it became more difficult to maintain tighter process control. As the number of processing step increased for fabricating complex chip structure, potential fault inducing factors are prevail and their allowable margins are continuously reduced. Therefore, one of the key to success in semiconductor manufacturing is highly accurate and fast fault detection and classification at each stage to reduce any undesired variation and identify the cause of the fault. Sensors in the equipment are used to monitor the state of the process. The idea is that whenever there is a fault in the process, it appears as some variation in the output from any of the sensors monitoring the process. These sensors may refer to information about pressure, RF power or gas flow and etc. in the equipment. By relating the data from these sensors to the process condition, any abnormality in the process can be identified, but it still holds some degree of certainty. Our hypothesis in this research is to capture the features of equipment condition data from healthy process library. We can use the health data as a reference for upcoming processes and this is made possible by mathematically modeling of the acquired data. In this work we demonstrate the use of recurrent neural network (RNN) has been used. RNN is a dynamic neural network that makes the output as a function of previous inputs. In our case we have etch equipment tool set data, consisting of 22 parameters and 9 runs. This data was first synchronized using the Dynamic Time Warping (DTW) algorithm. The synchronized data from the sensors in the form of time series is then provided to RNN which trains and restructures itself according to the input and then predicts a value, one step ahead in time, which depends on the past values of data. Eight runs of process data were used to train the network, while in order to check the performance of the network, one run was used as a test input. Next, a mean squared error based probability generating function was used to assign probability of fault in each parameter by comparing the predicted and actual values of the data. In the future we will make use of the Bayesian Networks to classify the detected faults. Bayesian Networks use directed acyclic graphs that relate different parameters through their conditional dependencies in order to find inference among them. The relationships between parameters from the data will be used to generate the structure of Bayesian Network and then posterior probability of different faults will be calculated using inference algorithms.

  • PDF

R기반 데이터 분석 프레임워크를 이용한 코팅제 배합 분석 기술 (An Analysis Techniques for Coatings Mixing using the R Data Analysis Framework)

  • 노성여;김민정;김영진
    • 한국멀티미디어학회논문지
    • /
    • 제18권6호
    • /
    • pp.734-741
    • /
    • 2015
  • Coating is a type of paint. It protects a product forming a film layer on the product and assigns various properties to the product. Coating is one of the fields which is being studied actively in the polymer industry. Importance of coating in various industries is more increased. However, mixing process has been performing in dependence on operator's experience. In this paper, we found the relationship between each data from coating formulation process. We propose a framework to analyze the coating formulation process as well. It can improve the coating formulation process. In particular, the suggested framework may reduce degradation and loss costs due to absence of standard data which is accurate formulation criteria. Also it suggests responses to errors which can be occurred in the future through the analysis of the error data generated in mixing step.

A Decision Tree Approach for Identifying Defective Products in the Manufacturing Process

  • Choi, Sungsu;Battulga, Lkhagvadorj;Nasridinov, Aziz;Yoo, Kwan-Hee
    • International Journal of Contents
    • /
    • 제13권2호
    • /
    • pp.57-65
    • /
    • 2017
  • Recently, due to the significance of Industry 4.0, the manufacturing industry is developing globally. Conventionally, the manufacturing industry generates a large volume of data that is often related to process, line and products. In this paper, we analyzed causes of defective products in the manufacturing process using the decision tree technique, that is a well-known technique used in data mining. We used data collected from the domestic manufacturing industry that includes Manufacturing Execution System (MES), Point of Production (POP), equipment data accumulated directly in equipment, in-process/external air-conditioning sensors and static electricity. We propose to implement a model using C4.5 decision tree algorithm. Specifically, the proposed decision tree model is modeled based on components of a specific part. We propose to identify the state of products, where the defect occurred and compare it with the generated decision tree model to determine the cause of the defect.

자기상관자료를 갖는 공정을 위한 다변량 관리도 (Multivariate Control Chart for Autocorrelated Process)

  • 남국현;장영순;배도선
    • 대한산업공학회지
    • /
    • 제27권3호
    • /
    • pp.289-296
    • /
    • 2001
  • This paper proposes multivariate control chart for autocorrelated data which are common in chemical and process industries and lead to increase in the number of false alarms when conventional control charts are applied. The effect of autocorrelated data is modeled as a vector autoregressive process, and canonical analysis is used to reduce the dimensionality of the data set and find the canonical variables that explain as much of the data variation as possible. Charting statistics are constructed based on the residual vectors from the canonical variables which are uncorrelated over time, and therefore the control charts for these statistics can attenuate the autocorrelation in the process data. The charting procedures are illustrated with a numerical example and Monte Carlo simulation is conducted to investigate the performances of the proposed control charts.

  • PDF