• Title/Summary/Keyword: 심볼릭 표현법

Search Result 4, Processing Time 0.015 seconds

Application of Symbolic Representation Method for Fault Detection and Clustering in Semiconductor Fabrication Processes (반도체공정 이상탐지 및 클러스터링을 위한 심볼릭 표현법의 적용)

  • Loh, Woong-Kee;Hong, Sang-Jeen
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.11
    • /
    • pp.806-818
    • /
    • 2009
  • Since the invention of the integrated circuit (IC) in 1950s, semiconductor technology has undergone dramatic development up to these days. A complete semiconductor is manufactured through a diversity of processes. For better semiconductor productivity, fault detection and classification (FDC) has been rigorously studied for finding faults even before the processes are completed. For FDC, various kinds of sensors are attached in many semiconductor manufacturing devices, and sensor values are collected in a periodic manner. The collection of sensor values consists of sequences of real numbers, and hence is regarded as a kind of time-series data. In this paper, we propose an algorithm for detecting and clustering faults in semiconductor processes. The proposed algorithm is a modification of the existing anomaly detection algorithm dealing with symbolically-represented time-series. The contributions of this paper are: (1) showing that a modification of the existing anomaly detection algorithm dealing with general time-series could be used for semiconductor process data and (2) presenting experimental results for improving correctness of fault detection and clustering. As a result of our experiment, the proposed algorithm caused neither false positive nor false negative.

On principal component analysis for interval-valued data (구간형 자료의 주성분 분석에 관한 연구)

  • Choi, Soojin;Kang, Kee-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.61-74
    • /
    • 2020
  • Interval-valued data, one type of symbolic data, are observed in the form of intervals rather than single values. Each interval-valued observation has an internal variation. Principal component analysis reduces the dimension of data by maximizing the variance of data. Therefore, the principal component analysis of the interval-valued data should account for the variance between observations as well as the variation within the observed intervals. In this paper, three principal component analysis methods for interval-valued data are summarized. In addition, a new method using a truncated normal distribution has been proposed instead of a uniform distribution in the conventional quantile method, because we believe think there is more information near the center point of the interval. Each method is compared using simulations and the relevant data set from the OECD. In the case of the quantile method, we draw a scatter plot of the principal component, and then identify the position and distribution of the quantiles by the arrow line representation method.

Study on the Performance Evaluation of Encoding and Decoding Schemes in Vector Symbolic Architectures (벡터 심볼릭 구조의 부호화 및 복호화 성능 평가에 관한 연구)

  • Youngseok Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.4
    • /
    • pp.229-235
    • /
    • 2024
  • Recent years have seen active research on methods for efficiently processing and interpreting large volumes of data in the fields of artificial intelligence and machine learning. One of these data processing technologies, Vector Symbolic Architecture (VSA), offers an innovative approach to representing complex symbols and data using high-dimensional vectors. VSA has garnered particular attention in various applications such as natural language processing, image recognition, and robotics. This study quantitatively evaluates the characteristics and performance of VSA methodologies by applying five VSA methodologies to the MNIST dataset and measuring key performance indicators such as encoding speed, decoding speed, memory usage, and recovery accuracy across different vector lengths. BSC and VT demonstrated relatively fast performance in encoding and decoding speeds, while MAP and HRR were relatively slow. In terms of memory usage, BSC was the most efficient, whereas MAP used the most memory. The recovery accuracy was highest for MAP and lowest for BSC. The results of this study provide a basis for selecting appropriate VSA methodologies depending on the application area.

A Study on Gene Search Using Test for Interval Data (구간형 데이터 검정법을 이용한 유전자 탐색에 관한 연구)

  • Lee, Seong-Keon
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2805-2812
    • /
    • 2018
  • The methylation score, expressed as a percentage of the methylation status data derived from the iterative sequencing process, has a value between 0 and 1. It is contrary to the assumption of normal distribution that simply applying the t-test to examine the difference in population-specific methylation scores in these data. In addition, since the result may vary depending on the number of repetitions of sequencing in the process of methylation score generation, a method that can analyze such errors is also necessary. In this paper, we introduce the symbolic data analysis and the interval K-S test method which convert observation data into interval data including uncertainty rather than one numerical data. In addition, it is possible to analyze the characteristics of methylation score by using Beta distribution without using normal distribution in the process of converting into interval data. For the data analysis, the nature of the proposed method was examined using sequencing data of actual patients and normal persons. While the t-test is only possible for the location test, it is found that the interval type K-S statistic can be used to test not only the location parameter but also the heterogeneity of the distribution function.