Search | Korea Science

A Hybrid Clustering Technique for Processing Large Data (대용량 데이터 처리를 위한 하이브리드형 클러스터링 기법)

Kim, Man-Sun;Lee, Sang-Yong
- The KIPS Transactions:PartB
- /
- v.10B no.1
- /
- pp.33-40
- /
- 2003
Data mining plays an important role in a knowledge discovery process and various algorithms of data mining can be selected for the specific purpose. Most of traditional hierachical clustering methode are suitable for processing small data sets, so they difficulties in handling large data sets because of limited resources and insufficient efficiency. In this study we propose a hybrid neural networks clustering technique, called PPC for Pre-Post Clustering that can be applied to large data sets and find unknown patterns. PPC combinds an artificial intelligence method, SOM and a statistical method, hierarchical clustering technique, and clusters data through two processes. In pre-clustering process, PPC digests large data sets using SOM. Then in post-clustering, PPC measures Similarity values according to cohesive distances which show inner features, and adjacent distances which show external distances between clusters. At last PPC clusters large data sets using the simularity values. Experiment with UCI repository data showed that PPC had better cohensive values than the other clustering techniques.
https://doi.org/10.3745/KIPSTB.2003.10B.1.033 인용 PDF KSCI

Sparse Distributed Memory with Monotonic Decision Function (단조 결정 함수를 갖는 축약 분산 기억 장치)

Gwon, Hui-Yong;Jang, Jeong-U;Im, Seong-Jun;Jo, Dong-Seop;Hwang, Hui-Yung
- The KIPS Transactions:PartB
- /
- v.8B no.1
- /
- pp.105-113
- /
- 2001
최근 축약 분산 기억 장치(SDM)가 적응적 문제 해결 능력과 하드웨어화의 용이성으로 인해 현실성이 있는 신경망의 한 모델로 제안되었다. 그러나 다층 인식자의 개별 뉴런이 선형 또는 비선형 결정 함수로 해 공간을 이분하고 그들이 다양하게 결합함으로써 일반적인 문제 해결 능력을 갖는데 비해, 축약 분산 기억 장치의 뉴런은 해 공간에서 자신을 중심으로 한 일정 반경 영역을 안과 밖으로 이분하고 이들을 단순하게 합하므로써, 해 공간이 실수 공간과 같이 크기 관계를 갖는 경우 비효율적인 모델로 된다. 본 논문에서는 이러한 축약 분산 기억 장치의 특성과 그 원인을 규명하고, 문제의 해 공간이 단조 증가 또는 감소 결정 함수로 양분되는 경우, 기존의 축약 분산 기억 장치에 크기 비교 과정을 도입함으로써, 주어진 문제를 효율적으로 해결할 수 있는 수정된 축약 분산 기억 장치 모델을 제안한다. 아울러 제안된 모델을 ATM망에서의 호 수락 제어 과정에 적용한 예를 보인다.최근 축약 분산 기억 장치(SDM)가 적응적 문제 해결 능력과 하드웨어화의 용이성으로 인해 현실성이 있는 신경망의 한 모델로 제안되었다. 그러나 다층 인식자의 개별 뉴런이 선형 또는 비선형 결정 함수로 해 공간을 이분하고 그들이 다양하게 결합함으로써 일반적인 문제 해결 능력을 갖는데 비해, 축약 분산 기억 장치의 뉴런은 해 공간에서 자신을 중심으로 한 일정 반경 영역을 안과 밖으로 이분하고 이들을 단순하게 합하므로써, 해 공간이 실수 공간과 같이 크기 관계를 갖는 경우 비효율적인 모델로 된다. 본 논문에서는 이러한 축약 분산 기억 장치의 특성과 그 원인을 규명하고, 문제의 해 공간이 단조 증가 또는 감소 결정 함수로 양분되는 경우, 기존의 축약 분산 기억 장치에 크기 비교 과정을 도입함으로써, 주어진 문제를 효율적으로 해결할 수 있는 수정된 축약 분산 기억 장치 모델을 제안한다. 아울러 제안된 모델을 ATM망에서의 호 수락 제어 과정에 적용한 예를 보인다.
PDF

Prediction of high turbidity in rivers using LSTM algorithm (LSTM 모형을 이용한 하천 고탁수 발생 예측 연구)

Park, Jungsu;Lee, Hyunho
- Journal of Korean Society of Water and Wastewater
- /
- v.34 no.1
- /
- pp.35-43
- /
- 2020
Turbidity has various effects on the water quality and ecosystem of a river. High turbidity during floods increases the operation cost of a drinking water supply system. Thus, the management of turbidity is essential for providing safe water to the public. There have been various efforts to estimate turbidity in river systems for proper management and early warning of high turbidity in the water supply process. Advanced data analysis technology using machine learning has been increasingly used in water quality management processes. Artificial neural networks(ANNs) is one of the first algorithms applied, where the overfitting of a model to observed data and vanishing gradient in the backpropagation process limit the wide application of ANNs in practice. In recent years, deep learning, which overcomes the limitations of ANNs, has been applied in water quality management. LSTM(Long-Short Term Memory) is one of novel deep learning algorithms that is widely used in the analysis of time series data. In this study, LSTM is used for the prediction of high turbidity(>30 NTU) in a river from the relationship of turbidity to discharge, which enables early warning of high turbidity in a drinking water supply system. The model showed 0.98, 0.99, 0.98 and 0.99 for precision, recall, F1-score and accuracy respectively, for the prediction of high turbidity in a river with 2 hour frequency data. The sensitivity of the model to the observation intervals of data is also compared with time periods of 2 hour, 8 hour, 1 day and 2 days. The model shows higher precision with shorter observation intervals, which underscores the importance of collecting high frequency data for better management of water resources in the future.
https://doi.org/10.11001/jksww.2020.34.1.035 인용 PDF KSCI

The diagnosis of Plasma Through RGB Data Using Rough Set Theory

Lim, Woo-Yup;Park, Soo-Kyong;Hong, Sang-Jeen
- Proceedings of the Korean Vacuum Society Conference
- /
- 2010.02a
- /
- pp.413-413
- /
- 2010
In semiconductor manufacturing field, all equipments have various sensors to diagnosis the situations of processes. For increasing the accuracy of diagnosis, hundreds of sensors are emplyed. As sensors provide millions of data, the process diagnosis from them are unrealistic. Besides, in some cases, the results from some data which have same conditions are different. We want to find some information, such as data and knowledge, from the data. Nowadays, fault detection and classification (FDC) has been concerned to increasing the yield. Certain faults and no-faults can be classified by various FDC tools. The uncertainty in semiconductor manufacturing, no-faulty in faulty and faulty in no-faulty, has been caused the productivity to decreased. From the uncertainty, the rough set theory is a viable approach for extraction of meaningful knowledge and making predictions. Reduction of data sets, finding hidden data patterns, and generation of decision rules contrasts other approaches such as regression analysis and neural networks. In this research, a RGB sensor was used for diagnosis plasma instead of optical emission spectroscopy (OES). RGB data has just three variables (red, green and blue), while OES data has thousands of variables. RGB data, however, is difficult to analyze by human's eyes. Same outputs in a variable show different outcomes. In other words, RGB data includes the uncertainty. In this research, by rough set theory, decision rules were generated. In decision rules, we could find the hidden data patterns from the uncertainty. RGB sensor can diagnosis the change of plasma condition as over 90% accuracy by the rough set theory. Although we only present a preliminary research result, in this paper, we will continuously develop uncertainty problem solving data mining algorithm for the application of semiconductor process diagnosis.
PDF

Development and Application of Total Maximum Daily Loads Simulation System Using Nonpoint Source Pollution Model (비점원오염모델을 이용한 오염총량모의시스템의 개발 및 적용)

Kang, Moon-Seong;Park, Seung-Woo
- Journal of Korea Water Resources Association
- /
- v.36 no.1
- /
- pp.117-128
- /
- 2003
The objectives of this study are to develop the total maximum daily loads simulation system, TOLOS that is capable of estimating annual nonpoint source pollution from small watersheds, to monitor the hydrology and water quality of the Balkan HP#6 watershed, and to validate TOLOS with the field data. TOLOS consists of three subsystems: the input data processor based on a geographic information system, the models, and the post processor. Land use pattern at the tested watershed was classified from the Landsat TM data using the artificial neutral network model that adopts an error back propagation algorithm. Paddy field components were added to SWAT model to simulate water balance at irrigated paddy blocks. SWAT model parameters were obtained from the GIS data base, and additional parameters calibrated with field data. TOLOS was then tested with ungauged conditions. The simulated runoff was reasonably good as compared with the observed data. And simulated water quality parameters appear to be reasonably comparable to the field data.
https://doi.org/10.3741/JKWRA.2003.36.1.117 인용 PDF KSCI

A Study on the Control of the Welding Quality Using a Infrared sensor (적외선센서를 이용한 용접품질 제어에 관한 연구)

Kim I.S.;Son S.J.;Kim I.J.;Kim H.H.;Seo J.H.
- Proceedings of the Korean Society of Precision Engineering Conference
- /
- 2005.10a
- /
- pp.754-758
- /
- 2005
Optimization of process variables such as arc current, welding voltage and welding speed in terms of the weld characteristics desired is the key step in achieving high quality and improving performance characteristics without increasing the cost. Consequently, incorrect settings of those process variables give rise to deviations in the welding characteristics from the desired bead geometry. Therefore, trainee welders are referred to the tabulated information relating different metal types and thickness as to recommend the desired values of process variables. Basically, the bead geometry plays an important role in determining the mechanical properties of the weld. So that it is very important to select the process variables for obtaining optimal bead geometry. However, it is difficult for the traditional identification methods to provide an accurate model because the optimized welding process is non-linear and time-dependent. In this paper, the possibilities of the Infra-red sensor in sensing and control of the bead geometry in the automated welding process are presented. Infra-red sensor is a well-known method to deal with the problems with a high degree of fuzziness so that the sensor is employed to build the relationship between process variables and the quality characteristic the proposed above respectively. Based on several neural networks, the mathematical models are derived from extensive experiments with different welding parameters and complex geometrical features. The developed system enables to select the optimal welding parameters and control the desired weld dimensions during arc welding process.
PDF

Design of Face Recognition Algorithm based Optimized pRBFNNs Using Three-dimensional Scanner (최적 pRBFNNs 패턴분류기 기반 3차원 스캐너를 이용한 얼굴인식 알고리즘 설계)

Ma, Chang-Min;Yoo, Sung-Hoon;Oh, Sung-Kwun
- Journal of the Korean Institute of Intelligent Systems
- /
- v.22 no.6
- /
- pp.748-753
- /
- 2012
In this paper, Face recognition algorithm is designed based on optimized pRBFNNs pattern classifier using three-dimensional scanner. Generally two-dimensional image-based face recognition system enables us to extract the facial features using gray-level of images. The environmental variation parameters such as natural sunlight, artificial light and face pose lead to the deterioration of the performance of the system. In this paper, the proposed face recognition algorithm is designed by using three-dimensional scanner to overcome the drawback of two-dimensional face recognition system. First face shape is scanned using three-dimensional scanner and then the pose of scanned face is converted to front image through pose compensation process. Secondly, data with face depth is extracted using point signature method. Finally, the recognition performance is confirmed by using the optimized pRBFNNs for solving high-dimensional pattern recognition problems.
https://doi.org/10.5391/JKIIS.2012.22.6.748 인용 PDF KSCI

Data Modeling using Cluster Based Fuzzy Model Tree (클러스터 기반 퍼지 모델트리를 이용한 데이터 모델링)

Lee, Dae-Jong;Park, Jin-Il;Park, Sang-Young;Jung, Nahm-Chung;Chun, Meung-Geun
- Journal of the Korean Institute of Intelligent Systems
- /
- v.16 no.5
- /
- pp.608-615
- /
- 2006
This paper proposes a fuzzy model tree consisting of local linear models using fuzzy cluster for data modeling. First, cluster centers are calculated by fuzzy clustering method using all input and output attributes. And then, linear models are constructed at internal nodes with fuzzy membership values between centers and input attributes. The expansion of internal node is determined by comparing errors calculated in parent node with ones in child node, respectively. As a final step, data prediction is performed with a linear model having the highest fuzzy membership value between input attributes and cluster centers in leaf nodes. To show the effectiveness of the proposed method, we have applied our method to various dataset. Under various experiments, our proposed method shows better performance than conventional model tree and artificial neural networks.
https://doi.org/10.5391/JKIIS.2006.16.5.608 인용 PDF KSCI

Social Issue Analysis Based on Sentiment of Twitter Users (트위터 사용자들의 감성을 이용한 사회적 이슈 분석)

Kim, Hannah;Jeong, Young-Seob
- Journal of Convergence for Information Technology
- /
- v.9 no.11
- /
- pp.81-91
- /
- 2019
Recently, social network service (SNS) is actively used by public. Among them, Twitter has a lot of tweets including sentiment and it is convenient to collect data through open Aplication Programming Interface (API). In this paper, we analyze social issues and suggest the possibility of using them in marketing through sentimental information of users. In this paper, we collect twitter text about social issues and classify as positive or negative by sentiment classifier to provide qualitative analysis. We provide a quantitative analysis by analyzing the correlation between the number of like and retweet of each tweet. As a result of the qualitative analysis, we suggest solutions to attract the interest of the public or consumers. As a result of the quantitative analysis, we conclude that the positive tweet should be brief to attract the users' attention on the Twitter. As future work, we will continue to analyze various social issues.
https://doi.org/10.22156/CS4SMB.2019.9.11.081 인용 PDF KSCI

Development of hybrid activation function to improve accuracy of water elevation prediction algorithm (수위예측 알고리즘 정확도 향상을 위한 Hybrid 활성화 함수 개발)

Yoo, Hyung Ju;Lee, Seung Oh
- Proceedings of the Korea Water Resources Association Conference
- /
- 2019.05a
- /
- pp.363-363
- /
- 2019
활성화 함수(activation function)는 기계학습(machine learning)의 학습과정에 비선형성을 도입하여 심층적인 학습을 용이하게 하고 예측의 정확도를 높이는 중요한 요소 중 하나이다(Roy et al., 2019). 일반적으로 기계학습에서 사용되고 있는 활성화 함수의 종류에는 계단 함수(step function), 시그모이드 함수(sigmoid 함수), 쌍곡 탄젠트 함수(hyperbolic tangent function), ReLU 함수(Rectified Linear Unit function) 등이 있으며, 예측의 정확도 향상을 위하여 다양한 형태의 활성화 함수가 제시되고 있다. 본 연구에서는 기계학습을 통하여 수위예측 시 정확도 향상을 위하여 Hybrid 활성화 함수를 제안하였다. 연구대상지는 조수간만의 영향을 받는 한강을 대상으로 선정하였으며, 2009년 ~ 2018년까지 10년간의 수문자료를 활용하였다. 수위예측 알고리즘은 Python 내 Tensorflow의 RNN (Recurrent Neural Networks) 모델을 이용하였으며, 강수량, 수위, 조위, 댐 방류량, 하천 유량의 수문자료를 학습시켜 3시간 및 6시간 후의 수위를 예측하였다. 예측정확도 향상을 위하여 입력 데이터는 정규화(Normalization)를 시켰으며, 민감도 분석을 통하여 신경망모델의 은닉층 개수, 학습률의 최적 값을 도출하였다. Hybrid 활성화 함수는 쌍곡 탄젠트 함수와 ReLU 함수를 혼합한 형태로 각각의 가중치($w_1,w_2,w_1+w_2=1$)를 변경하여 정확도를 평가하였다. 그 결과 가중치의 비($w_1/w_2$)에 따라서 예측 결과의 RMSE(Roote Mean Square Error)가 최소가 되고 NSE (Nash-Sutcliffe model Efficiency coefficient)가 최대가 되는 지점과 Peak 수위의 예측정확도가 최대가 되는 지점을 확인할 수 있었다. 본 연구는 현재 Data modeling을 통한 수위예측의 정확도 향상을 위해 기초가 되는 연구이나, 향후 다양한 형태의 활성화 함수를 제안하여 정확도를 향상시킨다면 예측 결과를 통하여 침수예보에 대한 의사결정이 가능할 것으로 기대된다.
PDF

Search Result 4,870, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)