Search | Korea Science

Data anomaly detection and Data fusion based on Incremental Principal Component Analysis in Fog Computing

Yu, Xue-Yong;Guo, Xin-Hui
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.10
- /
- pp.3989-4006
- /
- 2020
The intelligent agriculture monitoring is based on the perception and analysis of environmental data, which enables the monitoring of the production environment and the control of environmental regulation equipment. As the scale of the application continues to expand, a large amount of data will be generated from the perception layer and uploaded to the cloud service, which will bring challenges of insufficient bandwidth and processing capacity. A fog-based offline and real-time hybrid data analysis architecture was proposed in this paper, which combines offline and real-time analysis to enable real-time data processing on resource-constrained IoT devices. Furthermore, we propose a data process-ing algorithm based on the incremental principal component analysis, which can achieve data dimensionality reduction and update of principal components. We also introduce the concept of Squared Prediction Error (SPE) value and realize the abnormal detection of data through the combination of SPE value and data fusion algorithm. To ensure the accuracy and effectiveness of the algorithm, we design a regular-SPE hybrid model update strategy, which enables the principal component to be updated on demand when data anomalies are found. In addition, this strategy can significantly reduce resource consumption growth due to the data analysis architectures. Practical datasets-based simulations have confirmed that the proposed algorithm can perform data fusion and exception processing in real-time on resource-constrained devices; Our model update strategy can reduce the overall system resource consumption while ensuring the accuracy of the algorithm.
https://doi.org/10.3837/tiis.2020.10.004 인용 PDF KSCI HTML

Distributed data deduplication technique using similarity based clustering and multi-layer bloom filter (SDS 환경의 유사도 기반 클러스터링 및 다중 계층 블룸필터를 활용한 분산 중복제거 기법)

Yoon, Dabin;Kim, Deok-Hwan
- The Journal of Korean Institute of Next Generation Computing
- /
- v.14 no.5
- /
- pp.60-70
- /
- 2018
A software defined storage (SDS) is being deployed in cloud environment to allow multiple users to virtualize physical servers, but a solution for optimizing space efficiency with limited physical resources is needed. In the conventional data deduplication system, it is difficult to deduplicate redundant data uploaded to distributed storages. In this paper, we propose a distributed deduplication method using similarity-based clustering and multi-layer bloom filter. Rabin hash is applied to determine the degree of similarity between virtual machine servers and cluster similar virtual machines. Therefore, it improves the performance compared to deduplication efficiency for individual storage nodes. In addition, a multi-layer bloom filter incorporated into the deduplication process to shorten processing time by reducing the number of the false positives. Experimental results show that the proposed method improves the deduplication ratio by 9% compared to deduplication method using IP address based clusters without any difference in processing time.

Efficient Information System Sizing Selection Using Cloud Computing Platform (클라우드 컴퓨팅 플랫폼을 이용한 효율적인 정보시스템 용량 산정 방법에 관한 연구)

Seong, Baek-min;Lee, Min-gyu;Sohn, Hyo-jung;Kim, Jong-bae
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.10a
- /
- pp.79-81
- /
- 2014
Recently, It is built various information systems evolve IT skills. But When you build the information system, Difficult to determine whether the appropriate scale and problems that rely heavily on SI companies and professionals. To solve this problem, Korea Information Security Agency, etc., based on the primary objective was to develop H/W Capacity Equation formally to each system type. But the problems are to present H/W capacity equation by discussion of the expert group of suppliers and relatively long that it is difficult to formally apply in the situation now so it is no longer the limit. In this study, we proposes proper capacity planning techniques, which can guarantee the best performance compared to the budget invested. For this purpose, we derived the proper H/W capacity equation by regression analysis to gather performance metrics and cost of various cases by simulation of a virtual environment in the cloud. Through this study, when capacity planning, It is possible to reduce costs that It is possible to build an information system based on the digitized data and build information system in an environment that does not rely on the SI business or professional.
PDF

Design and Implementation of a Hadoop-based Efficient Security Log Analysis System (하둡 기반의 효율적인 보안로그 분석시스템 설계 및 구현)

Ahn, Kwang-Min;Lee, Jong-Yoon;Yang, Dong-Min;Lee, Bong-Hwan
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.8
- /
- pp.1797-1804
- /
- 2015
Integrated log management system can help to predict the risk of security and contributes to improve the security level of the organization, and leads to prepare an appropriate security policy. In this paper, we have designed and implemented a Hadoop-based log analysis system by using distributed database model which can store large amount of data and reduce analysis time by automating log collecting procedure. In the proposed system, we use the HBase in order to store a large amount of data efficiently in the scale-out fashion and propose an easy data storing scheme for analysing data using a Hadoop-based normal expression, which results in improving data processing speed compared to the existing system.
https://doi.org/10.6109/jkiice.2015.19.8.1797 인용 PDF KSCI KPUBS HTML

Design of Trajectory Data Indexing and Query Processing for Real-Time LBS in MapReduce Environments (MapReduce 환경에서의 실시간 LBS를 위한 이동궤적 데이터 색인 및 검색 시스템 설계)

Chung, Jaehwa
- Journal of Digital Contents Society
- /
- v.14 no.3
- /
- pp.313-321
- /
- 2013
In recent, proliferation of mobile smart devices have led to big-data era, the importance of location-based services is increasing due to the exponential growth of trajectory related data. In order to process trajectory data, parallel processing platforms such as cloud computing and MapReduce are necessary. Currently, the researches based on MapReduce are on progress, but due to the MapReduce's properties in using batch processing and simple key-value structure, applying MapReduce framework for real time LBS is difficult. Therefore, in this research we propose a suitable system design on efficient indexing and search techniques for real time service based on detailed analysis on the properties of MapReduce.
https://doi.org/10.9728/dcs.2013.14.3.313 인용 PDF KSCI

An Application Study of Disaster Information System Based on Cloud Computing Service (클라우드 컴퓨팅 서비스 기반 재난안전정보시스템 활용에 관한 연구)

Jeong, Inkyu;Park, Jin Yi;Kim, Min Ho;Lim, Jungtak;Kim, Jinyoung
- Proceedings of the Korean Society of Disaster Information Conference
- /
- 2016.11a
- /
- pp.366-367
- /
- 2016
과거 활용되던 재난관련 정보는 재난 발생을 신속하게 전파하거나, 피해규모, 복구자원 현황을 파악하는 등 재난피해 복구에 초점이 맞춰져 있었다. 그러나 최근에는 IT 기반이 확충되고 컴퓨팅 성능이 향상됨에 따라 그 양상이 변화하고 있다. 정형 및 비정형 데이터를 활용한 빅데이터 분석 기술은 재난의 예방과 대비를 위한 기술에 활용되고 있으며, 재난현장의 실시간 정보획득을 위해 IoT 기술이 도입되고 있다. 이처럼 재난정보의 수집, 관리, 분석 제공에 관한 중요성이 증대됨에 따라서 재난의 양상에 능동적으로 대처하고 정보의 효율적인 관리 및 이용을 위해 클라우드 컴퓨팅에 대한 관심이 부각되고 있다. 이에 본 논문에서는 재난관련 정보 활용 양상 변화에 대처하기 위해 재난관리시스템에 클라우드 컴퓨팅 서비스의 적용 방안을 검토하고자 한다. 사회가 복잡해짐에 따라 재난은 이제 사회 전반의 모든 정보를 다뤄야 하기 때문에, 과거 빅데이터의 3대 요소인 크기(Volume), 속도(Velocity), 다양성(Varsity)을 넘어 정확성(Veracity)과 가치(Value)를 뽑아낼 수 있는 방안에 대해 설명한다. 나아가 재난정보시스템의 효율적인 활용을 위한 클라우드 컴퓨팅 서비스의 활용방안에 대해 논의한다.
PDF

A Web-Based High Performance Multiple Sequence Alignment System Design and Implementation (웹 기반 고성능 다중서열정렬시스템 설계 및 구현)

Kim, Tae-Kyung;Kim, Hun-Gi;Choi, Chi-Hwan;Jung, Seung-Hyun;Hou, Bo-Kyeng;Cho, Wan-Sup
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2010.07a
- /
- pp.79-82
- /
- 2010
다중서열정렬 알고리즘은 생명정보학 분야에서 서열기반의 계통분류 분석에 가장 많이 사용되며, 가장 대표적인 공개 프로그램은 ClustalW로 사용자가 로컬시스템에 설치하여 이용할 수 있다. 그러나 실제로 사용자들이 ClustalW을 설치한 후, 서열데이터의 준비, 가공, 처리 및 타 시스템과 연동 등과 같은 작업을 하는데 여러 가지 어려움이 있다. 따라서 본 논문에서는 다중서열정렬 작업을 편리하고 빠르게 수행할 수 있는 웹기반의 고성능 다중서열정렬시스템을 제안한다. 제안된 시스템의 특징은, (1) Inter-Query 라우팅 알고리즘을 통해 다수의 PC 자원을 효율적으로 활용하여 계산 성능을 극대화하였으며, (2) 사용자 편의성을 고려한 웹인터페이스의 제공을 통해 개인화된 데이터관리, 실시간 모니터링, 데이터 편집 등을 지원하여 사용자가 손쉽게 서열데이터의 수집, 관리 및 처리할 수 있도록 지원한다.
PDF

Design and Implementation of a PCR Primer Search System on Cloud Computing Environments (클라우드 컴퓨팅 환경에서 PCR Primer 검색 시스템 설계 및 개발)

Park, Junho;Lim, Jongtae;Kim, Dongjoo;Lee, Yunjeong;Ryu, Eunkyung;Ahn, Minje;Cha, Jaehong;Yu, Seok Jong;Yoo, Jaesoo
- Proceedings of the Korea Contents Association Conference
- /
- 2012.05a
- /
- pp.269-270
- /
- 2012
유전자 증폭을 위한 정확한 PCR Primer의 디자인은 핵심적인 기반 기술이다. 기존 연구를 통해 각 유전자별 특이적인 PCR Primer를 디자인할 수 있는 도구가 제안되었으나, 유전체 정보를 활용한 대단위의 디자인작업을 수행하기에는 적합하지 않았다. 본 논문에서는 클라우드 컴퓨팅 환경에서 대규모의 유전체를 대상으로 특이적인 PCR Primer를 디자인하고 검색할 수 있는 시스템을 설계하고 구현한다. 제안하는 시스템은 Hadoop 플랫폼에서의 MapReduce 프레임워크를 기반으로 설계 및 구현하여 유전자 서열검색을 대규모로 수행할 수 있도록 하였다. 5만개의 질의를 이용한 성능 평가 결과, 제안하는 기법은 기존 BLAST를 이용한 검색방법에 비해 약 3배의 성능 향상을 보였다.
PDF

Low Power GPS Data Sharing System based on Cloud Computing (클라우드 기반 저전력 GPS Data Sharing 시스템 제안)

Lee, Young-Kwon;Choe, Sun-taag;Cho, We-Duke
- Proceedings of the Korea Information Processing Society Conference
- /
- 2016.04a
- /
- pp.762-765
- /
- 2016
사용자는 스마트폰의 대중화로 인해 다양하고 편리한 서비스를 쉽게 제공 받을 수 있다. 위치 정보 서비스를 사용하기 위해 GPS 모듈을 이용하는데 이는 전력 소모가 매우 크다. 다수의 GPS 모듈이 있는 그룹 상황에서 그룹의 헤더를 정하고 헤더의 위치 정보 데이터를 공유하는 방법을 이용하여 전력 소모 문제를 해결한다. 이를 위해 클라우드 기반 GPS 데이터 Sharing 시스템을 제안한다. 사전에 사회 관계 그룹을 등록하고 그룹원들의 위치 정보 데이터를 수신하고 거리/방위각/속도를 기준으로 그룹 상황을 감지한다. 그룹 상황 감지를 위해 Depth First Search(DFS) 알고리즘을 사용한다. 생성된 그룹에서 배터리 잔여량이 제일 많은 그룹원을 헤더로 정한다. 헤더의 배터리 잔여량에 따라 위치 정보 데이터 수집 횟수를 적응적으로 적용한다. 시스템을 적용한다면 그룹 상황에서의 그룹원의 전력 감소 효과와 더불어 대중 교통의 위치 데이터 공공화가 된다면 사용자의 위치 정보 데이터 대신 대중 교통의 데이터를 대신할 수 있고 사회 관계 그룹원들 간의 관계를 수치화 할 수 있을 것이다.
https://doi.org/10.3745/PKIPS.y2016m04a.762 인용 PDF

UHD Video Transcoding System in Cloud Computing Environment (클라우드 기반 UHD 영상 트랜스코딩 시스템)

Moon, Hee-Cheol;Kim, Yong-Hwan;Kim, Dong-Hyeok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2014.11a
- /
- pp.203-205
- /
- 2014
UHD 영상 콘텐츠는 FHD 영상에 비해 생생하고 더 좋은 고화질의 영상을 제공하지만 영상정보의 데이터 양은 4K UHD 경우 4 배 이상이다. 이러한 초대용량의 UHD 영상을 기존의 병렬/분산 처리를 이용하여 비디오 코딩 한다면 UHD 의 초대용량 특성으로 인하여 연산량 부하가 발생하게 된다. 따라서 UHD 영상은 기존의 분산처리 방식이 아닌 초대용량 데이터를 빠르게 처리 할 수 있는 새로운 분산 처리기술이 필요하다. 본 논문은 UHD 콘텐츠를 빠르게 트랜스코딩 할 수 있는 클라우드 기반 UHD 영상 트랜스코딩 시스템을 제안한다. 본 논문에서 제안하는 UHD 영상 트랜스코딩 시스템은 다음 3 가지 패킷 분석기, 분산 트랜스코더, 스트림 합성기로 구성된다. 패킷 분석기는 입력 영상을 분석하여 오디오와 비디오 스트림을 분할하고 비디오 스트림은 분산처리를 할 수 있도록 영상 패킷을 분할한다. 분산 트랜스코더는 클라우드 환경을 이용하여 분할된 영상 패킷들을 분산 디코드 및 인코드 처리한다. 스트림 합성기는 트랜스코딩이 완료된 비디오 스트림과 패킷 분석기에서 획득하였던 오디오 스트림을 합성하는 기능을 한다. 제시하는 방안을 적용하여 클라우드 기반 영상 트랜스 코딩 시스템을 구현하였으며, 구현된 시스템은 대용량의 UHD 영상을 빠른 속도로 트랜스코딩이 가능하다.
PDF

Search Result 748, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)