• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.026 seconds

Recommendation System Using Big Data Processing Technique (빅 데이터 처리 기법을 적용한 추천 시스템에 관한 연구)

  • Yun, So-Young;Youn, Sung-Dae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.6
    • /
    • pp.1183-1190
    • /
    • 2017
  • With the development of network and IT technology, people are searching and purchasing items they want, not bounded by places. Therefore, there are various studies on how to solve the scalability problem due to the rapidly increasing data in the recommendation system. In this paper, we propose an item-based collaborative filtering method using Tag weight and a recommendation technique using MapReduce method, which is a distributed parallel processing method. In order to improve speed and efficiency, the proposed method classifies items into categories in the preprocessing and groups according to the number of nodes. In each distributed node, data is processed by going through Map-Reduce step 4 times. In order to recommend better items to users, item tag weight is used in the similarity calculation. The experiment result indicated that the proposed method has been more enhanced the appropriacy compared to item-based method, and run efficiently on the large amounts of data.

Dynamic Generation Methods of the Wireless Map Database using Generalization and Filtering (Generalization과 Filtering을 이용한 무선 지도 데이터베이스의 동적 생성 기법)

  • Kim, Mi-Ran;Choe, Jin-O
    • The KIPS Transactions:PartD
    • /
    • v.8D no.4
    • /
    • pp.367-376
    • /
    • 2001
  • For the electronic map service by wireless, the existing map database cannot be used directly. This is because, the data volume of a map is too big to transfer by wireless and although the map is transferred successfully, the devices to display the map usually don’t have enough resources as the ones for desktop computers. It is also not acceptable to construct map database for the exclusive use of wireless service because of the vast cost. We propose new technique to generate a map for wireless service dynamically, from the existing map database. This technique includes the generalization method to reduce the map data volume and filtering method to guarantee that the data volume don’t exceed the limit of bandwidth. The generalization is performed in 3 steps :ㅁ step of merging the layers, a step of reducing the size of spatial objects, and a step of processing user interface. The filtering is performed by 2 module, counter and selector module. The counter module checks whether the data blume of generated map by generalization, exceeds the bandwidth limit. The selector module eliminates the excess objects and selects the rest, on the basis of distance.

  • PDF

Ontology and Sequential Rule Based Streaming Media Event Recognition (온톨로지 및 순서 규칙 기반 대용량 스트리밍 미디어 이벤트 인지)

  • Soh, Chi-Seung;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.4
    • /
    • pp.470-479
    • /
    • 2016
  • As the number of various types of media data such as UCC (User Created Contents) increases, research is actively being carried out in many different fields so as to provide meaningful media services. Amidst these studies, a semantic web-based media classification approach has been proposed; however, it encounters some limitations in video classification because of its underlying ontology derived from meta-information such as video tag and title. In this paper, we define recognized objects in a video and activity that is composed of video objects in a shot, and introduce a reasoning approach based on description logic. We define sequential rules for a sequence of shots in a video and describe how to classify it. For processing the large amount of increasing media data, we utilize Spark streaming, and a distributed in-memory big data processing framework, and describe how to classify media data in parallel. To evaluate the efficiency of the proposed approach, we conducted an experiment using a large amount of media ontology extracted from Youtube videos.

Development of Prediction Model of Groundwater Pollution based on Food Available Water and Validation in Small Watersheds (식품용수 수질자료를 이용한 지하수 오염 예측 모델 개발 및 소규모 유역에서의 검증)

  • Nam, Sungwoo;Park, Eungyu;Yi, Myeong-jae;Jeon, Seonkeum;Jung, Hyemin;Kim, Jeongwoo
    • Journal of Soil and Groundwater Environment
    • /
    • v.26 no.6
    • /
    • pp.165-175
    • /
    • 2021
  • Groundwater is used in many areas in food industry such as food manufacturing, food processing, cooking, and liquor industry etc. in Korea. As groundwater occupies a large portion of food industry, it is necessary to predict deterioration of water quality to ensure the safety of food water since using undrinkable groundwater has a ripple effect that can cause great harm or anxiety to food users. In this study, spatiotemporal data aggregation method was used in order to obtain spatially representative data, which enable prediction of groundwater quality change in a small watershed. In addition, a highly reliable predictive model was developed to estimate long-term changes in groundwater quality by applying a non-parametric segmented regression technique. Two pilot watersheds were selected where a large number of companies use groundwater for food water, and the appropriateness of the model was assessed by comparing the model-produced values with those obtained by actual measurements. The result of this study can contribute to establishing a customized food water management system utilizing big data that respond quickly, accurately, and preemptively to changes in groundwater quality and pollution. It is also expected to contribute to the improvement of food safety management.

Implementation of Customer Behavior Evaluation System Using Real-time Web Log Stream Data (실시간 웹로그 스트림데이터를 이용한 고객행동평가시스템 구현)

  • Lee, Hanjoo;Park, Hongkyu;Lee, Wonsuk
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.12
    • /
    • pp.1-11
    • /
    • 2018
  • Recently, the volume of online shopping market continues to be fast-growing, that is important to provide customized service based on customer behavior evaluation analysis. The existing systems only provide analysis data on the profiles and behaviors of the consumers, and there is a limit to the processing in real time due to disk based mining. There are problems of accuracy and system performance problems to apply existing systems to web services that require real-time processing and analysis. Therefore, The system proposed in this paper analyzes the web click log streams generated in real time to calculate the concentration level of specific products and finds interested customers which are likely to purchase the products, and provides and intensive promotions to interested customers. And we verify the efficiency and accuracy of the proposed system.

Computer Vision-Based Measurement Method for Wire Harness Defect Classification

  • Yun Jung Hong;Geon Lee;Jiyoung Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.1
    • /
    • pp.77-84
    • /
    • 2024
  • In this paper, we propose a method for accurately and rapidly detecting defects in wire harnesses by utilizing computer vision to calculate six crucial measurement values: the length of crimped terminals, the dimensions (width) of terminal ends, and the width of crimped sections (wire and core portions). We employ Harris corner detection to locate object positions from two types of data. Additionally, we generate reference points for extracting measurement values by utilizing features specific to each measurement area and exploiting the contrast in shading between the background and objects, thus reflecting the slope of each sample. Subsequently, we introduce a method using the Euclidean distance and correction coefficients to predict values, allowing for the prediction of measurements regardless of changes in the wire's position. We achieve high accuracy for each measurement type, 99.1%, 98.7%, 92.6%, 92.5%, 99.9%, and 99.7%, achieving outstanding overall average accuracy of 97% across all measurements. This inspection method not only addresses the limitations of conventional visual inspections but also yields excellent results with a small amount of data. Moreover, relying solely on image processing, it is expected to be more cost-effective and applicable with less data compared to deep learning methods.

The study of Defense Artificial Intelligence and Block-chain Convergence (국방분야 인공지능과 블록체인 융합방안 연구)

  • Kim, Seyong;Kwon, Hyukjin;Choi, Minwoo
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.81-90
    • /
    • 2020
  • The purpose of this study is to study how to apply block-chain technology to prevent data forgery and alteration in the defense sector of AI(Artificial intelligence). AI is a technology for predicting big data by clustering or classifying it by applying various machine learning methodologies, and military powers including the U.S. have reached the completion stage of technology. If data-based AI's data forgery and modulation occurs, the processing process of the data, even if it is perfect, could be the biggest enemy risk factor, and the falsification and modification of the data can be too easy in the form of hacking. Unexpected attacks could occur if data used by weaponized AI is hacked and manipulated by North Korea. Therefore, a technology that prevents data from being falsified and altered is essential for the use of AI. It is expected that data forgery prevention will solve the problem by applying block-chain, a technology that does not damage data, unless more than half of the connected computers agree, even if a single computer is hacked by a distributed storage of encrypted data as a function of seawater.

Real-time and Parallel Semantic Translation Technique for Large-Scale Streaming Sensor Data in an IoT Environment (사물인터넷 환경에서 대용량 스트리밍 센서데이터의 실시간·병렬 시맨틱 변환 기법)

  • Kwon, SoonHyun;Park, Dongwan;Bang, Hyochan;Park, Youngtack
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.54-67
    • /
    • 2015
  • Nowadays, studies on the fusion of Semantic Web technologies are being carried out to promote the interoperability and value of sensor data in an IoT environment. To accomplish this, the semantic translation of sensor data is essential for convergence with service domain knowledge. The existing semantic translation technique, however, involves translating from static metadata into semantic data(RDF), and cannot properly process real-time and large-scale features in an IoT environment. Therefore, in this paper, we propose a technique for translating large-scale streaming sensor data generated in an IoT environment into semantic data, using real-time and parallel processing. In this technique, we define rules for semantic translation and store them in the semantic repository. The sensor data is translated in real-time with parallel processing using these pre-defined rules and an ontology-based semantic model. To improve the performance, we use the Apache Storm, a real-time big data analysis framework for parallel processing. The proposed technique was subjected to performance testing with the AWS observation data of the Meteorological Administration, which are large-scale streaming sensor data for demonstration purposes.

An Energy-Efficient Algorithm for Solving Coverage Problem and Sensing Big Data in Sparse MANET Environments (희소 모바일 애드 혹 네트워크 환경에서 빅데이터 센싱을 위한 에너지 효율적인 센서 커버리지 알고리즘)

  • Gil, Joon-Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.11
    • /
    • pp.463-468
    • /
    • 2017
  • To sense a wide area with mobile nodes, the uniformity of node deployment is a very important issue. In this paper, we consider the coverage problem to sense big data in sparse mobile ad hoc networks. In most existing works on the coverage problem, it has been assumed that the number of nodes is large enough to cover the area in the network. However, the coverage problem in sparse mobile ad hoc networks differs in the sense that a long-distance between nodes should be formed to avoid the overlapping coverage areas. We formulate the sensor coverage problem in sparse mobile ad hoc networks and provide the solution to the problem by a self-organized approach without a central authority. The experimental results show that our approach is more efficient than the existing ones, subject to both of coverage areas and energy consumption.

A Realtime Malware Detection Technique Using Multiple Filter (다중 필터를 이용한 실시간 악성코드 탐지 기법)

  • Park, Jae-Kyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.7
    • /
    • pp.77-85
    • /
    • 2014
  • Recently, several environment damage caused by malicious or suspicious code is increasing. We study comprehensive response system actively for malware detection. Suspicious code is installed on your PC without your consent, users are unaware of the damage. Also, there are need to technology for realtime processing of Big Data. We must develope advanced technology for malware detection. We must analyze the static, dynamic of executable file for fundamentally malware detection in recently and verified by a reputation for verification. It is need to judgment of similarity for realtime response with big data. In this paper, we proposed realtime detection and verification technology using multiple filter. Our malware study suggests a new direction of realtime malware detection.