• Title/Summary/Keyword: HADOOP

Search Result 394, Processing Time 0.033 seconds

Design of Search System Based on Lucene for Minimum Price Products (루씬 기반의 최저가 상품 검색 시스템 설계)

  • Kim, A-Yong;Jeong, Dae-Jin;Gye, Min-Suk;Kim, Chang-Su;Jung, Hoe-kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.603-605
    • /
    • 2014
  • Has been switched to the online shopping market in stores of the consumer is from increased utilization and smart devices, the internet popularization. That is why has been converting the user's consumption patterns and consumer culture. Open markets is provides of making a wide variety of events and lowest price policies, safe transactions etc, for attract the consumers of expand distribution channels of the web and via mobile. In this paper, a designs of provides a search system for minimum price product information to the user of Information collect and analyze on sale from open market.

  • PDF

Performance Factor of Distributed Processing of Machine Learning using Spark (스파크를 이용한 머신러닝의 분산 처리 성능 요인)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.1
    • /
    • pp.19-24
    • /
    • 2021
  • In this paper, we study performance factor of machine learning in the distributed environment using Apache Spark and presents an efficient distributed processing method through experiments. This work firstly presents performance factor when performing machine learning in a distributed cluster by classifying cluster performance, data size, and configuration of spark engine. In addition, performance study of regression analysis using Spark MLlib running on the Hadoop cluster is performed while changing the configuration of the node and the Spark Executor. As a result of the experiment, it was confirmed that the effective number of executors was affected by the number of data blocks, but depending on the cluster size, the maximum and minimum values were limited by the number of cores and the number of worker nodes, respectively.

Big IoT Healthcare Data Analytics Framework Based on Fog and Cloud Computing

  • Alshammari, Hamoud;El-Ghany, Sameh Abd;Shehab, Abdulaziz
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1238-1249
    • /
    • 2020
  • Throughout the world, aging populations and doctor shortages have helped drive the increasing demand for smart healthcare systems. Recently, these systems have benefited from the evolution of the Internet of Things (IoT), big data, and machine learning. However, these advances result in the generation of large amounts of data, making healthcare data analysis a major issue. These data have a number of complex properties such as high-dimensionality, irregularity, and sparsity, which makes efficient processing difficult to implement. These challenges are met by big data analytics. In this paper, we propose an innovative analytic framework for big healthcare data that are collected either from IoT wearable devices or from archived patient medical images. The proposed method would efficiently address the data heterogeneity problem using middleware between heterogeneous data sources and MapReduce Hadoop clusters. Furthermore, the proposed framework enables the use of both fog computing and cloud platforms to handle the problems faced through online and offline data processing, data storage, and data classification. Additionally, it guarantees robust and secure knowledge of patient medical data.

Research on Regional Smart Farm Data Linkage and Service Utilization (지역 스마트팜 데이터 연계 및 서비스 활용에 대한 연구)

  • Won-Goo Lee;Hyun Jung Koo;Cheol-Joo Chae
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.26 no.2
    • /
    • pp.14-24
    • /
    • 2024
  • To enhance the usability of smart agriculture, methods for utilizing smart farm data are required. Therefore, this study proposes a scheme for utilizing regional smart farm data by linking it to services. The current status of domestic and foreign smart farm data collection and linkage services is analyzed. To collect and link regional smart farm data, necessary data collection, data cleaning, data storage structure and schema, and data storage and linkage systems are proposed. Based on the standards currently being implemented for regional smart farm internal data storage, a farm schema, environmental information schema, facility control information schema, and growth information schema are designed by extending the crop schema and crop main environmental factor information database schema. A data collection and management system structure based on the Hadoop Ecosystem is designed for data collection and management at regional smart farm data centers. Strategies are proposed for utilizing regional smart farm data to provide smart farm productivity improvement and revenue optimization services, image-based crop analysis services, and virtual reality-based smart farm simulation services.

Update Frequency Reducing Method of Spatio-Temporal Big Data based on MapReduce (MapReduce와 시공간 데이터를 이용한 빅 데이터 크기의 이동객체 갱신 횟수 감소 기법)

  • Choi, Youn-Gwon;Baek, Sung-Ha;Kim, Gyung-Bae;Bae, Hae-Young
    • Spatial Information Research
    • /
    • v.20 no.2
    • /
    • pp.137-153
    • /
    • 2012
  • Until now, many indexing methods that can reduce update cost have been proposed for managing massive moving objects. Because indexing methods for moving objects have to be updated periodically for managing moving objects that change their location data frequently. However these kinds indexing methods occur big load that exceed system capacity when the number of moving objects increase dramatically. In this paper, we propose the update frequency reducing method to combine MapReduce and existing indices. We use the update request grouping method for each moving object by using MapReduce. We decide to update by comparing the latest data and the oldest data in grouping data. We reduce update frequency by updating the latest data only. When update is delayed, for the data should not be lost and updated periodically, we store the data in a certain period of time in the hash table that keep previous update data. By the performance evaluation, we can prove that the proposed method reduces the update frequency by comparison with methods that are not applied the proposed method.

A reviews on the social network analysis using R (R을 이용한 사회연결망 분석에 대한 고찰)

  • Choi, Kyoungho;Yoo, Jin Ah
    • Journal of the Korea Convergence Society
    • /
    • v.6 no.1
    • /
    • pp.77-83
    • /
    • 2015
  • Though the SNA (social network analysis ; SNA) has been used for various fields, esp. social science field, ig. politics, journalism, and science of public administration as well as natural science field, there are few studies about the introduction of analysis tools. In order to perform the SNA, collecting data which are fit for the purpose, statistical values deduction and visualized results made by analysis tool are necessary, but the studies, which explain them systematically, are not sufficient yet. So, in this study, we are intended to introduce the analytic process, from the data input to the interpretation, with proven data. using the R program, which is free, in order to help researchers who have any plan to study using the SNA. The proven data in this study are quoted ones in the domestic scientific journals of food, which are those supplied citation index DB of Korean scientific journals. As a study methodology, the SNA is a new paradigm to substitute existing research methods as well as a complement of statistical analysis. Therefore, this study would contribute to vitalization of the SNA.

Real time predictive analytic system design and implementation using Bigdata-log (빅데이터 로그를 이용한 실시간 예측분석시스템 설계 및 구현)

  • Lee, Sang-jun;Lee, Dong-hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.6
    • /
    • pp.1399-1410
    • /
    • 2015
  • Gartner is requiring companies to considerably change their survival paradigms insisting that companies need to understand and provide again the upcoming era of data competition. With the revealing of successful business cases through statistic algorithm-based predictive analytics, also, the conversion into preemptive countermeasure through predictive analysis from follow-up action through data analysis in the past is becoming a necessity of leading enterprises. This trend is influencing security analysis and log analysis and in reality, the cases regarding the application of the big data analysis framework to large-scale log analysis and intelligent and long-term security analysis are being reported file by file. But all the functions and techniques required for a big data log analysis system cannot be accommodated in a Hadoop-based big data platform, so independent platform-based big data log analysis products are still being provided to the market. This paper aims to suggest a framework, which is equipped with a real-time and non-real-time predictive analysis engine for these independent big data log analysis systems and can cope with cyber attack preemptively.

Development of Procurement Announcement Analysis Support System (전자조달공고 분석지원 시스템 개발)

  • Lim, Il-kwon;Park, Dong-Jun;Cho, Han-Jin
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.8
    • /
    • pp.53-60
    • /
    • 2018
  • Domestic public e-procurement has been recognized excellence at home and abroad. However, it is difficult for procurement companies to check the related announcements and to grasp the status of procurement announcements at a glance. In this paper, we propose an e-Procurement Announcement Analysis Support System using the HDFS, HDFS, Apache Spark, and Collaborative Filtering Technology for procurement announcement recommendation service and procurement announcement and contract trend analysis service for effective e-procurement system. Procurement announcement recommendation service can relieve the procurement company from searching for announcements according to the characteristics and characteristics of the procurement company. The procurement announcement/contract trend analysis service visualizes the procurement announcement/contract information and procures It is implemented so that the analysis information of electronic procurement can be seen at a glance to the company and the demand organization.

Operational Big Data Analytics platform for Smart Factory (스마트팩토리를 위한 운영빅데이터 분석 플랫폼)

  • Bae, Hyerim;Park, Sanghyuck;Choi, Yulim;Joo, Byeongjun;Sutrisnowati, Riska Asriana;Pulshashi, Iq Reviessay;Putra, Ahmad Dzulfikar Adi;Adi, Taufik Nur;Lee, Sanghwa;Won, Seokrae
    • The Journal of Bigdata
    • /
    • v.1 no.2
    • /
    • pp.9-19
    • /
    • 2016
  • Since ICT convergence became a major issue, German government has carried forward a policy 'Industry 4.0' that triggered ICT convergence with manufacturing. Now this trend gets into our stride. From this facts, we can expect great leap up to quality perfection in low cost. Recently Korean government also enforces policy with 'Manufacturing 3.0' for upgrading Korean manufacturing industry with being accelerated by many related technologies. We, in the paper, developed a custom-made operational big data analysis platform for the implementation of operational intelligence to improve industry capability. Our platform is designed based on spring framework and web. In addition, HDFS and spark architectures helps our system analyze massive data on the field with streamed data processed by process mining algorithm. Extracted knowledge from data will support enhancement of manufacturing performance.

  • PDF

Design of GlusterFS Based Big Data Distributed Processing System in Smart Factory (스마트 팩토리 환경에서의 GlusterFS 기반 빅데이터 분산 처리 시스템 설계)

  • Lee, Hyeop-Geon;Kim, Young-Woon;Kim, Ki-Young;Choi, Jong-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.1
    • /
    • pp.70-75
    • /
    • 2018
  • Smart Factory is an intelligent factory that can enhance productivity, quality, customer satisfaction, etc. by applying information and communications technology to the entire production process including design & development, manufacture, and distribution & logistics. The precise amount of data generated in a smart factory varies depending on the factory's size and state of facilities. Regardless, it would be difficult to apply traditional production management systems to a smart factory environment, as it generates vast amounts of data. For this reason, the need for a distributed big-data processing system has risen, which can process a large amount of data. Therefore, this article has designed a Gluster File System (GlusterFS)-based distributed big-data processing system that can be used in a smart factory environment. Compared to existing distributed processing systems, the proposed distributed big-data processing system reduces the system load and the risk of data loss through the distribution and management of network traffic.