• Title/Summary/Keyword: Map-Reduce

Search Result 849, Processing Time 0.026 seconds

Travel Time Prediction Algorithm for Trajectory data by using Rule-Based Classification on MapReduce (맵리듀스 환경에서 규칙 기반 분류화를 이용한 궤적 데이터 주행 시간 예측 알고리즘)

  • Kim, JaeWon;Lee, HyunJo;Chang, JaeWoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.11a
    • /
    • pp.798-801
    • /
    • 2014
  • 여행 정보 시스템(ATIS), 교통 관리 시스템 (ITS) 등 궤적 기반 서비스에서, 서비스 품질을 향상시키기 위해서는 주어진 궤적 질의에 대한 정확한 주행시간을 예측하는 것이 필수적이다. 이를 위한 대표적인 공간 데이터 분석 기법으로는 데이터 분류에서 높은 정확도를 보장하는 규칙 기반 분류화 기법이 존재한다. 그러나 기존 규칙 기반 분류화 기법은 단일 컴퓨터 환경만을 고려하기 때문에, 대용량 공간 데이터 처리에 적합하지 않은 문제점이 존재한다. 이를 해결하기 위해, 본 연구에서는 맵리듀스 환경에서 규칙 기반 분류화를 이용한 궤적 데이터 주행 시간 예측 알고리즘을 개발하고자 한다. 제안하는 알고리즘은 첫째, 맵리듀스를 이용하여 대용량 공간 데이터를 병렬적으로 분석함으로써, 활용도 높은 궤적 데이터 규칙을 생성한다. 이를 통해 대용량 공간 데이터 기반의 규칙 생성 시간을 감소시킨다. 둘째, 그리드 구조 기반의 지도 데이터 분할을 통해, 사용자 질의처리 시 탐색 성능을 향상시킨다. 즉, 주행 시간 예측을 위한 규칙 그룹을 탐색 시 질의를 포함하는 그리드 셀만을 탐색하기 때문에, 질의처리 성능이 향상된다. 마지막으로 맵리듀스 구조에 적합한 질의처리 알고리즘을 설계하여, 효율적인 병렬 질의처리를 지원한다. 이를 위해 맵 함수에서는 선정된 그리드 셀에 대해, 질의에 포함된 도로 구간에서의 주행 시간을 병렬적으로 측정한다. 아울러 리듀스 함수에서는 출발 시간 및 구간별 주행 시간을 바탕으로 맵 함수의 결과를 병합함으로써, 최종 결과를 생성한다. 이를 통해 공간 빅데이터 분석을 통한 주행 시간 예측 기법의 처리 시간 및 결과 정확도를 향상시킨다.

An analysis of Flood Inundation using Query and Mathematical Method (Query 및 Mathematical 기법을 이용한 홍수범람 해석)

  • Jeong, Ha-Ok;Park, Sang-Woo;Choo, Tai-Ho;Park, Kun-Chul
    • Journal of Wetlands Research
    • /
    • v.12 no.1
    • /
    • pp.33-40
    • /
    • 2010
  • In this study, it has been intended to present the ways to improve some problems such as the difficulty of using the program which had got from the existing study, the computation and application of a lot of parameter and the complicated processing which need to be more simplified. Also It has been tried to bring up the ways to make a flood inundation map and a detailed inundation analysis which could reduce the risk factors. We selected an Anseong-Cheon basin, and wrote a flood inundation scenario based on extreme flood to exceed the planned frequency to consider only overflow and levee break and executed inundation simulation. Researchers conducted an analysis of overflow and levee break using function of HEC-RAS Storage with a One-Dimensional model. It applied Elevation versus Volume Curve for more correct inundation simulation than a method of Area-Time-Depth which used in popular. This study will suggest a mathematical method of SURFER with a little difference of inundation area more simplified and precise flood inundation than complicated Arcview 3.2a which used Query method of Arcview 3.2a.

GIS Based Flood Inundation Analysis in Protected Lowland Considering the Affection of Structure (구조물의 영향을 고려한 GIS기반의 제내지 홍수범람해석)

  • Choi, Seung-Yong;Han, Kun-Yeun;Cho, Wan-Hee
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.12 no.4
    • /
    • pp.1-17
    • /
    • 2009
  • In recent years, most of flood damage is associated with the levee failure. The objective of this study is to predict flow depths, flood area, flooding time and flood damage through flood inundation analysis considering the overflow of levee and the characteristics of levee failure. The hydrological parameters were extracted from GIS data such as DEM, land cover and soil map to estimate levee failure discharge. In addition, the characteristics of flood wave propagation could be accurately predicted as flood inundation analysis was accomplished considering the affection of structure within protected lowland and hourly prediction of flooded areas and estimation of flood strength will be utilized as basic data for the flood defence and establishment of measure to reduce flood damage.

  • PDF

High Resolution 3D Magnetic Resonance Fingerprinting with Hybrid Radial-Interleaved EPI Acquisition for Knee Cartilage T1, T2 Mapping

  • Han, Dongyeob;Hong, Taehwa;Lee, Yonghan;Kim, Dong-Hyun
    • Investigative Magnetic Resonance Imaging
    • /
    • v.25 no.3
    • /
    • pp.141-155
    • /
    • 2021
  • Purpose: To develop a 3D magnetic resonance fingerprinting (MRF) method for application in high resolution knee cartilage PD, T1, T2 mapping. Materials and Methods: A novel 3D acquisition trajectory with golden-angle rotating radial in kxy direction and interleaved echo planar imaging (EPI) acquisition in the kz direction was implemented in the MRF framework. A centric order was applied to the interleaved EPI acquisition to reduce Nyquist ghosting artifact due to field inhomogeneity. For the reconstruction, singular value decomposition (SVD) compression method was used to accelerate reconstruction time and conjugate gradient sensitivity-encoding (CG-SENSE) was performed to overcome low SNR of the high resolution data. Phantom experiments were performed to verify the proposed method. In vivo experiments were performed on 6 healthy volunteers and 2 early osteoarthritis (OA) patients. Results: In the phantom experiments, the T1 and T2 values of the proposed method were in good agreement with the spin-echo references. The results from the in vivo scans showed high quality proton density (PD), T1, T2 map with EPI echo train length (NETL = 4), acceleration factor in through plane (Rz = 5), and number of radial spokes (Nspk = 4). In patients, high T2 values (50-60 ms) were seen in all transverse, sagittal, and coronal views and the damaged cartilage regions were in agreement with the hyper-intensity regions shown on conventional turbo spin-echo (TSE) images. Conclusion: The proposed 3D MRF method can acquire high resolution (0.5 mm3) quantitative maps in practical scan time (~ 7 min and 10 sec) with full coverage of the knee (FOV: 160 × 160 × 120 mm3).

A Study on Research Paper Classification Using Keyword Clustering (키워드 군집화를 이용한 연구 논문 분류에 관한 연구)

  • Lee, Yun-Soo;Pheaktra, They;Lee, JongHyuk;Gil, Joon-Min
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.12
    • /
    • pp.477-484
    • /
    • 2018
  • Due to the advancement of computer and information technologies, numerous papers have been published. As new research fields continue to be created, users have a lot of trouble finding and categorizing their interesting papers. In order to alleviate users' this difficulty, this paper presents a method of grouping similar papers and clustering them. The presented method extracts primary keywords from the abstracts of each paper by using TF-IDF. Based on TF-IDF values extracted using K-means clustering algorithm, our method clusters papers to the ones that have similar contents. To demonstrate the practicality of the proposed method, we use paper data in FGCS journal as actual data. Based on these data, we derive the number of clusters using Elbow scheme and show clustering performance using Silhouette scheme.

Livestock Disease Forecasting and Smart Livestock Farm Integrated Control System based on Cloud Computing (클라우드 컴퓨팅기반 가축 질병 예찰 및 스마트 축사 통합 관제 시스템)

  • Jung, Ji-sung;Lee, Meong-hun;Park, Jong-kweon
    • Smart Media Journal
    • /
    • v.8 no.3
    • /
    • pp.88-94
    • /
    • 2019
  • Livestock disease is a very important issue in the livestock industry because if livestock disease is not responded quickly enough, its damage can be devastating. To solve the issues involving the occurrence of livestock disease, it is necessary to diagnose in advance the status of livestock disease and develop systematic and scientific livestock feeding technologies. However, there is a lack of domestic studies on such technologies in Korea. This paper, therefore, proposes Livestock Disease Forecasting and Livestock Farm Integrated Control System using Cloud Computing to quickly manage livestock disease. The proposed system collects a variety of livestock data from wireless sensor networks and application. Moreover, it saves and manages the data with the use of the column-oriented database Hadoop HBase, a column-oriented database management system. This provides livestock disease forecasting and livestock farm integrated controlling service through MapReduce Model-based parallel data processing. Lastly, it also provides REST-based web service so that users can receive the service on various platforms, such as PCs or mobile devices.

A study on the enhancement and performance optimization of parallel data processing model for Big Data on Emissions of Air Pollutants Emitted from Vehicles (차량에서 배출되는 대기 오염 물질의 빅 데이터에 대한 병렬 데이터 처리 모델의 강화 및 성능 최적화에 관한 연구)

  • Kang, Seong-In;Cho, Sung-youn;Kim, Ji-Whan;Kim, Hyeon-Joung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.1-6
    • /
    • 2020
  • Road movement pollutant air environment big data is a link between real-time traffic data such as vehicle type, speed, and load using AVC, VDS, WIM, and DTG, which are always traffic volume survey equipment, and road shape (uphill, downhill, turning section) data using GIS. It consists of traffic flow data. Also, unlike general data, a lot of data per unit time is generated and has various formats. In particular, since about 7.4 million cases/hour or more of large-scale real-time data collected as detailed traffic flow information are collected, stored and processed, a system that can efficiently process data is required. Therefore, in this study, an open source-based data parallel processing performance optimization study is conducted for the visualization of big data in the air environment of road transport pollution.

Cause Analysis and Reduction of Safety Accident in Modular Construction - Focusing on Manufacturing and Construction Process - (모듈러 건축에서의 안전사고 원인 분석 및 저감방안 - 제작 및 시공단계 작업을 중심으로 -)

  • Jeong, Gilsu;Lee, Hyunsoo;Park, Moonseo;Hyun, Hosang;Kim, Hyunsoo
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.35 no.8
    • /
    • pp.157-168
    • /
    • 2019
  • Modular Construction is regarded as having enhanced safety compared to traditional construction since most of modular manufacturing process in plants. Unlike general consideration for safety in modular construction, several industrial accident data and studies have pointed out that the accident rate of modular construction is not enough less as much as the practitioners have expected. It means that there is a clear need for improvement of safety management in modular construction. To enhance safety, it is necessary to identify the type and cause of accident through accident cases in order to prevent safety accident in advance. In this consideration, this study analyzed the types and causes of accidents through root cause analysis procedure with accident cases of U.S. OSHA. The classification was carried out in the order of process type, accident type and cause of accident. By following the classification criteria in this study, the causal factor was derived and the root cause map was created. Based on the analysis results, cross-analysis was conducted and it is shown that activity characteristics of modular construction are related to safety accidents. In addition, prevention methods to reduce safety accident by major activity are presented in terms of organizational, educational and technical aspects. This study contributes that the result can be used as the basic safety management in the manufacturing and construction process of modular construction.

Study of Efficient Algorithm for Deduplication of Complex Structure (복잡한 구조의 데이터 중복제거를 위한 효율적인 알고리즘 연구)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.29-36
    • /
    • 2021
  • The amount of data generated has been growing exponentially, and the complexity of data has been increasing owing to the advancement of information technology (IT). Big data analysts and engineers have therefore been actively conducting research to minimize the analysis targets for faster processing and analysis of big data. Hadoop, which is widely used as a big data platform, provides various processing and analysis functions, including minimization of analysis targets through Hive, which is a subproject of Hadoop. However, Hive uses a vast amount of memory for data deduplication because it is implemented without considering the complexity of data. Therefore, an efficient algorithm has been proposed for data deduplication of complex structures. The performance evaluation results demonstrated that the proposed algorithm reduces the memory usage and data deduplication time by approximately 79% and 0.677%, respectively, compared to Hive. In the future, performance evaluation based on a large number of data nodes is required for a realistic verification of the proposed algorithm.

Metadata Log Management for Full Stripe Parity in Flash Storage Systems (플래시 저장 시스템의 Full Stripe Parity를 위한 메타데이터 로그 관리 방법)

  • Lim, Seung-Ho
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.11
    • /
    • pp.17-26
    • /
    • 2019
  • RAID-5 technology is one of the choice for flash storage device to enhance its reliability. However, RAID-5 has inherent parity update overhead, especially, parity overhead for partial stripe write is one of the crucial issues for flash-based RAID-5 technologies. In this paper, we design efficient parity log architecture for RAID-5 to eliminate runtime partial parity overhead. During runtime, partial parity is retained in buffer memory until full stripe write completed, and the parity is written with full strip write. In addition, parity log is maintained in memory until whole the stripe group is used for data write. With this parity log, partial parity can be recovered from the power loss. In the experiments, the parity log method can eliminate partial parity writes overhead with a little parity log writes. Hence it can reduce write amplification at the same reliability.