• Title/Summary/Keyword: Distribute Processing

Search Result 152, Processing Time 0.035 seconds

A Study On Recommend System Using Co-occurrence Matrix and Hadoop Distribution Processing (동시발생 행렬과 하둡 분산처리를 이용한 추천시스템에 관한 연구)

  • Kim, Chang-Bok;Chung, Jae-Pil
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.5
    • /
    • pp.468-475
    • /
    • 2014
  • The recommend system is getting more difficult real time recommend by lager preference data set, computing power and recommend algorithm. For this reason, recommend system is proceeding actively one's studies toward distribute processing method of large preference data set. This paper studied distribute processing method of large preference data set using hadoop distribute processing platform and mahout machine learning library. The recommend algorithm is used Co-occurrence Matrix similar to item Collaborative Filtering. The Co-occurrence Matrix can do distribute processing by many node of hadoop cluster, and it needs many computation scale but can reduce computation scale by distribute processing. This paper has simplified distribute processing of co-occurrence matrix by changes over from four stage to three stage. As a result, this paper can reduce mapreduce job and can generate recommend file. And it has a fast processing speed, and reduce map output data.

A File Merging Scheme for Efficient Handling of Small Files in Hadoop Distributed File System (Hadoop Distribute file system에서 Small file을 효과적으로 처리하기 위한 파일 병합 기법 연구)

  • Park, Jong-Chang;Youn, Hee-Yong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.15-17
    • /
    • 2013
  • HDFS(Hadoop Distribute File System)는 대용량 파일 처리를 목적으로 설계 되었으며 현재 이상적인 분산 파일 시스템으로 각광 받고 있다. 이러한 HDFS는 기존 분산파일 시스템과 많은 유사성을 가지고 있으나, Fault Tolerance를 제공하고, 데이터 엑세스 패턴을 스트리밍 방식으로 지원하여 대용량 파일을 효율적으로 저장할 수 있다는 차별성을 가지고 있다. 하지만 실제 HDFS 데이터 집합에는 Small file이 차지하는 비중이 상당히 높으며, 이러한 다수의 Small file 은 데이터 처리에 있어 높은 비용을 초래할 뿐 아니라 Master Node 의 파일 처리 및 메모리 성능에 악영향을 미친다. 따라서 본 논문에서는 HDFS에서 Small file 이 미치는 영향을 분석하고 이러한 문제점을 해결 할 수 있는 로컬 인덱스 파일기반의 파일 병합 기법을 제안한다.

A New Request & Distribute algorithm for maintaining QoS of 802.16 based Mobile IPTV (802.16 기반의 모바일 IPTV의 QoS를 유지하기 위한 새로운 요청 & 할당 알고리즘)

  • Dong-Hyon Kim;Hee Yong Youn
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.743-746
    • /
    • 2008
  • IEEE 802.16 BWA(Broadband Wireless Access) 기반에서 QoS를 보장하기 위한 연구는 매우 중요하고 활발히 전개되고 있는 분야 중 하나이다. 현재 IEEE 802.16 Standard 기반의 QoS 보장을 위한 여러 메커니즘들이 정의되어 있지만, 단순한 정의일 뿐, 실제 시스템의 설계는 설계자의 몫으로 남겨져 있다. 또한 현재 설계되어있는 메커니즘들은 현재 여러 가지 부분에서 취약성을 보이고 있다. 또한 현재의 메커니즘들은 일반적인 인터넷환경에 맞추어져 있고, IPTV 서비스만을 위한 특화는 되어있지 못하다. 따라서 이 논문은 IEEE 802.16기반을 이용한 IPTV 서비스를 제공할 때 QoS를 보장하기 위한 연구를 하여 IPTV 서비스에서의 MPEG 서비스 제공시 높은 대역폭 사용을 위한 요청(Request) & 할당(Distribute) 알고리즘을 제안한다.

The Collision Processing Design of an Online Distributed Game Server (온라인 분산게임 서버의 충돌처리 설계)

  • Lee Sung-Ug
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.1
    • /
    • pp.72-79
    • /
    • 2006
  • Recently, a MMORPG(Massively Multi-play Online Role Playing Game) has built distribute server by Seamless world. This paper proposes an efficient collision detection method. DLS is used to dynamically adjust spatial subdivisions in each the boundary regions of distribute server We use an index table to effectively utilize the relationships between in the nodes and can perform the collision detection efficiently by reconstructing nodes of the tree. Also, we maintain the information for the boundary region to efficiently detect the collections and adjust the boundary regions between distributed servers by using DLS. As the DLS uses pointers, the information for each server is not needed and the boundary regions between the distributed servers are efficiently searched. Using node index points, the construction table can be made to find between ray and neighborhood node, In addition, processes for Network traffic reduce because a copy of the boundary regions is not needed when a object moves with realtime.

  • PDF

An Anycast Routing Algorithm by Estimating Traffic Conditions of Multimedia Sources

  • Park, Won-Hyuck;Shin, Hye-Jin;Lee, Tae-Seung;Kim, Jung-Sun
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.213-215
    • /
    • 2003
  • Multimedia has to carry data of heterogeous types. Multicast communication techniques can supply the most appropriate infrastructures to such multimedia. Of many multicast protocols, the core based tree (CBT) protocol is the most concentrative studies are conducted on. The CBT places a core router at center of the shared tree and transfers data through the tore router. However, the CBT has two problems due to centralizing all network traffics into a core router. First it can raise bottleneck effect at a core router. Second, it is possible to make an additive processing overhead when core router is distant from receivers. To cope with the problems, this paper proposes an intelligent anycast routing protocol. The anycast routing attempts to distribute the centralized traffic into plural core routers by using a knowledge-based algorithm. The anycast routing estimates the traffic characteristics of multimedia data far each multicast source, and achieves effectively the distributing that places an appropriate core router to process the incoming traffic based on the traffic information in the event that request of receivers are raised. This method prevent the additional overhead to distribute traffic because an individual core router uses the information estimated to multicast sources connected to oneself and the traffic processing statistics shared with other core neuters.

  • PDF

A Study on the Development of Radar Signal Detecting & Processor (Radar Signal Detecting & Processing 장치의 개발에 관한 연구)

  • 송재욱
    • Journal of the Korean Institute of Navigation
    • /
    • v.24 no.5
    • /
    • pp.435-441
    • /
    • 2000
  • This paper deals with the development of RACOM(Radar Signal Detecting & Processing Computer). RACOM is a radar display system specially designed for radar scan conversion, signal processing and PCI radar image display. RACOM contains two components; i )RSP(Radar Signal Processor) board which is a PCI based board for receiving video, trigger, heading & bearing signals from radar scanner & tranceiver units and processing these signals to generate high resolution radar image, and ⅱ)Applications which perform ordinary radar display functions such as EBL, VRM and so on. Since RACOM is designed to meet a wide variety of specifications(type of output signal from tranceiver unit), to record radar images and to distribute those images in real time to everywhere in a networked environment, it can be applicable to AIS(Automatic Identification System) and VDR(Voyage Data Recorder).

  • PDF

A Structure Distributed Processing Method in Data Flow Systems (Data Flow 시스템에서 구조체 분산 처리 방식)

  • Maeng, S.Y.;Hyun, W.M.;Ha, Y.H.;Lim, I.C.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1125-1128
    • /
    • 1987
  • This paper proposes a method which distributes the structure data represented by a tree and handles it. To distribute and handle the structure data, this method partitions a structure data and distributes the partitioned structure in multiple processing element and allocates the partitioned structure. Each processing element includes the structure memory to store the partitioned structure and the structure controller to handle efficiently the distributed structure. As the structure is distributed and is stored in the structure memory and is handled by the structure controller, the processing time is reduced.

  • PDF

Honey Bee Based Load Balancing in Cloud Computing

  • Hashem, Walaa;Nashaat, Heba;Rizk, Rawya
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.5694-5711
    • /
    • 2017
  • The technology of cloud computing is growing very quickly, thus it is required to manage the process of resource allocation. In this paper, load balancing algorithm based on honey bee behavior (LBA_HB) is proposed. Its main goal is distribute workload of multiple network links in the way that avoid underutilization and over utilization of the resources. This can be achieved by allocating the incoming task to a virtual machine (VM) which meets two conditions; number of tasks currently processing by this VM is less than number of tasks currently processing by other VMs and the deviation of this VM processing time from average processing time of all VMs is less than a threshold value. The proposed algorithm is compared with different scheduling algorithms; honey bee, ant colony, modified throttled and round robin algorithms. The results of experiments show the efficiency of the proposed algorithm in terms of execution time, response time, makespan, standard deviation of load, and degree of imbalance.

A Study on the Analysis Method of Artificial Intelligence for Real-Time Data Prediction. (실시간 데이터 예측을 위한 인공지능 분석 방법 연구)

  • Hong, Phil-Doo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.547-549
    • /
    • 2021
  • In Artificial Intelligence analysis, the process of creating a model and verifying it is a task that requires computational processing time because it is Batch Processing performed with already generated data. We need to model, validate, and predict real-time data, such as stocks and defense information, with data generated directly in front of us. As a solution to this, we solve it by applying techniques to segment the data required for artificial intelligence modeling tasks in order of time processing and distribute the data across multiple processes.

  • PDF

Development of Big-data Management Platform Considering Docker Based Real Time Data Connecting and Processing Environments (도커 기반의 실시간 데이터 연계 및 처리 환경을 고려한 빅데이터 관리 플랫폼 개발)

  • Kim, Dong Gil;Park, Yong-Soon;Chung, Tae-Yun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.4
    • /
    • pp.153-161
    • /
    • 2021
  • Real-time access is required to handle continuous and unstructured data and should be flexible in management under dynamic state. Platform can be built to allow data collection, storage, and processing from local-server or multi-server. Although the former centralize method is easy to control, it creates an overload problem because it proceeds all the processing in one unit, and the latter distributed method performs parallel processing, so it is fast to respond and can easily scale system capacity, but the design is complex. This paper provides data collection and processing on one platform to derive significant insights from various data held by an enterprise or agency in the latter manner, which is intuitively available on dashboards and utilizes Spark to improve distributed processing performance. All service utilize dockers to distribute and management. The data used in this study was 100% collected from Kafka, showing that when the file size is 4.4 gigabytes, the data processing speed in spark cluster mode is 2 minute 15 seconds, about 3 minutes 19 seconds faster than the local mode.