• Title/Summary/Keyword: 분산형 파일 시스템

Search Result 37, Processing Time 0.02 seconds

A Study on the Data Collection Methods based Hadoop Distributed Environment (하둡 분산 환경 기반의 데이터 수집 기법 연구)

  • Jin, Go-Whan
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.5
    • /
    • pp.1-6
    • /
    • 2016
  • Many studies have been carried out for the development of big data utilization and analysis technology recently. There is a tendency that government agencies and companies to introduce a Hadoop of a processing platform for analyzing big data is increasing gradually. Increased interest with respect to the processing and analysis of these big data collection technology of data has become a major issue in parallel to it. However, study of the collection technology as compared to the study of data analysis techniques, it is insignificant situation. Therefore, in this paper, to build on the Hadoop cluster is a big data analysis platform, through the Apache sqoop, stylized from relational databases, to collect the data. In addition, to provide a sensor through the Apache flume, a system to collect on the basis of the data file of the Web application, the non-structured data such as log files to stream. The collection of data through these convergence would be able to utilize as a basic material of big data analysis.

Development on Improved of LZW Compression Algorithm by Mixed Text File for Embedded System (임베디드시스템을 위한 혼용텍스트 파일의 개선된 LZW 압축 알고리즘 구현)

  • Cho, Mi-Nam;Ji, Yoo-Kang
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.70-76
    • /
    • 2010
  • This paper Extended ELZW(EBCDIC Lempel Ziv Welch) algorithm uses 2 byte prefix field for pointer of a table and 1 byte suffix field for repeat counter. where, a prefix field uses a pointer(index) of compression table and a suffix field uses a counter of overlapping or recursion text data in compression table. To increase compression ratio, after construction of compression table, table data are properly packed as different bit string in accordance with a alphabet, Hangeul, and pointer respectively. Therefore, proposed ELZW algorithm is superior to 1byte LZW algorithm as 5.22 percent and superior to 2byte LZW algorithm as 8.96 percent.

Development of Process Control Graphic System for Power Plant Using Multiple Microcomputers (다중 마이크로 컴퓨터를 이용한 발전소 공정제어 그래픽 시스템의 개발)

  • ;;;Zeungnam Bien
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.38 no.3
    • /
    • pp.217-227
    • /
    • 1989
  • A process control graphic system is proposed as an efficient tool for monitoring the operation of power plant. It uses the multi-processor structure with 60 Kbyte shared memory as an implemental type of the distributed computer system, so that it is flexible, functionally extensible, and applicable to real-time process. The shared memory is used as a real-time database handling the process values and operator's commands. The database files, generated by the user-interactive graphic editor developed for the system or text editor, have the characteristics of simplicity and user-friendliness. The process control graphic system, that can monitor the operation of boiler and function as a backup controller in case of failure in boiler controller, is applied to Ulsan power plant. As a result, it displays the operating data of the boiler process without error by 14 pages of color graphic image according to the operation menu, and additionally functions well as a fault-tolerant control system.

Spatial Computation on Spark Using GPGPU (GPGPU를 활용한 스파크 기반 공간 연산)

  • Son, Chanseung;Kim, Daehee;Park, Neungsoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.8
    • /
    • pp.181-188
    • /
    • 2016
  • Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop.

An Incentive mechanism for VOD Streaming Under Insufficient System Resources (한정된 자원 환경에서의 주문형 비디오 스트리밍 서비스를 위한 효율적인 인센티브 메커니즘)

  • Shin, Kyuyong;Lee, Jongdeog;Shin, Jinhee;Park, Chanjin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.4
    • /
    • pp.65-77
    • /
    • 2013
  • Recently the ratio of the Internet traffic generated by video streaming applications including video-on-demand (VOD) is getting higher and higher, while P2P-based naive content distribution has been the main source of the Internet traffic in the past. As a result, the paradigm of cooperatively distributed systems (e.g., P2P) is changing to support streaming applications. Most P2P assisted approaches for video streaming today are based on Bit Torrent thanks to its simplicity of implementation and easy adaptability. They, however, have immanent vulnerability to free-riding inherited from Bit Torrent, which inevitably hurts their performance under limited system resources with free-riding. This paper studies the weakness to free-riding of existing Bit Torrent-based video streaming applications and investigates the adaptability of T-Chain (which was originally designed to prevent free-riding in cooperatively distributed systems) to video streaming applications. Our experiment results show that the video streaming approach based on T-Chain outperforms most existing Bit Torrent-based ones by 60% on average under limited system resources with free-riding.

The Trace Analysis of SaaS from a Client's Perspective (클라이언트관점의 SaaS 사용 흔적 분석)

  • Kang, Sung-Lim;Park, Jung-Heum;Lee, Sang-Jin
    • The KIPS Transactions:PartC
    • /
    • v.19C no.1
    • /
    • pp.1-8
    • /
    • 2012
  • Recently, due to the development of broadband, there is a significant increase in utilizing on-demand Saas (Software as a Service) which takes advantage of the technology. Nevertheless, the academic and practical levels of digital forensics have not yet been established in cloud computing environment. In addition, the data of user behavior is not likely to be stored on the local system. The relevant data may be stored across the various remote servers. Therefore, the investigators may encounter some problems in performing digital forensics in cloud computing environment. it is important to analysis History files, Cookie files, Temporary Internet Files, physical memory, etc. in a viewpoint of client, since the SaaS basically uses the web to connects the internet service. In this paper, we propose the method that analysis the usuage trace of the Saas which is the one of the most popular cloud computing services.

GWB: An integrated software system for Managing and Analyzing Genomic Sequences (GWB: 유전자 서열 데이터의 관리와 분석을 위한 통합 소프트웨어 시스템)

  • Kim In-Cheol;Jin Hoon
    • Journal of Internet Computing and Services
    • /
    • v.5 no.5
    • /
    • pp.1-15
    • /
    • 2004
  • In this paper, we explain the design and implementation of GWB(Gene WorkBench), which is a web-based, integrated system for efficiently managing and analyzing genomic sequences, Most existing software systems handling genomic sequences rarely provide both managing facilities and analyzing facilities. The analysis programs also tend to be unit programs that include just single or some part of the required functions. Moreover, these programs are widely distributed over Internet and require different execution environments. As lots of manual and conversion works are required for using these programs together, many life science researchers suffer great inconveniences. in order to overcome the problems of existing systems and provide a more convenient one for helping genomic researches in effective ways, this paper integrates both managing facilities and analyzing facilities into a single system called GWB. Most important issues regarding the design of GWB are how to integrate many different analysis programs into a single software system, and how to provide data or databases of different formats required to run these programs. In order to address these issues, GWB integrates different analysis programs byusing common input/output interfaces called wrappers, suggests a common format of genomic sequence data, organizes local databases consisting of a relational database and an indexed sequential file, and provides facilities for converting data among several well-known different formats and exporting local databases into XML files.

  • PDF