• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.033 seconds

Implementation and Performance Measuring of Erasure Coding of Distributed File System (분산 파일시스템의 소거 코딩 구현 및 성능 비교)

  • Kim, Cheiyol;Kim, Youngchul;Kim, Dongoh;Kim, Hongyeon;Kim, Youngkyun;Seo, Daewha
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.11
    • /
    • pp.1515-1527
    • /
    • 2016
  • With the growth of big data, machine learning, and cloud computing, the importance of storage that can store large amounts of unstructured data is growing recently. So the commodity hardware based distributed file systems such as MAHA-FS, GlusterFS, and Ceph file system have received a lot of attention because of their scale-out and low-cost property. For the data fault tolerance, most of these file systems uses replication in the beginning. But as storage size is growing to tens or hundreds of petabytes, the low space efficiency of the replication has been considered as a problem. This paper applied erasure coding data fault tolerance policy to MAHA-FS for high space efficiency and introduces VDelta technique to solve data consistency problem. In this paper, we compares the performance of two file systems, MAHA-FS and GlusterFS. They have different IO processing architecture, the former is server centric and the latter is client centric architecture. We found the erasure coding performance of MAHA-FS is better than GlusterFS.

Case Study for Information Quality Maturity Model (정보 품질 성숙도 모델에 관한 연구)

  • Kim Chang-Jae;Choi Yong-Rak;Rhew Sung-Yul
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.557-564
    • /
    • 2006
  • Information is used effectively and contributes in profit creation and not only support management judgment quick but important resource to be possible recycled. The recent information systems improve enterprise's competitive power by reflection of user's various requirements and becoming big and complex for adaptation of rapidly circumstance change. Also it is trend that importance of information quality is emphasized gradually. The biggest problem in user requirement that is based on low quality data support. In case of business management is achieved by low quality information, company can not help dropping their competitive power such as company's strategy establishment, strategy achievement and management concentration breakup against competitor. Information of low quality increase time and expense to improve inaccurate data or revise and it is hard to accept correct information from specific situation. To solve these problems, we have to gain high quality data through definite comprehension, data management system establishment, and systematic data management achievement etc. Up to now, information quality and connected study were developed partially, but systematic methodology of information quality management's whole condition was not existed. Therefore, in this paper can show you how to extract process for information quality management & related evaluate factor with CMM (Capacity Maturity Mode]) 5 steps that is information warranty of quality process step. This paper whishes to contributes in competitive company or organization activity through information quality improvement management process.

Application and Utilization of Environmental DNA Technology for Biodiversity in Water Ecosystems (수생태계 생물다양성 연구를 위한 환경유전자(environmental DNA) 기술의 적용과 활용)

  • Kwak, Ihn-Sil;Park, Young-Seuk;Chang, Kwang-Hyeon
    • Korean Journal of Ecology and Environment
    • /
    • v.54 no.3
    • /
    • pp.151-155
    • /
    • 2021
  • The application of environmental DNA in the domestic ecosystem is also accelerating, but the processing and analysis of the produced data is limited, and doubts are raised about the reliability of the analyzed and produced biological taxa identification data, and the sample medium (target sample, water, air, sediment, Gastric contents, feces, etc.) and quantification and improvement of analysis methods are also needed. Therefore, in order to secure the reliability and accuracy of biodiversity research using the environmental DNA of the domestic ecosystem, it is a process of actively using the database accumulated through ecological taxonomy and undergoing verification procedures, and experts verifying the resolution of the data increased by gene sequence analysis. This is absolutely necessary. Environmental DNA research cannot be solved only by applying molecular biology technology, and interdisciplinary research cooperation such as ecology-taxa identification-genetics-informatics is important to secure the reliability of the produced data, and researchers dealing with various media can approach it together. It is an area in desperate need of an information sharing platform that can do this, and the speed of development will proceed rapidly, and the accumulated data is expected to grow as big data within a few years.

A Study on the Measurement and Comparison(IEC 60079-32-2) of Flammable Liquid Conductivity (인화성 액체 도전율에 관한 측정 및 비교(IEC 60079-32-2) 연구)

  • Lee, Dong Hoon;Byeon, Junghwan
    • Journal of the Korean Society of Safety
    • /
    • v.34 no.4
    • /
    • pp.22-31
    • /
    • 2019
  • The flammable liquid conductivity is an important factor in determining the generation of electrostatic in fire and explosion hazardous areas, so it is necessary to study the physical properties of flammable liquids. In particular, the relevant liquid conductivity in the process of handling flammable liquids in relation to the risk assessment and risk control in fire and explosion hazard areas, such as chemical plants, is classified as a main evaluation item according to the IEC standard, and it is necessary to have flammable liquid conductivity measuring devices and related data are required depending on the handling conditions of the material, such as temperature and mixing ratio for preventing the fire and explosion related to electrostatic. In addition, IEC 60079-32-2 [Explosive Atmospheres-Part 32-2 (Electrostatic hazards-Tests)] refers to the measuring device standard and the conductivity of a single substance. It was concluded that there is no measurement data according to the handling conditions such as mixing ratio of flammable liquid and temperature together with the use and measurement examples. We have developed the measurement reliability by improving the structure, material and measurement method of measuring device by referring to the IEC standard. We have developed a measurement device that is developed and manufactured by itself. The test results of flammable liquid conductivity measurement and the data of the NFPA 77 (Recommended Practice on Static Electricity) Annex B Table B.2 Static Electric Characteristic of Liquids were compared and verified by conducting the conductivity measurement of the flammable liquid handled in the fire and explosion hazardous place by using Measuring / Data Acquisition / Processing / PC Communication. It will contribute to the prevention of static electricity related disaster by taking preliminary measures for fire and explosion prevention by providing technical guidance for static electricity risk assessment and risk control through flammable liquid conductivity measurement experiment. In addition, based on the experimental results, it is possible to create a big data base by constructing electrostatic physical characteristic data of flammable liquids by process and material. Also, it is analyzed that it will contribute to the foundation composition for adding the specific information of conductivity of flammable liquid to the physical and chemical characteristics of MSDS.

Verification of Ground Subsidence Risk Map Based on Underground Cavity Data Using DNN Technique (DNN 기법을 활용한 지하공동 데이터기반의 지반침하 위험 지도 작성)

  • Han Eung Kim;Chang Hun Kim;Tae Geon Kim;Jeong Jun Park
    • Journal of the Society of Disaster Information
    • /
    • v.19 no.2
    • /
    • pp.334-343
    • /
    • 2023
  • Purpose: In this study, the cavity data found through ground cavity exploration was combined with underground facilities to derive a correlation, and the ground subsidence prediction map was verified based on the AI algorithm. Method: The study was conducted in three stages. The stage of data investigation and big data collection related to risk assessment. Data pre-processing steps for AI analysis. And it is the step of verifying the ground subsidence risk prediction map using the AI algorithm. Result: By analyzing the ground subsidence risk prediction map prepared, it was possible to confirm the distribution of risk grades in three stages of emergency, priority, and general for Busanjin-gu and Saha-gu. In addition, by arranging the predicted ground subsidence risk ratings for each section of the road route, it was confirmed that 3 out of 61 sections in Busanjin-gu and 7 out of 68 sections in Sahagu included roads with emergency ratings. Conclusion: Based on the verified ground subsidence risk prediction map, it is possible to provide citizens with a safe road environment by setting the exploration section according to the risk level and conducting investigation.

Methodology for Estimating Highway Traffic Performance Based on Origin/Destination Traffic Volume (기종점통행량(O/D) 기반의 고속도로 통행실적 산정 방법론 연구)

  • Howon Lee;Jungyeol Hong;Yoonhyuk Choi
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.23 no.2
    • /
    • pp.119-131
    • /
    • 2024
  • Understanding accurate traffic performance is crucial for ensuring efficient highway operation and providing a sustainable mobility environment. On the other hand, an immediate and precise estimation of highway traffic performance faces challenges because of infrastructure and technological constraints, data processing complexities, and limitations in using integrated big data. This paper introduces a framework for estimating traffic performance by analyzing real-time data sourced from toll collection systems and dedicated short-range communications used on highways. In particular, this study addresses the data errors arising from segmented information in data, influencing the individual travel trajectories of vehicles and establishing a more reliable Origin-Destination (OD) framework. The study revealed the necessity of trip linkage for accurate estimations when consecutive segments of individual vehicle travel within the OD occur within a 20-minute window. By linking these trip ODs, the daily average highway traffic performance for South Korea was estimated to be248,624 thousand vehicle kilometers per day. This value shows an increase of approximately 458 thousand vehicle kilometers per day compared to the 248,166 thousand vehicle kilometers per day reported in the highway operations manual. This outcome highlights the potential for supplementing previously omitted traffic performance data through the methodology proposed in this study.

An Algorithm for Finding a Relationship Between Entities: Semi-Automated Schema Integration Approach (엔티티 간의 관계명을 생성하는 알고리즘: 반자동화된 스키마 통합)

  • Kim, Yongchan;Park, Jinsoo;Suh, Jihae
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.243-262
    • /
    • 2018
  • Database schema integration is a significant issue in information systems. Because schema integration is a time-consuming and labor-intensive task, many studies have attempted to automate it. Researchers typically use XML as the source schema and leave much of the work to be done through DBA intervention, e.g., there are various naming conflicts related to relationship names in schema integration. In the past, the DBA had to intervene to resolve the naming-conflict name. In this paper, we introduce an algorithm that automatically generates relationship names to resolve relationship name conflicts that occur during schema integration. This algorithm is based on an Internet collocation and English sentence example dictionary. The relationship between the two entities is generated by analyzing examples extracted based on dictionary data through natural language processing. By building a semi-automated schema integration system and testing this algorithm, we found that it showed about 90% accuracy. Using this algorithm, we can resolve the problems related to naming conflicts that occur at schema integration automatically without DBA intervention.

A Study of Relationship Derivation Technique using object extraction Technique (개체추출기법을 이용한 관계성 도출기법)

  • Kim, Jong-hee;Lee, Eun-seok;Kim, Jeong-su;Park, Jong-kook;Kim, Jong-bae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.309-311
    • /
    • 2014
  • Despite increasing demands for big data application based on the analysis of scattered unstructured data, few relevant studies have been reported. Accordingly, the present study suggests a technique enabling a sentence-based semantic analysis by extracting objects from collected web information and automatically analyzing the relationships between such objects with collective intelligence and language processing technology. To be specific, collected information is stored in DBMS in a structured form, and then morpheme and feature information is analyzed. Obtained morphemes are classified into objects of interest, marginal objects and objects of non-interest. Then, with an inter-object attribute recognition technique, the relationships between objects are analyzed in terms of the degree, scope and nature of such relationships. As a result, the analysis of relevance between the information was based on certain keywords and used an inter-object relationship extraction technique that can determine positivity and negativity. Also, the present study suggested a method to design a system fit for real-time large-capacity processing and applicable to high value-added services.

  • PDF

Extended Adaptation Database Construction for Oriental Medicine Prescriptions Based on Academic Information (학술 정보 기반 한의학 처방을 위한 확장 적응증 데이터베이스 구축)

  • Lee, So-Min;Baek, Yeon-Hee;Song, Sang-Ho;CHRISTOPHER, RETITI DIOP EMANE;Han, Xuan-Zhong;Hong, Seong-Yeon;Kim, Ik-Su;Lim, Jong-Tea;Bok, Kyoung-Soo;TRAN, MINH NHAT;NGUYEN, QUYNH HOANG NGAN;Kim, So-Young;Kim, An-Na;Lee, Sang-Hun;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.8
    • /
    • pp.367-375
    • /
    • 2021
  • The quality of medical care can be defined as four types such as effectiveness, efficiency, adequacy, and scientific-technical quality. For the management of scientific-technical aspects, medical institutions annually disseminate the latest knowledge in the form of conservative education. However, there is an obvious limit to the fact that the latest knowledge is distributed quickly enough to the clinical site with only one-time conservative education. If intelligent information processing technologies such as big data and artificial intelligence are applied to the medical field, they can overcome the limitations of having to conduct research with only a small amount of information. In this paper, we construct databases on which the existing medicine prescription adaptations can be extended. To do this, we collect, store, manage, and analyze information related to oriental medicine at domestic and abroad Journals. We design a processing and analysis technique for oriental medicine evidence research data for the construction of a database of oriental medicine prescription extended adaption. Results can be used as a basic content of evidence-based medicine prescription information in the oriental medicine-related decision support services.

Milling Cutter Selection in Machining Center Using AHP (AHP를 활용한 머시닝센터의 밀링커터 선정)

  • Lee, Kyo-Sun;Park, Soo-Yong;Lee, Dong-Hyung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.164-170
    • /
    • 2017
  • The CNC machine tool field is showing a growing trend with the recent rapid development of manufacturing industries such as semiconductors, automobiles, medical devices, various inspection and test equipment, mechanical metal processing equipment, aircraft, shipbuilding and electronic equipment. However, small and medium-sized machining companies that use CNC machine tools are experiencing difficulties in increasingly intense competition. Especially, small companies which are receiving orders from 3rd or 4th venders are very difficult in business management. In recent years, company S experienced difficulty to make product quality and delivery time due to the ignorance of the processing method when manufacturing cooling plate jig made of SUS304 material used for cell phone liquid crystal glass processing. In order to solve these problems, we redesigned the process according to the size of our company and tried to manage all processes with quantified data. In the meantime, we have found that there is a need to improve the cutter process, which accounts for most of the machining process. Therefore, we have investigated the correlation between RPM and FEED of three cutters that have been used in the past. As a result, we found that it is the most urgent problem to solve the roughing process during the cutter operation which occupies more than 70% of the total machining. In order to shorten the machining time and improve the quality in machining of SUS304 cooling plate jig, we select the main factors such as price, tool life, maintenance cost, productivity, quality, RPM, and FEED and use AHP to find the most suitable milling cutter. We also tried to solve the problem of delivery, quality and production capacity which was a big problem of S company through experiment operation with selected cutter tool. As a result, the following conclusions were drawn. First, the most efficient of the three cutters currently available in the machining center has proven to be an M-cutter. Second, although one additional facility was required, it was possible to produce the existing facilities without additional investment by supplementing the lack of production capacity due to productivity improvement. Third, the Company's difficulties in delivery and capacity shortfalls have been resolved. Fourth, annual sales increased by KRW 109 million and profits increased by KRW 32 million annually. Fifth, it can confirm the usefulness of AHP method in corporate decision making and it can be utilized in various facility investment and process improvement in the future.