• Title/Summary/Keyword: File distribution

Search Result 184, Processing Time 0.025 seconds

A Study on the Domain Discrimination Model of CSV Format Public Open Data

  • Ha-Na Jeong;Jae-Woong Kim;Young-Suk Chung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.129-136
    • /
    • 2023
  • The government of the Republic of Korea is conducting quality management of public open data by conducting a public data quality management level evaluation. Public open data is provided in various open formats such as XML, JSON, and CSV, with CSV format accounting for the majority. When diagnosing the quality of public open data in CSV format, the quality diagnosis manager determines and diagnoses the domain for each field based on the field name and data within the field of the public open data file. However, it takes a lot of time because quality diagnosis is performed on large amounts of open data files. Additionally, in the case of fields whose meaning is difficult to understand, the accuracy of quality diagnosis is affected by the quality diagnosis person's ability to understand the data. This paper proposes a domain discrimination model for public open data in CSV format using field names and data distribution statistics to ensure consistency and accuracy so that quality diagnosis results are not influenced by the capabilities of the quality diagnosis person in charge, and to support shortening of diagnosis time. As a result of applying the model in this paper, the correct answer rate was about 77%, which is 2.8% higher than the file format open data diagnostic tool provided by the Ministry of Public Administration and Security. Through this, we expect to be able to improve accuracy when applying the proposed model to diagnosing and evaluating the quality management level of public data.

A Watermarking Scheme Based on k-means++ for Design Drawings (k-means++ 기반의 설계도면 워터마킹 기법)

  • Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.5
    • /
    • pp.57-70
    • /
    • 2009
  • A CAD design drawing based on vector data that is very important art work in industrial fields has been considered to content that the copyright protection is urgently needed. This paper presents a watermarking scheme based on k-means++ for CAD design drawing. One CAD design drawing consists of several layers and each layer consists of various geometric objects such as LINE, POLYLINE, CIRCLE, ARC, 3DFACE and POLYGON. POLYLINE with LINE, 3DFACE and ARC that are fundamental objects make up the majority in CAD design drawing. Therefore, the proposed scheme selects the target object with high distribution among POLYLINE, 3DFACE and ARC objects in CAD design drawing and then selects layers that include the most target object. Then we cluster the target objects in the selected layers by using k-means++ and embed the watermark into the geometric distribution of each group. The geometric distribution is the normalized length distribution in POLYLINE object, the normalized area distribution in 3DFACE object and the angle distribution in ARC object. Experimental results verified that the proposed scheme has the robustness against file format converting, layer attack as well as various geometric editing provided in CAD editing tools.

Integrated Water Distribution Network System using the Mathematical Analysis Model and GIS (수리해석 모형과 GIS를 이용한 통합 용수배분 시스템)

  • Kwon, Jae-Seop;Jo, Myung-Hee
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.4 no.4
    • /
    • pp.21-28
    • /
    • 2001
  • In this study, GNLP(GIS linked non-linear network analysis program) for pipeline system analysis has been developed. This GNLP gets the input data for pipeline analysis from existing GIS(geographic information system) data automatically, and has GUI(graphic user interface) for user. Non-Linear Method was used for hydraulic analysis of pipe network based on Hazen-Williams equation, and Microsoft Access of relational database management system(RDBMS) was used for the framework of database applied program. GNLP system environment program was improved so that a pipe network designer can input information data for hydraulic analysis of pipeline system more easily than that of existing models. Furthermore this model generate output such as pressure and water quantities in the form of a table and a chart, and also produces output data in Excel file. This model is also able to display data effectively for analysed data confirmation and query function which is the core of GIS program.

  • PDF

Equity in the Delivery of Health care in the Republic of Korea (의료이용의 형평성에 관한 실증적 연구 -공.교 의료보험 피부양자를 대상으로-)

  • 명지영;문옥륜
    • Health Policy and Management
    • /
    • v.5 no.2
    • /
    • pp.155-172
    • /
    • 1995
  • This study is an empirical analysis on the equity in the delivery of heatlh care under the Korean Medical Insurance Corporation System. The purposes of this study are to find out effects of income on the health care utiliztion and measure the income-related inequity in the distribution of health care. This study was carried out based on the fact that the health insurance program has been organized to achieve the equity objective, "equal treatment for equal needs". Of 41, 828 insured persons who had been diagnosed in the 1993 Health Screening Test and utilifzation data from 1, January 1993 through 31, December 1993 were derived from the Benefit Managment File. Inequity was measured by means of I) share approach, ii) standardization concentration curve approach, iii) inequity index, iv) test for inequity. The major findings were as follows : 1. The expenditure shares of the top two quintile groups exceeded their morbidity shares, whereas the opposite was true of the bottom three quintile groups, Which showed a positive HI$_{LG}$ inequity index, suggesting the presence of some inequity favoring the rich group. 2. Compared with other residential areas, the rural area showed the highest positive HI$_{LG}$ irrespective of need indicatior applied. 3. Standardized expenditure concentration indices adjusted by age, gender and need structure were also found to be positive, and therefore still indicated that there has been inequity favoring the rich after the standardization. 4. The Loglikelihood Ratio (LR) test for the statistical significance of income-related inequity of medical care utilization was carried out using the logistic regression model. The resulting loglikelihood ratio test statistic value was 176, which did exceed the 0.5 percent critical value of the chi-square distribution with 28 degrees of freedom, which is 50.993. Therefore, the null hypothesis of no income-related inequity of medical care utilization was rejected at the 99.5 percent confidence level. 5. The Regression based F-test has been carried out for analyzing the income-related inequity of medical expenditure in terms of age, gender, morbidity indicators as explanary variables. The hypothesis of the absence of income-relate inequity was rejected for all need indicators at the 95% confidence level.nce level.

  • PDF

Design of GlusterFS Based Big Data Distributed Processing System in Smart Factory (스마트 팩토리 환경에서의 GlusterFS 기반 빅데이터 분산 처리 시스템 설계)

  • Lee, Hyeop-Geon;Kim, Young-Woon;Kim, Ki-Young;Choi, Jong-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.1
    • /
    • pp.70-75
    • /
    • 2018
  • Smart Factory is an intelligent factory that can enhance productivity, quality, customer satisfaction, etc. by applying information and communications technology to the entire production process including design & development, manufacture, and distribution & logistics. The precise amount of data generated in a smart factory varies depending on the factory's size and state of facilities. Regardless, it would be difficult to apply traditional production management systems to a smart factory environment, as it generates vast amounts of data. For this reason, the need for a distributed big-data processing system has risen, which can process a large amount of data. Therefore, this article has designed a Gluster File System (GlusterFS)-based distributed big-data processing system that can be used in a smart factory environment. Compared to existing distributed processing systems, the proposed distributed big-data processing system reduces the system load and the risk of data loss through the distribution and management of network traffic.

A Study on Standardization of Copyright Collective Management for Digital Contents (디지털콘덴츠 집중관리를 위한 표준화에 관한 연구)

  • 조윤희;황도열
    • Journal of the Korean Society for information Management
    • /
    • v.20 no.1
    • /
    • pp.301-320
    • /
    • 2003
  • The rapidly increasing use of the Internet and advancement of the communication network, the explosive growth of digital contents from personal home pages to professional information service the emerging file exchange service and the development of hacking techniques . these are some of the trends contributing to the spread of illegal reproduction and distribution of digital contents, thus threatening the exclusive copyrights of the creative works that should be legally protected Accordingly, there is urgent need for a digital copyright management system designed to provide centralized management while playing the role of bridge between the copyright owners and users for smooth trading of the rights to digital contents, reliable billing, security measures, and monitoring of illegal use. Therefore, in this study, I examined the requirements of laws and systems for the introduction of the centralized management system to support smooth distribution of digital contents, and also researched on the current status of domestic and international centralized management system for copyrights. Furthermore, 1 tried to provide basic materials for the standardization of digital contents copyright management information through the examination of the essential elements of the centralized digital contents management such as the system for unique identification the standardization for data elements, and the digital rights management (DHM) .

Terrestrial DTV Broadcasting Program Protection System based on Program Protection Information (방송프로그램 보호신호에 기반한 지상파 방송프로그램 보호 시스템)

  • Choo, Hyon-Gon;Lee, Joo-Young;Nam, Je-Ho
    • Journal of Broadcast Engineering
    • /
    • v.15 no.2
    • /
    • pp.192-204
    • /
    • 2010
  • As illegal distribution of the terrestial DTV broadcast program occurs very frequently in on-line, the needs to protect broadcast program have increased. In this paper, a new approach to implement a system for terrestial DTV broadcast program protection based on program protection information(PPI) is proposed. In our approach, the broadcast program is recorded with encryption according to redistribution condition of the PPI and packaged into a file with key information and PPI together. And we also define a set of domain protocol for supporting user fair-use of broadcast program. In the proposed system, copy control can also be provided by process of home domain management. Implementation results show that our system can protect broadcast programs with efficiency and can support conditional distribution within home domain in order to satisfy user fair-use.

Term Clustering and Duplicate Distribution for Efficient Parallel Information Retrieval (효율적인 병렬정보검색을 위한 색인어 군집화 및 분산저장 기법)

  • 강재호;양재완;정성원;류광렬;권혁철;정상화
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.129-139
    • /
    • 2003
  • The PC cluster architecture is considered as a cost-effective alternative to the existing supercomputers for realizing a high-performance information retrieval (IR) system. To implement an efficient IR system on a PC cluster, it is essential to achieve maximum parallelism by having the data appropriately distributed to the local hard disks of the PCs in such a way that the disk I/O and the subsequent computation are distributed as evenly as possible to all the PCs. If the terms in the inverted index file can be classified to closely related clusters, the parallelism can be maximized by distributing them to the PCs in an interleaved manner. One of the goals of this research is the development of methods for automatically clustering the terms based on the likelihood of the terms' co-occurrence in the same query. Also, in this paper, we propose a method for duplicate distribution of inverted index records among the PCs to achieve fault-tolerance as well as dynamic load balancing. Experiments with a large corpus revealed the efficiency and effectiveness of our method.

Initial Authentication Protocol of Hadoop Distribution System based on Elliptic Curve (타원곡선기반 하둡 분산 시스템의 초기 인증 프로토콜)

  • Jeong, Yoon-Su;Kim, Yong-Tae;Park, Gil-Cheol
    • Journal of Digital Convergence
    • /
    • v.12 no.10
    • /
    • pp.253-258
    • /
    • 2014
  • Recently, the development of cloud computing technology is developed as soon as smartphones is increases, and increased that users want to receive big data service. Hadoop framework of the big data service is provided to hadoop file system and hadoop mapreduce supported by data-intensive distributed applications. But, smpartphone service using hadoop system is a very vulnerable state to data authentication. In this paper, we propose a initial authentication protocol of hadoop system assisted by smartphone service. Proposed protocol is combine symmetric key cryptography techniques with ECC algorithm in order to support the secure multiple data processing systems. In particular, the proposed protocol to access the system by the user Hadoop when processing data, the initial authentication key and the symmetric key instead of the elliptic curve by using the public key-based security is improved.

A Study on the Smart Printing Work Distribution Program to Increase the Efficiency of Managing Multiple Printers (복수의 프린터 관리효율을 증가시키기 위한 스마트한 인쇄작업 분배 프로그램 구현에 관한 연구)

  • Oh, Eun-Yeol
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.10
    • /
    • pp.1-8
    • /
    • 2019
  • Generally, printers are common for users to use for public use over a wired or wireless local area network. The number of printers in the same network is increasing, and management of multiple printers is needed. To do this, a program that drives two or more printers and a computer connected by a wired or wireless network. When a computer's control department receives a print command for a designated file, it executes the steps of receiving status information from the printer, selecting the printer, and sending the print command execution. As a method of research, we presented a method for selecting differentiation from this study through prior art research and literature research. Therefore, the purpose of the study is to distribute smart print commands according to real-time status information of many printers to increase the efficiency of the printer's management, and to distribute print commands according to the cumulative and usable workload of many printers so that parts replacement of many printers can be instantaneously performed.