• Title/Summary/Keyword: Data

Search Result 215,338, Processing Time 0.115 seconds

Component Development and Importance Weight Analysis of Data Governance (Data Governance 구성요소 개발과 중요도 분석)

  • Jang, Kyoung-Ae;Kim, Woo-Je
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.41 no.3
    • /
    • pp.45-58
    • /
    • 2016
  • Data are important in an organization because they are used in making decisions and obtaining insights. Furthermore, given the increasing importance of data in modern society, data governance should be requested to increase an organization's competitive power. However, data governance concepts have caused confusion because of the myriad of guidelines proposed by related institutions and researchers. In this study, we re-established the concept of ambiguous data governance and derived the top-level components by analyzing previous research. This study identified the components of data governance and quantitatively analyzed the relation between these components by using DEMATEL and context analysis techniques that are often used to solve complex problems. Three higher components (data compliance management, data quality management, and data organization management) and 13 lower components are derived as data governance components. Furthermore, importance analysis shows that data quality management, data compliance management, and data organization management are the top components of data governance in order of priority. This study can be used as a basis for presenting standards or establishing concepts of data governance.

Data Mining for High Dimensional Data in Drug Discovery and Development

  • Lee, Kwan R.;Park, Daniel C.;Lin, Xiwu;Eslava, Sergio
    • Genomics & Informatics
    • /
    • v.1 no.2
    • /
    • pp.65-74
    • /
    • 2003
  • Data mining differs primarily from traditional data analysis on an important dimension, namely the scale of the data. That is the reason why not only statistical but also computer science principles are needed to extract information from large data sets. In this paper we briefly review data mining, its characteristics, typical data mining algorithms, and potential and ongoing applications of data mining at biopharmaceutical industries. The distinguishing characteristics of data mining lie in its understandability, scalability, its problem driven nature, and its analysis of retrospective or observational data in contrast to experimentally designed data. At a high level one can identify three types of problems for which data mining is useful: description, prediction and search. Brief review of data mining algorithms include decision trees and rules, nonlinear classification methods, memory-based methods, model-based clustering, and graphical dependency models. Application areas covered are discovery compound libraries, clinical trial and disease management data, genomics and proteomics, structural databases for candidate drug compounds, and other applications of pharmaceutical relevance.

Environmental Survey Data Analysis by Data Fusion Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1201-1208
    • /
    • 2006
  • Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. Currently, Gyeongnam province is executing the social survey every year with the provincials. But, they have the limit of the analysis as execute the different survey to 3 year cycles. In this paper, we study to data fusion of environmental survey data using sas macro. We can use data fusion outputs in environmental preservation and environmental improvement.

  • PDF

Data Reduction Method in Massive Data Sets

  • Namo, Gecynth Torre;Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.7 no.1
    • /
    • pp.35-40
    • /
    • 2009
  • Many researchers strive to research on ways on how to improve the performance of RFID system and many papers were written to solve one of the major drawbacks of potent technology related with data management. As RFID system captures billions of data, problems arising from dirty data and large volume of data causes uproar in the RFID community those researchers are finding ways on how to address this issue. Especially, effective data management is important to manage large volume of data. Data reduction techniques in attempts to address the issues on data are also presented in this paper. This paper introduces readers to a new data reduction algorithm that might be an alternative to reduce data in RFID Systems. A process on how to extract data from the reduced database is also presented. Performance study is conducted to analyze the new data reduction algorithm. Our performance analysis shows the utility and feasibility of our categorization reduction algorithms.

A Study on the Data Value: In Public Data (데이터 가치에 대한 탐색적 연구: 공공데이터를 중심으로)

  • Lee, Sang Eun;Lee, Jung Hoon;Choi, Hyun Jin
    • Journal of Information Technology Services
    • /
    • v.21 no.1
    • /
    • pp.145-161
    • /
    • 2022
  • The data is a key catalyst for the development of the fourth industry, and has been viewed as an essential element of the new industry, with technology convergence such as artificial intelligence, augmented/virtual reality, self-driving and 5 G. This will determine the price and value of the data as the user uses data in which the data is based on the context of the situation, rather than the data itself of the past supplier-centric data. This study began with, what factors will increase the value of data from a user perspective not a supplier perspective The study was limited to public data and users conducted research on users using data, such as analysis or development based on data. The study was designed to gauge the value of data that was not studied in the user's perspective, and was instrumental in raising the value of data in the jurisdiction of supplying and managing data.

A Study on Priorities of the Components of Big Data Information Security Service by AHP (AHP 기법을 활용한 Big Data 보안관리 요소들의 우선순위 분석에 관한 연구)

  • Biswas, Subrata;Yoo, Jin Ho;Jung, Chul Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.4
    • /
    • pp.301-314
    • /
    • 2013
  • The existing computer environment, numerous mobile environments and the internet environment make human life easier through the development of IT technology. With the emergence of the mobile and internet environment, data is getting bigger rapidly. From this environment, we can take advantage of using those data as economic assets for organizations which make dreams come true for the emerging Big Data environment and Big Data security services. Nowadays, Big Data services are increasing. However, these Big Data services about Big Data security is insufficient at present. In terms of Big Data security the number of security by Big Data studies are increasing which creates value for Security by Big Data not Security for Big Data. Accordingly in this paper our research will show how security for Big Data can vitalize Big Data service for organizations. In details, this paper derives the priorities of the components of Big Data Information Security Service by AHP.

Draft Design of DataLake Framework based on Abyss Storage Cluster (Abyss Storage Cluster 기반의 DataLake Framework의 설계)

  • Cha, ByungRae;Park, Sun;Shin, Byeong-Chun;Kim, JongWon
    • Smart Media Journal
    • /
    • v.7 no.1
    • /
    • pp.9-15
    • /
    • 2018
  • As an organization or organization grows in size, many different types of data are being generated in different systems. There is a need for a way to improve efficiency by processing data smarter in different systems. Just like DataLake, we are creating a single domain model that accurately describes the data and can represent the most important data for the entire business. In order to realize the benefits of a DataLake, it is import to know how a DataLake may be expected to work and what components architecturally may help to build a fully functional DataLake. DataLake components have a life cycle according to the data flow. And while th data flows into a DataLake from the point of acquisition, its meta-data is captured and managed along with data traceability, data lineage, and security aspects based on data sensitivity across its life cycle. According to this reason, we have designed the DataLake Framework based on Abyss Storage Cluster.

Development and Lessons Learned of Clinical Data Warehouse based on Common Data Model for Drug Surveillance (약물부작용 감시를 위한 공통데이터모델 기반 임상데이터웨어하우스 구축)

  • Mi Jung Rho
    • Korea Journal of Hospital Management
    • /
    • v.28 no.3
    • /
    • pp.1-14
    • /
    • 2023
  • Purposes: It is very important to establish a clinical data warehouse based on a common data model to offset the different data characteristics of each medical institution and for drug surveillance. This study attempted to establish a clinical data warehouse for Dankook university hospital for drug surveillance, and to derive the main items necessary for development. Methodology/Approach: This study extracted the electronic medical record data of Dankook university hospital tracked for 9 years from 2013 (2013.01.01. to 2021.12.31) to build a clinical data warehouse. The extracted data was converted into the Observational Medical Outcomes Partnership Common Data Model (Version 5.4). Data term mapping was performed using the electronic medical record data of Dankook university hospital and the standard term mapping guide. To verify the clinical data warehouse, the use of angiotensin receptor blockers and the incidence of liver toxicity were analyzed, and the results were compared with the analysis of hospital raw data. Findings: This study used a total of 670,933 data from electronic medical records for the Dankook university clinical data warehouse. Excluding the number of overlapping cases among the total number of cases, the target data was mapped into standard terms. Diagnosis (100% of total cases), drug (92.1%), and measurement (94.5%) were standardized. For treatment and surgery, the insurance EDI (electronic data interchange) code was used as it is. Extraction, conversion and loading were completed. R language-based conversion and loading software for the process was developed, and clinical data warehouse construction was completed through data verification. Practical Implications: In this study, a clinical data warehouse for Dankook university hospitals based on a common data model supporting drug surveillance research was established and verified. The results of this study provide guidelines for institutions that want to build a clinical data warehouse in the future by deriving key points necessary for building a clinical data warehouse.

  • PDF

A Novel Data Prediction Model using Data Weights and Neural Network based on R for Meaning Analysis between Data (데이터간 의미 분석을 위한 R기반의 데이터 가중치 및 신경망기반의 데이터 예측 모형에 관한 연구)

  • Jung, Se Hoon;Kim, Jong Chan;Sim, Chun Bo
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.4
    • /
    • pp.524-532
    • /
    • 2015
  • All data created in BigData times is included potentially meaning and correlation in data. A variety of data during a day in all society sectors has become created and stored. Research areas in analysis and grasp meaning between data is proceeding briskly. Especially, accuracy of meaning prediction and data imbalance problem between data for analysis is part in course of something important in data analysis field. In this paper, we proposed data prediction model based on data weights and neural network using R for meaning analysis between data. Proposed data prediction model is composed of classification model and analysis model. Classification model is working as weights application of normal distribution and optimum independent variable selection of multiple regression analysis. Analysis model role is increased prediction accuracy of output variable through neural network. Performance evaluation result, we were confirmed superiority of prediction model so that performance of result prediction through primitive data was measured 87.475% by proposed data prediction model.

A Study on Quantitative Modeling for EPCIS Event Data (EPCIS Event 데이터 크기의 정량적 모델링에 관한 연구)

  • Lee, Chang-Ho;Jho, Yong-Chul
    • Journal of the Korea Safety Management & Science
    • /
    • v.11 no.4
    • /
    • pp.221-228
    • /
    • 2009
  • Electronic Product Code Information Services(EPCIS) is an EPCglobal standard for sharing EPC related information between trading partners. EPCIS provides a new important capability to improve efficiency, security, and visibility in the global supply chain. EPCIS data are classified into two categories, master data (static data) and event data (dynamic data). Master data are static and constant for objects, for example, the name and code of product and the manufacturer, etc. Event data refer to things that happen dynamically with the passing of time, for example, the date of manufacture, the period and the route of circulation, the date of storage in warehouse, etc. There are four kinds of event data which are Object Event data, Aggregation Event data, Quantity Event data, and Transaction Event data. This thesis we propose an event-based data model for EPC Information Service repository in RFID based integrated logistics center. This data model can reduce the data volume and handle well all kinds of entity relationships. From the point of aspect of data quantity, we propose a formula model that can explain how many EPCIS events data are created per one business activity. Using this formula model, we can estimate the size of EPCIS events data of RFID based integrated logistics center for a one day under the assumed scenario.