• Title/Summary/Keyword: Data collection framework

Search Result 242, Processing Time 0.027 seconds

AQS: An Analytical Query System for Multi-Location Rice Evaluation Data

  • Nazareno, Franco;Jung, Seung-Hyun;Kang, Yu-Jin;Lee, Kyung-Hee;Cho, Wan-Sup
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.2
    • /
    • pp.59-67
    • /
    • 2010
  • Rice varietal information exchange is vital for agricultural experiments and trials. With the growing size of rice data gathered around the world, and numerous research and development achievements, the effective collection and convenient ways of data dissemination is an important aspect to be dealt with. The collection of this data is continuously worked out through various international cooperation and network programs. The problem in acquiring this information anytime anywhere is the new challenge faced by rice breeders, scientist and crop information specialists, in order to perform rapid analysis and obtain significant results in rice research, thus alleviating rice production. To address these constraints, we propose an Online Analytical Query System, a web query application to provide breeders and rice scientist around the world a fast web search engine for rice varieties, giving the users the freedom to choose from which trial it has been used, trait observation parameters as well as geographical or weather conditions, and location specifications. The application uses data warehouse techniques and OLAP for summarization of agricultural trials conducted, and statistical analysis in deriving outstanding varieties used in these trials, consolidated in an Model-View-Controller Web framework.

A Study on the Design and Implementation of Metadata for Archival and Manuscripts Control (기록물정보 관리를 위한 메타데이터 설계와 구현에 관한 연구)

  • 김현희
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.4
    • /
    • pp.57-79
    • /
    • 2001
  • The study designs and implements a metadata management system for archival and manuscripts control. It has two purposes. First purpose is to organize the collection that the Institution of Korean Church History holds by using the proposed metadata format. Second purpose is to suggest the model and framework for managing the collection. The proposed metadata is designed based on ISAD(G), USMARC AMC, EAD and Ebind, By using the proposed metadata, the collection management system, which allows an integrated retrieval, is implemented. In order to evaluate the efficiency of the proposed system as well as to gain the basic data for the improvement of the proposed system, a questionnaire survey through e-mail is conducted. The evaluation results will be utilized for improving and upgrading the proposed system, and the phrased implementation of applying the system to digital libraries is suggested.

  • PDF

Big Data, Business Analytics, and IoT: The Opportunities and Challenges for Business (빅데이터, 비즈니스 애널리틱스, IoT: 경영의 새로운 도전과 기회)

  • Jang, Young Jae
    • The Journal of Information Systems
    • /
    • v.24 no.4
    • /
    • pp.139-152
    • /
    • 2015
  • With the advancement of the Internet/IT technologies and the increased computation power, massive data can be collected, stored, and processed these days. The availability of large databases has brought forth a new era in which companies are hard pressed to find innovative ways to utilize immense amounts of data at their disposal. Indeed, data has opened a new age of business operations and management. There are already many cases of innovative businesses reaping success thanks to scientific decisions based on data analysis and mathematical algorithms. Big Data is a new paradigm in itself. In this article, Big Data is viewed as a new perspective rather than a new technology. This value centric definition of Big Data provides a new insight and opportunities. Moreover, the Business Analytics, which is the framework of creating tangible results in management, is introduced. Then the Internet of Things (IoT), another innovative concept of data collection and networking, is presented and how this new concept can be interpreted with Big Data in terms of the value centric perspective. The challenges and opportunities with these new concepts are also discussed.

Efficient K-Anonymization Implementation with Apache Spark

  • Kim, Tae-Su;Kim, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.17-24
    • /
    • 2018
  • Today, we are living in the era of data and information. With the advent of Internet of Things (IoT), the popularity of social networking sites, and the development of mobile devices, a large amount of data is being produced in diverse areas. The collection of such data generated in various area is called big data. As the importance of big data grows, there has been a growing need to share big data containing information regarding an individual entity. As big data contains sensitive information about individuals, directly releasing it for public use may violate existing privacy requirements. Thus, privacy-preserving data publishing (PPDP) has been actively studied to share big data containing personal information for public use, while preserving the privacy of the individual. K-anonymity, which is the most popular method in the area of PPDP, transforms each record in a table such that at least k records have the same values for the given quasi-identifier attributes, and thus each record is indistinguishable from other records in the same class. As the size of big data continuously getting larger, there is a growing demand for the method which can efficiently anonymize vast amount of dta. Thus, in this paper, we develop an efficient k-anonymity method by using Spark distributed framework. Experimental results show that, through the developed method, significant gains in processing time can be achieved.

The Life of Patients with a Heart Transplant (심장 이식 수혜자의 삶)

  • Song, Yeoung-Suk
    • Journal of Korean Academy of Nursing
    • /
    • v.37 no.1
    • /
    • pp.64-71
    • /
    • 2007
  • Purpose: The main purpose of this study was to develop a substantive theory on the life of patients with heart transplantation in the context of Korean society and culture. The question for the study was 'What is the life of patients like with a heart transplant?'. Method: A grounded theory method guided the data collection and analysis. Participants for this study were 12 adults who regularly visited a Cardiovascular ambulatorium in a medical center. The data was collected through an in-depth interview and analyses were performed simultaneously. Result: 'Developing a new life to live on borrowed time' was the core category in this study. It revealed two types of life, one is living in peace and another is thinking positive. Conclusion: This study provides a framework for the development of individualized nursing interventions to care for patients with Heart Transplantation. The findings may provide pointers for health professionals about ways to improve support for heart transplant recipients.

A Study on Caring Experience from their Spouses Perceived by Hemodialysis Patients : A Grounded Theory (혈액투석환자가 지각하는 배우자 돌봄 경험)

  • Kim, Hyo-Bin
    • The Korean Journal of Rehabilitation Nursing
    • /
    • v.8 no.2
    • /
    • pp.157-164
    • /
    • 2005
  • Purpose: To develop a substantive theory that represents caring experience from their the spouses perceived by hemodiialysis patients. Method: Grounded theory method guided the data collection and analysis. A purposeful sample of 15 hemodiialysis patients participated from April, 2005 to September, 2005. The data were collected through in-depth interviews. All interviews were audio taped and transcribed verbatim. Constant comparative analysis were performed simultaneously. Result: The core category on caring experience from their the spouses perceived by hemodialysis patients was identified "Re-establishment for life". The process was categorized into four stage, "Escaping", "Accepting", "Enduring", "Transcending". Conclusion: This study provides a framework for the development of individualized nursing intervention to care for the hemodialysis patients.

  • PDF

A Study of Methodology for Automatic Construction of OWL Ontologies from Sejong Electronic Dictionary (대용량 OWL 온톨로지 자동구축을 위한 세종전자사전 활용 방법론 연구)

  • Song Do Gyu
    • Language and Information
    • /
    • v.9 no.1
    • /
    • pp.19-34
    • /
    • 2005
  • Ontology is an indispensable component in intelligent and semantic processing of knowledge and information, such as in semantic web. However, ontology construction requires vast amount of data collection and arduous efforts in processing these un-structured data. This study proposed a methodology to automatically construct and generate ontologies from Sejong Electronic Dictionary. As Sejong Electronic Dictionary is structured in XML format, it can be processed automatically by computer programmed tools into an OWL(Web Ontology Language)-based ontologies as specified in W3C . This paper presents the process and concrete application of this methodology.

  • PDF

Augmentation of Hidden Markov Chain for Complex Sequential Data in Context

  • Sin, Bong-Kee
    • Journal of Multimedia Information System
    • /
    • v.8 no.1
    • /
    • pp.31-34
    • /
    • 2021
  • The classical HMM is defined by a parameter triple �� = (��, A, B), where each parameter represents a collection of probability distributions: initial state, state transition and output distributions in order. This paper proposes a new stationary parameter e = (e1, e2, …, eN) where N is the number of states and et = P(|xt = i, y) for describing how an input pattern y ends in state xt = i at time t followed by nothing. It is often said that all is well that ends well. We argue here that all should end well. The paper sets the framework for the theory and presents an efficient inference and training algorithms based on dynamic programming and expectation-maximization. The proposed model is applicable to analyzing any sequential data with two or more finite segmental patterns are concatenated, each forming a context to its neighbors. Experiments on online Hangul handwriting characters have proven the effect of the proposed augmentation in terms of highly intuitive segmentation as well as recognition performance and 13.2% error rate reduction.

Estimating the AUC of the MROC curve in the presence of measurement errors

  • G, Siva;R, Vishnu Vardhan;Kamath, Asha
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.533-545
    • /
    • 2022
  • Collection of data on several variables, especially in the field of medicine, results in the problem of measurement errors. The presence of such measurement errors may influence the outcomes or estimates of the parameter in the model. In classification scenario, the presence of measurement errors will affect the intrinsic cum summary measures of Receiver Operating Characteristic (ROC) curve. In the context of ROC curve, only a few researchers have attempted to study the problem of measurement errors in estimating the area under their respective ROC curves in the framework of univariate setup. In this paper, we work on the estimation of area under the multivariate ROC curve in the presence of measurement errors. The proposed work is supported with a real dataset and simulation studies. Results show that the proposed bias-corrected estimator helps in correcting the AUC with minimum bias and minimum mean square error.

RHadoop platform for K-Means clustering of big data (빅데이터 K-평균 클러스터링을 위한 RHadoop 플랫폼)

  • Shin, Ji Eun;Oh, Yoon Sik;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.609-619
    • /
    • 2016
  • RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. In this paper, we implement K-Means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. The main idea introduces a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. We showed that our K-Means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases. We also implemented Elbow method with MapReduce for finding the optimum number of clusters for K-Means clustering on large dataset. Comparison with our MapReduce implementation of Elbow method and classical kmeans() in R with small data showed similar results.