• Title/Summary/Keyword: analysis data

Search Result 85,083, Processing Time 0.081 seconds

BaSDAS: a web-based pooled CRISPR-Cas9 knockout screening data analysis system

  • Park, Young-Kyu;Yoon, Byoung-Ha;Park, Seung-Jin;Kim, Byung Kwon;Kim, Seon-Young
    • Genomics & Informatics
    • /
    • v.18 no.4
    • /
    • pp.46.1-46.4
    • /
    • 2020
  • We developed the BaSDAS (Barcode-Seq Data Analysis System), a GUI-based pooled knockout screening data analysis system, to facilitate the analysis of pooled knockout screen data easily and effectively by researchers with limited bioinformatics skills. The BaSDAS supports the analysis of various pooled screening libraries, including yeast, human, and mouse libraries, and provides many useful statistical and visualization functions with a user-friendly web interface for convenience. We expect that BaSDAS will be a useful tool for the analysis of genome-wide screening data and will support the development of novel drugs based on functional genomics information.

Research on the Development of Big Data Analysis Tools for Engineering Education (공학교육 빅 데이터 분석 도구 개발 연구)

  • Kim, Younyoung;Kim, Jaehee
    • Journal of Engineering Education Research
    • /
    • v.26 no.4
    • /
    • pp.22-35
    • /
    • 2023
  • As information and communication technology has developed remarkably, it has become possible to analyze various types of large-volume data generated at a speed close to real time, and based on this, reliable value creation has become possible. Such big data analysis is becoming an important means of supporting decision-making based on scientific figures. The purpose of this study is to develop a big data analysis tool that can analyze large amounts of data generated through engineering education. The tasks of this study are as follows. First, a database is designed to store the information of entries in the National Creative Capstone Design Contest. Second, the pre-processing process is checked for analysis with big data analysis tools. Finally, analyze the data using the developed big data analysis tool. In this study, 1,784 works submitted to the National Creative Comprehensive Design Contest from 2014 to 2019 were analyzed. As a result of selecting the top 10 words through topic analysis, 'robot' ranked first from 2014 to 2019, and energy, drones, ultrasound, solar energy, and IoT appeared with high frequency. This result seems to reflect the current core topics and technology trends of the 4th Industrial Revolution. In addition, it seems that due to the nature of the Capstone Design Contest, students majoring in electrical/electronic, computer/information and communication engineering, mechanical engineering, and chemical/new materials engineering who can submit complete products for problem solving were selected. The significance of this study is that the results of this study can be used in the field of engineering education as basic data for the development of educational contents and teaching methods that reflect industry and technology trends. Furthermore, it is expected that the results of big data analysis related to engineering education can be used as a means of preparing preemptive countermeasures in establishing education policies that reflect social changes.

Component Development and Importance Weight Analysis of Data Governance (Data Governance 구성요소 개발과 중요도 분석)

  • Jang, Kyoung-Ae;Kim, Woo-Je
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.41 no.3
    • /
    • pp.45-58
    • /
    • 2016
  • Data are important in an organization because they are used in making decisions and obtaining insights. Furthermore, given the increasing importance of data in modern society, data governance should be requested to increase an organization's competitive power. However, data governance concepts have caused confusion because of the myriad of guidelines proposed by related institutions and researchers. In this study, we re-established the concept of ambiguous data governance and derived the top-level components by analyzing previous research. This study identified the components of data governance and quantitatively analyzed the relation between these components by using DEMATEL and context analysis techniques that are often used to solve complex problems. Three higher components (data compliance management, data quality management, and data organization management) and 13 lower components are derived as data governance components. Furthermore, importance analysis shows that data quality management, data compliance management, and data organization management are the top components of data governance in order of priority. This study can be used as a basis for presenting standards or establishing concepts of data governance.

An Empirical Study on Manufacturing Process Mining of Smart Factory (스마트 팩토리의 제조 프로세스 마이닝에 관한 실증 연구)

  • Taesung, Kim
    • Journal of the Korea Safety Management & Science
    • /
    • v.24 no.4
    • /
    • pp.149-156
    • /
    • 2022
  • Manufacturing process mining performs various data analyzes of performance on event logs that record production. That is, it analyzes the event log data accumulated in the information system and extracts useful information necessary for business execution. Process data analysis by process mining analyzes actual data extracted from manufacturing execution systems (MES) to enable accurate manufacturing process analysis. In order to continuously manage and improve manufacturing and manufacturing processes, there is a need to structure, monitor and analyze the processes, but there is a lack of suitable technology to use. The purpose of this research is to propose a manufacturing process analysis method using process mining and to establish a manufacturing process mining system by analyzing empirical data. In this research, the manufacturing process was analyzed by process mining technology using transaction data extracted from MES. A relationship model of the manufacturing process and equipment was derived, and various performance analyzes were performed on the derived process model from the viewpoint of work, equipment, and time. The results of this analysis are highly effective in shortening process lead times (bottleneck analysis, time analysis), improving productivity (throughput analysis), and reducing costs (equipment analysis).

Education and Training of Product Data Analytics using Product Data Management System (PDM 시스템을 활용한 Product Data Analytics 교육 훈련)

  • Do, Namchul
    • Korean Journal of Computational Design and Engineering
    • /
    • v.22 no.1
    • /
    • pp.80-88
    • /
    • 2017
  • Product data analytics (PDA) is a data-driven analysis method that uses product data management (PDM) databases as its operational data. It aims to understand and evaluate product development processes indirectly through the analysis of product data from the PDM databases. To educate and train PDA efficiently, this study proposed an approach that employs courses for both product development and PDA in a class. The participant group for product development provides a PDM database as a result of their product development activities, and the other group for PDA analyses the PDM database and provides analysis result to the product development group who can explain causes of the result. The collaboration between the two groups can enhance the efficiency of the education and training course on PDA. This study also includes an application example of the approach to a graduate class on PDA and discussion of its result.

Reliability Assessment of Machine Tools Using Failure Mode Analysis Programs (고장모드 분석 프로그램을 통한 공작기계의 신뢰성 평가)

  • Kim Bong-Suk;Lee Soo-Hun;Song Jun-Yeob;Lee Seung-Woo
    • Transactions of the Korean Society of Machine Tool Engineers
    • /
    • v.14 no.1
    • /
    • pp.15-23
    • /
    • 2005
  • For reliability assessment for machine tools, failure mode analyses by two viewpoints were studied in this paper. First, this study developed the reliability data analysis program, which searches f3r optimal failure distribution like failure rate or MTBF(Mean Time Between Failure) using failure data and reliability test data of mechanical parts in the web. Moreover, this data analysis program saves both failure data or reliability data and their failure rate or MTBF for database establishment. Second, this paper conducted failure mode analysis through such performance tests as circular movement test and vibration testing for machine tools when reliability data is not available. A developed web-based analysis program shows correlations between failure mode and performance test result and also accumulates all the data. These kinds of data analysis programs and stored data furnish valuable information for improving the reliability of mechanical system.

Stream Data Analysis of the Weather on the Location using Principal Component Analysis (주성분 분석을 이용한 지역기반의 날씨의 스트림 데이터 분석)

  • Kim, Sang-Yeob;Kim, Kwang-Deuk;Bae, Kyoung-Ho;Ryu, Keun-Ho
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.2
    • /
    • pp.233-237
    • /
    • 2010
  • The recent advance of sensor networks and ubiquitous techniques allow collecting and analyzing of the data which overcome the limitation imposed by time and space in real-time for making decisions. Also, analysis and prediction of collected data can support useful and necessary information to users. The collected data in sensor networks environment is the stream data which has continuous, unlimited and sequential properties. Because of the continuous, unlimited and large volume properties of stream data, managing stream data is difficult. And the stream data needs dynamic processing method because of the memory constraint and access limitation. Accordingly, we analyze correlation stream data using principal component analysis. And using result of analysis, it helps users for making decisions.

Veri cation of Improving a Clustering Algorith for Microarray Data with Missing Values

  • Kim, Su-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.315-321
    • /
    • 2011
  • Gene expression microarray data often include multiple missing values. Most gene expression analysis (including gene clustering analysis); however, require a complete data matric as an input. In ordinary clustering methods, just a single missing value makes one abandon the whole data of a gene even if the rest of data for that gene was intact. The quality of analysis may decrease seriously as the missing rate is increased. In the opposite aspect, the imputation of missing value may result in an artifact that reduces the reliability of the analysis. To clarify this contradiction in microarray clustering analysis, this paper compared the accuracy of clustering with and without imputation over several microarray data having different missing rates. This paper also tested the clustering efficiency of several imputation methods including our propose algorithm. The results showed it is worthwhile to check the clustering result in this alternative way without any imputed data for the imperfect microarray data.

Utilization and Analysis of Big-data

  • Lee, Soowook;Han, Manyong
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.255-259
    • /
    • 2019
  • This study reviews the analysis and characteristics of databases from big data and then establishes representational strategy. Thus, analysis has continued for a long time in the quantity and quality of data, and there are changes in the location of data in the social sciences, past trends and the emergence of big data. The introduction of big data is presented as a prototype of new social science and is a useful practical example that empirically shows the need, basis, and direction of analysis through trend prediction services. Big data provides a future perspective as an important foundation for social change within the framework of basic social sciences.

Reinforcement learning multi-agent using unsupervised learning in a distributed cloud environment

  • Gu, Seo-Yeon;Moon, Seok-Jae;Park, Byung-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.192-198
    • /
    • 2022
  • Companies are building and utilizing their own data analysis systems according to business characteristics in the distributed cloud. However, as businesses and data types become more complex and diverse, the demand for more efficient analytics has increased. In response to these demands, in this paper, we propose an unsupervised learning-based data analysis agent to which reinforcement learning is applied for effective data analysis. The proposal agent consists of reinforcement learning processing manager and unsupervised learning manager modules. These two modules configure an agent with k-means clustering on multiple nodes and then perform distributed training on multiple data sets. This enables data analysis in a relatively short time compared to conventional systems that perform analysis of large-scale data in one batch.