• Title/Summary/Keyword: data for analysis

Search Result 73,879, Processing Time 0.082 seconds

Statistical analysis of metagenomics data

  • Calle, M. Luz
    • Genomics & Informatics
    • /
    • v.17 no.1
    • /
    • pp.6.1-6.9
    • /
    • 2019
  • Understanding the role of the microbiome in human health and how it can be modulated is becoming increasingly relevant for preventive medicine and for the medical management of chronic diseases. The development of high-throughput sequencing technologies has boosted microbiome research through the study of microbial genomes and allowing a more precise quantification of microbiome abundances and function. Microbiome data analysis is challenging because it involves high-dimensional structured multivariate sparse data and because of its compositional nature. In this review we outline some of the procedures that are most commonly used for microbiome analysis and that are implemented in R packages. We place particular emphasis on the compositional structure of microbiome data. We describe the principles of compositional data analysis and distinguish between standard methods and those that fit into compositional data analysis.

Xperanto: A Web-Based Integrated System for DNA Microarray Data Management and Analysis

  • Park, Ji Yeon;Park, Yu Rang;Park, Chan Hee;Kim, Ji Hoon;Kim, Ju Ha
    • Genomics & Informatics
    • /
    • v.3 no.1
    • /
    • pp.39-42
    • /
    • 2005
  • DNA microarray is a high-throughput biomedical technology that monitors gene expression for thousands of genes in parallel. The abundance and complexity of the gene expression data have given rise to a requirement for their systematic management and analysis to support many laboratories performing microarray research. On these demands, we developed Xperanto for integrated data management and analysis using user-friendly web-based interface. Xperanto provides an integrated environment for management and analysis by linking the computational tools and rich sources of biological annotation. With the growing needs of data sharing, it is designed to be compliant to MGED (Microarray Gene Expression Data) standards for microarray data annotation and exchange. Xperanto enables a fast and efficient management of vast amounts of data, and serves as a communication channel among multiple researchers within an emerging interdisciplinary field.

The analysis of flight data of B747-400 aircraft with Missed Approach (B747-400 항공기의 Missed Approach 비행자료 분석)

  • Shin, D.W.;Park, J.H.;Eun, H.B.
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.11 no.2
    • /
    • pp.93-107
    • /
    • 2003
  • This study is performed to secure the safety of civil aviation by establishing systematic analysis ability of Flight Data Recorder. Through this study, readouting UFDR(Universal Flight Data Recorder) to personal computer, flight data numerical analysis and regulations of Missed Approach. In the analysis, the flight data of B747-400 model aircraft with Missed Approach in San Francisco(KSFO) was selected.

  • PDF

A guideline for the statistical analysis of compositional data in immunology

  • Yoo, Jinkyung;Sun, Zequn;Greenacre, Michael;Ma, Qin;Chung, Dongjun;Kim, Young Min
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.4
    • /
    • pp.453-469
    • /
    • 2022
  • The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the alternative approach using Dirichlet regression analysis, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.

Analysis and Estimation for Market Share of Biologics based on Google Trends Big Data (구글 트렌드 빅데이터를 통한 바이오의약품의 시장 점유율 분석과 추정)

  • Bong, Ki Tae;Lee, Heesang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.14-24
    • /
    • 2020
  • Google Trends is a useful tool not only for setting search periods, but also for providing search volume to specific countries, regions, and cities. Extant research showed that the big data from Google Trends could be used for an on-line market analysis of opinion sensitive products instead of an on-site survey. This study investigated the market share of tumor necrosis factor-alpha (TNF-α) inhibitor, which is in a great demand pharmaceutical product, based on big data analysis provided by Google Trends. In this case study, the consumer interest data from Google Trends were compared to the actual product sales of Top 3 TNF-α inhibitors (Enbrel, Remicade, and Humira). A correlation analysis and relative gap were analyzed by statistical analysis between sales-based market share and interest-based market share. Besides, in the country-specific analysis, three major countries (USA, Germany, and France) were selected for market share analysis for Top 3 TNF-α inhibitors. As a result, significant correlation and similarity were identified by data analysis. In the case of Remicade's biosimilars, the consumer interest in two biosimilar products (Inflectra and Renflexis) increased after the FDA approval. The analytical data showed that Google Trends is a powerful tool for market share estimation for biosimilars. This study is the first investigation in market share analysis for pharmaceutical products using Google Trends big data, and it shows that global and regional market share analysis and estimation are applicable for the interest-sensitive products.

Street Fashion Information Analysis System Design Using Data Fusion

  • Park, Hee-Chang;Park, Hye-Won
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.10a
    • /
    • pp.35-45
    • /
    • 2005
  • Data fusion is method to combination data. The purpose of this study is to design and implementation for street fashion information analysis system using data fusion. It can offer variety and actually information because it can fuse image data and survey data for street fashion. Data fusion method exists exact matching method, judgemental matching method, probability matching method, statistical matching method, data linking method, etc. In this study, we use exact matching method. Our system can be visual information analysis of customer's viewpoint because it can analyze both each data and fused data for image data and survey data.

  • PDF

Statistical approach for development of objective evaluation method on tobacco smoke

  • Hwang, Keon-Joong;Rhee, Moon-Soo;Ra, Do-Young
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.22 no.2
    • /
    • pp.184-189
    • /
    • 2000
  • This study was conducted to develop the objective evaluation method for tobacco smoke. The evaluation was carried out by using the data of cut or blended tobacco components, smoke components, electric nose system (ENS), and sensory test. By using the statistical methods, such as cluster analysis, discriminant analysis, factor analysis, correlation analysis, and multiple regression analysis, the relationship among the data of tobacco, smoke, ENS, and sensory evaluation was studied. By the results of cluster analysis, the data from smoke analysis by GC and ENS were able to select the difference of tobacco leaf characteristics. As the results of discriminant analysis, grouping by the components of tobacco leaves and smoke was possible and the results of GC analysis of smoke could be used for discrimination of tobacco leaves. In the results of factor analysis, nicotine, tar, CO, puff No and pH in the smoke were the factors effecting on the tobacco leaf characteristics. From the correlation analysis, aroma, taste, irritation, and smoke volume of sensory test had high relation to tar, p-cresol threonolatone, levoglucosane, and quinic acid- ${\gamma}$ -lactone of smoke. The ENS data showed high efficiency for discriminant analysis and cluster analysis, but it was not good for factor analysis, and correlation analysis. It was possible to estimate tobacco leaves and their blending characteristics by the analytical data of tobacco leaves, smoke, ENS, and sensory test results. By the multiple regression analysis, some correlation among selected chemical components and sensory evaluation were found. This study strongly indicated that the some chemical analysis data was available for the objective evaluation of tobacco sensory attributes.

  • PDF

A Study on the Reliability of Observational Settlement Analysis Using Data Mining (데이터마이닝을 이용한 관측적 침하해석의 신뢰성 연구)

  • 우철웅;장병욱
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.45 no.6
    • /
    • pp.183-193
    • /
    • 2003
  • Most construction works on the soft ground adopt instrumentation to manage settlement and stability of the embankment. The rapid progress of the information technologies and the digital data acquisition on the soft ground instrumentation has led to the fast-growing amount of data. Although valuable information about the behaviour of the soft ground may be hiding behind the data, most of the data are used restrictedly only for the management of settlement and stability. One of the critical issues on soft ground instrumentation is the long-term settlement prediction. Some observational settlement analysis methods are used for this purpose. But the reliability of the analysis results is remained in vague. The knowledge could be discovered from a large volume of experiences on the observational settlement analysis. In this article, we present a database to store settlement records and data mining procedure. A large volume of knowledge about observational settlement prediction were collected from the database by applying the filtering algorithm and knowledge discovery algorithm. Statistical analysis revealed that the reliability of observational settlement analysis depends on stay duration and estimated degree of consolidation.

Effective Data Management Method for Operational Data on Accredited Engineering Programs (공학교육인증 프로그램의 효과적인 운영 데이터 관리 방법)

  • Han, Kyoung-Soo
    • Journal of Engineering Education Research
    • /
    • v.17 no.5
    • /
    • pp.51-58
    • /
    • 2014
  • This study proposes an effective data management method for easing the burden on self-study report by analyzing operational data on accredited engineering programs. Four analysis criteria are developed: variability, difficulty level of collecting, urgency of analysis, timeliness. After the operational data are analyzed in terms of the analysis criteria, the data which should be managed in time are extracted according to the analysis results. This study proposes a data management method in which tasks of managing the timely-managed data are performed based on the regular academic schedule, so that the result of this study may be used as a working-level reference material.

Analysis on Types of Golf Tourism After COVID-19 by using Big Data

  • Hyun Seok Kim;Munyeong Yun;Gi-Hwan Ryu
    • International Journal of Advanced Culture Technology
    • /
    • v.12 no.1
    • /
    • pp.270-275
    • /
    • 2024
  • Introduction. In this study, purpose is to analize the types of golf tourism, inbound or outbound, by using big data and see how movement of industry is being changed and what changes have been made during and after Covid-19 in golf industry. Method Using Textom, a big data analysis tool, "golf tourism" and "Covid-19" were selected as keywords, and search frequency information of Naver and Daum was collected for a year from 1 st January, 2023 to 31st December, 2023, and data preprocessing was conducted based on this. For the suitability of the study and more accurate data, data not related to "golf tourism" was removed through the refining process, and similar keywords were grouped into the same keyword to perform analysis. As a result of the word refining process, top 36 keywords with the highest relevance and search frequency were selected and applied to this study. The top 36 keywords derived through word purification were subjected to TF-IDF analysis, visualization analysis using Ucinet6 and NetDraw programs, network analysis between keywords, and cluster analysis between each keyword through Concor analysis. Results By using big data analysis, it was found out option of oversea golf tourism is affecting on inbound golf travel. "Golf", "Tourism", "Vietnam", "Thailand" showed high frequencies, which proves that oversea golf tour is now the re-coming trends.