• Title/Summary/Keyword: multiple testing

Search Result 790, Processing Time 0.025 seconds

Multiple Group Testing Procedures for Analysis of High-Dimensional Genomic Data

  • Ko, Hyoseok;Kim, Kipoong;Sun, Hokeun
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.187-195
    • /
    • 2016
  • In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's $T^2$ test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator) in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.

A Bayesian Multiple Testing of Detecting Differentially Expressed Genes in Two-sample Comparison Problem

  • Oh Hyun-Sook;Yang Wan-Youn
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.39-47
    • /
    • 2006
  • The Bayesian approach to multiple testing procedure for one sample testing problem proposed by Scott and Berger (2003) is extended to two-sample comparison problem in microarray experiments. The prior distribution of each gene's mean for one sample is given conditionally on the corresponding gene's mean for the other sample. Posterior distributions of interesting parameters are derived and estimated based on an importance sampling method. A simulated example is given for illustration.

Speaker Verification with the Constraint of Limited Data

  • Kumari, Thyamagondlu Renukamurthy Jayanthi;Jayanna, Haradagere Siddaramaiah
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.807-823
    • /
    • 2018
  • Speaker verification system performance depends on the utterance of each speaker. To verify the speaker, important information has to be captured from the utterance. Nowadays under the constraints of limited data, speaker verification has become a challenging task. The testing and training data are in terms of few seconds in limited data. The feature vectors extracted from single frame size and rate (SFSR) analysis is not sufficient for training and testing speakers in speaker verification. This leads to poor speaker modeling during training and may not provide good decision during testing. The problem is to be resolved by increasing feature vectors of training and testing data to the same duration. For that we are using multiple frame size (MFS), multiple frame rate (MFR), and multiple frame size and rate (MFSR) analysis techniques for speaker verification under limited data condition. These analysis techniques relatively extract more feature vector during training and testing and develop improved modeling and testing for limited data. To demonstrate this we have used mel-frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC) as feature. Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM) are used for modeling the speaker. The database used is NIST-2003. The experimental results indicate that, improved performance of MFS, MFR, and MFSR analysis radically better compared with SFSR analysis. The experimental results show that LPCC based MFSR analysis perform better compared to other analysis techniques and feature extraction techniques.

Application of a Modular Multi-Gaussian Beam Model to Ultrasonic Wave Propagation with Multiple Interfaces

  • Jeong, Hyun-Jo;Park, Moon-Cheol;Schmerr Lester W.
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.25 no.3
    • /
    • pp.163-170
    • /
    • 2005
  • A modular Gaussian beam model is developed to simulate some ultrasonic testing configurations where multiple interfaces are involved. A general formulation is given in a modular matrix form to represent the Gaussian beam propagation with multiple interfaces. The ultrasonic transducer fields are modeled by a multi-Gaussian beam model which is formed by superposing 10 single Gaussian beams. The proposed model, referred to as "MMGB" (modular multi-Gaussian beam) model, is then applied to a typical contact and angle beam testing configuration to predict the output signal reflected from the corner of a vertical crack. The resulting expressions given in a modular matrix form are implemented in a personal computer using the MATLAB program. Simulation results are presented and compared with available experimental results.

Objective Bayesian multiple hypothesis testing for the shape parameter of generalized exponential distribution

  • Lee, Woo Dong;Kim, Dal Ho;Kang, Sang Gil
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.217-225
    • /
    • 2017
  • This article deals with the problem of multiple hypothesis testing for the shape parameter in the generalized exponential distribution. We propose Bayesian hypothesis testing procedures for multiple hypotheses of the shape parameter with the noninformative prior. The Bayes factor with the noninformative prior is not well defined. The reason is that the most of the noninformative prior can be improper. Therefore we study the default Bayesian multiple hypothesis testing methods using the fractional and intrinsic Bayes factors with the reference priors. Simulation study is performed and an example is given.

Multiple Testing in Genomic Sequences Using Hamming Distance

  • Kang, Moonsu
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.899-904
    • /
    • 2012
  • High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.

Design of Multiple-Purpose Protocol Test System (다기능 프로토콜 시험시스템 설계)

  • 최양희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.15 no.5
    • /
    • pp.434-445
    • /
    • 1990
  • Protocol testing techniques are expanded from the traditional simple function- testing based on the OSI model, to sophisticated performance testing, conformance testing and interoperability testing. In addition, both point-to-point and point-to-multipoint protocols are to be covered. This paper presents a new multiple-purpose protocol test system where the common platform includes the test sequence generation and test result analysis, and the modular test execution part is selectively adjusted according to the test purposes and protocols under test. This paper describes test system for network routing protocol and test system for transport protocol, designed upon the ideas of the multiple-purpose protocol test system.

  • PDF

Testing Environment based on TTCN-3 for Network-based Embedded Software (TTCN-3를 이용한 네트워크 기반 임베디드 소프트웨어 테스팅 환경 구축)

  • Chae, Hochang;Jin, Xiulin;Cho, Jeonghun;Lee, Seonghun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.1
    • /
    • pp.29-38
    • /
    • 2010
  • It is highly requested that the more complicated embedded software is used for high performance and multiple functions of the systems. This is inevitably increasing the errors. Therefore the embedded software testing is getting important recently. There are not general testing methods which are able to be applied for any embedded systems, but via this research, we introduce a testing method which is based on TTCN-3, a testing standard, for embedded systems. A testing environment for network-based embedded software is implemented with considering the features of TTCN-3 testing which is based on message exchange. The testing environment has two additional parts with TTCN-3 test system, the network analyzer to access the network-based systems and the communication interface which is suggested for embedded systems in previous work, and we have implemented the whole testing environment with interacting these two parts. In addition to the normal testing domain, called single node testing as a unit testing of V-model, we suggest another concept to test multiple nodes in network. It could be achieved by adding keywords such as supervisor and object which are describing the feature of TTCN-3 testing component and generating the TTCN-3 Executable code which contains new keywords. The testing has done for embedded software which is based on CAN network and the demonstration of the testing environment has been shown in this paper.

A Usability Test of a New Computerized Open-ended Math Testing System for Elementary School Students (초등학생용 컴퓨터화 개방형 수학 시험 방식의 사용가능성 검증)

  • Park, Joo-Yong;Kim, Yong-Guk
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.2
    • /
    • pp.283-307
    • /
    • 2010
  • In this study, a new open-ended format math testing system for elementary school students has been proposed. This system is an application of the recently proposed Constructive Multiple-choice Testing (CMT) system on math testing. The CMT system is a testing system in which the examinee has to respond to an item twice, first in an open ended format, and then in the multiple choice format. The advantages of this system is that process information can easily be obtained and that the examinee can receive feedback immediately after the test, based on his/her multiple choice responses. This open-ended format math testing system includes the manager mode, which allows the generation of the test items and student account management, and the testing mode, which allows the students to input their solution process using the menu bar and the keyboard. When two groups, one tested using the CMT system and the other tested using the paper and pencil test, were compared, there was no significant difference in average scores between the two groups although the testing time was longer for the group tested using the CMT system. This result suggests that the open-ended format math testing system proposed in this study can be used effectively in the actual classroom setting.

  • PDF

Estimating the Failure Rate of a Large Scaled Software in Multiple Input Domain Testing (다중입력영역시험에서의 대형 소프트웨어 고장률 추정 연구)

  • 문숙경
    • Journal of Korean Society for Quality Management
    • /
    • v.30 no.3
    • /
    • pp.186-194
    • /
    • 2002
  • In this paper we introduce formulae for estimating the failure rate of a large scaled software by using the Bayesian rule when a black-box random testing which selects an element(test case) at random with equally likely probability, is performed. A program or software can be treated as a mathematical function with a well-defined (input)domain and range. For a large scaled software, their input domains can be partitioned into multiple subdomains and exhaustive testing is not generally practical. Testing is proceeding with selecting a subdomain, and then picking a test case from within the selected subdomain. Whether or not the proportion of selecting one of the subdomains is assumed probability, we developed the formulae either case by using Bayesian rule with gamma distribution as a prior distribution.