• Title/Summary/Keyword: sequential data

Search Result 1,105, Processing Time 0.027 seconds

Monitoring of Clinical Trials: Issues and Recammendations

  • Fleming Thomas R.;Demets David L.
    • 대한예방의학회:학술대회논문집
    • /
    • 1994.02b
    • /
    • pp.270-284
    • /
    • 1994
  • Interim analyses of randomized trials enable investigators to make more efficient use of limited research resources and to satisfy ethical requirements that a regimen be discontinued as soon as it has been established to have an inferior efficacy/toxicity profile. Unfortunately. the integrity and credibility of these trials can be compromised if inappropriate procedures are used in monitoring interim data. 'In this paper we discuss how group sequential designs provide useful guidelines that enable one to satisfy the valid objectives of interim monitoring while avoiding undesirable consequences, and we consider how flexible one can be in the way such designs are implemented. We also provide motivation for the role of data-monitoring committees in preserving study integrity and credibility in either government- or industry-sponsored trials. In our view. these committees should have multidisciplinary representation and membership limited to individuals free of apparent significant conflict of interest, and ideally should be the only individuals to whom the data analysis center provides interim results on relative efficacy of treatment regimens. Finally. we discuss some important practical issues such as estimation following group sequential testing, anal ysis of secondary outcomes after using a group sequential design applied to a primary outcome, early stopping of negative trials. and the role of administrative analyses.

  • PDF

An Algorithm for Sequential Sampling Method in Data Mining (데이터 마이닝에서 샘플링 기법을 이용한 연속패턴 알고리듬)

  • 홍지명;김낙현;김성집
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.21 no.45
    • /
    • pp.101-112
    • /
    • 1998
  • Data mining, which is also referred to as knowledge discovery in database, means a process of nontrivial extraction of implicit, previously unknown and potentially useful information (such as knowledge rules, constraints, regularities) from data in databases. The discovered knowledge can be applied to information management, decision making, and many other applications. In this paper, a new data mining problem, discovering sequential patterns, is proposed which is to find all sequential patterns using sampling method. Recognizing that the quantity of database is growing exponentially and transaction database is frequently updated, sampling method is a fast algorithm reducing time and cost while extracting the trend of customer behavior. This method analyzes the fraction of database but can in general lead to results of a very high degree of accuracy. The relaxation factor, as well as the sample size, can be properly adjusted so as to improve the result accuracy while minimizing the corresponding execution time. The superiority of the proposed algorithm will be shown through analyzing accuracy and efficiency by comparing with Apriori All algorithm.

  • PDF

Missing Values Estimation for Time Course Gene Expression Data Using the Sequential Partial Least Squares Regression Fitting (순차적 부분최소제곱 회귀적합에 의한 시간경로 유전자 발현 자료의 결측치 추정)

  • Kim, Kyung-Sook;Oh, Mi-Ra;Baek, Jang-Sun;Son, Young-Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.2
    • /
    • pp.275-290
    • /
    • 2008
  • The size of microarray gene expression data is very big and its observation process is also very complex. Thus missing values are frequently occurred. In this paper we propose the sequential partial least squares(SPLS) regression fitting method to estimate missing values for time course gene expression data that has correlations among observations over time points. The SPLS method is to combine the sequential technique with the partial least squares(PLS) regression fitting method. The usefulness of method proposed is evaluated through some simulation study for three yeast time course data.

Sequential Pattern Mining Algorithms with Quantities (정량 정보를 포함한 순차 패턴 마이닝 알고리즘)

  • Kim, Chul-Yun;Lim, Jong-Hwa;Ng Raymond T.;Shim Kyu-Seok
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.453-462
    • /
    • 2006
  • Discovering sequential patterns is an important problem for many applications. Existing algorithms find sequential patterns in the sense that only items are included in the patterns. However, for many applications, such as business and scientific applications, quantitative attributes are often recorded in the data, which are ignored by existing algorithms but can provide useful insight to the users. In this paper, we consider the problem of mining sequential patterns with quantities. We demonstrate that naive extensions to existing algorithms for sequential patterns are inefficient, as they may enumerate the search space blindly. Thus, we propose hash filtering and quantity sampling techniques that significantly improve the performance of the naive extensions. Experimental results confirm that compared with the naive extensions, these schemes not only improve the execution time substantially but also show better scalability for sequential patterns with quantities.

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences

  • Kang, Tae-Ho;Yoo, Jae-Soo;Kim, Hak-Yong;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • v.3 no.2
    • /
    • pp.18-24
    • /
    • 2007
  • Biological sequences such as DNA and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of more than hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological datasets with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with a fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. The experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

Time Series Forecasting Based on Modified Ensemble Algorithm (시계열 예측의 변형된 ENSEMBLE ALGORITHM)

  • Kim Yon Hyong;Kim Jae Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.137-146
    • /
    • 2005
  • Neural network is one of the most notable technique. It usually provides more powerful forecasting models than the traditional time series techniques. Employing the Ensemble technique in forecasting model, one should provide a initial distribution. Usually the uniform distribution is assumed so that the initialization is noninformative. However, it would be expected a sequential informative initialization based on data rather than the uniform initialization gives further reduction in forecasting error. In this note, a modified Ensemble algorithm using sequential initial probability is developed. The sequential distribution is designed to have much weight on the recent data.

Sequential patient recruitment monitoring in multi-center clinical trials

  • Kim, Dong-Yun;Han, Sung-Min;Youngblood, Marston Jr.
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.501-512
    • /
    • 2018
  • We propose Sequential Patient Recruitment Monitoring (SPRM), a new monitoring procedure for patient recruitment in a clinical trial. Based on the sequential probability ratio test using improved stopping boundaries by Woodroofe, the method allows for continuous monitoring of the rate of enrollment. It gives an early warning when the recruitment is unlikely to achieve the target enrollment. The packet data approach combined with the Central Limit Theorem makes the method robust to the distribution of the recruitment entry pattern. A straightforward application of the counting process framework can be used to estimate the probability to achieve the target enrollment under the assumption that the current trend continues. The required extension of the recruitment period can also be derived for a given confidence level. SPRM is a new, continuous patient recruitment monitoring tool that provides an opportunity for corrective action in a timely manner. It is suitable for the modern, centralized data management environment and requires minimal effort to maintain. We illustrate this method using real data from two well-known, multicenter, phase III clinical trials.

A Hierarchical Sequential Index Scheme for Range Queries in Wireless Location-based Services (무선 위치기반서비스에서 영역질의처리를 위한 계층적 인덱스기법)

  • Park, Kwang-Jin
    • Journal of Internet Computing and Services
    • /
    • v.11 no.1
    • /
    • pp.15-20
    • /
    • 2010
  • In this paper, we propose a novel approach to reduce spatial query access latency and energy consumption by leveraging results from nearby peers in wireless broadcast environments. We propose a three-tier Hierarchical Location-Based Sequential access index, called HLBS, which provides selective tuning (pruning and searching entries) without pointers using a linear accessing structure based on the location of each data object. The HLBS saves search cost and index overhead, since the small index size with a sequential index structure results in low access latency overhead and facilitates efficient searches for sequential-access media (wireless channels with data broadcast). Comprehensive experiments illustrate that the proposed scheme is more efficient than the previous techniques in terms of energy consumption.

The Forward Sequential Procedure for the Identifying Multiple Outliers in Linear Regression

  • Park, Jin-Pyo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.1053-1066
    • /
    • 2005
  • In this paper we consider the problem of identifying and testing outliers in linear regression. First we consider the use of the so-called scale ratio tests for testing the null hypothesis of no outliers. This test is based on the ratio of two residual scale estimates. We show the asymptotic distribution of the test statistics and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure using the suggested test is proposed. The new method is compared with classical procedure in the real data example. Unlike other forward procedures, the present one is unaffected by masking and swamping effects because the test statistic is based on robust scale estimate.

  • PDF

Design and Implementation of Sequential Pattern Miner to Analyze Alert Data Pattern (경보데이터 패턴 분석을 위한 순차 패턴 마이너 설계 및 구현)

  • Shin, Moon-Sun;Paik, Woo-Jin
    • Journal of Internet Computing and Services
    • /
    • v.10 no.2
    • /
    • pp.1-13
    • /
    • 2009
  • Intrusion detection is a process that identifies the attacks and responds to the malicious intrusion actions for the protection of the computer and the network resources. Due to the fast development of the Internet, the types of intrusions become more complex recently and need immediate and correct responses because the frequent occurrences of a new intrusion type rise rapidly. Therefore, to solve these problems of the intrusion detection systems, we propose a sequential pattern miner for analysis of the alert data in order to support intelligent and automatic detection of the intrusion. Sequential pattern mining is one of the methods to find the patterns among the extracted items that are frequent in the fixed sequences. We apply the prefixSpan algorithm to find out the alert sequences. This method can be used to predict the actions of the sequential patterns and to create the rules of the intrusions. In this paper, we propose an extended prefixSpan algorithm which is designed to consider the specific characteristics of the alert data. The extended sequential pattern miner will be used as a part of alert data analyzer of intrusion detection systems. By using the created rules from the sequential pattern miner, the HA(high-level alert analyzer) of PEP(policy enforcement point), usually called IDS, performs the prediction of the sequence behaviors and changing patterns that were not visibly checked.

  • PDF