• Title/Summary/Keyword: exact sequences

Search Result 102, Processing Time 0.035 seconds

ANALYSIS OF NEIGHBOR-JOINING BASED ON BOX MODEL

  • Cho, Jin-Hwan;Joe, Do-Sang;Kim, Young-Rock
    • Journal of applied mathematics & informatics
    • /
    • v.25 no.1_2
    • /
    • pp.455-470
    • /
    • 2007
  • In phylogenetic tree construction the neighbor-joining algorithm is the most well known method which constructs a trivalent tree from a pairwise distance data measured by DNA sequences. The core part of the algorithm is its cherry picking criterion based on the tree structure of each quartet. We give a generalized version of the criterion based on the exact box model of quartets, known as the tight span of a metric. We also show by experiment why neighbor-joining and the quartet consistency count method give similar performance.

ALGEBRAIC STRUCTURES IN A PRINCIPAL FIBRE BUNDLE

  • Park, Joon-Sik
    • Journal of the Chungcheong Mathematical Society
    • /
    • v.21 no.3
    • /
    • pp.371-376
    • /
    • 2008
  • Let $P(M,G,{\pi})=:P$ be a principal fibre bundle with structure Lie group G over a base manifold M. In this paper we get the following facts: 1. The tangent bundle TG of the structure Lie group G in $P(M,G,{\pi})=:P$ is a Lie group. 2. The Lie algebra ${\mathcal{g}}=T_eG$ is a normal subgroup of the Lie group TG. 3. $TP(TM,TG,{\pi}_*)=:TP$ is a principal fibre bundle with structure Lie group TG and projection ${\pi}_*$ over base manifold TM, where ${\pi}_*$ is the differential map of the projection ${\pi}$ of P onto M. 4. for a Lie group $H,\;TH=H{\circ}T_eH=T_eH{\circ}H=TH$ and $H{\cap}T_eH=\{e\}$, but H is not a normal subgroup of the group TH in general.

  • PDF

Finding approximate occurrence of a pattern that contains gaps by the bit-vector approach

  • Lee, In-Bok;Park, Kun-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.193-199
    • /
    • 2003
  • The application of finding occurrences of a pattern that contains gaps includes information retrieval, data mining, and computational biology. As the biological sequences may contain errors, it is important to find not only the exact occurrences of a pattern but also approximate ones. In this paper we present an O(mnk$_{max}$/w) time algorithm for the approximate gapped pattern matching problem, where m is the length of the text, H is the length of the pattern, w is the word size of the target machine, and k$_{max}$ is the greatest error bound for subpatterns.

  • PDF

Motion Boundary Detection and Motion Vector Estimation by spatio-temporal Gradient Method using a New Spatial Gradient (새로운 공간경사를 사용한 시공간 경사법에 의한 운동경계 검출 및 이동벡터 추정)

  • 김이한;김성대
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.59-68
    • /
    • 1993
  • The motion vector estimation and motion boundary detection have been briskly studied since they are an important clue for analysis of object structure and 3-d motion. The purpose of this researches is more exact estimation, but there are two main causes to make inaccurate. The one is the erroneous measurement of gradients in brightness values and the other is the blurring of motion boundries which is caused by the smoothness constraint. In this paper, we analyze the gradient measurement error of conventional methods and propose new technique based on it. When the proposed method is applied to the motion boundary detection in Schunck and motion vector estimation in Horn & Schunck, it is shown to have much better performance than conventional method is some artificial and real image sequences.

  • PDF

Estimation of Gini-Simpson index for SNP data

  • Kang, Joonsung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1557-1564
    • /
    • 2017
  • We take genomic sequences of high-dimensional low sample size (HDLSS) without ordering of response categories into account. When constructing an appropriate test statistics in this model, the classical multivariate analysis of variance (MANOVA) approach might not be useful owing to very large number of parameters and very small sample size. For these reasons, we present a pseudo marginal model based upon the Gini-Simpson index estimated via Bayesian approach. In view of small sample size, we consider the permutation distribution by every possible n! (equally likely) permutation of the joined sample observations across G groups of (sizes $n_1,{\ldots}n_G$). We simulate data and apply false discovery rate (FDR) and positive false discovery rate (pFDR) with associated proposed test statistics to the data. And we also analyze real SARS data and compute FDR and pFDR. FDR and pFDR procedure along with the associated test statistics for each gene control the FDR and pFDR respectively at any level ${\alpha}$ for the set of p-values by using the exact conditional permutation theory.

An Efficient Video Indexing Method using Object Motion Map in compresed Domain (압축영역에서 객체 움직임 맵에 의한 효율적인 비디오 인덱싱 방법에 관한 연구)

  • Kim, So-Yeon;No, Yong-Man
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.5
    • /
    • pp.1570-1578
    • /
    • 2000
  • Object motion is an important feature of content in video sequences. By now, various methods to exact feature about the object motion have been reported[1,2]. However they are not suitable to index video using the motion, since a lot of bits and complex indexing parameters are needed for the indexing [3,4] In this paper, we propose object motion map which could provide efficient indexing method for object motion. The proposed object motion map has both global and local motion information during an object is moving. Furthermore, it requires small bit of memory for the indexing. to evaluate performance of proposed indexing technique, experiments are performed with video database consisting of MPEG-1 video sequence in MPEG-7 test set.

  • PDF

Distribution of Runs and Patterns in Four State Trials

  • Jungtaek Oh
    • Kyungpook Mathematical Journal
    • /
    • v.64 no.2
    • /
    • pp.287-301
    • /
    • 2024
  • From the mathematical and statistical point of view, a segment of a DNA strand can be viewed as a sequence of four-state (A, C, G, T) trials. Herein, we consider the distributions of runs and patterns related to the run lengths of multi-state sequences, especially for four states (A, B, C, D). Let X1, X2, . . . be a sequence of four state independent and identically distributed trials taking values in the set 𝒢 = {A, B, C, D}. In this study, we obtain exact formulas for the probability distribution function for the discrete distribution of runs of B's of order k. We obtain longest run statistics, shortest run statistics, and determine the distributions of waiting times and run lengths.

The Effect of Acoustic Correlates of Domain-initial Strengthening in Lexical Segmentation of English by Native Korean Listeners

  • Kim, Sa-Hyang;Cho, Tae-Hong
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.115-124
    • /
    • 2010
  • The current study investigated the role of acoustic correlates of domain-initial strengthening in lexical segmentation of a non-native language. In a series of cross-modal identity-priming experiments, native Korean listeners heard English auditory stimuli and made lexical decision to visual targets (i.e., written words). The auditory stimuli contained critical two word sequences which created temporal lexical ambiguity (e.g., 'mill#company', with the competitor 'milk'). There was either an IP boundary or a word boundary between the two words in the critical sequences. The initial CV of the second word (e.g., [$k_{\Lambda}$] in 'company') was spliced from another token of the sequence in IP- or Wd-initial positions. The prime words were postboundary words (e.g., company) in Experiment 1, and preboundary words (e.g., mill) in Experiment 2. In both experiments, Korean listeners showed priming effects only in IP contexts, indicating that they can make use of IP boundary cues of English in lexical segmentation of English. The acoustic correlates of domain-initial strengthening were also exploited by Korean listeners, but significant effects were found only for the segmentation of postboundary words. The results therefore indicate that L2 listeners can make use of prosodically driven phonetic detail in lexical segmentation of L2, as long as the direction of those cues are similar in their L1 and L2. The exact use of the cues by Korean listeners was, however, different from that found with native English listeners in Cho, McQueen, and Cox (2007). The differential use of the prosodically driven phonetic cues by the native and non-native listeners are thus discussed.

  • PDF

Differentially Expressed Genes by Methylmercury in Neuroblastoma cell line using suppression subtractive hybridization (SSH) and cDNA Microarray

  • Kim, Youn-Jung;Chang, Suk-Tai;Yun, Hye-Jung;Ryu, Jae-Chun
    • Proceedings of the Korea Society of Environmental Toocicology Conference
    • /
    • 2003.05a
    • /
    • pp.187-187
    • /
    • 2003
  • Methylmercury (MeHg), one of the heavy metal compounds, can cause severe damage to the central nervous system in humans. Many reports have shown that MeHg is poisonous to human body through contaminated foods and has released into the environment. Despite many studies on the pathogenesis of MeHg-induced central neuropathy, no useful mechanism of toxicity has been established so far. In this study, two methods, cDNA Microarray and SSH, were performed to assess the expression profile against MeHg and to identify differentially expressed genes by MeHg in neuroblastoma cell line. TwinChip Human-8K (Digital Genomics) was used with total RNA from SH-SY5Y (human neuroblastoma cell line) treated with solvent (DMSO) and 6.25 uM (IC50) MeHg. And we performed forward and reverse SSH method on mRNA derived from SH-SY5Y treated with DMSO and MeHg (6.25 uM). Differentially expressed cDNA clones were sequenced and were screened by dot blot and ribonuclease protection assay to confirm that individual clones indeed represent differentially expressed genes. These sequences were identified by BLAST homology search to known genes or expressed sequence tags (ESTs). Analysis of these sequences may provide an insight into the biological effects of MeHg in the pathogenesis of neurodegenerative disease and a possibility to develop more efficient and exact monitoring system of heavy metals as environmental pollutants.

  • PDF

Estimation of Substring Selectivity in Biological Sequence Database (생물학 서열 데이타베이스에서 부분 문자열의 선적도 추정)

  • 배진욱;이석호
    • Journal of KIISE:Databases
    • /
    • v.30 no.2
    • /
    • pp.168-175
    • /
    • 2003
  • Until now, substring selectivities have been estimated by two steps. First step is to build up a count-suffix tree, which has statistical information about substrings, and second step is to estimate substring selectivity using it. However, it's actually impossible to build up a count-suffix tree from biological sequences because their lengths are too long. So, this paper proposes a novel data structure, count q-gram tree, consisting of fixed length substrings. The Count q-gram tree retains the exact counts of all substrings whose lengths are equal to or less than q and this tree is generated in 0(N) time and in site not subject to total length of all sequences, N. This paper also presents an estimation technique, k-MO. k-MO can choose overlapping length of splitted substrings from a query string, and this choice will affect accuracy of selectivity and query processing time. Experiments show k-MO can estimate very accurately.