• Title/Summary/Keyword: exact sequences

Search Result 101, Processing Time 0.03 seconds

Searching Sequential Patterns by Approximation Algorithm (근사 알고리즘을 이용한 순차패턴 탐색)

  • Sarlsarbold, Garawagchaa;Hwang, Young-Sup
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.5
    • /
    • pp.29-36
    • /
    • 2009
  • Sequential pattern mining, which discovers frequent subsequences as patterns in a sequence database, is an important data mining problem with broad applications. Since a sequential pattern in DNA sequences can be a motif, we studied to find sequential patterns in DNA sequences. Most previously proposed mining algorithms follow the exact matching with a sequential pattern definition. They are not able to work in noisy environments and inaccurate data in practice. Theses problems occurs frequently in DNA sequences which is a biological data. We investigated approximate matching method to deal with those cases. Our idea is based on the observation that all occurrences of a frequent pattern can be classified into groups, which we call approximated pattern. The existing PrefixSpan algorithm can successfully find sequential patterns in a long sequence. We improved the PrefixSpan algorithm to find approximate sequential patterns. The experimental results showed that the number of repeats from the proposed method was 5 times more than that of PrefixSpan when the pattern length is 4.

Historical Introduction of Japanese Wild Mice, Mus musculus, from South China and the Korean Peninsula

  • Nunome, Mitsuo;Suzuki, Hitoshi;Moriwaki, Kazuo
    • Animal Systematics, Evolution and Diversity
    • /
    • v.29 no.4
    • /
    • pp.267-271
    • /
    • 2013
  • In Japan, the wild house mouse Mus musculus consists of two lineages, one from Southeast Asia (Mus musculus castaneus; CAS) and one from northern Eurasia (Mus musculus musculus; MUS). However, the exact origins of the parental lineages are unclear. A recent work using mitochondrial sequences revealed that Japanese CAS and MUS are closely related to haplotypes from South China and the Korean Peninsula, respectively. Recent phylogeographic analyses using nuclear gene sequences have also confirmed a close relationship between Japan and Korea in the MUS component. However, the Japanese CAS components in the nuclear genome are likely to be unique and to differ from those of other CAS territories, including South China. Although the origins are still unresolved, these results allow us to conclude that two areas of the continent, South China and the Korean Peninsula, are the primary source areas of Japanese wild mice and suggest pre-historical introductions associated with certain historical agricultural developments in East Asia.

Identification of Nicotiana tabacum Cultivars using Molecular Markers

  • Um, Yu-Rry;Cho, Eun-Jeong;Shin, Ha-Jeong;Kim, Ho-Bang;Seok, Yeong-Seon;Kim, Kwan-Suk;Lee, Yi
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.30 no.2
    • /
    • pp.85-93
    • /
    • 2008
  • This report describes a set of seven informative single-nucleotide polymorphisms (SNPs) and one insertion-deletion (INDEL) distributed over 24 cultivars that can be used for tobacco (Nicotiana tabacum L.) cultivar identification. We analyzed 163,000 genomic DNA sequences downloaded from Tobacco Genome Initiative database and assembled 31,370 contigs and 60,000 singletons. Using relatively long contigs, we designed primer sets for PCR amplification. We amplified 61 loci from 24 cultivars and sequenced the PCR products. We found seven significant SNPs and one INDEL among the sequences and we classified the 24 cultivars into 10 groups. SNP frequency of tobacco, 1/8,380 bp, was very low in comparison with those of other plant species, between 1/46 bp and 1/336 bp. For exact identification of tobacco cultivars, many more SNP markers should be developed. This study is the first attempt to identify tobacco cultivars using SNP markers.

Efficient Preprocessing Method for Binary Centroid Tracker in Cluttered Image Sequences (복잡한 배경영상에서 효과적인 전처리 방법을 이용한 표적 중심 추적기)

  • Cho, Jae-Soo
    • Journal of Advanced Navigation Technology
    • /
    • v.10 no.1
    • /
    • pp.48-56
    • /
    • 2006
  • This paper proposes an efficient preprocessing technique for a binary centroid tracker in correlated image sequences. It is known that the following factors determine the performance of the binary centroid target tracker: (1) an efficient real-time preprocessing technique, (2) an exact target segmentation from cluttered background images and (3) an intelligent tracking window sizing, and etc. The proposed centroid tracker consists of an adaptive segmentation method based on novel distance features and an efficient real-time preprocessing technique in order to enhance the distinction between the objects of interest and their local background. Various tracking experiments using synthetic images as well as real Forward-Looking InfraRed (FLIR) images are performed to show the usefulness of the proposed methods.

  • PDF

Identification of differentially expressed Genes by methyl mercury in neuroblastoma cell line using SSH

  • Kim, Youn-Jung;Chang, Suk-Tai;Ryu, Jae-Chun
    • Proceedings of the Korea Society of Environmental Toocicology Conference
    • /
    • 2002.10a
    • /
    • pp.167-167
    • /
    • 2002
  • Methylmercury (MeHg), one of the heavy metal compound, can cause severe damage to the central nervous system in humans. Many reports have contributed MeHg poisoning to contaminated foods and release into the environment. Despite many studies on the pathogenesis of MeHg-induced central neuropathy, no useful mechanism of toxicity has been established. To find genes differentially expressed by MeHg in neuronal cell, we peformed forward and reverse suppression subtractive hybridization (SSH) method on mRNA derived from neuroblastoma cell line, SH-SY5Y treated with solvent (DMSO) and 6.25 uM (IC$\sub$50/) MeHg. Differentially expressed CDNA clones were sequenced and the mRNAs were re-examined on Northern blots. These sequences were identified by BLAST homology search to known genes or expressed sequence tags (ESTs). Analysis of these sequences has provided an insight into the biological effects of MeHg in the pathogenesis of neurodegenerative disease and a possibility to develop more efficient and exact monitoring system of heavy metals as common environmental pollutants.

  • PDF

Effectual Method FOR 3D Rebuilding From Diverse Images

  • Leung, Carlos Wai Yin;Hons, B.E.
    • 한국정보컨버전스학회:학술대회논문집
    • /
    • 2008.06a
    • /
    • pp.145-150
    • /
    • 2008
  • This thesis explores the problem of reconstructing a three-dimensional(3D) scene given a set of images or image sequences of the scene. It describes efficient methods for the 3D reconstruction of static and dynamic scenes from stereo images, stereo image sequences, and images captured from multiple viewpoints. Novel methods for image-based and volumetric modelling approaches to 3D reconstruction are presented, with an emphasis on the development of efficient algorithm which produce high quality and accurate reconstructions. For image-based 3D reconstruction a novel energy minimisation scheme, Iterated Dynamic Programming, is presented for the efficient computation of strong local minima of discontinuity preserving energyy functions. Coupled with a novel morphological decomposition method and subregioning schemes for the efficient computation of a narrowband matching cost volume. the minimisation framework is applied to solve problems in stereo matching, stereo-temporal reconstruction, motion estimation, 2D image registration and 3D image registration. This thesis establishes Iterated Dynamic Programming as an efficient and effective energy minimisation scheme suitable for computer vision problems which involve finding correspondences across images. For 3D reconstruction from multiple view images with arbitrary camera placement, a novel volumetric modelling technique, Embedded Voxel Colouring, is presented that efficiently embeds all reconstructions of a 3D scene into a single output in a single scan of the volumetric space under exact visibility. An adaptive thresholding framework is also introduced for the computation of the optimal set of thresholds to obtain high quality 3D reconstructions. This thesis establishes the Embedded Voxel Colouring framework as a fast, efficient and effective method for 3D reconstruction from multiple view images.

  • PDF

Scene Change Detection Techniques Using DC components and Moving Vector in DCT-domain of MPEG systems (MPEG system의 DCT변환영역에서 DC성분과 움직임 벡터를 이용한 영상 장면전환 검출기법)

  • 박재두;이광형
    • Journal of the Korea Society of Computer and Information
    • /
    • v.4 no.3
    • /
    • pp.28-34
    • /
    • 1999
  • In this paper. we propose the method of Scene Change Detection for video sequence using the DC components and the moving vectors in the Macro Blocks in the DCT blocks. The proposed method detects the Scene Change which would not be related with the specific sequences in the compressed MPEG domain. To do this. we define new metrics for Scene Change Detection using the features of picture component and detect the exact Scene Change point of B-pictures using the characteristics of B-picture's sharp response for the moving vectors. In brief, we will detect the cut point using I-picture and the gradual scene changes such as dissolve, fade, wipe, etc. As a results, our proposed method shows good test results for the various MPEG sequences.

Identification of Differentially Expressed Genes by Exposure of Methylmercury in Neuroblastoma Cell Line Using Suppression Subtractive Hybridization (SSH)

  • Kim, Youn-Jung;Ryu, Jae-Chun
    • Molecular & Cellular Toxicology
    • /
    • v.2 no.1
    • /
    • pp.60-66
    • /
    • 2006
  • Methylmercury (MeHg), one of the heavy metal compounds, can cause severe damage to the central nervous system in humans. Many reports have shown that MeHg is poisonous to human body through contaminated foods and has released into the environment. Despite many studies on the pathogenesis of MeHg-induced central neuropathy, no useful mechanism of toxicity has been established so far. This study, using of suppression subtractive hybridization (SSH) method, was peformed to identify differentially expressed genes by MeHg in SH-SY5Y human neuroblastoma cell line. We prepared to total RNA from SH-SY5Y cells treated with solvent (DMSO) and $6.25\;{\mu}M\;(IC_{50})$ MeHg and performed forward and reverse SSH. Differentially expressed cDNA clones were screened by dot blot, sequenced and confirmed that individual clones indeed represent differentially expressed genes with real time RT-PCR. These sequences were identified by BLAST homology search to known genes or expressed sequence tags (ESTs). Analysis of these sequences may provide an insight into the biological effects of MeHg in the pathogenesis of neurodegenerative disease and a possibility to develop more efficient and exact monitoring system of heavy metals as ubiquitous environmental pollutants.

An Efficient DNA Sequence Compression using Small Sequence Pattern Matching

  • Murugan., A;Punitha., K
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.281-287
    • /
    • 2021
  • Bioinformatics is formed with a blend of biology and informatics technologies and it employs the statistical methods and approaches for attending the concerning issues in the domains of nutrition, medical research and towards reviewing the living environment. The ceaseless growth of DNA sequencing technologies has resulted in the production of voluminous genomic data especially the DNA sequences thus calling out for increased storage and bandwidth. As of now, the bioinformatics confronts the major hurdle of management, interpretation and accurately preserving of this hefty information. Compression tends to be a beacon of hope towards resolving the aforementioned issues. Keeping the storage efficiently, a methodology has been recommended which for attending the same. In addition, there is introduction of a competent algorithm that aids in exact matching of small pattern. The DNA representation sequence is then implemented subsequently for determining 2 bases to 6 bases matching with the remaining input sequence. This process involves transforming of DNA sequence into an ASCII symbols in the first level and compress by using LZ77 compression method in the second level and after that form the grid variables with size 3 to hold the 100 characters. In the third level of compression, the compressed output is in the grid variables. Hence, the proposed algorithm S_Pattern DNA gives an average better compression ratio of 93% when compared to the existing compression algorithms for the datasets from the UCI repository.

AllEC: An Implementation of Application for EC Numbers Prediction based on AEC Algorithm

  • Park, Juyeon;Park, Mingyu;Han, Sora;Kim, Jeongdong;Oh, Taejin;Lee, Hyun
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.201-212
    • /
    • 2022
  • With the development of sequencing technology, there is a need for technology to predict the function of the protein sequence. Enzyme Commission (EC) numbers are becoming markers that distinguish the function of the sequence. In particular, many researchers are researching various methods of predicting the EC numbers of protein sequences based on deep learning. However, as studies using various methods exist, a problem arises, in which the exact prediction result of the sequence is unknown. To solve this problem, this paper proposes an All Enzyme Commission (AEC) algorithm. The proposed AEC is an algorithm that executes various prediction methods and integrates the results when predicting sequences. This algorithm uses duplicates to give more weights when duplicate values are obtained from multiple methods. The largest value, among the final prediction result values for each method to which the weight is applied, is the final prediction result. Moreover, for the convenience of researchers, the proposed algorithm is provided through the AllEC web services. They can use the algorithms regardless of the operating systems, installation, or operating environment.