• Title/Summary/Keyword: Sequence Alignment

Search Result 350, Processing Time 0.023 seconds

Prediction of subcellular localization of proteins using pairwise sequence alignment and support vector machine

  • Kim, Jong-Kyoung;Raghava, G. P. S.;Kim, Kwang-S.;Bang, Sung-Yang;Choi, Seung-Jin
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.158-166
    • /
    • 2004
  • Predicting the destination of a protein in a cell gives valuable information for annotating the function of the protein. Recent technological breakthroughs have led us to develop more accurate methods for predicting the subcellular localization of proteins. The most important factor in determining the accuracy of these methods, is a way of extracting useful features from protein sequences. We propose a new method for extracting appropriate features only from the sequence data by computing pairwise sequence alignment scores. As a classifier, support vector machine (SVM) is used. The overall prediction accuracy evaluated by the jackknife validation technique reach 94.70% for the eukaryotic non-plant data set and 92.10% for the eukaryotic plant data set, which show the highest prediction accuracy among methods reported so far with such data sets. Our numerical experimental results confirm that our feature extraction method based on pairwise sequence alignment, is useful for this classification problem.

  • PDF

Sequence Analysis and Potential Action of Eukaryotic Type Protein Kinase from Streptomyces coelicolor A3(2)

  • Roy, Daisy R.;Chandra, Sathees B.C.
    • Genomics & Informatics
    • /
    • v.6 no.1
    • /
    • pp.44-49
    • /
    • 2008
  • Protein kinase C (PKC) is a family of kinases involved in the transduction of cellular signals that promote lipid hydrolysis. PKC plays a pivotal role in mediating cellular responses to extracellular stimuli involved in proliferation, differentiation and apoptosis. Comparative analysis of the PKC-${\alpha},{\beta},{\varepsilon}$ isozymes of 200 recently sequenced microbial genomes was carried out using variety of bioinformatics tools. Diversity and evolution of PKC was determined by sequence alignment. The ser/thr protein kinases of Streptomyces coelicolor A3 (2), is the only bacteria to show sequence alignment score greater than 30% with all the three PKC isotypes in the sequence alignment. S.coelicolor is the subject of our interest because it is notable for the production of pharmaceutically useful compounds including anti-tumor agents, immunosupressants and over two-thirds of all natural antibiotics currently available. The comparative analysis of three human isotypes of PKC and Serine/threonine protein kinase of S.coelicolor was carried out and possible mechanism of action of PKC was derived. Our analysis indicates that Serine/ threonine protein kinase from S. coelicolor can be a good candidate for potent anti-tumor agent. The presence of three representative isotypes of the PKC super family in this organism helps us to understand the mechanism of PKC from evolutionary perspective.

A Study of Alignment Tolerance's Definition and Test Method for Airborne Camera (항공기 탑재용 카메라 정렬오차 정의 및 시험방안 연구)

  • Song, Dae-Buem;Yoon, Yong-Eun;Lee, Hang-Bok
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.16 no.2
    • /
    • pp.154-159
    • /
    • 2013
  • Alignment tolerance for EO/IR airborne camera using common optic is an important factor in stabilization accuracy and geo-pointing accuracy. Before airborne camera is mounted on the aircraft, defining alignment tolerance and verification of it is essential in production as well as research and development. In this paper we establish basic concept on the definition and elements of alignment tolerance for airborne camera and propose how to measure each of those elements. Components and the measurement sequence of alignment tolerance are as follows: 1) tolerance of alignment between EO and IR LOS. 2) tolerance of sensor alignment. 3) tolerance of position reporting accuracy. 4) tolerance of mount alignment

Multi-Level Sequence Alignment : An Adaptive Control Method Between Speed and Accuracy for Document Comparison (계산속도 및 정확도의 적응적 제어가 가능한 다단계 문서 비교 시스템)

  • Seo, Jong-Kyu;Tak, Haesung;Cho, Hwan-Gue
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.728-743
    • /
    • 2014
  • Finger printing and sequence alignment are well-known approaches for document similarity comparison. A fingerprinting method is simple and fast, but it can not find particular similar regions. A string alignment method is used for identifying regions of similarity by arranging the sequences of a string. It has an advantage of finding particular similar regions, but it also has a disadvantage of taking more computing time. The Multi-Level Alignment (MLA) is a new method designed for taking the advantages of both methods. The MLA divides input documents into uniform length blocks, and then extracts fingerprints from each block and calculates similarity of block pairs by comparing the fingerprints. A similarity table is created in this process. Finally, sequence alignment is used for specifying longest similar regions in the similarity table. The MLA allows users to change block's size to control proportion of the fingerprint algorithm and the sequence alignment. As a document is divided into several blocks, similar regions are also fragmented into two or more blocks. To solve this fragmentation problem, we proposed a united block method. Experimentally, we show that computing document's similarity with the united block is more accurate than the original MLA method, with minor time loss.

A Survey of Sequence Alignment Algorithms (서열 정렬 알고리즘의 연구 동향)

  • 성종희;김동규
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.05b
    • /
    • pp.571-574
    • /
    • 2003
  • 서열 정렬(sequence alignment)은 새로운 서열의 기능적, 구조적, 진화적 분석을 용이하게 하기 때문에 분자 생물학(molecular biology) 등에서 널리 사용된다. 지금까지 서열 정렬 알고리즘들에 대한 연구는 활발히 진행되어 왔다. 특히, 생물학 데이터양의 기하급수적인 증가와 전체 유전체 서열의 분석이 이루어진 종(species)들이 증가하면서, 보다 빠르고 정확하게 서열 정력을 수행하는 알고리즘이 필요하게 되었다. 본 논문에서는 동적 프로그래밍 방식에서부터 전체 유전체 서열 알고리즘에 이르기까지 서열 정렬 알고리즘의 연구 동향을 분석하고자 한다.

  • PDF

Evaluation of Alignment Methods for Genomic Analysis in HPC Environment (HPC 환경의 대용량 유전체 분석을 위한 염기서열정렬 성능평가)

  • Lim, Myungeun;Jung, Ho-Youl;Kim, Minho;Choi, Jae-Hun;Park, Soojun;Choi, Wan;Lee, Kyu-Chul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.107-112
    • /
    • 2013
  • With the progress of NGS technologies, large genome data have been exploded recently. To analyze such data effectively, the assistance of HPC technique is necessary. In this paper, we organized a genome analysis pipeline to call SNP from NGS data. To organize the pipeline efficiently under HPC environment, we analyzed the CPU utilization pattern of each pipeline steps. We found that sequence alignment is computing centric and suitable for parallelization. We also analyzed the performance of parallel open source alignment tools and found that alignment method utilizing many-core processor can improve the performance of genome analysis pipeline.

miRNA Pattern Discovery from Sequence Alignment

  • Sun, Xiaohan;Zhang, Junying
    • Journal of Information Processing Systems
    • /
    • v.13 no.6
    • /
    • pp.1527-1543
    • /
    • 2017
  • MiRNA is a biological short sequence, which plays a crucial role in almost all important biological process. MiRNA patterns are common sequence segments of multiple mature miRNA sequences, and they are of significance in identifying miRNAs due to the functional implication in miRNA patterns. In the proposed approach, the primary miRNA patterns are produced from sequence alignment, and they are then cut into short segment miRNA patterns. From the segment miRNA patterns, the candidate miRNA patterns are selected based on estimated probability, and from which, the potential miRNA patterns are further selected according to the classification performance between authentic and artificial miRNA sequences. Three parameters are suggested that bi-nucleotides are employed to compute the estimated probability of segment miRNA patterns, and top 1% segment miRNA patterns of length four in the order of estimated probabilities are selected as potential miRNA patterns.

Bioinformatics Approach to Direct Target Prediction for RNAi Function and Non-specific Cosuppression in Caenorhabditis elegans (생물정보학적 접근을 통한 Caenorhabditis elegans 모델시스템의 생체내 RNAi 기능예측 및 비특이적 공동발현억제 현상 분석)

  • Kim, Tae-Ho;Kim, Eui-Yong;Joo, Hyun
    • KSBB Journal
    • /
    • v.26 no.2
    • /
    • pp.131-138
    • /
    • 2011
  • Some computational approaches are needed for clarifying RNAi sequences, because it takes much time and endeavor that almost of RNAi sequences are verified by experimental data. Incorrectness of RNAi mechanism and other unaware factors in organism system are frequently faced with questions regarding potential use of RNAi as therapeutic applications. Our massive parallelized pair alignment scoring between dsRNA in Genebank and expressed sequence tags (ESTs) in Caenorhabditis elegans Genome Sequencing Projects revealed that this provides a useful tool for the prediction of RNAi induced cosuppression details for practical use. This pair alignment scoring method using high performance computing exhibited some possibility that numerous unwanted gene silencing and cosuppression exist even at high matching scores each other. The classifying the relative higher matching score of them based on GO (Gene Ontology) system could present mapping dsRNA of C. elegans and functional roles in an applied system. Our prediction also exhibited that more than 78% of the predicted co-suppressible genes are located in the ribosomal spot of C. elegans.

A Database Retrieval Model for Efficient Gene Sequence Alignment (효율적인 유전자 서열 비고를 위한 데이타베이스 검색 모델)

  • 김민준;임성화;김재훈;이원태;정진원
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.243-251
    • /
    • 2004
  • Most programs of bioinformatics provide biochemists and biologists retrieve and analysis services of gene and protein database. As these services retrieve database for each arrival of user's request, it takes a long time and increases server's load and response time. In this paper. by utilizing database retrieval patterns of sequence alignment programs in bioinformatics, grouping method is proposed to share database retrieval between many requests. Carpool method is also proposed to reduce response time as well as to increase system expandability by combining new arriving requests with the previous on going requests. The performance of our two proposed schemes is verified by mathematic analysis and simulation.

An Algorithm for multiple local alignment with Normalized Local Alignment Algorithm (정규화된 지역 정렬 알고리즘을 적용한 다중 지역 정렬 알고리즘)

  • Jang, Suk-Bong;Lee, Gye-Sung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05b
    • /
    • pp.1019-1022
    • /
    • 2003
  • 두 서열을 비교하여 유사성(similarity)이나 상동성(homology)를 찾기 위한 서열 정렬 방법 중에서 지역 정렬에 많이 사용되는 Smith-Waterman 알고리즘의 제한점인 Mosaic effect와 Shadow effect를 극복하기 위한 효율적인 방법을 살펴보고, 하나의 최대 값이 아닌 다수개의 최대 값을 찾아 다수개를 정렬함으로써 서열내에 존재 할 수 있는 다수개의 지역 정렬을 찾고 Normalized sequence alignment 알고리즘을 이용하여 서열 정렬된 결과들의 우선 순위를 매겨본다.

  • PDF