Search | Korea Science

Algorithm for Predicting Functionally Equivalent Proteins from BLAST and HMMER Searches

Yu, Dong Su;Lee, Dae-Hee;Kim, Seong Keun;Lee, Choong Hoon;Song, Ju Yeon;Kong, Eun Bae;Kim, Jihyun F.
- Journal of Microbiology and Biotechnology
- /
- v.22 no.8
- /
- pp.1054-1058
- /
- 2012
In order to predict biologically significant attributes such as function from protein sequences, searching against large databases for homologous proteins is a common practice. In particular, BLAST and HMMER are widely used in a variety of biological fields. However, sequence-homologous proteins determined by BLAST and proteins having the same domains predicted by HMMER are not always functionally equivalent, even though their sequences are aligning with high similarity. Thus, accurate assignment of functionally equivalent proteins from aligned sequences remains a challenge in bioinformatics. We have developed the FEP-BH algorithm to predict functionally equivalent proteins from protein-protein pairs identified by BLAST and from protein-domain pairs predicted by HMMER. When examined against domain classes of the Pfam-A seed database, FEP-BH showed 71.53% accuracy, whereas BLAST and HMMER were 57.72% and 36.62%, respectively. We expect that the FEP-BH algorithm will be effective in predicting functionally equivalent proteins from BLAST and HMMER outputs and will also suit biologists who want to search out functionally equivalent proteins from among sequence-homologous proteins.
https://doi.org/10.4014/jmb.1203.03050 인용 PDF KSCI

In silica Prediction of Angiogenesis-related Genes in Human Hepatocellular Carcinoma

Kang, Seung-Hui;Park, Jeong-Ae;Hong, Soon-Sun;Kim, Kyu-Won
- Genomics & Informatics
- /
- v.2 no.3
- /
- pp.134-141
- /
- 2004
Hepatocellular carcinoma (HCC) is one of the most common malignancies worldwide and a typical hypervascular tumor. Therefore, it is important to find factors related to angiogenesis in the process of HCC malignancy. In order to find angiogenesis-related factors in HCC, we used combined methods of in silico prediction and an experimental assay. We analyzed 1457 genes extracted from cDNA microarray of HCC patients by text-mining, sequence similarity search and domain analysis. As a result, we predicted that 16 genes were likely to be involved in angiogenesis and then the effects of these genes were confirmed by hypoxia response element(HRE)-luciferase assay. For instant, we classified osteopontin into a potent angiogenic factor and coagulation factor XII into a significant antiangiogenic factor. Collectively, we suggest that using a combination of in silico prediction and experimental approaches, we can identify HCC-specific angiogenesisrelated factors effectively and rapidly.
PDF KSCI

Metagenome Analysis of Protein Domain Collocation within Cellulase Genes of Goat Rumen Microbes

Lim, SooYeon;Seo, Jaehyun;Choi, Hyunbong;Yoon, Duhak;Nam, Jungrye;Kim, Heebal;Cho, Seoae;Chang, Jongsoo
- Asian-Australasian Journal of Animal Sciences
- /
- v.26 no.8
- /
- pp.1144-1151
- /
- 2013
In this study, protein domains with cellulase activity in goat rumen microbes were investigated using metagenomic and bioinformatic analyses. After the complete genome of goat rumen microbes was obtained using a shotgun sequencing method, 217,892,109 pair reads were filtered, including only those with 70% identity, 100-bp matches, and thresholds below $E^{-10}$ using METAIDBA. These filtered contigs were assembled and annotated using blastN against the NCBI nucleotide database. As a result, a microbial community structure with 1431 species was analyzed, among which Prevotella ruminicola 23 bacteria and Butyrivibrio proteoclasticus B316 were the dominant groups. In parallel, 201 sequences related with cellulase activities (EC.3.2.1.4) were obtained through blast searches using the enzyme.dat file provided by the NCBI database. After translating the nucleotide sequence into a protein sequence using Interproscan, 28 protein domains with cellulase activity were identified using the HMMER package with threshold E values below $10^{-5}$. Cellulase activity protein domain profiling showed that the major protein domains such as lipase GDSL, cellulase, and Glyco hydro 10 were present in bacterial species with strong cellulase activities. Furthermore, correlation plots clearly displayed the strong positive correlation between some protein domain groups, which was indicative of microbial adaption in the goat rumen based on feeding habits. This is the first metagenomic analysis of cellulase activity protein domains using bioinformatics from the goat rumen.
https://doi.org/10.5713/ajas.2013.13219 인용 PDF KSCI

A Study on Construction of Integrated Prokaryotes Gene Prediction System (통합형 미생물 유전자 예측 시스템의 구축에 관한 연구)

Chang Jong-won;Ryoo Yoon-kyu;Ku Ja-hyo;Yoon Young-woo
- Journal of the Institute of Convergence Signal Processing
- /
- v.6 no.1
- /
- pp.27-32
- /
- 2005
As a large quantity of Genome sequencing has happened to be done a very much a surprising speed in short period, an automatic genome annotation process has become prerequisite. The most difficult process among with this kind of genome annotation works is to finding out the protein-coding genes within a genome. The main 2 subjects of gene prediction are Eukaryotes and Prokaryotes ; their genes have different structures, therefore, their gene prediction methods will also obviously varies. Until now, it is found that among of the 231 genome sequenced species, 200 have been found to be prokaryotes, therefore, for study of biotechnology studies, through comparative genomics, prokaryotes, rather than eukaryotes could may be more appropriate than eukaryotes. Even more, prokaryotes does not have the gene structure called an intron, so it makes the gene prediction easier. Former prokaryotes gene predictions have been shown to be 80%～ to 90% of accuracy. A recent study is aiming at 100% of gene prediction accuracy. In this paper, especially in the case of the E. coli K-12 and S. typhi genomes, gene prediction accuracy which showed 98.5% and 98.7% was more efficient than previous GLIMMER.
PDF

In silico genome wide identification and expression analysis of the WUSCHEL-related homeobox gene family in Medicago sativa

Yang, Tianhui;Gao, Ting;Wang, Chuang;Wang, Xiaochun;Chen, Caijin;Tian, Mei;Yang, Weidi
- Genomics & Informatics
- /
- v.20 no.2
- /
- pp.19.1-19.15
- /
- 2022
Alfalfa (Medicago sativa) is an important food and feed crop which rich in mineral sources. The WUSCHEL-related homeobox (WOX) gene family plays important roles in plant development and identification of putative gene families, their structure, and potential functions is a primary step for not only understanding the genetic mechanisms behind various biological process but also for genetic improvement. A variety of computational tools, including MAFFT, HMMER, hidden Markov models, Pfam, SMART, MEGA, ProtTest, BLASTn, and BRAD, among others, were used. We identified 34 MsWOX genes based on a systematic analysis of the alfalfa plant genome spread in eight chromosomes. This is an expansion of the gene family which we attribute to observed chromosomal duplications. Sequence alignment analysis revealed 61 conserved proteins containing a homeodomain. Phylogenetic study sung reveal five evolutionary clades with 15 motif distributions. Gene structure analysis reveals various exon, intron, and untranslated structures which are consistent in genes from similar clades. Functional analysis prediction of promoter regions reveals various transcription binding sites containing key growth, development, and stress-responsive transcription factor families such as MYB, ERF, AP2, and NAC which are spread across the genes. Most of the genes are predicted to be in the nucleus. Also, there are duplication events in some genes which explain the expansion of the family. The present research provides a clue on the potential roles of MsWOX family genes that will be useful for further understanding their functional roles in alfalfa plants.
https://doi.org/10.5808/gi.22013 인용 PDF KSCI

Search Result 5, Processing Time 0.019 seconds

Algorithm for Predicting Functionally Equivalent Proteins from BLAST and HMMER Searches

In silica Prediction of Angiogenesis-related Genes in Human Hepatocellular Carcinoma

Metagenome Analysis of Protein Domain Collocation within Cellulase Genes of Goat Rumen Microbes

A Study on Construction of Integrated Prokaryotes Gene Prediction System (통합형 미생물 유전자 예측 시스템의 구축에 관한 연구)

In silico genome wide identification and expression analysis of the WUSCHEL-related homeobox gene family in Medicago sativa

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)