• Title/Summary/Keyword: Comparative bioinformatics

Search Result 120, Processing Time 0.023 seconds

Verifying Orthologous Paralogenes using Whole Genome Alignment

  • Chan, P.Y.;Lam, T.W.;Yiu, S.M.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.109-112
    • /
    • 2005
  • Identifying orthologous paralogenes is a fundamental problem in comparative genomics and can facilitate the study of evolutionary history of the species. Existing approaches for locating paralogs make use of local alignment based algorithms such as BLAST. However, there are cases that genes with high alignment scores are not paralogenes. On the other hand, whole genome alignment tools are designed to locate orthologs. Most of these tools are based on some unique substrings (called anchors) in the corresponding orthologous pair to identify them. Intuitively, these tools may not be useful in identifying orthologous paralogenes as paralogenes are very similar and there may not be enough unique anchors. However, our study shows that this is not true. Paralogenes although are similar, they have undergone different mutations. So, there are enough unique anchors for identifying them. Our contributions include the followings. Based on this counter-intuitive finding, we propose to employ the whole genome alignment tools to help verifying paralogenes. Our experiments on five pairs of human-mouse chromosomes show that our approach is effective and can identify most of the mis-classified paralog groups (more than 80%). We verify our finding that whole genome alignment tools are able to locate orthologous paralogenes through a simulation study. The result from the study confirms our finding.

  • PDF

An Orthologous Group Clustering Technique based on the Grid Computing

  • Oh, J.S.;Kim, T.K.;Kim, S.S.;Kwon, H.R.;Kim, Y.C.;Yoo, J.S.;Cho, W.S.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.72-77
    • /
    • 2005
  • Orthologs are genes having the same function across different species that specialize from a single gene in the last common ancestor of these species. Orthologous groups are useful in the genome annotation, studies on gene evolution, and comparative genomics. However, the construction of an orthologous group is difficult to automate and it takes so much time. It is also hard to guarantee the accuracy of the constructed orthologous groups. We propose a system to construct orthologous groups on many genomes automatically and rapidly. We utilize the grid computing to reduce the sequence alignment time, and we use clustering algorithm in the application of database to automate whole processes. We have generated orthologous groups for 20 complete prokaryotes genomes just in a day because of the grid computing. Furthermore, new genomes can be accommodated easily by the clustering algorithm and grid computing. We compared the generated orthologous groups with COGs (Clusters of orthologous Group of proteins) and KO (KEGG Ortholog). The comparison shows about 85 percent similarity compared with previous well-known orthologous databases.

  • PDF

3D-QSAR Studies on 2-(indol-5-yl)thiazole Derivatives as Xanthine Oxidase (XO) Inhibitors

  • Nagarajan, Santhosh Kumar;Madhavan, Thirumurthy
    • Journal of Integrative Natural Science
    • /
    • v.8 no.4
    • /
    • pp.258-266
    • /
    • 2015
  • Xanthine Oxidase is an enzyme, which oxidizes hypoxanthine to xanthine, and xanthine to uric acid. It is widely distributed throughout various organs including the liver, gut, lung, kidney, heart, brain and plasma. It is involved in gout pathogenesis. In this study, we have performed Comparative Molecular Field Analysis (CoMFA) on a series of 2-(indol-5-yl) thiazole derivatives as xanthine oxidase (XO) inhibitors to identify the structural variations with their inhibitory activities. Ligand based CoMFA models were generated based on atom-by-atom matching alignment. In atom-by-atom matching, the bioactive conformation of highly active molecule 11 was generated using systematic search. Compounds were aligned using the bioactive conformation and it is used for model generation. Different CoMFA models were generated using different alignments and the best model yielded a cross-validated $q^2$ of 0.698 with five components and non-cross-validated correlation coefficient ($r^2$) of 0.992 with Fisher value as 236.431, and an estimated standard error of 0.068. The predictive ability of the best CoMFA models was found to be $r^2_{pred}$0.653. The CoMFA study revealed that the $R_3$ position of the structure is important in influencing the biological activity of the inhibitors. Electro positive groups and bulkier substituents in this position enhance the biological activity.

Docking and QSAR studies of PARP-1 Inhibitors (PARP-1 억제제의 Docking 및 QSAR 연구)

  • Kim, Hye-Jung;Cho, Seung-Joo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.210-218
    • /
    • 2004
  • Poly(ADP-ribose)polymerase-1 (PARP-1) is a nuclear enzyme involved in various physical functions related to genomic repair, and PARP inhibitors have therapeutic application in a variety of neurological diseases. Docking and the QSAR (quantitative structure-activity relationships) studies for 52 PARP-1 inhibitors were conducted using FlexX algorithm, comparative molecular field analysis (CoMFA), and hologram quantitative structure-activity relationship analysis (HQSAR). The resultant FlexX model showed a reasonable correlation (r$^{2}$ = 0.701) between predicted activity and observed activity. Partial least squares analysis produced statistically significant models with q$^{2}$ values of 0.795 (SDEP=0.690, r$^{2}$=0.940, s=0.367) and 0.796 (SDEP=0.678, r$^{2}$ = 0.919, s=0.427) for CoMFA and HQSAR, respectively. The models for the entire inhibitor set were validated by prediction test and scrambling in both QSAR methods. In this work, combination of docking, CoMFA with 3D descriptors and HQSAR based on molecular fragments provided an improved understanding in the interaction between the inhibitors and the PARP. This can be utilized for virtual screening to design novel PARP-1 inhibitors.

  • PDF

A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing

  • Hyeonwoo Kim;Jiwon Kim;Ji Won Cho;Kwang-Sung Ahn;Dong-Il Park;Sangsoo Kim
    • Genomics & Informatics
    • /
    • v.21 no.3
    • /
    • pp.40.1-40.11
    • /
    • 2023
  • Microbial community profiling using 16S rRNA amplicon sequencing allows for taxonomic characterization of diverse microorganisms. While amplicon sequence variant (ASV) methods are increasingly favored for their fine-grained resolution of sequence variants, they often discard substantial portions of sequencing reads during quality control, particularly in datasets with large number samples. We present a streamlined pipeline that integrates FastP for read trimming, HmmUFOtu for operational taxonomic units (OTU) clustering, Vsearch for chimera checking, and Kraken2 for taxonomic assignment. To assess the pipeline's performance, we reprocessed two published stool datasets of normal Korean populations: one with 890 and the other with 1,462 independent samples. In the first dataset, HmmUFOtu retained 93.2% of over 104 million read pairs after quality trimming, discarding chimeric or unclassifiable reads, while DADA2, a commonly used ASV method, retained only 44.6% of the reads. Nonetheless, both methods yielded qualitatively similar β-diversity plots. For the second dataset, HmmUFOtu retained 89.2% of read pairs, while DADA2 retained a mere 18.4% of the reads. HmmUFOtu, being a closed-reference clustering method, facilitates merging separately processed datasets, with shared OTUs between the two datasets exhibiting a correlation coefficient of 0.92 in total abundance (log scale). While the first two dimensions of the β-diversity plot exhibited a cohesive mixture of the two datasets, the third dimension revealed the presence of a batch effect. Our comparative evaluation of ASV and OTU methods within this streamlined pipeline provides valuable insights into their performance when processing large-scale microbial 16S rRNA amplicon sequencing data. The strengths of HmmUFOtu and its potential for dataset merging are highlighted.

De Novo Assembly and Comparative Analysis of the Enterococcus faecalis Genome (KACC 91532) from a Korean Neonate

  • Ham, Jun Sang;Kwak, Woori;Chang, Oun Ki;Han, Gi Sung;Jeong, Seok Geun;Seol, Kuk Hwan;Kim, Hyoun Wook;Kang, Geun Ho;Park, Beom Young;Lee, Hyun-Jeong;Kim, Jong Geun;Kim, Kyu-Won;Sung, Samsun;Lee, Taeheon;Cho, Seoae;Kim, Heebal
    • Journal of Microbiology and Biotechnology
    • /
    • v.23 no.7
    • /
    • pp.966-973
    • /
    • 2013
  • Using a newly constructed de novo assembly pipeline, finished genome level assembly had been conducted for the probiotic candidate strain E. faecalis KACC 91532 isolated from a stool samples of Korean neonates. Our gene prediction identified 3,061 genes in the assembled genome of the strain. Among these, nine genes were specific only for the E. faecalis KACC 91532, compared with all of the four known reference genomes (EF62, D32, V583, OG1RF). We identified genes related to phenotypic characters and detected E. faecalis KACC 91532-specific evolutionarily accelerated genes using dN/dS analysis. From these results, we found the potential risk of KACC 91532 as a useful probiotic strain and identified some candidate genetic variations that could affect the function of enzymes.

Pharmacophore-Based Comparative Molecular Similarity Indices Analysis of CRTh2 Antagonists

  • Babu, Sathya
    • Journal of Integrative Natural Science
    • /
    • v.8 no.4
    • /
    • pp.273-284
    • /
    • 2015
  • Chemoattractant Receptor Homologous molecule expressed on Th2 cells (CRTh2) is a chemoattractant receptor with seven transmembrane helices targeted for inflammatory diseases such as asthma and allergic rhinitis. In this study, pharmacophore based Comparative Molecular Similarity Indices Analysis (CoMSIA) were performed on the series of 2-(2-(benzylthio)-1H-benzo[d]imidazol-1-yl) acetic acids derivatives. Initially, GASP module was used for generation of pharmacophore models using five highly active compounds from the dataset. Among the generated pharmacophores, the best pharmacophore model was selected based on fitness score and was used as template for the alignment of compounds which was used for CoMSIA analysis. The best predictions were obtained utilizing steric, hydrophobic and H-bond acceptor parameters showing a $q^2$=0.559 and $r^2$=0.730. 15 test set compounds was used to investigate the predictive ability of the CoMSIA model. Contour maps suggested that presence of bulky substituents and H-bond acceptor atoms at $5^{th}$ position of benzene ring will increase the activity of the compounds. The results obtained from this study will be useful to design more potent CRTh2 antagonists.

Comparative Modeling Studies of 1-deoxy-D-xylulose 5-phosphate Synthase (MEP pathway) from Mycobacterium Tuberculosis

  • Kothandan, Gugan
    • Journal of Integrative Natural Science
    • /
    • v.4 no.3
    • /
    • pp.202-209
    • /
    • 2011
  • Tuberculosis is a major health problem in humans because of its multidrug resistance and discovering new treatments for this disease is urgently required. The synthesis of isoprenoids in Mycobacterium tuberculosis has been reported as an interesting pathway to target. In this context, 2C-methyl-D-erythritol 4-phosphate (MEP) pathway of M. tuberculosis has drawn attention. The MEP pathway begins with the condensation of glyceraldehyde 3-phosphate and pyruvate forming 1-deoxy-D-xylulose 5-phosphate (DXP) which is catalyzed by 1-deoxy-D-xylulose 5-phosphate synthase (DXS). As there is no X-ray structure was reported for this target, comparative modeling was used to generate the three dimensional structure. The structure was further validated by PROCHECK, VERIFY-3D, PROSA, ERRAT and WHATIF. Molecular docking studies was performed with the substrate (Thiamine pyrophosphate) and the reported inhibitor 2-methyl-3-(4-fluorophenyl)-5-(4-methoxy-phenyl)-4H-pyrazolol[1,5-a]pyrimidin-7-one) against the developed model to identify the crucial residues in the active site. This study may further be useful to provide structure based drug design.

A comparative study of filter methods based on information entropy

  • Kim, Jung-Tae;Kum, Ho-Yeun;Kim, Jae-Hwan
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.40 no.5
    • /
    • pp.437-446
    • /
    • 2016
  • Feature selection has become an essential technique to reduce the dimensionality of data sets. Many features are frequently irrelevant or redundant for the classification tasks. The purpose of feature selection is to select relevant features and remove irrelevant and redundant features. Applications of the feature selection range from text processing, face recognition, bioinformatics, speaker verification, and medical diagnosis to financial domains. In this study, we focus on filter methods based on information entropy : IG (Information Gain), FCBF (Fast Correlation Based Filter), and mRMR (minimum Redundancy Maximum Relevance). FCBF has the advantage of reducing computational burden by eliminating the redundant features that satisfy the condition of approximate Markov blanket. However, FCBF considers only the relevance between the feature and the class in order to select the best features, thus failing to take into consideration the interaction between features. In this paper, we propose an improved FCBF to overcome this shortcoming. We also perform a comparative study to evaluate the performance of the proposed method.

Sequence Analysis and Potential Action of Eukaryotic Type Protein Kinase from Streptomyces coelicolor A3(2)

  • Roy, Daisy R.;Chandra, Sathees B.C.
    • Genomics & Informatics
    • /
    • v.6 no.1
    • /
    • pp.44-49
    • /
    • 2008
  • Protein kinase C (PKC) is a family of kinases involved in the transduction of cellular signals that promote lipid hydrolysis. PKC plays a pivotal role in mediating cellular responses to extracellular stimuli involved in proliferation, differentiation and apoptosis. Comparative analysis of the PKC-${\alpha},{\beta},{\varepsilon}$ isozymes of 200 recently sequenced microbial genomes was carried out using variety of bioinformatics tools. Diversity and evolution of PKC was determined by sequence alignment. The ser/thr protein kinases of Streptomyces coelicolor A3 (2), is the only bacteria to show sequence alignment score greater than 30% with all the three PKC isotypes in the sequence alignment. S.coelicolor is the subject of our interest because it is notable for the production of pharmaceutically useful compounds including anti-tumor agents, immunosupressants and over two-thirds of all natural antibiotics currently available. The comparative analysis of three human isotypes of PKC and Serine/threonine protein kinase of S.coelicolor was carried out and possible mechanism of action of PKC was derived. Our analysis indicates that Serine/ threonine protein kinase from S. coelicolor can be a good candidate for potent anti-tumor agent. The presence of three representative isotypes of the PKC super family in this organism helps us to understand the mechanism of PKC from evolutionary perspective.