• Title/Summary/Keyword: Biological Sequence Database

Search Result 94, Processing Time 0.045 seconds

Cloning and Characterization of a Rice cDNA Encoding Glutamate Decarboxylase

  • Oh, Suk-Heung;Choi, Won-Gyu;Lee, In-Tae;Yun, Song-Joong
    • BMB Reports
    • /
    • v.38 no.5
    • /
    • pp.595-601
    • /
    • 2005
  • In this study, we have isolated a rice (Oryza sativa L.) glutamate decarboxylase (RicGAD) clone from a root cDNA library, using a partial Arabidopsis thaliana GAD gene as a probe. The rice root cDNA library was constructed with mRNA, which had been derived from the roots of rice seedlings subjected to phosphorus deprivation. Nucleotide sequence analysis indicated that the RicGAD clone was 1,712 bp long, and harbors a complete open reading frame of 505 amino acids. The 505 amino acid sequence deduced from this RicGAD clone exhibited 67.7% and 61.9% identity with OsGAD1 (AB056060) and OsGAD2 (AB056061) in the database, respectively. The 505 amino acid sequence also exhibited 62.9, 64.1, and 64.2% identity to Arabidopsis GAD (U9937), Nicotiana tabacum GAD (AF020425), and Petunia hybrida GAD (L16797), respectively. The RicGAD was found to possess a highly conserved tryptophan residue, but lacks the lysine cluster at the C-proximal position, as well as other stretches of positively charged residues. The GAD sequence was expressed heterologously using the high copy number plasmid, pVUCH. Our activation analysis revealed that the maximal activation of the RicGAD occurred in the presence of both $Ca^{2+}$ and calmodulin. The GAD-encoded 56~58 kDa protein was identified via Western blot analysis, using an anti-GAD monoclonal antibody. The results of our RT-PCR analyses revealed that RicGAD is expressed predominantly in rice roots obtained from rice seedlings grown under phosphorus deprivation conditions, and in non-germinated brown rice, which is known to have a limited phosphorus bioavailability. These results indicate that RicGAD is a $Ca^{2+}$/calmodulin-dependent enzyme, and that RicGAD is expressed primarily under phosphate deprivation conditions.

Analysis of partial cDNA sequence from Theileria sergenti

  • Park, Jin-ho;Chae, Joon-seok;Kim, Dae-hyuk;Jang, Yong-suk;Kwon, Oh-deog;Lee, Joo-mook
    • Korean Journal of Veterinary Research
    • /
    • v.39 no.4
    • /
    • pp.797-805
    • /
    • 1999
  • T sergenti cDNA library were constructed to get a more broad information about the structural, functional or antigenic properties of the proteins, and analyzes for their partial cDNA sequences and expression sequences tags(ESTg). The mRNA were purified from T sergenti isolates to identify the information of antigen gene, then first and second strand cDNA was synthesized. EcoR I adaptor ligation and Xho I enzyme restriction were used to the synthesized cDNA, and ligated into a Uni-ZAP XR vector. T sergenti cDNA library was constructed with packaging and amplification in vitro. Antibody screening was performed with constructed T sergenti cDNA library using antisera against T sergenti. Among those clones, eight phagemids were rescued from the recombinant in vivo excision with f1 helper phage. Using the analysis of endonuclease restriction and PCR, the recombinant cDNA were proved having a 0.5-3.0kb of inserts. The eight of partial cDNA clones' sequences were obtained and examined for their homology using BLASTN and BLASTX. The eight of sequenced clones were classified into three groups according to the basis of database searches. A total 3,045bp of partial cDNA sequence were determined from six clones. The putatively identified clones contain a cytochrome c gene, a heat shock protein gene, a cyclophilin gene, and a ribosomal protein gene. The unidentified clones have a homology to ATP-binding protein(mtrA) gene of S argillaceus, DNA-binding protein(DBP) gene of Pseudorabies virus 85kDa merozoite protein gene of B bovis, mRNA spm1 protein of T annulata and glycine-rich RNA-binding protein mRNA of O sativa etc.

  • PDF

Structure-based Functional Discovery of Proteins: Structural Proteomics

  • Jung, Jin-Won;Lee, Weon-Tae
    • BMB Reports
    • /
    • v.37 no.1
    • /
    • pp.28-34
    • /
    • 2004
  • The discovery of biochemical and cellular functions of unannotated gene products begins with a database search of proteins with structure/sequence homologues based on known genes. Very recently, a number of frontier groups in structural biology proposed a new paradigm to predict biological functions of an unknown protein on the basis of its three-dimensional structure on a genomic scale. Structural proteomics (genomics), a research area for structure-based functional discovery, aims to complete the protein-folding universe of all gene products in a cell. It would lead us to a complete understanding of a living organism from protein structure. Two major complementary experimental techniques, X-ray crystallography and NMR spectroscopy, combined with recently developed high throughput methods have played a central role in structural proteomics research; however, an integration of these methodologies together with comparative modeling and electron microscopy would speed up the goal for completing a full dictionary of protein folding space in the near future.

Development of Digital Endoscopic Image Processing System (디지탈 내시경 영상처리 시스템의 개발)

  • 송철규;이영묵
    • Journal of Biomedical Engineering Research
    • /
    • v.18 no.2
    • /
    • pp.121-126
    • /
    • 1997
  • Endoscopy has become a crucial diagnostic and therapeutic procedure in clinical areas. Over the past three years, we have developed a computerized system to record and store clinical data pertaining to endoscopic surgery of laparascopic cholecystectomy, pelviscopic endometriosis, and surgical arthroscopy. In this study, we developed a computer system, which is composed of a frame yabber, a sound board, a VCR control board, a LAN card and EDMS(endoscopic data management software. Also, computer system has controled peripheral instruments such as a color video printer, a video cassette recorder, and endoscopic input/output signals(image and doctor's comment). Digital endoscopic data management system is based on open architecture and a set of widely available industry standards, namely: windows 3.1 as a operating system, TCP/IP as a network protocol and a time sequence based database that handles both images and doctor's cotnments. For the purpose of data storage, we used MOD and CD-R. Digital endoscopic system was designed to be able to store, recreate, change, and compress signals and medical images.

  • PDF

A DNA Index Structure using Frequency and Position Information of Genetic Alphabet (염기문자의 빈도와 위치정보를 이용한 DNA 인덱스구조)

  • Kim Woo-Cheol;Park Sang-Hyun;Won Jung-Im;Kim Sang-Wook;Yoon Jee-Hee
    • Journal of KIISE:Databases
    • /
    • v.32 no.3
    • /
    • pp.263-275
    • /
    • 2005
  • In a large DNA database, indexing techniques are widely used for rapid approximate sequence searching. However, most indexing techniques require a space larger than original databases, and also suffer from difficulties in seamless integration with DBMS. In this paper, we suggest a space-efficient and disk-based indexing and query processing algorithm for approximate DNA sequence searching, specially exact match queries, wildcard match queries, and k-mismatch queries. Our indexing method places a sliding window at every possible location of a DNA sequence and extracts its signature by considering the occurrence frequency of each nucleotide. It then stores a set of signatures using a multi-dimensional index, such as R*-tree. Especially, by assigning a weight to each position of a window, it prevents signatures from being concentrated around a few spots in index space. Our query processing algorithm converts a query sequence into a multi-dimensional rectangle and searches the index for the signatures overlapped with the rectangle. The experiments with real biological data sets revealed that the proposed method is at least three times, twice, and several orders of magnitude faster than the suffix-tree-based method in exact match, wildcard match, and k- mismatch, respectively.

Algorithm for Predicting Functionally Equivalent Proteins from BLAST and HMMER Searches

  • Yu, Dong Su;Lee, Dae-Hee;Kim, Seong Keun;Lee, Choong Hoon;Song, Ju Yeon;Kong, Eun Bae;Kim, Jihyun F.
    • Journal of Microbiology and Biotechnology
    • /
    • v.22 no.8
    • /
    • pp.1054-1058
    • /
    • 2012
  • In order to predict biologically significant attributes such as function from protein sequences, searching against large databases for homologous proteins is a common practice. In particular, BLAST and HMMER are widely used in a variety of biological fields. However, sequence-homologous proteins determined by BLAST and proteins having the same domains predicted by HMMER are not always functionally equivalent, even though their sequences are aligning with high similarity. Thus, accurate assignment of functionally equivalent proteins from aligned sequences remains a challenge in bioinformatics. We have developed the FEP-BH algorithm to predict functionally equivalent proteins from protein-protein pairs identified by BLAST and from protein-domain pairs predicted by HMMER. When examined against domain classes of the Pfam-A seed database, FEP-BH showed 71.53% accuracy, whereas BLAST and HMMER were 57.72% and 36.62%, respectively. We expect that the FEP-BH algorithm will be effective in predicting functionally equivalent proteins from BLAST and HMMER outputs and will also suit biologists who want to search out functionally equivalent proteins from among sequence-homologous proteins.

Draft Genome Assembly and Annotation for Cutaneotrichosporon dermatis NICC30027, an Oleaginous Yeast Capable of Simultaneous Glucose and Xylose Assimilation

  • Wang, Laiyou;Guo, Shuxian;Zeng, Bo;Wang, Shanshan;Chen, Yan;Cheng, Shuang;Liu, Bingbing;Wang, Chunyan;Wang, Yu;Meng, Qingshan
    • Mycobiology
    • /
    • v.50 no.1
    • /
    • pp.66-78
    • /
    • 2022
  • The identification of oleaginous yeast species capable of simultaneously utilizing xylose and glucose as substrates to generate value-added biological products is an area of key economic interest. We have previously demonstrated that the Cutaneotrichosporon dermatis NICC30027 yeast strain is capable of simultaneously assimilating both xylose and glucose, resulting in considerable lipid accumulation. However, as no high-quality genome sequencing data or associated annotations for this strain are available at present, it remains challenging to study the metabolic mechanisms underlying this phenotype. Herein, we report a 39,305,439 bp draft genome assembly for C. dermatis NICC30027 comprised of 37 scaffolds, with 60.15% GC content. Within this genome, we identified 524 tRNAs, 142 sRNAs, 53 miRNAs, 28 snRNAs, and eight rRNA clusters. Moreover, repeat sequences totaling 1,032,129 bp in length were identified (2.63% of the genome), as were 14,238 unigenes that were 1,789.35 bp in length on average (64.82% of the genome). The NCBI non-redundant protein sequences (NR) database was employed to successfully annotate 11,795 of these unigenes, while 3,621 and 11,902 were annotated with the Swiss-Prot and TrEMBL databases, respectively. Unigenes were additionally subjected to pathway enrichment analyses using the Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Cluster of Orthologous Groups of proteins (COG), Clusters of orthologous groups for eukaryotic complete genomes (KOG), and Non-supervised Orthologous Groups (eggNOG) databases. Together, these results provide a foundation for future studies aimed at clarifying the mechanistic basis for the ability of C. dermatis NICC30027 to simultaneously utilize glucose and xylose to synthesize lipids.

Identification and Expression of Equine MER-Derived miRNAs

  • Gim, Jeong-An;Kim, Heui-Soo
    • Molecules and Cells
    • /
    • v.40 no.4
    • /
    • pp.262-270
    • /
    • 2017
  • MicroRNAs (miRNAs) are single-stranded, small RNAs (21-23 nucleotides) that function in gene silencing and translational inhibition via the RNA interference mechanism. Most miRNAs originate from host genomic regions, such as intergenic regions, introns, exons, and transposable elements (TEs). Here, we focused on the palindromic structure of medium reiteration frequencies (MERs), which are similar to precursor miRNAs. Five MER consensus sequences (MER5A1, MER53, MER81, MER91C, and MER117) were matched with paralogous transcripts predicted to be precursor miRNAs in the horse genome (equCab2) and located in either intergenic regions or introns. The MER5A1, MER53, and MER91C sequences obtained from RepeatMasker were matched with the eca-miR-544b, eca-miR-1302, and eca-miR-652 precursor sequences derived from Ensembl transcript database, respectively. Each precursor form was anticipated to yield two mature forms, and we confirmed miRNA expression in six different tissues (cerebrum, cerebellum, lung, spleen, adrenal gland, and duodenum) of one thoroughbred horse. MER5A1-derived miRNAs generally showed significantly higher expression in the lung than in other tissues. MER91C-derived miRNA-5p also showed significantly higher expression in the duodenum than in other tissues (cerebellum, lung, spleen, and adrenal gland). The MER117-overlapped expressed sequence tag generated polycistronic miRNAs, which showed higher expression in the duodenum than other tissues. These data indicate that horse MER transposons encode miRNAs that are expressed in several tissues and are thought to have biological functions.

Transcription Factor for Gene Function Analysis in Maize (옥수수 유전자 기능 분석을 위한 전사인자의 이해)

  • Moon, Jun-Cheol;Kim, Jae Yoon;Baek, Seong-Bum;Kwon, Young-Up;Song, Kitae;Lee, Byung-Moo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.59 no.3
    • /
    • pp.263-281
    • /
    • 2014
  • Transcription factors are essential for the regulation of gene expression in plant. They are binding to either enhancer or promoter region of DNA adjacent to the gene and are related to basal transcription regulation, differential enhancement of transcription, development, response to intercellular signals or environment, and cell cycle control. The mechanism in controlling gene expression of transcription can be understood through the assessment of the complete sequence for the maize genome. It is possible that the maize genome encodes 4,000 or more transcription factors because it has undergone whole duplication in the past. Previously, several transcription factors of maize have been characterized. In this review article, the transcription factors were selected using Pfam database, including many family members in comparison with other family and listed as follows: ABI3/VP1, AP2/EREBP, ARF, ARID, AS2, AUX/IAA, BES1, bHLH, bZIP, C2C2-CO-like, C2C2-Dof, C2C2-GATA, C2C2-YABBY, C2H2, E2F/DP, FHA, GARP-ARR-B, GeBP, GRAS, HMG, HSF, MADS, MYB, MYB-related, NAC, PHD, and WRKY family. For analyzing motifs, each amino acid sequence has been aligned with ClustalW and the conserved sequence was shown by sequence logo. This review article will contribute to further study of molecular biological analysis and breeding using the transcription factor of maize as a strategy for selecting target gene.

Design of Heterogeneous Content Linkage Method by Analyzing Genbank (Genbank 분석을 통한 이종의 콘텐츠 연계 방안 설계)

  • Ahn, Bu-Young;Lee, Myung-Sun;Kim, Ji-Young;Oh, Chung-Shick
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.49-54
    • /
    • 2010
  • As information on gene sequences is not only diverse but also extremely huge in volume, high-performance computer and information technology techniques are required to build and analyze gene sequence databases. This has given rise to the discipline of bioinformatics, a field of research where computers are utilized to collect, to manage, to save, to evaluate, and to analyze biological data. In line with such continued development in bioinformatics, the Korea Institute of Science and Technology Information (KISTI) has built an infrastructure for the biological information, based on the information technology, and provided the information for researchers of bioscience. This paper analyzes the reference fields of Genbank, the most frequently used gene database by the global researchers among the life information databases, and proposes the interface method to NDSL which is the science and technology information integrated service provided by KISTI. For these, after collecting Genbank data from NCBI FTP site, we rebuilt the database by separating Genbank text files into the basic gene data and the reference data. So new tables are generated through extracting the paper and patent information from Genbank reference fields. Then we suggest the method of connection with the paper DB and the patent DB operated by KISTI.