• Title/Summary/Keyword: bio-database

Search Result 242, Processing Time 0.022 seconds

HExDB: Human EXon DataBase for Alternative Splicing Pattern Analysis

  • Park, Junghwan;Lee, Minho;Bhak, Jong
    • Genomics & Informatics
    • /
    • v.3 no.3
    • /
    • pp.80-85
    • /
    • 2005
  • HExDB is a database for analyzing exon and splicing pattern information in Homo sapiens. HExDB is useful for specific purposes: 1) to design primers for exon amplification from cDNA and 2) to understand the change of ORFs by alternative splicing. HExDB was constructed by integrating data from AltExtron which is the computationally predicted exon database, Ensemble cDNA annotation, and Affymetrix genome tile published recently. Although it may contain false positive data, HExDB is good starting point due to its sensitivity. At present, there areas many as 2,046,519 exons stored in the HExDB. We found that $16.8\%$ of the exons in the database was constitutive exons and $83.1\%$ were novel gene exons.

REPEATOME: A Database for Repeat Element Comparative Analysis in Human and Chimpanzee

  • Woo, Tae-Ha;Hong, Tae-Hui;Kim, Sang-Soo;Chung, Won-Hyong;Kang, Hyo-Jin;Kim, Chang-Bae;Seo, Jung-Min
    • Genomics & Informatics
    • /
    • v.5 no.4
    • /
    • pp.179-187
    • /
    • 2007
  • An increasing number of primate genomes are being sequenced. A direct comparison of repeat elements in human genes and their corresponding chimpanzee orthologs will not only give information on their evolution, but also shed light on the major evolutionary events that shaped our species. We have developed REPEATOME to enable visualization and subsequent comparisons of human and chimpanzee repeat elements. REPEATOME (http://www.repeatome.org/) provides easy access to a complete repeat element map of the human genome, as well as repeat element-associated information. It provides a convenient and effective way to access the repeat elements within or spanning the functional regions in human and chimpanzee genome sequences. REPEATOME includes information to compare repeat elements and gene structures of human genes and their counterparts in chimpanzee. This database can be accessed using comparative search options such as intersection, union, and difference to find lineage-specific or common repeat elements. REPEATOME allows researchers to perform visualization and comparative analysis of repeat elements in human and chimpanzee.

Dissemination of Advanced Mouse Resources and Technologies at RIKEN BioResource Center

  • Yoshiki, Atsushi
    • Interdisciplinary Bio Central
    • /
    • v.2 no.4
    • /
    • pp.15.1-15.5
    • /
    • 2010
  • RIKEN BioResource Center (BRC) has collected, preserved, conducted quality control of, and distributed mouse resources since 2002 as the core facility of the National BioResource Project by the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan. Our mouse resources include over 5,000 strains such as humanized disease models, fluorescent reporters, and knockout mice. We have developed novel mouse strains such as tissue-specific Cre-drivers and optogenetic strains that are in high demand by the research community. We have removed all our specified pathogens from the deposited mice and used our quality control tests to examine their genetic modifications and backgrounds. RIKEN BRC is a founding member of the Federation of International Mouse Resources and the Asian Mouse Mutagenesis and Resource Association, and provides mouse resources to the one-stop International Mouse Strain Resource database. RIKEN BRC also participates in the International Gene Trap Consortium, having registered 713 gene-trap clones and their sequences in a public library, and is an advisory member of the CREATE (Coordination of resources for conditional expression of mutated mouse alleles) consortium which represents major European and international mouse database holders for the integration and dissemination of Cre-driver strains. RIKEN BRC provides training courses in the use of advanced technologies for the quality control and cryopreservation of mouse strains to promote the effective use of mouse resources worldwide.

Cereal Resources in National BioResource Project of Japan

  • Sato, Kazuhiro;Endo, Takashi R.;Kurata, Nori
    • Interdisciplinary Bio Central
    • /
    • v.2 no.4
    • /
    • pp.13.1-13.8
    • /
    • 2010
  • The National BioResource Project of Japan is a governmental project to promote domestic/international research activities using biological resources. The project has 27 biological resources including three cereal resources. The core center and sub-center which historically collected the cereal resources were selected for each cereal program. These resources are categorized into several different types in the project; germplasm, genetic stocks, genome resources and database information. Contents of rice resources are wild species, local varieties in East and Southwest Asia & wild relatives, MNU-induced chemical mutant lines, marker tester lines, chromosome substitution lines and other experimental lines. Contents of wheat resources are wild strains, cultivated strains, experimental lines, rye wild and cultivated strains; EST clones and full-length cDNA clones. Contents of barley resources are cultivar and experimental lines, core collection, EST/cDNA clones, BAC clones, their filters and superpool DNA. Each resource is accessible from the online database to see the contents and information about the resources. Links to the genome information and genomic tools are also important function of each database. The major contents and some examples are presented here.

Formalized Web-based Data Searching System for GRID Environment (그리드 환경을 위한 정형화된 웹 기반 데이터 검색 시스템)

  • Lee, Sang-keon;Hwang, Seog-chan;Choi, Jae-young;No, Kyoung-Tai
    • The KIPS Transactions:PartA
    • /
    • v.11A no.1
    • /
    • pp.75-80
    • /
    • 2004
  • To interact database data with GRID system, implementation and installation of data manipulation module which manipulates database data and its index is required. Developing a search system searching data on web-based database, and integrating it with grid system, it is possible that searching data on web and use it directly on GRID system without independent data module. So, we can build easy and effective grid system, and the system could have more flexible architecture adapting data change. In this paper, we propose a searching system which interacting web-based database with GRID systems. We integrated the searching system with a bio god system which runs virtual screening jobs. As a result, UB Grid (Universal Bio Grid) is constructed. Developer could reduce time and effort required to integrate web data to GRID system, and user could use UB Grid system easily and effectively.

Informatics for protein identification by tandem mass spectrometry; Focused on two most-widely applied algorithms, Mascot and SEQUEST

  • Sohn, Chang-Ho;Jung, Jin-Woo;Kang, Gum-Yong;Kim, Kwang-Pyo
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.89-94
    • /
    • 2006
  • Mass spectrometry (MS) is widely applied for high throughput proteomics analysis. When large-scale proteome analysis experiments are performed, it generates massive amount of data. To search these proteomics data against protein databases, fully automated database search algorithms, such as Mascot and SEQUEST are routinely employed. At present, it is critical to reduce false positives and false negatives during such analysis. In this review we have focused on aspects of automated protein identification using tandem mass spectrometry (MS/MS) spectra and validation of the protein identifications of two most common automated protein identification algorithms Mascot and SEQUEST.

  • PDF

Design and Implementation of Reference Evapotranspiration Database for Future Climate Scenarios (기후변화 시나리오를 이용한 미래 읍면동단위 기준증발산량 데이터베이스 설계 및 구축)

  • Kim, Taegon;Suh, Kyo;Nam, Won-Ho;Lee, Jemyung;Hwang, Syewoon;Yoo, Seung-Hwan;Hong, Soun-Ouk
    • Journal of Korean Society of Rural Planning
    • /
    • v.22 no.4
    • /
    • pp.71-80
    • /
    • 2016
  • Meanwhile, reference evapotranspiration(ET0) is important information for agricultural management including irrigation planning and drought assessment, the database of reference evapotranspiration for future periods was rarely constructed especially at districts unit over the country. The Coupled Model Intercomparison Project Phase 5 (CMIP5) provides several meteorological data such as precipitation, average temperature, humidity, wind speed, and radiation for long-term future period at daily time-scale. This study aimed to build a database for reference evapotranspiration using the climate forecasts at high resolution (the outputs of HadGEM3-RA provided by Korea Meteorological Administration (KMA)). To estimate reference evapotranspiration, we implemented four different models such as FAO Modified Penman, FAO Penman-Monteith, FAO Blaney-Criddle, and Thornthwaite. The suggested database system has an open architecture so that user could add other models into the database. The database contains 5,050 regions' data for each four models and four Representative Concentration Pathways (RCP) climate change scenarios. The developed database system provides selecting features by which the database users could extract specific region and period data.

Reinterpretation of the protein identification process for proteomics data

  • Kwon, Kyung-Hoon;Lee, Sang-Kwang;Cho, Kun;Park, Gun-Wook;Kang, Byeong-Soo;Park, Young-Mok
    • Interdisciplinary Bio Central
    • /
    • v.1 no.3
    • /
    • pp.9.1-9.6
    • /
    • 2009
  • Introduction: In the mass spectrometry-based proteomics, biological samples are analyzed to identify proteins by mass spectrometer and database search. Database search is the process to select the best matches to the experimental mass spectra among the amino acid sequence database and we identify the protein as the matched sequence. The match score is defined to find the matches from the database and declare the highest scored hit as the most probable protein. According to the score definition, search result varies. In this study, the difference among search results of different search engines or different databases was investigated, in order to suggest a better way to identify more proteins with higher reliability. Materials and Methods: The protein extract of human mesenchymal stem cell was separated by several bands by one-dimensional electrophorysis. One-dimensional gel was excised one by one, digested by trypsin and analyzed by a mass spectrometer, FT LTQ. The tandem mass (MS/MS) spectra of peptide ions were applied to the database search of X!Tandem, Mascot and Sequest search engines with IPI human database and SwissProt database. The search result was filtered by several threshold probability values of the Trans-Proteomic Pipeline (TPP) of the Institute for Systems Biology. The analysis of the output which was generated from TPP was performed. Results and Discussion: For each MS/MS spectrum, the peptide sequences which were identified from different conditions such as search engines, threshold probability, and sequence database were compared. The main difference of peptide identification at high threshold probability was caused by not the difference of sequence database but the difference of the score. As the threshold probability decreases, the missed peptides appeared. Conversely, in the extremely high threshold level, we missed many true assignments. Conclusion and Prospects: The different identification result of the search engines was mainly caused by the different scoring algorithms. Usually in proteomics high-scored peptides are selected and low-scored peptides are discarded. Many of them are true negatives. By integrating the search results from different parameter and different search engines, the protein identification process can be improved.

BioCC: An Openfree Hypertext Bio Community Cluster for Biology

  • Gong Sung-Sam;Kim Tae-Hyung;Oh Jung-Su;Kwon Je-Keun;Cho Su-An;Bolser Dan;Bhak Jong
    • Genomics & Informatics
    • /
    • v.4 no.3
    • /
    • pp.125-128
    • /
    • 2006
  • We present an openfree hypertext (also known as wiki) web cluster called BioCC. BioCC is a novel wiki farm that lets researchers create hundreds of biological web sites. The web sites form an organic information network. The contents of all the sites on the BioCC wiki farm are modifiable by anonymous as well as registered users. This enables biologists with diverse backgrounds to form their own Internet bio-communities. Each community can have custom-made layouts for information, discussion, and knowledge exchange. BioCC aims to form an ever-expanding network of openfree biological knowledge databases used and maintained by biological experts, students, and general users. The philosophy behind BioCC is that the formation of biological knowledge is best achieved by open-minded individuals freely exchanging information. In the near future, the amount of genomic information will have flooded society. BioGG can be an effective and quickly updated knowledge database system. BioCC uses an opensource wiki system called Mediawiki. However, for easier editing, a modified version of Mediawiki, called Biowiki, has been applied. Unlike Mediawiki, Biowiki uses a WYSIWYG (What You See Is What You Get) text editor. BioCC is under a share-alike license called BioLicense (http://biolicense.org). The BioCC top level site is found at http://bio.cc/

Combining Neuroinformatics Databases for Multi-Level Analysis of Brain Disorders

  • Yu, Ha Sun;Bang, Joon;Jo, Yousang;Lee, Doheon
    • Interdisciplinary Bio Central
    • /
    • v.4 no.3
    • /
    • pp.7.1-7.8
    • /
    • 2012
  • With the development of many methods of studying the brain, the field of neuroscience has generated large amounts of information obtained from various techniques: imaging techniques, electrophysiological techniques, techniques for analyzing brain connectivity, techniques for getting molecular information of the brain, etc. A plenty of neuroinformatics databases have been made for storing and sharing this useful information and those databases can be publicly accessed by researchers as needed. However, since there are too many neuroinformatics databases, it is difficult to find the appropriate database depending on the needs of researcher. Moreover, many researchers in neuroscience fields are unfamiliar with using neuroinformatics databases for their studies because data is too diverse for neuroscientists to handle this and there is little precedent for using neuroinformatics databases for their research. Therefore, in this article, we review databases in the field of neuroscience according to both their methods for obtaining data and their objectives to help researchers to use databases properly. We also introduce major neuroinformatics databases for each type of information. In addition, to show examples of novel uses of neuroinformatics databases, we represent several studies that combine neuroinformatics databases of different information types and discover new findings. Finally, we conclude our paper with the discussion of potential applications of neuroinformatics databases.