• Title/Summary/Keyword: genome project

Search Result 227, Processing Time 0.026 seconds

BioSubroutine: an Open Web Server for Bioinformatics Algorithms and Subroutines

  • Lee, Joowon;Kim, Hana;Lee, Wonhye;Chung, Dongil;Bhak, Jong
    • Genomics & Informatics
    • /
    • v.3 no.1
    • /
    • pp.35-38
    • /
    • 2005
  • We present BioSubroutine, an open depository server that automatically categorizes various subroutines frequently used in bioinformatics research. We processed a large bioinformatics subroutine library called Bio.pl that was the first Bioperl subroutine library built in 1995. Over 1000 subroutines were processed automatically and an HTML interface has been created. BioSubroutine can accept new subroutines and algorithms from any such subroutine library, as well as provide interactive user forms. The subroutines are stored in an SQL database for quick searching and accessing. BioSubroutine is an open access project under the BioLicense license scheme.

Genomic Applications of Biochip Informatics (유전체 발현의 정보학적 분석과 응용)

  • Kim, Ju-Han
    • KOGO NEWS
    • /
    • v.5 no.4
    • /
    • pp.9-16
    • /
    • 2005
  • Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic expression data transforms the challenges m biomedical research into ones in bioinformatics. Clinical informatics has long developed technologies to imp개ve biomedical research by integrating experimental and clinical information systems. Biomedical informatics, powered by high throughput techniques, genomic-scale databases and advanced clinical information system, is likely to transform our biomedical understanding forever much the same way that biochemistry did to biology a generation ago. The emergence of healthcare and biomedical informatics revolutionizing both bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics and prognostics.

  • PDF

Global Optimization of Clusters in Gene Expression Data of DNA Microarrays by Deterministic Annealing

  • Lee, Kwon Moo;Chung, Tae Su;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.1 no.1
    • /
    • pp.20-24
    • /
    • 2003
  • The analysis of DNA microarry data is one of the most important things for functional genomics research. The matrix representation of microarray data and its successive 'optimal' incisional hyperplanes is a useful platform for developing optimization algorithms to determine the optimal partitioning of pairwise proximity matrix representing completely connected and weighted graph. We developed Deterministic Annealing (DA) approach to determine the successive optimal binary partitioning. DA algorithm demonstrated good performance with the ability to find the 'globally optimal' binary partitions. In addition, the objects that have not been clustered at small non­zero temperature, are considered to be very sensitive to even small randomness, and can be used to estimate the reliability of the clustering.

The Protein Identification system Design and Implementation by Peptide mass mapping in Distributed Environment (분산 환경에서 Peptide Mass Mapping에 의한 단백질 검증 시스템 설계 및 구현)

  • 신민수;김도완;허철구;임소형
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2000.11a
    • /
    • pp.571-574
    • /
    • 2000
  • 오늘날 단백질 정보 분석은 HGP(Human Genome Project)이후 Post-genome 시대를 맞이하면서 매우 중요한 분야로 인식되고 있다. 이 단백질 정보를 이용하는 응용은 Discovery of Protein Structure/Function Relationships, Evolutionary Relationships, 3D Modeling 등 많은 분야에서 활용되어진다. 여러 가지 분야들 중에서 특히 단백질 구조 분석을 위한 많은 다양한 소프트웨어들이 출현되고 있다. 하지만 복잡하게 얽혀 있는 단백질들을 검증하기 위해서 Mass Spectrometry에서 발생되는 Peptide Masses의 정보들을 이용할 수 있다. 이에 본 논문에서는 Mass Spectrometry에서 생성된 Peptide Mass Map을 이용하여 기존의 단백질 Database에 있는 단백질들과 비교하는 자동화 단백질 검증 시스템 설계 및 구현에 관한 연구내용을 담고 있다. 이 시스템은 3-계층 중심으로 개발이 이루어지며 이 기종 시스템과의 원활한 통신 다중 계층의 환경에 있는 각 객체들간에 통신을 위해서 RMI 기반의 미들 웨어를 활용하기로 한다.

  • PDF

Translation and Transcription: the Dual Functionality of LysRS in Mast Cells

  • Yannay-Cohen, Nurit;Razin, Ehud
    • Molecules and Cells
    • /
    • v.22 no.2
    • /
    • pp.127-132
    • /
    • 2006
  • In the post genome project era, it is well established that the human genome contains a smaller number of genes than expected. The complexity found in higher organisms can be explained if proteins are multifunctional. Indeed, recent studies are continuing to reveal proteins that are capable of a broad repertoire of functions. A good paradigm for multifunctionality can be found in the amino-acyl tRNA synthetases (aaRSs), an ancient conserved family of proteins. This unique family, which is comprised of 20 different enzymes, is well known for its participation in protein synthesis. Several studies have described numerous examples of these "housekeeping" proteins taking part in extensive critical cellular activities. In this review, we focus on a member of that family, lysyl-tRNA synthetase (LysRS), which has been shown to have a dual functionality. In addition to its contribution to the translation process, LysRS also takes part in the regulation of MITF and USF2 target genes. This phenomenon was first described in mast cells.

Adjusting sampling bias in case-control genetic association studies

  • Seo, Geum Chu;Park, Taesung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1127-1135
    • /
    • 2014
  • Genome-wide association studies (GWAS) are designed to discover genetic variants such as single nucleotide polymorphisms (SNPs) that are associated with human complex traits. Although there is an increasing interest in the application of GWAS methodologies to population-based cohorts, many published GWAS have adopted a case-control design, which raise an issue related to a sampling bias of both case and control samples. Because of unequal selection probabilities between cases and controls, the samples are not representative of the population that they are purported to represent. Therefore, non-random sampling in case-control study can potentially lead to inconsistent and biased estimates of SNP-trait associations. In this paper, we proposed inverse-probability of sampling weights based on disease prevalence to eliminate a case-control sampling bias in estimation and testing for association between SNPs and quantitative traits. We apply the proposed method to a data from the Korea Association Resource project and show that the standard estimators applied to the weighted data yield unbiased estimates.

Ethical Considerations in Genomic Cohort Study (유전체 코호트 연구의 윤리적 고려 사항)

  • Choi, Eun-Kyung;Kim, Ock-Joo
    • Journal of Preventive Medicine and Public Health
    • /
    • v.40 no.2
    • /
    • pp.122-129
    • /
    • 2007
  • During the last decade, genomic cohort study has been developed in many countries by linking health data and genetic data in stored samples. Genomic cohort study is expected to find key genetic components that contribute to common diseases, thereby promising great advance in genome medicine. While many countries endeavor to build biobank systems, biobank-based genome research has raised important ethical concerns including genetic privacy, confidentiality, discrimination, and informed consent. Informed consent for biobank poses an important question: whether true informed consent is possible in population-based genomic cohort research where the nature of future studies is unforeseeable when consent is obtained. Due to the sensitive character of genetic information, protecting privacy and keeping confidentiality become important topics. To minimize ethical problems and achieve scientific goals to its maximum degree, each country strives to build population-based genomic cohort research project, by organizing public consultation, trying public and expert consensus in research, and providing safeguards to protect privacy and confidentiality.

Biotea-2-Bioschemas, facilitating structured markup for semantically annotated scholarly publications

  • Garcia, Leyla;Giraldo, Olga;Garcia, Alexander;Rebholz-Schuhmann, Dietrich
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.14.1-14.6
    • /
    • 2019
  • The total number of scholarly publications grows day by day, making it necessary to explore and use simple yet effective ways to expose their metadata. Schema.org supports adding structured metadata to web pages via markup, making it easier for data providers but also for search engines to provide the right search results. Bioschemas is based on the standards of schema.org, providing new types, properties and guidelines for metadata, i.e., providing metadata profiles tailored to the Life Sciences domain. Here we present our proposed contribution to Bioschemas (from the project "Biotea"), which supports metadata contributions for scholarly publications via profiles and web components. Biotea comprises a semantic model to represent publications together with annotated elements recognized from the scientific text; our Biotea model has been mapped to schema.org following Bioschemas standards.

Multi-block Analysis of Genomic Data Using Generalized Canonical Correlation Analysis

  • Jun, Inyoung;Choi, Wooree;Park, Mira
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.33.1-33.9
    • /
    • 2018
  • Recently, there have been many studies in medicine related to genetic analysis. Many genetic studies have been performed to find genes associated with complex diseases. To find out how genes are related to disease, we need to understand not only the simple relationship of genotypes but also the way they are related to phenotype. Multi-block data, which is a summation form of variable sets, is used for enhancing the analysis of the relationships of different blocks. By identifying relationships through a multi-block data form, we can understand the association between the blocks in comprehending the correlation between them. Several statistical analysis methods have been developed to understand the relationship between multi-block data. In this paper, we will use generalized canonical correlation methodology to analyze multi-block data from the Korean Association Resource project, which has a combination of single nucleotide polymorphism blocks, phenotype blocks, and disease blocks.

Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data

  • Lee, Yuna;Park, Kiejung;Koh, Insong
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.40.1-40.9
    • /
    • 2019
  • While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.