• 제목/요약/키워드: Comparative bioinformatics

검색결과 120건 처리시간 0.027초

계층적 정렬쌍 가시화를 이용한 유전자 클러스터 탐색 알고리즘 (A Gene Clustering Method with Hierarchical Visualization of Alignment Pairs)

  • 진희정;박수현;조환규
    • 정보처리학회논문지A
    • /
    • 제16A권3호
    • /
    • pp.143-152
    • /
    • 2009
  • 최근 생물정보학 분야의 연구는 하나하나의 유전자를 연구하던 예전의 방법에서 유전자들간의 관계를 알아보는 연구들로 변해가고 있다. 이러한 유전자들 간의 연구 중 하나가 유전자 팀(gene team)을 연구하는 것이다. 유전자 팀이란 몇몇 염색체들 사이의 유전자들이 보존되어 있는 것을 말하며, 닫힌 영역 안에 보존되어 있는 유전자들의 집합으로 볼 수 있다. 이들은 진화과정을 거치면서, 유전자 팀 내의 유전자들의 위치나 그 종류가 변한다. 이러한 유전자 팀을 찾기 위해 많은 연구들이 이루어져왔다. 본 논문은 생물정보학 분야에서 많이 사용되는 계층적 클러스터링(hierarchical clustering)방법을 변형하여 전체 유전체(whole genome) 쌍내에서의 의미 있는 영역을 찾고, 영역 내에서 gene team을 찾을 수 있는 방법을 소개한다. 본 연구 방법을 이용하면, 복잡한 구조의 두 유전체 사이의 연관 유전자들이나 유사 영역들의 맵(map)을 단계별로 간략화 하여 나타낼 수 있다.

Comparative Proteomic Profiling of Pancreatic Ductal Adenocarcinoma Cell Lines

  • Kim, Yikwon;Han, Dohyun;Min, Hophil;Jin, Jonghwa;Yi, Eugene C.;Kim, Youngsoo
    • Molecules and Cells
    • /
    • 제37권12호
    • /
    • pp.888-898
    • /
    • 2014
  • Pancreatic cancer is one of the most fatal cancers and is associated with limited diagnostic and therapeutic modalities. Currently, gemcitabine is the only effective drug and represents the preferred first-line treatment for chemotherapy. However, a high level of intrinsic or acquired resistance of pancreatic cancer to gemcitabine can contribute to the failure of gemcitabine treatment. To investigate the underlying molecular mechanisms for gemcitabine resistance in pancreatic cancer, we performed label-free quantification of protein expression in intrinsic gemcitabine-resistant and -sensitive human pancreatic adenocarcinoma cell lines using our improved proteomic strategy, combined with filter-aided sample preparation, single-shot liquid chromatography-mass spectrometry, enhanced spectral counting, and a statistical method based on a power law global error model. We identified 1931 proteins and quantified 787 differentially expressed proteins in the BxPC3, PANC-1, and HPDE cell lines. Bioinformatics analysis identified 15 epithelial to mesenchymal transition (EMT) markers and 13 EMT-related proteins that were closely associated with drug resistance were differentially expressed. Interestingly, 8 of these proteins were involved in glutathione and cysteine/methionine metabolism. These results suggest that proteins related to the EMT and glutathione metabolism play important roles in the development of intrinsic gemcitabine resistance by pancreatic cancer cell lines.

Comparative Analysis of Envelope Proteomes in Escherichia coli B and K-12 Strains

  • Han, Mee-Jung;Lee, Sang-Yup;Hong, Soon-Ho
    • Journal of Microbiology and Biotechnology
    • /
    • 제22권4호
    • /
    • pp.470-478
    • /
    • 2012
  • Recent genome comparisons of E. coli B and K-12 strains have indicated that the makeup of the cell envelopes in these two strains is quite different. Therefore, we analyzed and compared the envelope proteomes of E. coli BL21(DE3) and MG1655. A total of 165 protein spots, including 62 nonredundant proteins, were unambiguously identified by two-dimensional gel electrophoresis and mass spectrometry. Of these, 43 proteins were conserved between the two strains, whereas 4 and 16 strain-specific proteins were identified only in E. coli BL21(DE3) and MG1655, respectively. Additionally, 24 proteins showed more than 2-fold differences in intensities between the B and K-12 strains. The reference envelope proteome maps showed that E. coli envelope mainly contained channel proteins and lipoproteins. Interesting proteomic observations between the two strains were as follows: (i) B produced more OmpF porin with a larger pore size than K-12, indicating an increase in the membrane permeability; (ii) B produced higher amounts of lipoproteins, which facilitates the assembly of outer membrane ${\beta}$-barrel proteins; and (iii) motility- (FliC) and chemotaxis-related proteins (CheA and CheW) were detected only in K-12, which showed that E. coli B is restricted with regard to migration under unfavorable conditions. These differences may influence the permeability and integrity of the cell envelope, showing that E. coli B may be more susceptible than K-12 to certain stress conditions. Thus, these findings suggest that E. coli K-12 and its derivatives will be more favorable strains in certain biotechnological applications, such as cell surface display or membrane engineering studies.

남조류의 생리·생태 연구에서 분자생태유전학적 기법의 역할 및 전망 (Prospect and Roles of Molecular Ecogenetic Techniques in the Ecophysiological Study of Cyanobacteria)

  • 안치용
    • 생태와환경
    • /
    • 제51권1호
    • /
    • pp.16-28
    • /
    • 2018
  • 남조류에 대한 오랜 연구로 많은 사실을 알게 되었음에도 여전히 미지의 영역으로 남아있는 부분이 많은데, 분자 생물학에 기반한 오믹스 기술의 발전으로 새로운 도구를 이용한 다른 관점에서의 연구가 최근 활발해지고 있다. 일차적으로는 유전체 염기서열 분석기술을 사용하여 다양한 남조류의 유전체 비교분석과 유전자의 발현 양상을 연구함으로써, 독소 합성의 조절 기작 등 생리적 특성이 나타나는 원리 규명에 많은 노력이 기울여지고 있다. 또한 남조류 유전형의 다양성과 이들이 밀접하게 상호작용하는 박테리아 군집이 계절적 및 환경적 요인에 어떻게 반응하여 변화하고, 이러한 변화가 생태계에는 어떤 영향을 미치는지에 대한 연구가 생물정보학 분석기법과 결합하면서, 생태계의 복잡한 작동방식에 대한 이해도 늘어나고 있다. 특히 다양한 오믹스 기법을 복합 적용함으로써 생태계 안에서 일어나는 모든 층위의 생물학적 반응에 대한 총체적 그림을 그리는 것이 현실화되고 있으며, 이렇게 그려진 설계도로부터 녹조를 효과적으로 제어하고 건강한 수생태계를 유지할 수 있는 새로운 통찰의 가능성에 대한 기대가 고조되고 있다.

Genomic Analysis of Dairy Starter Culture Streptococcus thermophilus MTCC 5461

  • Prajapati, Jashbhai B.;Nathani, Neelam M.;Patel, Amrutlal K.;Senan, Suja;Joshi, Chaitanya G.
    • Journal of Microbiology and Biotechnology
    • /
    • 제23권4호
    • /
    • pp.459-466
    • /
    • 2013
  • The lactic acid bacterium Streptococcus thermophilus is widely used as a starter culture for the production of dairy products. Whole-genome sequencing is expected to utilize the genetic basis behind the metabolic functioning of lactic acid bacterium (LAB), for development of their use in biotechnological and probiotic applications. We sequenced the whole genome of Streptococcus thermophilus MTCC 5461, the strain isolated from a curd source, by 454 GS-FLX titanium and Ion Torrent PGM. We performed comparative genome analysis using the local BLAST and RDP for 16S rDNA comparison and by the RAST server for functional comparison against the published genome sequence of Streptococcus thermophilus CNRZ 1066. The whole genome size of S. thermophilus MTCC 5461 is of 1.73Mb size with a GC content of 39.3%. Streptococcal virulence-related genes are either inactivated or absent in the strain. The genome possesses coding sequences for features important for a probiotic organism such as adhesion, acid tolerance, bacteriocin production, and lactose utilization, which was found to be conserved among the strains MTCC 5461 and CNRZ 1066. Biochemical analysis revealed the utilization of 17 sugars by the bacterium, where the presence of genes encoding enzymes involved in metabolism for 16 of these 17 sugars were confirmed in the genome. This study supports the facts that the strain MTCC 5461 is nonpathogenic and harbors essential features that can be exploited for its probiotic potential.

Global Sequence Homology Detection Using Word Conservation Probability

  • Yang, Jae-Seong;Kim, Dae-Kyum;Kim, Jin-Ho;Kim, Sang-Uk
    • Interdisciplinary Bio Central
    • /
    • 제3권4호
    • /
    • pp.14.1-14.9
    • /
    • 2011
  • Protein homology detection is an important issue in comparative genomics. Because of the exponential growth of sequence databases, fast and efficient homology detection tools are urgently needed. Currently, for homology detection, sequence comparison methods using local alignment such as BLAST are generally used as they give a reasonable measure for sequence similarity. However, these methods have drawbacks in offering overall sequence similarity, especially in dealing with eukaryotic genomes that often contain many insertions and duplications on sequences. Also these methods do not provide the explicit models for speciation, thus it is difficult to interpret their similarity measure into homology detection. Here, we present a novel method based on Word Conservation Score (WCS) to address the current limitations of homology detection. Instead of counting each amino acid, we adopted the concept of 'Word' to compare sequences. WCS measures overall sequence similarity by comparing word contents, which is much faster than BLAST comparisons. Furthermore, evolutionary distance between homologous sequences could be measured by WCS. Therefore, we expect that sequence comparison with WCS is useful for the multiple-species-comparisons of large genomes. In the performance comparisons on protein structural classifications, our method showed a considerable improvement over BLAST. Our method found bigger micro-syntenic blocks which consist of orthologs with conserved gene order. By testing on various datasets, we showed that WCS gives faster and better overall similarity measure compared to BLAST.

결측치가 존재하는 유전형 자료에서의 연관불균형과 일배체형을 사용한 결측치 대치 방법 (A New Method for Imputation of Missing Genotype using Linkage Disequilibrium and Haplotype Information)

  • 박윤주;김영진;박정선;김규찬;고인송;정호열
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제32권2호
    • /
    • pp.99-107
    • /
    • 2005
  • 본 논문에서는 단일염기변이(SNP: Single Nucleotide Polymorphism)와 같은 유전형(Rcnotype)자료에서 결측치가 발생하였을 경우 유전형 자료의 특이성을 고려해 자료 원래의 정보손실을 최소화하는 대치법인 연관불균형 기반의 대치법(linkage disequilibrium- based imputation)과 일배체형 기반의 대치법(haplotype-based imputation)을 제시한다. 이러한 결측치 대치는 실험상에서 발생하는 결측치에 의한 중요한 정보의 손실을 최소화 한다는 점에서 필요한 방법이다. 일반적으로 그동안 생물학 자료의 결측치 대치는 대부분 주형질 대치법(major allele imputation)이 활용되어왔는데 유전형 자료에서의 이 방법의 사용은 사료의 특이성으로 인하여 결측치에 대한 높은 오차율(error rate)을 보임으로서 자료의 신뢰성을 떨어뜨릴 수 있다. 본 논문에서는 유전형 자료인 단일염기변이 자료의 시뮬레이션을 통하여 기존의 주형질 대치법과 논문에서 제안된 연관불균형 기반의 대치법과 일배체형 기반의 대치법을 비교하고 그 결과를 보여 준다.

Effect of Fc Fusion on Folding and Immunogenicity of Middle East Respiratory Syndrome Coronavirus Spike Protein

  • Chun, Jungmin;Cho, Yeondong;Park, Ki Hoon;Choi, Hanul;Cho, Hansam;Lee, Hee-Jung;Jang, Hyun;Kim, Kyung Hyun;Oh, Yu-Kyoung;Kim, Young Bong
    • Journal of Microbiology and Biotechnology
    • /
    • 제29권5호
    • /
    • pp.813-819
    • /
    • 2019
  • Middle East respiratory syndrome coronavirus (MERS-CoV) induces severe respiratory impairment with a reported mortality rate of ~36% in humans. The absence of clinically available MERS-CoV vaccines and treatments to date has resulted in uncontrolled incidence and propagation of the virus. In vaccine design, fusion with the IgG Fc domain is reported to increase the immunogenicity of various vaccine antigens. However, limited reports have documented the potential negative effects of Fc fusion on vaccine antigens. To determine whether Fc fusion affects the immunogenicity of MERS-CoV antigen, we constructed a Fcassociated MERS-CoV spike protein (eS770-Fc, 110 kDa), whereby human IgG4 Fc domain was fused to MERS-CoV spike protein (eS770) via a Gly/Pro linker using baculovirus as the expression system. For comparative analyses, two eS770 proteins lacking the IgG4 Fc domain were generated using the IdeS protease ($eS770-{\Delta}Fc$) or His tag attachment (eS770-His) and the immunogenicity of the above constructs were examined following intramuscular immunization in mice. Contrary to expectations, non-Fc spike proteins ($eS770-{\Delta}Fc$, eS770-His; 90 kDa) showed higher immunogenicity than the Fc fusion protein (eS770-Fc). Moreover, unlike non-Fc spike proteins, eS770-Fc immunization did not elicit neutralizing antibodies against MERS-CoV. The lower immunogenicity of Fc-fused eS770 was related to alterations in the structural conformation of the spike protein. Taken together, our results indicate that IgG Fc fusion reduces the immunogenicity of eS770 by interfering with the proper folding structure.

황해 갑각 중형동물플랑크톤의 형태 분석과 DNA 메타바코딩 비교 (Comparison of Morphological Analysis and DNA Metabarcoding of Crustacean Mesozooplankton in the Yellow Sea)

  • 김가람;강형구;김충곤;최재호;김성
    • Ocean and Polar Research
    • /
    • 제43권1호
    • /
    • pp.45-51
    • /
    • 2021
  • Studies on marine zooplankton diversity and ecology are important for understanding marine ecosystem, as well as environmental conservation and fisheries management. DNA metabarcoding is known as a useful tool to reveal and understand diversity among animals, but a comparative evaluation with classical microscopy is still required in order to properly use it for marine zooplankton research. This study compared crustacean mesozooplankton taxa revealed by morphological analysis and metabarcoding of the cytochrome oxidase I (COI). A total of 17 crustacean species were identified by morphological analysis, and 18 species by metabarcoding. Copepods made up the highest proportion of taxa, accounting for more than 50% of the total number of species delineated by both methods. Cladocerans were not found by morphological analysis, whereas amphipods and mysids were not detected by metabarcoding. Unlike morphological analysis, metabarcoding was able to identify decapods down to the species level. There were some discrepancies in copepod species, which could be due to a lack of genetic database, or biases during DNA extraction, amplification, pooling and bioinformatics. Morphological analysis will be useful for ecological studies as it can classify and quantify the life history stages of marine zooplankton that metabarcoding cannot detect. Metabarcoding can be a powerful tool for determining marine zooplankton diversity, if its methods or database are further supplemented.

High-performance computing for SARS-CoV-2 RNAs clustering: a data science-based genomics approach

  • Oujja, Anas;Abid, Mohamed Riduan;Boumhidi, Jaouad;Bourhnane, Safae;Mourhir, Asmaa;Merchant, Fatima;Benhaddou, Driss
    • Genomics & Informatics
    • /
    • 제19권4호
    • /
    • pp.49.1-49.11
    • /
    • 2021
  • Nowadays, Genomic data constitutes one of the fastest growing datasets in the world. As of 2025, it is supposed to become the fourth largest source of Big Data, and thus mandating adequate high-performance computing (HPC) platform for processing. With the latest unprecedented and unpredictable mutations in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the research community is in crucial need for ICT tools to process SARS-CoV-2 RNA data, e.g., by classifying it (i.e., clustering) and thus assisting in tracking virus mutations and predict future ones. In this paper, we are presenting an HPC-based SARS-CoV-2 RNAs clustering tool. We are adopting a data science approach, from data collection, through analysis, to visualization. In the analysis step, we present how our clustering approach leverages on HPC and the longest common subsequence (LCS) algorithm. The approach uses the Hadoop MapReduce programming paradigm and adapts the LCS algorithm in order to efficiently compute the length of the LCS for each pair of SARS-CoV-2 RNA sequences. The latter are extracted from the U.S. National Center for Biotechnology Information (NCBI) Virus repository. The computed LCS lengths are used to measure the dissimilarities between RNA sequences in order to work out existing clusters. In addition to that, we present a comparative study of the LCS algorithm performance based on variable workloads and different numbers of Hadoop worker nodes.