• 제목/요약/키워드: multiple genome sequences

검색결과 64건 처리시간 0.026초

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

Gene Duplications Revealed during the Process of SNP Discovery in Soybean[Glycine max(L.) Merr.]

  • Cai, Chun Mei;Van, Kyu-Jung;Lee, Suk-Ha
    • Journal of Crop Science and Biotechnology
    • /
    • 제10권4호
    • /
    • pp.237-242
    • /
    • 2007
  • Genome duplication(i.e. polyploidy) is a common phenomenon in the evolution of plants. The objective of this study was to achieve a comprehensive understanding of genome duplication for SNP discovery by Thymine/Adenine(TA) cloning for confirmation. Primer pairs were designed from 793 EST contigs expressed in the roots of a supernodulating soybean mutant and screened between 'Pureunkong' and 'Jinpumkong 2' by direct sequencing. Almost 27% of the primer sets were failed to obtain sequence data due to multiple bands on agarose gel or poor quality sequence data from a single band. TA cloning was able to identify duplicate genes and the paralogous sequences were coincident with the nonspecific peaks in direct sequencing. Our study confirmed that heterogeneous products by the co-amplification of a gene family member were the main cause of obtaining multiple bands or poor quality sequence data in direct sequencing. Counts of amplified bands on agarose gel and peaks of sequencing trace suggested that almost 27% of nonrepetitive soybean sequences were present in as many as four copies with an average of 2.33 duplications per segment. Copy numbers would be underestimated because of the presence of long intron between primer binding sites or mutation on priming site. Also, the copy numbers were not accurately estimated due to deletion or tandem duplication in the entire soybean genome.

  • PDF

Structural Characterization of the Genome of BERV γ4 the Most Abundant Endogenous Retrovirus Family in Cattle

  • Xiao, Rui;Park, Kwangha;Oh, Younshin;Kim, Jinhoi;Park, Chankyu
    • Molecules and Cells
    • /
    • 제26권4호
    • /
    • pp.404-408
    • /
    • 2008
  • The genome of replication-competent BERV ${\gamma}4$ provirus, which is the most abundant ERV family in the bovine genome, was characterized in detail. The BERV ${\gamma}4$ genome showed that BERV ${\gamma}4$ harbors 8576 nucleotides and has the typical 5'-long terminal repeat (LTR)-gag-pro-pol-env-LTR-3' retroviral organization with a long leader region positioned before the gag open reading frame. Multiple sequences analysis showed that the nucleotide difference between 5' and 3' LTRs was 4.2% (mean value 0.042) in average, suggesting that the provirus formed at most 13.3 million years ago. Gag separated by a stop codon from pro-pol in the same reading frame, while env resides in another reading frame lacking of a functional surface domain. According to the current bovine genome sequence assembly, the full-length BERV ${\gamma}4$ provirus sequences were only found in the chromosomes 1, 2, 6, 10, 15, 23, 26, 28, X, and unassigned, although the partial sequences almost evenly distributed in the entire bovine genome. This is the first detailed study describing the genome structure of BERV ${\gamma}4$, the most abundant ERV family present in bovine genome. Combined with our recent reports on characterization of ERVs in bovine, this study will contribute to illuminate ERVs in the cattle of which no information was previously available.

Evidence of genome duplication revealed by sequence analysis of multi-loci expressed sequence tagesimple sequence repeat bands in Panax ginseng Meyer

  • Kim, Nam-Hoon;Choi, Hong-Il;Kim, Kyung Hee;Jang, Woojong;Yang, Tae-Jin
    • Journal of Ginseng Research
    • /
    • 제38권2호
    • /
    • pp.130-135
    • /
    • 2014
  • Background: Panax ginseng, the most famous medicinal herb, has a highly duplicated genome structure. However, the genome duplication of P. ginseng has not been characterized at the sequence level. Multiple band patterns have been consistently observed during the development of DNA markers using unique sequences in P. ginseng. Methods: We compared the sequences of multiple bands derived from unique expressed sequence tagsimple sequence repeat (EST-SSR) markers to investigate the sequence level genome duplication. Results: Reamplification and sequencing of the individual bands revealed that, for each marker, two bands around the expected size were genuine amplicons derived from two paralogous loci. In each case, one of the two bands was polymorphic, showing different allelic forms among nine ginseng cultivars, whereas the other band was usually monomorphic. Sequences derived from the two loci showed a high similarity, including the same primer-binding site, but each locus could be distinguished based on SSR number variations and additional single nucleotide polymorphisms (SNPs) or InDels. A locus-specific marker designed from the SNP site between the paralogous loci produced a single band that also showed clear polymorphism among ginseng cultivars. Conclusion: Our data imply that the recent genome duplication has resulted in two highly similar paralogous regions in the ginseng genome. The two paralogous sequences could be differentiated by large SSR number variations and one or two additional SNPs or InDels in every 100 bp of genic region, which can serve as a reliable identifier for each locus.

Genetically Independent Tetranucleotide to Hexanucleotide Core Motif SSR Markers for Identifying Lentinula edodes Cultivars

  • Saito, Teruaki;Sakuta, Genki;Kobayashi, Hitoshi;Ouchi, Kenji;Inatomi, Satoshi
    • Mycobiology
    • /
    • 제47권4호
    • /
    • pp.466-472
    • /
    • 2019
  • For the purpose of protecting the rights of Lentinula edodes breeders, we developed a new simple sequence repeat (SSR) marker set consisting only of genetically independent tetranucleotide or longer core motifs. Using available genome sequences for five L. edodes strains, we designed primers for 13 SSR markers that amplified polymorphic sequences in 20 L. edodes cultivars. We evaluated the independence of every possible marker pair based on genotype data. Consequently, eight genetically independent markers were selected. The polymorphic information content values of the markers ranged from 0.269 to 0.764, with an average of 0.409. The markers could distinguish among 20 L. edodes cultivars and produced highly repeatable and reproducible results. The markers developed in this study will enable the precise identification of L. edodes cultivars, and may be useful for protecting breeders' rights.

A One-Step System for Convenient and Flexible Assembly of Transcription Activator-Like Effector Nucleases (TALENs)

  • Zhao, Jinlong;Sun, Wenye;Liang, Jing;Jiang, Jing;Wu, Zhao
    • Molecules and Cells
    • /
    • 제39권9호
    • /
    • pp.687-691
    • /
    • 2016
  • Transcription activator-like effector nucleases (TALENs) are powerful tools for targeted genome editing in diverse cell types and organisms. However, the highly identical TALE repeat sequences make it challenging to assemble TALEs using conventional cloning approaches, and multiple repeats in one plasmid are easily catalyzed for homologous recombination in bacteria. Although the methods for TALE assembly are constantly improving, these methods are not convenient because of laborious assembly steps or large module libraries, limiting their broad utility. To overcome the barrier of multiple assembly steps, we report a one-step system for the convenient and flexible assembly of a 180 TALE module library. This study is the first demonstration to ligate 9 mono-/dimer modules and one circular TALEN backbone vector in a one step process, generating 9.5 to 18.5 repeat sequences with an overall assembly rate higher than 50%. This system makes TALEN assembly much simpler than the conventional cloning of two DNA fragments because this strategy combines digestion and ligation into one step using circular vectors and different modules to avoid gel extraction. Therefore, this system provides a convenient tool for the application of TALEN-mediated genome editing in scientific studies and clinical trials.

Vibrio 속 16S rRNA 유전자 염기서열의 이질성 분석 (Heterogeneity Analysis of the 16S rRNA Gene Sequences of the Genus Vibrio)

  • 기장서
    • 미생물학회지
    • /
    • 제45권4호
    • /
    • pp.430-434
    • /
    • 2009
  • 세균 16S rRNA 유전자 염기서열은 분자계통분류, 진화역사 규명, 미생물 검출 등 다양한 목적으로 이용되어 왔다. 세균 제놈(genome)은 multiple rRNA 오페론을 갖고 있으며, 이들 유전자 염기서열은 일부 변이가 있는 것으로 알려져 있다. 본 연구에서는 Vibrio 속의 16S rRNA 유전자 염기서열을 이용하여 세포 내 16S rRNA의 이질성을 규명하였다. 분석은 GenBank 자료 중에서 제놈 염기서열 annotation이 완료된 V. cholerae, V. harveyi, V. parahaemolyticus, V. splendidus, V. vulnificus를 이용하여 실시하였다. Vibrio 속은 1번 염색체에 7~10개의 16S rRNA 유전자 copy를 갖고 있으며, 이들의 세포 내 유전자 변이는 0.9% 이하 상이성(99.1%이상 DNA 상동성)을 보였다. 2번 염색체에서는 16S rRNA 유전자가 1개 이하로 존재하였다. 유전체내 16S rRNA 유전형은 최소 5개(V. vulnificus #CMCP6)에서 최대 8개(V. parahaemolyticus #RIMD 2210633, V. harveyi #ATCC BAA-1116)로 조사되었다. 본 결과는 Vibrio 속의 16S rRNA 유전자 염기서열이 높은 이질성을 갖는 것을 제시해 준다.

Whole Genome Analysis of Human Papillomavirus Type 16 Multiple Infection in Cervical Cancer Patients

  • Chansaenroj, Jira;Theamboonlers, Apiradee;Junyangdikul, Pairoj;Swangvaree, Sukumarn;Karalak, Anant;Poovorawan, Yong
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권2호
    • /
    • pp.599-606
    • /
    • 2012
  • The characterization of the whole genome of human papillomavirus type 16 (HPV16) from cervical cancer specimens with multiple infections in comparison with single infection samples as the oncogenic potential of the virus may differ. Cervical carcinoma specimens positive for HPV16 by PCR and INNO-LiPA were randomly selected for whole genome characterization. Two HPV16 single infection and six HPV16 multiple infection specimens were subjected to whole genome analysis by using conserved primers and subsequent sequencing. All HPV16 whole genomes from single infection samples clustered in the European (E) lineage while all multiple infection specimens belonged to the non-European lineage. The variations in nucleotide sequences in E6, E7, E2, L1 and Long control region (LCR) were evaluated. In the E6 region, amino acid changes at L83V were related to increased cancer progression. An amino acid variation N29S within the E7 oncoprotein significantly associated with severity of lesion was also discovered. In all three domains of the E2 gene non synonymous mutations were found. The L1 region showed various mutations which may be related to conformation changes of viral epitopes. Some transcription factor binding sites in the LCR region correlated to virulence were shown on GRE/1, TEF-1, YY14 and Oct-1. HPV16 European variant prone to single infection may harbor a major variation at L83V which significantly increases the risk for developing cervical carcinoma. HPV16 non-European variants prone to multiple infections may require many polymorphisms to enhance the risk of cervical cancer development.

Development of CRISPR technology for precise single-base genome editing: a brief review

  • Lee, Hyomin K.;Oh, Yeounsun;Hong, Juyoung;Lee, Seung Hwan;Hur, Junho K.
    • BMB Reports
    • /
    • 제54권2호
    • /
    • pp.98-105
    • /
    • 2021
  • The clustered regularly interspaced short palindromic repeats (CRISPR) system is a family of DNA sequences originally discovered as a type of acquired immunity in prokaryotes such as bacteria and archaea. In many CRISPR systems, the functional ribonucleoproteins (RNPs) are composed of CRISPR protein and guide RNAs. They selectively bind and cleave specific target DNAs or RNAs, based on sequences complementary to the guide RNA. The specific targeted cleavage of the nucleic acids by CRISPR has been broadly utilized in genome editing methods. In the process of genome editing of eukaryotic cells, CRISPR-mediated DNA double-strand breaks (DSB) at specific genomic loci activate the endogenous DNA repair systems and induce mutations at the target sites with high efficiencies. Two of the major endogenous DNA repair machineries are non-homologous end joining (NHEJ) and homology-directed repair (HDR). In case of DSB, the two repair pathways operate in competition, resulting in several possible outcomes including deletions, insertions, and substitutions. Due to the inherent stochasticity of DSB-based genome editing methods, it was difficult to achieve defined single-base changes without unanticipated random mutation patterns. In order to overcome the heterogeneity in DSB-mediated genome editing, novel methods have been developed to incorporate precise single-base level changes without inducing DSB. The approaches utilized catalytically compromised CRISPR in conjunction with base-modifying enzymes and DNA polymerases, to accomplish highly efficient and precise genome editing of single and multiple bases. In this review, we introduce some of the advances in single-base level CRISPR genome editing methods and their applications.

Evolutionary course of CsRn1 long-terminal-repeat retrotransposon and its heterogeneous integrations into the genome of the liver fluke, Clonorchis sinensis

  • Bae, Young-An;Kong, Yoon
    • Parasites, Hosts and Diseases
    • /
    • 제41권4호
    • /
    • pp.209-219
    • /
    • 2003
  • The evolutionary course of the CsRn1 long-terminal-repeat (LTR) retrotransposon was predicted by conducting a phylogenetic analysis with its paralog LTR sequences. Based on the clustering patterns in the phylogenetic tree, multiple CsRn1 copies could be grouped into four subsets, which were shown to have different integration times. Their differential sequence divergences and heterogeneous integration patterns strongly suggested that these subsets appeared sequentially in the genome of C. sinensis. Members of recently expanding subset showed the lowest level of divergence in their L TR and reverse transcriptase gene sequences. They were also shown to be highly polymorphic among individual genomes of the trematode. The CsRn1 element exhibited a preference for repetitive, agenic chromosomal regions in terms of selecting integration targets. Our results suggested that CsRn1 might induce a considerable degree of intergenomic variation and, thereby, have influenced the evolution of the C. sinensis genome.