• Title/Summary/Keyword: gene set

Search Result 579, Processing Time 0.032 seconds

OryzaGP: rice gene and protein dataset for named-entity recognition

  • Larmande, Pierre;Do, Huy;Wang, Yue
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.17.1-17.3
    • /
    • 2019
  • Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.

Simple Assessment of Taxonomic Status and Genetic Diversity of Korean Long-Tailed Goral (Naemorhedus caudatus) Based on Partial Mitochondrial Cytochrome b Gene Using Non-Invasive Fecal Samples

  • Kim, Baek-Jun
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.2 no.1
    • /
    • pp.32-41
    • /
    • 2021
  • South Korea presently harbors less than 800 long-tailed gorals (Naemorhedus caudatus), an endangered species. I report for the first time on the taxonomic status and genetic diversity of the Korean species using non-invasive fecal sampling based on mitochondrial cytochrome b gene sequence analyses. To determine the taxonomic status of this species, I reconstructed a consensus neighbor-joining tree and generated a minimum spanning network combining haplotype sequences obtained from feces with a new goral-specific primer set developed using known sequences of the Korean goral and related species (e.g., Russian goral, Chinese goral, Himalayan goral, Japanese serow, etc.). I also examined the genetic diversity of this species. The Korean goral showed only three different haplotypes. The phylogenetic tree and parsimony haplotype network revealed a single cluster of Korean and Russian gorals, separate from related species. Generally, the Korean goral has a relatively low genetic diversity compared with that of other ungulate species (e.g., moose and red deer). I preliminarily showcased the application of non-invasive fecal sampling to the study of genetic characteristics, including the taxonomic status and genetic diversity of gorals, based on mitochondrial DNA. More phylogenetic studies are necessary to ensure the conservation of goral populations throughout South Korea.

Molecular Basis of the Hrp Pathogenicity of the Fire Blight Pathogen Erwinia amylovora : a Type III Protein Secretion System Encoded in a Pathogenicity Island

  • Kim, Jihyun F.;Beer, Steven V.
    • The Plant Pathology Journal
    • /
    • v.17 no.2
    • /
    • pp.77-82
    • /
    • 2001
  • Erwinia amylovora causes a devastating disease called fire blight in rosaceous trees and shrubs such as apple, pear, and raspberry. To successfully infect its hosts, the pathogen requires a set of clustered genes termed hrp. Studies on the hrp system of E. amylovora indicated that it consists of three functional classes of genes. Regulation genes including hrpS, hrpS, hrpXY, and hrpL produce proteins that control the expression of other genes in the cluster. Secretion genes, many of which named hrc, encode proteins that may form a transmembrane complex, which is devoted to type III protein secretion. Finally, several genes encode the proteins that are delivered by the protein secretion apparatus. They include harpins, DspE, and other potential effector proteins that may contribute to proliferation of E. amylovora inside the hosts. Harpins are glycine-rich heat-stable elicitors of the hypersensitive response, and induce systemic acquired resistance. The pathogenicity protein DseE is homologous and functionally similar to an avirulence protein of Pseudomonas syringae. The region encompassing the hrpldsp gene cluster of E. amylovora shows features characteristic of a genomic island : a cryptic recombinase/integrase gene and a tRNA gene are present at one end and genes corresponding to those of the Escherichia coli K-12 chromosome are found beyond the region. This island, designated the Hrp pathogenicity island, is more than 60 kilobases in size and carries as many as 60 genes.

  • PDF

Optimized Protocols for Efficient Plant Regeneration and Gene Transfer in Pepper (Capsicum annuum L.)

  • Mihalka, Virag;Fari, Miklos;Szasz, Attila;Balazs, Ervin;Nagy, Istvan
    • Journal of Plant Biotechnology
    • /
    • v.2 no.3
    • /
    • pp.143-149
    • /
    • 2000
  • An Efficient in vitro regeneration system and an optimized Agrobacterium mediated transformation protocol are described, based on the use of young seedling cotyledons of Capsicum annuum L. Optimal regeneration efficiency can be obtained by cultivating cotyledon explants on media containing 4 mg/L benzyladenine and 0.1 mg/L indolacetic acid. The effect of antibiotics used to eliminate Agrobacteria, as well as the toxic level of some generally used selection agents (kanamycin, geneticin, hygromycin, phosphinotricin and methotrexate) in regenerating pepper tissues were determined. To enable the comparison of different selection markers in identical vector background, a set of binary vectors containing the marker genes for NPTII, HPT, DHFR and BAR respectively, as well as the CaMV 35S promoter/enhancer-GUS chimaeric gene was constructed and introduced into four different Agrobacterium host strains.

  • PDF

Identification of csp Homolog in Bradyrhizobium japonicum

  • No, Jae-Sang;Yu, Ji-Cheol;So, Jae-Seong
    • 한국생물공학회:학술대회논문집
    • /
    • 2001.11a
    • /
    • pp.602-605
    • /
    • 2001
  • Low-temperature adaptation and protection for environmental stresses were studied in the gram-negative soil bacterium Bradyrhizobium japonicum 61A101c. B. japonicum was more resistant to alcohol, $H_2O_2$, heat and freezing following a pretreatment at $4^{\circ}C$, resulting in approximately 10 to 1,000 folds increased survival compared to mid-exponential-phase cells grown at an optimal temperature at $28^{\circ}C$. This phenomena relate to the cold shock protein expressed when cells are exposed to a downshift in temperature. To confirm the presence of cold shock protein genes in B. japonicum, a PCR strategy was employed using a degenerate primer set, which successfully amplified a putative csp gene fragment. Sequence analysis of the PCR product(200bp) revealed csp-like sequences that were up to 96% identical to csp gene of S. typhimurium.

  • PDF

A Unified Object Database for Biochemical Pathways

  • Jung, T.S.;Oh, J.S.;Jang, H.K.;Ahn, M.S.;Roh, D.H.;Cho, W.S.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.383-387
    • /
    • 2005
  • One of the most important issues in post-genome era is identifying functions of genes and understanding the interaction among them. Such interactions from complex biochemical pathways, which are very useful to understand the organism system. We present an integrated biochemical pathway database system with a set of software tools for reconstruction, visualization, and simulation of the pathways from the database. The novel features of the presented system include: (a) automatic integration of the heterogeneous biochemical pathway databases, (b) gene ontology for high quality of database in the integration and query (c) various biochemical simulations on the pathway database, (d) dynamic pathway reconstruction for the gene list or sequence data, (e) graphical tools which enable users to view the reconstructed pathways in a dynamic form, (f) importing/exporting SBML documents, a data exchange standard for systems biology.

  • PDF

A Novel Marker for the Species-Specific Detection and Quantitation of Shigella sonnei by Targeting a Methylase Gene

  • Cho, Min Seok;Ahn, Tae-Young;Joh, Kiseong;Kwon, Oh-Sang;Jheong, Won-Hwa;Park, Dong Suk
    • Journal of Microbiology and Biotechnology
    • /
    • v.22 no.8
    • /
    • pp.1113-1117
    • /
    • 2012
  • Shigella sonnei is a causal agent of fever, nausea, stomach cramps, vomiting, and diarrheal disease. The present study describes a quantitative polymerase chain reaction (qPCR) assay for the specific detection of S. sonnei using a primer pair based on the methylase gene for the amplification of a 325 bp DNA fragment. The qPCR primer set for the accurate diagnosis of Shigella sonnei was developed from publically available genome sequences. This quantitative PCR-based method will potentially simplify and facilitate the diagnosis of this pathogen and guide disease management.

The Role of a Floral Identity Gene LFY in Plant Morphological Evolution

  • Park, Young-Doo;Yoon, Ho-Sung
    • Korean Journal of Plant Taxonomy
    • /
    • v.37 no.4
    • /
    • pp.323-333
    • /
    • 2007
  • The degree to which parallel evolution utilizes the same genetic mechanisms indicates the degree to which developmental processes constrain or channel phenotypic evolution. A transgenetic strategy was used to elucidate the role of one floral meristem identity gene, LEAFY (LFY), in the evolution of rosette flowering, a plant architecture that has evolved in parallel in several lineages of the mustard family, Brassicaceae. The LFY genes from three rosette flowering species were cloned and introduced into a species with the ancestral architecture, and results indicated that changes at the LFY locus contributed to the evolution of rosette flowering in two of the three lineages, but that in each lineage a different set of genetic partners was involved. Also, LFY was shown to play a role in the evolution of flower size. Transgenetic strategy may be useful in the study of plant morphological evolution and parallelism.

Discrimination of Listeria monocytogenes by Sequence Typing Based on Two Housekeeping Genes and Its Comparison to PFGE Patterns

  • Suh, Dong-Kyun
    • Biomedical Science Letters
    • /
    • v.11 no.3
    • /
    • pp.289-293
    • /
    • 2005
  • Two housekeeping genes, of Listeria monocytogenes dat and hlyA, were analyzed in a set of 28 isolates from different sources to estimate their genetic diversities. These strains were previously characterized by pulsed-field gel electrophoresis. Complete gene sequences for dat (465 bp) and hlyA (584 bp) had sequence similarity of $99.87-100\%$ S and $99.96-100\%$ S among isolates, respectively. Also, we found that the numbers of sequence types (ST) were about 3-fold less than those of PFGE types (3 STs versus 11 PFGE types). There was, however, a good correlation between the PFGE patterns and phylogenetic grouping of two gene sequences among the isolates. Further studies on analyzing additional loci would increase the discriminatory power of sequence typing for L. monocytogenes strains.

  • PDF

Detection of Mycoplasma Infection in Cultured Cells on the Basis of Molecular Profiling of Host Responses

  • Chung, Tae Su;Kim, Ju Han;Lee, Young-Ju;Park, Woong-Yang
    • Genomics & Informatics
    • /
    • v.3 no.3
    • /
    • pp.63-67
    • /
    • 2005
  • Adaptive responses to diverse microbial pathogens might be limited in relatively few types. Host cell responses to pathogens are believed to be patterned or stereotyped along with species or class. We tried to compose the host response to Mycoplasma in terms of cellular gene expression. Although gene expression profile of two host HeLa and 293 cells were quite different each other, 30 genes were differentially expressed by mycoplasma infection in both of HeLa and 293 cells. Six of them (PR48, MADH4, MKPX, CRK, RBM7, NEK3) were related to cell cycle or proliferation. Another category of genes like IL1 HY1, KLRF1, TNFSF14, GBP1 were host defense to elicit immune responses. With this set of genes, we establish the prediction model for mycoplasma contamination.