• Title/Summary/Keyword: seq2seq

Search Result 230, Processing Time 0.024 seconds

Identifying long non-coding RNAs and characterizing their functional roles in swine mammary gland from colostrogenesis to lactogenesis

  • Shi, Lijun;Zhang, Longchao;Wang, Ligang;Liu, Xin;Gao, Hongmei;Hou, Xinhua;Zhao, Fuping;Yan, Hua;Cai, Wentao;Wang, Lixian
    • Animal Bioscience
    • /
    • v.35 no.6
    • /
    • pp.814-825
    • /
    • 2022
  • Objective: This study was conducted to identify the functional long non-coding RNAs (lncRNAs) for swine lactation by RNA-seq data of mammary gland. Methods: According to the RNA-seq data of swine mammary gland, we screened lncRNAs, performed differential expression analysis, and confirmed the functional lncRNAs for swine lactation by validation of genome wide association study (GWAS) signals, functional annotation and weighted gene co-expression network analysis (WGCNA). Results: We totally identified 286 differentially expressed (DE) lncRNAs in mammary gland at different stages from 14 days prior to (-) parturition to day 1 after (+) parturition, and the expressions of most of lncRNAs were strongly changed from day -2 to day +1. Further, the GWAS signals of sow milk ability trait were significantly enriched in DE lncRNAs. Functional annotation revealed that these DE lncRNAs were mainly involved in mammary gland and lactation developing, milk composition metabolism and colostrum function. By performing weighted WGCNA, we identified 7 out of 12 lncRNA-mRNA modules that were highly associated with the mammary gland at day -14, day -2, and day +1, in which, 35 lncRNAs and 319 mRNAs were involved. Conclusion: This study suggested that 18 lncRNAs and their 20 target genes were promising candidates for swine parturition and colostrum occurrence processes. Our research provided new insights into lncRNA profiles and their regulating mechanisms from colostrogenesis to lactogenesis in swine.

Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB

  • Lee, Joon-Ho;Lee, Taeheon;Lee, Hak-Kyo;Cho, Byung-Wook;Shin, Dong-Hyun;Do, Kyoung-Tag;Sung, Samsun;Kwak, Woori;Kim, Hyeon Jeong;Kim, Heebal;Cho, Seoae;Park, Kyung-Do
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.27 no.9
    • /
    • pp.1236-1243
    • /
    • 2014
  • Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB) (http://snugenome2.snu.ac.kr/HSDB) provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs) were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.

PC-SAN: Pretraining-Based Contextual Self-Attention Model for Topic Essay Generation

  • Lin, Fuqiang;Ma, Xingkong;Chen, Yaofeng;Zhou, Jiajun;Liu, Bo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3168-3186
    • /
    • 2020
  • Automatic topic essay generation (TEG) is a controllable text generation task that aims to generate informative, diverse, and topic-consistent essays based on multiple topics. To make the generated essays of high quality, a reasonable method should consider both diversity and topic-consistency. Another essential issue is the intrinsic link of the topics, which contributes to making the essays closely surround the semantics of provided topics. However, it remains challenging for TEG to fill the semantic gap between source topic words and target output, and a more powerful model is needed to capture the semantics of given topics. To this end, we propose a pretraining-based contextual self-attention (PC-SAN) model that is built upon the seq2seq framework. For the encoder of our model, we employ a dynamic weight sum of layers from BERT to fully utilize the semantics of topics, which is of great help to fill the gap and improve the quality of the generated essays. In the decoding phase, we also transform the target-side contextual history information into the query layers to alleviate the lack of context in typical self-attention networks (SANs). Experimental results on large-scale paragraph-level Chinese corpora verify that our model is capable of generating diverse, topic-consistent text and essentially makes improvements as compare to strong baselines. Furthermore, extensive analysis validates the effectiveness of contextual embeddings from BERT and contextual history information in SANs.

Microbial Community of Tannery Wastewater Involved in Nitrification Revealed by Illumina MiSeq Sequencing

  • Ma, Xiaojian;Wu, Chongde;Jun, Huang;Zhou, Rongqing;Shi, Bi
    • Journal of Microbiology and Biotechnology
    • /
    • v.28 no.7
    • /
    • pp.1168-1177
    • /
    • 2018
  • The aim of this study was to investigate the microbial community of three tannery wastewater treatment plants (WWTPs) involved in nitrification by Illumina MiSeq sequencing. The results showed that highly diverse communities were present in tannery wastewater. A total of six phyla, including Proteobacteria (37-41%), Bacteroidetes (6.04-16.80), Planctomycetes (3.65-16.55), Chloroflexi (2.51-11.48), Actinobacteria (1.91-9.21), and Acidobacteria (3.04-6.20), were identified as the main phyla, and Proteobacteria dominated in all the samples. Within Proteobacteria, Beta-proteobacteria was the most abundant class, with the sequence percentages ranging from 9.66% to 17.44%. Analysis of the community at the genus level suggested that Thauera, Gp4, Ignavibacterium, Phycisphaera, and Arenimonas were the core genera shared by at least two tannery WWTPs. A detailed analysis of the abundance of ammonia-oxidizing bacteria (AOB) and nitrite-oxidizing bacteria (NOB) indicated that Nitrosospira, Nitrosomonas, and Nitrospira were the main AOB and NOB in tannery wastewater, respectively, which exhibited relatively high abundance in all samples. In addition, real-time quantitative PCR was conducted to validate the results by quantifying the abundance of the AOB and total bacteria, and similar results were obtained. Overall, the results presented in this study may provide new insights into our understanding of key microorganisms and the entire community of tannery wastewater and contribute to improving the nitrogen removal efficiency.

Attention-based word correlation analysis system for big data analysis (빅데이터 분석을 위한 어텐션 기반의 단어 연관관계 분석 시스템)

  • Chi-Gon, Hwang;Chang-Pyo, Yoon;Soo-Wook, Lee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.27 no.1
    • /
    • pp.41-46
    • /
    • 2023
  • Recently, big data analysis can use various techniques according to the development of machine learning. Big data collected in reality lacks an automated refining technique for the same or similar terms based on semantic analysis of the relationship between words. Since most of the big data is described in general sentences, it is difficult to understand the meaning and terms of the sentences. To solve these problems, it is necessary to understand the morphological analysis and meaning of sentences. Accordingly, NLP, a technique for analyzing natural language, can understand the word's relationship and sentences. Among the NLP techniques, the transformer has been proposed as a way to solve the disadvantages of RNN by using self-attention composed of an encoder-decoder structure of seq2seq. In this paper, transformers are used as a way to form associations between words in order to understand the words and phrases of sentences extracted from big data.

Fatty Acid Binding Protein 5 (FABP5) Promotes Aggressiveness of Gastric Cancer Through Modulation of Tumor Immunity

  • Mei-qing Qiu;Hui-jun Wang;Ya-fei Ju;Li Sun;Zhen Liu;Tao Wang;Shi-feng Kan;Zhen Yang;Ya-yun Cui;You-qiang Ke;Hong-min He;Shu Zhang
    • Journal of Gastric Cancer
    • /
    • v.23 no.2
    • /
    • pp.340-354
    • /
    • 2023
  • Purpose: Gastric cancer (GC) is the second most lethal cancer globally and is associated with poor prognosis. Fatty acid-binding proteins (FABPs) can regulate biological properties of carcinoma cells. FABP5 is overexpressed in many types of cancers; however, the role and mechanisms of action of FABP5 in GC remain unclear. In this study, we aimed to evaluate the clinical and biological functions of FABP5 in GC. Materials and Methods: We assessed FABP5 expression using immunohistochemical analysis in 79 patients with GC and evaluated its biological functions following in vitro and in vivo ectopic expression. FABP5 targets relevant to GC progression were determined using RNA sequencing (RNA-seq). Results: Elevated FABP5 expression was closely associated with poor outcomes, and ectopic expression of FABP5 promoted proliferation, invasion, migration, and carcinogenicity of GC cells, thus suggesting its potential tumor-promoting role in GC. Additionally, RNA-seq analysis indicated that FABP5 activates immune-related pathways, including cytokine-cytokine receptor interaction pathways, interleukin-17 signaling, and tumor necrosis factor signaling, suggesting an important rationale for the possible development of therapies that combine FABP5-targeted drugs with immunotherapeutics. Conclusions: These findings highlight the biological mechanisms and clinical implications of FABP5 in GC and suggest its potential as an adverse prognostic factor and/or therapeutic target.

Comparison of Gene Expression Changes in Three Wheat Varieties with Different Susceptibilities to Heat Stress Using RNA-Seq Analysis

  • Myoung Hui Lee;Kyeong-Min Kim;Wan-Gyu Sang;Chon-Sik Kang;Changhyun Choi
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.197-197
    • /
    • 2022
  • Wheat is highly susceptible to heat stress, which significantly reduces grain yield. In this study, we used RNA-seq technology to analyze the transcript expression at three different time-points after heat treatment in three cultivars differing in their susceptibility to heat stress: Jopum, Keumkang, and Olgeuru. A total of 11,751, 8850, and 14,711; 10,959,7946, and 14,205; and 22,895,13,060, and 19,408 differentially-expressed genes (log2 fold-change > 1 and FDR (padj) < 0.05) were identified in Jopum, Keumkang, and Olgeuru in the control vs. 6-h, in the control vs. 12-h, and in the 6-h vs. 12-h heat treatment, respectively. Functional enrichment analysis showed that the biological processes for DEGs, such as the cellular response to heat and oxidative stress-and including the removal of superoxide radicals and the positive regulation of superoxide dismutase activity-were significantly enriched among the three comparisons in all three cultivars. Furthermore, we investigated the differential expression patterns of reactive oxygen species (ROS)-scavenging enzymes, heat shock proteins, and heat-stress transcription factors using qRT-PCR to confirm the differences in gene expression among the three varieties under heat stress. This study contributes to a better understanding of the wheat heat-stress response at the early growth stage and the varietal differences in heat tolerancea.

  • PDF

Transcriptome Profiling of Differentially Expressed Genes in Cowpea (Vigna unguiculata L.) Under Salt Stress

  • Byeong Hee Kang;Woon Ji Kim;Sreepama Chowdhury;Chang Yeok Moon;Sehee Kang;Bo-Keun Ha
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.261-261
    • /
    • 2022
  • Cowpea [Vigna unguiculata (L.) Walp] is one of the most important grain legumes that enhance soil fertility and is well-adapted to various abiotic stress. Also, it is cultivated worldwide as a tropical annual crop, and the semi-arid regions are known as the main cowpea-produced regions. However, accumulation of soil salinity induced by low rainfall in these regions is reducing crop yields and quality. In general, plants exposed to soil salinity cause an accumulation of high ion chloride, which leads to the degradation of root and leaf proteins. In this study, we identified candidate genes associated with salinity tolerance through an analysis of differentially expressed genes (DEGs) in four cowpea germplasms with contrasting salinity tolerance. A total of 553,776,035 short reads were obtained using the Illumina Novaseq 6000 platform for RNA-Seq, which were subsequently aligned to the reference genome of cowpea Vunguiculata v1.2. A total of9,806 DEGs were identified between NaCl treatment and control of four cowpea germplasms. Among these DEGs, functions related to salt stress such as calcium transporter and cytochrome-450 family were associated with salt stress. In GO analysis and KEGG analysis, these DEGs were enriched in terms such as the "phosphorylation", ''extracellular region", and "ion binding". These RNA-seq results will improve the understanding of the salt tolerance of cowpea and can be used as useful basic data for molecular breeding technology in the future.

  • PDF

Single-Cell RNA Sequencing of Bone Marrow Mesenchymal Stem Cells from the Elderly People

  • Dezhou Zhu;Jie Gao;Chengxuan Tang;Zheng Xu;Tiansheng Sun
    • International Journal of Stem Cells
    • /
    • v.15 no.2
    • /
    • pp.173-182
    • /
    • 2022
  • Background and Objectives: Bone marrow mesenchymal stem cells (BMSCs) show considerable promise in regenerative medicine. Many studies demonstrated that BMSCs cultured in vitro were highly heterogeneous and composed of diverse cell subpopulations, which may be the basis of their multiple biological characteristics. However, the exact cell subpopulations that make up BMSCs are still unknown. Methods and Results: In this study, we used single-cell RNA sequencing (scRNA-Seq) to divide 6,514 BMSCs into three clusters. The number and corresponding proportion of cells in clusters 1 to 3 were 3,766 (57.81%), 1,720 (26.40%), and 1,028 (15.78%). The gene expression profile and function of the cells in the same cluster were similar. The vast majority of cells expressed the markers defining BMSCs by flow cytometry and gene expression analysis. Each cluster had at least 20 differentially expressed genes (DEGs). We conducted Gene Ontology enrichment analysis on the top 20 DEGs of each cluster and found that the three clusters had different functions, which were related to self-renewal, multilineage differentiation and cytokine secretion, respectively. In addition, the function of the top 20 DEGs of each cluster was checked by the National Center for Biotechnology Information gene database to further verify our hypothesis. Conclusions: This study indicated that scRNA-Seq can be used to divide BMSCs into different subpopulations, demonstrating the heterogeneity of BMSCs.

Transcriptomic Analysis of Triticum aestivum under Salt Stress Reveals Change of Gene Expression (RNA sequencing을 이용한 염 스트레스 처리 밀(Triticum aestivum)의 유전자 발현 차이 확인 및 후보 유전자 선발)

  • Jeon, Donghyun;Lim, Yoonho;Kang, Yuna;Park, Chulsoo;Lee, Donghoon;Park, Junchan;Choi, Uchan;Kim, Kyeonghoon;Kim, Changsoo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.67 no.1
    • /
    • pp.41-52
    • /
    • 2022
  • As a cultivar of Korean wheat, 'Keumgang' wheat variety has a fast growth period and can be grown stably. Hexaploid wheat (Triticum aestivum) has moderately high salt tolerance compared to tetraploid wheat (Triticum turgidum L.). However, the molecular mechanisms related to salt tolerance of hexaploid wheat have not been elucidated yet. In this study, the candidate genes related to salt tolerance were identified by investigating the genes that are differently expressed in Keumgang variety and examining salt tolerant mutation '2020-s1340.'. A total of 85,771,537 reads were obtained after quality filtering using NextSeq 500 Illumina sequencing technology. A total of 23,634,438 reads were aligned with the NCBI Campala Lr22a pseudomolecule v5 reference genome (Triticum aestivum). A total of 282 differentially expressed genes (DEGs) were identified in the two Triticum aestivum materials. These DEGs have functions, including salt tolerance related traits such as 'wall-associated receptor kinase-like 8', 'cytochrome P450', '6-phosphofructokinase 2'. In addition, the identified DEGs were classified into three categories, including biological process, molecular function, cellular component using gene ontology analysis. These DEGs were enriched significantly for terms such as the 'copper ion transport', 'oxidation-reduction process', 'alternative oxidase activity'. These results, which were obtained using RNA-seq analysis, will improve our understanding of salt tolerance of wheat. Moreover, this study will be a useful resource for breeding wheat varieties with improved salt tolerance using molecular breeding technology.