• Title/Summary/Keyword: seq2seq

Search Result 226, Processing Time 0.024 seconds

Screening for candidate genes related with histological microstructure, meat quality and carcass characteristic in pig based on RNA-seq data

  • Ropka-Molik, Katarzyna;Bereta, Anna;Zukowski, Kacper;Tyra, Miroslaw;Piorkowska, Katarzyna;Zak, Grzegorz;Oczkowicz, Maria
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.10
    • /
    • pp.1565-1574
    • /
    • 2018
  • Objective: The aim of the present study was to identify genetic variants based on RNA-seq data, obtained via transcriptome sequencing of muscle tissue of pigs differing in muscle histological structure, and to verify the variants' effect on histological microstructure and production traits in a larger pig population. Methods: RNA-seq data was used to identify the panel of single nucleotide polymorphisms (SNPs) significantly related with percentage and diameter of each fiber type (I, IIA, IIB). Detected polymorphisms were mapped to quantitative trait loci (QTLs) regions. Next, the association study was performed on 944 animals representing five breeds (Landrace, Large White, Pietrain, Duroc, and native Puławska breed) in order to evaluate the relationship of selected SNPs and histological characteristics, meat quality and carcasses traits. Results: Mapping of detected genetic variants to QTL regions showed that chromosome 14 was the most overrepresented with the identification of four QTLs related to percentage of fiber types I and IIA. The association study performed on a 293 longissimus muscle samples confirmed a significant positive effect of transforming acidic coiled-coil-containing protein 2 (TACC2) polymorphisms on fiber diameter, while SNP within forkhead box O1 (FOXO1) locus was associated with decrease of diameter of fiber types IIA and IIB. Moreover, subsequent general linear model analysis showed significant relationship of FOXO1, delta 4-desaturase, sphingolipid 1 (DEGS1), and troponin T2 (TNNT2) genes with loin 'eye' area, FOXO1 with loin weight, as well as FOXO1 and TACC2 with lean meat percentage. Furthermore, the intramuscular fat content was positively associated (p<0.01) with occurrence of polymorphisms within DEGS1, TNNT2 genes and negatively with occurrence of TACC2 polymorphism. Conclusion: This study's results indicate that the SNP calling analysis based on RNA-seq data can be used to search candidate genes and establish the genetic basis of phenotypic traits. The presented results can be used for future studies evaluating the use of selected SNPs as genetic markers related to muscle histological profile and production traits in pig breeding.

Computational approaches for prediction of protein-protein interaction between Foot-and-mouth disease virus and Sus scrofa based on RNA-Seq

  • Park, Tamina;Kang, Myung-gyun;Nah, Jinju;Ryoo, Soyoon;Wee, Sunghwan;Baek, Seung-hwa;Ku, Bokkyung;Oh, Yeonsu;Cho, Ho-seong;Park, Daeui
    • Korean Journal of Veterinary Service
    • /
    • v.42 no.2
    • /
    • pp.73-83
    • /
    • 2019
  • Foot-and-Mouth Disease (FMD) is a highly contagious trans-boundary viral disease caused by FMD virus, which causes huge economic losses. FMDV infects cloven hoofed (two-toed) mammals such as cattle, sheep, goats, pigs and various wildlife species. To control the FMDV, it is necessary to understand the life cycle and the pathogenesis of FMDV in host. Especially, the protein-protein interaction between FMDV and host will help to understand the survival cycle of viruses in host cell and establish new therapeutic strategies. However, the computational approach for protein-protein interaction between FMDV and pig hosts have not been applied to studies of the onset mechanism of FMDV. In the present work, we have performed the prediction of the pig's proteins which interact with FMDV based on RNA-Seq data, protein sequence, and structure information. After identifying the virus-host interaction, we looked for meaningful pathways and anticipated changes in the host caused by infection with FMDV. A total of 78 proteins of pig were predicted as interacting with FMDV. The 156 interactions include 94 interactions predicted by sequence-based method and the 62 interactions predicted by structure-based method using domain information. The protein interaction network contained integrin as well as STYK1, VTCN1, IDO1, CDH3, SLA-DQB1, FER, and FGFR2 which were related to the up-regulation of inflammation and the down-regulation of cell adhesion and host defense systems such as macrophage and leukocytes. These results provide clues to the knowledge and mechanism of how FMDV affects the host cell.

A Transliteration Model based on the Seq2seq Learning and Methods for Phonetically-Aware Partial Match for Transliterated Terms in Korean (문장대문장 학습을 이용한 음차변환 모델과 한글 음차변환어의 발음 유사도 기반 부분매칭 방법론)

  • Park, Joohee;Park, Wonjun;Seo, Heecheol
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.443-448
    • /
    • 2018
  • 웹검색 결과의 품질 향상을 위해서는 질의의 정확한 매칭 뿐만이 아니라, 서로 같은 대상을 지칭하는 한글 문자열과 영문 문자열(예: 네이버-naver)의 매칭과 같은 유연한 매칭 또한 중요하다. 본 논문에서는 문장대문장 학습을 통해 영문 문자열을 한글 문자열로 음차변환하는 방법론을 제시한다. 또한 음차변환 결과로 얻어진 한글 문자열을 동일 영문 문자열의 다양한 음차변환 결과와 매칭시킬 수 있는 발음 유사성 기반 부분 매칭 방법론을 제시하고, 위키피디아의 리다이렉트 키워드를 활용하여 이들의 성능을 정량적으로 평가하였다. 이를 통해 본 논문은 문장대문장 학습 기반의 음차 변환 결과가 복잡한 문맥을 고려할 수 있으며, Damerau-Levenshtein 거리의 계산에 자모 유사도를 활용하여 기존에 비해 효과적으로 한글 키워드들 간의 부분매칭이 가능함을 보였다.

  • PDF

Identification of Hemimethylcted DNA Binding Activity in the seqA Mutant

  • Lee, Ho;Kang, Suk-Hyun;Yim, Jeong-Bin;Hwang, Deog-Su
    • Animal cells and systems
    • /
    • v.2 no.3
    • /
    • pp.351-353
    • /
    • 1998
  • A 245 bp segment of E. coli chromosomal replication origin, oriC, contains 11 repeats of the GATC sequence in which adenine is methylated by Dam methylase. Newly replicated oriC is hemimethylated. The parental strand of the newly replicated oriC is methylated, but the nascent strand is not yet methylated until methylated by Dam methylase. The hemimethylated oriC plays an important role in the regulation of chromosomal replication. Activity in the seqA mutant was identified to bind preferentially to hemimethylated DNA, but not to fully-methylated DNA. This activity may participate in the sequestration of initiation of chromosomal replication.

  • PDF

Mitigating Hate Speech in Korean Open-domain Chatbot using CTRL (한국어 오픈 도메인 대화 모델의 CTRL을 활용한 혐오 표현 생성 완화)

  • Jwa, Seung Yeon;Cha, Young-rok;Han, Moonsu;Shin, Donghoon
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.365-370
    • /
    • 2021
  • 대형 코퍼스로 학습한 언어 모델은 코퍼스 안의 사회적 편견이나 혐오 표현까지 학습한다. 본 연구에서는 한국어 오픈 도메인 대화 모델에서 혐오 표현 생성을 완화하는 방법을 제시한다. Seq2seq 구조인 BART [1]를 기반으로 하여 컨트롤 코드을 추가해 혐오 표현 생성 조절을 수행하였다. 컨트롤 코드를 사용하지 않은 기준 모델(Baseline)과 비교한 결과, 컨트롤 코드를 추가해 학습한 모델에서 혐오 표현 생성이 완화되었고 대화 품질에도 변화가 없음을 확인하였다.

  • PDF

Korean Generation-based Dialogue State Tracking using Korean Token-Free Pre-trained Language Model KeByT5 (한국어 토큰-프리 사전학습 언어모델 KeByT5를 이용한 한국어 생성 기반 대화 상태 추적)

  • Kiyoung Lee;Jonghun Shin;Soojong Lim;Ohwoog Kwon
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.644-647
    • /
    • 2023
  • 대화 시스템에서 대화 상태 추적은 사용자와의 대화를 진행하면서 사용자의 의도를 파악하여 시스템 응답을 결정하는데 있어서 중요한 역할을 수행한다. 특히 목적지향(task-oriented) 대화에서 사용자 목표(goal)를 만족시키기 위해서 대화 상태 추적은 필수적이다. 최근 다양한 자연어처리 다운스트림 태스크들이 사전학습 언어모델을 백본 네트워크로 사용하고 그 위에서 해당 도메인 태스크를 미세조정하는 방식으로 좋은 성능을 내고 있다. 본 논문에서는 한국어 토큰-프리(token-free) 사전학습 언어모델인 KeByT5B 사용하고 종단형(end-to-end) seq2seq 방식으로 미세조정을 수행한 한국어 생성 기반 대화 상태 추적 모델을 소개하고 관련하여 수행한 실험 결과를 설명한다.

  • PDF

Deep Learning-based Delinquent Taxpayer Prediction: A Scientific Administrative Approach

  • YongHyun Lee;Eunchan Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.30-45
    • /
    • 2024
  • This study introduces an effective method for predicting individual local tax delinquencies using prevalent machine learning and deep learning algorithms. The evaluation of credit risk holds great significance in the financial realm, impacting both companies and individuals. While credit risk prediction has been explored using statistical and machine learning techniques, their application to tax arrears prediction remains underexplored. We forecast individual local tax defaults in Republic of Korea using machine and deep learning algorithms, including convolutional neural networks (CNN), long short-term memory (LSTM), and sequence-to-sequence (seq2seq). Our model incorporates diverse credit and public information like loan history, delinquency records, credit card usage, and public taxation data, offering richer insights than prior studies. The results highlight the superior predictive accuracy of the CNN model. Anticipating local tax arrears more effectively could lead to efficient allocation of administrative resources. By leveraging advanced machine learning, this research offers a promising avenue for refining tax collection strategies and resource management.

Analysis of opposing histone modifications H3K4me3 and H3K27me3 reveals candidate diagnostic biomarkers for TNBC and gene set prediction combination

  • Park, Hyoung-Min;Kim, HuiSu;Lee, Kang-Hoon;Cho, Je-Yoel
    • BMB Reports
    • /
    • v.53 no.5
    • /
    • pp.266-271
    • /
    • 2020
  • Breast cancer encompasses a major portion of human cancers and must be carefully monitored for appropriate diagnoses and treatments. Among the many types of breast cancers, triple negative breast cancer (TNBC) has the worst prognosis and the least cases reported. To gain a better understanding and a more decisive precursor for TNBC, two major histone modifications, an activating modification H3K4me3 and a repressive modification H3K27me3, were analyzed using data from normal breast cell lines against TNBC cell lines. The combination of these two histone markers on the gene promoter regions showed a great correlation with gene expression. A list of signature genes was defined as active (highly enriched H3K4me3), including NOVA1, NAT8L, and MMP16, and repressive genes (highly enriched H3K27me3), IRX2 and ADRB2, according to the distribution of these histone modifications on the promoter regions. To further enhance the investigation, potential candidates were also compared with other types of breast cancer to identify signs specific to TNBC. RNA-seq data was implemented to confirm and verify gene regulation governed by the histone modifications. Combinations of the biomarkers based on H3K4me3 and H3K27me3 showed the diagnostic value AUC 93.28% with P-value of 1.16e-226. The results of this study suggest that histone modification analysis of opposing histone modifications may be valuable toward developing biomarkers and targets for TNBC.

RNA-seq profiling of skin in temperate and tropical cattle

  • Morenikeji, Olanrewaju B.;Ajayi, Oyeyemi O.;Peters, Sunday O.;Mujibi, Fidalis D.;De Donato, Marcos;Thomas, Bolaji N.;Imumorin, Ikhide G.
    • Journal of Animal Science and Technology
    • /
    • v.62 no.2
    • /
    • pp.141-158
    • /
    • 2020
  • Skin is a major thermoregulatory organ in the body controlling homeothermy, a critical function for climate adaptation. We compared genes expressed between tropical- and temperate-adapted cattle to better understand genes involved in climate adaptation and hence thermoregulation. We profiled the skin of representative tropical and temperate cattle using RNA-seq. A total of 214,754,759 reads were generated and assembled into 72,993,478 reads and were mapped to unique regions in the bovine genome. Gene coverage of unique regions of the reference genome showed that of 24,616 genes, only 13,130 genes (53.34%) displayed more than one count per million reads for at least two libraries and were considered suitable for downstream analyses. Our results revealed that of 255 genes expressed differentially, 98 genes were upregulated in tropically-adapted White Fulani (WF; Bos indicus) and 157 genes were down regulated in WF compared to Angus, AG (Bos taurus). Fifteen pathways were identified from the differential gene sets through gene ontology and pathway analyses. These include the significantly enriched melanin metabolic process, proteinaceous extracellular matrix, inflammatory response, defense response, calcium ion binding and response to wounding. Quantitative PCR was used to validate six representative genes which are associated with skin thermoregulation and epithelia dysfunction (mean correlation 0.92; p < 0.001). Our results contribute to identifying genes and understanding molecular mechanisms of skin thermoregulation that may influence strategic genomic selection in cattle to withstand climate adaptation, microbial invasion and mechanical damage.

Identification of a novel immune-related gene in the immunized black soldier fly, Hermetia illucens (L.)

  • Jung, Seong-Tae;Goo, Tae-Won;Kim, Seong Ryul;Choi, Gwang-Ho;Kim, Sung-Wan;Nga, Pham Thi;Park, Seung-Won
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.36 no.2
    • /
    • pp.25-30
    • /
    • 2018
  • The larvae of Hermetia. illucens have a high probability of coming into contact with microorganisms such as bacteria and fungi. Therefore, the survival of H. illucens is primarily the protection of their own against microbial infection. This effect depends on the development of the innate immune system. Antimicrobial Peptides (AMPs) exhibit antimicrobial activity against other bacterial strains and can provide important data to understand the basis of the innate immunity of H. illucens. In this study, we injected larvae with Enterococcus. faecalis (gram-positive bacteria) and Serratia. marcescens as (gram-negative bacteria) to test the hypothesis that H. illucens is protected from infection by its immune-related gene expression repertoire. To identify the inducible immune-related genes, we performed and cataloged the transcriptomes by RNA-Seq analysis. We compared the transcriptomes of whole larvae and obtained a DNA fragment of 465 bp including the poly (A) tail by RACE as a novel H. illucens immune-related gene against bacteria. A novel target mRNA expression was higher in immunized larvae with E. faecalis and S. marcescens groups than non-immunized group. We expect our study to provide evidence that the global RNA-Seq approach allowed for the identification of a gene of interest which was further analyzed by quantitative RT-PCR, together with genes chosen from the available literature.