• 제목/요약/키워드: seq2seq

검색결과 227건 처리시간 0.032초

도심 자율주행을 위한 어텐션-장단기 기억 신경망 기반 차선 변경 가능성 판단 알고리즘 개발 (Attention-LSTM based Lane Change Possibility Decision Algorithm for Urban Autonomous Driving)

  • 이희성;이경수
    • 자동차안전학회지
    • /
    • 제14권3호
    • /
    • pp.65-70
    • /
    • 2022
  • Lane change in urban environments is a challenge for both human-driving and automated driving due to their complexity and non-linearity. With the recent development of deep-learning, the use of the RNN network, which uses time series data, has become the mainstream in this field. Many researches using RNN show high accuracy in highway environments, but still do not for urban environments where the surrounding situation is complex and rapidly changing. Therefore, this paper proposes a lane change possibility decision network by adopting Attention layer, which is an SOTA in the field of seq2seq. By weighting each time step within a given time horizon, the context of the road situation is more human-like. A total 7D vectors of x, y distances and longitudinal relative speed of side front and rear vehicles, and longitudinal speed of ego vehicle were used as input. A total 5,614 expert data of 4,098 yield cases and 1,516 non-yield cases were used for training, and the performance of this network was tested through 1,817 data. Our network achieves 99.641% of test accuracy, which is about 4% higher than a network using only LSTM in an urban environment. Furthermore, it shows robust behavior to false-positive or true-negative objects.

딥러닝 기반의 문서요약기법을 활용한 뉴스 추천 (News Recommendation Exploiting Document Summarization based on Deep Learning)

  • 허지욱
    • 한국인터넷방송통신학회논문지
    • /
    • 제22권4호
    • /
    • pp.23-28
    • /
    • 2022
  • 최근 스마트폰 또는 타블렛 PC와 같은 스마트기기가 정보의 창구 역할을 하게 되면서 다수의 사용자가 웹포털을 통해 웹 뉴스를 소비하는 것이 더욱 중요해졌다. 하지만 인터넷 상에 생성되는 뉴스의 양을 사용자들이 따라가기 힘들며 중복되고 반복되는 폭발하는 뉴스 기사에 오히려 혼란을 야기 시킬 수도 있다. 본 논문에서는 뉴스 포털에서 사용자의 질의로부터 검색된 뉴스후보들 중 KoBART 기반의 문서요약 기술을 활용한 뉴스 추천 시스템을 제안한다. 실험을 통해서 새롭게 수집된 뉴스 데이터를 기반으로 학습한 KoBART의 성능이 사전훈련보다 더욱 우수한 결과를 보여주었으며 KoBART로부터 생성된 요약문을 환용하여 사용자에게 효과적으로 뉴스를 추천하였다.

Epigenetic Silencing of CHOP Expression by the Histone Methyltransferase EHMT1 Regulates Apoptosis in Colorectal Cancer Cells

  • Kim, Kwangho;Ryu, Tae Young;Lee, Jinkwon;Son, Mi-Young;Kim, Dae-Soo;Kim, Sang Kyum;Cho, Hyun-Soo
    • Molecules and Cells
    • /
    • 제45권9호
    • /
    • pp.622-630
    • /
    • 2022
  • Colorectal cancer (CRC) has a high mortality rate among cancers worldwide. To reduce this mortality rate, chemotherapy (5-fluorouracil, oxaliplatin, and irinotecan) or targeted therapy (bevacizumab, cetuximab, and panitumumab) has been used to treat CRC. However, due to various side effects and poor responses to CRC treatment, novel therapeutic targets for drug development are needed. In this study, we identified the overexpression of EHMT1 in CRC using RNA sequencing (RNA-seq) data derived from TCGA, and we observed that knocking down EHMT1 expression suppressed cell growth by inducing cell apoptosis in CRC cell lines. In Gene Ontology (GO) term analysis using RNA-seq data, apoptosis-related terms were enriched after EHMT1 knockdown. Moreover, we identified the CHOP gene as a direct target of EHMT1 using a ChIP (chromatin immunoprecipitation) assay with an anti-histone 3 lysine 9 dimethylation (H3K9me2) antibody. Finally, after cotransfection with siEHMT1 and siCHOP, we again confirmed that CHOP-mediated cell apoptosis was induced by EHMT1 knockdown. Our findings reveal that EHMT1 plays a key role in regulating CRC cell apoptosis, suggesting that EHMT1 may be a therapeutic target for the development of cancer inhibitors.

NA-Seq를 이용한 제주산 메밀의 발아초기 전사체 프로파일 분석 (Transcriptomic Profile Analysis of Jeju Buckwheat using RNA-Seq Data)

  • 한송이;정성진;오대주;정용환;김찬식;김재훈
    • 한국산학기술학회논문지
    • /
    • 제19권1호
    • /
    • pp.537-545
    • /
    • 2018
  • 본 연구에서는 메밀의 발아초기에 발현되는 전사체의 다양한 정보 수집을 위해 양절메밀과 대관 3-3호의 RNA를 추출하여 전사체 분석을 수행하였다. 제주산 양절메밀과 대관3-3호의 종자 및 발아 후 12, 24, 36시간별로 total RNA를 추출하고, llumina Hiseq 2000 플랫폼을 사용하여 시퀀싱 하였다. SolexaQA package의 DynamicTrim과 LengthsORT 프로그램으로 이용하여 raw 데이터 분석을 실시한 후, 어셈블리(assembly)와 annotation을 수행하였다. RNA-seq raw 데이터로부터 약 84.2%, 81.5%에 해당하는 16.5Gb, 16.2Gb의 transcriptome 데이터를 확보하였다. 47Mb에 해당하는 43,494개의 대표적인 전사체(representative transcripts)를 확보하였고, 그 중에서 annotation DB와 서열 유사도를 갖는 서열은 23,165개로 확인되었다. 메밀의 representative transcripts 유전자의 유전자 온톨로지(gene ontology) 분석결과, biological process는 metabolic process (49.49%)에서, cellular components는 cell (46.12%)에서, molecular function은 catalyltic activity (80.43%)에서 유전자가 많이 분포되어 있는 것을 확인하였다. 종자의 발아에 관련된 gibberellin receptor GID1C의 경우에는 양절메밀, 대관 3-3호의 발현양이 모두 시간이 지남에 따라 증가되는 것을 확인할 수 있었으며, gibberellin 20-oxidase1의 경우에는 양절메밀에서는 발아 후 12 시간이내에 증가되었으나, 대관 3-3호에서는 36시간까지 유전자 발현양 증가하는 것을 확인할 수 있었다. 이러한 제주산 메밀의 발아초기 단계별 전사체 분석 데이터는 종간의 기능적, 형태학적 차이를 일으키는 메커니즘 규명에 도움을 줄 것으로 사료된다.

체색 패턴이 다른 개볼락(Sebastes pachycephalus) 피부 전사체 프로파일링 (Skin Transcriptome Profiling of the Blass Bloched Rockfish (Sebastes pachycephalus) with Different Body Color Patterns)

  • 장요순
    • 한국어류학회지
    • /
    • 제32권3호
    • /
    • pp.117-129
    • /
    • 2020
  • 생물의 종 구분에 이용하는 지표 중 체색은 특징이 뚜렷한 형태 지표로서, 어류의 종 동정에 유용한 형태형질이다. 개볼락은 한국 중부와 남부, 일본 홋카이도 남쪽 등지에 분포하는 상업적으로 중요한 어종으로, 피부에 반점의 유무 및 마킹이 있는 위치에 따라 4개의 아종으로 구분하는 복잡한 체색 특성을 갖는다. 그러나 개볼락의 다양한 체색 패턴과 관련된 유전자 탐색 및 유전자 변이 발굴 등 체색 형성에 관여하는 유전자 규명에 관한 연구는 없다. 이에 따라 본 연구에서는 개볼락의 체색 패턴 관련 유전자 발굴 및 유전자 발현 특성을 규명하기 위한 기초 연구로 체색 타입별 피부 전사체를 프로파일링하였다. 개볼락을 Wild type (반점과 marking 없음)과 Color type (반점과 마킹 모두 있음)으로 구분하였고, 피부 전사체를 RNA-seq 방법을 이용하여 분석하였다. 개볼락 피부 전사체의 발현량을 비교하여 체색 타입별 차등발현유전자 164개를 확보하였다. 이들 차등발현유전자의 기능을 Gene ontology(GO) 분석으로 확인한 결과, 2개는 molecular function, 46개는 biological process, 6개는 cellular component 기능그룹에 속하였다. 차등발현유전자 중 CTL (Galactose-specific lectin nattectin), CUL1 (Cullin-1), CMAS (N-acylneuraminate cytidylyltransferase), NMRK2 (Nicotinamide riboside kinase 2), ALOXE3 (Hydroperoxide isomerase ALOXE3), SLC4A7 (Sodium bicarbonate cotransporter 3) 등은 특정 체색 타입 특이적인 발현양상을 나타냈다. 이번 연구는 개볼락의 체색 패턴 형성에 관여하는 전사체를 탐색한 첫 번째 연구로, 체색 형성 관련 기능유전자 발굴을 위한 후보유전자로 개볼락의 체색 타입별 차등발현유전자를 확보한 것에 의의가 있다. 향후에는 이들 후보유전자의 발현양상 및 기능을 분석하여 개볼락의 복잡한 체색 패턴과 관련된 기능유전자의 특성을 밝히고자 한다.

Integration and Reanalysis of Four RNA-Seq Datasets Including BALF, Nasopharyngeal Swabs, Lung Biopsy, and Mouse Models Reveals Common Immune Features of COVID-19

  • Rudi Alberts;Sze Chun Chan;Qian-Fang Meng;Shan He;Lang Rao;Xindong Liu;Yongliang Zhang
    • IMMUNE NETWORK
    • /
    • 제22권3호
    • /
    • pp.22.1-22.25
    • /
    • 2022
  • Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndromecoronavirus-2 (SARS-CoV-2), has spread over the world causing a pandemic which is still ongoing since its emergence in late 2019. A great amount of effort has been devoted to understanding the pathogenesis of COVID-19 with the hope of developing better therapeutic strategies. Transcriptome analysis using technologies such as RNA sequencing became a commonly used approach in study of host immune responses to SARS-CoV-2. Although substantial amount of information can be gathered from transcriptome analysis, different analysis tools used in these studies may lead to conclusions that differ dramatically from each other. Here, we re-analyzed four RNA-sequencing datasets of COVID-19 samples including human bronchoalveolar lavage fluid, nasopharyngeal swabs, lung biopsy and hACE2 transgenic mice using the same standardized method. The results showed that common features of COVID-19 include upregulation of chemokines including CCL2, CXCL1, and CXCL10, inflammatory cytokine IL-1β and alarmin S100A8/S100A9, which are associated with dysregulated innate immunity marked by abundant neutrophil and mast cell accumulation. Downregulation of chemokine receptor genes that are associated with impaired adaptive immunity such as lymphopenia is another common feather of COVID-19 observed. In addition, a few interferon-stimulated genes but no type I IFN genes were identified to be enriched in COVID-19 samples compared to their respective control in these datasets. These features are in line with results from single-cell RNA sequencing studies in the field. Therefore, our re-analysis of the RNA-seq datasets revealed common features of dysregulated immune responses to SARS-CoV-2 and shed light to the pathogenesis of COVID-19.

Genome-wide survey and expression analysis of F-box genes in wheat

  • Kim, Dae Yeon;Hong, Min Jeong;Seo, Yong Weon
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2017년도 9th Asian Crop Science Association conference
    • /
    • pp.141-141
    • /
    • 2017
  • The ubiquitin-proteasome pathway is the major regulatory mechanism in a number of cellular processes for selective degradation of proteins and involves three steps: (1) ATP dependent activation of ubiquitin by E1 enzyme, (2) transfer of activated ubiquitin to E2 and (3) transfer of ubiquitin to the protein to be degraded by E3 complex. F-box proteins are subunit of SCF complex and involved in specificity for a target substrate to be degraded. F-box proteins regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence. However, little is known about the F-box genes in wheat. The draft genome sequence of wheat (IWGSC Reference Sequence v1.0 assembly) used to analysis a genome-wide survey of the F-box gene family in wheat. The Hidden Markov Model (HMM) profiles of F-box (PF00646), F-box-like (PF12937), F-box-like 2 (PF13013), FBA (PF04300), FBA_1 (PF07734), FBA_2 (PF07735), FBA_3 (PF08268) and FBD (PF08387) domains were downloaded from Pfam database were searched against IWGSC Reference Sequence v1.0 assembly. RNA-seq paired-end libraries from different stages of wheat, such as stages of seedling, tillering, booting, day after flowering (DAF) 1, DAF 10, DAF 20, and DAF 30 were conducted and sequenced by Illumina HiSeq2000 for expression analysis of F-box protein genes. Basic analysis including Hisat, HTseq, DEseq, gene ontology analysis and KEGG mapping were conducted for differentially expressed gene analysis and their annotation mappings of DEGs from various stages. About 950 F-box domain proteins identified by Pfam were mapped to wheat reference genome sequence by blastX (e-value < 0.05). Among them, more than 140 putative F-box protein genes were selected by fold changes cut-offs of > 2, significance p-value < 0.01, and FDR<0.01. Expression profiling of selected F-box protein genes were shown by heatmap analysis, and average linkage and squared Euclidean distance of putative 144 F-box protein genes by expression patterns were calculated for clustering analysis. This work may provide valuable and basic information for further investigation of protein degradation mechanism by ubiquitin proteasome system using F-box proteins during wheat development stages.

  • PDF

BSA-Seq Technologies Identify a Major QTL for Clubroot Resistance in Chinese Cabbage (Brassica rapa ssp. pekinesis)

  • Yuan, Yu-Xiang;Wei, Xiao-Chun;Zhang, Qiang;Zhao, Yan-Yan;Jiang, Wu-Sheng;Yao, Qiu-Ju;Wang, Zhi-Yong;Zhang, Ying;Tan, Yafei;Li, Yang;Xu, Qian;Zhang, Xiao-Wei
    • 한국균학회소식:학술대회논문집
    • /
    • 한국균학회 2015년도 춘계학술대회 및 임시총회
    • /
    • pp.41-41
    • /
    • 2015
  • BSA-seq technologies, combined Bulked Segregant Analysis (BSA) and Next-Generation Sequencing (NGS), are making it faster and more efficient to establish the association of agronomic traits with molecular markers or candidate genes, which is the requirement for marker-assisted selection in molecular breeding. Clubroot disease, caused by Plasmodiophora brassicae, is a serious threat to Brassica crops. Even we have breed new clubroot resistant varieties of Chinese cabbage (B. rapa ssp. pekinesis), the underlying genetic mechanism is unclear. In this study, an $F_2$ population of 340 plants were inoculated with P. brassicae from Xinye (Pathotype 2 on the differentials of Williams). Resistance phenotype segregation ratio for the populations fit a 3:1 (R:S) segregation model, consistent with a single dominant gene model. Super-BSA, using re-sequencing the parents, extremely R and S DNA pools with each 50 plants, revealed 3 potential candidate regions on the chromosome A03, with the most significant region falling between 24.30 Mb and 24.75 Mb. A linkage map with 31 markers in this region was constructed with several closely linked markers identified. A Major QTL for clubroot resistance, CRq, which was identified with the peak LOD score at 169.3, explaining 89.9% of the phenotypic variation. And we developed a new co-segregated InDel marker BrQ-2. Joint BSA-seq and traditional QTL analysis delimited CRq to an 250 kb genomic region, where four TIR-NBS-LRR genes (Bra019409, Bra019410, Bra019412 and Bra019413) clustered. The CR gene CRq and closely linked markers will be highly useful for breeding new resistant Chinese cabbage cultivars.

  • PDF

Whole genome MBD-seq and RRBS analyses reveal that hypermethylation of gastrointestinal hormone receptors is associated with gastric carcinogenesis

  • Kim, Hee-Jin;Kang, Tae-Wook;Haam, Keeok;Kim, Mirang;Kim, Seon-Kyu;Kim, Seon-Young;Lee, Sang-Il;Song, Kyu-Sang;Jeong, Hyun-Yong;Kim, Yong Sung
    • Experimental and Molecular Medicine
    • /
    • 제50권12호
    • /
    • pp.1.1-1.14
    • /
    • 2018
  • DNA methylation is a regulatory mechanism in epigenetics that is frequently altered during human carcinogenesis. To detect critical methylation events associated with gastric cancer (GC), we compared three DNA methylomes from gastric mucosa (GM), intestinal metaplasia (IM), and gastric tumor (GT) cells that were microscopically dissected from an intestinal-type early gastric cancer (EGC) using methylated DNA binding domain sequencing (MBD-seq) and reduced representation bisulfite sequencing (RRBS) analysis. In this study, we focused on differentially methylated promoters (DMPs) that could be directly associated with gene expression. We detected 2,761 and 677 DMPs between the GT and GM by MBD-seq and RRBS, respectively, and for a total of 3,035 DMPs. Then, 514 (17%) of all DMPs were detected in the IM genome, which is a precancer of GC, supporting that some DMPs might represent an early event in gastric carcinogenesis. A pathway analysis of all DMPs demonstrated that 59 G protein-coupled receptor (GPCR) genes linked to the hypermethylated DMPs were significantly enriched in a neuroactive ligand-receptor interaction pathway. Furthermore, among the 59 GPCRs, six GI hormone receptor genes (NPY1R, PPYR1, PTGDR, PTGER2, PTGER3, and SSTR2) that play an inhibitory role in the secretion of gastrin or gastric acid were selected and validated as potential biomarkers for the diagnosis or prognosis of GC patients in two cohorts. These data suggest that the loss of function of gastrointestinal (GI) hormone receptors by promoter methylation may lead to gastric carcinogenesis because gastrin and gastric acid have been known to play a role in cell differentiation and carcinogenesis in the GI tract.

꼬막(Tegillarca granosa)의 유전적 다양성 분석을 위한 드래프트 게놈분석과 마이크로새틀라이트 마커 발굴 (Genome Survey and Microsatellite Marker Selection of Tegillarca granosa)

  • 김진무;이승재;조은아;최은경;김현진;이정식;박현
    • 한국해양생명과학회지
    • /
    • 제6권1호
    • /
    • pp.38-46
    • /
    • 2021
  • 꼬막 종류 중 하나인 Tegillarca granosa는 해양 이매패류로서 한국, 중국, 일본 등의 중요한 수산 자원 중 하나이다. 꼬막의 염색체 수는 2n=38로 알려져 있지만, 유전체의 크기와 유전 정보에 대해서는 아직 명확하게 알려져 있지 않다. 꼬막의 유전체 크기 예측을 위하여 NGS Illumina HiSeq 플랫폼을 이용하여 얻은 짧은 DNA 서열 정보를 통하여 in silico 분석으로 유전체 크기를 분석하였다. 그 결과 꼬막의 유전체 크기는 770.61 Mb로 예측되었다. 이후 MaSuRCA assembler를 통하여 드래프트 게놈 조립 작업을 수행하고, QDD pipeline을 이용하여 SSR (simple sequence repeats) 분석을 수행하였다. 꼬막의 유전체로부터 43,944개의 SSR을 발굴하였으며, 다이-뉴클레오타이드(di-nucleotide) 69.51%, 트라이-뉴클레오타이드(tri-nucleotide) 16.68%, 테트라-뉴클레오타이드(tetra-nucleotide) 12.96%, 펜타-뉴클레오타이드(penta-nucleotide) 0.82% 그리고 헥사-뉴클레오타이드(hexa-nucleotide) 0.03%로 구성되었다. 이후 꼬막의 유전적 다양성 연구에 활용할 수 있는 100개의 마이크로새틀라이트 마커의 프라이머 세트를 선별하였다. 앞으로 이번 연구를 통해서, 꼬막의 집단유전학적 연구와 유전적 다양성을 규명하는데 도움이 될 것이며, 나아가 동종들 간의 원산지 분류를 알아낼 수 있을 것이다.