• 제목/요약/키워드: Data annotation

검색결과 261건 처리시간 0.033초

메타게놈 서열에 존재하는 보존적인 전사와 번역 인자를 이용한 ORF 예측 (Prediction of ORFs in Metagenome by Using Cis-acting Transcriptional and Translational Factors)

  • 정대은;김근중
    • KSBB Journal
    • /
    • 제25권5호
    • /
    • pp.490-496
    • /
    • 2010
  • 미생물은 지구상에 약 $5\;{\times}\;10^{30}$ 정도의 개체가 존재하며, 350~550 Pg (1Pg = 1015g)의 탄소, 85~130 Pg의 질소, 9~14 Pg의 인 등, 지구상의 어떠한 생물 종보다 거대한 양의 원소를 포함하고 있다. 또한 이러한 미생물과 생태계를 구성하는 다른 유기체나 무기물과의 관계가 지속적으로 밝혀지고 있다. 이러한 연구들의 기본적인 목표는 상호작용에 중요한 인자들의 규명 (대표적으로 유전자)하는 것이기 때문에, 염색체에 존재하는 true ORF의 검색과 확인은 가장 중요한 기본 수단이 된다. 그러나 다양한 미생물로 구성된 환경 유전체는 기존 정보로 검색 가능한 비율을 정확하게 유추할 수 없기에 많은 어려움이 있다. 이렇게 경계가 불분명한 자료의 검색을 위해서는 보다 많은 정보를 필요 (training이나 space를 규정하기 위한 보다 많은 유전자 서열)로 하며, 다른 검색 방법이나 기법들이 추가적으로 개발되어야 할 것이다. 이러한 방법의 대안으로써, 미생물의 유전자간 서열에 존재하는 전사/번역인자의 보존성에 근거한 검색방법은 개량 여하에 따라 광범위한 적용 범위를 지닐 것이다. 현 수준에서도 조합 탐색, 즉 기존의 방법과 혼용하거나 기존의 방법을 보완하는 과정으로 충분한 가치를 지니고 있다. 이러한 추정은, 기존의 ORF 중심의 발굴 결과와 전혀 일치되지 않는 경우에서부터 90% 이상 일치하는 등의 결과로서 확인하였다. 일치 되지 않는 많은 경우가 BLASTing으로 검색되지 않는 새로운 ORF를 포함하기 때문이다.

시맨틱 갭을 줄이기 위한 딥러닝과 행위 온톨로지의 결합 기반 이미지 검색 (Image retrieval based on a combination of deep learning and behavior ontology for reducing semantic gap)

  • 이승;정혜욱
    • 예술인문사회 융합 멀티미디어 논문지
    • /
    • 제9권11호
    • /
    • pp.1133-1144
    • /
    • 2019
  • 최근 스마트 기기의 발전으로 인터넷상에 존재하는 이미지 데이터의 양이 급속하게 증가하는 상황에서 효과적인 이미지 검색을 위한 다양한 방법들이 연구되고 있다. 기존의 이미지 검색 방법들은 이미지에 존재하는 물체들을 단순하게 검출하여 각 물체들의 라벨 정보에 근거한 검색을 수행하기 때문에 사용자가 원하는 이미지와 검색 결과로 얻은 이미지 간에 의미적 차이인 시맨틱 갭(Semantic Gap)이 발생된다. 이미지 검색에서 발생하는 시맨틱 갭을 줄이기 위해, 본 논문에서는 딥러닝 기반의 다중 객체 분류 모듈과 사람의 행위를 분류하는 모듈을 연결하고, 이 모듈들에 행위 온톨로지를 결합하였다. 즉, 딥러닝과 행위 온톨로지의 결합을 기반으로 객체들 간의 연관성을 고려한 이미지 검색 시스템을 제안한다. 이미지에 포함된 동적인 행위를 고려하기 위해 Walking과 Running 데이터를 이용하여 실험한 결과를 분석하였다. 제안한 방법은 향후 이미지 검색 결과의 정확도를 높일 수 있는 영상의 자동 주석 생성 연구에 확장하여 적용할 수 있다.

조선총독부의 '조선도서 및 고문서'의 수집·분류 활동 (A study on collecting and classifying the Chosen literatures and archives of Chosen General Government)

  • 이승일
    • 기록학연구
    • /
    • 제4호
    • /
    • pp.93-130
    • /
    • 2001
  • Chosen General Government initiated the activities of collecting and managing the archives from Chosen Dynasty because of necessity to push positively for its colonial policies. Particularly, such efforts of the regime resulted eventually in boosting their understanding on the Korea cultures, as well as contributed to their reigning Korea to an extent. Some aspects that reflect it are as follows. In 1910 Chosen General Government took over, and began to arrange and classify huge volumes of archives that were held by the royal family. During this period, they collected and arranged literatures that they took over from the earlier Korean government. In 1913, Chosen General Government increased enormously the varieties and volumes of the archives that they intended to collect. They started with collecting archives limited to those literatures that had existed in the civil sector before 1894. It can be noticed that just in 1913 Chosen General Government revealed their intention to collect and classify both royal archives and civil archives. With the work of collecting, classification and annotating archives, Chosen General Government commenced the compilation of Chosensa (Korean History). These efforts aimed at cultural assimilation and educating of Korean people, and in this process, the importance of Chosen Dynasty's archives was reconfirmed. One of the representative cases was a change of terminology. With the compilation efforts into full swing since 1915, Chosen General Government repeatedly started to use the term 'Saryo' (historical records) in connection with Chosen's literatures and archives. The term 'Saryo' previously had been used in Japanese literatures, and it is deemed that it was used as a term generally referring to archives of Chosen Dynasty from that time. This signifies that Chosen General Government began to involve their historical point of view in approaching to the archives of Chosen. As they broadened their understanding on Korea through the annotation of old literatures and compilation of Chosen History, they seriously set on the work of assimilating Korean people culturally aiming at gripping its reign on Korea. Archives of Chosen likewise were very crucial basic data for understanding Korea and its people, and Chosen General Government is deemed to have utilized the archives as a means to reign and assimilate Korean people.

랜덤포레스트를 이용한 국내 학술지 논문의 자동분류에 관한 연구 (An Analytical Study on Automatic Classification of Domestic Journal articles Using Random Forest)

  • 김판준
    • 정보관리학회지
    • /
    • 제36권2호
    • /
    • pp.57-77
    • /
    • 2019
  • 대표적인 앙상블 기법으로서 랜덤포레스트(RF)를 문헌정보학 분야의 학술지 논문에 대한 자동분류에 적용하였다. 특히, 국내 학술지 논문에 주제 범주를 자동 할당하는 분류 성능 측면에서 트리 수, 자질선정, 학습집합 크기 등 주요 요소들에 대한 다각적인 실험을 수행하였다. 이를 통해, 실제 환경의 불균형 데이터세트(imbalanced dataset)에 대하여 랜덤포레스트(RF)의 성능을 최적화할 수 있는 방안을 모색하였다. 결과적으로 국내 학술지 논문의 자동분류에서 랜덤포레스트(RF)는 트리 수 구간 100~1000(C)과 카이제곱통계량(CHI)으로 선정한 소규모의 자질집합(10%), 대부분의 학습집합(9~10년)을 사용하는 경우에 가장 좋은 분류 성능을 기대할 수 있는 것으로 나타났다.

부록 공정주법칙례(工程做法則例)의 해제(解題)와 권(卷)1 부분(部分)의 주석(註釋) (Appendix The Annotation of 『Gongchengzuofazeli (工程做法則例)』, and Commentary on its First Volume)

  • 한동수;동건비;이성호;양희식
    • 헤리티지:역사와 과학
    • /
    • 제43권2호
    • /
    • pp.82-119
    • /
    • 2010
  • "공정주법칙례(工程做法則例)"는 건물의 영조표준을 통일하고 공정관리제도를 강화하기 위한 목적으로 청나라 옹정(雍正) 12년(1734년) 공부에서 간행한 책이다. 영조표준을 만들기 위해 두구(斗口)를 기준으로 한 부재 하나하나의 치수가 기록되어 있으며, 공정관리를 위한 여러 비용의 기록 또한 존재 한다. 현재 이러한 기록들은 당시의 건축기술 및 건축환경을 짐작해 볼 수 있는 중요한 자료로 인식되고 있다. 하지만 우리나라 건축사학계는 북송 시기에 간행된 "영조법식(營造法式)"의 연구에 집중된 경향이 있어 "공정주법칙례"는 그 중요성은 인정 받고 있으나 아직 번역 작업조차 이뤄지지 않은 상태이다. 따라서 여기서는 "공정주법칙례"의 기본적인 내용들을 서두에 밝히고, 권1의 원문을 번역하여 소개함으로써 이후 연구의 기초자료를 제공하는데 그 목적을 두었다.

Identification of Hub Genes in the Pathogenesis of Ischemic Stroke Based on Bioinformatics Analysis

  • Yang, Xitong;Yan, Shanquan;Wang, Pengyu;Wang, Guangming
    • Journal of Korean Neurosurgical Society
    • /
    • 제65권5호
    • /
    • pp.697-709
    • /
    • 2022
  • Objective : The present study aimed to identify the function of ischemic stroke (IS) patients' peripheral blood and its role in IS, explore the pathogenesis, and provide direction for clinical research progress by comprehensive bioinformatics analysis. Methods : Two datasets, including GSE58294 and GSE22255, were downloaded from Gene Expression Omnibus database. GEO2R was utilized to obtain differentially expressed genes (DEGs). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs were performed using the database annotation, visualization and integrated discovery database. The protein-protein interaction (PPI) network of DEGs was constructed by search tool of searching interactive gene and visualized by Cytoscape software, and then the Hub gene was identified by degree analysis. The microRNA (miRNA) and miRNA target genes closely related to the onset of stroke were obtained through the miRNA gene regulatory network. Results : In total, 36 DEGs, containing 27 up-regulated and nine down-regulated DEGs, were identified. GO functional analysis showed that these DEGs were involved in regulation of apoptotic process, cytoplasm, protein binding and other biological processes. KEGG enrichment analysis showed that these DEGs mediated signaling pathways, including human T-cell lymphotropic virus (HTLV)-I infection and microRNAs in cancer. The results of PPI network and cytohubba showed that there was a relationship between DEGs, and five hub genes related to stroke were obtained : SOCS3, KRAS, PTGS2, EGR1, and DUSP1. Combined with the visualization of DEG-miRNAs, hsa-mir-16-5p, hsa-mir-181a-5p and hsa-mir-124-3p were predicted to be the key miRNAs in stroke, and three miRNAs were related to hub gene. Conclusion : Thirty-six DEGs, five Hub genes, and three miRNA were obtained from bioinformatics analysis of IS microarray data, which might provide potential targets for diagnosis and treatment of IS.

Draft Genome Sequence of the Reference Strain of the Korean Medicinal Mushroom Wolfiporia cocos KMCC03342

  • Bogun Kim;Byoungnam Min;Jae-Gu Han;Hongjae Park;Seungwoo Baek;Subin Jeong;In-Geol Choi
    • Mycobiology
    • /
    • 제50권4호
    • /
    • pp.254-257
    • /
    • 2022
  • Wolfiporia cocos is a wood-decay brown rot fungus belonging to the family Polyporaceae. While the fungus grows, the sclerotium body of the strain, dubbed Bokryeong in Korean, is formed around the roots of conifer trees. The dried sclerotium has been widely used as a key component of many medicinal recipes in East Asia. Wolfiporia cocos strain KMCC03342 is the reference strain registered and maintained by the Korea Seed and Variety Service for commercial uses. Here, we present the first draft genome sequence of W. cocos KMCC03342 using a hybrid assembly technique combining both short- and long-read sequences. The genome has a total length of 55.5 Mb comprised of 343 contigs with N50 of 332 kb and 95.8% BUSCO completeness. The GC ratio was 52.2%. We predicted 14,296 protein-coding gene models based on ab initio gene prediction and evidence-based annotation procedure using RNAseq data. The annotated genome was predicted to have 19 terpene biosynthesis gene clusters, which was the same number as the previously sequenced W. cocos strain MD-104 genome but higher than Chinese W. cocos strains. The genome sequence and the predicted gene clusters allow us to study biosynthetic pathways for the active ingredients of W. cocos.

Analysis of antibiotic resistance genes in pig feces during the weaning transition using whole metagenome shotgun sequencing

  • Gi Beom Keum;Eun Sol Kim;Jinho Cho;Minho Song;Kwang Kyo Oh;Jae Hyoung Cho;Sheena Kim;Hyeri Kim;Jinok Kwak;Hyunok Doo;Sriniwas Pandey;Hyeun Bum Kim;Ju-Hoon Lee
    • Journal of Animal Science and Technology
    • /
    • 제65권1호
    • /
    • pp.175-182
    • /
    • 2023
  • Antibiotics have been used in livestock production for not only treatment but also for increasing the effectiveness of animal feed, aiding animal growth, and preventing infectious diseases at the time when immunity is lowered due to stress. South Korea and the EU are among the countries that have prohibited the use of antibiotics for growth promotion in order to prevent indiscriminate use of antibiotics, as previous studies have shown that it may lead to increase in cases of antibiotic-resistant bacteria. Therefore, this study evaluated the number of antibiotic resistance genes in piglets staging from pre-weaning to weaning. Fecal samples were collected from 8 piglets just prior to weaning (21 d of age) and again one week after weaning (28 d of age). Total DNA was extracted from the 200 mg of feces collected from the 8 piglets. Whole metagenome shotgun sequencing was carried out using the Illumina Hi-Seq 2000 platform and raw sequence data were imported to Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST) pipeline for microbial functional analysis. The results of this study did not show an increase in antibiotic-resistant bacteria although confirmed an increase in antibiotic-resistant genes as the consequence of changes in diet and environment during the experiment.

Status of Philippine Mango Genomics: Enriching Molecular Genomics Towards a Globally Competitive Philippine Mango Industry

  • Eureka Teresa M. Ocampo;Cris Q. Cortaga;Jhun Laurence S. Rasco;John Albert P. Lachica;Darlon V. Lantican
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.28-28
    • /
    • 2022
  • This paper presents the first genome assemblies of Philippine mangoes that provide valuable reference for varietal improvement and genomic studies on mango and related fruit crops. WE sequenced whole genomes of3 species, Mangifera odorata (Huani), Mangifera altissima (Paho), and Mangifera indica 'Carabao' (Sweet Elena). 'Carabao' is the major export variety of the Philippines; Paho is identified as vulnerable by the IUCN Red List of Threatened Species; Huani has fruit sap acrid which is the primary defense mechanism against insects and birds. We used Falcon, a diploid aware -de novo assembler to assemble SMRT generated long-read sequences. Falcon-unzip was employed to phase the output assembly producing larger contig sets (primary contigs) and shorter contigs corresponding to haplotypes (haplotigs). Assembly statistics were generated by comparing the assembly to a reference genome, Tommy Atkins, using Quality Assessment Tool (QUAST). Moreover, the extent of duplication and completeness of gene content was measured using Benchmarking Universal Single-Copy Orthologs (BUSCO). Draft assemblies with high duplications were processed using Purge Haplotigs and Purge Dups to lessen duplications with minimal impact on genome completeness. De novo assemblies of Huani, Paho and 'Carabao' were then generated with primary contig sizes of 463.64 Mb, 508.95 Mb and 401.51 Mb respectively. These draft assemblies of Huani, Paho and 'Carabao' showed 96.90%, 95.17% and 99.07% complete BUSCOs respectively which is comparable to 'Tommy Atkins' genome (98.6%). Using two mango transcriptome data (pooled RNA-seq from different mango varieties and tissues), 91-96% or 24-30 million reads were successfully mapped back for each generated assembly indicating high degree of completeness. The results obtained demonstrated the highly contiguous, phased, and near complete genome assembly of three Philippine mango species for structural and functional annotation of gene units, especially those with economic importance.

  • PDF

Transcriptome Analysis Reveals the Putative Polyketide Synthase Gene Involved in Hispidin Biosynthesis in Sanghuangporus sanghuang

  • Jiansheng Wei;Liangyan Liu;Xiaolong Yuan;Dong Wang;Xinyue Wang;Wei Bi;Yan Yang;Yi Wang
    • Mycobiology
    • /
    • 제51권5호
    • /
    • pp.360-371
    • /
    • 2023
  • Hispidin is an important styrylpyrone produced by Sanghuangporus sanghuang. To analyze hispidin biosynthesis in S. sanghuang, the transcriptomes of hispidin-producing and non-producing S. sanghuang were determined by Illumina sequencing. Five PKSs were identified using genome annotation. Comparative analysis with the reference transcriptome showed that two PKSs (ShPKS3 and ShPKS4) had low expression levels in four types of media. The gene expression pattern of only ShPKS1 was consistent with the yield variation of hispidin. The combined analyses of gene expression with qPCR and hispidin detection by liquid chromatography-mass spectrometry coupled with ion-trap and time-of-flight technologies (LCMS-IT-TOF) showed that ShPKS1 was involved in hispidin biosynthesis in S. sanghuang. ShPKS1 is a partially reducing PKS gene with extra AMP and ACP domains before the KS domain. The domain architecture of ShPKS1 was AMP-ACP-KS-AT-DH-KR-ACP-ACP. Phylogenetic analysis shows that ShPKS1 and other PKS genes from Hymenochaetaceae form a unique monophyletic clade closely related to the clade containing Agaricales hispidin synthase. Taken together, our data indicate that ShPKS1 is a novel PKS of S. sanghuang involved in hispidin biosynthesis.