• 제목/요약/키워드: source text

검색결과 267건 처리시간 0.02초

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities

  • Kim, Tae-Kyung;Oh, Jeong-Su;Ko, Gun-Hwan;Cho, Wan-Sup;Hou, Bo-Kyeng;Lee, Sang-Hyuk
    • Interdisciplinary Bio Central
    • /
    • 제3권2호
    • /
    • pp.7.1-7.6
    • /
    • 2011
  • Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications. Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr. Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

은퇴노인의 도서관 이용 경험에 관한 내러티브 탐구 (A Narrative Inquiry on the Retired Elderly Person's Library Use Experience)

  • 이호신
    • 정보관리학회지
    • /
    • 제36권1호
    • /
    • pp.215-246
    • /
    • 2019
  • 이 연구는 클랜디닌과 코넬리가 제안한 내러티브 탐구방법을 활용하여 은퇴 노인들의 도서관 이용경험을 탐구한 것이다. 도서관이라는 공간을 이용하는 것이 은퇴노인들의 삶에 가져다 준 변화의 구체적이고, 심층적인 내용을 파악하고, 그것이 삶에 가져다주는 의미를 점검하기 위한 것이다. 이를 위해서 서울 시내 공공도서관을 이용하는 세 사람의 은퇴 노인을 연구참여자로 선정하여 인터뷰하였고, 이를 바탕으로 현장텍스트를 구성하였다. 현장텍스트를 바탕으로 연구참여자들의 이야기는 소설, 에세이, 편지 형식의 연구텍스트로 재구성되었다. 이들의 도서관 이용 경험은 각각 규칙적인 생활을 위한 거점, 재미와 활력, 새로운 꿈꾸기를 위한 보물창고, 노년을 견디는 위안의 원천으로 해석되었다. 책읽기를 통한 건강한 삶에의 지향이라는 공통점을 발견할 수 있었다. 연구의 결과는 공공도서관의 노인 이용자에 관한 이해를 확충하는 데 유용하고, 서비스 개선을 위한 기초자료로 활용할 수 있으리라 기대한다.

빅데이터 분석을 위한 비용효과적 오픈 소스 시스템 설계 (Designing Cost Effective Open Source System for Bigdata Analysis)

  • 이종화;이현규
    • 지식경영연구
    • /
    • 제19권1호
    • /
    • pp.119-132
    • /
    • 2018
  • Many advanced products and services are emerging in the market thanks to data-based technologies such as Internet (IoT), Big Data, and AI. The construction of a system for data processing under the IoT network environment is not simple in configuration, and has a lot of restrictions due to a high cost for constructing a high performance server environment. Therefore, in this paper, we will design a development environment for large data analysis computing platform using open source with low cost and practicality. Therefore, this study intends to implement a big data processing system using Raspberry Pi, an ultra-small PC environment, and open source API. This big data processing system includes building a portable server system, building a web server for web mining, developing Python IDE classes for crawling, and developing R Libraries for NLP and visualization. Through this research, we will develop a web environment that can control real-time data collection and analysis of web media in a mobile environment and present it as a curriculum for non-IT specialists.

복수의 이미지를 합성하여 사용하는 캡차의 안전성 검증 (On the Security of Image-based CAPTCHA using Multi-image Composition)

  • 변제성;강전일;양대헌;이경희
    • 정보보호학회논문지
    • /
    • 제22권4호
    • /
    • pp.761-770
    • /
    • 2012
  • 컴퓨터와 사람을 구분하기 위한 수단인 캡차는 광고, 스팸 메일, DDoS 등의 공격을 하는 자동화된 봇을 막기 위해 널리 사용되고 있다. 초창기에는 문자가 출력된 이미지를 왜곡시켜 이를 컴퓨터가 식별하기 어렵도록 하는 방식이 주로 사용되었지만, 이러한 방법들은 인공지능 기법이나 이미지 처리 기법으로 쉽게 무력화 될 수 있음이 여러 연구들을 통해 밝혀졌다. 그러한 이유에서 문자 기반 캡차의 대안으로 이미지를 사용하는 캡차가 주목받게 되었고 그에 따라 여러 가지 형태의 이미지 기반 캡차가 제안되었다. 하지만 텍스트 기반 캡차보다 높은 보안성을 제공하기 위해서는 많은 양의 소스 이미지가 필요하였다. 이에 따라 강전일(2008) 등은 소규모의 이미지 데이터베이스를 이용한 이미지 기반 캡차를 제안하였다. 이 캡차는 사용자 실험을 통해 현재 널리 사용되는 문자 기반 캡차에 비해 사용자 편의성을 보였지만, 아직 안전성이 검증되지 않았다. 이 논문에서는 강전일(2008)등이 제안한 복수의 이미지를 합성하여 사용하는 캡차를 실제로 공격해봄으로써 해당 캡차의 안전성을 검증해 보았다.

텍스트 마이닝을 통한 키워드 추출과 머신러닝 기반의 오픈소스 소프트웨어 주제 분류 (Keyword Extraction through Text Mining and Open Source Software Category Classification based on Machine Learning Algorithms)

  • 이예슬;백승찬;조용준;신동명
    • 한국소프트웨어감정평가학회 논문지
    • /
    • 제14권2호
    • /
    • pp.1-9
    • /
    • 2018
  • 오픈소스를 사용하는 사용자 및 기업의 비중이 지속적으로 증가하고 있다. 국외뿐만 아니라 국내에서의 오픈소스 소프트웨어 시장 규모가 급격하게 성장하고 있다. 하지만 오픈소스 소프트웨어의 지속적인 발전에 비해서, 오픈소스 소프트웨어 주제 분류에 대한 연구 거의 이루어지지 않고 있으며 소프트웨어의 분류 체계 또한 구체화되어 있지 않다. 현재는 사용자가 주제를 직접 입력하거나 태깅하는 방식을 사용하고 있으며 이에 따른 오 분류 및 번거로움이 존재한다. 또한 오픈소스 소프트웨어 분류에 대한 연구는 오픈소스 소프트웨어 평가, 추천, 필터링등의 기반 연구로 이용될 수 있다. 따라서 본 연구에서는 머신러닝 모델을 사용하여 오픈소스 소프트웨어를 분류하는 기법에 대하여 제안하고, 머신러닝 모델 별 성능 비교를 제안한다.

"금궤요략"과 "상한론(傷寒論)"의 상사조문(相似條文)에 대한 분석(分析) (An analysis on the analogous text of Shanghanlun and Jinguiyaolue)

  • 염용하;하기태;현동환;윤상주;김준기;최달영
    • 동국한의학연구소논문집
    • /
    • 제9권
    • /
    • pp.155-163
    • /
    • 2000
  • "상한론(傷寒論)"과 "금궤요략"은 중경(仲景)의 저술(著述)로서 의방지조(醫方之祖)로 인정되어 왔지만, 두 책의 관계에 대해서는 오랫동안 논란이 있어 왔다. 그러나 "금궤요략"과 "상한론(傷寒論)"의 상사조문(相似條文)은 각 책의 10.8%, 11%를 차지할 만큼 많이 있으며 각 조문을 분석한 결과 높은 상동성을 가지는 조문이 전체의 63.9%를 차지하고 있는데, 이것은 두 책이 서로 동일한 근원에서 나왔음을 보여준다. 따라서 "상한론(傷寒論)"과 "금궤요략"의 관계를 이해함에 있어서 이들 상사조문(相似條文)에 대한 인식이 반드시 선행되어야 할 것으로 생각된다.

  • PDF

텍스트 마이닝과 기계 학습을 이용한 국내 가짜뉴스 예측 (Fake News Detection for Korean News Using Text Mining and Machine Learning Techniques)

  • 윤태욱;안현철
    • Journal of Information Technology Applications and Management
    • /
    • 제25권1호
    • /
    • pp.19-32
    • /
    • 2018
  • Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection method using Artificial Intelligence techniques over the past years. But, unfortunately, there have been no prior studies proposed an automated fake news detection method for Korean news. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (Topic Modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as multiple discriminant analysis, case based reasoning, artificial neural networks, and support vector machine can be applied. To validate the effectiveness of the proposed method, we collected 200 Korean news from Seoul National University's FactCheck (http://factcheck.snu.ac.kr). which provides with detailed analysis reports from about 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

A Practical Application of "Writing" Hypertext Literature in the English Education of the Elementary School

  • Oh, Sei-Chan
    • 영어어문교육
    • /
    • 제11권2호
    • /
    • pp.19-34
    • /
    • 2005
  • Hypertext raises question to general assumptions about our conventional conceptions of education. In this essay, three kinds of learning-models are presented by the application of "writing" hypertext literature to the English education of the elementary school. These models, which I call the "scene-centered" system, give knowledge to learners in non-linear, non-sequential structure. The term "scene" is a single concept or idea composed of a single sub-text, which is to be made by the group of students. This system is focused on the collaborative composition of students. Students, by generating sub-texts and connecting texts, perform the educational activities to expand the source text. The "scene-centered" system is, to put it into a Barte's term, a "writerly text." But in order to "write," "reading" should be accompanied. So, this system is a learning model in which writing and reading are carried on simultaneously. In all the process, students play a role of multi-user, with three access rights: read, write, and annotate. So, students making use of hypertext systems will act as reader-authors. And teachers will take the new role in collaborative writing environment. No longer the central authoritarian evaluator, they will become consultants, co-writers, coaches of their students.

  • PDF

Hot Topic Discovery across Social Networks Based on Improved LDA Model

  • Liu, Chang;Hu, RuiLin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권11호
    • /
    • pp.3935-3949
    • /
    • 2021
  • With the rapid development of Internet and big data technology, various online social network platforms have been established, producing massive information every day. Hot topic discovery aims to dig out meaningful content that users commonly concern about from the massive information on the Internet. Most of the existing hot topic discovery methods focus on a single network data source, and can hardly grasp hot spots as a whole, nor meet the challenges of text sparsity and topic hotness evaluation in cross-network scenarios. This paper proposes a novel hot topic discovery method across social network based on an im-proved LDA model, which first integrates the text information from multiple social network platforms into a unified data set, then obtains the potential topic distribution in the text through the improved LDA model. Finally, it adopts a heat evaluation method based on the word frequency of topic label words to take the latent topic with the highest heat value as a hot topic. This paper obtains data from the online social networks and constructs a cross-network topic discovery data set. The experimental results demonstrate the superiority of the proposed method compared to baseline methods.

Towards a Student-centred Approach to Translation Teaching

  • Almanna, Ali;Lazim, Hashim
    • 비교문화연구
    • /
    • 제36권
    • /
    • pp.241-270
    • /
    • 2014
  • The aim of this article is to review the traditional methodologies of teaching translation that concentrate on text-typologies and, as an alternative, to propose an eclectic multi-componential approach that involves a set of interdisciplinary skills with a view to improving the trainee translators' competences and skills. To this end, three approaches, namely a minimalist approach, a pre-transferring adjustment approach and a revision vs. editing approach are proposed to shift the focus of attention from teacher-centred approaches towards student-centred approaches. It has been shown that translator training programmes need to focus on improving the trainee translators' competences and skills, such as training them how to produce and select among the different versions they produce by themselves with justified confidence as quickly as they can (minimalist approach), adjust the original text semantically, syntactically and/or textually in a way that the source text supplely accommodates itself in the linguistic system of the target language (pre-transferring adjustment), and revise and edit others' translations. As the validity of the approach proposed relies partially on instructors' competences and skills in teaching translation, universities, particularly in the Arab world, need to invest in recruiting expert practitioners instead of depending mainly on bilingual teachers to teach translation.