• Title/Summary/Keyword: 텍스트마이닝

Search Result 1,175, Processing Time 0.024 seconds

Offering system for major article Using Text Mining and Data Mining (텍스트마이닝과 데이터마이닝을 이용한 주요기사 제공 시스템)

  • Song, Sung-Mook;Ryu, Joon-Suk;Kim, Ung-Mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.733-734
    • /
    • 2009
  • 현대사회에서 인터넷의 비약적인 발전과 빠른 보급으로 우리가 접할 수 있는 정보의 양이 늘어나고 이들 중에서 필요한 정보만을 얻어내기에는 쉽지 않다. 특히 비구조적이고 정형화되지 않은 텍스트 데이터인 기사들을 텍스트마이닝을 이용하여 기사 헤드라인을 용어 단위로 구분하여 추출하고 데이터마이닝의 연관 규칙을 적용하여 빈발항목의 지지도와 용어간의 연관성을 통해 기사의 내용에 효과적으로 접근하는 시스템을 제안하고자 한다.

A Study on the Method for Extracting the Purpose-Specific Customized Information from Online Product Reviews based on Text Mining (텍스트 마이닝 기반의 온라인 상품 리뷰 추출을 통한 목적별 맞춤화 정보 도출 방법론 연구)

  • Kim, Joo Young;Kim, Dong soo
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.2
    • /
    • pp.151-161
    • /
    • 2016
  • In the era of the Web 2.0, characterized by the openness, sharing and participation, it is easy for internet users to produce and share the data. The amount of the unstructured data which occupies most of the digital world's data has increased exponentially. One of the kinds of the unstructured data called personal online product reviews is necessary for both the company that produces those products and the potential customers who are interested in those products. In order to extract useful information from lots of scattered review data, the process of collecting data, storing, preprocessing, analyzing, and drawing a conclusion is needed. Therefore we introduce the text-mining methodology for applying the natural language process technology to the text format data like product review in order to carry out extracting structured data by using R programming. Also, we introduce the data-mining to derive the purpose-specific customized information from the structured review information drawn by the text-mining.

Research Trend Analysis on Living Lab Using Text Mining (텍스트 마이닝을 이용한 리빙랩 연구동향 분석)

  • Kim, SeongMook;Kim, YoungJun
    • Journal of Digital Convergence
    • /
    • v.18 no.8
    • /
    • pp.37-48
    • /
    • 2020
  • This study aimed at understanding trends of living lab studies and deriving implications for directions of the studies by utilizing text mining. The study included network analysis and topic modelling based on keywords and abstracts from total 166 thesis published between 2011 and November 2019. Centrality analysis showed that living lab studies had been conducted focusing on keywords like innovation, society, technology, development, user and so on. From the topic modelling, 5 topics such as "regional innovation and user support", "social policy program of government", "smart city platform building", "technology innovation model of company" and "participation in system transformation" were extracted. Since the foundation of KNoLL in 2017, the diversification of living lab study subjects has been made. Quantitative analysis using text mining provides useful results for development of living lab studies.

A Semantic Text Model with Wikipedia-based Concept Space (위키피디어 기반 개념 공간을 가지는 시멘틱 텍스트 모델)

  • Kim, Han-Joon;Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.107-123
    • /
    • 2014
  • Current text mining techniques suffer from the problem that the conventional text representation models cannot express the semantic or conceptual information for the textual documents written with natural languages. The conventional text models represent the textual documents as bag of words, which include vector space model, Boolean model, statistical model, and tensor space model. These models express documents only with the term literals for indexing and the frequency-based weights for their corresponding terms; that is, they ignore semantical information, sequential order information, and structural information of terms. Most of the text mining techniques have been developed assuming that the given documents are represented as 'bag-of-words' based text models. However, currently, confronting the big data era, a new paradigm of text representation model is required which can analyse huge amounts of textual documents more precisely. Our text model regards the 'concept' as an independent space equated with the 'term' and 'document' spaces used in the vector space model, and it expresses the relatedness among the three spaces. To develop the concept space, we use Wikipedia data, each of which defines a single concept. Consequently, a document collection is represented as a 3-order tensor with semantic information, and then the proposed model is called text cuboid model in our paper. Through experiments using the popular 20NewsGroup document corpus, we prove the superiority of the proposed text model in terms of document clustering and concept clustering.

Using Text Mining Techniques for Intrusion Detection Problem in Computer Network (텍스트 마이닝 기법을 이용한 컴퓨터 네트워크의 침입 탐지)

  • Oh Seung-Joon;Won Min-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.27-32
    • /
    • 2005
  • Recently there has been much interest in applying data mining to computer network intrusion detection. A new approach, based on the k-Nearest Neighbour(kNN) classifier, is used to classify Program behaviour as normal or intrusive. Each system call is treated as a word and the collection of system calls over each program execution as a document. These documents are then classified using kNN classifier, a Popular method in text mining. A simple example illustrates the proposed procedure.

  • PDF