• Title/Summary/Keyword: Text features

Search Result 580, Processing Time 0.027 seconds

Exploring an Optimal Feature Selection Method for Effective Opinion Mining Tasks

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.2
    • /
    • pp.171-177
    • /
    • 2019
  • This paper aims to find the most effective feature selection method for the sake of opinion mining tasks. Basically, opinion mining tasks belong to sentiment analysis, which is to categorize opinions of the online texts into positive and negative from a text mining point of view. By using the five product groups dataset such as apparel, books, DVDs, electronics, and kitchen, TF-IDF and Bag-of-Words(BOW) fare calculated to form the product review feature sets. Next, we applied the feature selection methods to see which method reveals most robust results. The results show that the stacking classifier based on those features out of applying Information Gain feature selection method yields best result.

Visualizing the phenotype diversity: a case study of Alexander disease

  • Dohi, Eisuke;Bangash, Ali Haider
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.28.1-28.4
    • /
    • 2021
  • Since only a small number of patients have a rare disease, it is difficult to identify all of the features of these diseases. This is especially true for patients uncommonly presenting with rare diseases. It can also be difficult for the patient, their families, and even clinicians to know which one of a number of disease phenotypes the patient is exhibiting. To address this issue, during Biomedical Linked Annotation Hackathon 7 (BLAH7), we tried to extract Alexander disease patient data in Portable Document Format. We then visualized the phenotypic diversity of those Alexander disease patients with uncommon presentations. This led to us identifying several issues that we need to overcome in our future work.

Teaching The Adventures of Wu Han of Korea in Secondary Education (중등 영문학 교재로서의 『한국인 우한의 모험』 연구)

  • Om, Donghee
    • American Studies
    • /
    • v.43 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • This paper examines the benefits of teaching The Adventures of Wu Han of Korea in secondary education in Korea. The novel is a rare sample of twentieth-century American fiction that features a Korean protagonist. What is notable in this novel is that its major Korean characters seem to share the mindset of their American author and creator and represent the Western perspective in their discourse of Korean/Eastern idea and culture. The novel is packed with Orientalist attitudes and could be taught as a case study of Orientalism. Teachers can also use the novel to teach students the art of close reading by analyzing selected scenes from the text.

Analysis on Trends of No-Code Machine Learning Tools

  • Yo-Seob, Lee;Phil-Joo, Moon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.412-419
    • /
    • 2022
  • The amount of digital text data is growing exponentially, and many machine learning solutions are being used to monitor and manage this data. Artificial intelligence and machine learning are used in many areas of our daily lives, but the underlying processes and concepts are not easy for most people to understand. At a time when many experts are needed to run a machine learning solution, no-code machine learning tools are a good solution. No-code machine learning tools is a platform that enables machine learning functions to be performed without engineers or developers. The latest No-Code machine learning tools run in your browser, so you don't need to install any additional software, and the simple GUI interface makes them easy to use. Using these platforms can save you a lot of money and time because there is less skill and less code to write. No-Code machine learning tools make it easy to understand artificial intelligence and machine learning. In this paper, we examine No-Code machine learning tools and compare their features.

Sentence model based subword embeddings for a dialog system

  • Chung, Euisok;Kim, Hyun Woo;Song, Hwa Jeon
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.599-612
    • /
    • 2022
  • This study focuses on improving a word embedding model to enhance the performance of downstream tasks, such as those of dialog systems. To improve traditional word embedding models, such as skip-gram, it is critical to refine the word features and expand the context model. In this paper, we approach the word model from the perspective of subword embedding and attempt to extend the context model by integrating various sentence models. Our proposed sentence model is a subword-based skip-thought model that integrates self-attention and relative position encoding techniques. We also propose a clustering-based dialog model for downstream task verification and evaluate its relationship with the sentence-model-based subword embedding technique. The proposed subword embedding method produces better results than previous methods in evaluating word and sentence similarity. In addition, the downstream task verification, a clustering-based dialog system, demonstrates an improvement of up to 4.86% over the results of FastText in previous research.

Machine Learning Based Blog Text Opinion Classification System Using Opinion Word Centered-Dependency Tree Pattern Features (의견어중심의 의존트리패턴자질을 이용한 기계학습기반 한국어 블로그 문서 의견분류시스템)

  • Kwak, Dong-Min;Lee, Seung-Wook
    • Annual Conference of KIPS
    • /
    • 2009.11a
    • /
    • pp.337-338
    • /
    • 2009
  • 블로그문서의 의견극성분류 연구는 주로 기계학습기법에 기반한 방법이었고, 이때 주로 활용된 자질은 명사, 동사 등의 품사정보와 의견어 어휘정보였다. 하지만 하나의 의견어 어휘만을 고려한다면 그 극성을 판별하는데 필요한 정보가 충분하지 않아 부정확한 결과를 도출하는 경우가 발생할 수 있다. 본 논문에서는 여러 어휘를 동시에 고려하였을 때 보다 정확한 의견분류를 수행할 수 있을 것이라는 가정을 세웠다. 본 논문에서는 효과적인 의견어휘자질의 추출을 위하여 의견이 내포될 가능성이 높은 의견어휘를 기반으로 의존구문분석을 통해 의존트리패턴을 추출하였고, 제안하는 PF-IDF가중치를 적용하여 지지벡터기계(SVM)와 다항시행접근 단순베이지안(MNNB)알고리즘으로 비교 실험을 수행하였다. 기준시스템인 TF-IDF가중치 기법에 비해 정확도(accuracy)가 지지벡터기계에서 5%, 다항시행접근 단순베이지안에서 8.9% 향상된 성능을 보였다.

Odometry Using Strong Features of Recognized Text (인식된 문자의 강한 특징점을 활용하는 측위시스템)

  • Song, Do-hoon;Park, Jong-il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.219-222
    • /
    • 2021
  • 본 논문에서는 시각-관성 측위시스템(Visual-Inertial Odometry, VIO)에서 광학 문자 인식(Optical Character Recognition, OCR)을 활용해 문자의 영역을 찾아내고, 그 위치를 기억해 측위시스템에서 다시 인식되었을 때 비교하기 위해 위치와 특징점을 저장하고자 한다. 먼저, 실시간으로 움직이는 카메라의 영상에서 문자를 찾아내고, 카메라의 상대적인 위치를 이용하여 문자가 인식된 위치와 특징점을 저장하는 방법을 제안한다. 또한 저장된 문자가 다시 탐색되었을 때, 문자가 재인식되었는 지 판별하기 위한 방법을 제안한다. 인공적인 마커나 미리 학습된 객체를 사용하지 않고 상황에 따른 문자를 사용하는 이 방법은 문자가 존재하는 범용적인 공간에서 사용이 가능하다.

  • PDF

Analyzing User Feedback on a Fan Community Platform 'Weverse': A Text Mining Approach

  • Thi Thao Van Ho;Mi Jin Noh;Yu Na Lee;Yang Sok Kim
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.62-71
    • /
    • 2024
  • This study applies topic modeling to uncover user experience and app issues expressed in users' online reviews of a fan community platform, Weverse on Google Play Store. It allows us to identify the features which need to be improved to enhance user experience or need to be maintained and leveraged to attract more users. Therefore, we collect 88,068 first-level English online reviews of Weverse on Google Play Store with Google-Play-Scraper tool. After the initial preprocessing step, a dataset of 31,861 online reviews is analyzed using Latent Dirichlet Allocation (LDA) topic modeling with Gensim library in Python. There are 5 topics explored in this study which highlight significant issues such as network connection error, delayed notification, and incorrect translation. Besides, the result revealed the app's effectiveness in fostering not only interaction between fans and artists but also fans' mutual relationships. Consequently, the business can strengthen user engagement and loyalty by addressing the identified drawbacks and leveraging the platform for user communication.

Retrieving Semantic Image Using Shape Descriptors and Latent-dynamic Conditional Random Fields

  • Mahmoud Elmezain;Hani M. Ibrahem
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.10
    • /
    • pp.197-205
    • /
    • 2024
  • This paper introduces a new approach to semantic image retrieval using shape descriptors as dispersion and moment in conjunction with discriminative model of Latent-dynamic Conditional Random Fields (LDCRFs). The target region is firstly localized via the background subtraction model. Then the features of dispersion and moments are employed to k-mean procedure to extract object's feature as second stage. After that, the learning process is carried out by LDCRFs. Finally, SPARQL language on input text or image query is to retrieve semantic image based on sequential processes of Query Engine, Matching Module and Ontology Manger. Experimental findings show that our approach can be successful retrieve images against the mammals Benchmark with rate 98.11. Such outcomes are likely to compare very positively with those accessible in the literature from other researchers.

Architectural Features of Naedeok-dong Cathedral, Cheongju Diocese under the Jurisdiction of Maryknoll Missioners (메리놀회 관할 청주교구 내덕동 주교좌성당의 건축적 특징)

  • Kim, Myungsun;Lee, Jeong-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.259-268
    • /
    • 2020
  • Eighteen catholic churches, built in the Chungbuk area(Cheongju diocese) under the jurisdiction of the American Maryknoll missioners in 1953-1969, are not constrained by specific architectural styles, unlike those built by other foreign Catholic missionary organizations. The same is true of Naedeok-dong cathedral in Cheongju, which is the highest hierarchy and representative church of the diocese. Nevertheless, it has unique architectural features that distinguish it from other churches in the diocese. This study examined what those features were, how they were embodied, and their origins. This study also shows that the features are common in the missioners' churches in Pyeogyang diocese in 1923-1942 and that Father James V. Pardy and the architect Tae-Bong Park, played a bridging role in having the same features between the Pyeogyang and Cheongju diocese. In conclusion, this study summarizes the significance of Naedeok-dong cathedral in relation to the missioners' ideology, in the history of the churches in 1923-1969 and Korean modern Catholic church architecture. To this end, a literature search that utilized mainly primary sources, such as newly discovered architectural drawings, photographs, and text related to the cathedral, was performed.