• Title/Summary/Keyword: Bert

Search Result 390, Processing Time 0.021 seconds

Evaluation of Similarity Analysis of Newspaper Article Using Natural Language Processing

  • Ayako Ohshiro;Takeo Okazaki;Takashi Kano;Shinichiro Ueda
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.6
    • /
    • pp.1-7
    • /
    • 2024
  • Comparing text features involves evaluating the "similarity" between texts. It is crucial to use appropriate similarity measures when comparing similarities. This study utilized various techniques to assess the similarities between newspaper articles, including deep learning and a previously proposed method: a combination of Pointwise Mutual Information (PMI) and Word Pair Matching (WPM), denoted as PMI+WPM. For performance comparison, law data from medical research in Japan were utilized as validation data in evaluating the PMI+WPM method. The distribution of similarities in text data varies depending on the evaluation technique and genre, as revealed by the comparative analysis. For newspaper data, non-deep learning methods demonstrated better similarity evaluation accuracy than deep learning methods. Additionally, evaluating similarities in law data is more challenging than in newspaper articles. Despite deep learning being the prevalent method for evaluating textual similarities, this study demonstrates that non-deep learning methods can be effective regarding Japanese-based texts.

A Study on Intelligent Document Processing Management using Unstructured Data (비정형 데이터를 활용한 지능형 문서 처리 관리에 관한 연구)

  • Kyoung Hoon Park;Kwang-Kyu Seo
    • Journal of the Semiconductor & Display Technology
    • /
    • v.23 no.2
    • /
    • pp.71-75
    • /
    • 2024
  • This research focuses on processing unstructured data efficiently, containing various formulas in document processing and management regarding the terms and rules of domestic insurance documents using text mining techniques. Through parsing and compilation technology, document context, content, constants, and variables are automatically separated, and errors are verified in order of the document and logic to improve document accuracy accordingly. Through document debugging technology, errors in the document are identified in real time. Furthermore, it is necessary to predict the changes that intelligent document processing will bring to document management work, in particular, the impact on documents and utilization tasks that are double managed due to various formulas and prepare necessary capabilities in the future.

  • PDF

Term of Penalty Prediction using ChatGPT (ChatGPT 를 이용한 형사사건 양형 예측 연구)

  • Minhan Cho;Jinyoung Han
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.784-785
    • /
    • 2024
  • 형량 예측 연구는 법률 인공지능에서 가장 활발히 연구되고 있는 분야 중 하나이며, 비법률전문가의 사법 신뢰도 상승과 법률전문가의 업무 부담 완화에 긍정적 영향을 줄 수 있다. 본 연구는 형사 사건의 양형 예측에 ChatGPT 를 접목하여 입력된 사실관계와 유사한 선행 판례를 검색함으로써 형량 예측에 필요한 모델의 훈련 시간과 비용을 절감하는 접근법을 제안한다. 본 모델의 weighted F1-score 는 0.53 으로, 미세조정된 BERT 모델과 유사한 성능을 기록하였다.

Analysis of the feasibility of using title-id indexing in a news recommendation system (뉴스 추천 시스템에서의 제목 인덱싱의 활용 가능성 분석)

  • Jun-Pyo Kim;Tae-Ho Kim;Sang-Wook Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.680-682
    • /
    • 2024
  • 현재까지 연구되었던 뉴스 추천 시스템은 일반적으로 뉴스 제목, 뉴스 본문, 카테고리 정보 등의 텍스트 정보를 기반으로 사용자에게 맞춤 뉴스를 추천해주는 방식으로 동작한다. 구체적으로는 뉴스의 텍스트 정보를 통해 뉴스를 표현하는 임베딩 벡터를 생성하여 사용자 맞춤 뉴스를 추천하는 task-specific 한 아키텍처를 기반으로 동작한다. 기존 연구에서는 task-specific 아키텍처 내의 뉴스의 임베딩 벡터를 생성하는 과정에서 BERT 와 같은 언어모델을 이용하여 텍스트 정보를 더 잘 반영하고자 했다. 본 연구에서는 기존의 구조와 다르게, 뉴스 제목 인덱싱을 통해 전체 뉴스 추천 시스템에서의 언어모델을 충분히 활용할 수 있는 방식을 제안하고자 한다.

The Research on Emotion Recognition through Multimodal Feature Combination (멀티모달 특징 결합을 통한 감정인식 연구)

  • Sung-Sik Kim;Jin-Hwan Yang;Hyuk-Soon Choi;Jun-Heok Go;Nammee Moon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.739-740
    • /
    • 2024
  • 본 연구에서는 음성과 텍스트라는 두 가지 모달리티의 데이터를 효과적으로 결합함으로써, 감정 분류의 정확도를 향상시키는 새로운 멀티모달 모델 학습 방법을 제안한다. 이를 위해 음성 데이터로부터 HuBERT 및 MFCC(Mel-Frequency Cepstral Coefficients)기법을 통해 추출한 특징 벡터와 텍스트 데이터로부터 RoBERTa를 통해 추출한 특징 벡터를 결합하여 감정을 분류한다. 실험 결과, 제안한 멀티모달 모델은 F1-Score 92.30으로 유니모달 접근 방식에 비해 우수한 성능 향상을 보였다.

Simulation Techniques for Mid-Frequency Vibro-Acoustics Virtual Tools For Real Problems

  • Desmet, Wim;Pluymers, Bert;Atak, Onur;Bergen, Bart;Deckers, Elke;Huijssen, Koos;Van Genechten, Bert;Vergote, Karel;Vandepitte, Dirk
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2010.05a
    • /
    • pp.49-49
    • /
    • 2010
  • The most commonly used numerical modelling techniques for acoustics and vibration are based on element based techniques, such as the nite element and boundary element method. Due to the huge computational eorts involved, the use of these deterministic techniques is practically restricted to low-frequency applications. For high-frequency modelling, probabilistic techniques such as SEA are well established. However, there is still a wide mid-frequency range, for which no adequate and mature prediction techniques are available. In this frequency range, the computational eorts of conventional element based techniques become prohibitively large, while the basic assumptions of the probabilistic techniques are not yet valid. In recent years, a vast amount of research has been initiated in a quest for an adequate solution for the current midfrequency problem. One family of research methods focuses on novel deterministic approaches with an enhanced convergence rate and computational eciency compared to the conventional element based methods in order to shift the practical frequency limitation towards the mid-frequency range. Amongst those techniques, a wave based prediction technique using an indirect Tretz approach is being developed at the K.U.Leuven - Noise and Vibration Research group. This paper starts with an outline of the major features of the mid-frequency modelling challenge and provides a short overview of the current research activities in response to this challenge. Next, the basic concepts of the wave based technique and its hybrid coupling with nite element schemes are described. Various validations on two- and threedimensional acoustic, elastic, poro-elastic and vibro-acoustic examples are given to illustrate the potential of the method and its benecial performance as compared to conventional element based methods. A closing part shares some views on the open issues and future research directions.

  • PDF

Effects of Gamma-ray and Chemical Mutagens on the Germination and Seedling Growth in Stevia rebaudiana Bert. (감마선 및 화학적 돌연변이원 처리가 스테비아 (Stevia rebaudiana Bert.)의 종자 발아 및 초기 생장에 미치는 영향)

  • Yoon, Tai-Young;Kim, Ee-Youb;Kim, Young-Ho;Choi, Gin-Su;Hyun, Kyung-Sup;Seong, Yoon-Hee;Jo, Han-Jig;Kim, Dong Sub;Kang, Si-Yong;Ko, Jeong-Ae
    • Journal of Radiation Industry
    • /
    • v.6 no.2
    • /
    • pp.189-197
    • /
    • 2012
  • This study was carried out to develop the improved useful mutants for yield or composition of stevia plants using the gamma ray or chemical mutagens treatments. The seeds of stevia 'Suwon No. 11' were irradiated up to 400 Gy of gamma ray. Chemical mutagens were treated on the seeds of the 'Suwon No. 11' using 0.07% colchicine, 10 mM sodium azide, or 10 mM NMU for various durations. The germination rate, and shoot and root growth of seedling were estimated at 30 days after gamma ray irradiation or chemical mutagen treatment, and the plant height, the number of branches, and leaf length and width were examined at 3 months after mutagenesis treatments. In the case of gamma ray treatments, the germination rate and early-stage growth were decreased as the increase of radiation dose, and the 50% lethal dose was found to be 200 Gy. the plant height was decreased as the increase of radiation dose, while the number of branches per plant and leaf length were increased. Leaf shape was modified to the relatively longer one compared to the control, which was identified more apparently at the treatments of higher than 150 Gy. In the treatment of chemical mutagens, the rate of germination and survival were decreased as the increase of incubation time. The 50% lethal dose for germination rate were identified as the conditions of the 15 hours incubation in 0.07% colchicine, the 4 hrs in 10 mM sodium azide, and the 2 hrs in 10 mM NMU, in the three chemical mutagens treatments. Chemical mutagens had no influence on shoot growth, while root growth was increased, especially as the incubation time was extended. The highest root growth occurred in the NMU treatment at 6 hrs incubation time. The plant height was decreased as the increase of incubation time in the chemical mutagens treatments. Among the chemical mutagens, NMU was the most effective to induce the mutants with long-shaped or the least lobed leaves.

A Study on Classification of Mobile Application Reviews Using Deep Learning (딥러닝을 활용한 모바일 어플리케이션 리뷰 분류에 관한 연구)

  • Son, Jae Ik;Noh, Mi Jin;Rahman, Tazizur;Pyo, Gyujin;Han, Mumoungcho;Kim, Yang Sok
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.76-83
    • /
    • 2021
  • With the development and use of smart devices such as smartphones and tablets increases, the mobile application market based on mobile devices is growing rapidly. Mobile application users write reviews to share their experience in using the application, which can identify consumers' various needs and application developers can receive useful feedback on improving the application through reviews written by consumers. However, there is a need to come up with measures to minimize the amount of time and expense that consumers have to pay to manually analyze the large amount of reviews they leave. In this work, we propose to collect delivery application user reviews from Google PlayStore and then use machine learning and deep learning techniques to classify them into four categories like application feature advantages, disadvantages, feature improvement requests and bug report. In the case of the performance of the Hugging Face's pretrained BERT-based Transformer model, the f1 score values for the above four categories were 0.93, 0.51, 0.76, and 0.83, respectively, showing superior performance than LSTM and GRU.

Multimodal Sentiment Analysis Using Review Data and Product Information (리뷰 데이터와 제품 정보를 이용한 멀티모달 감성분석)

  • Hwang, Hohyun;Lee, Kyeongchan;Yu, Jinyi;Lee, Younghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.15-28
    • /
    • 2022
  • Due to recent expansion of online market such as clothing, utilizing customer review has become a major marketing measure. User review has been used as a tool of analyzing sentiment of customers. Sentiment analysis can be largely classified with machine learning-based and lexicon-based method. Machine learning-based method is a learning classification model referring review and labels. As research of sentiment analysis has been developed, multi-modal models learned by images and video data in reviews has been studied. Characteristics of words in reviews are differentiated depending on products' and customers' categories. In this paper, sentiment is analyzed via considering review data and metadata of products and users. Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Self Attention-based Multi-head Attention models and Bidirectional Encoder Representation from Transformer (BERT) are used in this study. Same Multi-Layer Perceptron (MLP) model is used upon every products information. This paper suggests a multi-modal sentiment analysis model that simultaneously considers user reviews and product meta-information.

Artificial Intelligence for Assistance of Facial Expression Practice Using Emotion Classification (감정 분류를 이용한 표정 연습 보조 인공지능)

  • Dong-Kyu, Kim;So Hwa, Lee;Jae Hwan, Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1137-1144
    • /
    • 2022
  • In this study, an artificial intelligence(AI) was developed to help with facial expression practice in order to express emotions. The developed AI used multimodal inputs consisting of sentences and facial images for deep neural networks (DNNs). The DNNs calculated similarities between the emotions predicted by the sentences and the emotions predicted by facial images. The user practiced facial expressions based on the situation given by sentences, and the AI provided the user with numerical feedback based on the similarity between the emotion predicted by sentence and the emotion predicted by facial expression. ResNet34 structure was trained on FER2013 public data to predict emotions from facial images. To predict emotions in sentences, KoBERT model was trained in transfer learning manner using the conversational speech dataset for emotion classification opened to the public by AIHub. The DNN that predicts emotions from the facial images demonstrated 65% accuracy, which is comparable to human emotional classification ability. The DNN that predicts emotions from the sentences achieved 90% accuracy. The performance of the developed AI was evaluated through experiments with changing facial expressions in which an ordinary person was participated.