• Title/Summary/Keyword: intelligence embedding

Search Result 80, Processing Time 0.021 seconds

Opera Clustering: K-means on librettos datasets

  • Jeong, Harim;Yoo, Joo Hun
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.45-52
    • /
    • 2022
  • With the development of artificial intelligence analysis methods, especially machine learning, various fields are widely expanding their application ranges. However, in the case of classical music, there still remain some difficulties in applying machine learning techniques. Genre classification or music recommendation systems generated by deep learning algorithms are actively used in general music, but not in classical music. In this paper, we attempted to classify opera among classical music. To this end, an experiment was conducted to determine which criteria are most suitable among, composer, period of composition, and emotional atmosphere, which are the basic features of music. To generate emotional labels, we adopted zero-shot classification with four basic emotions, 'happiness', 'sadness', 'anger', and 'fear.' After embedding the opera libretto with the doc2vec processing model, the optimal number of clusters is computed based on the result of the elbow method. Decided four centroids are then adopted in k-means clustering to classify unsupervised libretto datasets. We were able to get optimized clustering based on the result of adjusted rand index scores. With these results, we compared them with notated variables of music. As a result, it was confirmed that the four clusterings calculated by machine after training were most similar to the grouping result by period. Additionally, we were able to verify that the emotional similarity between composer and period did not appear significantly. At the end of the study, by knowing the period is the right criteria, we hope that it makes easier for music listeners to find music that suits their tastes.

Fine-tuning BERT Models for Keyphrase Extraction in Scientific Articles

  • Lim, Yeonsoo;Seo, Deokjin;Jung, Yuchul
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.1
    • /
    • pp.45-56
    • /
    • 2020
  • Despite extensive research, performance enhancement of keyphrase (KP) extraction remains a challenging problem in modern informatics. Recently, deep learning-based supervised approaches have exhibited state-of-the-art accuracies with respect to this problem, and several of the previously proposed methods utilize Bidirectional Encoder Representations from Transformers (BERT)-based language models. However, few studies have investigated the effective application of BERT-based fine-tuning techniques to the problem of KP extraction. In this paper, we consider the aforementioned problem in the context of scientific articles by investigating the fine-tuning characteristics of two distinct BERT models - BERT (i.e., base BERT model by Google) and SciBERT (i.e., a BERT model trained on scientific text). Three different datasets (WWW, KDD, and Inspec) comprising data obtained from the computer science domain are used to compare the results obtained by fine-tuning BERT and SciBERT in terms of KP extraction.

Claim-Evidence Pair Extraction Model using Hierarchical Label Embedding (계층적 레이블 임베딩을 이용한 주장-증거 쌍 추출 모델)

  • Yujin Sim;Damrin Kim;Tae-il Kim;Sung-won Choi;Harksoo Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.474-478
    • /
    • 2023
  • 논증 마이닝이란 비정형의 텍스트 데이터에서 논증 구조와 그 요소들을 식별, 분석, 추출하는 자연어 처리의 한 분야다. 논증 마이닝의 하위 작업인 주장-증거 쌍 추출은 주어진 문서에서 자동으로 주장과 증거 쌍을 추출하는 작업이다. 본 논문에서는 효과적인 주장-증거 쌍 추출을 위해, 문서 단위의 문맥 정보를 이용하고 주장과 증거 간의 종속성을 반영하기 위한 계층적 LAN 방법을 제안한다. 실험을 통해 서로의 정보를 활용하는 종속적인 구조가 독립적인 구조보다 우수함을 입증하였으며, 최종 제안 모델은 Macro F1을 기준으로 13.5%의 성능 향상을 보였다.

  • PDF

Burmese Sentiment Analysis Based on Transfer Learning

  • Mao, Cunli;Man, Zhibo;Yu, Zhengtao;Wu, Xia;Liang, Haoyuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.535-548
    • /
    • 2022
  • Using a rich resource language to classify sentiments in a language with few resources is a popular subject of research in natural language processing. Burmese is a low-resource language. In light of the scarcity of labeled training data for sentiment classification in Burmese, in this study, we propose a method of transfer learning for sentiment analysis of a language that uses the feature transfer technique on sentiments in English. This method generates a cross-language word-embedding representation of Burmese vocabulary to map Burmese text to the semantic space of English text. A model to classify sentiments in English is then pre-trained using a convolutional neural network and an attention mechanism, where the network shares the model for sentiment analysis of English. The parameters of the network layer are used to learn the cross-language features of the sentiments, which are then transferred to the model to classify sentiments in Burmese. Finally, the model was tuned using the labeled Burmese data. The results of the experiments show that the proposed method can significantly improve the classification of sentiments in Burmese compared to a model trained using only a Burmese corpus.

Korean Sentiment Analysis Using Natural Network: Based on IKEA Review Data

  • Sim, YuJeong;Yun, Dai Yeol;Hwang, Chi-gon;Moon, Seok-Jae
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.173-178
    • /
    • 2021
  • In this paper, we find a suitable methodology for Korean Sentiment Analysis through a comparative experiment in which methods of embedding and natural network models are learned at the highest accuracy and fastest speed. The embedding method compares word embeddeding and Word2Vec. The model compares and experiments representative neural network models CNN, RNN, LSTM, GRU, Bi-LSTM and Bi-GRU with IKEA review data. Experiments show that Word2Vec and BiGRU had the highest accuracy and second fastest speed with 94.23% accuracy and 42.30 seconds speed. Word2Vec and GRU were found to have the third highest accuracy and fastest speed with 92.53% accuracy and 26.75 seconds speed.

Knowledge Graph Embedding Methods for Political Stance Prediction: Performance Evaluation (뉴스 기사의 정치적 성향 판단을 위한 지식 그래프 임베딩 기법의 효과 분석)

  • Seongeun Ryu;Yunyong Ko;Sang-Wook Kim
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.519-521
    • /
    • 2023
  • 온라인 뉴스 플랫폼의 발전은 에코 챔버(echo chamber) 효과와 정치적 양극화를 심화시키며, 이를 완화하기 위한 선행 연구로 뉴스 기사의 정치적 성향을 판단하는 연구가 필요하다. 기존 연구는 외부 지식 그래프를 활용하여 뉴스 기사의 텍스트 정보를 더욱 풍부하게 표현한다. 그러나, 외부 지식을 임베딩하는 지식 그래프 임베딩(knowledge graph embedding, KGE) 방법은 다양하며, 각 KGE 방법이 정치적 성향 예측 정확도에 미치는 효과에 대해서 충분히 연구되지 않았다. 본 논문에서는 정치적 성향 예측에 외부 지식의 활용을 최대화하기 위한 다양한 KGE 방법들의 효과를 분석한다. 실험 결과, 외부 지식 그래프 내의 개체들 간 복잡한 관계를 간단하고 정확하게 표현 가능한 ModE 방법을 활용하는 것이 정치적 성향 예측에 가장 효과적이라는 것을 확인하였다.

English-Korean speech translation corpus (EnKoST-C): Construction procedure and evaluation results

  • Jeong-Uk Bang;Joon-Gyu Maeng;Jun Park;Seung Yun;Sang-Hun Kim
    • ETRI Journal
    • /
    • v.45 no.1
    • /
    • pp.18-27
    • /
    • 2023
  • We present an English-Korean speech translation corpus, named EnKoST-C. End-to-end model training for speech translation tasks often suffers from a lack of parallel data, such as speech data in the source language and equivalent text data in the target language. Most available public speech translation corpora were developed for European languages, and there is currently no public corpus for English-Korean end-to-end speech translation. Thus, we created an EnKoST-C centered on TED Talks. In this process, we enhance the sentence alignment approach using the subtitle time information and bilingual sentence embedding information. As a result, we built a 559-h English-Korean speech translation corpus. The proposed sentence alignment approach showed excellent performance of 0.96 f-measure score. We also show the baseline performance of an English-Korean speech translation model trained with EnKoST-C. The EnKoST-C is freely available on a Korean government open data hub site.

STL-Attention based Traffic Prediction with Seasonality Embedding (계절성 임베딩을 고려한 STL-Attention 기반 트래픽 예측)

  • Yeom, Sungwoong;Choi, Chulwoong;Kolekar, Shivani Sanjay;Kim, Kyungbaek
    • Annual Conference of KIPS
    • /
    • 2021.11a
    • /
    • pp.95-98
    • /
    • 2021
  • 최근 비정상적인 네트워크 활동 감지 및 네트워크 서비스 프로비저닝과 같은 다양한 분야에서 응용되는 네트워크 트래픽 예측 기술이 네트워크 통신 문제에 의한 트래픽의 결측 및 네트워크 유저의 불규칙한 활동에 의한 비선형 특성 때문에 발생하는 성능 저하를 극복하기 위해 딥러닝 신경망에 대한 연구가 활성화되고 있다. 이 딥러닝 신경망 중 시계열 딥러닝 신경망은 단기 네트워크 트래픽 볼륨을 예측할 때 낮은 오류율을 보인다. 하지만, 시계열 딥러닝 신경망은 기울기 소멸 및 폭발과 같은 비선형성, 다중 계절성 및 장기적 의존성 문제와 같은 한계를 보여준다. 이 논문에서는 계절성 임베딩을 고려한 주의 신경망 기반 트래픽 예측 기법을 제안한다. 제안하는 기법은 STL 분해 기법을 통해 분해된 트래픽 트랜드, 계절성, 잔차를 이용하여 일별 및 주별 계절성을 임베딩하고 이를 주의 신경망을 기반으로 향후 트래픽을 예측한다.

Facial Manipulation Detection with Transformer-based Discriminative Features Learning Vision (트랜스포머 기반 판별 특징 학습 비전을 통한 얼굴 조작 감지)

  • Van-Nhan Tran;Minsu Kim;Philjoo Choi;Suk-Hwan Lee;Hoanh-Su Le;Ki-Ryong Kwon
    • Annual Conference of KIPS
    • /
    • 2023.11a
    • /
    • pp.540-542
    • /
    • 2023
  • Due to the serious issues posed by facial manipulation technologies, many researchers are becoming increasingly interested in the identification of face forgeries. The majority of existing face forgery detection methods leverage powerful data adaptation ability of neural network to derive distinguishing traits. These deep learning-based detection methods frequently treat the detection of fake faces as a binary classification problem and employ softmax loss to track CNN network training. However, acquired traits observed by softmax loss are insufficient for discriminating. To get over these limitations, in this study, we introduce a novel discriminative feature learning based on Vision Transformer architecture. Additionally, a separation-center loss is created to simply compress intra-class variation of original faces while enhancing inter-class differences in the embedding space.

User Sentiment Analysis on Amazon Fashion Product Review Using Word Embedding (워드 임베딩을 이용한 아마존 패션 상품 리뷰의 사용자 감성 분석)

  • Lee, Dong-yub;Jo, Jae-Choon;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.4
    • /
    • pp.1-8
    • /
    • 2017
  • In the modern society, the size of the fashion market is continuously increasing both overseas and domestic. When purchasing a product through e-commerce, the evaluation data for the product created by other consumers has an effect on the consumer's decision to purchase the product. By analysing the consumer's evaluation data on the product the company can reflect consumer's opinion which can leads to positive affect of performance to company. In this paper, we propose a method to construct a model to analyze user's sentiment using word embedding space formed by learning review data of amazon fashion products. Experiments were conducted by learning three SVM classifiers according to the number of positive and negative review data using the formed word embedding space which is formed by learning 5.7 million Amazon review data.. Experimental results showed the highest accuracy of 88.0% when learning SVM classifier using 50,000 positive review data and 50,000 negative review data.