• 제목/요약/키워드: classification and extraction

검색결과 1,099건 처리시간 0.025초

Research on Chinese Microblog Sentiment Classification Based on TextCNN-BiLSTM Model

  • Haiqin Tang;Ruirui Zhang
    • Journal of Information Processing Systems
    • /
    • 제19권6호
    • /
    • pp.842-857
    • /
    • 2023
  • Currently, most sentiment classification models on microblogging platforms analyze sentence parts of speech and emoticons without comprehending users' emotional inclinations and grasping moral nuances. This study proposes a hybrid sentiment analysis model. Given the distinct nature of microblog comments, the model employs a combined stop-word list and word2vec for word vectorization. To mitigate local information loss, the TextCNN model, devoid of pooling layers, is employed for local feature extraction, while BiLSTM is utilized for contextual feature extraction in deep learning. Subsequently, microblog comment sentiments are categorized using a classification layer. Given the binary classification task at the output layer and the numerous hidden layers within BiLSTM, the Tanh activation function is adopted in this model. Experimental findings demonstrate that the enhanced TextCNN-BiLSTM model attains a precision of 94.75%. This represents a 1.21%, 1.25%, and 1.25% enhancement in precision, recall, and F1 values, respectively, in comparison to the individual deep learning models TextCNN. Furthermore, it outperforms BiLSTM by 0.78%, 0.9%, and 0.9% in precision, recall, and F1 values.

Framework for Content-Based Image Identification with Standardized Multiview Features

  • Das, Rik;Thepade, Sudeep;Ghosh, Saurav
    • ETRI Journal
    • /
    • 제38권1호
    • /
    • pp.174-184
    • /
    • 2016
  • Information identification with image data by means of low-level visual features has evolved as a challenging research domain. Conventional text-based mapping of image data has been gradually replaced by content-based techniques of image identification. Feature extraction from image content plays a crucial role in facilitating content-based detection processes. In this paper, the authors have proposed four different techniques for multiview feature extraction from images. The efficiency of extracted feature vectors for content-based image classification and retrieval is evaluated by means of fusion-based and data standardization-based techniques. It is observed that the latter surpasses the former. The proposed methods outclass state-of-the-art techniques for content-based image identification and show an average increase in precision of 17.71% and 22.78% for classification and retrieval, respectively. Three public datasets - Wang; Oliva and Torralba (OT-Scene); and Corel - are used for verification purposes. The research findings are statistically validated by conducting a paired t-test.

Automatic extraction of similar poetry for study of literary texts: An experiment on Hindi poetry

  • Prakash, Amit;Singh, Niraj Kumar;Saha, Sujan Kumar
    • ETRI Journal
    • /
    • 제44권3호
    • /
    • pp.413-425
    • /
    • 2022
  • The study of literary texts is one of the earliest disciplines practiced around the globe. Poetry is artistic writing in which words are carefully chosen and arranged for their meaning, sound, and rhythm. Poetry usually has a broad and profound sense that makes it difficult to be interpreted even by humans. The essence of poetry is Rasa, which signifies mood or emotion. In this paper, we propose a poetry classification-based approach to automatically extract similar poems from a repository. Specifically, we perform a novel Rasa-based classification of Hindi poetry. For the task, we primarily used lexical features in a bag-of-words model trained using the support vector machine classifier. In the model, we employed Hindi WordNet, Latent Semantic Indexing, and Word2Vec-based neural word embedding. To extract the rich feature vectors, we prepared a repository containing 37 717 poems collected from various sources. We evaluated the performance of the system on a manually constructed dataset containing 945 Hindi poems. Experimental results demonstrated that the proposed model attained satisfactory performance.

Efficient Extraction of Hierarchically Structured Rules Using Rough Sets

  • Lee, Chul-Heui;Seo, Seon-Hak
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제4권2호
    • /
    • pp.205-210
    • /
    • 2004
  • This paper deals with rule extraction from data using rough set theory. We construct the rule base in a hierarchical granulation structure by applying core as a classification criteria at each level. When more than one core exist, the coverage is used for the selection of an appropriate one among them to increase the classification rate and accuracy. In Addition, a probabilistic approach is suggested so that the partially useful information included in inconsistent data can be contributed to knowledge reduction in order to decrease the effect of the uncertainty or vagueness of data. As a result, the proposed method yields more proper and efficient rule base in compatability and size. The simulation result shows that it gives a good performance in spite of very simple rules and short conditionals.

Enhanced CT-image for Covid-19 classification using ResNet 50

  • Lobna M. Abouelmagd;Manal soubhy Ali Elbelkasy
    • International Journal of Computer Science & Network Security
    • /
    • 제24권1호
    • /
    • pp.119-126
    • /
    • 2024
  • Disease caused by the coronavirus (COVID-19) is sweeping the globe. There are numerous methods for identifying this disease using a chest imaging. Computerized Tomography (CT) chest scans are used in this study to detect COVID-19 disease using a pretrain Convolutional Neural Network (CNN) ResNet50. This model is based on image dataset taken from two hospitals and used to identify Covid-19 illnesses. The pre-train CNN (ResNet50) architecture was used for feature extraction, and then fully connected layers were used for classification, yielding 97%, 96%, 96%, 96% for accuracy, precision, recall, and F1-score, respectively. When combining the feature extraction techniques with the Back Propagation Neural Network (BPNN), it produced accuracy, precision, recall, and F1-scores of 92.5%, 83%, 92%, and 87.3%. In our suggested approach, we use a preprocessing phase to improve accuracy. The image was enhanced using the Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm, which was followed by cropping the image before feature extraction with ResNet50. Finally, a fully connected layer was added for classification, with results of 99.1%, 98.7%, 99%, 98.8% in terms of accuracy, precision, recall, and F1-score.

한국어 음절 인식을 위한 MLP 신경망 구조 및 특징 추출에 관한 연구 (A Study on MLP Neural Network Architecture and Feature Extraction for Korean Syllable Recognition)

  • 금지수;이현수
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.672-675
    • /
    • 1999
  • In this paper, we propose a MLP neural network architecture and feature extraction for Korean syllable recognition. In the proposed syllable recognition system, firstly onset is classified by onset classification neural network. And the results information of onset classification neural network are used for feature selection of imput patterns vector. The feature extraction of Korean syllables is based on sonority. Using the threshold rate separate the syllable. The results of separation are used for feature of onset. nucleus and coda. ETRI's SAMDORI has been used by speech DB. The recognition rate is 96% in the speaker dependent and 93.3% in the speaker independent.

  • PDF

다중센서 영상 기반의 지상 표적 분류 알고리즘 (Ground Target Classification Algorithm based on Multi-Sensor Images)

  • 이은영;구은혜;이희열;조웅호;박길흠
    • 한국멀티미디어학회논문지
    • /
    • 제15권2호
    • /
    • pp.195-203
    • /
    • 2012
  • 본 논문은 다중센서 영상을 이용한 결정 융합 기반의 지상 표적 분류 알고리즘 및 특징 추출 기법을 제안한다. 표적의 인식률 향상을 위하여 가중 투표 방법을 적용함으로써 개별 분류기로부터 획득된 결과를 융합하였다. 또한 개별 센서 영상 내에 속한 표적을 분류하기 위해 CCD 영상으로부터 획득한 CM 영상의 밝기 차이와 FLIR 영상 내 표적의 윤곽선 정보 및 차량과 포탑의 너비 비율을 이용하여 스케일과 회전변화에 강인한 특징들을 추출하였다. 마지막으로 실험을 통하여 본 논문에서 제안한 지상 표적 분류 알고리즘과 특징 추출 기법에 대한 성능을 검증한다.

Wavelet-based feature extraction for automatic defect classification in strands by ultrasonic structural monitoring

  • Rizzo, Piervincenzo;Lanza di Scalea, Francesco
    • Smart Structures and Systems
    • /
    • 제2권3호
    • /
    • pp.253-274
    • /
    • 2006
  • The structural monitoring of multi-wire strands is of importance to prestressed concrete structures and cable-stayed or suspension bridges. This paper addresses the monitoring of strands by ultrasonic guided waves with emphasis on the signal processing and automatic defect classification. The detection of notch-like defects in the strands is based on the reflections of guided waves that are excited and detected by magnetostrictive ultrasonic transducers. The Discrete Wavelet Transform was used to extract damage-sensitive features from the detected signals and to construct a multi-dimensional Damage Index vector. The Damage Index vector was then fed to an Artificial Neural Network to provide the automatic classification of (a) the size of the notch and (b) the location of the notch from the receiving sensor. Following an optimization study of the network, it was determined that five damage-sensitive features provided the best defect classification performance with an overall success rate of 90.8%. It was thus demonstrated that the wavelet-based multidimensional analysis can provide excellent classification performance for notch-type defects in strands.

A Novel Model for Smart Breast Cancer Detection in Thermogram Images

  • Kazerouni, Iman Abaspur;Zadeh, Hossein Ghayoumi;Haddadnia, Javad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권24호
    • /
    • pp.10573-10576
    • /
    • 2015
  • Background: Accuracy in feature extraction is an important factor in image classification and retrieval. In this paper, a breast tissue density classification and image retrieval model is introduced for breast cancer detection based on thermographic images. The new method of thermographic image analysis for automated detection of high tumor risk areas, based on two-directional two-dimensional principal component analysis technique for feature extraction, and a support vector machine for thermographic image retrieval was tested on 400 images. The sensitivity and specificity of the model are 100% and 98%, respectively.

기술용어 분산표현을 활용한 특허문헌 분류에 관한 연구 (A Study on Patent Literature Classification Using Distributed Representation of Technical Terms)

  • 최윤수;최성필
    • 한국문헌정보학회지
    • /
    • 제53권2호
    • /
    • pp.179-199
    • /
    • 2019
  • 본 연구의 목적은 특허 문헌 분류에 가장 적합한 방법론을 발견하기 위하여 다양한 자질 추출 방법과 기계학습 및 딥러닝 모델을 살펴보고 실험을 통해 최적의 성능을 제공하는 방법론을 분석하는데 있다. 자질 추출 방법으로는 전통적인 BoW 방법과 분산표현 방식인 워드 임베딩 벡터를 비교 실험하고, 문헌 집합 구축 방식으로는 형태소 분석과 멀티그램을 이용하는 방식을 비교 검토하였다. 또한 전통적인 기계학습 모델과 딥러닝 모델을 이용하여 분류 성능을 검증하였다. 실험 결과, 분산표현 방법과 형태소 분석을 이용한 자질추출 방법을 기반으로 딥러닝 모델을 적용하였을 경우에 분류 성능이 가장 우수한 것으로 판명되었으며 섹션, 클래스, 서브클래스 분류 실험에서 전통적인 기계학습 방법에 비해 각각 5.71%, 18.84%, 21.53% 우수한 분류 성능을 보여주었다.