• Title/Summary/Keyword: classification and extraction

Search Result 1,099, Processing Time 0.024 seconds

Topic Extraction and Classification Method Based on Comment Sets

  • Tan, Xiaodong
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.329-342
    • /
    • 2020
  • In recent years, emotional text classification is one of the essential research contents in the field of natural language processing. It has been widely used in the sentiment analysis of commodities like hotels, and other commentary corpus. This paper proposes an improved W-LDA (weighted latent Dirichlet allocation) topic model to improve the shortcomings of traditional LDA topic models. In the process of the topic of word sampling and its word distribution expectation calculation of the Gibbs of the W-LDA topic model. An average weighted value is adopted to avoid topic-related words from being submerged by high-frequency words, to improve the distinction of the topic. It further integrates the highest classification of the algorithm of support vector machine based on the extracted high-quality document-topic distribution and topic-word vectors. Finally, an efficient integration method is constructed for the analysis and extraction of emotional words, topic distribution calculations, and sentiment classification. Through tests on real teaching evaluation data and test set of public comment set, the results show that the method proposed in the paper has distinct advantages compared with other two typical algorithms in terms of subject differentiation, classification precision, and F1-measure.

Malware Classification using Dynamic Analysis with Deep Learning

  • Asad Amin;Muhammad Nauman Durrani;Nadeem Kafi;Fahad Samad;Abdul Aziz
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.49-62
    • /
    • 2023
  • There has been a rapid increase in the creation and alteration of new malware samples which is a huge financial risk for many organizations. There is a huge demand for improvement in classification and detection mechanisms available today, as some of the old strategies like classification using mac learning algorithms were proved to be useful but cannot perform well in the scalable auto feature extraction scenario. To overcome this there must be a mechanism to automatically analyze malware based on the automatic feature extraction process. For this purpose, the dynamic analysis of real malware executable files has been done to extract useful features like API call sequence and opcode sequence. The use of different hashing techniques has been analyzed to further generate images and convert them into image representable form which will allow us to use more advanced classification approaches to classify huge amounts of images using deep learning approaches. The use of deep learning algorithms like convolutional neural networks enables the classification of malware by converting it into images. These images when fed into the CNN after being converted into the grayscale image will perform comparatively well in case of dynamic changes in malware code as image samples will be changed by few pixels when classified based on a greyscale image. In this work, we used VGG-16 architecture of CNN for experimentation.

A Study on Optimal Shape-Size Index Extraction for Classification of High Resolution Satellite Imagery (고해상도 영상의 분류결과 개선을 위한 최적의 Shape-Size Index 추출에 관한 연구)

  • Han, You-Kyung;Kim, Hye-Jin;Choi, Jae-Wan;Kim, Yong-Il
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.2
    • /
    • pp.145-154
    • /
    • 2009
  • High spatial resolution satellite image classification has a limitation when only using the spectral information due to the complex spatial arrangement of features and spectral heterogeneity within each class. Therefore, the extraction of the spatial information is one of the most important steps in high resolution satellite image classification. This study proposes a new spatial feature extraction method, named SSI(Shape-Size Index). SSI uses a simple region-growing based image segmentation and allocates spatial property value in each segment. The extracted feature is integrated with spectral bands to improve overall classification accuracy. The classification is achieved by applying a SVM(Support Vector Machines) classifier. In order to evaluate the proposed feature extraction method, KOMPSAT-2 and QuickBird-2 data are used for experiments. It is demonstrated that proposed SSI algorithm leads to a notable increase in classification accuracy.

Estimation of Classification Error Based on the Bhattacharyya Distance for Data with Multimodal Distribution (Multimodal 분포 데이터를 위한 Bhattacharyya distance 기반 분류 에러예측 기법)

  • 최의선;이철희
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.85-87
    • /
    • 2000
  • In pattern classification, the Bhattacharyya distance has been used as a class separability measure and provides useful information for feature selection and extraction. In this paper, we propose a method to predict the classification error for multimodal data based on the Bhattacharyya distance. In our approach, we first approximate the pdf of multimodal distribution with a Gaussian mixture model and find the bhattacharyya distance and classification error. Exprimental results showed that there is a strong relationship between the Bhattacharyya distance and the classification error for multimodal data.

  • PDF

Terrain Cover Classification Technique Based on Support Vector Machine (Support Vector Machine 기반 지형분류 기법)

  • Sung, Gi-Yeul;Park, Joon-Sung;Lyou, Joon
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.45 no.6
    • /
    • pp.55-59
    • /
    • 2008
  • For effective mobility control of UGV(unmanned ground vehicle), the terrain cover classification is an important component as well as terrain geometry recognition and obstacle detection. The vision based terrain cover classification algorithm consists of pre-processing, feature extraction, classification and post-processing. In this paper, we present a method to classify terrain covers based on the color and texture information. The color space conversion is performed for the pre-processing, the wavelet transform is applied for feature extraction, and the SVM(support vector machine) is applied for the classifier. Experimental results show that the proposed algorithm has a promising classification performance.

Feature Selection for Image Classification of Hyperion Data (Hyperion 영상의 분류를 위한 밴드 추출)

  • 한동엽;조영욱;김용일;이용웅
    • Korean Journal of Remote Sensing
    • /
    • v.19 no.2
    • /
    • pp.170-179
    • /
    • 2003
  • In order to classify Land Use/Land Cover using multispectral images, we have to give consequence to defining proper classes and selecting training sample with higher class separability. The process of satellite hyperspectral image which has a lot of bands is difficult and time-consuming. Furthermore, classification result of hyperspectral image with noise is often worse than that of a multispectral image. When selecting training fields according to the signatures in the study area, it is difficult to calculate covariance matrix in some clusters with pixels less than the number of bands. Therefore in this paper we presented an overview of feature extraction methods for classification of Hyperion data and examined effectiveness of feature extraction through the accuracy assesment of classified image. Also we evaluated the classification accuracy of optimal meaningful features by class separation distance, which is also a method for band reduction. As a result, the classification accuracies of feature-extracted image and original image are similar regardless of classifiers. But the number of bands used and computing time were reduced. The classifiers such as MLC, SAM and ECHO were used.

Extraction of paddy field in Jaeryeong, North Korea by object-oriented classification with RapidEye NDVI imagery (RapidEye 위성영상의 시계열 NDVI 및 객체기반 분류를 이용한 북한 재령군의 논벼 재배지역 추출 기법 연구)

  • Lee, Sang-Hyun;Oh, Yun-Gyeong;Park, Na-Young;Lee, Sung Hack;Choi, Jin-Yong
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.56 no.3
    • /
    • pp.55-64
    • /
    • 2014
  • While utilizing high resolution satellite image for land use classification has been popularized, object-oriented classification has been adapted as an affordable classification method rather than conventional statistical classification. The aim of this study is to extract the paddy field area using object-oriented classification with time series NDVI from high-resolution satellite images, and the RapidEye satellite images of Jaeryung-gun in North Korea were used. For the implementation of object-oriented classification, creating objects by setting of scale and color factors was conducted, then 3 different land use categories including paddy field, forest and water bodies were extracted from the objects applying the variation of time-series NDVI. The unclassified objects which were not involved into the previous extraction classified into 6 categories using unsupervised classification by clustering analysis. Finally, the unsuitable paddy field area were assorted from the topographic factors such as elevation and slope. As the results, about 33.6 % of the total area (32313.1 ha) were classified to the paddy field (10847.9 ha) and 851.0 ha was classified to the unsuitable paddy field based on the topographic factors. The user accuracy of paddy field classification was calculated to 83.3 %, and among those, about 60.0 % of total paddy fields were classified from the time-series NDVI before the unsupervised classification. Other land covers were classified as to upland(5255.2 ha), forest (10961.0 ha), residential area and bare land (3309.6 ha), and lake and river (1784.4 ha) from this object-oriented classification.

Surface Classification and Its Threshold Value Selection for the Recognition of 3-D Objects (3차원 물체 인식을 위한 표면 분류 및 임계치의 선정)

  • 조동욱;백승재;김동원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.20-25
    • /
    • 2000
  • This paper proposes the method of surface classification and threshold value selection for surface classification of the three-dimensional object recognition. The processings of three-dimensional image processing system consist of three steps, i.e, acquisition of range data, feature extraction and matching process. This paper proposes the method of shape feature extraction from the acquired rage data in the entire three-dimensional image processing system. In order to achieve these goals, firstly, this article proposes the surface classification method by using the distribution characteristics of sign value from range values. Also pre-existing method which uses the K-curvature and K-curvature has limitation in the practical threshold value selection. To overcome this, this article proposes the selection of threshold value for surface classification. Finally, the effectiveness of this article is demonstrated by the several experiments.

  • PDF

A Study for the Land-cover Classification of Remote Sensed Data Using Quadratic Programming (원격탐사 데이터의 이차계획법에 의한 토지피복분류에 관한 연구)

  • 전형섭;조기성
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.19 no.2
    • /
    • pp.163-172
    • /
    • 2001
  • This study present the quadratic programming as the classification method of remote sensed data applying to the extraction of landcover and examine it's applicable capability by comparing the classification accuracy of quadratic programming with that of neural network and maximum likelihood method which are used in the extraction of thematic layer. As the results, as drawing the more improved classification results by 6% than maximum likelihood method, we could discern that the method of quadratic programming is appliable to classifying the remote sensed data. Also, in the classification of quadratic programming method, we could definitely indicate the results which was ignored in the previous extreme(binary) classification method by affecting the class decision with the class composition proportion.

  • PDF

A New Method for Classification of Structural Textures

  • Lee, Bongkyu
    • International Journal of Control, Automation, and Systems
    • /
    • v.2 no.1
    • /
    • pp.125-133
    • /
    • 2004
  • In this paper, we present a new method that combines the characteristics of edge in-formation and second-order neural networks for the classification of structural textures. The edges of a texture are extracted using an edge detection approach. From this edge information, classification features called second-order features are obtained. These features are fed into a second-order neural network for training and subsequent classification. It will be shown that the main disadvantage of using structural methods in texture classifications, namely, the difficulty of the extraction of texels, is overcome by the proposed method.