• Title/Summary/Keyword: Semantic Classification Model

Search Result 112, Processing Time 0.026 seconds

A Study on Design and Analysis of Metadata and Ontology based on Humanities and Social Sciences (기초학문자료 메타데이터 설계 분석 및 온톨로지 적용 방안 연구)

  • Lee, Jung-Yeoun;Kim, Jung-Min;Choi, Suk-Doo;Kim, Lee-Kyum
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.41 no.2
    • /
    • pp.291-316
    • /
    • 2007
  • The purpose of this study is to design metadata model for describing different kinds of concepts, properties, and semantic relationships of result materials of researches. We examine our metadata model to evaluate correctness and efficiency of the model through contents analysis of a constructed database. From the results of examination, we suggest more effective structure of metadata schema. Domain ontology could constructed by the enlarged thesaurus in order to overcome the limitation of the keyword search, therefore we design a philosophy and religion ontology based on subject classification to improve information retrieval and implement it using XML/Topic Maps to improve retrieval functionality of our database.

Spatial Replicability Assessment of Land Cover Classification Using Unmanned Aerial Vehicle and Artificial Intelligence in Urban Area (무인항공기 및 인공지능을 활용한 도시지역 토지피복 분류 기법의 공간적 재현성 평가)

  • Geon-Ung, PARK;Bong-Geun, SONG;Kyung-Hun, PARK;Hung-Kyu, LEE
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.4
    • /
    • pp.63-80
    • /
    • 2022
  • As a technology to analyze and predict an issue has been developed by constructing real space into virtual space, it is becoming more important to acquire precise spatial information in complex cities. In this study, images were acquired using an unmanned aerial vehicle for urban area with complex landscapes, and land cover classification was performed object-based image analysis and semantic segmentation techniques, which were image classification technique suitable for high-resolution imagery. In addition, based on the imagery collected at the same time, the replicability of land cover classification of each artificial intelligence (AI) model was examined for areas that AI model did not learn. When the AI models are trained on the training site, the land cover classification accuracy is analyzed to be 89.3% for OBIA-RF, 85.0% for OBIA-DNN, and 95.3% for U-Net. When the AI models are applied to the replicability assessment site to evaluate replicability, the accuracy of OBIA-RF decreased by 7%, OBIA-DNN by 2.1% and U-Net by 2.3%. It is found that U-Net, which considers both morphological and spectroscopic characteristics, performs well in land cover classification accuracy and replicability evaluation. As precise spatial information becomes important, the results of this study are expected to contribute to urban environment research as a basic data generation method.

Building Domain Ontology through Concept and Relation Classification (개념 및 관계 분류를 통한 분야 온톨로지 구축)

  • Huang, Jin-Xia;Shin, Ji-Ae;Choi, Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.562-571
    • /
    • 2008
  • For the purpose of building domain ontology, this paper proposes a methodology for building core ontology first, and then enriching the core ontology with the concepts and relations in the domain thesaurus. First, the top-level concept taxonomy of the core ontology is built using domain dictionary and general domain thesaurus. Then, the concepts of the domain thesaurus are classified into top-level concepts in the core ontology, and relations between broader terms (BT) - narrower terms (NT) and related terms (RT) are classified into semantic relations defined for the core ontology. To classify concepts, a two-step approach is adopted, in which a frequency-based approach is complemented with a similarity-based approach. To classify relations, two techniques are applied: (i) for the case of insufficient training data, a rule-based module is for identifying isa relation out of non-isa ones; a pattern-based approach is for classifying non-taxonomic semantic relations from non-isa. (ii) For the case of sufficient training data, a maximum-entropy model is adopted in the feature-based classification, where k-NN approach is for noisy filtering of training data. A series of experiments show that performances of the proposed systems are quite promising and comparable to judgments by human experts.

Aspect-Based Sentiment Analysis with Position Embedding Interactive Attention Network

  • Xiang, Yan;Zhang, Jiqun;Zhang, Zhoubin;Yu, Zhengtao;Xian, Yantuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.5
    • /
    • pp.614-627
    • /
    • 2022
  • Aspect-based sentiment analysis is to discover the sentiment polarity towards an aspect from user-generated natural language. So far, most of the methods only use the implicit position information of the aspect in the context, instead of directly utilizing the position relationship between the aspect and the sentiment terms. In fact, neighboring words of the aspect terms should be given more attention than other words in the context. This paper studies the influence of different position embedding methods on the sentimental polarities of given aspects, and proposes a position embedding interactive attention network based on a long short-term memory network. Firstly, it uses the position information of the context simultaneously in the input layer and the attention layer. Secondly, it mines the importance of different context words for the aspect with the interactive attention mechanism. Finally, it generates a valid representation of the aspect and the context for sentiment classification. The model which has been posed was evaluated on the datasets of the Semantic Evaluation 2014. Compared with other baseline models, the accuracy of our model increases by about 2% on the restaurant dataset and 1% on the laptop dataset.

Deep learning-based monitoring for conservation and management of coastal dune vegetation (해안사구 식생의 보전 및 관리를 위한 딥러닝 기반 모니터링)

  • Kim, Dong-woo;Gu, Ja-woon;Hong, Ye-ji;Kim, Se-Min;Son, Seung-Woo
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.25 no.6
    • /
    • pp.25-33
    • /
    • 2022
  • In this study, a monitoring method using high-resolution images acquired by unmanned aerial vehicles and deep learning algorithms was proposed for the management of the Sinduri coastal sand dunes. Class classification was done using U-net, a semantic division method. The classification target classified 3 types of sand dune vegetation into 4 classes, and the model was trained and tested with a total of 320 training images and 48 test images. Ignored label was applied to improve the performance of the model, and then evaluated by applying two loss functions, CE Loss and BCE Loss. As a result of the evaluation, when CE Loss was applied, the value of mIoU for each class was the highest, but it can be judged that the performance of BCE Loss is better considering the time efficiency consumed in learning. It is meaningful as a pilot application of unmanned aerial vehicles and deep learning as a method to monitor and manage sand dune vegetation. The possibility of using the deep learning image analysis technology to monitor sand dune vegetation has been confirmed, and it is expected that the proposed method can be used not only in sand dune vegetation but also in various fields such as forests and grasslands.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Training Performance Analysis of Semantic Segmentation Deep Learning Model by Progressive Combining Multi-modal Spatial Information Datasets (다중 공간정보 데이터의 점진적 조합에 의한 의미적 분류 딥러닝 모델 학습 성능 분석)

  • Lee, Dae-Geon;Shin, Young-Ha;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.2
    • /
    • pp.91-108
    • /
    • 2022
  • In most cases, optical images have been used as training data of DL (Deep Learning) models for object detection, recognition, identification, classification, semantic segmentation, and instance segmentation. However, properties of 3D objects in the real-world could not be fully explored with 2D images. One of the major sources of the 3D geospatial information is DSM (Digital Surface Model). In this matter, characteristic information derived from DSM would be effective to analyze 3D terrain features. Especially, man-made objects such as buildings having geometrically unique shape could be described by geometric elements that are obtained from 3D geospatial data. The background and motivation of this paper were drawn from concept of the intrinsic image that is involved in high-level visual information processing. This paper aims to extract buildings after classifying terrain features by training DL model with DSM-derived information including slope, aspect, and SRI (Shaded Relief Image). The experiments were carried out using DSM and label dataset provided by ISPRS (International Society for Photogrammetry and Remote Sensing) for CNN-based SegNet model. In particular, experiments focus on combining multi-source information to improve training performance and synergistic effect of the DL model. The results demonstrate that buildings were effectively classified and extracted by the proposed approach.

Ontology Construction of Diet Data for Food Hygiene Informatization (식품 위생 정보화를 위한 식단 정보 온톨로지 구축과 활용)

  • Cha, Kyung-Ae;Yeo, Sun-Dong;Yoon, Seong-Wook;Hong, Won-Kee
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.1
    • /
    • pp.21-27
    • /
    • 2017
  • To guarantee the effectiveness of the HACCP(Hazard analysis and critical control points) system, it is necessary to develop of an ontology-based information system that can automatically manage the large amount of HACCP records or information derived from the HACCP operation results. In this paper, we construct a food information ontology which represents the relationships between ingredients, recipe, and features of food categories. Moreover, we develop HACCP automation application adopt the ontology to verify the semantic quality of the designed ontology model by performing HACCP processes such as HACCP diet classification. We expect to contribute to develop a food hygiene information and improve the accuracy of the HACCP data through the semantic system.

Evaluation of the Feasibility of Deep Learning for Vegetation Monitoring (딥러닝 기반의 식생 모니터링 가능성 평가)

  • Kim, Dong-woo;Son, Seung-Woo
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.26 no.6
    • /
    • pp.85-96
    • /
    • 2023
  • This study proposes a method for forest vegetation monitoring using high-resolution aerial imagery captured by unmanned aerial vehicles(UAV) and deep learning technology. The research site was selected in the forested area of Mountain Dogo, Asan City, Chungcheongnam-do, and the target species for monitoring included Pinus densiflora, Quercus mongolica, and Quercus acutissima. To classify vegetation species at the pixel level in UAV imagery based on characteristics such as leaf shape, size, and color, the study employed the semantic segmentation method using the prominent U-net deep learning model. The research results indicated that it was possible to visually distinguish Pinus densiflora Siebold & Zucc, Quercus mongolica Fisch. ex Ledeb, and Quercus acutissima Carruth in 135 aerial images captured by UAV. Out of these, 104 images were used as training data for the deep learning model, while 31 images were used for inference. The optimization of the deep learning model resulted in an overall average pixel accuracy of 92.60, with mIoU at 0.80 and FIoU at 0.82, demonstrating the successful construction of a reliable deep learning model. This study is significant as a pilot case for the application of UAV and deep learning to monitor and manage representative species among climate-vulnerable vegetation, including Pinus densiflora, Quercus mongolica, and Quercus acutissima. It is expected that in the future, UAV and deep learning models can be applied to a variety of vegetation species to better address forest management.

Analysis of Sentential Paraphrase Patterns and Errors through Predicate-Argument Tuple-based Approximate Alignment (술어-논항 튜플 기반 근사 정렬을 이용한 문장 단위 바꿔쓰기표현 유형 및 오류 분석)

  • Choi, Sung-Pil;Song, Sa-Kwang;Myaeng, Sung-Hyon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.2
    • /
    • pp.135-148
    • /
    • 2012
  • This paper proposes a model for recognizing sentential paraphrases through Predicate-Argument Tuple (PAT)-based approximate alignment between two texts. We cast the paraphrase recognition problem as a binary classification by defining and applying various alignment features which could effectively express the semantic relatedness between two sentences. Experiment confirmed the potential of our approach and error analysis revealed various paraphrase patterns not being solved by our system, which can help us devise methods for further performance improvement.