• Title/Summary/Keyword: Feature Classification

Search Result 2,187, Processing Time 0.029 seconds

Performance Analysis of Automatic Target Recognition Using Simulated SAR Image (표적 SAR 시뮬레이션 영상을 이용한 식별 성능 분석)

  • Lee, Sumi;Lee, Yun-Kyung;Kim, Sang-Wan
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.283-298
    • /
    • 2022
  • As Synthetic Aperture Radar (SAR) image can be acquired regardless of the weather and day or night, it is highly recommended to be used for Automatic Target Recognition (ATR) in the fields of surveillance, reconnaissance, and national security. However, there are some limitations in terms of cost and operation to build various and vast amounts of target images for the SAR-ATR system. Recently, interest in the development of an ATR system based on simulated SAR images using a target model is increasing. Attributed Scattering Center (ASC) matching and template matching mainly used in SAR-ATR are applied to target classification. The method based on ASC matching was developed by World View Vector (WVV) feature reconstruction and Weighted Bipartite Graph Matching (WBGM). The template matching was carried out by calculating the correlation coefficient between two simulated images reconstructed with adjacent points to each other. For the performance analysis of the two proposed methods, the Synthetic and Measured Paired Labeled Experiment (SAMPLE) dataset was used, which has been recently published by the U.S. Defense Advanced Research Projects Agency (DARPA). We conducted experiments under standard operating conditions, partial target occlusion, and random occlusion. The performance of the ASC matching is generally superior to that of the template matching. Under the standard operating condition, the average recognition rate of the ASC matching is 85.1%, and the rate of the template matching is 74.4%. Also, the ASC matching has less performance variation across 10 targets. The ASC matching performed about 10% higher than the template matching according to the amount of target partial occlusion, and even with 60% random occlusion, the recognition rate was 73.4%.

Multimodal Sentiment Analysis Using Review Data and Product Information (리뷰 데이터와 제품 정보를 이용한 멀티모달 감성분석)

  • Hwang, Hohyun;Lee, Kyeongchan;Yu, Jinyi;Lee, Younghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.15-28
    • /
    • 2022
  • Due to recent expansion of online market such as clothing, utilizing customer review has become a major marketing measure. User review has been used as a tool of analyzing sentiment of customers. Sentiment analysis can be largely classified with machine learning-based and lexicon-based method. Machine learning-based method is a learning classification model referring review and labels. As research of sentiment analysis has been developed, multi-modal models learned by images and video data in reviews has been studied. Characteristics of words in reviews are differentiated depending on products' and customers' categories. In this paper, sentiment is analyzed via considering review data and metadata of products and users. Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Self Attention-based Multi-head Attention models and Bidirectional Encoder Representation from Transformer (BERT) are used in this study. Same Multi-Layer Perceptron (MLP) model is used upon every products information. This paper suggests a multi-modal sentiment analysis model that simultaneously considers user reviews and product meta-information.

Domain Knowledge Incorporated Counterfactual Example-Based Explanation for Bankruptcy Prediction Model (부도예측모형에서 도메인 지식을 통합한 반사실적 예시 기반 설명력 증진 방법)

  • Cho, Soo Hyun;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.307-332
    • /
    • 2022
  • One of the most intensively conducted research areas in business application study is a bankruptcy prediction model, a representative classification problem related to loan lending, investment decision making, and profitability to financial institutions. Many research demonstrated outstanding performance for bankruptcy prediction models using artificial intelligence techniques. However, since most machine learning algorithms are "black-box," AI has been identified as a prominent research topic for providing users with an explanation. Although there are many different approaches for explanations, this study focuses on explaining a bankruptcy prediction model using a counterfactual example. Users can obtain desired output from the model by using a counterfactual-based explanation, which provides an alternative case. This study introduces a counterfactual generation technique based on a genetic algorithm (GA) that leverages both domain knowledge (i.e., causal feasibility) and feature importance from a black-box model along with other critical counterfactual variables, including proximity, distribution, and sparsity. The proposed method was evaluated quantitatively and qualitatively to measure the quality and the validity.

Designing a Molecular Diagnostic Laboratory for Testing Highly Pathogenic Viruses (고병원성 바이러스 검사를 위한 분자진단검사실 구축)

  • Jung, Tae Won;Jung, Jaeyoung;Kim, Sunghyun;Kim, Young-Kwon
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.53 no.2
    • /
    • pp.143-150
    • /
    • 2021
  • The recent spread of novel and highly variant pathogenic viruses, including the coronavirus (SARS-CoV-2), has increased the demand for diagnostic testing for rapid confirmation. This has resulted in investigating the functional capability of each space, and preparing facility guidelines to secure the safety of medical technologists. During viral evaluations, there is a requirement of negative pressure facilities along with thread separation, during pre-treatment of samples and before nucleic acid amplification. Space composition therefore needs to be planned by considering unidirectional air flow. This classification of safety management facilities is designated as biosafety level 2, and personal protective equipment is placed accordingly. In case of handling dangerous materials, they need to be carried out of the biosafety cabinet, and sterilizers are required for suitable disposal of infectious agents. A common feature of domestic laboratories is maintenance of the sample pre-treatment space at a negative pressure of -2.5 Pa or less, and arranging separate pre-treatment and reagent preparation spaces during the test process. We believe that the data generated in this study is meaningful, and offers an efficient direction and detailed flow for separation of the inspection process and space functions. Moreover, this study introduces construction of the laboratory by applying the safety management standards.

Denoising Self-Attention Network for Mixed-type Data Imputation (혼합형 데이터 보간을 위한 디노이징 셀프 어텐션 네트워크)

  • Lee, Do-Hoon;Kim, Han-Joon;Chun, Joonghoon
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.135-144
    • /
    • 2021
  • Recently, data-driven decision-making technology has become a key technology leading the data industry, and machine learning technology for this requires high-quality training datasets. However, real-world data contains missing values for various reasons, which degrades the performance of prediction models learned from the poor training data. Therefore, in order to build a high-performance model from real-world datasets, many studies on automatically imputing missing values in initial training data have been actively conducted. Many of conventional machine learning-based imputation techniques for handling missing data involve very time-consuming and cumbersome work because they are applied only to numeric type of columns or create individual predictive models for each columns. Therefore, this paper proposes a new data imputation technique called 'Denoising Self-Attention Network (DSAN)', which can be applied to mixed-type dataset containing both numerical and categorical columns. DSAN can learn robust feature expression vectors by combining self-attention and denoising techniques, and can automatically interpolate multiple missing variables in parallel through multi-task learning. To verify the validity of the proposed technique, data imputation experiments has been performed after arbitrarily generating missing values for several mixed-type training data. Then we show the validity of the proposed technique by comparing the performance of the binary classification models trained on imputed data together with the errors between the original and imputed values.

Preoperative Assessment of Renal Sinus Invasion by Renal Cell Carcinoma according to Tumor Complexity and Imaging Features in Patients Undergoing Radical Nephrectomy

  • Ji Hoon Kim;Kye Jin Park;Mi-Hyun Kim;Jeong Kon Kim
    • Korean Journal of Radiology
    • /
    • v.22 no.8
    • /
    • pp.1323-1331
    • /
    • 2021
  • Objective: To identify the association between renal tumor complexity and pathologic renal sinus invasion (RSI) and evaluate the usefulness of computed tomography tumor features for predicting RSI in patients with renal cell carcinoma (RCC). Materials and Methods: This retrospective study included 276 consecutive patients who underwent radical nephrectomy for RCC with a size of ≤ 7 cm between January 2014 and October 2017. Tumor complexity and anatomical renal sinus involvement were evaluated using two standardized scoring systems: the radius (R), exophytic or endophytic (E), nearness to collecting system or sinus (N), anterior or posterior (A), and location relative to polar lines (RENAL) nephrometry and preoperative aspects and dimensions used for anatomical classification (PADUA) system. CT-based tumor features, including shape, enhancement pattern, margin at the interface of the renal sinus (smooth vs. non-smooth), and finger-like projection of the mass, were also assessed by two independent radiologists. Univariable and multivariable logistic regression analyses were performed to identify significant predictors of RSI. The positive predictive value, negative predictive value (NPV), accuracy of anatomical renal sinus involvement, and tumor features were evaluated. Results: Eighty-one of 276 patients (29.3%) demonstrated RSI. Among highly complex tumors (RENAL or PADUA score ≥ 10), the frequencies of RSI were 42.4% (39/92) and 38.0% (71/187) using RENAL and PADUA scores, respectively. Multivariable analysis showed that a non-smooth margin and the presence of a finger-like projection were significant predictors of RSI. Anatomical renal sinus involvement showed high NPVs (91.7% and 95.2%) but low accuracy (40.2% and 43.1%) for RSI, whereas the presence of a non-smooth margin or finger-like projection demonstrated comparably high NPVs (90.0% and 91.3% for both readers) and improved accuracy (67.0% and 73.9%, respectively). Conclusion: A non-smooth margin or the presence of a finger-like projection can be used as a preoperative CT-based tumor feature for predicting RSI in patients with RCC.

The Validation Study of the Questionnaire for Sasang Constitution Classification (the 2nd edition revised in 1995) - In the field of profile analysis (사상체질분류검사지(四象體質分類檢査紙)(QSCC)II에 대(對)한 타당화(妥當化) 연구(硏究) -각(各) 체질집단(體質集團)의 군집별(群集別) Profile 분석(分析)을 중심(中心)으로-)

  • Lee, Jung-Chan;Go, Byeong-Hui;Song, Il-Byeong
    • Journal of Sasang Constitutional Medicine
    • /
    • v.8 no.1
    • /
    • pp.247-294
    • /
    • 1996
  • By means of the statistical data which has been collected with newly revised QSCC made use of the outpatient group examined at Kyung-Hee Medical Center and an open ordinary person group, the author proceeded statistical analysis for the validation study of the revised questionnaire itself. First, check the accurate discrimination rate by performing discriminant analysis on the statistical data of the patient group. And next, sought T-score by applying the norms gained in process of standadization of the open ordinary person group to the Sasang scale score of the outpatient group and investigated the distinctive feature between the subpopulations which was devided in the process of multivarite cluster analysis. The result was summarized as follows ; 1. The validity of the questionnaire was established through the fact that the accurate discrimination rate the ratio between predicted group and actual group was figured out 70.08%. 2. At the profile analysis the response to the relevant scale showed notable upward tendency in each constitutional group and therefore it seems to be pertinent in the field of constitutional discrimination. 3. In the observation of the power of expression through the profile analysis of each constitutional group the Soyang group demonstrated the most remarkable outcome, the Soeum group was the most inferior and the Taieum group revealed a sort of dual property. 4. What is called the group of seceder out of three subpopulation of each constitutional group distinguished definitely from the contrasted groups at the point of the distinctive profile feature and the content is like following description. (1) The seceder group of Soyang-in showed considerably passive disposition differently from general character of ordinary Soyang group and an appearance attracting the attention is that they demonstrated comparatively higher response at Soeum scale (2) The seceder group of Taieum-in gained low scores in general that informed the passive disposition of the group and the other way of the general property of Taieum group which showed accompanied ascension in Taiyang-Taieum scales they demonstrated sharply declined score at Taiyang scale (3) The seceder group of Soeum-in demonstrated distinctive property similar to the profile feature of Soyang group and it notifies that the passive property of Soeum group was diluted for the most part. According to the above result, the validity of newly revised questionnaire has been proven successfully and the property of seceder groups could be noticed to some degree through the profile analysis on the course of this study. The result of this study is expected to use as a research materials to produce next edition of the questionnaire and it is regarded that further inquisition about the difference between the seceder group and the contrasted group is required for the promotion of the questionnaire as it refered several times in the contents of the main discourse.

  • PDF

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

Development of a Prototype System for Aquaculture Facility Auto Detection Using KOMPSAT-3 Satellite Imagery (KOMPSAT-3 위성영상 기반 양식시설물 자동 검출 프로토타입 시스템 개발)

  • KIM, Do-Ryeong;KIM, Hyeong-Hun;KIM, Woo-Hyeon;RYU, Dong-Ha;GANG, Su-Myung;CHOUNG, Yun-Jae
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.4
    • /
    • pp.63-75
    • /
    • 2016
  • Aquaculture has historically delivered marine products because the country is surrounded by ocean on three sides. Surveys on production have been conducted recently to systematically manage aquaculture facilities. Based on survey results, pricing controls on marine products has been implemented to stabilize local fishery resources and to ensure minimum income for fishermen. Such surveys on aquaculture facilities depend on manual digitization of aerial photographs each year. These surveys that incorporate manual digitization using high-resolution aerial photographs can accurately evaluate aquaculture with the knowledge of experts, who are aware of each aquaculture facility's characteristics and deployment of those facilities. However, using aerial photographs has monetary and time limitations for monitoring aquaculture resources with different life cycles, and also requires a number of experts. Therefore, in this study, we investigated an automatic prototype system for detecting boundary information and monitoring aquaculture facilities based on satellite images. KOMPSAT-3 (13 Scene), a local high-resolution satellite provided the satellite imagery collected between October and April, a time period in which many aquaculture facilities were operating. The ANN classification method was used for automatic detecting such as cage, longline and buoy type. Furthermore, shape files were generated using a digitizing image processing method that incorporates polygon generation techniques. In this study, our newly developed prototype method detected aquaculture facilities at a rate of 93%. The suggested method overcomes the limits of existing monitoring method using aerial photographs, but also assists experts in detecting aquaculture facilities. Aquaculture facility detection systems must be developed in the future through application of image processing techniques and classification of aquaculture facilities. Such systems will assist in related decision-making through aquaculture facility monitoring.

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.