• Title/Summary/Keyword: 속성 추출

Search Result 788, Processing Time 0.031 seconds

A Study on an Automatic Classification Model for Facet-Based Multidimensional Analysis of Civil Complaints (패싯 기반 민원 다차원 분석을 위한 자동 분류 모델)

  • Na Rang Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.1
    • /
    • pp.135-144
    • /
    • 2024
  • In this study, we propose an automatic classification model for quantitative multidimensional analysis based on facet theory to understand public opinions and demands on major issues through big data analysis. Civil complaints, as a form of public feedback, are generated by various individuals on multiple topics repeatedly and continuously in real-time, which can be challenging for officials to read and analyze efficiently. Specifically, our research introduces a new classification framework that utilizes facet theory and political analysis models to analyze the characteristics of citizen complaints and apply them to the policy-making process. Furthermore, to reduce administrative tasks related to complaint analysis and processing and to facilitate citizen policy participation, we employ deep learning to automatically extract and classify attributes based on the facet analysis framework. The results of this study are expected to provide important insights into understanding and analyzing the characteristics of big data related to citizen complaints, which can pave the way for future research in various fields beyond the public sector, such as education, industry, and healthcare, for quantifying unstructured data and utilizing multidimensional analysis. In practical terms, improving the processing system for large-scale electronic complaints and automation through deep learning can enhance the efficiency and responsiveness of complaint handling, and this approach can also be applied to text data processing in other fields.

An Efficient Estimation of Place Brand Image Power Based on Text Mining Technology (텍스트마이닝 기반의 효율적인 장소 브랜드 이미지 강도 측정 방법)

  • Choi, Sukjae;Jeon, Jongshik;Subrata, Biswas;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.113-129
    • /
    • 2015
  • Location branding is a very important income making activity, by giving special meanings to a specific location while producing identity and communal value which are based around the understanding of a place's location branding concept methodology. Many other areas, such as marketing, architecture, and city construction, exert an influence creating an impressive brand image. A place brand which shows great recognition to both native people of S. Korea and foreigners creates significant economic effects. There has been research on creating a strategically and detailed place brand image, and the representative research has been carried out by Anholt who surveyed two million people from 50 different countries. However, the investigation, including survey research, required a great deal of effort from the workforce and required significant expense. As a result, there is a need to make more affordable, objective and effective research methods. The purpose of this paper is to find a way to measure the intensity of the image of the brand objective and at a low cost through text mining purposes. The proposed method extracts the keyword and the factors constructing the location brand image from the related web documents. In this way, we can measure the brand image intensity of the specific location. The performance of the proposed methodology was verified through comparison with Anholt's 50 city image consistency index ranking around the world. Four methods are applied to the test. First, RNADOM method artificially ranks the cities included in the experiment. HUMAN method firstly makes a questionnaire and selects 9 volunteers who are well acquainted with brand management and at the same time cities to evaluate. Then they are requested to rank the cities and compared with the Anholt's evaluation results. TM method applies the proposed method to evaluate the cities with all evaluation criteria. TM-LEARN, which is the extended method of TM, selects significant evaluation items from the items in every criterion. Then the method evaluates the cities with all selected evaluation criteria. RMSE is used to as a metric to compare the evaluation results. Experimental results suggested by this paper's methodology are as follows: Firstly, compared to the evaluation method that targets ordinary people, this method appeared to be more accurate. Secondly, compared to the traditional survey method, the time and the cost are much less because in this research we used automated means. Thirdly, this proposed methodology is very timely because it can be evaluated from time to time. Fourthly, compared to Anholt's method which evaluated only for an already specified city, this proposed methodology is applicable to any location. Finally, this proposed methodology has a relatively high objectivity because our research was conducted based on open source data. As a result, our city image evaluation text mining approach has found validity in terms of accuracy, cost-effectiveness, timeliness, scalability, and reliability. The proposed method provides managers with clear guidelines regarding brand management in public and private sectors. As public sectors such as local officers, the proposed method could be used to formulate strategies and enhance the image of their places in an efficient manner. Rather than conducting heavy questionnaires, the local officers could monitor the current place image very shortly a priori, than may make decisions to go over the formal place image test only if the evaluation results from the proposed method are not ordinary no matter what the results indicate opportunity or threat to the place. Moreover, with co-using the morphological analysis, extracting meaningful facets of place brand from text, sentiment analysis and more with the proposed method, marketing strategy planners or civil engineering professionals may obtain deeper and more abundant insights for better place rand images. In the future, a prototype system will be implemented to show the feasibility of the idea proposed in this paper.

Effect of the Consumer's Perception of the University Foodservice Quality on the Consumer Attitude (대학교 급식소의 급식서비스 품질에 대한 인식이 소비자태도에 미치는 영향)

  • Kim, Hyun-Ah
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.35 no.6
    • /
    • pp.815-822
    • /
    • 2006
  • The purposes of this study were to investigate the consumer's perception of the subject to manage the foodservice operation in the university, and to analyze the effects of consumer's perception of the university foodservice qualify on intent to revisit and intent to recommend. The questionnaires were distributed to 575 students in the K University located in Masan, who were sampled by proportionate stratified sampling method. The surveys were peformed from May 17 to June 2, 2005. The 566 questionnaires were responded, and 6 unusable questionnaires were excluded, then 560 were used for the final analysis (response rate: 97.4%). For the statistical analysis, SPSS (12.0) was used to conduct the descriptive analysis, factor analysis reliability analysis, and multiple regression analysis. The results of this study were as follows: First 254 respondants (47.3%) did not know that their foodservice operation was managed by contract foodservice company, and 374 students (66.8%) did not know the name of the contract foodservice company which runned their foodservice operation. Second, the food factor of university foodservice quality had a significant positive effect on intent to revisit (P<0.001), and the food factor of university foodservice quality also had a significant positive effect on Intent to recommend (p<0.001). It was concluded that as the food factor of university foodservice qualify Increased, the intent to revisit and the intent to recommend the university foodservice increased. So when university foodservice managers plan the foodservice operation strategy, they should focus on increasing the perception of customers' foodservice quality and also advertising contract foodservice company's brand name.

A Spatio-Temporal Clustering Technique for the Moving Object Path Search (이동 객체 경로 탐색을 위한 시공간 클러스터링 기법)

  • Lee, Ki-Young;Kang, Hong-Koo;Yun, Jae-Kwan;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.3 s.15
    • /
    • pp.67-81
    • /
    • 2005
  • Recently, the interest and research on the development of new application services such as the Location Based Service and Telemetics providing the emergency service, neighbor information search, and route search according to the development of the Geographic Information System have been increasing. User's search in the spatio-temporal database which is used in the field of Location Based Service or Telemetics usually fixes the current time on the time axis and queries the spatial and aspatial attributes. Thus, if the range of query on the time axis is extensive, it is difficult to efficiently deal with the search operation. For solving this problem, the snapshot, a method to summarize the location data of moving objects, was introduced. However, if the range to store data is wide, more space for storing data is required. And, the snapshot is created even for unnecessary space that is not frequently used for search. Thus, non storage space and memory are generally used in the snapshot method. Therefore, in this paper, we suggests the Hash-based Spatio-Temporal Clustering Algorithm(H-STCA) that extends the two-dimensional spatial hash algorithm used for the spatial clustering in the past to the three-dimensional spatial hash algorithm for overcoming the disadvantages of the snapshot method. And, this paper also suggests the knowledge extraction algorithm to extract the knowledge for the path search of moving objects from the past location data based on the suggested H-STCA algorithm. Moreover, as the results of the performance evaluation, the snapshot clustering method using H-STCA, in the search time, storage structure construction time, optimal path search time, related to the huge amount of moving object data demonstrated the higher performance than the spatio-temporal index methods and the original snapshot method. Especially, for the snapshot clustering method using H-STCA, the more the number of moving objects was increased, the more the performance was improved, as compared to the existing spatio-temporal index methods and the original snapshot method.

  • PDF

An Analysis of a 100-Years-Old Map of the Heritage Trees in Jeju Island (제주도 노거수 자연유산의 100년 전과 현재 분석)

  • Song, Kuk-Man;Kim, Yang-Ji;Seo, Yeon-Ok;Choi, Hyung-Soon;Choi, Byoung-Ki
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.37 no.2
    • /
    • pp.20-29
    • /
    • 2019
  • The purpose of this study is to verify and reconstruct the record information for big old trees of Jeju on the basis of the precise map of Jeju island in 1918 which was produced 100 years ago. For the analysis of high altitude, coordinate system and georeferencing were performed by selecting representative points using ArcGIS. We extracted digitized information by using point extraction method and extracted attribute information based on legend type and relative size in map. Based on the map of the past 100 years ago, the present situation of the big old tree in Jeju was analyzed and their characteristics were analyzed. In addition, based on the information of the protected big old trees in present, we discussed the characteristics of past tree (1918), present tree (2019), and contribution of big old tree in Jeju landscape and vegetation. As a result, 1,013 individuals were distributed in Jeju Island 100 years ago. Even when it was intensive in the use of timber, the big old trees were protected, and contributed as a representative component of Jeju's unique landscape. The remaining distribution of Jeju's big old tree is 159 trees. As in the past, distribution has been confirmed around the lowlands, but declines in numbers are found throughout the island. The major factors for the decline of individuals are large-scale development projects such as reaching the limit of life, natural disturbance (typhoon, disease, pest, drought, etc.). However, it is presumed that a large number of individuals have played a leading role in shaping the current forests as contributing to important species sources in the restoration process of Jeju vegetation. However, it is presumed that a large number of individuals (405) have played a leading role in forming the present forest by contributing to the species pool in the restoration process of Jeju vegetation.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

The Validation Study of the Questionnaire for Sasang Constitution Classification (the 2nd edition revised in 1995) - In the field of profile analysis (사상체질분류검사지(四象體質分類檢査紙)(QSCC)II에 대(對)한 타당화(妥當化) 연구(硏究) -각(各) 체질집단(體質集團)의 군집별(群集別) Profile 분석(分析)을 중심(中心)으로-)

  • Lee, Jung-Chan;Go, Byeong-Hui;Song, Il-Byeong
    • Journal of Sasang Constitutional Medicine
    • /
    • v.8 no.1
    • /
    • pp.247-294
    • /
    • 1996
  • By means of the statistical data which has been collected with newly revised QSCC made use of the outpatient group examined at Kyung-Hee Medical Center and an open ordinary person group, the author proceeded statistical analysis for the validation study of the revised questionnaire itself. First, check the accurate discrimination rate by performing discriminant analysis on the statistical data of the patient group. And next, sought T-score by applying the norms gained in process of standadization of the open ordinary person group to the Sasang scale score of the outpatient group and investigated the distinctive feature between the subpopulations which was devided in the process of multivarite cluster analysis. The result was summarized as follows ; 1. The validity of the questionnaire was established through the fact that the accurate discrimination rate the ratio between predicted group and actual group was figured out 70.08%. 2. At the profile analysis the response to the relevant scale showed notable upward tendency in each constitutional group and therefore it seems to be pertinent in the field of constitutional discrimination. 3. In the observation of the power of expression through the profile analysis of each constitutional group the Soyang group demonstrated the most remarkable outcome, the Soeum group was the most inferior and the Taieum group revealed a sort of dual property. 4. What is called the group of seceder out of three subpopulation of each constitutional group distinguished definitely from the contrasted groups at the point of the distinctive profile feature and the content is like following description. (1) The seceder group of Soyang-in showed considerably passive disposition differently from general character of ordinary Soyang group and an appearance attracting the attention is that they demonstrated comparatively higher response at Soeum scale (2) The seceder group of Taieum-in gained low scores in general that informed the passive disposition of the group and the other way of the general property of Taieum group which showed accompanied ascension in Taiyang-Taieum scales they demonstrated sharply declined score at Taiyang scale (3) The seceder group of Soeum-in demonstrated distinctive property similar to the profile feature of Soyang group and it notifies that the passive property of Soeum group was diluted for the most part. According to the above result, the validity of newly revised questionnaire has been proven successfully and the property of seceder groups could be noticed to some degree through the profile analysis on the course of this study. The result of this study is expected to use as a research materials to produce next edition of the questionnaire and it is regarded that further inquisition about the difference between the seceder group and the contrasted group is required for the promotion of the questionnaire as it refered several times in the contents of the main discourse.

  • PDF

Increasing Accuracy of Classifying Useful Reviews by Removing Neutral Terms (중립도 기반 선택적 단어 제거를 통한 유용 리뷰 분류 정확도 향상 방안)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.129-142
    • /
    • 2016
  • Customer product reviews have become one of the important factors for purchase decision makings. Customers believe that reviews written by others who have already had an experience with the product offer more reliable information than that provided by sellers. However, there are too many products and reviews, the advantage of e-commerce can be overwhelmed by increasing search costs. Reading all of the reviews to find out the pros and cons of a certain product can be exhausting. To help users find the most useful information about products without much difficulty, e-commerce companies try to provide various ways for customers to write and rate product reviews. To assist potential customers, online stores have devised various ways to provide useful customer reviews. Different methods have been developed to classify and recommend useful reviews to customers, primarily using feedback provided by customers about the helpfulness of reviews. Most shopping websites provide customer reviews and offer the following information: the average preference of a product, the number of customers who have participated in preference voting, and preference distribution. Most information on the helpfulness of product reviews is collected through a voting system. Amazon.com asks customers whether a review on a certain product is helpful, and it places the most helpful favorable and the most helpful critical review at the top of the list of product reviews. Some companies also predict the usefulness of a review based on certain attributes including length, author(s), and the words used, publishing only reviews that are likely to be useful. Text mining approaches have been used for classifying useful reviews in advance. To apply a text mining approach based on all reviews for a product, we need to build a term-document matrix. We have to extract all words from reviews and build a matrix with the number of occurrences of a term in a review. Since there are many reviews, the size of term-document matrix is so large. It caused difficulties to apply text mining algorithms with the large term-document matrix. Thus, researchers need to delete some terms in terms of sparsity since sparse words have little effects on classifications or predictions. The purpose of this study is to suggest a better way of building term-document matrix by deleting useless terms for review classification. In this study, we propose neutrality index to select words to be deleted. Many words still appear in both classifications - useful and not useful - and these words have little or negative effects on classification performances. Thus, we defined these words as neutral terms and deleted neutral terms which are appeared in both classifications similarly. After deleting sparse words, we selected words to be deleted in terms of neutrality. We tested our approach with Amazon.com's review data from five different product categories: Cellphones & Accessories, Movies & TV program, Automotive, CDs & Vinyl, Clothing, Shoes & Jewelry. We used reviews which got greater than four votes by users and 60% of the ratio of useful votes among total votes is the threshold to classify useful and not-useful reviews. We randomly selected 1,500 useful reviews and 1,500 not-useful reviews for each product category. And then we applied Information Gain and Support Vector Machine algorithms to classify the reviews and compared the classification performances in terms of precision, recall, and F-measure. Though the performances vary according to product categories and data sets, deleting terms with sparsity and neutrality showed the best performances in terms of F-measure for the two classification algorithms. However, deleting terms with sparsity only showed the best performances in terms of Recall for Information Gain and using all terms showed the best performances in terms of precision for SVM. Thus, it needs to be careful for selecting term deleting methods and classification algorithms based on data sets.

Development of Sauces Made from Gochujang Using the Quality Function Deployment Method: Focused on U.S. and Chinese Markets (품질기능전개(Quality Function Deployment) 방법을 적용한 고추장 소스 콘셉트 개발: 미국과 중국 시장을 중심으로)

  • Lee, Seul Ki;Kim, A Young;Hong, Sang Pil;Lee, Seung Je;Lee, Min A
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.44 no.9
    • /
    • pp.1388-1398
    • /
    • 2015
  • Quality Function Deployment (QFD) is the most complete and comprehensive method for translating what customers need from a product. This study utilized QFD to develop sauces made from Gochujang and to determine how to fulfill international customers' requirements. A customer survey and expert opinion survey were conducted from May 13 to August 22, 2014 and targeted 220 consumers and 20 experts in the U.S. and China. Finally, a total of 208 (190 consumers and 18 experts) useable data were selected. The top three customer requirements for Gochujang sauces were identified as fresh flavor (4.40), making better flavor (3.99), and cooking availability (3.90). Thirty-three engineering characteristics were developed. The results from the calculation of relative importance of engineering characteristics identified that 'cooking availability', 'free sample and food testing', 'unique concept', and 'development of brand' were the highest. The relative importance of engineering characteristics, correlation, and technical difficulties are ranked, and this result could contribute to the development Korean sauces based on customer needs and engineering characteristics.

A Study on the Selection of Parameter Values of FUSION Software for Improving Airborne LiDAR DEM Accuracy in Forest Area (산림지역에서의 LiDAR DEM 정확도 향상을 위한 FUSION 패러미터 선정에 관한 연구)

  • Cho, Seungwan;Park, Joowon
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.3
    • /
    • pp.320-329
    • /
    • 2017
  • This study aims to evaluate whether the accuracy of LiDAR DEM is affected by the changes of the five input levels ('1','3','5','7' and '9') of median parameter ($F_{md}$), mean parameter ($F_{mn}$) of the Filtering Algorithm (FA) in the GroundFilter module and median parameter ($I_{md}$), mean parameter ($I_{mn}$) of the Interpolation Algorithm (IA) in the GridSurfaceCreate module of the FUSION in order to present the combination of parameter levels producing the most accurate LiDAR DEM. The accuracy is measured by the residuals calculated by difference between the field elevation values and their corresponding DEM elevation values. A multi-way ANOVA is used to statistically examine whether there are effects of parameter level changes on the means of the residuals. The Tukey HSD is conducted as a post-hoc test. The results of the multi- way ANOVA test show that the changes in the levels of $F_{md}$, $F_{mn}$, $I_{mn}$ have significant effects on the DEM accuracy with the significant interaction effect between $F_{md}$ and $F_{mn}$. Therefore, the level of $F_{md}$, $F_{mn}$, and the interaction between two variables are considered to be factors affecting the accuracy of LiDAR DEM as well as the level of $I_{mn}$. As the results of the Tukey HSD test on the combination levels of $F_{md}{\ast}F_{mn}$, the mean of residuals of the '$9{\ast}3$' combination provides the highest accuracy while the '$1{\ast}1$' combination provides the lowest one. Regarding $I_{mn}$ levels, the mean of residuals of the both '3' and '1' provides the highest accuracy. This study can contribute to improve the accuracy of the forest attributes as well as the topographic information extracted from the LiDAR data.