• Title/Summary/Keyword: similarity cost

Search Result 182, Processing Time 0.026 seconds

A Study on Clustering of Core Competencies to Deploy in and Develop Courseworks for New Digital Technology (카드소팅을 활용한 디지털 신기술 과정 핵심역량 군집화에 관한 연구)

  • Ji-Woon Lee;Ho Lee;Joung-Huem Kwon
    • Journal of Practical Engineering Education
    • /
    • v.14 no.3
    • /
    • pp.565-572
    • /
    • 2022
  • Card sorting is a useful data collection method for understanding users' perceptions of relationships between items. In general, card sorting is an intuitive and cost-effective technique that is very useful for user research and evaluation. In this study, the core competencies of each field were used as competency cards used in the next stage of card sorting for course development, and the clustering results were derived by applying the K-means algorithm to cluster the results. As a result of card sorting, competency clustering for core competencies for each occupation in each field was verified based on Participant-Centric Analysis (PCA). For the number of core competency cards for each occupation, the number of participants who agreed appropriately for clustering and the degree of card similarity were derived compared to the number of sorting participants.

Sleep/Wake Dynamic Classifier based on Wearable Accelerometer Device Measurement (웨어러블 가속도 기기 측정에 의한 수면/비수면 동적 분류)

  • Park, Jaihyun;Kim, Daehun;Ku, Bonhwa;Ko, Hanseok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.6
    • /
    • pp.126-134
    • /
    • 2015
  • A sleep disorder is being recognized as one of the major health issues related to high levels of stress. At the same time, interests about quality of sleep are rapidly increasing. However, diagnosing sleep disorder is not a simple task because patients should undergo polysomnography test, which requires a long time and high cost. To solve this problem, an accelerometer embedded wrist-worn device is being considered as a simple and low cost solution. However, conventional methods determine a state of user to "sleep" or "wake" according to whether values of individual section's accelerometer data exceed a certain threshold or not. As a result, a high miss-classification rate is observed due to user's intermittent movements while sleeping and tiny movements while awake. In this paper, we propose a novel method that resolves the above problems by employing a dynamic classifier which evaluates a similarity between the neighboring data scores obtained from SVM classifier. A performance of the proposed method is evaluated using 50 data sets and its superiority is verified by achieving 88.9% accuracy, 88.9% sensitivity, and 88.5% specificity.

Multimodal Route Selection from Korea to Europe Using Fuzzy AHP-TOPSIS Approaches: The Perspective of the China-Railway Express (한-유럽 복합운송 경로선택에 관한 연구 중국-유럽 화물열차를 중심으로)

  • Wang, Guan;Ahn, Seung-Bum
    • Journal of Korea Port Economic Association
    • /
    • v.37 no.4
    • /
    • pp.13-31
    • /
    • 2021
  • Since the signing of the Korea-Europe Free Trade Agreement, the volume of trade transactions between South Korea and Europe has increased. The traditional single-mode transport system has been transformed into an intermodal transport system using two or more modes of transport. In addition, the conventional sea and air transport routes have been restricted, leading to a decline in Korean exports to Europe, and the rail transport mode is becoming mainstream in the market due to the influence of COVID-19. This paper focuses on the China-Railway Express to explore a new intermodal transport route from Korea to Europe. First, the fuzzy analytic hierarchy process (AHP) is used to evaluate the factor weights when selecting intermodal transport routes from Korea to Europe. Then, the TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution) method is used to rank three alternatives. The results show that among the four factors (total cost, total time, transportation capability, and service reliability), the total cost is the most significant factor, followed by the total time, service reliability, and transportation capability. Furthermore, the alternative route 1 (Incheon-Dalian-Manchuria-Hamburg) is preferred.

The Adaptive Personalization Method According to Users Purchasing Index : Application to Beverage Purchasing Predictions (고객별 구매빈도에 동적으로 적응하는 개인화 시스템 : 음료수 구매 예측에의 적용)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.95-108
    • /
    • 2011
  • TThis is a study of the personalization method that intelligently adapts the level of clustering considering purchasing index of a customer. In the e-biz era, many companies gather customers' demographic and transactional information such as age, gender, purchasing date and product category. They use this information to predict customer's preferences or purchasing patterns so that they can provide more customized services to their customers. The previous Customer-Segmentation method provides customized services for each customer group. This method clusters a whole customer set into different groups based on their similarity and builds predictive models for the resulting groups. Thus, it can manage the number of predictive models and also provide more data for the customers who do not have enough data to build a good predictive model by using the data of other similar customers. However, this method often fails to provide highly personalized services to each customer, which is especially important to VIP customers. Furthermore, it clusters the customers who already have a considerable amount of data as well as the customers who only have small amount of data, which causes to increase computational cost unnecessarily without significant performance improvement. The other conventional method called 1-to-1 method provides more customized services than the Customer-Segmentation method for each individual customer since the predictive model are built using only the data for the individual customer. This method not only provides highly personalized services but also builds a relatively simple and less costly model that satisfies with each customer. However, the 1-to-1 method has a limitation that it does not produce a good predictive model when a customer has only a few numbers of data. In other words, if a customer has insufficient number of transactional data then the performance rate of this method deteriorate. In order to overcome the limitations of these two conventional methods, we suggested the new method called Intelligent Customer Segmentation method that provides adaptive personalized services according to the customer's purchasing index. The suggested method clusters customers according to their purchasing index, so that the prediction for the less purchasing customers are based on the data in more intensively clustered groups, and for the VIP customers, who already have a considerable amount of data, clustered to a much lesser extent or not clustered at all. The main idea of this method is that applying clustering technique when the number of transactional data of the target customer is less than the predefined criterion data size. In order to find this criterion number, we suggest the algorithm called sliding window correlation analysis in this study. The algorithm purposes to find the transactional data size that the performance of the 1-to-1 method is radically decreased due to the data sparity. After finding this criterion data size, we apply the conventional 1-to-1 method for the customers who have more data than the criterion and apply clustering technique who have less than this amount until they can use at least the predefined criterion amount of data for model building processes. We apply the two conventional methods and the newly suggested method to Neilsen's beverage purchasing data to predict the purchasing amounts of the customers and the purchasing categories. We use two data mining techniques (Support Vector Machine and Linear Regression) and two types of performance measures (MAE and RMSE) in order to predict two dependent variables as aforementioned. The results show that the suggested Intelligent Customer Segmentation method can outperform the conventional 1-to-1 method in many cases and produces the same level of performances compare with the Customer-Segmentation method spending much less computational cost.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

Determining the Size of a Hankel Matrix in Subspace System Identification for Estimating the Stiffness Matrix and Flexural Rigidities of a Shear Building (전단빌딩의 강성행렬 및 부재의 강성추정을 위한 부분공간 시스템 확인기법에서의 행켈행렬의 크기 결정)

  • Park, Seung-Keun;Park, Hyun Woo
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.26 no.2
    • /
    • pp.99-112
    • /
    • 2013
  • This paper presents a subspace system identification for estimating the stiffness matrix and flexural rigidities of a shear building. System matrices are estimated by LQ decomposition and singular value decomposition from an input-output Hankel matrix. The estimated system matrices are converted into a real coordinate through similarity transformation, and the stiffness matrix is estimated from the system matrices. The accuracy and the stability of an estimated stiffness matrix depend on the size of the associated Hankel matrix. The estimation error curve of the stiffness matrix is obtained with respect to the size of a Hankel matrix using a prior finite element model of a shear building. The sizes of the Hankel matrix, which are consistent with a target accuracy level, are chosen through this curve. Among these candidate sizes of the Hankel matrix, more proper one can be determined considering the computational cost of subspace identification. The stiffness matrix and flexural rigidities are estimated using the Hankel matrix with the candidate sizes. The validity of the proposed method is demonstrated through the numerical example of a five-story shear building model with and without damage.

A Running Stability Test of 1/5 Scaled Bogie using Small-Scaled Derailment Simulator (소형탈선시뮬레이터를 이용한 1/5 축소대차의 주행안정성 시험)

  • Eom, Beom-Gyu;Kang, Bu-Byoung;Lee, Hi-Sung
    • Journal of the Korean Society for Railway
    • /
    • v.15 no.1
    • /
    • pp.9-16
    • /
    • 2012
  • The dynamic stability of railway vehicle has been one of the important issues in railway safety. The dynamic simulator has been used in the study about the dynamic stability of railway vehicle and wheel/rail interface optimization. Especially, a small scale simulator has been widely used in the fundamental study in the laboratory instead of full scale roller rig which is not cost effective and inconvenient to achieve diverse design parameters. But the technique for the design of the small scale simulator about the dynamic characteristics of the wheel-rail system and the bogie system has not been well developed in Korea. Therefore, the research using the small-scaled derailment simulator and the 1/5 scaled bogie has been conducted. In this paper, we did running stability test of 1/5 scaled bogie using small-scaled derailment simulator. Also, for the operation of the small scaled simulator, it is required to investigate the performance and characteristics of the simulator system. This could be achieved by a comparative study between an analysis and an experiment. This paper presented the analytical model which could be used for verifying the test results and understanding of the physical behavior of the dynamic system comprising the small- scaled derailment simulator and the 1/5 scaled bogie.

Practical Classification of Herbicide by Two-dimensional Ordination Analysis in Transplanted Lowland Rice Field (Two-dimensional Ordination 분석법(分析法)에 의한 제초제(除草劑) 살초(殺草) Spectrum 분류(分類)에 관한 연구(硏究))

  • Kim, Soon-Chul;Park, Rae-Kyeong
    • Korean Journal of Weed Science
    • /
    • v.2 no.2
    • /
    • pp.129-140
    • /
    • 1982
  • Herbicides were classified by two-dimensional ordination analysis based on the weed flora which was not controlled by application of a particular herbicide. The number of herbicide group was varied depending upon the weed community type and the experiment site. The technique of the two-dimensional ordination analysis gave more comprehensive informations about selecting of herbicides for increasing the herbicidal efficacy, for increasing the weed spectrum and for reducing the herbicide cost by mixing of herbicides. The two-dimensional ordination analysis could be used not only herbicide classification and selecting effective herbicide or herbicide combination but also can be used for the evaluation of systematic application of herbicides.

  • PDF

A Korean Community-based Question Answering System Using Multiple Machine Learning Methods (다중 기계학습 방법을 이용한 한국어 커뮤니티 기반 질의-응답 시스템)

  • Kwon, Sunjae;Kim, Juae;Kang, Sangwoo;Seo, Jungyun
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1085-1093
    • /
    • 2016
  • Community-based Question Answering system is a system which provides answers for each question from the documents uploaded on web communities. In order to enhance the capacity of question analysis, former methods have developed specific rules suitable for a target region or have applied machine learning to partial processes. However, these methods incur an excessive cost for expanding fields or lead to cases in which system is overfitted for a specific field. This paper proposes a multiple machine learning method which automates the overall process by adapting appropriate machine learning in each procedure for efficient processing of community-based Question Answering system. This system can be divided into question analysis part and answer selection part. The question analysis part consists of the question focus extractor, which analyzes the focused phrases in questions and uses conditional random fields, and the question type classifier, which classifies topics of questions and uses support vector machine. In the answer selection part, the we trains weights that are used by the similarity estimation models through an artificial neural network. Also these are a number of cases in which the results of morphological analysis are not reliable for the data uploaded on web communities. Therefore, we suggest a method that minimizes the impact of morphological analysis by using character features in the stage of question analysis. The proposed system outperforms the former system by showing a Mean Average Precision criteria of 0.765 and R-Precision criteria of 0.872.

Representative Labels Selection Technique for Document Cluster using WordNet (문서 클러스터를 위한 워드넷기반의 대표 레이블 선정 방법)

  • Kim, Tae-Hoon;Sohn, Mye
    • Journal of Internet Computing and Services
    • /
    • v.18 no.2
    • /
    • pp.61-73
    • /
    • 2017
  • In this paper, we propose a Documents Cluster Labeling method using information content of words in clusters to understand what the clusters imply. To do so, we calculate the weight and frequency of the words. These two measures are used to determine the weight among the words in the cluster. As a nest step, we identify the candidate labels using the WordNet. At this time, the candidate labels are matched to least common hypernym of the words in the cluster. Finally, the representative labels are determined with respect to information content of the words and the weight of the words. To prove the superiority of our method, we perform the heuristic experiment using two kinds of measures, named the suitability of the candidate label ($Suitability_{cl}$) and the appropriacy of representative label ($Appropriacy_{rl}$). In applying the method proposed in this research, in case of suitability of the candidate label, it decreases slightly compared with existing methods, but the computational cost is about 20% of the conventional methods. And we confirmed that appropriacy of the representative label is better results than the existing methods. As a result, it is expected to help data analysts to interpret the document cluster easier.