• Title/Summary/Keyword: 계층적 군집화

Search Result 134, Processing Time 0.025 seconds

Automatic e-mail Hierarchy Classification using Dynamic Category Hierarchy and Principal Component Analysis (PCA와 동적 분류체계를 사용한 자동 이메일 계층 분류)

  • Park, Sun
    • Journal of Advanced Navigation Technology
    • /
    • v.13 no.3
    • /
    • pp.419-425
    • /
    • 2009
  • The amount of incoming e-mails is increasing rapidly due to the wide usage of Internet. Therefore, it is more required to classify incoming e-mails efficiently and accurately. Currently, the e-mail classification techniques are focused on two way classification to filter spam mails from normal ones based mainly on Bayesian and Rule. The clustering method has been used for the multi-way classification of e-mails. But it has a disadvantage of low accuracy of classification and no category labels. The classification methods have a disadvantage of training and setting of category labels by user. In this paper, we propose a novel multi-way e-mail hierarchy classification method that uses PCA for automatic category generation and dynamic category hierarchy for high accuracy of classification. It classifies a huge amount of incoming e-mails automatically, efficiently, and accurately.

  • PDF

Building Matching Analysis and New Building Update for the Integrated Use of the Digital Map and the Road Name Address Map (수치지도와 도로명주소지도의 통합 활용을 위한 건물 매칭 분석과 신규 건물 갱신)

  • Yeom, Jun Ho;Huh, Yong;Lee, Jeabin
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.5
    • /
    • pp.459-467
    • /
    • 2014
  • The importance of fusion and association using established spatial information has increased gradually with the production and supply of various spatial data by public institutions. The generation of necessary spatial information without field investigation and additional surveying can reduce time, labor, and financial costs. However, the study of the integration of the newly introduced road name address map with the digital map is very insufficient. Even though the use of the road name address map is encouraged for public works related to spatial information, the digital map is still widely used because it is the national basic map. Therefore, in this study, building matching and update were performed to associate the digital map with the road name address map. After geometric calibration using the block-based ICP (Iterative Closest Point) method, multi-scale corresponding pair searching with hierarchical clustering was applied to detect the multi-type match. The accuracy assessment showed that the proposed method is more than 95% accurate and the matched building layer of the two maps is useful for the integrated application and fusion. In addition, the use of the road name address map, which carries the latest and most frequently renewed data, enables cost-effective updating of new buildings.

Validation Technique of Simulation Model using Weighted F-measure with Hierarchical X-means (WF-HX) Method (계층적 X-means와 가중 F-measure를 통한 시뮬레이션 모델 검증 기법)

  • Yang, Dae-Gil;HwangBo, Hun;Cheon, Hyun-Jae;Lee, Hong-Chul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.2
    • /
    • pp.562-574
    • /
    • 2012
  • Simulation validation techniques which have been employed in most studies are statistical analysis, which validate a model with mean or variance of throughput and resource utilization as an evaluation object. However, these methods have not been able to ensure the reliability of individual elements of the model well. To overcome the problem, the weighted F-measure method was proposed, but this technique also had some limitations. First, it is difficult to apply the technique to complex system environment with numerous values of interarrival time because it assigns a class to an individual value of interarrival time. In addition, due to unbounded weights, the value of weighted F-measure has no lower bound, so it is difficult to determine its threshold. Therefore, this paper propose weighted F-measure technique with cluster analysis to solve these problems. The classes for the technique are defined by each cluster, which reduces considerable number of classes and enables to apply the technique to various systems. Moreover, we improved the validation technique in the way of assigning minimum bounded weights without any lack of objectivity.

A Study on the Improvement Direction of Selection Evaluation Indicators for the Land Transport Technology Commercialization Support Project: Focusing on the Follow-up Project Linkage Plan (국토교통기술사업화지원사업 선정평가 지표 개선방안 연구: 후속사업 연계 방안을 중심으로)

  • Hyung-Wook Shim;Seok-Ki Cha;Seung-Hee Back
    • Journal of Industrial Convergence
    • /
    • v.20 no.12
    • /
    • pp.87-96
    • /
    • 2022
  • The Ministry of Land, Infrastructure and Transport has also been promoting the commercialization of land transport technology to commercialize the technologies owned by small and medium-sized venture companies, and to support the transfer and commercialization of public technologies. At this point, in order to improve the investment effect of subsequent new projects and to select excellent research institutes, it is necessary to establish a valid evaluation index system suitable for the purpose of the project. The evaluation index system for subsequent new projects should be linked to the project objectives and goals of the preceding project, and should be selected in consideration of existing evaluation indicators to prevent interruption of research results. Therefore, this thesis sets the evaluation index system into multiple scenarios through hierarchical cluster analysis using the evaluation result data for each evaluation committee for small and medium venture companies participating in the land transportation technology commercialization support project, and then analyzes the structural equation model. As a result of scenario analysis, considering the measurement effect of each path representing the causal relationship between evaluation indicators and the effect of each evaluation index on evaluation items, the scenario with the highest impact on the evaluation result was selected as an improvement plan.

Development of an SNP set for marker-assisted breeding based on the genotyping-by-sequencing of elite inbred lines in watermelon (수박 엘리트 계통의 GBS를 통한 마커이용 육종용 SNP 마커 개발)

  • Lee, Junewoo;Son, Beunggu;Choi, Youngwhan;Kang, Jumsoon;Lee, Youngjae;Je, Byoung Il;Park, Younghoon
    • Journal of Plant Biotechnology
    • /
    • v.45 no.3
    • /
    • pp.242-249
    • /
    • 2018
  • This study was conducted to develop an SNP set that can be useful for marker-assisted breeding (MAB) in watermelon (Citrullus. lanatus L) using Genotyping-by-sequencing (GBS) analysis of 20 commercial elite watermelon inbreds. The result of GBS showed that 77% of approximately 1.1 billion raw reads were mapped on the watermelon genome with an average mapping region of about 4,000 Kb, which indicated genome coverage of 2.3%. After the filtering process, a total of 2,670 SNPs with an average depth of 31.57 and the PIC (Polymorphic Information Content) value of 0.1~0.38 for 20 elite inbreds were obtained. Among those SNPs, 55 SNPs (5 SNPs per chromosome that are equally distributed on each chromosome) were selected. For the understanding genetic relationship of 20 elite inbreds, PCA (Principal Component Analysis) was carried out with 55 SNPs, which resulted in the classification of inbreds into 4 groups based on PC1 (52%) and PC2 (11%), thus causing differentiation between the inbreds. A similar classification pattern for PCA was observed from hierarchical clustering analysis. The SNP set developed in this study has the potential for application to cultivar identification, F1 seed purity test, and marker-assisted backcross (MABC) not only for 20 elite inbreds but also for diverse resources for watermelon breeding.

Predicting stock movements based on financial news with systematic group identification (시스템적인 군집 확인과 뉴스를 이용한 주가 예측)

  • Seong, NohYoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.1-17
    • /
    • 2019
  • Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.

Analysis of the Difference on Elementary Students' School Adaptation and Academic Performance by Dependence on Smart Devices

  • Lee, KyungHee;Park, Hye-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.4
    • /
    • pp.213-221
    • /
    • 2022
  • The purpose of this study is to find methods to prevent and improve smart device over-dependence problems by analyzing differences in school life adaptation and academic performance according to children's dependence on smart devices. For this, the data of fifth grade elementary school students in the 12th year were extracted and utilized from Panel Survey of Korean Children. The data were analyzed using non-hierarchical cluster(K-means) analysis, T-test, one-way ANOVA, and Scheffé tests. The results of this study are as follows. First, It has been shown that dependence on smart devices, school adaptation and academic performance have a negative correlation. Second, students in potential and high-risk groups who are highly dependent on smart devices have significantly lower school adaptation compared to those in the safety group. Third, high-risk students showed significantly lower academic performance compared to those in the potential risk group and general group. Based on these findings, it was suggested that for elementary school students who rely on smart devices, various learning support and national efforts such as counseling for school life adaptation are needed.

Development of Wooden Coffin(木棺) and Chamber(木槨) Tombs in Gyeongju(慶州) and Sarokuk(斯盧國) (경주지역 목관·목곽묘의 전개와 사로국)

  • Lee, Ju Heun
    • Korean Journal of Heritage: History & Science
    • /
    • v.42 no.3
    • /
    • pp.106-130
    • /
    • 2009
  • The aim of this paper is analysis of structure and development pattern about wooden coffin and chamber tombs in Gyeongju from the 2nd century B.C. to the 3rd century A.D. for researching to socio-political tendency and growth process of Sarokuk. Tombs buried with iron objects were built in Youngnam(嶺南) from the 2nd century B.C. with spread wooden coffin with stone mound(積石木棺墓). Also medium or small sized wooden coffin tombs buried with bronze mirror of western Han(前漢) and soft stoneware(瓦質土器) were appeared the 2nd century B.C. in Gyeongju, because of establishment of Han's commanderies(漢郡縣) in the Korean Peninsula and refuge from Daedong river(大同江) to Jinhan(辰韓). Separate tombs(獨立墓) with lots of bronze object ware assumed high ranked tombs of parsonage(司祭王) or local chief(地域首長). From the 2nd century A.D. the size of wooden coffin tombs became enlarged and funerary objects ware abundant, for example Sarari 130th tomb(舍羅里 130號). The burying pattern of this tomb is similar to wooden chamber tombs in Lelang(樂浪), which had prestige goods like lacquer ware and bronze mirror in wood box(木匣) beside coffin. Appearance of these wooden chamber tombs that were different from original wooden coffin tombs imply interaction between Lelang and these area with iron. Sarari community that held right of trade and distribution to outside through the geographical advantage grew up centered position in Gyeongju politically, socially, and culturally. Chamber in tomb as a new structural notion that can secure funerary objects became firmly was established from the 2nd century A.D. in Gyeongju and large sized wooden chamber tombs were generally built early of the 3rd century A.D. This tendency was reflected in stratification of community and growth as center of local state. After late of the 3rd century A.D. Gyeongju type wooden chamber tomb(慶州式木槨墓) which had subordinate outer coffin(副槨) was appeared and then subordinate outer coffin was as bigger as main chamber(主槨) the 4th century A.D., because of centralization and stratification in society and unification of various communities among the Gyeongju area.

A Market Segmentation Scheme Based on Customer Information and QAP Correlation between Product Networks (고객정보와 상품네트워크 유사도를 이용한 시장세분화 기법)

  • Jeong, Seok-Bong;Shin, Yong Ho;Koo, Seo Ryong;Yoon, Hyoup-Sang
    • Journal of the Korea Society for Simulation
    • /
    • v.24 no.4
    • /
    • pp.97-106
    • /
    • 2015
  • In recent, hybrid market segmentation techniques have been widely adopted, which conduct segmentation using both general variables and transaction based variables. However, the limitation of the techniques is to generate incorrect results for market segmentation even though its methodology and concept are easy to apply. In this paper, we propose a novel scheme to overcome this limitation of the hybrid techniques and to take an advantage of product information obtained by customer's transaction data. In this scheme, we first divide a whole market into several unit segments based on the general variables and then agglomerate the unit segments with higher QAP correlations. Each product network represents for purchasing patterns of its corresponding segment, thus, comparisons of QAP correlation between product networks of each segment can be a good measure to compare similarities between each segment. A case study has been conducted to validate the proposed scheme. The results show that our scheme effectively works for Internet shopping malls.

Basic Study for Selection of Factors Constituents of User Satisfaction for Micro Electric Vehicles (초소형전기차 사용자만족도 구성요인 선정을 위한 기반연구)

  • Jin, Eunju;Seo, Imki;Kim, Jongmin;Park, Jejin
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.41 no.5
    • /
    • pp.581-589
    • /
    • 2021
  • With the recent increase in the introduction of micro-electric vehicles in Korea, interest in micro-electric vehicle user satisfaction is increasing to revitalize related markets. In this paper, a basic study was conducted on the development of public services using micro-electric vehicle based on the constituent factors of user satisfaction. The survey includes: ① 'Analytic Hierarchy Process (AHP) for selecting the priority factors of user satisfaction of micro-electric vehicles', ② 'A survey of micro-electric vehicles image' to collect data in advance for providing users' preferences and transportation services for micro-electric vehicles, ③ In order to investigate the user satisfaction level of users who actually operated micro-electric vehicles, the order of 'user satisfaction survey of micro-electric vehicle drivers' was conducted. In the Analytic Hierarchy Process (AHP) analysis, it was found that users regarded as important in the order of 'user utilization data', 'vehicle movement data', and 'charging service data'. In the micro-electric vehicle image survey, users perceived micro-electric vehicles more positively in terms of "safety", 'durability', 'Ride comfort', 'design', 'MOOE (Maintenance and other operating expense)', and 'environment-friendly' when comparing micro-electric vehicles with electric motorcycles. In the survey on the user satisfaction of micro-electric vehicle drivers, the use of micro-electric vehicle did not directly affect work performance efficiency, and there was an experience of being disadvantaged on the road due to the size of the micro-electric vehicle, and driving in a cluster of micro-electric vehicle for outdoor advertisements. The city's public relations effect was great, but it was concerned about safety. In the future, based on the results of this study, we plan to build a user satisfaction structural equation model, preemptively discover feedback R&D for micro-electric vehicle utilization services in the public field, and actively seek to discover new public mobility support services.