• Title/Summary/Keyword: 지능형 데이터 분석

Search Result 639, Processing Time 0.027 seconds

How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores (평점과 리뷰 텍스트 감성분석을 결합한 추천시스템 향상 방안 연구)

  • Hyun, Jiyeon;Ryu, Sangyi;Lee, Sang-Yong Tom
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.219-239
    • /
    • 2019
  • As the importance of providing customized services to individuals becomes important, researches on personalized recommendation systems are constantly being carried out. Collaborative filtering is one of the most popular systems in academia and industry. However, there exists limitation in a sense that recommendations were mostly based on quantitative information such as users' ratings, which made the accuracy be lowered. To solve these problems, many studies have been actively attempted to improve the performance of the recommendation system by using other information besides the quantitative information. Good examples are the usages of the sentiment analysis on customer review text data. Nevertheless, the existing research has not directly combined the results of the sentiment analysis and quantitative rating scores in the recommendation system. Therefore, this study aims to reflect the sentiments shown in the reviews into the rating scores. In other words, we propose a new algorithm that can directly convert the user 's own review into the empirically quantitative information and reflect it directly to the recommendation system. To do this, we needed to quantify users' reviews, which were originally qualitative information. In this study, sentiment score was calculated through sentiment analysis technique of text mining. The data was targeted for movie review. Based on the data, a domain specific sentiment dictionary is constructed for the movie reviews. Regression analysis was used as a method to construct sentiment dictionary. Each positive / negative dictionary was constructed using Lasso regression, Ridge regression, and ElasticNet methods. Based on this constructed sentiment dictionary, the accuracy was verified through confusion matrix. The accuracy of the Lasso based dictionary was 70%, the accuracy of the Ridge based dictionary was 79%, and that of the ElasticNet (${\alpha}=0.3$) was 83%. Therefore, in this study, the sentiment score of the review is calculated based on the dictionary of the ElasticNet method. It was combined with a rating to create a new rating. In this paper, we show that the collaborative filtering that reflects sentiment scores of user review is superior to the traditional method that only considers the existing rating. In order to show that the proposed algorithm is based on memory-based user collaboration filtering, item-based collaborative filtering and model based matrix factorization SVD, and SVD ++. Based on the above algorithm, the mean absolute error (MAE) and the root mean square error (RMSE) are calculated to evaluate the recommendation system with a score that combines sentiment scores with a system that only considers scores. When the evaluation index was MAE, it was improved by 0.059 for UBCF, 0.0862 for IBCF, 0.1012 for SVD and 0.188 for SVD ++. When the evaluation index is RMSE, UBCF is 0.0431, IBCF is 0.0882, SVD is 0.1103, and SVD ++ is 0.1756. As a result, it can be seen that the prediction performance of the evaluation point reflecting the sentiment score proposed in this paper is superior to that of the conventional evaluation method. In other words, in this paper, it is confirmed that the collaborative filtering that reflects the sentiment score of the user review shows superior accuracy as compared with the conventional type of collaborative filtering that only considers the quantitative score. We then attempted paired t-test validation to ensure that the proposed model was a better approach and concluded that the proposed model is better. In this study, to overcome limitations of previous researches that judge user's sentiment only by quantitative rating score, the review was numerically calculated and a user's opinion was more refined and considered into the recommendation system to improve the accuracy. The findings of this study have managerial implications to recommendation system developers who need to consider both quantitative information and qualitative information it is expect. The way of constructing the combined system in this paper might be directly used by the developers.

A Template-based Interactive University Timetabling Support System (템플릿 기반의 상호대화형 전공강의시간표 작성지원시스템)

  • Chang, Yong-Sik;Jeong, Ye-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.121-145
    • /
    • 2010
  • University timetabling depending on the educational environments of universities is an NP-hard problem that the amount of computation required to find solutions increases exponentially with the problem size. For many years, there have been lots of studies on university timetabling from the necessity of automatic timetable generation for students' convenience and effective lesson, and for the effective allocation of subjects, lecturers, and classrooms. Timetables are classified into a course timetable and an examination timetable. This study focuses on the former. In general, a course timetable for liberal arts is scheduled by the office of academic affairs and a course timetable for major subjects is scheduled by each department of a university. We found several problems from the analysis of current course timetabling in departments. First, it is time-consuming and inefficient for each department to do the routine and repetitive timetabling work manually. Second, many classes are concentrated into several time slots in a timetable. This tendency decreases the effectiveness of students' classes. Third, several major subjects might overlap some required subjects in liberal arts at the same time slots in the timetable. In this case, it is required that students should choose only one from the overlapped subjects. Fourth, many subjects are lectured by same lecturers every year and most of lecturers prefer the same time slots for the subjects compared with last year. This means that it will be helpful if departments reuse the previous timetables. To solve such problems and support the effective course timetabling in each department, this study proposes a university timetabling support system based on two phases. In the first phase, each department generates a timetable template from the most similar timetable case, which is based on case-based reasoning. In the second phase, the department schedules a timetable with the help of interactive user interface under the timetabling criteria, which is based on rule-based approach. This study provides the illustrations of Hanshin University. We classified timetabling criteria into intrinsic and extrinsic criteria. In intrinsic criteria, there are three criteria related to lecturer, class, and classroom which are all hard constraints. In extrinsic criteria, there are four criteria related to 'the numbers of lesson hours' by the lecturer, 'prohibition of lecture allocation to specific day-hours' for committee members, 'the number of subjects in the same day-hour,' and 'the use of common classrooms.' In 'the numbers of lesson hours' by the lecturer, there are three kinds of criteria : 'minimum number of lesson hours per week,' 'maximum number of lesson hours per week,' 'maximum number of lesson hours per day.' Extrinsic criteria are also all hard constraints except for 'minimum number of lesson hours per week' considered as a soft constraint. In addition, we proposed two indices for measuring similarities between subjects of current semester and subjects of the previous timetables, and for evaluating distribution degrees of a scheduled timetable. Similarity is measured by comparison of two attributes-subject name and its lecturer-between current semester and a previous semester. The index of distribution degree, based on information entropy, indicates a distribution of subjects in the timetable. To show this study's viability, we implemented a prototype system and performed experiments with the real data of Hanshin University. Average similarity from the most similar cases of all departments was estimated as 41.72%. It means that a timetable template generated from the most similar case will be helpful. Through sensitivity analysis, the result shows that distribution degree will increase if we set 'the number of subjects in the same day-hour' to more than 90%.

Design and Analysis of a Scenario for Evaluating Application Service Performance of a Hybrid V2X Communication System (하이브리드 V2X 통신시스템의 응용서비스 성능 평가를 위한 시나리오 설계 및 분석 연구)

  • Lee, Sung-Hun;Lee, Chang-Kyo;Byun, Sang-Bong;Cho, Soo-Hyun;Cho, Hyun-Kyu
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.4
    • /
    • pp.423-430
    • /
    • 2019
  • The convergence of the automotive industry and the ICT technology can be broadly divided into the commercial service sector and the Cooperative-ITS (C-ITS) service sector. The C-ITS service sector is using V2X communication technology as a field that aims to provide safer transportation, more green and efficient transportation, and more predictable and productive mobility. The recent convergence of self-driving cars and connected cars requires high data rates, low transmission delays, and low transmission error rates. Interest in comparison of performance between WAVE and C-V2X (LTE-V2X, 5G-V2X) has been amplified and application services by communication technology are being studied. In this paper, we design the application performance evaluation method of Hybrid V2X communication system and confirm that the decrease of packet error rate (PER) performance is caused by the increase of communication distance, not the vehicle speed.

Requirement Analysis for Agricultural Meteorology Information Service Systems based on the Fourth Industrial Revolution Technologies (4차 산업혁명 기술에 기반한 농업 기상 정보 시스템의 요구도 분석)

  • Kim, Kwang Soo;Yoo, Byoung Hyun;Hyun, Shinwoo;Kang, DaeGyoon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.3
    • /
    • pp.175-186
    • /
    • 2019
  • Efforts have been made to introduce the climate smart agriculture (CSA) for adaptation to future climate conditions, which would require collection and management of site specific meteorological data. The objectives of this study were to identify requirements for construction of agricultural meteorology information service system (AMISS) using technologies that lead to the fourth industrial revolution, e.g., internet of things (IoT), artificial intelligence, and cloud computing. The IoT sensors that require low cost and low operating current would be useful to organize wireless sensor network (WSN) for collection and analysis of weather measurement data, which would help assessment of productivity for an agricultural ecosystem. It would be recommended to extend the spatial extent of the WSN to a rural community, which would benefit a greater number of farms. It is preferred to create the big data for agricultural meteorology in order to produce and evaluate the site specific data in rural areas. The digital climate map can be improved using artificial intelligence such as deep neural networks. Furthermore, cloud computing and fog computing would help reduce costs and enhance the user experience of the AMISS. In addition, it would be advantageous to combine environmental data and farm management data, e.g., price data for the produce of interest. It would also be needed to develop a mobile application whose user interface could meet the needs of stakeholders. These fourth industrial revolution technologies would facilitate the development of the AMISS and wide application of the CSA.

Are you a Machine or Human?: The Effects of Human-likeness on Consumer Anthropomorphism Depending on Construal Level (Are you a Machine or Human?: 소셜 로봇의 인간 유사성과 소비자 해석수준이 의인화에 미치는 영향)

  • Lee, Junsik;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.129-149
    • /
    • 2021
  • Recently, interest in social robots that can socially interact with humans is increasing. Thanks to the development of ICT technology, social robots have become easier to provide personalized services and emotional connection to individuals, and the role of social robots is drawing attention as a means to solve modern social problems and the resulting decline in the quality of individual lives. Along with the interest in social robots, the spread of social robots is also increasing significantly. Many companies are introducing robot products to the market to target various target markets, but so far there is no clear trend leading the market. Accordingly, there are more and more attempts to differentiate robots through the design of social robots. In particular, anthropomorphism has been studied importantly in social robot design, and many approaches have been attempted to anthropomorphize social robots to produce positive effects. However, there is a lack of research that systematically describes the mechanism by which anthropomorphism for social robots is formed. Most of the existing studies have focused on verifying the positive effects of the anthropomorphism of social robots on consumers. In addition, the formation of anthropomorphism of social robots may vary depending on the individual's motivation or temperament, but there are not many studies examining this. A vague understanding of anthropomorphism makes it difficult to derive design optimal points for shaping the anthropomorphism of social robots. The purpose of this study is to verify the mechanism by which the anthropomorphism of social robots is formed. This study confirmed the effect of the human-likeness of social robots(Within-subjects) and the construal level of consumers(Between-subjects) on the formation of anthropomorphism through an experimental study of 3×2 mixed design. Research hypotheses on the mechanism by which anthropomorphism is formed were presented, and the hypotheses were verified by analyzing data from a sample of 206 people. The first hypothesis in this study is that the higher the human-likeness of the robot, the higher the level of anthropomorphism for the robot. Hypothesis 1 was supported by a one-way repeated measures ANOVA and a post hoc test. The second hypothesis in this study is that depending on the construal level of consumers, the effect of human-likeness on the level of anthropomorphism will be different. First, this study predicts that the difference in the level of anthropomorphism as human-likeness increases will be greater under high construal condition than under low construal condition.Second, If the robot has no human-likeness, there will be no difference in the level of anthropomorphism according to the construal level. Thirdly,If the robot has low human-likeness, the low construal level condition will make the robot more anthropomorphic than the high construal level condition. Finally, If the robot has high human-likeness, the high construal levelcondition will make the robot more anthropomorphic than the low construal level condition. We performed two-way repeated measures ANOVA to test these hypotheses, and confirmed that the interaction effect of human-likeness and construal level was significant. Further analysis to specifically confirm interaction effect has also provided results in support of our hypotheses. The analysis shows that the human-likeness of the robot increases the level of anthropomorphism of social robots, and the effect of human-likeness on anthropomorphism varies depending on the construal level of consumers. This study has implications in that it explains the mechanism by which anthropomorphism is formed by considering the human-likeness, which is the design attribute of social robots, and the construal level of consumers, which is the way of thinking of individuals. We expect to use the findings of this study as the basis for design optimization for the formation of anthropomorphism in social robots.

Intelligent Brand Positioning Visualization System Based on Web Search Traffic Information : Focusing on Tablet PC (웹검색 트래픽 정보를 활용한 지능형 브랜드 포지셔닝 시스템 : 태블릿 PC 사례를 중심으로)

  • Jun, Seung-Pyo;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.93-111
    • /
    • 2013
  • As Internet and information technology (IT) continues to develop and evolve, the issue of big data has emerged at the foreground of scholarly and industrial attention. Big data is generally defined as data that exceed the range that can be collected, stored, managed and analyzed by existing conventional information systems and it also refers to the new technologies designed to effectively extract values from such data. With the widespread dissemination of IT systems, continual efforts have been made in various fields of industry such as R&D, manufacturing, and finance to collect and analyze immense quantities of data in order to extract meaningful information and to use this information to solve various problems. Since IT has converged with various industries in many aspects, digital data are now being generated at a remarkably accelerating rate while developments in state-of-the-art technology have led to continual enhancements in system performance. The types of big data that are currently receiving the most attention include information available within companies, such as information on consumer characteristics, information on purchase records, logistics information and log information indicating the usage of products and services by consumers, as well as information accumulated outside companies, such as information on the web search traffic of online users, social network information, and patent information. Among these various types of big data, web searches performed by online users constitute one of the most effective and important sources of information for marketing purposes because consumers search for information on the internet in order to make efficient and rational choices. Recently, Google has provided public access to its information on the web search traffic of online users through a service named Google Trends. Research that uses this web search traffic information to analyze the information search behavior of online users is now receiving much attention in academia and in fields of industry. Studies using web search traffic information can be broadly classified into two fields. The first field consists of empirical demonstrations that show how web search information can be used to forecast social phenomena, the purchasing power of consumers, the outcomes of political elections, etc. The other field focuses on using web search traffic information to observe consumer behavior, identifying the attributes of a product that consumers regard as important or tracking changes on consumers' expectations, for example, but relatively less research has been completed in this field. In particular, to the extent of our knowledge, hardly any studies related to brands have yet attempted to use web search traffic information to analyze the factors that influence consumers' purchasing activities. This study aims to demonstrate that consumers' web search traffic information can be used to derive the relations among brands and the relations between an individual brand and product attributes. When consumers input their search words on the web, they may use a single keyword for the search, but they also often input multiple keywords to seek related information (this is referred to as simultaneous searching). A consumer performs a simultaneous search either to simultaneously compare two product brands to obtain information on their similarities and differences, or to acquire more in-depth information about a specific attribute in a specific brand. Web search traffic information shows that the quantity of simultaneous searches using certain keywords increases when the relation is closer in the consumer's mind and it will be possible to derive the relations between each of the keywords by collecting this relational data and subjecting it to network analysis. Accordingly, this study proposes a method of analyzing how brands are positioned by consumers and what relationships exist between product attributes and an individual brand, using simultaneous search traffic information. It also presents case studies demonstrating the actual application of this method, with a focus on tablets, belonging to innovative product groups.

A Study on the Status of Medical Equipment and Radiological Technologists using Big Data for Health Care: Based on Data for 2020-2021 (보건의료 빅데이터를 활용한 의료장비 및 방사선사 인력 현황 연구 : 2020-2021년 자료를 기준으로)

  • Jang, Hyon-Chol
    • Journal of the Korean Society of Radiology
    • /
    • v.15 no.5
    • /
    • pp.667-673
    • /
    • 2021
  • As we enter the era of the 4th industrial revolution, it is judged that the scope of work of radiologists will be further expanded according to the innovation and advancement of radiation medical technology development. In this study, the current status of medical equipment and radiology technicians was identified, and basic data were provided for the plan for nurturing talents in the field of radiation medical technology in the era of the 4th industrial revolution, as well as career and employment counseling. Data from the second quarter of 2020 and the second quarter of 2021 were analyzed using health and medical big data. As a result of comparing the status of medical equipment by type in 2021 compared to 2020, C-Arm X-ray examination equipment increased by 5.83% to 6,638 units, followed by MRI examination equipment 1,811 units 5.29%, and angiography equipment 725 units 5.22% , general X-ray examination equipment 21,557 units increased 3.99%, CT examination equipment 2,136 units 3.03%, and breast examination equipment 3,425 units increased 3.00%. As a result of a comparison of the total number of radiologists in 2021 compared to 2020, the number was 29,038, an increase of 2.73%. As a result of comparing the status of radiographers by region, the increase was highest in the Gyeonggi region with 5.96%, followed by the Gangwon region with a 5.66% increase and the Chungnam region with a 3.81% increase. In a situation where the number of medical equipment and radiologist manpower is increasing, universities are developing specialized knowledge and practical competency through subject development related to the understanding and utilization of customized artificial intelligence and big data that can be applied in the medical radiation technology field in the era of the 4th industrial revolution. It is necessary to nurture qualified radiographers, and at the level of the association, it is thought that active policies are needed to create new jobs and improve employment.

Application of Amplitude Demodulation to Acquire High-sampling Data of Total Flux Leakage for Tendon Nondestructive Estimation (덴던 비파괴평가를 위한 Total Flux Leakage에서 높은 측정빈도의 데이터를 획득하기 위한 진폭복조의 응용)

  • Joo-Hyung Lee;Imjong Kwahk;Changbin Joh;Ji-Young Choi;Kwang-Yeun Park
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.27 no.2
    • /
    • pp.17-24
    • /
    • 2023
  • A post-processing technique for the measurement signal of a solenoid-type sensor is introduced. The solenoid-type sensor nondestructively evaluates an external tendon of prestressed concrete using the total flux leakage (TFL) method. The TFL solenoid sensor consists of primary and secondary coils. AC electricity, with the shape of a sinusoidal function, is input in the primary coil. The signal proportional to the differential of the input is induced in the secondary coil. Because the amplitude of the induced signal is proportional to the cross-sectional area of the tendon, sectional loss of the tendon caused by ruptures or corrosion can be identified by the induced signal. Therefore, it is important to extract amplitude information from the measurement signal of the TFL sensor. Previously, the amplitude was extracted using local maxima, which is the simplest way to obtain amplitude information. However, because the sampling rate is dramatically decreased by amplitude extraction using the local maxima, the previous method places many restrictions on the direction of TFL sensor development, such as applying additional signal processing and/or artificial intelligence. Meanwhile, the proposed method uses amplitude demodulation to obtain the signal amplitude from the TFL sensor, and the sampling rate of the amplitude information is same to the raw TFL sensor data. The proposed method using amplitude demodulation provides ample freedom for development by eliminating restrictions on the first coil input frequency of the TFL sensor and the speed of applying the sensor to external tension. It also maintains a high measurement sampling rate, providing advantages for utilizing additional signal processing or artificial intelligence. The proposed method was validated through experiments, and the advantages were verified through comparison with the previous method. For example, in this study the amplitudes extracted by amplitude demodulation provided a sampling rate 100 times greater than those of the previous method. There may be differences depending on the given situation and specific equipment settings; however, in most cases, extracting amplitude information using amplitude demodulation yields more satisfactory results than previous methods.

Strategy for Store Management Using SOM Based on RFM (RFM 기반 SOM을 이용한 매장관리 전략 도출)

  • Jeong, Yoon Jeong;Choi, Il Young;Kim, Jae Kyeong;Choi, Ju Choel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.93-112
    • /
    • 2015
  • Depending on the change in consumer's consumption pattern, existing retail shop has evolved in hypermarket or convenience store offering grocery and daily products mostly. Therefore, it is important to maintain the inventory levels and proper product configuration for effectively utilize the limited space in the retail store and increasing sales. Accordingly, this study proposed proper product configuration and inventory level strategy based on RFM(Recency, Frequency, Monetary) model and SOM(self-organizing map) for manage the retail shop effectively. RFM model is analytic model to analyze customer behaviors based on the past customer's buying activities. And it can differentiates important customers from large data by three variables. R represents recency, which refers to the last purchase of commodities. The latest consuming customer has bigger R. F represents frequency, which refers to the number of transactions in a particular period and M represents monetary, which refers to consumption money amount in a particular period. Thus, RFM method has been known to be a very effective model for customer segmentation. In this study, using a normalized value of the RFM variables, SOM cluster analysis was performed. SOM is regarded as one of the most distinguished artificial neural network models in the unsupervised learning tool space. It is a popular tool for clustering and visualization of high dimensional data in such a way that similar items are grouped spatially close to one another. In particular, it has been successfully applied in various technical fields for finding patterns. In our research, the procedure tries to find sales patterns by analyzing product sales records with Recency, Frequency and Monetary values. And to suggest a business strategy, we conduct the decision tree based on SOM results. To validate the proposed procedure in this study, we adopted the M-mart data collected between 2014.01.01~2014.12.31. Each product get the value of R, F, M, and they are clustered by 9 using SOM. And we also performed three tests using the weekday data, weekend data, whole data in order to analyze the sales pattern change. In order to propose the strategy of each cluster, we examine the criteria of product clustering. The clusters through the SOM can be explained by the characteristics of these clusters of decision trees. As a result, we can suggest the inventory management strategy of each 9 clusters through the suggested procedures of the study. The highest of all three value(R, F, M) cluster's products need to have high level of the inventory as well as to be disposed in a place where it can be increasing customer's path. In contrast, the lowest of all three value(R, F, M) cluster's products need to have low level of inventory as well as to be disposed in a place where visibility is low. The highest R value cluster's products is usually new releases products, and need to be placed on the front of the store. And, manager should decrease inventory levels gradually in the highest F value cluster's products purchased in the past. Because, we assume that cluster has lower R value and the M value than the average value of good. And it can be deduced that product are sold poorly in recent days and total sales also will be lower than the frequency. The procedure presented in this study is expected to contribute to raising the profitability of the retail store. The paper is organized as follows. The second chapter briefly reviews the literature related to this study. The third chapter suggests procedures for research proposals, and the fourth chapter applied suggested procedure using the actual product sales data. Finally, the fifth chapter described the conclusion of the study and further research.