Search | Korea Science

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

Ahn, Hyunchul
- Information Systems Review
- /
- v.16 no.3
- /
- pp.161-177
- /
- 2014
Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.
https://doi.org/10.14329/isr.2014.16.3.161 인용 PDF

Development of Standard Process for Private Information Protection of Medical Imaging Issuance (개인정보 보호를 위한 의료영상 발급 표준 업무절차 개발연구)

Park, Bum-Jin;Yoo, Beong-Gyu;Lee, Jong-Seok;Jeong, Jae-Ho;Son, Gi-Gyeong;Kang, Hee-Doo
- Journal of radiological science and technology
- /
- v.32 no.3
- /
- pp.335-341
- /
- 2009
Purpose : The medical imaging issuance is changed from conventional film method to Digital Compact Disk solution because of development on IT technology. However other medical record department's are undergoing identification check through and through whereas medical imaging department cannot afford to do that. So, we examine present applicant's recognition of private intelligence safeguard, and medical imaging issuance condition by CD & DVD medium toward various medical facility and then perform comparative analysis associated with domestic and foreign law & recommendation, lastly suggest standard for medical imaging issuance and process relate with internal environment. Materials and methods : First, we surveyed issuance process & required documents when situation of medical image issuance in the metropolitan medical facility by wire telephone between 2008.6.1$\sim$2008.7.1. in accordance with the medical law Article 21$\sim$clause 2, suggested standard through applicant's required documents occasionally - (1) in the event of oneself $\rightarrow$ verifying identification, (2) in the event of family $\rightarrow$ verifying applicant identification & family relations document (health insurance card, attested copy, and so on), (3) third person or representative $\rightarrow$ verifying applicant identification & letter of attorney & certificate of one's seal impression. Second, also checked required documents of applicant in accordance with upper standard when situation of medical image issuance in Kyung-hee university medical center during 3 month 2008.5.1$\sim$2008.7.31. Third, developed a work process by triangular position of issuance procedure for situation when verifying required documents & management of unpreparedness. Result : Look all over the our manufactured output in the hospital - satisfy the all conditions $\rightarrow$ 4 place(12%), possibly request everyone $\rightarrow$ 4 place(12%), and apply in the clinic section $\rightarrow$ 9 place(27%) that does not medical imaging issuance office, so we don't know about required documents condition. and look into whether meet or not the applicant's required documents on upper 3month survey - satisfy the all conditions $\rightarrow$ 629 case(49%), prepare a one part $\rightarrow$ 416 case(33%), insufficiency of all document $\rightarrow$ 226case(18%). On the authority of upper research result, we are establishing the service model mapping for objective reception when image export situation through triangular position of issuance procedure and reduce of friction with patient and promote the patient convenience. Conclusion : The PACS is classified under medical machinery that mean indicates about higher importance of medical information therefore medical information administrator's who already received professional education & mind, are performer about issuance process only and also have to provide under ID checking process exhaustively.
PDF

Relationships Among Employees' IT Personnel Competency, Personal Work Satisfaction, and Personal Work Performance: A Goal Orientation Perspective (조직구성원의 정보기술 인적역량과 개인 업무만족 및 업무성과 간의 관계: 목표지향성 관점)

Heo, Myung-Sook;Cheon, Myun-Joong
- Asia pacific journal of information systems
- /
- v.21 no.4
- /
- pp.63-104
- /
- 2011
The study examines the relationships among employee's goal orientation, IT personnel competency, personal effectiveness. The goal orientation includes learning goal orientation, performance approach goal orientation, and performance avoid goal orientation. Personal effectiveness consists of personal work satisfaction and personal work performance. In general, IT personnel competency refers to IT expert's skills, expertise, and knowledge required to perform IT activities in organizations. However, due to the advent of the internet and the generalization of IT, IT personnel competency turns out to be an important competency of technological experts as well as employees in organizations. While the competency of IT itself is important, the appropriate harmony between IT personnel's business capability and technological capability enhances the value of human resources and thus provides organizations with sustainable competitive advantages. The rapid pace of organization change places increased pressure on employees to continually update their skills and adapt their behavior to new organizational realities. This challenge raises a number of important questions concerning organizational behavior? Why do some employees display remarkable flexibility in their behavioral responses to changes in the organization, whereas others firmly resist change or experience great stress when faced with the need to alter behavior? Why do some employees continually strive to improve themselves over their life span, whereas others are content to forge through life using the same basic knowledge and skills? Why do some employees throw themselves enthusiastically into challenging tasks, whereas others avoid challenging tasks? The goal orientation proposed by organizational psychology provides at least a partial answer to these questions. Goal orientations refer to stable personally characteristics fostered by "self-theories" about the nature and development of attributes (such as intelligence, personality, abilities, and skills) people have. Self-theories are one's beliefs and goal orientations are achievement motivation revealed in seeking goals in accordance with one's beliefs. The goal orientations include learning goal orientation, performance approach goal orientation, and performance avoid goal orientation. Specifically, a learning goal orientation refers to a preference to develop the self by acquiring new skills, mastering new situations, and improving one's competence. A performance approach goal orientation refers to a preference to demonstrate and validate the adequacy of one's competence by seeking favorable judgments and avoiding negative judgments. A performance avoid goal orientation refers to a preference to avoid the disproving of one's competence and to avoid negative judgements about it, while focusing on performance. And the study also examines the moderating role of work career of employees to investigate the difference in the relationship between IT personnel competency and personal effectiveness. The study analyzes the collected data using PASW 18.0 and and PLS(Partial Least Square). The study also uses PLS bootstrapping algorithm (sample size: 500) to test research hypotheses. The result shows that the influences of both a learning goal orientation (${\beta}$ = 0.301, t = 3.822, P < 0.000) and a performance approach goal orientation (${\beta}$ = 0.224, t = 2.710, P < 0.01) on IT personnel competency are positively significant, while the influence of a performance avoid goal orientation(${\beta}$ = -0.142, t = 2.398, p < 0.05) on IT personnel competency is negatively significant. The result indicates that employees differ in their psychological and behavioral responses according to the goal orientation of employees. The result also shows that the impact of a IT personnel competency on both personal work satisfaction(${\beta}$ = 0.395, t = 4.897, P < 0.000) and personal work performance(${\beta}$ = 0.575, t = 12.800, P < 0.000) is positively significant. And the impact of personal work satisfaction(${\beta}$ = 0.148, t = 2.432, p < 0.05) on personal work performance is positively significant. Finally, the impacts of control variables (gender, age, type of industry, position, work career) on the relationships between IT personnel competency and personal effectiveness(personal work satisfaction work performance) are partly significant. In addition, the study uses PLS algorithm to find out a GoF(global criterion of goodness of fit) of the exploratory research model which includes a mediating variable, IT personnel competency. The result of analysis shows that the value of GoF is 0.45 above GoFlarge(0.36). Therefore, the research model turns out be good. In addition, the study performs a Sobel Test to find out the statistical significance of the mediating variable, IT personnel competency, which is already turned out to have the mediating effect in the research model using PLS. The result of a Sobel Test shows that the values of Z are all significant statistically (above 1.96 and below -1.96) and indicates that IT personnel competency plays a mediating role in the research model. At the present day, most employees are universally afraid of organizational changes and resistant to them in organizations in which the acceptance and learning of a new information technology or information system is particularly required. The problem is due' to increasing a feeling of uneasiness and uncertainty in improving past practices in accordance with new organizational changes. It is not always possible for employees with positive attitudes to perform their works suitable to organizational goals. Therefore, organizations need to identify what kinds of goal-oriented minds employees have, motivate them to do self-directed learning, and provide them with organizational environment to enhance positive aspects in their works. Thus, the study provides researchers and practitioners with a matter of primary interest in goal orientation and IT personnel competency, of which they have been unaware until very recently. Some academic and practical implications and limitations arisen in the course of the research, and suggestions for future research directions are also discussed.
PDF KSCI

Case Analysis of the Promotion Methodologies in the Smart Exhibition Environment (스마트 전시 환경에서 프로모션 적용 사례 및 분석)

Moon, Hyun Sil;Kim, Nam Hee;Kim, Jae Kyeong
- Journal of Intelligence and Information Systems
- /
- v.18 no.3
- /
- pp.171-183
- /
- 2012
In the development of technologies, the exhibition industry has received much attention from governments and companies as an important way of marketing activities. Also, the exhibitors have considered the exhibition as new channels of marketing activities. However, the growing size of exhibitions for net square feet and the number of visitors naturally creates the competitive environment for them. Therefore, to make use of the effective marketing tools in these environments, they have planned and implemented many promotion technics. Especially, through smart environment which makes them provide real-time information for visitors, they can implement various kinds of promotion. However, promotions ignoring visitors' various needs and preferences can lose the original purposes and functions of them. That is, as indiscriminate promotions make visitors feel like spam, they can't achieve their purposes. Therefore, they need an approach using STP strategy which segments visitors through right evidences (Segmentation), selects the target visitors (Targeting), and give proper services to them (Positioning). For using STP Strategy in the smart exhibition environment, we consider these characteristics of it. First, an exhibition is defined as market events of a specific duration, which are held at intervals. According to this, exhibitors who plan some promotions should different events and promotions in each exhibition. Therefore, when they adopt traditional STP strategies, a system can provide services using insufficient information and of existing visitors, and should guarantee the performance of it. Second, to segment automatically, cluster analysis which is generally used as data mining technology can be adopted. In the smart exhibition environment, information of visitors can be acquired in real-time. At the same time, services using this information should be also provided in real-time. However, many clustering algorithms have scalability problem which they hardly work on a large database and require for domain knowledge to determine input parameters. Therefore, through selecting a suitable methodology and fitting, it should provide real-time services. Finally, it is needed to make use of data in the smart exhibition environment. As there are useful data such as booth visit records and participation records for events, the STP strategy for the smart exhibition is based on not only demographical segmentation but also behavioral segmentation. Therefore, in this study, we analyze a case of the promotion methodology which exhibitors can provide a differentiated service to segmented visitors in the smart exhibition environment. First, considering characteristics of the smart exhibition environment, we draw evidences of segmentation and fit the clustering methodology for providing real-time services. There are many studies for classify visitors, but we adopt a segmentation methodology based on visitors' behavioral traits. Through the direct observation, Veron and Levasseur classify visitors into four groups to liken visitors' traits to animals (Butterfly, fish, grasshopper, and ant). Especially, because variables of their classification like the number of visits and the average time of a visit can estimate in the smart exhibition environment, it can provide theoretical and practical background for our system. Next, we construct a pilot system which automatically selects suitable visitors along the objectives of promotions and instantly provide promotion messages to them. That is, based on the segmentation of our methodology, our system automatically selects suitable visitors along the characteristics of promotions. We adopt this system to real exhibition environment, and analyze data from results of adaptation. As a result, as we classify visitors into four types through their behavioral pattern in the exhibition, we provide some insights for researchers who build the smart exhibition environment and can gain promotion strategies fitting each cluster. First, visitors of ANT type show high response rate for promotion messages except experience promotion. So they are fascinated by actual profits in exhibition area, and dislike promotions requiring a long time. Contrastively, visitors of GRASSHOPPER type show high response rate only for experience promotion. Second, visitors of FISH type appear favors to coupon and contents promotions. That is, although they don't look in detail, they prefer to obtain further information such as brochure. Especially, exhibitors that want to give much information for limited time should give attention to visitors of this type. Consequently, these promotion strategies are expected to give exhibitors some insights when they plan and organize their activities, and grow the performance of them.
https://doi.org/10.13088/jiis.2012.18.3.171 인용 PDF KSCI

Analyzing the User Intention of Booth Recommender System in Smart Exhibition Environment (스마트 전시환경에서 부스 추천시스템의 사용자 의도에 관한 조사연구)

Choi, Jae Ho;Xiang, Jun-Yong;Moon, Hyun Sil;Choi, Il Young;Kim, Jae Kyeong
- Journal of Intelligence and Information Systems
- /
- v.18 no.3
- /
- pp.153-169
- /
- 2012
Exhibitions have played a key role of effective marketing activity which directly informs services and products to current and potential customers. Through participating in exhibitions, exhibitors have got the opportunity to make face-to-face contact so that they can secure the market share and improve their corporate images. According to this economic importance of exhibitions, show organizers try to adopt a new IT technology for improving their performance, and researchers have also studied services which can improve the satisfaction of visitors through analyzing visit patterns of visitors. Especially, as smart technologies make them monitor activities of visitors in real-time, they have considered booth recommender systems which infer preference of visitors and recommender proper service to them like on-line environment. However, while there are many studies which can improve their performance in the side of new technological development, they have not considered the choice factor of visitors for booth recommender systems. That is, studies for factors which can influence the development direction and effective diffusion of these systems are insufficient. Most of prior studies for the acceptance of new technologies and the continuous intention of use have adopted Technology Acceptance Model (TAM) and Extended Technology Acceptance Model (ETAM). Booth recommender systems may not be new technology because they are similar with commercial recommender systems such as book recommender systems, in the smart exhibition environment, they can be considered new technology. However, for considering the smart exhibition environment beyond TAM, measurements for the intention of reuse should focus on how booth recommender systems can provide correct information to visitors. In this study, through literature reviews, we draw factors which can influence the satisfaction and reuse intention of visitors for booth recommender systems, and design a model to forecast adaptation of visitors for booth recommendation in the exhibition environment. For these purposes, we conduct a survey for visitors who attended DMC Culture Open in November 2011 and experienced booth recommender systems using own smart phone, and examine hypothesis by regression analysis. As a result, factors which can influence the satisfaction of visitors for booth recommender systems are the effectiveness, perceived ease of use, argument quality, serendipity, and so on. Moreover, the satisfaction for booth recommender systems has a positive relationship with the development of reuse intention. For these results, we have some insights for booth recommender systems in the smart exhibition environment. First, this study gives shape to important factors which are considered when they establish strategies which induce visitors to consistently use booth recommender systems. Recently, although show organizers try to improve their performances using new IT technologies, their visitors have not felt the satisfaction from these efforts. At this point, this study can help them to provide services which can improve the satisfaction of visitors and make them last relationship with visitors. On the other hands, this study suggests that they managers along the using time of booth recommender systems. For example, in the early stage of the adoption, they should focus on the argument quality, perceived ease of use, and serendipity, so that improve the acceptance of booth recommender systems. After these stages, they should bridge the differences between expectation and perception for booth recommender systems, and lead continuous uses of visitors. However, this study has some limitations. We only use four factors which can influence the satisfaction of visitors. Therefore, we should development our model to consider important additional factors. And the exhibition in our experiments has small number of booths so that visitors may not need to booth recommender systems. In the future study, we will conduct experiments in the exhibition environment which has a larger scale.
https://doi.org/10.13088/jiis.2012.18.3.153 인용 PDF KSCI

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

Kam, Miah;Song, Min
- Journal of Intelligence and Information Systems
- /
- v.18 no.3
- /
- pp.53-77
- /
- 2012
This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.
https://doi.org/10.13088/jiis.2012.18.3.053 인용 PDF KSCI

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

Kim, Minsung;Im, Il
- Journal of Intelligence and Information Systems
- /
- v.20 no.2
- /
- pp.137-148
- /
- 2014

Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

. Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

https://doi.org/10.13088/jiis.2014.20.2.137 인용 PDF KSCI

Scalable Collaborative Filtering Technique based on Adaptive Clustering (적응형 군집화 기반 확장 용이한 협업 필터링 기법)

Lee, O-Joun;Hong, Min-Sung;Lee, Won-Jin;Lee, Jae-Dong
- Journal of Intelligence and Information Systems
- /
- v.20 no.2
- /
- pp.73-92
- /
- 2014
An Adaptive Clustering-based Collaborative Filtering Technique was proposed to solve the fundamental problems of collaborative filtering, such as cold-start problems, scalability problems and data sparsity problems. Previous collaborative filtering techniques were carried out according to the recommendations based on the predicted preference of the user to a particular item using a similar item subset and a similar user subset composed based on the preference of users to items. For this reason, if the density of the user preference matrix is low, the reliability of the recommendation system will decrease rapidly. Therefore, the difficulty of creating a similar item subset and similar user subset will be increased. In addition, as the scale of service increases, the time needed to create a similar item subset and similar user subset increases geometrically, and the response time of the recommendation system is then increased. To solve these problems, this paper suggests a collaborative filtering technique that adapts a condition actively to the model and adopts the concepts of a context-based filtering technique. This technique consists of four major methodologies. First, items are made, the users are clustered according their feature vectors, and an inter-cluster preference between each item cluster and user cluster is then assumed. According to this method, the run-time for creating a similar item subset or user subset can be economized, the reliability of a recommendation system can be made higher than that using only the user preference information for creating a similar item subset or similar user subset, and the cold start problem can be partially solved. Second, recommendations are made using the prior composed item and user clusters and inter-cluster preference between each item cluster and user cluster. In this phase, a list of items is made for users by examining the item clusters in the order of the size of the inter-cluster preference of the user cluster, in which the user belongs, and selecting and ranking the items according to the predicted or recorded user preference information. Using this method, the creation of a recommendation model phase bears the highest load of the recommendation system, and it minimizes the load of the recommendation system in run-time. Therefore, the scalability problem and large scale recommendation system can be performed with collaborative filtering, which is highly reliable. Third, the missing user preference information is predicted using the item and user clusters. Using this method, the problem caused by the low density of the user preference matrix can be mitigated. Existing studies on this used an item-based prediction or user-based prediction. In this paper, Hao Ji's idea, which uses both an item-based prediction and user-based prediction, was improved. The reliability of the recommendation service can be improved by combining the predictive values of both techniques by applying the condition of the recommendation model. By predicting the user preference based on the item or user clusters, the time required to predict the user preference can be reduced, and missing user preference in run-time can be predicted. Fourth, the item and user feature vector can be made to learn the following input of the user feedback. This phase applied normalized user feedback to the item and user feature vector. This method can mitigate the problems caused by the use of the concepts of context-based filtering, such as the item and user feature vector based on the user profile and item properties. The problems with using the item and user feature vector are due to the limitation of quantifying the qualitative features of the items and users. Therefore, the elements of the user and item feature vectors are made to match one to one, and if user feedback to a particular item is obtained, it will be applied to the feature vector using the opposite one. Verification of this method was accomplished by comparing the performance with existing hybrid filtering techniques. Two methods were used for verification: MAE(Mean Absolute Error) and response time. Using MAE, this technique was confirmed to improve the reliability of the recommendation system. Using the response time, this technique was found to be suitable for a large scaled recommendation system. This paper suggested an Adaptive Clustering-based Collaborative Filtering Technique with high reliability and low time complexity, but it had some limitations. This technique focused on reducing the time complexity. Hence, an improvement in reliability was not expected. The next topic will be to improve this technique by rule-based filtering.
https://doi.org/10.13088/jiis.2014.20.2.073 인용 PDF KSCI

Bae, Kyoung-Yul
- Journal of Intelligence and Information Systems
- /
- v.19 no.4
- /
- pp.147-157
- /
- 2013
With the advent of the digital environment that can be accessed anytime, anywhere with the introduction of high-speed network, the free distribution and use of digital content were made possible. Ironically this environment is raising a variety of copyright infringement, and product images used in the online shopping mall are pirated frequently. There are many controversial issues whether shopping mall images are creative works or not. According to Supreme Court's decision in 2001, to ad pictures taken with ham products is simply a clone of the appearance of objects to deliver nothing but the decision was not only creative expression. But for the photographer's losses recognized in the advertising photo shoot takes the typical cost was estimated damages. According to Seoul District Court precedents in 2003, if there are the photographer's personality and creativity in the selection of the subject, the composition of the set, the direction and amount of light control, set the angle of the camera, shutter speed, shutter chance, other shooting methods for capturing, developing and printing process, the works should be protected by copyright law by the Court's sentence. In order to receive copyright protection of the shopping mall images by the law, it is simply not to convey the status of the product, the photographer's personality and creativity can be recognized that it requires effort. Accordingly, the cost of making the mall image increases, and the necessity for copyright protection becomes higher. The product images of the online shopping mall have a very unique configuration unlike the general pictures such as portraits and landscape photos and, therefore, the general image watermarking technique can not satisfy the requirements of the image watermarking. Because background of product images commonly used in shopping malls is white or black, or gray scale (gradient) color, it is difficult to utilize the space to embed a watermark and the area is very sensitive even a slight change. In this paper, the characteristics of images used in shopping malls are analyzed and a watermarking technology which is suitable to the shopping mall images is proposed. The proposed image watermarking technology divide a product image into smaller blocks, and the corresponding blocks are transformed by DCT (Discrete Cosine Transform), and then the watermark information was inserted into images using quantization of DCT coefficients. Because uniform treatment of the DCT coefficients for quantization cause visual blocking artifacts, the proposed algorithm used weighted mask which quantizes finely the coefficients located block boundaries and coarsely the coefficients located center area of the block. This mask improves subjective visual quality as well as the objective quality of the images. In addition, in order to improve the safety of the algorithm, the blocks which is embedded the watermark are randomly selected and the turbo code is used to reduce the BER when extracting the watermark. The PSNR(Peak Signal to Noise Ratio) of the shopping mall image watermarked by the proposed algorithm is 40.7~48.5[dB] and BER(Bit Error Rate) after JPEG with QF = 70 is 0. This means the watermarked image is high quality and the algorithm is robust to JPEG compression that is used generally at the online shopping malls. Also, for 40% change in size and 40 degrees of rotation, the BER is 0. In general, the shopping malls are used compressed images with QF which is higher than 90. Because the pirated image is used to replicate from original image, the proposed algorithm can identify the copyright infringement in the most cases. As shown the experimental results, the proposed algorithm is suitable to the shopping mall images with simple background. However, the future study should be carried out to enhance the robustness of the proposed algorithm because the robustness loss is occurred after mask process.
https://doi.org/10.13088/jiis.2013.19.4.147 인용 PDF KSCI

A Study on the Differences of Information Diffusion Based on the Type of Media and Information (매체와 정보유형에 따른 정보확산 차이에 대한 연구)

Lee, Sang-Gun;Kim, Jin-Hwa;Baek, Heon;Lee, Eui-Bang
- Journal of Intelligence and Information Systems
- /
- v.19 no.4
- /
- pp.133-146
- /
- 2013
While the use of internet is routine nowadays, users receive and share information through a variety of media. Through the use of internet, information delivery media is diversifying from traditional media of one-way communication, such as newspaper, TV, and radio, into media of two-way communication. In contrast of traditional media, blogs enable individuals to directly upload and share news, which can be considered to have a differential speed of information diffusion than news media that convey information unilaterally. Therefore this Study focused on the difference between online news and social media blogs. Moreover, there are variations in the speed of information diffusion because that information closely related to one person boosts communications between individuals. We believe that users' standard of evaluation would change based on the types of information. As well, the speed of information diffusion would change based on the level of proximity. Therefore, the purpose of this study is to examine the differences in information diffusion based on the types of media. And then information is segmentalized and an examination is done to see how information diffusion differentiates based on the types of information. This study used the Bass diffusion model, which has been frequently used because this model has higher explanatory power than other models by explaining diffusion of market through innovation effect and imitation effect. Also this model has been applied a lot in other information diffusion related studies. The Bass diffusion model includes an innovation effect and an imitation effect. Innovation effect measures the early-stage impact, while the imitation effect measures the impact of word of mouth at the later stage. According to Mahajan et al. (2000), Innovation effect is emphasized by usefulness and ease-of-use, as well Imitation effect is emphasized by subjective norm and word-of-mouth. Also, according to Lee et al. (2011), Innovation effect is emphasized by mass communication. According to Moore and Benbasat (1996), Innovation effect is emphasized by relative advantage. Because Imitation effect is adopted by within-group influences and Innovation effects is adopted by product's or service's innovation. Therefore, ours study compared online news and social media blogs to examine the differences between media. We also choose different types of information including entertainment related information "Psy Gentelman", Current affair news "Earthquake in Sichuan, China", and product related information "Galaxy S4" in order to examine the variations on information diffusion. We considered that users' information proximity alters based on the types of information. Hence, we chose the three types of information mentioned above, which have different level of proximity from users' standpoint, in order to examine the flow of information diffusion. The first conclusion of this study is that different media has similar effect on information diffusion, even the types of media of information provider are different. Information diffusion has only been distinguished by a disparity between proximity of information. Second, information diffusions differ based on types of information. From the standpoint of users, product and entertainment related information has high imitation effect because of word of mouth. On the other hand, imitation effect dominates innovation effect on Current affair news. From the results of this study, the flow changes of information diffusion is examined and be applied to practical use. This study has some limitations, and those limitations would be able to provide opportunities and suggestions for future research. Presenting the difference of Information diffusion according to media and proximity has difficulties for generalization of theory due to small sample size. Therefore, if further studies adopt to a request for an increase of sample size and media diversity, difference of the information diffusion according to media type and information proximity could be understood more detailed.
https://doi.org/10.13088/jiis.2013.19.4.133 인용 PDF KSCI

맨앞
이전
536
537
538현재
539
540
다음
맨뒤
538 / 547 pages

Search Result 5,469, Processing Time 0.032 seconds

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

Development of Standard Process for Private Information Protection of Medical Imaging Issuance (개인정보 보호를 위한 의료영상 발급 표준 업무절차 개발연구)

Relationships Among Employees' IT Personnel Competency, Personal Work Satisfaction, and Personal Work Performance: A Goal Orientation Perspective (조직구성원의 정보기술 인적역량과 개인 업무만족 및 업무성과 간의 관계: 목표지향성 관점)

Case Analysis of the Promotion Methodologies in the Smart Exhibition Environment (스마트 전시 환경에서 프로모션 적용 사례 및 분석)

Analyzing the User Intention of Booth Recommender System in Smart Exhibition Environment (스마트 전시환경에서 부스 추천시스템의 사용자 의도에 관한 조사연구)

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

Scalable Collaborative Filtering Technique based on Adaptive Clustering (적응형 군집화 기반 확장 용이한 협업 필터링 기법)

Image Watermarking for Copyright Protection of Images on Shopping Mall (쇼핑몰 이미지 저작권보호를 위한 영상 워터마킹)

A Study on the Differences of Information Diffusion Based on the Type of Media and Information (매체와 정보유형에 따른 정보확산 차이에 대한 연구)

Search Result 5,469, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)