• Title/Summary/Keyword: 성과정보 활용

Search Result 4,083, Processing Time 0.064 seconds

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

  • Kim, Minsung;Im, Il
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.137-148
    • /
    • 2014
  • Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

    . Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

  • Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

    • Choi, Kyungbin;Nam, Kihwan
      • Journal of Intelligence and Information Systems
      • /
      • v.25 no.1
      • /
      • pp.85-107
      • /
      • 2019
    • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

    Policies for Improving Thermal Environment Using Vulnerability Assessment - A Case Study of Daegu, Korea - (열취약성 평가를 통한 열환경 개선 정책 제시 - 대구광역시를 사례로 -)

    • KIM, Kwon;EUM, Jeong-Hee
      • Journal of the Korean Association of Geographic Information Studies
      • /
      • v.21 no.2
      • /
      • pp.1-23
      • /
      • 2018
    • This study aims to propose a way for evaluating thermal environment vulnerability associated with policy to improve thermal environment. For this purpose, a variety of indices concerning thermal vulnerability assessment and adaptation policies for climate change applied to 17 Korean cities were reviewed and examined. Finally, 15 indices associated with policies for improving thermal environment were selected. The selected indices for thermal vulnerability assessment were applied to Daegu Metropolitan City of South Korea as a case study. As results, 15 vulnerability maps based on the standardized indices were established, and a comprehensive map with four grades of thermal vulnerability were established for Daegu Metropolitan City. As results, the area with the highest rated area in the first-grade(most vulnerable to heat) was Dong-gu, followed by Dalseo-gu and Buk-gu, and the highest area ratio of the first-grade regions was Ansim-1-dong in Dong-gu. Based on the standardized indices, the causes of the thermal environment vulnerability of Ansim-1-dong were accounted for the number of basic livelihood security recipients, the number of cardiovascular disease deaths, heat index, and Earth's surface temperature. To improve the thermal environment vulnerability of Ansim-1-dong, active policy implementation is required in expansion and maintenance of heat wave shelters, establishment of database for the population with diseases susceptible to high temperature environments, expansion of shade areas and so on. This study shows the applicability of the vulnerability assessment method linked with the policies and is expected to contribute to the strategic and effective establishment of thermal environment policies in urban master district plans.

    Comparison of Deep Learning Frameworks: About Theano, Tensorflow, and Cognitive Toolkit (딥러닝 프레임워크의 비교: 티아노, 텐서플로, CNTK를 중심으로)

    • Chung, Yeojin;Ahn, SungMahn;Yang, Jiheon;Lee, Jaejoon
      • Journal of Intelligence and Information Systems
      • /
      • v.23 no.2
      • /
      • pp.1-17
      • /
      • 2017
    • The deep learning framework is software designed to help develop deep learning models. Some of its important functions include "automatic differentiation" and "utilization of GPU". The list of popular deep learning framework includes Caffe (BVLC) and Theano (University of Montreal). And recently, Microsoft's deep learning framework, Microsoft Cognitive Toolkit, was released as open-source license, following Google's Tensorflow a year earlier. The early deep learning frameworks have been developed mainly for research at universities. Beginning with the inception of Tensorflow, however, it seems that companies such as Microsoft and Facebook have started to join the competition of framework development. Given the trend, Google and other companies are expected to continue investing in the deep learning framework to bring forward the initiative in the artificial intelligence business. From this point of view, we think it is a good time to compare some of deep learning frameworks. So we compare three deep learning frameworks which can be used as a Python library. Those are Google's Tensorflow, Microsoft's CNTK, and Theano which is sort of a predecessor of the preceding two. The most common and important function of deep learning frameworks is the ability to perform automatic differentiation. Basically all the mathematical expressions of deep learning models can be represented as computational graphs, which consist of nodes and edges. Partial derivatives on each edge of a computational graph can then be obtained. With the partial derivatives, we can let software compute differentiation of any node with respect to any variable by utilizing chain rule of Calculus. First of all, the convenience of coding is in the order of CNTK, Tensorflow, and Theano. The criterion is simply based on the lengths of the codes and the learning curve and the ease of coding are not the main concern. According to the criteria, Theano was the most difficult to implement with, and CNTK and Tensorflow were somewhat easier. With Tensorflow, we need to define weight variables and biases explicitly. The reason that CNTK and Tensorflow are easier to implement with is that those frameworks provide us with more abstraction than Theano. We, however, need to mention that low-level coding is not always bad. It gives us flexibility of coding. With the low-level coding such as in Theano, we can implement and test any new deep learning models or any new search methods that we can think of. The assessment of the execution speed of each framework is that there is not meaningful difference. According to the experiment, execution speeds of Theano and Tensorflow are very similar, although the experiment was limited to a CNN model. In the case of CNTK, the experimental environment was not maintained as the same. The code written in CNTK has to be run in PC environment without GPU where codes execute as much as 50 times slower than with GPU. But we concluded that the difference of execution speed was within the range of variation caused by the different hardware setup. In this study, we compared three types of deep learning framework: Theano, Tensorflow, and CNTK. According to Wikipedia, there are 12 available deep learning frameworks. And 15 different attributes differentiate each framework. Some of the important attributes would include interface language (Python, C ++, Java, etc.) and the availability of libraries on various deep learning models such as CNN, RNN, DBN, and etc. And if a user implements a large scale deep learning model, it will also be important to support multiple GPU or multiple servers. Also, if you are learning the deep learning model, it would also be important if there are enough examples and references.

    A Study on the Application and Development of Contents through Digitalizing Korean Patterns (한국문양의 디지털컨텐츠 개발과 활용에 관한 연구)

    • 박현택
      • Archives of design research
      • /
      • v.16 no.3
      • /
      • pp.201-210
      • /
      • 2003
    • The world is preparing another unseen war, that is, the cultural war of digital economy which will dominate the new millenium. As the “contents”, which are composed of various ingredients of media, gain vitality, the developed nations are in preparation of the war with the “cultural industry” weapons. The digital economic experts say that the left out nations will become economic colony in the new millenium age. The most important characteristics of cultural industry is the unity of creativity and culture which is all the more improved on the basis of the culture created upon knowledge. This leads to competition between nations or regions, and to survive one has to develop the industrial structure through cognition of its own cultural value. Furthermore, it is not a short-term development and investment of cultural products but a study on the method to graft the cultural value to the industry itself. The multi-media period does not accept an independent medium, and the contents products are becoming the leading industry since il is proved that they last semi-permanently in the digital world. The victory lies in the quality and quantity of the contents as the high ability and variety of the technology of media advance in accordance to the market principles. Since the culture, science and economy are becoming one complex structure, all nations of the world are trying the evolve a unique design of their on culture on the basis of the global universality. In consequence, we should excavate a uniqueness from our cultural heritage and develop into a korean design which will be recognized in the world market. The value of our cultural property should not only be used as academic and research purposes but should be re-evaluated with modem view, recognized as the core element that decides the quality of life and developed into exclusive designs. The korean designs represent the mould concept of our people which evolves from the mould or shape alphabet of Korea To meet the requirements of the changing world and in preparation of the cultural competitive age, it is never too early to make a data on the korean designs through their analysis and evaluation.

    • PDF

    Purification Efficiency of Slop & Plane Water Treatment Part of SRT System Using Eco-Concrete (Eco-Concrete를 이용한 SRT System의 사면수처리부와 평면수처리부의 정화효율분석)

    • Jang, Won-Geun;Park, Jae-Young;Choi, I-Song;Chang, Jun-Young;Oh, Jong-Min
      • Proceedings of the Korea Water Resources Association Conference
      • /
      • 2006.05a
      • /
      • pp.1860-1864
      • /
      • 2006
    • 본 연구는 강우시 발생되는 강우유출수와 합류식하수관거월류수에 의해 하천으로 유입되는 오염부하를 저감시키기 위한 공법으로, 고수부지 및 제방사면부와 둔치부를 형상화하여 pilot를 제작하였고, 연속적으로 시운전을 한 SRTS(Stormwater Runoff Treatment System)에 관한 것이다. SRT system 내부의 사면수처리부와 평면수처리부에는 다공성 콘크리트를 충진하였다. system 상부에는 식생을 조성하여 뿌리가 수면에 닿아 영양물질을 흡수하는 목적으로 사면수처리부와 평면수처리부에 각각 정육각형과 직사각형인 식생포트를 탈.부착이 가능하도록 고안하였다. 내부에서는 토양과 수처리조 사이에 연결관을 부착하였고, 모세관현상에 의해 토양이 수분을 흡수하도록 구성하였다. pilot plant는 유입부, 사면 수처리부, 평면 수처리부, 유출부로 나누었다. 유입부는 유입펌프와 V-notch로 구성하였고, 유입펌프는 2대를 설치하여 1시간 간격으로 연속적 유입으로 유량조절이 가능하도록 상호교대 운전을 하였다. 평면 수처리부$(W(1.0m){\times}(L(2.4m){\times}H(0.6m))$는 장방형의 접촉산화조로서 하부에 슬러지 침전 및 저류를 위한 hopper를 설치하여 슬러지의 원활한 수집 및 인발이 가능하도록 하였다. 유출부는 사각weir를 설치하였다. 강우유출수의 pH는 $7.27{\sim}7.92$이고, DO농도는 $7.12{\sim}7.88mg/l$로 관측되었다. 2차처리수의 pH는 평균7.4이고 DO농도는 최저 4.5 mg/l에서 최고 8.9 mg/l로 평균 6.8 mg/l로 관측되었다. 또한 강우유출수의 유입수의 T-N, T-P 농도는 각각 $17.5{\sim}22.5mg/l,\;8.9{\sim}11.4mg/l$의 범위이고, 2차 처리수의 유입수의 T-N, T-P 농도와 유사하였다.적인 방법론을 제시할 수 있을 것으로 사료된다.첨두홍수량을 저류하기 위해서 상대적으로 넓은 저류면적이 필요한 것으로 나타난다. 대등한 수위감소값의 홍수저감효과를 발휘하기 위해서 본 연구에서는 On-Line 저류지 면적은 Off-Line 저류지에 비 두배 이상이 필요한 것으로 보여졌다.들에 관한 정보는 종종 현장관측에서 조차 무시되는 경우가 많다. 이에 본 연구에서는 수질모형의 매개변수 중 특히 수리특성에 관련된 매개변수들이 수질에 미치는 영향을 파악하는 것을 목적으로 하고 있다. 이를 위해 적용된 수질모형은 QualKo를 사용하였으며, 대상 하천은 낙동강 본류 경남구간 시점 부근인 회천 합류 전부터 낙동강 본류 경남구간 종점 부근인 밀양강 합류 전까지의 경남 오염총량관리 기본계획 시 구축된 모형 매개변수를 바탕으로 분석을 수행하였다. 일차오차분석을 이용하여 수리매개변수와 수질매개변수의 수질항목별 상대적 기여도를 파악해 본 결과, 수리매개변수는 DO, BOD, 유기질소, 유기인 모든 항목에 일정 정도의 상대적 기여도를 가지고 있는 것을 알 수 있었다. 이로부터 수질 모형의 적용 시 수리 매개변수 또한 수질 매개변수의 추정 시와 같이 보다 세심한 주의를 기울여 추정할 필요가 있을 것으로 판단된다.변화와 기흉 발생과의 인과관계를 확인하고 좀 더 구체화하기 위한 연구가 필요할 것이다.게 이루어질 수 있을 것으로 기대된다.는 초과수익률이 상승하지만, 이후로는 감소하므로, 반전거래전략을 활용하는 경우 주식투자기간은 24개월이하의 중단기가 적합함을 발견하였다. 이상의 행태적 측면과 투자성과측면의 실증결과를 통하여 한국주식시장에 있어서 시장수익률을 평균적으로 초과할 수 있는 거래전략은 존재하므로 이러한 전략을 개발 및 활용할 수 있으며, 특히, 한국주식시장에 적합한 거래전략은 반전거래전략이고, 이 전략의 유용성은 투자자가 설정한 투자기간보다 더욱 긴 분석기간의 주식가격정보에 의하여 최대한 발휘될 수 있음을 확인하

    • PDF

    Practical Study on Learning Effects of University e-Learning (대학 e-러닝 학습효과에 관한 실증연구)

    • Kim, Joon-Ho
      • Information Systems Review
      • /
      • v.12 no.3
      • /
      • pp.19-48
      • /
      • 2010
    • This study focused on characterizing various factors in order for learners to maintain their interests in learning and to maximize learning effects as the top priority purpose of university e-Learning, on the basis of results of conceptual studies on existing e-Learning and practical studies, and then on examining them practically. It also analyzed which factors would have greater influence on learning effects of e-Learning in general. Moreover, in comparison with existing numerous studies which examined only factor such as learning effects of e-Learning, it analyzed such things in detail according to division into three items such as learning satisfaction, learning transfer and learning recommendation. To achieve such purposes of the study, it characterized and set 3 factors such as learning contents, instructional design and user convenience on the assumption that such factors have a significant influence on learning effects of e-Learning. Moreover, the factor of learning contents includes 3 detailed elements, i.e., learning issue and objective, knowledge information, and consistency and propriety, and the factor of instructional design includes 4 detailed elements, i.e., interest and sympathy, interaction, contents presentation and explanatory strategy. Lastly, the factor of user convenience includes 2 detailed elements such as screen configuration, and check-up of contents and teaching schedule. According to analytical results, it showed all 3 factors such as learning contents, instructional design and user convenience have a significant influence on learning effects of e-Learning(i.e., learning satisfaction, learning transfer and learning recommendation). In more detail, it showed the learning issue and objective from the factor of learning contents have the greatest influence on learning satisfaction of e-Learning. Then, it is the most important to set the learning issue and objective with given priority to learners and set the learning objective estimable, in order to raise the learning satisfaction. It showed the contents presentation from the factor of instructional design on the learning transfer. Therefore, it is the most important to structuralize mutual relation and presentation orders to promote learning systematically and to let learners access to such things, for the purpose of raising the learning transfer. Moreover, it showed the interest and sympathy from the factor of instructional design has the greatest influence on the learning recommendation. Thus, it is the most important to promote learners' interests to the maximum using well-timed media, and to give a lecture enough to arouse learners' sympathy.

    Citing Behavior of Korean Scientists on Foreign Journals in KSCD (KSCD를 활용한 국내 과학기술자의 해외 학술지 인용행태 연구)

    • Kim, Byung-Kyu;Kang, Mu-Yeong;Choi, Seon-Heui;Kim, Soon-Young;You, Beom-Jong;Shin, Jae-Do
      • Journal of the Korean Society for information Management
      • /
      • v.28 no.2
      • /
      • pp.117-133
      • /
      • 2011
    • There have been little comprehensive research for studying impact of foreign journals on Korean scientists. The main reason for this is because there was no extensive citation index database of domestic journals for analysis. Korea Institute of Science and Technology Information (KISTI) built the Korea Science Citation Database (KSCD), and have provided Korea Science Citation Index (KSCI) and Korea Journal Citation Reports (KJCR) services. In this article, citing behavior of Korean scientists on foreign journals was examined by using KSCD that covers Korean core journals. This research covers (1) analysis of foreign document types cited, (2) analysis of citation counts of foreign journals by subject and the ratio of citing different disciplines, (3) analysis of language and country of foreign documents cited, (4) analysis of publishers of journals and whether or not journals are listed on global citation index services and (5) analysis for current situation of subscribing to foreign electronic journals in Korea. The results of this research would be useful for establishing strategies for licensing foreign electronic journals and for information services. From this research, immediacy citation rate (average 1.46%), peak-time (average 3.9 years) and half-life (average 8 years) of cited foreign journals were identified. It was also found that Korean scientistis tend to cite journals covered in SCI(E) or SCOPUS, and 90% of cited foreign journals have been licensed by institutions in Korea.

    Locates the Sunken Ship 'Dmitri Donskoi' using Marine Geophysical Survey Techniques in Deep Water (지구물리 탐사기법을 이용한 심해 Dmitri Donskoi호 확인)

    • Yoo, Hai-Soo;Kim, Su-Jeong;Park, Dong-Won
      • 한국지구물리탐사학회:학술대회논문집
      • /
      • 2004.08a
      • /
      • pp.104-117
      • /
      • 2004
    • Dmitri Donskoi, which went down during the Russo-Japanese War occurred 100 years ago, was found by using geophysical exploration techniques at the 400 m water depth of submarine valley off Jeodong of Ulleung Island. In the submarine area with the rugged seabed topography and volcanic seamounts, in particular, the reliable seabed images were acquired by using the mid-to-shallow Multibeam exploration technique The strength of corrosion (causticity) of the sunken Donskoi, measured by the electrochemical method, decreased to 2/5 compared with the original strength.

    • PDF

    Forest Policy of Democratic People's Republic of Korea Represented in RodongShinmun (「로동신문」에 드러난 북한의 산림정책)

    • Song, Minkyung;Park, Mi-Sun;Youn, Yeo-Chang
      • Journal of Environmental Policy
      • /
      • v.11 no.3
      • /
      • pp.123-148
      • /
      • 2012
    • Deforestation and forest degradation in Democratic People's Republic of Korea (DPRK) accelerated from the mid 1980s through the economic crisis in 1990s and is still happening. DPRK has conducted afforestation and reforestation activities against this trend. However there are not many official documents on achievement of forest rehabilitation in DPRK. "Rodong Shinmun," as an official newspaper published by North Korean Workers Party, represents governmental policies and is one of a few accessible information on DPRK which is available in Republic of Korea (ROK). This paper aims to investigate the national forest policies of the DPRK represented in Rodong Shinmun. Total of 499 articles using the word 'Sanlim (forest)' and 'Rimsan (forest product)' in the title of articles were selected for content analysis. The national forest plans and forest policy instruments contained in the selected articles were analyzed. The subjects of represented forest policies were classified into four groups; forestation, forest management, land management and forest protection or conservation. The focus of forest policy was changed from economic utilization of forest resources such as timber production in the 1990s to forest protection in the 2000s. Rodong Shinmun reported more frequently about regulatory instruments and informational instruments than the economic instruments. Official commendation and awards were the main incentives given to the people who contributed to forestry achievements. In particular, forest policies were emphasized by Kim Il Sung and Kim Jong Il and afforestation and forest protection were described as patriotic activities in the Rodong Shinmun. In conclusion, this research revealed that Rodong Shinmun plays a role as a means for introducing, propagating and instigating forest policies in the society of the DPRK. The findings help to understand the forest policies of the DPRK which could be useful when designing development aids for DPRK.

    • PDF

    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.