• Title/Summary/Keyword: 네트워크 관리 서비스

Search Result 2,056, Processing Time 0.032 seconds

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

  • Kim, Minsung;Im, Il
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.137-148
    • /
    • 2014
  • Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

    . Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

  • Mapping Categories of Heterogeneous Sources Using Text Analytics (텍스트 분석을 통한 이종 매체 카테고리 다중 매핑 방법론)

    • Kim, Dasom;Kim, Namgyu
      • Journal of Intelligence and Information Systems
      • /
      • v.22 no.4
      • /
      • pp.193-215
      • /
      • 2016
    • In recent years, the proliferation of diverse social networking services has led users to use many mediums simultaneously depending on their individual purpose and taste. Besides, while collecting information about particular themes, they usually employ various mediums such as social networking services, Internet news, and blogs. However, in terms of management, each document circulated through diverse mediums is placed in different categories on the basis of each source's policy and standards, hindering any attempt to conduct research on a specific category across different kinds of sources. For example, documents containing content on "Application for a foreign travel" can be classified into "Information Technology," "Travel," or "Life and Culture" according to the peculiar standard of each source. Likewise, with different viewpoints of definition and levels of specification for each source, similar categories can be named and structured differently in accordance with each source. To overcome these limitations, this study proposes a plan for conducting category mapping between different sources with various mediums while maintaining the existing category system of the medium as it is. Specifically, by re-classifying individual documents from the viewpoint of diverse sources and storing the result of such a classification as extra attributes, this study proposes a logical layer by which users can search for a specific document from multiple heterogeneous sources with different category names as if they belong to the same source. Besides, by collecting 6,000 articles of news from two Internet news portals, experiments were conducted to compare accuracy among sources, supervised learning and semi-supervised learning, and homogeneous and heterogeneous learning data. It is particularly interesting that in some categories, classifying accuracy of semi-supervised learning using heterogeneous learning data proved to be higher than that of supervised learning and semi-supervised learning, which used homogeneous learning data. This study has the following significances. First, it proposes a logical plan for establishing a system to integrate and manage all the heterogeneous mediums in different classifying systems while maintaining the existing physical classifying system as it is. This study's results particularly exhibit very different classifying accuracies in accordance with the heterogeneity of learning data; this is expected to spur further studies for enhancing the performance of the proposed methodology through the analysis of characteristics by category. In addition, with an increasing demand for search, collection, and analysis of documents from diverse mediums, the scope of the Internet search is not restricted to one medium. However, since each medium has a different categorical structure and name, it is actually very difficult to search for a specific category insofar as encompassing heterogeneous mediums. The proposed methodology is also significant for presenting a plan that enquires into all the documents regarding the standards of the relevant sites' categorical classification when the users select the desired site, while maintaining the existing site's characteristics and structure as it is. This study's proposed methodology needs to be further complemented in the following aspects. First, though only an indirect comparison and evaluation was made on the performance of this proposed methodology, future studies would need to conduct more direct tests on its accuracy. That is, after re-classifying documents of the object source on the basis of the categorical system of the existing source, the extent to which the classification was accurate needs to be verified through evaluation by actual users. In addition, the accuracy in classification needs to be increased by making the methodology more sophisticated. Furthermore, an understanding is required that the characteristics of some categories that showed a rather higher classifying accuracy of heterogeneous semi-supervised learning than that of supervised learning might assist in obtaining heterogeneous documents from diverse mediums and seeking plans that enhance the accuracy of document classification through its usage.

    Perception and Appraisal of Urban Park Users Using Text Mining of Google Maps Review - Cases of Seoul Forest, Boramae Park, Olympic Park - (구글맵리뷰 텍스트마이닝을 활용한 공원 이용자의 인식 및 평가 - 서울숲, 보라매공원, 올림픽공원을 대상으로 -)

    • Lee, Ju-Kyung;Son, Yong-Hoon
      • Journal of the Korean Institute of Landscape Architecture
      • /
      • v.49 no.4
      • /
      • pp.15-29
      • /
      • 2021
    • The study aims to grasp the perception and appraisal of urban park users through text analysis. This study used Google review data provided by Google Maps. Google Maps Review is an online review platform that provides information evaluating locations through social media and provides an understanding of locations from the perspective of general reviewers and regional guides who are registered as members of Google Maps. The study determined if the Google Maps Reviews were useful for extracting meaningful information about the user perceptions and appraisals for parks management plans. The study chose three urban parks in Seoul, South Korea; Seoul Forest, Boramae Park, and Olympic Park. Review data for each of these three parks were collected via web crawling using Python. Through text analysis, the keywords and network structure characteristics for each park were analyzed. The text was analyzed, as were park ratings, and the analysis compared the reviews of residents and foreign tourists. The common keywords found in the review comments for the three parks were "walking", "bicycle", "rest" and "picnic" for activities, "family", "child" and "dogs" for accompanying types, and "playground" and "walking trail" for park facilities. Looking at the characteristics of each park, Seoul Forest shows many outdoor activities based on nature, while the lack of parking spaces and congestion on weekends negatively impacted users. Boramae Park has the appearance of a city park, with various facilities providing numerous activities, but reviewers often cited the park's complexity and the negative aspects in terms of dog walking groups. At Olympic Park, large-scale complex facilities and cultural events were frequently mentioned, emphasizing its entertainment functions. Google Maps Review can function as useful data to identify parks' overall users' experiences and general feelings. Compared to data from other social media sites, Google Maps Review's data provides ratings and understanding factors, including user satisfaction and dissatisfaction.

    A Study on the Determinants of Patent Citation Relationships among Companies : MR-QAP Analysis (기업 간 특허인용 관계 결정요인에 관한 연구 : MR-QAP분석)

    • Park, Jun Hyung;Kwahk, Kee-Young;Han, Heejun;Kim, Yunjeong
      • Journal of Intelligence and Information Systems
      • /
      • v.19 no.4
      • /
      • pp.21-37
      • /
      • 2013
    • Recently, as the advent of the knowledge-based society, there are more people getting interested in the intellectual property. Especially, the ICT companies leading the high-tech industry are working hard to strive for systematic management of intellectual property. As we know, the patent information represents the intellectual capital of the company. Also now the quantitative analysis on the continuously accumulated patent information becomes possible. The analysis at various levels becomes also possible by utilizing the patent information, ranging from the patent level to the enterprise level, industrial level and country level. Through the patent information, we can identify the technology status and analyze the impact of the performance. We are also able to find out the flow of the knowledge through the network analysis. By that, we can not only identify the changes in technology, but also predict the direction of the future research. In the field using the network analysis there are two important analyses which utilize the patent citation information; citation indicator analysis utilizing the frequency of the citation and network analysis based on the citation relationships. Furthermore, this study analyzes whether there are any impacts between the size of the company and patent citation relationships. 74 S&P 500 registered companies that provide IT and communication services are selected for this study. In order to determine the relationship of patent citation between the companies, the patent citation in 2009 and 2010 is collected and sociomatrices which show the patent citation relationship between the companies are created. In addition, the companies' total assets are collected as an index of company size. The distance between companies is defined as the absolute value of the difference between the total assets. And simple differences are considered to be described as the hierarchy of the company. The QAP Correlation analysis and MR-QAP analysis is carried out by using the distance and hierarchy between companies, and also the sociomatrices that shows the patent citation in 2009 and 2010. Through the result of QAP Correlation analysis, the patent citation relationship between companies in the 2009's company's patent citation network and the 2010's company's patent citation network shows the highest correlation. In addition, positive correlation is shown in the patent citation relationships between companies and the distance between companies. This is because the patent citation relationship is increased when there is a difference of size between companies. Not only that, negative correlation is found through the analysis using the patent citation relationship between companies and the hierarchy between companies. Relatively it is indicated that there is a high evaluation about the patent of the higher tier companies influenced toward the lower tier companies. MR-QAP analysis is carried out as follow. The sociomatrix that is generated by using the year 2010 patent citation relationship is used as the dependent variable. Additionally the 2009's company's patent citation network and the distance and hierarchy networks between the companies are used as the independent variables. This study performed MR-QAP analysis to find the main factors influencing the patent citation relationship between the companies in 2010. The analysis results show that all independent variables have positively influenced the 2010's patent citation relationship between the companies. In particular, the 2009's patent citation relationship between the companies has the most significant impact on the 2010's, which means that there is consecutiveness regarding the patent citation relationships. Through the result of QAP correlation analysis and MR-QAP analysis, the patent citation relationship between companies is affected by the size of the companies. But the most significant impact is the patent citation relationships that had been done in the past. The reason why we need to maintain the patent citation relationship between companies is it might be important in the use of strategic aspect of the companies to look into relationships to share intellectual property between each other, also seen as an important auxiliary of the partner companies to cooperate with.

    Antecedents of Trust and Effects on Committment in B2B e-Marketplace (B2B 마켓플레이스에서 신뢰의 선행요인과 몰입에 미치는 영향)

    • Oh, Sang-Hyun;Kim, Sang-Hyeon
      • Journal of Distribution Research
      • /
      • v.13 no.1
      • /
      • pp.1-33
      • /
      • 2008
    • As the interest in the business-to-business(B2B) electronic commerce is increasing, many companies are participating in the B2B e-Marketplaces. The e-Marketplace is defined as the virtual market that many players take part in to transact. The e-Marketplace has an influenced on the manner in which organizational buyers and sellers interact. As a result, it is important to develop an understanding of the behaviors of firms that use these electronic marketplaces. The purpose of this study is to develop a comprehensive model for trust and commitment of B2B e-Marketplace and empirically to examine their structural relationships. Drawing from trust and commitment theory in the interorganizational relationship and B2B electronic commerce context, this study identifies network externality, interactivity, justice, quality of information sharing, institutional assurance as the determinants of trust and commitment of e-Marketplace. The proposed model hypothesized that (1) trust is a function of network externality, interactivity, justice, quality of information sharing, institutional assurance, (2) attitudinal and behavioral commitment is a function of trust, (3) behavioral commitment is a function of attitudinal commitment. The proposed model is tested using organizational-level survey data from 187 buying organizations that conduct business in MRO e-Marketplaces. The data were tested by reliability test, correlation analysis, exploratory factor analysis, confirmatory factor analysis and covariance structure analysis. The results indicate that (1) trust is influenced by network externality, interactivity, justice, institutional assurance, (2) attitudinal commitment and behavioral commitment is influenced by trust (3) behavioral commitment is influenced by attitudinal commitment. Also, the empirical results confirmed that trust play a strong, central role in determinging e-Marketplace commitment. The key theoretical contribution of this research is that it begins to extend interorganizational information system literature in areas such as B2B Internet e-Marketplace. Managerially, this study contributes tn the understanding of the role of B2B e-Markeplace providers in Internet situation. And Limitations of this study and guidelines for future researches are also discussed.

    • PDF

    A Study on the Implications of Korea Through the Policy Analysis of AI Start-up Companies in Major Countries (주요국 AI 창업기업 정책 분석을 통한 국내 시사점 연구)

    • Kim, Dong Jin;Lee, Seong Yeob
      • Asia-Pacific Journal of Business Venturing and Entrepreneurship
      • /
      • v.19 no.2
      • /
      • pp.215-235
      • /
      • 2024
    • As artificial intelligence (AI) technology is recognized as a key technology that will determine future national competitiveness, competition for AI technology and industry promotion policies in major countries is intensifying. This study aims to present implications for domestic policy making by analyzing the policies of major countries on the start-up of AI companies, which are the basis of the AI industry ecosystem. The top four countries and the EU for the number of new investment attraction companies in the 2023 AI Index announced by the HAI Research Institute at Stanford University in the United States were selected, The United States enacted the National AI Initiative Act (NAIIA) in 2021. Through this law, The US Government is promoting continued leadership in the United States in AI R&D, developing reliable AI systems in the public and private sectors, building an AI system ecosystem across society, and strengthening DB management and access to AI policies conducted by all federal agencies. In the 14th Five-Year (2021-2025) Plan and 2035 Long-term Goals held in 2021, China has specified AI as the first of the seven strategic high-tech technologies, and is developing policies aimed at becoming the No. 1 AI global powerhouse by 2030. The UK is investing in innovative R&D companies through the 'Future Fund Breakthrough' in 2021, and is expanding related investments by preparing national strategies to leap forward as AI leaders, such as the implementation plan of the national AI strategy in 2022. Israel is supporting technology investment in start-up companies centered on the Innovation Agency, and the Innovation Agency is leading mid- to long-term investments of 2 to 15 years and regulatory reforms for new technologies. The EU is strengthening its digital innovation hub network and creating the InvestEU (European Strategic Investment Fund) and AI investment fund to support the use of AI by SMEs. This study aims to contribute to analyzing the policies of major foreign countries in making AI company start-up policies and providing a basis for Korea's strategy search. The limitations of the study are the limitations of the countries to be analyzed and the failure to attempt comparative analysis of the policy environments of the countries under the same conditions.

    • PDF

    Introduction on the Products and the Quality Management Plans for GOCI-II (천리안 해양위성 2호 산출물 및 품질관리 계획)

    • Lee, Sun-Ju;Lee, Kyeong-Sang;Han, Tae Hyun;Moon, Jeong-Eon;Bae, Sujung;Choi, Jong-kuk
      • Korean Journal of Remote Sensing
      • /
      • v.37 no.5_2
      • /
      • pp.1245-1257
      • /
      • 2021
    • GOCI-II, succeeding the mission of GOCI, was launched in February 2020 and has been in regular operation since October 2020. Korea Institute of Ocean Science and Technology (KIOST) processes and produces in real time Level-1B and 26 Level-2 outputs, which then are provided by Korea Hydrographic and Oceanographic Agency (KHOA). We introduced current status of regular GOCI-II operation and showed future improvement. Basic GOCI-II products including chlorophyll-a, total suspended materials, and colored dissolved organic matter concentration, are induced by OC4 and YOC algorithms, which were described in detail. For the full disk (FD), imaging schedule was established considering solar zenith angle and sun glint during the in-orbital test, but improved by further considering satellite zenith angle. The number of slots satisfying the condition 'Best Ocean' significantly increased from 15 to 78. GOCI-II calibration requirements were presented based on that by European Space Agency (ESA) and candidate fixed locations for calibrating local observation area were. The quality management of FD uses research ships and overseas bases of KIOST, but it is necessary to establish an international calibration/validation network. These results are expected to enhance the understanding of users for output processing and help establish detailed plans for future quality management tasks.

    Cybertrap : Unknown Attack Detection System based on Virtual Honeynet (Cybertrap : 가상 허니넷 기반 신종공격 탐지시스템)

    • Kang, Dae-Kwon;Hyun, Mu-Yong;Kim, Chun-Suk
      • The Journal of the Korea institute of electronic communication sciences
      • /
      • v.8 no.6
      • /
      • pp.863-871
      • /
      • 2013
    • Recently application of open protocols and external network linkage to the national critical infrastructure has been growing with the development of information and communication technologies. This trend could mean that the national critical infrastructure is exposed to cyber attacks and can be seriously jeopardized when it gets remotely operated or controlled by viruses, crackers, or cyber terrorists. In this paper virtual Honeynet model which can reduce installation and operation resource problems of Honeynet system is proposed. It maintains the merits of Honeynet system and adapts the virtualization technology. Also, virtual Honeynet model that can minimize operating cost is proposed with data analysis and collecting technique based on the verification of attack intention and focus-oriented analysis technique. With the proposed model, new type of attack detection system based on virtual Honeynet, that is Cybertrap, is designed and implemented with the host and data collecting technique based on the verification of attack intention and the network attack pattern visualization technique. To test proposed system we establish test-bed and evaluate the functionality and performance through series of experiments.

    Intelligent Web Crawler for Supporting Big Data Analysis Services (빅데이터 분석 서비스 지원을 위한 지능형 웹 크롤러)

    • Seo, Dongmin;Jung, Hanmin
      • The Journal of the Korea Contents Association
      • /
      • v.13 no.12
      • /
      • pp.575-584
      • /
      • 2013
    • Data types used for big-data analysis are very widely, such as news, blog, SNS, papers, patents, sensed data, and etc. Particularly, the utilization of web documents offering reliable data in real time is increasing gradually. And web crawlers that collect web documents automatically have grown in importance because big-data is being used in many different fields and web data are growing exponentially every year. However, existing web crawlers can't collect whole web documents in a web site because existing web crawlers collect web documents with only URLs included in web documents collected in some web sites. Also, existing web crawlers can collect web documents collected by other web crawlers already because information about web documents collected in each web crawler isn't efficiently managed between web crawlers. Therefore, this paper proposed a distributed web crawler. To resolve the problems of existing web crawler, the proposed web crawler collects web documents by RSS of each web site and Google search API. And the web crawler provides fast crawling performance by a client-server model based on RMI and NIO that minimize network traffic. Furthermore, the web crawler extracts core content from a web document by a keyword similarity comparison on tags included in a web documents. Finally, to verify the superiority of our web crawler, we compare our web crawler with existing web crawlers in various experiments.

    Design and Implementation of a Web Server Using a Learning-based Dynamic Thread Pool Scheme (학습 기반의 동적 쓰레드 풀 기법을 적용한 웹 서버의 설계 및 구현)

    • Yoo, Seo-Hee;Kang, Dong-Hyun;Lee, Kwon-Yong;Park, Sung-Yong
      • Journal of KIISE:Computing Practices and Letters
      • /
      • v.16 no.1
      • /
      • pp.23-34
      • /
      • 2010
    • As the number of user increases according to the improvement of the network, the multi-thread schemes are used to process the service requests of several users who are connected simultaneously. The static thread pool scheme has the problem of occupying a static amount of system resources. On the other hand, the dynamic thread pool scheme can control the number of threads according to the users' requests. However, it has disadvantage that this scheme cannot react to the requests which are larger than the maximum value assigned. In this paper, a web server using a learning-based dynamic thread pool scheme is suggested, which will be running on a server programming of a multi-thread environment. The suggested scheme adds the creation of the threads through the prediction of the next number of periodic requests using Auto Regressive scheme with the web server apache worker MPM (Multi-processing Module). Unlike previous schemes, in order to set the exact number of the necessary threads during the unchanged number of work requests in a certain period, K-Nearest Neighbor algorithm is used to learn the number of threads in advance according to the number of requests. The required number of threads is set by comparing with the previously learned objects. Then, the similar objects are selected to decide the number of the threads according to the request, and they create the threads. In this paper, the response time has decreased by modifying the number of threads dynamically, and the system resources can be used more efficiently by managing the number of threads according to the requests.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.