• Title/Summary/Keyword: Intelligent Control

Search Result 4,096, Processing Time 0.033 seconds

GIS-based Market Analysis and Sales Management System : The Case of a Telecommunication Company (시장분석 및 영업관리 역량 강화를 위한 통신사의 GIS 적용 사례)

  • Chang, Nam-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.61-75
    • /
    • 2011
  • A Geographic Information System(GIS) is a system that captures, stores, analyzes, manages and presents data with reference to geographic location data. In the later 1990s and earlier 2000s it was limitedly used in government sectors such as public utility management, urban planning, landscape architecture, and environmental contamination control. However, a growing number of open-source packages running on a range of operating systems enabled many private enterprises to explore the concept of viewing GIS-based sales and customer data over their own computer monitors. K telecommunication company has dominated the Korean telecommunication market by providing diverse services, such as high-speed internet, PSTN(Public Switched Telephone Network), VOLP (Voice Over Internet Protocol), and IPTV(Internet Protocol Television). Even though the telecommunication market in Korea is huge, the competition between major services providers is growing more fierce than ever before. Service providers struggled to acquire as many new customers as possible, attempted to cross sell more products to their regular customers, and made more efforts on retaining the best customers by offering unprecedented benefits. Most service providers including K telecommunication company tried to adopt the concept of customer relationship management(CRM), and analyze customer's demographic and transactional data statistically in order to understand their customer's behavior. However, managing customer information has still remained at the basic level, and the quality and the quantity of customer data were not enough not only to understand the customers but also to design a strategy for marketing and sales. For example, the currently used 3,074 legal regional divisions, which are originally defined by the government, were too broad to calculate sub-regional customer's service subscription and cancellation ratio. Additional external data such as house size, house price, and household demographics are also needed to measure sales potential. Furthermore, making tables and reports were time consuming and they were insufficient to make a clear judgment about the market situation. In 2009, this company needed a dramatic shift in the way marketing and sales activities, and finally developed a dedicated GIS_based market analysis and sales management system. This system made huge improvement in the efficiency with which the company was able to manage and organize all customer and sales related information, and access to those information easily and visually. After the GIS information system was developed, and applied to marketing and sales activities at the corporate level, the company was reported to increase sales and market share substantially. This was due to the fact that by analyzing past market and sales initiatives, creating sales potential, and targeting key markets, the system could make suggestions and enable the company to focus its resources on the demographics most likely to respond to the promotion. This paper reviews subjective and unclear marketing and sales activities that K telecommunication company operated, and introduces the whole process of developing the GIS information system. The process consists of the following 5 modules : (1) Customer profile cleansing and standardization, (2) Internal/External DB enrichment, (3) Segmentation of 3,074 legal regions into 46,590 sub_regions called blocks, (4) GIS data mart design, and (5) GIS system construction. The objective of this case study is to emphasize the need of GIS system and how it works in the private enterprises by reviewing the development process of the K company's market analysis and sales management system. We hope that this paper suggest valuable guideline to companies that consider introducing or constructing a GIS information system.

A digital Audio Watermarking Algorithm using 2D Barcode (2차원 바코드를 이용한 오디오 워터마킹 알고리즘)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.97-107
    • /
    • 2011
  • Nowadays there are a lot of issues about copyright infringement in the Internet world because the digital content on the network can be copied and delivered easily. Indeed the copied version has same quality with the original one. So, copyright owners and content provider want a powerful solution to protect their content. The popular one of the solutions was DRM (digital rights management) that is based on encryption technology and rights control. However, DRM-free service was launched after Steve Jobs who is CEO of Apple proposed a new music service paradigm without DRM, and the DRM is disappeared at the online music market. Even though the online music service decided to not equip the DRM solution, copyright owners and content providers are still searching a solution to protect their content. A solution to replace the DRM technology is digital audio watermarking technology which can embed copyright information into the music. In this paper, the author proposed a new audio watermarking algorithm with two approaches. First, the watermark information is generated by two dimensional barcode which has error correction code. So, the information can be recovered by itself if the errors fall into the range of the error tolerance. The other one is to use chirp sequence of CDMA (code division multiple access). These make the algorithm robust to the several malicious attacks. There are many 2D barcodes. Especially, QR code which is one of the matrix barcodes can express the information and the expression is freer than that of the other matrix barcodes. QR code has the square patterns with double at the three corners and these indicate the boundary of the symbol. This feature of the QR code is proper to express the watermark information. That is, because the QR code is 2D barcodes, nonlinear code and matrix code, it can be modulated to the spread spectrum and can be used for the watermarking algorithm. The proposed algorithm assigns the different spread spectrum sequences to the individual users respectively. In the case that the assigned code sequences are orthogonal, we can identify the watermark information of the individual user from an audio content. The algorithm used the Walsh code as an orthogonal code. The watermark information is rearranged to the 1D sequence from 2D barcode and modulated by the Walsh code. The modulated watermark information is embedded into the DCT (discrete cosine transform) domain of the original audio content. For the performance evaluation, I used 3 audio samples, "Amazing Grace", "Oh! Carol" and "Take me home country roads", The attacks for the robustness test were MP3 compression, echo attack, and sub woofer boost. The MP3 compression was performed by a tool of Cool Edit Pro 2.0. The specification of MP3 was CBR(Constant Bit Rate) 128kbps, 44,100Hz, and stereo. The echo attack had the echo with initial volume 70%, decay 75%, and delay 100msec. The sub woofer boost attack was a modification attack of low frequency part in the Fourier coefficients. The test results showed the proposed algorithm is robust to the attacks. In the MP3 attack, the strength of the watermark information is not affected, and then the watermark can be detected from all of the sample audios. In the sub woofer boost attack, the watermark was detected when the strength is 0.3. Also, in the case of echo attack, the watermark can be identified if the strength is greater and equal than 0.5.

Learning Material Bookmarking Service based on Collective Intelligence (집단지성 기반 학습자료 북마킹 서비스 시스템)

  • Jang, Jincheul;Jung, Sukhwan;Lee, Seulki;Jung, Chihoon;Yoon, Wan Chul;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.179-192
    • /
    • 2014
  • Keeping in line with the recent changes in the information technology environment, the online learning environment that supports multiple users' participation such as MOOC (Massive Open Online Courses) has become important. One of the largest professional associations in Information Technology, IEEE Computer Society, announced that "Supporting New Learning Styles" is a crucial trend in 2014. Popular MOOC services, CourseRa and edX, have continued to build active learning environment with a large number of lectures accessible anywhere using smart devices, and have been used by an increasing number of users. In addition, collaborative web services (e.g., blogs and Wikipedia) also support the creation of various user-uploaded learning materials, resulting in a vast amount of new lectures and learning materials being created every day in the online space. However, it is difficult for an online educational system to keep a learner' motivation as learning occurs remotely, with limited capability to share knowledge among the learners. Thus, it is essential to understand which materials are needed for each learner and how to motivate learners to actively participate in online learning system. To overcome these issues, leveraging the constructivism theory and collective intelligence, we have developed a social bookmarking system called WeStudy, which supports learning material sharing among the users and provides personalized learning material recommendations. Constructivism theory argues that knowledge is being constructed while learners interact with the world. Collective intelligence can be separated into two types: (1) collaborative collective intelligence, which can be built on the basis of direct collaboration among the participants (e.g., Wikipedia), and (2) integrative collective intelligence, which produces new forms of knowledge by combining independent and distributed information through highly advanced technologies and algorithms (e.g., Google PageRank, Recommender systems). Recommender system, one of the examples of integrative collective intelligence, is to utilize online activities of the users and recommend what users may be interested in. Our system included both collaborative collective intelligence functions and integrative collective intelligence functions. We analyzed well-known Web services based on collective intelligence such as Wikipedia, Slideshare, and Videolectures to identify main design factors that support collective intelligence. Based on this analysis, in addition to sharing online resources through social bookmarking, we selected three essential functions for our system: 1) multimodal visualization of learning materials through two forms (e.g., list and graph), 2) personalized recommendation of learning materials, and 3) explicit designation of learners of their interest. After developing web-based WeStudy system, we conducted usability testing through the heuristic evaluation method that included seven heuristic indices: features and functionality, cognitive page, navigation, search and filtering, control and feedback, forms, context and text. We recruited 10 experts who majored in Human Computer Interaction and worked in the same field, and requested both quantitative and qualitative evaluation of the system. The evaluation results show that, relative to the other functions evaluated, the list/graph page produced higher scores on all indices except for contexts & text. In case of contexts & text, learning material page produced the best score, compared with the other functions. In general, the explicit designation of learners of their interests, one of the distinctive functions, received lower scores on all usability indices because of its unfamiliar functionality to the users. In summary, the evaluation results show that our system has achieved high usability with good performance with some minor issues, which need to be fully addressed before the public release of the system to large-scale users. The study findings provide practical guidelines for the design and development of various systems that utilize collective intelligence.

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

  • Yun, Byeong-Dae;Kim, Jong-Woo;Cho, Yong-Seok;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.147-161
    • /
    • 2010
  • As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.

Personal Information Overload and User Resistance in the Big Data Age (빅데이터 시대의 개인정보 과잉이 사용자 저항에 미치는 영향)

  • Lee, Hwansoo;Lim, Dongwon;Zo, Hangjung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.125-139
    • /
    • 2013
  • Big data refers to the data that cannot be processes with conventional contemporary data technologies. As smart devices and social network services produces vast amount of data, big data attracts much attention from researchers. There are strong demands form governments and industries for bib data as it can create new values by drawing business insights from data. Since various new technologies to process big data introduced, academic communities also show much interest to the big data domain. A notable advance related to the big data technology has been in various fields. Big data technology makes it possible to access, collect, and save individual's personal data. These technologies enable the analysis of huge amounts of data with lower cost and less time, which is impossible to achieve with traditional methods. It even detects personal information that people do not want to open. Therefore, people using information technology such as the Internet or online services have some level of privacy concerns, and such feelings can hinder continued use of information systems. For example, SNS offers various benefits, but users are sometimes highly exposed to privacy intrusions because they write too much personal information on it. Even though users post their personal information on the Internet by themselves, the data sometimes is not under control of the users. Once the private data is posed on the Internet, it can be transferred to anywhere by a few clicks, and can be abused to create fake identity. In this way, privacy intrusion happens. This study aims to investigate how perceived personal information overload in SNS affects user's risk perception and information privacy concerns. Also, it examines the relationship between the concerns and user resistance behavior. A survey approach and structural equation modeling method are employed for data collection and analysis. This study contributes meaningful insights for academic researchers and policy makers who are planning to develop guidelines for privacy protection. The study shows that information overload on the social network services can bring the significant increase of users' perceived level of privacy risks. In turn, the perceived privacy risks leads to the increased level of privacy concerns. IF privacy concerns increase, it can affect users to from a negative or resistant attitude toward system use. The resistance attitude may lead users to discontinue the use of social network services. Furthermore, information overload is mediated by perceived risks to affect privacy concerns rather than has direct influence on perceived risk. It implies that resistance to the system use can be diminished by reducing perceived risks of users. Given that users' resistant behavior become salient when they have high privacy concerns, the measures to alleviate users' privacy concerns should be conceived. This study makes academic contribution of integrating traditional information overload theory and user resistance theory to investigate perceived privacy concerns in current IS contexts. There is little big data research which examined the technology with empirical and behavioral approach, as the research topic has just emerged. It also makes practical contributions. Information overload connects to the increased level of perceived privacy risks, and discontinued use of the information system. To keep users from departing the system, organizations should develop a system in which private data is controlled and managed with ease. This study suggests that actions to lower the level of perceived risks and privacy concerns should be taken for information systems continuance.

Development of System for Real-Time Object Recognition and Matching using Deep Learning at Simulated Lunar Surface Environment (딥러닝 기반 달 표면 모사 환경 실시간 객체 인식 및 매칭 시스템 개발)

  • Jong-Ho Na;Jun-Ho Gong;Su-Deuk Lee;Hyu-Soung Shin
    • Tunnel and Underground Space
    • /
    • v.33 no.4
    • /
    • pp.281-298
    • /
    • 2023
  • Continuous research efforts are being devoted to unmanned mobile platforms for lunar exploration. There is an ongoing demand for real-time information processing to accurately determine the positioning and mapping of areas of interest on the lunar surface. To apply deep learning processing and analysis techniques to practical rovers, research on software integration and optimization is imperative. In this study, a foundational investigation has been conducted on real-time analysis of virtual lunar base construction site images, aimed at automatically quantifying spatial information of key objects. This study involved transitioning from an existing region-based object recognition algorithm to a boundary box-based algorithm, thus enhancing object recognition accuracy and inference speed. To facilitate extensive data-based object matching training, the Batch Hard Triplet Mining technique was introduced, and research was conducted to optimize both training and inference processes. Furthermore, an improved software system for object recognition and identical object matching was integrated, accompanied by the development of visualization software for the automatic matching of identical objects within input images. Leveraging satellite simulative captured video data for training objects and moving object-captured video data for inference, training and inference for identical object matching were successfully executed. The outcomes of this research suggest the feasibility of implementing 3D spatial information based on continuous-capture video data of mobile platforms and utilizing it for positioning objects within regions of interest. As a result, these findings are expected to contribute to the integration of an automated on-site system for video-based construction monitoring and control of significant target objects within future lunar base construction sites.

Digital Archives of Cultural Archetype Contents: Its Problems and Direction (디지털 아카이브즈의 문제점과 방향 - 문화원형 콘텐츠를 중심으로 -)

  • Hahm, Han-Hee;Park, Soon-Cheol
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.17 no.2
    • /
    • pp.23-42
    • /
    • 2006
  • This is a study of the digital archives of Culturecontent.com where 'Cultural Archetype Contents' are currently in service. One of the major purposes of our study is to point out problems in the current system and eventually propose improvements to the digital archives. The government launched a four-year project for developing the cultural archetype content sources and establishing its related business with the hope of enhancing the nation's competitiveness. More specifically, the project focuses on the production of source materials of cultural archetype contents in the subjects of Korea's history. tradition, everyday life. arts and general geographical books. In addition, through this project, the government also intends to establish a proper distribution system of digitalized culture contents and to control copyright issues. This paper analyzes the digital archives system that stores the culture content data that have been produced from 2002 to 2005 and evaluates the current system's weaknesses and strengths. The summary of our findings is as follows. First. the digital archives system does not contain a semantic search engine and therefore its full function is 1agged. Second, similar data is not classified into the same categories but into the different ones, thereby confusing and inconveniencing users. Users who want to find source materials could be disappointed by the current distributive system. Our paper suggests a better system of digital archives with text mining technology which consists of five significant intelligent process-keyword searches, summarization, clustering, classification and topic tracking. Our paper endeavors to develop the best technical environment for preserving and using culture contents data. With the new digitalized upgraded settings, users of culture contents data will discover a world of new knowledge. The technology we introduce in this paper will lead to the highest achievable digital intelligence through a new framework.

Image Watermarking for Copyright Protection of Images on Shopping Mall (쇼핑몰 이미지 저작권보호를 위한 영상 워터마킹)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.147-157
    • /
    • 2013
  • With the advent of the digital environment that can be accessed anytime, anywhere with the introduction of high-speed network, the free distribution and use of digital content were made possible. Ironically this environment is raising a variety of copyright infringement, and product images used in the online shopping mall are pirated frequently. There are many controversial issues whether shopping mall images are creative works or not. According to Supreme Court's decision in 2001, to ad pictures taken with ham products is simply a clone of the appearance of objects to deliver nothing but the decision was not only creative expression. But for the photographer's losses recognized in the advertising photo shoot takes the typical cost was estimated damages. According to Seoul District Court precedents in 2003, if there are the photographer's personality and creativity in the selection of the subject, the composition of the set, the direction and amount of light control, set the angle of the camera, shutter speed, shutter chance, other shooting methods for capturing, developing and printing process, the works should be protected by copyright law by the Court's sentence. In order to receive copyright protection of the shopping mall images by the law, it is simply not to convey the status of the product, the photographer's personality and creativity can be recognized that it requires effort. Accordingly, the cost of making the mall image increases, and the necessity for copyright protection becomes higher. The product images of the online shopping mall have a very unique configuration unlike the general pictures such as portraits and landscape photos and, therefore, the general image watermarking technique can not satisfy the requirements of the image watermarking. Because background of product images commonly used in shopping malls is white or black, or gray scale (gradient) color, it is difficult to utilize the space to embed a watermark and the area is very sensitive even a slight change. In this paper, the characteristics of images used in shopping malls are analyzed and a watermarking technology which is suitable to the shopping mall images is proposed. The proposed image watermarking technology divide a product image into smaller blocks, and the corresponding blocks are transformed by DCT (Discrete Cosine Transform), and then the watermark information was inserted into images using quantization of DCT coefficients. Because uniform treatment of the DCT coefficients for quantization cause visual blocking artifacts, the proposed algorithm used weighted mask which quantizes finely the coefficients located block boundaries and coarsely the coefficients located center area of the block. This mask improves subjective visual quality as well as the objective quality of the images. In addition, in order to improve the safety of the algorithm, the blocks which is embedded the watermark are randomly selected and the turbo code is used to reduce the BER when extracting the watermark. The PSNR(Peak Signal to Noise Ratio) of the shopping mall image watermarked by the proposed algorithm is 40.7~48.5[dB] and BER(Bit Error Rate) after JPEG with QF = 70 is 0. This means the watermarked image is high quality and the algorithm is robust to JPEG compression that is used generally at the online shopping malls. Also, for 40% change in size and 40 degrees of rotation, the BER is 0. In general, the shopping malls are used compressed images with QF which is higher than 90. Because the pirated image is used to replicate from original image, the proposed algorithm can identify the copyright infringement in the most cases. As shown the experimental results, the proposed algorithm is suitable to the shopping mall images with simple background. However, the future study should be carried out to enhance the robustness of the proposed algorithm because the robustness loss is occurred after mask process.

A Study on the Application of Outlier Analysis for Fraud Detection: Focused on Transactions of Auction Exception Agricultural Products (부정 탐지를 위한 이상치 분석 활용방안 연구 : 농수산 상장예외품목 거래를 대상으로)

  • Kim, Dongsung;Kim, Kitae;Kim, Jongwoo;Park, Steve
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.93-108
    • /
    • 2014
  • To support business decision making, interests and efforts to analyze and use transaction data in different perspectives are increasing. Such efforts are not only limited to customer management or marketing, but also used for monitoring and detecting fraud transactions. Fraud transactions are evolving into various patterns by taking advantage of information technology. To reflect the evolution of fraud transactions, there are many efforts on fraud detection methods and advanced application systems in order to improve the accuracy and ease of fraud detection. As a case of fraud detection, this study aims to provide effective fraud detection methods for auction exception agricultural products in the largest Korean agricultural wholesale market. Auction exception products policy exists to complement auction-based trades in agricultural wholesale market. That is, most trades on agricultural products are performed by auction; however, specific products are assigned as auction exception products when total volumes of products are relatively small, the number of wholesalers is small, or there are difficulties for wholesalers to purchase the products. However, auction exception products policy makes several problems on fairness and transparency of transaction, which requires help of fraud detection. In this study, to generate fraud detection rules, real huge agricultural products trade transaction data from 2008 to 2010 in the market are analyzed, which increase more than 1 million transactions and 1 billion US dollar in transaction volume. Agricultural transaction data has unique characteristics such as frequent changes in supply volumes and turbulent time-dependent changes in price. Since this was the first trial to identify fraud transactions in this domain, there was no training data set for supervised learning. So, fraud detection rules are generated using outlier detection approach. We assume that outlier transactions have more possibility of fraud transactions than normal transactions. The outlier transactions are identified to compare daily average unit price, weekly average unit price, and quarterly average unit price of product items. Also quarterly averages unit price of product items of the specific wholesalers are used to identify outlier transactions. The reliability of generated fraud detection rules are confirmed by domain experts. To determine whether a transaction is fraudulent or not, normal distribution and normalized Z-value concept are applied. That is, a unit price of a transaction is transformed to Z-value to calculate the occurrence probability when we approximate the distribution of unit prices to normal distribution. The modified Z-value of the unit price in the transaction is used rather than using the original Z-value of it. The reason is that in the case of auction exception agricultural products, Z-values are influenced by outlier fraud transactions themselves because the number of wholesalers is small. The modified Z-values are called Self-Eliminated Z-scores because they are calculated excluding the unit price of the specific transaction which is subject to check whether it is fraud transaction or not. To show the usefulness of the proposed approach, a prototype of fraud transaction detection system is developed using Delphi. The system consists of five main menus and related submenus. First functionalities of the system is to import transaction databases. Next important functions are to set up fraud detection parameters. By changing fraud detection parameters, system users can control the number of potential fraud transactions. Execution functions provide fraud detection results which are found based on fraud detection parameters. The potential fraud transactions can be viewed on screen or exported as files. The study is an initial trial to identify fraud transactions in Auction Exception Agricultural Products. There are still many remained research topics of the issue. First, the scope of analysis data was limited due to the availability of data. It is necessary to include more data on transactions, wholesalers, and producers to detect fraud transactions more accurately. Next, we need to extend the scope of fraud transaction detection to fishery products. Also there are many possibilities to apply different data mining techniques for fraud detection. For example, time series approach is a potential technique to apply the problem. Even though outlier transactions are detected based on unit prices of transactions, however it is possible to derive fraud detection rules based on transaction volumes.

Pareto Ratio and Inequality Level of Knowledge Sharing in Virtual Knowledge Collaboration: Analysis of Behaviors on Wikipedia (지식 공유의 파레토 비율 및 불평등 정도와 가상 지식 협업: 위키피디아 행위 데이터 분석)

  • Park, Hyun-Jung;Shin, Kyung-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.19-43
    • /
    • 2014
  • The Pareto principle, also known as the 80-20 rule, states that roughly 80% of the effects come from 20% of the causes for many events including natural phenomena. It has been recognized as a golden rule in business with a wide application of such discovery like 20 percent of customers resulting in 80 percent of total sales. On the other hand, the Long Tail theory, pointing out that "the trivial many" produces more value than "the vital few," has gained popularity in recent times with a tremendous reduction of distribution and inventory costs through the development of ICT(Information and Communication Technology). This study started with a view to illuminating how these two primary business paradigms-Pareto principle and Long Tail theory-relates to the success of virtual knowledge collaboration. The importance of virtual knowledge collaboration is soaring in this era of globalization and virtualization transcending geographical and temporal constraints. Many previous studies on knowledge sharing have focused on the factors to affect knowledge sharing, seeking to boost individual knowledge sharing and resolve the social dilemma caused from the fact that rational individuals are likely to rather consume than contribute knowledge. Knowledge collaboration can be defined as the creation of knowledge by not only sharing knowledge, but also by transforming and integrating such knowledge. In this perspective of knowledge collaboration, the relative distribution of knowledge sharing among participants can count as much as the absolute amounts of individual knowledge sharing. In particular, whether the more contribution of the upper 20 percent of participants in knowledge sharing will enhance the efficiency of overall knowledge collaboration is an issue of interest. This study deals with the effect of this sort of knowledge sharing distribution on the efficiency of knowledge collaboration and is extended to reflect the work characteristics. All analyses were conducted based on actual data instead of self-reported questionnaire surveys. More specifically, we analyzed the collaborative behaviors of editors of 2,978 English Wikipedia featured articles, which are the best quality grade of articles in English Wikipedia. We adopted Pareto ratio, the ratio of the number of knowledge contribution of the upper 20 percent of participants to the total number of knowledge contribution made by the total participants of an article group, to examine the effect of Pareto principle. In addition, Gini coefficient, which represents the inequality of income among a group of people, was applied to reveal the effect of inequality of knowledge contribution. Hypotheses were set up based on the assumption that the higher ratio of knowledge contribution by more highly motivated participants will lead to the higher collaboration efficiency, but if the ratio gets too high, the collaboration efficiency will be exacerbated because overall informational diversity is threatened and knowledge contribution of less motivated participants is intimidated. Cox regression models were formulated for each of the focal variables-Pareto ratio and Gini coefficient-with seven control variables such as the number of editors involved in an article, the average time length between successive edits of an article, the number of sections a featured article has, etc. The dependent variable of the Cox models is the time spent from article initiation to promotion to the featured article level, indicating the efficiency of knowledge collaboration. To examine whether the effects of the focal variables vary depending on the characteristics of a group task, we classified 2,978 featured articles into two categories: Academic and Non-academic. Academic articles refer to at least one paper published at an SCI, SSCI, A&HCI, or SCIE journal. We assumed that academic articles are more complex, entail more information processing and problem solving, and thus require more skill variety and expertise. The analysis results indicate the followings; First, Pareto ratio and inequality of knowledge sharing relates in a curvilinear fashion to the collaboration efficiency in an online community, promoting it to an optimal point and undermining it thereafter. Second, the curvilinear effect of Pareto ratio and inequality of knowledge sharing on the collaboration efficiency is more sensitive with a more academic task in an online community.