• Title/Summary/Keyword: 웹 검색시스템

Search Result 1,329, Processing Time 0.026 seconds

Reliable Image-Text Fusion CAPTCHA to Improve User-Friendliness and Efficiency (사용자 편의성과 효율성을 증진하기 위한 신뢰도 높은 이미지-텍스트 융합 CAPTCHA)

  • Moon, Kwang-Ho;Kim, Yoo-Sung
    • The KIPS Transactions:PartC
    • /
    • v.17C no.1
    • /
    • pp.27-36
    • /
    • 2010
  • In Web registration pages and online polling applications, CAPTCHA(Completely Automated Public Turing Test To Tell Computers and Human Apart) is used for distinguishing human users from automated programs. Text-based CAPTCHAs have been widely used in many popular Web sites in which distorted text is used. However, because the advanced optical character recognition techniques can recognize the distorted texts, the reliability becomes low. Image-based CAPTCHAs have been proposed to improve the reliability of the text-based CAPTCHAs. However, these systems also are known as having some drawbacks. First, some image-based CAPTCHA systems with small number of image files in their image dictionary is not so reliable since attacker can recognize images by repeated executions of machine learning programs. Second, users may feel uncomfortable since they have to try CAPTCHA tests repeatedly when they fail to input a correct keyword. Third, some image-base CAPTCHAs require high communication cost since they should send several image files for one CAPTCHA. To solve these problems of image-based CAPTCHA, this paper proposes a new CAPTCHA based on both image and text. In this system, an image and keywords are integrated into one CAPTCHA image to give user a hint for the answer keyword. The proposed CAPTCHA can help users to input easily the answer keyword with the hint in the fused image. Also, the proposed system can reduce the communication costs since it uses only a fused image file for one CAPTCHA. To improve the reliability of the image-text fusion CAPTCHA, we also propose a dynamic building method of large image dictionary from gathering huge amount of images from theinternet with filtering phase for preserving the correctness of CAPTCHA images. In this paper, we proved that the proposed image-text fusion CAPTCHA provides users more convenience and high reliability than the image-based CAPTCHA through experiments.

Text-mining Techniques for Metabolic Pathway Reconstruction (대사경로 재구축을 위한 텍스트 마이닝 기법)

  • Kwon, Hyuk-Ryul;Na, Jong-Hwa;Yoo, Jae-Soo;Cho, Wan-Sup
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.4
    • /
    • pp.138-147
    • /
    • 2007
  • Metabolic pathway is a series of chemical reactions occuning within a cell and can be used for drug development and understanding of life phenomenon. Many biologists are trying to extract metabolic pathway information from huge literatures for their metabolic-circuit regulation study. We propose a text-mining technique based on the keyword and pattern. Proposed technique utilizes a web robot to collect huge papers and stores them into a local database. We use gene ontology to increase compound recognition rate and NCBI Tokenizer library to recognize useful information without compound destruction. Furthermore, we obtain useful sentence patterns representing metabolic pathway from papers and KEGG database. We have extracted 66 patterns in 20,000 documents for Glycosphingolipid species from KEGG, a representative metabolic database. We verify our system for nineteen compounds in Glycosphingolipid species. The result shows that the recall is 95.1%, the precision 96.3%, and the processing time 15 seconds. Proposed text mining system is expected to be used for metabolic pathway reconstruction.

  • PDF

An Efficient USR system design and implementation based on the USN (USN을 이용한 효율적인 USR 시스템 설계 및 구현)

  • Jin, Woo-Jeong;Xiao, Huang;Jeong, Dae-Ryeong;Shin, Geuk-Jae;Jung, Hoe-Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.451-453
    • /
    • 2010
  • The world is rapidly evolving in the future society is based on intelligence. In such circumstances, USN(Ubiquitous Sensor Network) to implement the future ubiquitous(Ubiquitous) society have been highlighted as a key infrastructure. For realization of Ubiquitous computing(Ubiquitous Computing), The data recognized from each sensors must be collected on real-time and transferred to applied service so that they may be used as data for providing service to users. USN sensor data on the use for using a numerous sensor data provision service, service provider can publish its service in standardized registry for users to search and use the service registry. However, in previous studies using Web services standards WS-Eventing, and UDDI(Universal Description, Discovery, and Integration) as USR(USN Service Registry) for USN Application Service is unnecessary or insufficient. In this paper, data from sensors that provide information to service providers register their services and use the information for service users to explore and views on the registry of the system was designed and implemented.

  • PDF

A Study on Availability of AtoM for Recording Korean Wave Culture Contents : A Case of K-Food Contents (한류문화콘텐츠의 기록화를 위한 AtoM 활용 방안에 관한 연구 K-Food 콘텐츠를 중심으로)

  • Shim, Gab-yong;Yoo, Hyeon-Gyeong;Moon, Sang-Hoon;Lee, Youn-Yong;Lee, Jeong-Hyeon;Kim, Yong
    • The Korean Journal of Archival Studies
    • /
    • no.43
    • /
    • pp.5-42
    • /
    • 2015
  • Korean wave 3.0 is focused on 'K-Culture' which includes traditional culture, cultural art as well as existing culture contents as a keyword. It considers everything about Korean culture as materials of Korean wave culture contents. Since Korean wave culture contents reflect contemporary social aspect, it needs to preserve those contents as archives and records which have the important value of evidence. With this social environment, this study aims to implement RMS based on AtoM that manages various kinds of Korean wave culture contents through analysis of management situation of those materials. Recently, it is in progress individually to manage them through organizations dealing with korean cultures such as K-Pop, K-Food, K-Movie. However, it has problems in accumulating information and reproducing high quality contents because of lack of coordination among organizations. To solve the problems, this study proposed RMS based on open source software Access to Memory(AtoM) for managing and recording Korean wave culture contents. AtoM provides various functions for managing records and archives such as accumulation, classification, description and browsing. Furthermore AtoM is for free as open source software and easy to implement and use. Thus, this study implemented RMS based on AtoM to methodically manage korean wave culture contents by functional requirements of RMS. Also, this study considered contents relating K-Food as an object to collect, classify, and describe. To describe it, this study selected ISAD(G) standard.

Quality Dimensions Affecting the Effectiveness of a Semantic-Web Search Engine (검색 효과성에 영향을 미치는 시맨틱웹 검색시스템 품질요인에 관한 연구)

  • Han, Dong-Il;Hong, Il-Yoo
    • Asia pacific journal of information systems
    • /
    • v.19 no.1
    • /
    • pp.1-31
    • /
    • 2009
  • This paper empirically examines factors that potentially influence the success of a Web-based semantic search engine. A research model has been proposed that shows the impact of quality-related factors upon the effectiveness of a semantic search engine, based on DeLone and McLean's(2003) information systems success model. An empirical study has been conducted to test hypotheses formulated around the research model, and statistical methods were applied to analyze gathered data and draw conclusions. Implications for academics and practitioners are offered based on the findings of the study. The proposed model includes three quality dimensions of a Web-based semantic search engine-namely, information quality, system quality and service quality. These three dimensions each have measures designed to collectively assess the respective dimension. The model is intended to examine the relationship between measures of these quality dimensions and measures of two dependent constructs, including individuals' net benefit and user satisfaction. Individuals' net benefit was measured by the extent to which the user's information needs were adequately met, whereas user satisfaction was measured by a combination of the perceived satisfaction with search results and the perceived satisfaction with the overall system. A total of 23 hypotheses have been formulated around the model, and a questionnaire survey has been conducted using a functional semantic search website created by KT and Hakia, so as to collect data to validate the model. Copies of a questionnaire form were handed out in person to 160 research associates and employees working in the area of designing and developing semantic search engines. Those who received the form, 148 respondents returned valid responses. The survey form asked respondents to use the given website to answer questions concerning the system. The results of the empirical study have indicated that, of the three quality dimensions, information quality was found to have the strongest association with the effectiveness of a Web-based semantic search engine. This finding is consistent with the observation in the literature that the aspects of the information quality should serve as a basis for evaluating the search outcomes from a semantic search engine. Measures under the information quality dimension that have a positive effect on informational gratification and user satisfaction were found to be recall and currency. Under the system quality dimension, response time and interactivity, were positively related to informational gratification. On the other hand, only one measure under the service quality dimension, reliability was found to have a positive relationship with user satisfaction. The results were based on the seven hypotheses that have been accepted. One may wonder why 15 out of the 23 hypotheses have been rejected and question the theoretical soundness of the model. However, the correlations between independent variables and dependent variables came out to be fairly high. This suggests that the structural equation model yielded results inconsistent with those of coefficient analysis, because the structural equation model intends to examine the relationship among independent variables as well as the relationship between independent variables and dependent variables. The findings offer some useful implications for owners of a semantic search engine, as far as the design and maintenance of the website is concerned. First, the system should be designed to respond to the user's query as fast as possible. Also it should be designed to support the search process by recommending, revising, and choosing a search query, so as to maximize users' interactions with the system. Second, the system should present search results with maximum recall and currency to effectively meet the users' expectations. Third, it should be capable of providing online services in a reliable and trustworthy manner. Finally, effective increase in user satisfaction requires the improvement of quality factors associated with a semantic search engine, which would in turn help increase the informational gratification for users. The proposed model can serve as a useful framework for measuring the success of a Web-based semantic search engine. Applying the search engine success framework to the measurement of search engine effectiveness has the potential to provide an outline of what areas of a semantic search engine needs improvement, in order to better meet information needs of users. Further research will be needed to make this idea a reality.

Analysis of Tourism Popularity Using T-map Search andSome Trend Data: Focusing on Chuncheon-city, Gangwon-province (T맵 검색지와 썸트랜드 데이터를 이용한 관광인기도분석: 강원도 춘천을 중심으로)

  • TaeWoo Kim;JaeHee Cho
    • Journal of Service Research and Studies
    • /
    • v.12 no.1
    • /
    • pp.25-35
    • /
    • 2022
  • Covid-19, of which the first patient in Korea occurred in January 2020, has affected various fields. Of these, the tourism sector might havebeen hit the hardest. In particular, since tourism-based industrial structure forms the basis of the region, Gangwon-province, and the tourism industry is the main source of income for small businesses and small enterprises, the damage is great. To check the situation and extent of such damage, targeting the Chuncheon region, where public access is the most convenient among the Gangwon regions, one-day tours are possible using public transportation from Seoul and the metropolitan area, with a general image that low expense tourism is recognized as possible, this study conducted empirical analysis through data analysis. For this, the general status of the region was checked based on the visitor data of Chuncheon city provided by the tourist information system, and to check the levels ofinterest in 2019, before Covid-19, and in 2020, after Covid-19, by comparing keywords collected from the web service sometrend of Vibe Company Inc., a company specializing in keyword collection, with SK Telecom's T-map search site data, which in parallel provides in-vehicle navigation service and communication service, this study analyzed the general regional image of Chuncheon-city. In addition, by comparing data from two years by developing a tourism popularity index applying keywords and T-map search site data, this study examined how much the Covid-19 situation affected the level of interest of visitors to the Chuncheon area leading to actual visits using a data analysis approach. According to the results of big data analysis applying the tourism popularity index after designing the data mart, this study confirmed that the effect of the Covid-19 situation on tourism popularity in Chuncheon-city, Gangwon-provincewas not significant, and confirmed the image of tourist destinations based on the regional characteristics of the region. It is hoped that the results of this research and analysis can be used as useful reference data for tourism economic policy making.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

A Study on the Information Usage Behavior of Researchers in the Field of Ocean Science and Technology (해양과학기술 분야 연구자의 정보이용행태에 관한 연구)

  • Han, Jong Yup;Seo, Man Deok
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.1
    • /
    • pp.163-187
    • /
    • 2014
  • The purpose of this study is to explain information usage behavior of researchers in the field of ocean science and technology. The study mainly collected primary data for advancement of special library services as well as establishment of personalized information services based on personal characteristics such as age, education level, and area of research. The data collection was conducted for two weeks during January 2014, through a web survey to 348 researchers in national ocean research institutions in South Korea. Total of 115 researchers replied. The analysis showed that the most preferred type of information medium was a scholarly journal. Researchers used more foreign published journals compared to Korean ones, while favoring digital formats rather than printed ones. The top channels for information collection were 'web search' and 'affiliated libraries.' Most pointed out difficulties of data collection were 'lack of variety of digital resources in affiliated libraries' and 'reluctance to use charged information.' Key elements for satisfactory user experience were ranked in the order of 'digital library system,' 'library staff,' and 'library collection' and so on;which proves the close relationship between library service and information usage service satisfaction. The result of an assessment for demands in special libraries showed that 'personalized information search service,' 'project support service,' and 'research direction analysis service' should be implemented in the future.

A System for Automatic Classification of Traditional Culture Texts (전통문화 콘텐츠 표준체계를 활용한 자동 텍스트 분류 시스템)

  • Hur, YunA;Lee, DongYub;Kim, Kuekyeng;Yu, Wonhee;Lim, HeuiSeok
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.12
    • /
    • pp.39-47
    • /
    • 2017
  • The Internet have increased the number of digital web documents related to the history and traditions of Korean Culture. However, users who search for creators or materials related to traditional cultures are not able to get the information they want and the results are not enough. Document classification is required to access this effective information. In the past, document classification has been difficult to manually and manually classify documents, but it has recently been difficult to spend a lot of time and money. Therefore, this paper develops an automatic text classification model of traditional cultural contents based on the data of the Korean information culture field composed of systematic classifications of traditional cultural contents. This study applied TF-IDF model, Bag-of-Words model, and TF-IDF/Bag-of-Words combined model to extract word frequencies for 'Korea Traditional Culture' data. And we developed the automatic text classification model of traditional cultural contents using Support Vector Machine classification algorithm.

Elicitation of Collective Intelligence by Fuzzy Relational Methodology (퍼지관계 이론에 의한 집단지성의 도출)

  • Joo, Young-Do
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.17-35
    • /
    • 2011
  • The collective intelligence is a common-based production by the collaboration and competition of many peer individuals. In other words, it is the aggregation of individual intelligence to lead the wisdom of crowd. Recently, the utilization of the collective intelligence has become one of the emerging research areas, since it has been adopted as an important principle of web 2.0 to aim openness, sharing and participation. This paper introduces an approach to seek the collective intelligence by cognition of the relation and interaction among individual participants. It describes a methodology well-suited to evaluate individual intelligence in information retrieval and classification as an application field. The research investigates how to derive and represent such cognitive intelligence from individuals through the application of fuzzy relational theory to personal construct theory and knowledge grid technique. Crucial to this research is to implement formally and process interpretatively the cognitive knowledge of participants who makes the mutual relation and social interaction. What is needed is a technique to analyze cognitive intelligence structure in the form of Hasse diagram, which is an instantiation of this perceptive intelligence of human beings. The search for the collective intelligence requires a theory of similarity to deal with underlying problems; clustering of social subgroups of individuals through identification of individual intelligence and commonality among intelligence and then elicitation of collective intelligence to aggregate the congruence or sharing of all the participants of the entire group. Unlike standard approaches to similarity based on statistical techniques, the method presented employs a theory of fuzzy relational products with the related computational procedures to cover issues of similarity and dissimilarity.