• Title/Summary/Keyword: Information Retrieval Technique

Search Result 344, Processing Time 0.022 seconds

The Blog Polarity Classification Technique using Opinion Mining (오피니언 마이닝을 활용한 블로그의 극성 분류 기법)

  • Lee, Jong-Hyuk;Lee, Won-Sang;Park, Jea-Won;Choi, Jae-Hyun
    • Journal of Digital Contents Society
    • /
    • v.15 no.4
    • /
    • pp.559-568
    • /
    • 2014
  • Previous polarity classification using sentiment analysis utilizes a sentence rule by product reviews based rating points. It is difficult to be applied to blogs which have not rating of product reviews and is possible to fabricate product reviews by comment part-timers and managers who use web site so it is not easy to understand a product and store reviews which are reliability. Considering to these problems, if we analyze blogs which have personal and frank opinions and classify polarity, it is possible to understand rightly opinions for the product, store. This paper suggests that we extract high frequency vocabularies in blogs by several domains and choose topic words. Then we apply a technique of sentiment analysis and classify polarity about contents of blogs. To evaluate performances of sentiment analysis, we utilize the measurement index that use Precision, Recall, F-Score in an information retrieval field. In a result of evaluation, using suggested sentiment analysis is the better performances to classify polarity than previous techniques of using the sentence rule based product reviews.

Development of MPEG-7 Description-based Annotation Tool for Production of Semantic Multimedia Metadata (의미적 멀티미디어 메타데이터 생성을 위한 MPEG-7 기술기반 주석도구의 개발)

  • An, Hyoung-Geun;Koh, Jaw-Jin
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.35-44
    • /
    • 2007
  • Recently, an increasing in quantity of multimedia data have brought a new problem that expected data should be retrieved fast and exactly. The adequate representation for the multimedia data is the key element for efficient retrieval. For this reason, MPEG-7 standard was established for description of multimedia data. In this paper, we propose a new approach to metadata production. The user can decompose a given content into units and easily annotate each unit by adding basic Information such as time, place, etc. as well as classification information such as event, relationship, etc. according to the MPEG-7 standard. The objective is to build automatically a pure semantic description; the nodes are the events and the links are the graphs which describe the relationships among the events. Finally, we have implemented an annotation tool(SMAT) for semantic description based on proposed technique and assess some of the experiment results. In conclusion, we ran say that the proposod annotation tool is characterized by two important proprieties : reusability and extendibility.

Estimating Coverage of the Web Search Services Using Near-Uniform Sampling of Web Documents (균등한 웹 문서 샘플링을 이용한 웹 검색 서비스들의 커버리지 측정)

  • Jang, Sung-Soo;Kim, Kwang-Hyun;Lee, Joon-Ho
    • The KIPS Transactions:PartD
    • /
    • v.15D no.3
    • /
    • pp.305-312
    • /
    • 2008
  • Web documents with useful information are widely available on the internet and they are accessible with web search service. For this reason, web search services study better ways to collect more web documents, but have a difficulty figuring out the coverage of these web pages. This paper is intended to find ways to evaluate the current coverage assessment methods and suggest more effective coverage assessment technique that is, sampling internet web documents equally, monitoring how they are classified on web search services, in an attempt to assess both absolute and relative coverage of the web search engines. The paper also presents the comparison among Korean web search services using the suggested methods.the absolute and relative coverage was highest in Google followed by Naver and Empas. The result is expected to help estimating coverage of web search services.

Optimal Associative Neighborhood Mining using Representative Attribute (대표 속성을 이용한 최적 연관 이웃 마이닝)

  • Jung Kyung-Yong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.50-57
    • /
    • 2006
  • In Electronic Commerce, the latest most of the personalized recommender systems have applied to the collaborative filtering technique. This method calculates the weight of similarity among users who have a similar preference degree in order to predict and recommend the item which hits to propensity of users. In this case, we commonly use Pearson Correlation Coefficient. However, this method is feasible to calculate a correlation if only there are the items that two users evaluated a preference degree in common. Accordingly, the accuracy of prediction falls. The weight of similarity can affect not only the case which predicts the item which hits to propensity of users, but also the performance of the personalized recommender system. In this study, we verify the improvement of the prediction accuracy through an experiment after observing the rule of the weight of similarity applying Vector similarity, Entropy, Inverse user frequency, and Default voting of Information Retrieval field. The result shows that the method combining the weight of similarity using the Entropy with Default voting got the most efficient performance.

Study of the Haar Wavelet Feature Detector for Image Retrieval (이미지 검색을 위한 Haar 웨이블릿 특징 검출자에 대한 연구)

  • Peng, Shao-Hu;Kim, Hyun-Soo;Muzzammil, Khairul;Kim, Deok-Hwan
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.160-170
    • /
    • 2010
  • This paper proposes a Haar Wavelet Feature Detector (HWFD) based on the Haar wavelet transform and average box filter. By decomposing the original image using the Haar wavelet transform, the proposed detector obtains the variance information of the image, making it possible to extract more distinctive features from the original image. For detection of interest points that represent the regions whose variance is the highest among their neighbor regions, we apply the average box filter to evaluate the local variance information and use the integral image technique for fast computation. Due to utilization of the Haar wavelet transform and the average box filter, the proposed detector is robust to illumination change, scale change, and rotation of the image. Experimental results show that even though the proposed method detects fewer interest points, it achieves higher repeatability, higher efficiency and higher matching accuracy compared with the DoG detector and Harris corner detector.

Embeded-type Search Function with Feedback for Smartphone Applications (스마트폰 애플리케이션을 위한 임베디드형 피드백 지원 검색체)

  • Kang, Moonjoong;Hwang, Mintae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.5
    • /
    • pp.974-983
    • /
    • 2017
  • In this paper, we have discussed the search function that can be embedded and used on Android-based applications. We used BM25 to suppress insignificant and too frequent words such as postpositions, Pivoted Length Normalization technique used to resolve the search priority problem related to each item's length, and Rocchio's method to pull items inferred to be related to the query closer to the query vector on Vector Space Model to support implicit feedback function. The index operation is divided into two methods; simple index to support offline operation and complex index for online operation. The implementation uses query inference function to guess user's future input by collating given present input with indexed data and with it the function is able to handle and correct user's error. Thus the implementation could be easily adopted into smartphone applications to improve their search functions.

Prefetching Methods with Vehicle's Pattern in Location-Based Services (위치기반서비스에서 차량의 패턴을 고려한 프리페칭 기법)

  • Choi, In-Seon;Kim, Joo-Hwan;Lee, Dong-Chun
    • Convergence Security Journal
    • /
    • v.8 no.4
    • /
    • pp.105-113
    • /
    • 2008
  • Mobile computing environment is known to be quite difficult to provide user with a stable Quality of Service (QoS) due to vehicle mobility nature. In order to protect the inherent characteristics of wireless network such as low bandwidth and high transmission delay along with the vehicle's mobility, many works are conducted to apply caching and prefetching methods. This paper presents a novel prefetching technique which is based on owner's profile. It is the scheme with the vehicle velocity, the visit frequency and resident time of specific region for vehicle owner's past given period, and the user profile which has owner's personal inclination about a certain place. The proposed scheme shown relatively superior performance in terms of the utilization ratio of prefetched information and the failure ratio of information retrieval than the previous methods.

  • PDF

Data hub system based on SQL/XMDR message using Wrapper for distributed data interoperability (분산 데이터 상호운용을 위한 SQL/XMDR 메시지 기반의 Wrapper를 이용한 데이터 허브 시스템)

  • Moon, Seok-Jae;Jung, Gye-Dong;Choi, Young-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.11
    • /
    • pp.2047-2058
    • /
    • 2007
  • The business environment of enterprises could be difficult to obviate redundancy to filtrate data source occurred on data integrated to standard rules and meta-data and to produce integration of data and single viewer in geographical and spatial distributed environment. Specially, To can interchange various data from a heterogeneous system or various applications without types and forms and synchronize continually exactly integrated information#s is of paramount concern. Therefore data hub system based on SQL/XMDR message to overcome a problem of meaning interoperability occurred on exchanging or jointing between each legacy systems are proposed in this paper. This system use message mapping technique of query transform system to maintain data modified in real-time on cooperating data. It can consistently maintain data modified in realtime on exchanging or jointing data for cooperating legacy systems, it improve clarity and availability of data by providing a single interface on data retrieval.

A Study of Performance Analysis on Effective Multiple Buffering and Packetizing Method of Multimedia Data for User-Demand Oriented RTSP Based Transmissions Between the PoC Box and a Terminal (PoC Box 단말의 RTSP 운용을 위한 사용자 요구 중심의 효율적인 다중 수신 버퍼링 기법 및 패킷화 방법에 대한 성능 분석에 관한 연구)

  • Bang, Ji-Woong;Kim, Dae-Won
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.1
    • /
    • pp.54-75
    • /
    • 2011
  • PoC(Push-to-talk Over Cellular) is an integrated technology of group voice calls, video calls and internet based multimedia services. If a PoC user can not participate in the PoC session for various reasons such as an emergency situation, lack of battery capacity, then the user can use the PoC Box which has a similar functionality to the MM Box in the MMS(Multimedia Messaging Service). The RTSP(Real-Time Streaming Protocol) method is recommended to be used when there is a transmission session between the PoC box and a terminal. Since the existing VOD service uses a wired network, the packet size of RTSP-based VOD service is huge, however, the PoC service has wireless communication environments which have general characteristics to be used in RTSP method. Packet loss in a wired communication environments is relatively less than that in wireless communication environment, therefore, a buffering latency occurs in PoC service due to a play-out delay which means an asynchronous play of audio & video contents. Those problems make a user to be difficult to find the information they want when the media contents are played-out. In this paper, the following techniques and methods were proposed and their performance and superiority were verified through testing: cross-over dual reception buffering technique, advance partition multi-reception buffering technique, and on-demand multi-reception buffering technique, which are designed for effective picking up of information in media content being transmitted in short amount of time using RTSP when a user searches for media, as well as for reduction in playback delay; and same-priority packetization transmission method and priority-based packetization transmission method, which are media data packetization methods for transmission. From the simulation of functional evaluation, we could find that the proposed multiple receiving buffering and packetizing methods are superior, with respect to the media retrieval inclination, to the existing single receiving buffering method by 6-9 points from the viewpoint of effectiveness and excellence. Among them, especially, on-demand multiple receiving buffering technology with same-priority packetization transmission method is able to manage the media search inclination promptly to the requests of users by showing superiority of 3-24 points above compared to other combination methods. In addition, users could find the information they want much quickly since large amount of informations are received in a focused media retrieval period within a short time.

Font Classification of English Printed Character using Non-negative Matrix Factorization (NMF를 이용한 영문자 활자체 폰트 분류)

  • Lee, Chang-Woo;Kang, Hyun;Jung, Kee-Chul;Kim, Hang-Joon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.65-76
    • /
    • 2004
  • Today, most documents are electronically produced and their paleography is digitalized by imaging, resulting in a tremendous number of electronic documents in the shape of images. Therefore, to process these document images, many methods of document structure analysis and recognition have already been proposed, including font classification. Accordingly, the current paper proposes a font classification method for document images that uses non-negative matrix factorization (NMF), which is able to learn part-based representations of objects. In the proposed method, spatially total features of font images are automatically extracted using NMF, then the appropriateness of the features specifying each font is investigated. The proposed method is expected to improve the performance of optical character recognition (OCR), document indexing, and retrieval systems, when such systems adopt a font classifier as a preprocessor.