• Title/Summary/Keyword: Automatic Information Extraction

Search Result 592, Processing Time 0.03 seconds

Web Contents Mining System for Real-Time Monitoring of Opinion Information based on Web 2.0 (웹2.0에서 의견정보의 실시간 모니터링을 위한 웹 콘텐츠 마이닝 시스템)

  • Kim, Young-Choon;Joo, Hae-Jong;Choi, Hae-Gill;Cho, Moon-Taek;Kim, Young-Baek;Rhee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.1
    • /
    • pp.68-79
    • /
    • 2011
  • This paper focuses on the opinion information extraction and analysis system through Web mining that is based on statistics collected from Web contents. That is, users' opinion information which is scattered across several websites can be automatically analyzed and extracted. The system provides the opinion information search service that enables users to search for real-time positive and negative opinions and check their statistics. Also, users can do real-time search and monitoring about other opinion information by putting keywords in the system. Proposing technique proved that the actual performance is excellent by comparison experiment with other techniques. Performance evaluation of function extracting positive/negative opinion information, the performance evaluation applying dynamic window technique and tokenizer technique for multilingual information retrieval, and the performance evaluation of technique extracting exact multilingual phonetic translation are carried out. The experiment with typical movie review sentence and Wikipedia experiment data as object as that applying example is carried out and the result is analyzed.

Academic Conference Categorization According to Subjects Using Topical Information Extraction from Conference Websites (학회 웹사이트의 토픽 정보추출을 이용한 주제에 따른 학회 자동분류 기법)

  • Lee, Sue Kyoung;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.2
    • /
    • pp.61-77
    • /
    • 2017
  • Recently, the number of academic conference information on the Internet has rapidly increased, the automatic classification of academic conference information according to research subjects enables researchers to find the related academic conference efficiently. Information provided by most conference listing services is limited to title, date, location, and website URL. However, among these features, the only feature containing topical words is title, which causes information insufficiency problem. Therefore, we propose methods that aim to resolve information insufficiency problem by utilizing web contents. Specifically, the proposed methods the extract main contents from a HTML document collected by using a website URL. Based on the similarity between the title of a conference and its main contents, the topical keywords are selected to enforce the important keywords among the main contents. The experiment results conducted by using a real-world dataset showed that the use of additional information extracted from the conference websites is successful in improving the conference classification performances. We plan to further improve the accuracy of conference classification by considering the structure of websites.

Automatic Recognition and Normalization System of Korean Time Expression using the individual time units (시간의 단위별 처리를 이용한 자동화된 한국어 시간 표현 인식 및 정규화 시스템)

  • Seon, Choong-Nyoung;Kang, Sang-Woo;Seo, Jung-Yun
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.447-458
    • /
    • 2010
  • Time expressions are a very important form of information in different types of data. Thus, the recognition of a time expression is an important factor in the field of information extraction. However, most previously designed systems consider only a specific domain, because time expressions do not have a regular form and frequently include different ellipsis phenomena. We present a two-level recognition method consisting of extraction and transformation phases to achieve generality and portability. In the extraction phase, time expressions are extracted by atomic time units for extensibility. Then, in the transformation phase, omitted information is restored using basis time and prior knowledge. Finally, every complete atomic time unit is transformed into a normalized form. The proposed system can be used as a general-purpose system, because it has a language- and domain-independent architecture. In addition, this system performs robustly in noisy data like SMS data, which include various errors. For SMS data, the accuracies of time-expression extraction and time-expression normalization by using the proposed system are 93.8% and 93.2%, respectively. On the basis of these experimental results, we conclude that the proposed system shows high performance in noisy data.

  • PDF

A Study On Face Feature Points Using Active Discrete Wavelet Transform (Active Discrete Wavelet Transform를 이용한 얼굴 특징 점 추출)

  • Chun, Soon-Yong;Zijing, Qian;Ji, Un-Ho
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.47 no.1
    • /
    • pp.7-16
    • /
    • 2010
  • Face recognition of face images is an active subject in the area of computer pattern recognition, which has a wide range of potential. Automatic extraction of face image of the feature points is an important step during automatic face recognition. Whether correctly extract the facial feature has a direct influence to the face recognition. In this paper, a new method of facial feature extraction based on Discrete Wavelet Transform is proposed. Firstly, get the face image by using PC Camera. Secondly, decompose the face image using discrete wavelet transform. Finally, we use the horizontal direction, vertical direction projection method to extract the features of human face. According to the results of the features of human face, we can achieve face recognition. The result show that this method could extract feature points of human face quickly and accurately. This system not only can detect the face feature points with great accuracy, but also more robust than the tradition method to locate facial feature image.

Automatic Recognition of Analog and Digital Modulation Signals (아날로그 및 디지털 변조 신호의 자동 인식)

  • Seo Seunghan;Yoon Yeojong;Jin Younghwan;Seo Yongju;Lim Sunmin;Ahn Jaemin;Eun Chang-Soo;Jang Won;Nah Sunphil
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.1C
    • /
    • pp.73-81
    • /
    • 2005
  • We propose an automatic modulation recognition scheme which extracts pre-defined key features from the received signal and then applies equal gain combining method to determine the used modulation. Moreover, we compare and analyze the performance of the proposed algorithm with that of decision-theoretic algorithm. Our scheme extracts five pre-defined key features from each data segment, a data unit for the key feature extraction, which are then averaged over all the segments to recognize the modulation according to the decision procedure. We check the performance of the proposed algorithm through computer simulations for analog modulations such as AM, FM, SSB and for digital modulations such as FSK2, FSK4, PSK2, and PSK4, by measuring recognition success rate varying SNR and data collection time. The result shows that the performance of the proposed scheme is comparable to that of the decision-theoretic algorithm with less complexity.

Video Automatic Editing Method and System based on Machine Learning (머신러닝 기반의 영상 자동 편집 방법 및 시스템)

  • Lee, Seung-Hwan;Park, Dea-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.235-237
    • /
    • 2022
  • Video content is divided into long-form video content and short-form video content according to the length. Long form video content is created with a length of 15 minutes or longer, and all frames of the captured video are included without editing. Short-form video content can be edited to a shorter length from 1 minute to 15 minutes, and only some frames from the frames of the captured video. Due to the recent growth of the single-person broadcasting market, the demand for short-form video content to increase viewers is increasing. Therefore, there is a need for research on content editing technology for editing and generating short-form video content. This study studies the technology to create short-form videos of main scenes by capturing images, voices, and motions. Short-form videos of key scenes use a pre-trained highlight extraction model through machine learning. An automatic video editing system and method for automatically generating a highlight video is a core technology of short-form video content. Machine learning-based automatic video editing method and system research will contribute to competitive content activities by reducing the effort and cost and time invested by single creators for video editing

  • PDF

Development of Remote Radar/AIS Network System for Observing and Analyzing Vessel Traffic in Tokyo Bay

  • Hagiwara, Hideki;Shoji, Ruri;Tamaru, Hitoi;Liu, Shun;Okano, Tadashi
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • v.1
    • /
    • pp.151-156
    • /
    • 2006
  • Accurate vessel traffic observation is indispensable to carry out vessel traffic management, design of vessel traffic route, planning of port construction, etc. In order to observe the vessel traffic accurately without many efforts such as the use of a ship or car equipped with special radar observation system and the preparation of observation staff, the authors have been developing completely automated remote radar/AIS network system covering the main traffic area in Tokyo Bay. The composite radar image observed at Yokosuka and Kawasaki radar stations with AIS information can be seen on web site of Internet. In addition to the development of radar/AIS observation system, the software to analyze observed vessel traffic flow has been developed. This software has various functions such as automatic tracking of ship's positions, automatic estimation of ship's size, automatic integration of radar image and AIS data, animation of ships' movements, extraction of dangerous ship encounters, etc. The configuration and functions of the developed remote radar/AIS network system are shown first in this paper. Then various functions of the software to analyze vessel traffic are introduced, and some analyzed results on the vessel traffic in Tokyo Bay are described demonstrating the effectiveness of the developed system.

  • PDF

Efficient Object Classification Scheme for Scanned Educational Book Image (교육용 도서 영상을 위한 효과적인 객체 자동 분류 기술)

  • Choi, Young-Ju;Kim, Ji-Hae;Lee, Young-Woon;Lee, Jong-Hyeok;Hong, Gwang-Soo;Kim, Byung-Gyu
    • Journal of Digital Contents Society
    • /
    • v.18 no.7
    • /
    • pp.1323-1331
    • /
    • 2017
  • Despite the fact that the copyright has grown into a large-scale business, there are many constant problems especially in image copyright. In this study, we propose an automatic object extraction and classification system for the scanned educational book image by combining document image processing and intelligent information technology like deep learning. First, the proposed technology removes noise component and then performs a visual attention assessment-based region separation. Then we carry out grouping operation based on extracted block areas and categorize each block as a picture or a character area. Finally, the caption area is extracted by searching around the classified picture area. As a result of the performance evaluation, it can be seen an average accuracy of 83% in the extraction of the image and caption area. For only image region detection, up-to 97% of accuracy is verified.

Feature Extraction Using Trace Transform for Insect Footprint Recognition (곤충 발자국 패턴 인식을 위한 Trace Transform 기반의 특징값 추출)

  • Shin, Bok-Suk;Cho, Kyoung-Won;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.6
    • /
    • pp.1095-1100
    • /
    • 2008
  • In a process of insect foot recognition, footprint segments as basic areas for recognition need to be extracted from scanned insect footprints and appropriate features should be found from the footprint segments in order to discriminate kinds of insects, because the characteristics of the features are important to classify insects. In this paper, we propose methods for automatic footprint segmentation and feature extraction. We use a Trace transform method in order to find out appropriate features from the extracted segments by the above methods. The Trace transform method builds a new type of data structure from the segmented images by functions using parallel trace lines and the new type of data structure has characteristics invariant to translation, rotation and reflection of images. This data structure is converted to Triple features by Diametric and Circus functions, and the Triple features are used for discriminating patterns of insect footprints. In this paper, we show that the Triple features found by the proposed methods are enough distinguishable and appropriate for classifying kinds of insects.

Extraction of Basic Insect Footprint Segments Using ART2 of Automatic Threshold Setting (자동 임계값 설정 ART2를 이용한 곤충 발자국의 인식 대상 영역 추출)

  • Shin, Bok-Suk;Cha, Eui-Young;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.8
    • /
    • pp.1604-1611
    • /
    • 2007
  • In a process of insect footprint recognition, basic footprint segments should be extracted from a whole insect footprint image in order to find out appropriate features for classification. In this paper, we used a clustering method as a preprocessing stage for extraction of basic insect footprint segments. In general, sizes and strides of footprints may be different according to type and sire of an insect for recognition. Therefore we proposed an improved ART2 algorithm for extraction or basic insect footprint segments regardless of size and stride or footprint pattern. In the proposed ART2 algorithm, threshold value for clustering is determined automatically using contour shape of the graph created by accumulating distances between all the spots of footprint pattern. In the experimental results applying the proposed method to two kinds of insect footprint patterns, we could see that all the clustering results were accomplished correctly.