• Title/Summary/Keyword: Automatic Information Extraction

Search Result 592, Processing Time 0.025 seconds

A Study on the Korean-Engligh Semantic Thesaurus Construction for Knowledge Management System (지식관리시스템을 위한 의미형 한영 시소러스 구축에 관한 연구)

  • 남영준
    • Journal of Korean Library and Information Science Society
    • /
    • v.32 no.4
    • /
    • pp.77-98
    • /
    • 2001
  • As the role of a library has changed to the integrated management system of knowledge, the library needs new information retrieval tools. The purpose of this study is to propose a method and principle of the Korean-English semantic thesaurus construction for a knowledge management system. The method and principle is as follows; 1) in collecting terminology, I included not only internal documents but external documents on the web as a source for the descriptors extraction. 2) conceptual descriptors are more needed than semantic ones. I also proposed the necessity of the authority files for complement. 3) I proposed the appropriate scale of the descriptors to be 15,000 in a thesaurus. And 4) I proposed a hybrid method that used both a manual and an automatic process in establishing the relationship.

  • PDF

Application of Digital Photogrammetry for The Automatic Extraction of Road Information (도로정보의 자동추출을 위한 수치사진측량기법의 적용)

  • 유환희
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.12 no.1
    • /
    • pp.89-94
    • /
    • 1994
  • A number of the latest research projects focus on the development of real-time mapping system. Typically, these devices are used to capture land-related information in digital form from airplanes or cars. The purpose of this paper is to automatically extract the road information from the digital images obtained using the so-called "GPS-Van" which has been developed by Center for Mapping at The Ohio State University, and to propose the method for the effective storage and management of the digital data. The edges of a road can be extracted from the digital image and determined real-time 3-dimensional position by digital photogrammetry. Also, the three storage level which consists of raster data level, object-oriented data level, and vector data level in the data storage and Quadtree data structure for the effective compression and search in the data management was proposed in this paper.his paper.

  • PDF

A Study of Automatic Vehicle Control by Image Processing (화상처리 기술을 이용한 자동차 교통 제어에 관한 연구)

  • Choe, Hyeong-Jin;Yang, Hae-Sul
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.3
    • /
    • pp.418-426
    • /
    • 1994
  • Auto Navigation System is to provide a vehicle driver with more driving information by developing a computer-based system which supports advanced knowledge to a vehicle driving automation system and a driver. In this paper, we propose a new algorithm for the extraction of passing car which removes a background region using a series of images. First, we generate two difference images from three original images by getting the difference values between every two of them in sequence. Second, we generate two mask images from the two difference images. Finally, we extract passing car using the one original image and the two mask images. Using this algorithm we can extract the moving object in the outdoors.

  • PDF

Automatic Malware Detection Rule Generation and Verification System (악성코드 침입탐지시스템 탐지규칙 자동생성 및 검증시스템)

  • Kim, Sungho;Lee, Suchul
    • Journal of Internet Computing and Services
    • /
    • v.20 no.2
    • /
    • pp.9-19
    • /
    • 2019
  • Service and users over the Internet are increasing rapidly. Cyber attacks are also increasing. As a result, information leakage and financial damage are occurring. Government, public agencies, and companies are using security systems that use signature-based detection rules to respond to known malicious codes. However, it takes a long time to generate and validate signature-based detection rules. In this paper, we propose and develop signature based detection rule generation and verification systems using the signature extraction scheme developed based on the LDA(latent Dirichlet allocation) algorithm and the traffic analysis technique. Experimental results show that detection rules are generated and verified much more quickly than before.

A Dual-scale Network with Spatial-temporal Attention for 12-lead ECG Classification

  • Shuo Xiao;Yiting Xu;Chaogang Tang;Zhenzhen Huang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2361-2376
    • /
    • 2023
  • The electrocardiogram (ECG) signal is commonly used to screen and diagnose cardiovascular diseases. In recent years, deep neural networks have been regarded as an effective way for automatic ECG disease diagnosis. The convolutional neural network is widely used for ECG signal extraction because it can obtain different levels of information. However, most previous studies adopt single scale convolution filters to extract ECG signal features, ignoring the complementarity between ECG signal features of different scales. In the paper, we propose a dual-scale network with convolution filters of different sizes for 12-lead ECG classification. Our model can extract and fuse ECG signal features of different scales. In addition, different spatial and time periods of the feature map obtained from the 12-lead ECG may have different contributions to ECG classification. Therefore, we add a spatial-temporal attention to each scale sub-network to emphasize the representative local spatial and temporal features. Our approach is evaluated on PTB-XL dataset and achieves 0.9307, 0.8152, and 89.11 on macro-averaged ROC-AUC score, a maximum F1 score, and mean accuracy, respectively. The experiment results have proven that our approach outperforms the baselines.

Detection and Analysis of the Liver Area and Liver Tumors in CT Scans (CT 영상에서의 간 영역과 간 종양 추출 및 분석)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.1
    • /
    • pp.15-27
    • /
    • 2007
  • In Korea, hepatoma is the thirdly frequent cause of death from cancer occupying 17.2% among the whole deaths from cancer and the rate of death from hepatoma comes to about 21's persons per one-hundred thousand ones. This paper proposes an automatic method for the extraction of areas being suspicious as hepatoma from a CT scan and evaluates the availability as an auxiliary tool for the diagnosis of hepatoma. For detecting tumors in the internal of the liver from CT scans, first, an area of the liver is extracted from about $45{\sim}50's$ CT scans obtained by scanning in 2.5-mm intervals starting from the lower part of the chest. In the extraction of an area of the liver, after unconcerned areas outside of the ribs being removed, areas of the internal organs are separated and enlarged by using intensity information of the CT scan. The area of the liver is extracted among separated areas by using information on position and morphology of the liver. Since hepatoma is a hypervascular turner, the area corresponding to hepatoma appears more brightly than the surroundings in contrast-enhancement CT scans, and when hepatoma shows expansile growth, the area has a spherical shape. So, for the extraction of areas of hepatoma, areas being brighter than the surroundings and globe-shaped are selected as candidate ones in an area of the liver, and then, areas appearing at the same position in successive CT scans among the candidates are discriminated as hepatoma. For the performance evaluation of the proposed method, experiment results obtained by applying the proposed method to CT scans were compared with the diagnoses by radiologists. The evaluation results showed that all areas of the liver and liver tumors were extracted exactly and the proposed method has a high availability as an auxiliary diagnosis tools for the discrimination of liver tumors.

  • PDF

Automatic Container Placard Recognition System (컨테이너 플래카드 자동 인식 시스템)

  • Heo, Gyeongyong;Lee, Imgeun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.6
    • /
    • pp.659-665
    • /
    • 2019
  • Various placards are attached to the surface of a container depending on the risk of the cargo loaded. Containers with dangerous goods should be managed separately from ordinary containers. Therefore, as part of the port automation system, there is a demand for automatic recognition of placards. In this paper, proposed is a system that automatically extracts the placard area based on the shape features of the placard and recognizes the contents in it. Various distortions can be caused by the surface curvature of the container, therefore, attention should be paid to the area extraction and recognition process. The proposed system can automatically extract the region of interest and recognize the placard using the feature that the placard is diamond shaped and the class number is written just above the lower vertex. When the proposed system is applied to real images, the placard can be recognized without error, and the used techniques can be applied to various image analysis systems.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

Hierarchical Automatic Classification of News Articles based on Association Rules (연관규칙을 이용한 뉴스기사의 계층적 자동분류기법)

  • Joo, Kil-Hong;Shin, Eun-Young;Lee, Joo-Il;Lee, Won-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.6
    • /
    • pp.730-741
    • /
    • 2011
  • With the development of the internet and computer technology, the amount of information through the internet is increasing rapidly and it is managed in document form. For this reason, the research into the method to manage for a large amount of document in an effective way is necessary. The conventional document categorization method used only the keywords of related documents for document classification. However, this paper proposed keyword extraction method of based on association rule. This method extracts a set of related keywords which are involved in document's category and classifies representative keyword by using the classification rule proposed in this paper. In addition, this paper proposed the preprocessing method for efficient keywords creation and predicted the new document's category. We can design the classifier and measure the performance throughout the experiment to increase the profile's classification performance. When predicting the category, substituting all the classification rules one by one is the major reason to decrease the process performance in a profile. Finally, this paper suggested automatically categorizing plan which can be applied to hierarchical category architecture, extended from simple category architecture.

Realtime Facial Expression Data Tracking System using Color Information (컬러 정보를 이용한 실시간 표정 데이터 추적 시스템)

  • Lee, Yun-Jung;Kim, Young-Bong
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.7
    • /
    • pp.159-170
    • /
    • 2009
  • It is very important to extract the expression data and capture a face image from a video for online-based 3D face animation. In recently, there are many researches on vision-based approach that captures the expression of an actor in a video and applies them to 3D face model. In this paper, we propose an automatic data extraction system, which extracts and traces a face and expression data from realtime video inputs. The procedures of our system consist of three steps: face detection, face feature extraction, and face tracing. In face detection, we detect skin pixels using YCbCr skin color model and verifies the face area using Haar-based classifier. We use the brightness and color information for extracting the eyes and lips data related facial expression. We extract 10 feature points from eyes and lips area considering FAP defined in MPEG-4. Then, we trace the displacement of the extracted features from continuous frames using color probabilistic distribution model. The experiments showed that our system could trace the expression data to about 8fps.