• Title/Summary/Keyword: Semantic-Based Information Extraction

Search Result 135, Processing Time 0.026 seconds

An Effective Framework for Contented-Based Image Retrieval with Multi-Instance Learning Techniques

  • Peng, Yu;Wei, Kun-Juan;Zhang, Da-Li
    • Journal of Ubiquitous Convergence Technology
    • /
    • v.1 no.1
    • /
    • pp.18-22
    • /
    • 2007
  • Multi-Instance Learning(MIL) performs well to deal with inherently ambiguity of images in multimedia retrieval. In this paper, an effective framework for Contented-Based Image Retrieval(CBIR) with MIL techniques is proposed, the effective mechanism is based on the image segmentation employing improved Mean Shift algorithm, and processes the segmentation results utilizing mathematical morphology, where the goal is to detect the semantic concepts contained in the query. Every sub-image detected is represented as a multiple features vector which is regarded as an instance. Each image is produced to a bag comprised of a flexible number of instances. And we apply a few number of MIL algorithms in this framework to perform the retrieval. Extensive experimental results illustrate the excellent performance in comparison with the existing methods of CBIR with MIL.

  • PDF

Fast information extraction algorithm for object-based MPEG-4 conversion from MPEG-1,2 (MPEG-1,2로부터 객체 기반 MPEG-4 변환을 위한 고속 정보 추출 알고리즘)

  • 양종호;박성욱
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.3
    • /
    • pp.91-102
    • /
    • 2004
  • In this paper, a fast information extraction algorithm for object-based MPEG-4 application from MPEG-1,2 is proposed. For object-based MPEG-4 conversion, we need to extract such information as object-image, shape-image, macro-block motion vector, and header information from MPEG-1,2 bit-stream. If we use the extracted information, fast conversion for object-based MPEG-4 is possible. The proposed object extraction algerian has two important steps, namely the motion vector extraction from MPEG-1,2 bit-stream and the watershed algerian The algorithm extracts objects using user's assistance in the intra frame and tracks then in the following inter frames. If we have an unsatisfactory result for a fast moving object the user can intervene to connect the segmentation. The proposed algorithm consist of two steps, which are intra frame object extracting processing and inter frame tracking processing. Object extracting process is the step in which user extracts a semantic object directly by using the block classification and watersheds. Object tracking process is the step of the following the object in the subsequent frames. It is based on the boundary fitting method using motion vector, object-mask and modified watersheds. Experimental results show that the proposed method can achieve a fast conversion from the MPEG-1,2 bit-stream to the object-based MPEG-4 input.

An Experimental Study on the Relation Extraction from Biomedical Abstracts using Machine Learning (기계 학습을 이용한 바이오 분야 학술 문헌에서의 관계 추출에 대한 실험적 연구)

  • Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.50 no.2
    • /
    • pp.309-336
    • /
    • 2016
  • This paper introduces a relation extraction system that can be used in identifying and classifying semantic relations between biomedical entities in scientific texts using machine learning methods such as Support Vector Machines (SVM). The suggested system includes many useful functions capable of extracting various linguistic features from sentences having a pair of biomedical entities and applying them into training relation extraction models for maximizing their performance. Three globally representative collections in biomedical domains were used in the experiments which demonstrate its superiority in various biomedical domains. As a result, it is most likely that the intensive experimental study conducted in this paper will provide meaningful foundations for research on bio-text analysis based on machine learning.

An Efficient Algorithm for Detecting Tables in HTML Documents (HTML 문서의 테이블 식별을 위한 효율적인 알고리즘)

  • Kim Yeon-Seok;Lee Kyong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1339-1353
    • /
    • 2004
  • < TABLE > tags in HTML documents are widely used for formatting layout of Web documents as well as for describing genuine tables with relational information. As a prerequisite for information extraction from the Web, this paper presents an efficient method for sophisticated table detection. The proposed method consists of two phases: preprocessing and attribute-value relations extraction. For the preprocessing where genuine or ungenuine tables are filtered out, appropriate rules are devised based on a careful examination of general characteristics of < TABLE > tags. The remaining is detected at the attribute-value relations extraction phase. Specifically, a value area is extracted and checked out whether there is a syntactic coherency Futhermore, the method looks for a semantic coherency between an attribute area and a value area of a table that may be inappropriate for the syntactic coherency checkup. Experimental results with 11,477 < TABLE > tags from 1,393 HTML documents show at the method has performed better compared with previous works, resulting in a precision of 97.54% and a recall of 99.22% in average.

  • PDF

Recognition and Evaluation of Efficient Language Analysis Unit for Korean (한국어에서 실용적 언어분석 단위의 인식과 평가)

  • 박인철
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.1
    • /
    • pp.65-76
    • /
    • 2004
  • In this paper, we observe the differences between linguistic and computational aspect in the automatic processing of languages which are dominant representation method for information in the Internet. For efficient information retrieval, information extraction and machine translation from the massive documents, we investigate analysis units for morphology analysis, syntactic analysis and semantic analysis. and propose the syntactic longest analysis unit rather than morphological unit based on linguistics. Also, by evaluating with massive documents, we show that the proposed analysis units can be used for the constraint which can reduce the ambiguity occurring in the language processing.

  • PDF

A Technique for Extracting GeoSemantic Knowledge from Micro-blog (마이크로 블로그기반의 공간 지식 추출 기법연구)

  • Ha, Su-Wook;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Spatial Information Research
    • /
    • v.20 no.2
    • /
    • pp.129-136
    • /
    • 2012
  • Recently international organizations such as ISO/TC211, OGC, INSPIRE (Infrastructure for Spatial Information in Europe) make an effort to share geospatial data using semantic web technologies. In addition, smart phone and social networking services enable community-based opportunities for participants to share issues of a social phenomenon based on geographic area, and many researchers try to find a method of extracting issues from that. However, serviceable spatial ontologies are still insufficient at application level, and studies of spatial information extraction from SNS were focused on user's location finding or geocoding by text mining. Therefore, a study of extracting spatial phenomenon from social media information and converting it into geosemantic knowledge is very usable. In this paper, we propose a framework for extracting keywords from micro-blog, one of the social media services, finding their relationships using data mining technique, and converting it into spatiotemopral knowledge. The result of this study could be used for implementing a related system as a procedure and ontology model for constructing geoseem antic issue. And from this, it is expected to improve the effectiveness of finding, publishing and analysing spatial issues.

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention (딥러닝 기반의 Multi Scale Attention을 적용한 개선된 Pyramid Scene Parsing Network)

  • Kim, Jun-Hyeok;Lee, Sang-Hun;Han, Hyun-Ho
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.45-51
    • /
    • 2021
  • With the development of deep learning, semantic segmentation methods are being studied in various fields. There is a problem that segmenation accuracy drops in fields that require accuracy such as medical image analysis. In this paper, we improved PSPNet, which is a deep learning based segmentation method to minimized the loss of features during semantic segmentation. Conventional deep learning based segmentation methods result in lower resolution and loss of object features during feature extraction and compression. Due to these losses, the edge and the internal information of the object are lost, and there is a problem that the accuracy at the time of object segmentation is lowered. To solve these problems, we improved PSPNet, which is a semantic segmentation model. The multi-scale attention proposed to the conventional PSPNet was added to prevent feature loss of objects. The feature purification process was performed by applying the attention method to the conventional PPM module. By suppressing unnecessary feature information, eadg and texture information was improved. The proposed method trained on the Cityscapes dataset and use the segmentation index MIoU for quantitative evaluation. As a result of the experiment, the segmentation accuracy was improved by about 1.5% compared to the conventional PSPNet.

A Study on Keywords Extraction based on Semantic Analysis of Document (문서의 의미론적 분석에 기반한 키워드 추출에 관한 연구)

  • Song, Min-Kyu;Bae, Il-Ju;Lee, Soo-Hong;Park, Ji-Hyung
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2007.11a
    • /
    • pp.586-591
    • /
    • 2007
  • 지식 관리 시스템, 정보 검색 시스템, 그리고 전자 도서관 시스템 등의 문서를 다루는 시스템에서는 문서의 구조화 및 문서의 저장이 필요하다. 문서에 담겨있는 정보를 추출하기 위해 가장 우선시되어야 하는 것은 키워드의 선별이다. 기존 연구에서 가장 널리 사용된 알고리즘은 단어의 사용 빈도를 체크하는 TF(Term Frequency)와 IDF(Inverted Document Frequency)를 활용하는 TF-IDF 방법이다. 그러나 TF-IDF 방법은 문서의 의미를 반영하지 못하는 한계가 존재한다. 이를 보완하기 위하여 본 연구에서는 세 가지 방법을 활용한다. 첫 번째는 문헌 속에서의 단어의 위치 및 서론, 결론 등의 특정 부분에 사용된 단어의 활용도를 체크하는 문헌구조적 기법이고, 두 번째는 강조 표현, 비교 표현 등의 특정 사용 문구를 통제 어휘로 지정하여 활용하는 방법이다. 마지막으로 어휘의 사전적 의미를 분석하여 이를 메타데이터로 활용하는 방법인 언어학적 기법이 해당된다. 이를 통하여 키워드 추출 과정에서 문서의 의미 분석도 수행하여 키워드 추출의 효율을 높일 수 있다.

  • PDF

Customized Knowledge Creation Framework using Context- and intensity-based Similarity (상황과 정보 집적도를 고려한 유사도 기반의 맞춤형 지식 생성프레임워크)

  • Sohn, Mye M.;Lee, Hyun-Jung
    • Journal of Internet Computing and Services
    • /
    • v.12 no.5
    • /
    • pp.113-125
    • /
    • 2011
  • As information resources have become more various and the number of the resources has increased, knowledge customization on the social web has been becoming more difficult. To reduce the burden, we offer a framework for context-based similarity calculation for knowledge customization using ontology on the CBR. Thereby, we newly developed context- and intensity-based similarity calculation methods which are applied to extraction of the most similar case considered semantic similarity and syntactic, and effective creation of the user-tailored knowledge using the selected case. The process is comprised of conversion of unstructured web information into cases, extraction of an appropriate case according to the user requirements, and customization of the knowledge using the selected case. In the experimental section, the effectiveness of the developed similarity methods are compared with other edge-counting similarity methods using two classes which are compared with each other. It shows that our framework leads higher similarity values for conceptually close classes compared with other methods.

Integration of Ontology Model and Product Structure for the Requirement Management of Building Specification (건조사양서 요구사항의 추적을 위한 온톨로지 모델과 제품구조 통합 기초 연구)

  • Kim, Seung-Hyun;Lee, Jang-Hyun;Han, Eun-Jung
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.48 no.3
    • /
    • pp.207-214
    • /
    • 2011
  • Ship design requirements described in the building specification should be reflected in the design process. This paper identifies the configuration of requirements mentioned in the building specification using Ontology Representation Language (OWL). Ontology-based semantic search system specifies the requirement items. Through this extraction, building specifications mentioned for each entry are configured to the tree. Tracking requirements for ship design and a set of procedures to instruct is also used for the V model of systems engineering. The semantic search engine of robot agent and ontology can search the requirements specification document and extract the design information. Thereafter, design requirements for the tracking model that proposes the relationship between the associated BOM(bill of material) and product structure.