• 제목/요약/키워드: Textual information

검색결과 240건 처리시간 0.025초

XMT 저작용 MPEG-4 BIFS 인코더 개발 (Development of MPEG-4 BIFS Encoder for XMT Authoring)

  • 김상욱;차경애;김희선;이동훈;김광용;이명호
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2000년도 추계학술발표논문집 (하)
    • /
    • pp.1333-1336
    • /
    • 2000
  • 이 논문은 XMT 형식의 미디어 디스크립션을 생성하는 MPEG-4 BIFS 인코더를 제안하고 이의 개발을 보인다. XMT (Extensible MPEG-4 Textual Format)는 텍스트 형식의 MPEG-4 씬 디스크립션으로 방송용 오디오/비디오 편집 및 이동 단말기 사용자 중심의 미디어 컨텐트 개발에 활용될 수 있다. 이 논문에서는 XMT 저작을 위한 시각적인 방식과 이를 통해 편집된 정보인 씬 디스립션, 객체 디스크립션 및 인터프리팅 정보 등을 이용하여 XMT 형식의 미디어 디스크립션을 생성하는 기술을 보인다. 저작된 XMT 형식의 미디어 디스크립션은 디코더에 의해서 프리젠테이션 될 수 있다.

  • PDF

SMIL 변환을 지원하는 XMT 저작도구 (A XMT Authoring Tool supporting SMIL Transformation)

  • 임영순;김희선;이숙영;김상욱
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2003년도 가을 학술발표논문집 Vol.30 No.2 (2)
    • /
    • pp.475-477
    • /
    • 2003
  • MPEG-4는 멀티미디어 객체로 구성된 씬을 컨텐츠 단위로 표현할 수 있도록 하는 멀티미디어 표준안이다. MPEG-4 시스템은 씬 구성의 정보를 BIFS 라는 이진 포맷으로 정의할 뿐만 아니라 XMT(extensible MPEG-4 Textual format)라고 하는 텍스트 형태의 씬 디스크립션도 정의하고 있다. 이는 XMT가 XML 포맷이므로 작성된 파일을 이용해 다른 형태로 변환하여 다양한 재생 환경에서 상호 교환하여 사용하도록 하기 위함이다. 본 논문에서는 다양한 재생환경을 지원하기 위하여 XMT에서 정의하는 2가지 파일 포맷인 XMT-$\alpha$와 XMT-$\Omega$ 파일을 생성하고, 다양한 재생 환경을 지원하기 위하여 XMT-$\Omega$ 파일 포맷을 변환하여 SMIL로 생성하는 저작도구를 소개한다.

  • PDF

Citation-based Article Summarization using a Combination of Lexical Text Similarities: Evaluation with Computational Linguistics Literature Summarization Datasets

  • Kang, In-Su
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권7호
    • /
    • pp.31-37
    • /
    • 2019
  • Citation-based article summarization is to create a shortened text for an academic article, reflecting the content of citing sentences which contain other's thoughts about the target article to be summarized. To deal with the problem, this study introduces an extractive summarization method based on calculating a linear combination of various sentence salience scores, which represent the degrees to which a candidate sentence reflects the content of author's abstract text, reader's citing text, and the target article to be summarized. In the current study, salience scores are obtained by computing surface-level textual similarities. Experiments using CL-SciSumm datasets show that the proposed method parallels or outperforms the previous approaches in ROUGE evaluations against SciSumm-2017 human summaries and SciSumm-2016/2017 community summaries.

Systematic Review on Chatbot Techniques and Applications

  • Park, Dong-Min;Jeong, Seong-Soo;Seo, Yeong-Seok
    • Journal of Information Processing Systems
    • /
    • 제18권1호
    • /
    • pp.26-47
    • /
    • 2022
  • Chatbots were an important research subject in the past. A chatbot is a computer program or an artificial intelligence program that participates in a conversation via auditory or textual methods. As the research on chatbots progressed, some important issues regarding them changed over time. Therefore, it is necessary to review the technology with a focus on recent advancements and core research technologies. In this paper, we introduce five different chatbot technologies: natural language processing, pattern matching, semantic web, data mining, and context-aware computer. We also introduce the latest technology for the chatbot researchers to recognize the present situation and channelize it in the right direction.

Evaluation of Similarity Analysis of Newspaper Article Using Natural Language Processing

  • Ayako Ohshiro;Takeo Okazaki;Takashi Kano;Shinichiro Ueda
    • International Journal of Computer Science & Network Security
    • /
    • 제24권6호
    • /
    • pp.1-7
    • /
    • 2024
  • Comparing text features involves evaluating the "similarity" between texts. It is crucial to use appropriate similarity measures when comparing similarities. This study utilized various techniques to assess the similarities between newspaper articles, including deep learning and a previously proposed method: a combination of Pointwise Mutual Information (PMI) and Word Pair Matching (WPM), denoted as PMI+WPM. For performance comparison, law data from medical research in Japan were utilized as validation data in evaluating the PMI+WPM method. The distribution of similarities in text data varies depending on the evaluation technique and genre, as revealed by the comparative analysis. For newspaper data, non-deep learning methods demonstrated better similarity evaluation accuracy than deep learning methods. Additionally, evaluating similarities in law data is more challenging than in newspaper articles. Despite deep learning being the prevalent method for evaluating textual similarities, this study demonstrates that non-deep learning methods can be effective regarding Japanese-based texts.

다양한 재생 환경을 지원하는 XMT 저작 시스템 (An XMT Authoring System supporting Multiple Presentation Environments)

  • 김희선;임영순
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제10권3호
    • /
    • pp.251-258
    • /
    • 2004
  • XMT는 텍스트 형식의 MPEG-4 씬 기술 언어로 방송용 오디오/비디오 편집 및 사용자 중심의 미디어 컨텐츠 개발에 활용될 수 있다. 본 논문에서는 다양한 재생 환경에서 컨텐츠의 상호 교환을 지원하는 XMT 저작 시스템을 제안한다. XMT 저작 시스템은 XMT의 두 가지 파일 포맷인 XMT-$\alpha$와 XMT-$\Omega$를 생성한다. 두 파일 포맷은 같은 객체를 표현하는 방법이 다르므로, 추상화된 XMT-$\alpha$를 위한 저작 인터페이스와 XMT-$\Omega$를 위한 인터페이스를 제공한다. 또한, 두 개의 파일 포맷을 지원할 수 있는 내부 자료 구조를 정의하고, XMT-$\alpha$를 BIFS로 변환하는 기능과 XMT-$\Omega$를 SMIL, XMT-$\alpha$로 변환하는 기능을 제공하여 XMT의 특징인 다양한 환경에서 멀티미디어의 상호 교환성을 제공한다.

Topic Level Disambiguation for Weak Queries

  • Zhang, Hui;Yang, Kiduk;Jacob, Elin
    • Journal of Information Science Theory and Practice
    • /
    • 제1권3호
    • /
    • pp.33-46
    • /
    • 2013
  • Despite limited success, today's information retrieval (IR) systems are not intelligent or reliable. IR systems return poor search results when users formulate their information needs into incomplete or ambiguous queries (i.e., weak queries). Therefore, one of the main challenges in modern IR research is to provide consistent results across all queries by improving the performance on weak queries. However, existing IR approaches such as query expansion are not overly effective because they make little effort to analyze and exploit the meanings of the queries. Furthermore, word sense disambiguation approaches, which rely on textual context, are ineffective against weak queries that are typically short. Motivated by the demand for a robust IR system that can consistently provide highly accurate results, the proposed study implemented a novel topic detection that leveraged both the language model and structural knowledge of Wikipedia and systematically evaluated the effect of query disambiguation and topic-based retrieval approaches on TREC collections. The results not only confirm the effectiveness of the proposed topic detection and topic-based retrieval approaches but also demonstrate that query disambiguation does not improve IR as expected.

Integrated Method for Text Detection in Natural Scene Images

  • Zheng, Yang;Liu, Jie;Liu, Heping;Li, Qing;Li, Gen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권11호
    • /
    • pp.5583-5604
    • /
    • 2016
  • In this paper, we present a novel image operator to extract textual information in natural scene images. First, a powerful refiner called the Stroke Color Extension, which extends the widely used Stroke Width Transform by incorporating color information of strokes, is proposed to achieve significantly enhanced performance on intra-character connection and non-character removal. Second, a character classifier is trained by using gradient features. The classifier not only eliminates non-character components but also remains a large number of characters. Third, an effective extractor called the Character Color Transform combines color information of characters and geometry features. It is used to extract potential characters which are not correctly extracted in previous steps. Fourth, a Convolutional Neural Network model is used to verify text candidates, improving the performance of text detection. The proposed technique is tested on two public datasets, i.e., ICDAR2011 dataset and ICDAR2013 dataset. The experimental results show that our approach achieves state-of-the-art performance.

AUTOMATED HAZARD IDENTIFICATION FRAMEWORK FOR THE PROACTIVE CONSIDERATION OF CONSTRUCTION SAFETY

  • JunHyuk Kwon;Byungil Kim;SangHyun Lee;Hyoungkwan Kim
    • 국제학술발표논문집
    • /
    • The 5th International Conference on Construction Engineering and Project Management
    • /
    • pp.60-65
    • /
    • 2013
  • Introducing the concept of construction safety in the design/engineering phase can improve the efficiency and effectiveness of safety management on construction sites. In this sense, further improvements for safety can be made in the design/engineering phase through the development of (1) an automated hazard identification process that is little dependent on user knowledge, (2) an automated construction schedule generation to accommodate varying hazard information over time, and (3) a visual representation of the results that is easy to understand. In this paper, we formulate an automated hazard identification framework for construction safety by extracting hazard information from related regulations to eliminate human interventions, and by utilizing a visualization technique in order to enhance users' understanding on hazard information. First, the hazard information is automatically extracted from textual safety and health regulations (i.e., Occupational Safety Health Administration (OSHA) Standards) by using natural language processing (NLP) techniques without users' interpretations. Next, scheduling and sequencing of the construction activities are automatically generated with regard to the 3D building model. Then, the extracted hazard information is integrated into the geometry data of construction elements in the industry foundation class (IFC) building model using a conformity-checking algorithm within the open source 3D computer graphics software. Preliminary results demonstrate that this approach is advantageous in that it can be used in the design/engineering phases of construction without the manual interpretation of safety experts, facilitating the designers' and engineers' proactive consideration for improving safety management.

  • PDF

이메일에 포함된 감성정보 관련 메타데이터 추출에 관한 연구 (Recognizing Emotional Content of Emails as a byproduct of Natural Language Processing-based Metadata Extraction)

  • 백우진
    • 정보관리학회지
    • /
    • 제23권2호
    • /
    • pp.167-183
    • /
    • 2006
  • 본 연구는 이메일에 나타난 감성정보 메타데이터 추출에 있어 자연언어처리에 기반한 방식을 적용하였다. 투자분석가와 고객 사이에 주고받은 이메일을 통하여 개인화 정보를 추출하였다. 개인화란 이용자에게 개인적으로 의미 있는 방식으로 콘텐츠를 제공함으로써 온라인 상에서 관계를 생성하고, 성장시키고, 지속시키는 것을 의미한다. 전자상거래나 온라인 상의 비즈니스 경우, 본 연구는 대량의 정보에서 개인에게 의미 있는 정보를 선별하여 개인화 서비스에 활용할 수 있도록, 이메일이나 토론게시판 게시물, 채팅기록 등의 텍스트를 자연언어처리 기법에 의하여 자동적으로 메타데이터를 추출할 수 있는 시스템을 구현하였다. 구현된 시스템은 온라인 비즈니스와 같이 커뮤니케이션이 중요하고, 상호 교환되는 메시지의 의도나 상대방의 감정을 파악하는 것이 중요한 경우에 그러한 감성정보 관련 메타데이터를 자동으로 추출하는 시도를 했다는 점에서 연구의 가치를 찾을 수 있다.