• Title/Summary/Keyword: Video abstraction

Search Result 24, Processing Time 0.028 seconds

A Scheme for News Videos based on MPEG-7 and Its Summarization Mechanism by using the Key-Frames of Selected Shot Types (MPEG-7을 기반으로 한 뉴스 동영상 스키마 및 샷 종류별 키프레임을 이용한 요약 생성 방법)

  • Jeong, Jin-Guk;Sim, Jin-Sun;Nang, Jong-Ho;Kim, Gyung-Su;Ha, Myung-Hwan;Jung, Byung-Heei
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.5
    • /
    • pp.530-539
    • /
    • 2002
  • Recently, there have been a lot of researches to develop an archive system for news videos that usually has a fixed structure. However, since the meta-data representation and storing schemes for news video are different from each other in the previously proposed archive systems, it was very hard to exchange these meta-data. This paper proposes a scheme for news video based on MPEG-7 MDS that is an international standard to represent the contents of multimedia, and a summarization mechanism reflecting the characteristics of shots in the news videos. The proposed scheme for news video uses the MPEG-7 MDS schemes such as VideoSegment and TextAnnotation to keep the original structure of news video, and the proposed summarization mechanism uses a slide-show style presentation of key frames with associated audio to reduce the data size of the summary video.

A Study on Flexible Attribude Tree and Patial Result Matrix for Content-baseed Retrieval and Browsing of Video Date. (비디오 데이터의 내용 기반 검색과 브라우징을 위한 유동 속성 트리 및 부분 결과 행렬의 이용 방법 연구)

  • 성인용;이원석
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.1
    • /
    • pp.1-13
    • /
    • 2000
  • While various types of information can be mixed in a continuous video stream without any cleat boundary, the meaning of a video scene can be interpreted by multiple levels of abstraction, and its description can be varied among different users. Therefore, for the content-based retrieval in video data it is important for a user to be able to describe a scene flexibly while the description given by different users should be maintained consistently This paper proposes an effective way to represent the different types of video information in conventional database models such as the relational and object-oriented models. Flexibly defined attributes and their values are organized as tree-structured dictionaries while the description of video data is stored in a fixed database schema. We also introduce several browsing methods to assist a user. The dictionary browser simplifies the annotation process as well as the querying process of a user while the result browser can help a user analyze the results of a query in terms of various combinations of Query conditions.

  • PDF

Video Segmentation Method using Improved Adaptive Threshold Algorithm and Post-processing (개선된 적응적 임계값 결정 알고리즘과 후처리 기법을 적용한 동영상 분할 방법)

  • Won, In-Su;Lee, Jun-Woo;Lim, Dae-Kyu;Jeong, Dong-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.5
    • /
    • pp.663-673
    • /
    • 2010
  • As a tool used for video maintenance, Video segmentation divides videos in hierarchical and structural manner. This technique can be considered as a core technique that can be applied commonly for various applications such as indexing, abstraction or retrieval. Conventional video segmentation used adaptive threshold to split video by calculating difference between consecutive frames and threshold value in window with fixed size. In this case, if the time difference between occurrences of cuts is less than the size of a window or there is much difference in neighbor feature, accurate detection is impossible. In this paper, Improved Adaptive threshold algorithm which enables determination of window size according to video format and reacts sensitively on change in neighbor feature is proposed to solve the problems above. Post-Processing method for decrement in error caused by camera flash and fast movement of large objects is applied. Evaluation result showed that there is 3.7% improvement in performance of detection compared to conventional method. In case of application of this method on modified video, the result showed 95.5% of reproducibility. Therefore, the proposed method is more accurated compared to conventional method and having reproducibility even in case of various modification of videos, it is applicable in various area as a video maintenance tool.

A Methodology for Extraction and Retrieval of Real-time Knowledge from Video Surveillance Systems by Incremental Abstraction of Trajectory and Relation Patterns (비디오 감시 시스템으로부터 객체 동선과 관계 패턴의 점진적 추상화에 의한 실시간 지식의 추출 및 복원 방법론)

  • Kim, Se-Jong;Kim, Tae-Ho;Lee, Moon-Kun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10c
    • /
    • pp.307-312
    • /
    • 2006
  • 멀티미디어의 비중이 커짐에 따라 컴퓨터 과학 각 분야에서 독자적인 기술들을 이용하여 실제 응용 및 시스템을 구축하고 있다. 하지만 멀티미디어 동영상 내에서 객체의 행위 단독적인 움직임을 수치로만 표현하여 자료를 처리함에 따라 의미를 해석하는 것이 부자연스럽고 정확한 숫자에 부합하는 행동의 검출이 어렵다. 본 논문에서는 멀티미디어 동영상의 기본적인 행위를 추출하고 이를 추상화, 정형화하여 보다 상위단계로 접근을 유도하여 멀티미디어 데이터에 대한 접근을 용이하게 하기위한 방법에 대하여 논의하였다.

  • PDF

Automatic Detection of Anchorperson Shots for News Video Abstraction (뉴스 동영상 요약을 위한 앵커 장면 자동 추출 알고리즘)

  • 정진국;이태연;낭종호;김경수;하명환;정병희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.274-276
    • /
    • 2001
  • 최근 많이 사용되는 대용량의 뉴스 비디오의 편리한 검색 및 관리 방법이 필요하게 되면서 뉴스 비디오 데이터를 자동으로 분석하여 저급 수준의 정보로부터 고급 수준의 내용 정보를 자동으로 추출하는 기술이 필요하게 되었다. 특히 뉴스를 요약하는데 있어서는 이런 기술이 더 유용하게 쓰일 수 있다. 앵커, 그래픽, 인터뷰, 기자보도, 회견/연설 장면 등이 뉴스 비디오의 고급 수준 내용 정보가 될 수 있는데 그 중에서도 앵커 장면은 뉴스의 기사를 나누는 고급 수준의 정보로서 중요한 의미를 갖게 된다. 본 논문에서는 이러한 앵커 장면을 자동으로 추출하는 방법을 제안한다. 앵커 장면의 공통된 특징을 이용하여 검출하게 되는데 첫 번째 특징은 한 뉴스 프로그램을 진행하는 앵커는 동일하다는 점이고 두 번째 특징은 동일한 스튜디오 안이라는 점이다. 본 논문에서는 앵커를 판별하는 방법으로 얼굴의 검출방법과 옷 색깔의 히스토그램 비교방법을 이용한다. 본 논문의 알고리즘을 여러 개의 KBS 9시 뉴스 비디오 데이터에 적용하여 실험한 결과 Recall과 Precision 모두 96% 이상 나오는 것을 알 수 있었다.

  • PDF

A Replay Shot Detection Algorithm for the Soccer Video Abstraction (축구 동영상 요약을 위한 재연 장면 자동 추출 알고리즘)

  • 정진국;김주영;낭종호;김경수;하명환;정병희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.277-279
    • /
    • 2001
  • 최근 디지털 비디오 데이터의 사용이 급격히 증가하면서 저급 수준의 정보를 이용하여 고급 수준의 내용 정보를 자동으로 추출하는 기술이 필요하게 되었다. 축구와 같은 분야에서는 그 중에서도 골, 프리킥, 파울 장면 등의 고급 수준 내용 정보가 중요한 의미를 갖게 되는데 특히, 이러한 장면 중 중요하다고 여기는 장면은 재연 장면을 통하여 다시 시청자에게 보여주게 되며, 축구 비디오에 대한 요약에서는 이런 장면들이 꼭 포함되어야 한다. 본 논문에서는 이러한 축구 비디오 데이터에서 재연 장면을 자동으로 추출하는 방법을 제안한다. 기본적으로는 축구 고유의 특징들을 이용하는데 첫 번째 특징은 샷의 길이가 너무 짧거나 너무 길지 않다는 것이고, 두 번째 특징은 재연 장면이라는 것은 장면이 느리게 다시 재생되는 것이기 때문에 움직임 특징이 일반적인 장면과는 다르다는 것이다. 본 논문에서는 오브젝트의 움직임을 구분하기 위하여 재연 장면을 두 가지 종류로 나누었다. 하나는 확대 상태의 재연 장면이고 다른 하나는 축소 상태의 재연 장면이다. 본 논문의 알고리즘을 적용하여 실험한 결과 Recall과 precision 모두 77% 이상 나오는 것을 알 수 있었다.

  • PDF

TVML (TV program Making Language) - Automatic TV Program Generation from Text-based Script -

  • Masaki-HAYASHI;Hirotada-UEDA;Tsuneya-KURIHARA;Michiaki-YASUMURA
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1999.06a
    • /
    • pp.151-158
    • /
    • 1999
  • This paper describes TVML (TV program Making Language) for automatically generating television programs from text-based script. This language describes the contents of a television program using expression with a high level of abstraction like“title #1”and“zoom-in”. The software used to read a script written in TVML and to automatically generate the program video and audio is called the TVML Player. The paper begins by describing TVML language specifications and the TVML Player. It then describes the“external control mode”of the TVML Player that can be used for applying TVML to interactive applications. Finally, it describes the TVML Editor, a user interface that we developed which enables users having no specialized knowledge of computer languages to make TVML scripts. In addition to its role as a television-program production tool. TVML is expected to have a wide range of applications in the network and multimedia fields.

Automatic Extraction Techniques of Topic-relevant Visual Shots Using Realtime Brainwave Responses (실시간 뇌파반응을 이용한 주제관련 영상물 쇼트 자동추출기법 개발연구)

  • Kim, Yong Ho;Kim, Hyun Hee
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.8
    • /
    • pp.1260-1274
    • /
    • 2016
  • To obtain good summarization algorithms, we need first understand how people summarize videos. 'Semantic gap' refers to the gap between semantics implied in video summarization algorithms and what people actually infer from watching videos. We hypothesized that ERP responses to real time videos will show either N400 effects to topic-irrelevant shots in the 300∼500ms time-range after stimulus on-set or P600 effects to topic-relevant shots in the 500∼700ms time range. We recruited 32 participants in the EEG experiment, asking them to focus on the topic of short videos and to memorize relevant shots to the topic of the video. After analysing real time videos based on the participants' rating information, we obtained the following t-test result, showing N400 effects on PF1, F7, F3, C3, Cz, T7, and FT7 positions on the left and central hemisphere, and P600 effects on PF1, C3, Cz, and FCz on the left and central hemisphere and C4, FC4, P8, and TP8 on the right. A further 3-way MANOVA test with repeated measures of topic-relevance, hemisphere, and electrode positions showed significant interaction effects, implying that the left hemisphere at central, frontal, and pre-frontal positions were sensitive in detecting topic-relevant shots while watching real time videos.

Improvement of Character-net via Detection of Conversation Participant (대화 참여자 결정을 통한 Character-net의 개선)

  • Kim, Won-Taek;Park, Seung-Bo;Jo, Geun-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.10
    • /
    • pp.241-249
    • /
    • 2009
  • Recently, a number of researches related to video annotation and representation have been proposed to analyze video for searching and abstraction. In this paper, we have presented a method to provide the picture elements of conversational participants in video and the enhanced representation of the characters using those elements, collectively called Character-net. Because conversational participants are decided as characters detected in a script holding time, the previous Character-net suffers serious limitation that some listeners could not be detected as the participants. The participants who complete the story in video are very important factor to understand the context of the conversation. The picture elements for detecting the conversational participants consist of six elements as follows: subtitle, scene, the order of appearance, characters' eyes, patterns, and lip motion. In this paper, we present how to use those elements for detecting conversational participants and how to improve the representation of the Character-net. We can detect the conversational participants accurately when the proposed elements combine together and satisfy the special conditions. The experimental evaluation shows that the proposed method brings significant advantages in terms of both improving the detection of the conversational participants and enhancing the representation of Character-net.

Abstraction of players action in tennis games over various platform (플랫폼에 따른 테니스 게임 플레이어 액션의 추상화 연구)

  • Chung, Don-Uk
    • Journal of Digital Contents Society
    • /
    • v.16 no.4
    • /
    • pp.635-643
    • /
    • 2015
  • This study conducted a case study using various platforms centered on a tennis game to examine what forms the movements of a game player had when they were abstracted in the game. In particular, it summarized the forms of the player's experience that could be attained from the abstracted tennis actions into the 4 types: movement, swing, direction & intensity, and skill; and observed and schematized them in the early video games, console games, mobile games, Gesture recognition games, and wearable games. In conclusion, the development of technology offers the players with greater experience. For example the change of the platform of simple games of pressing buttons into swinging. Furthermore, the study found a consistency in the context even though the difference of action was slightly found by the interface.