• Title/Summary/Keyword: 단어 영상 추출

Search Result 65, Processing Time 0.027 seconds

Design and Implementation of Movie Recommention System Based on User Emotion (사용자 감성 기반 영화 추천 시스템의 설계 및 구현)

  • Byeon, Jaehee;Hong, Jongui;Yang, Janghun;Choi, Yoo-Joo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.964-965
    • /
    • 2013
  • 본 연구에서는 사용자의 감성 정보를 기반으로 한 영화 추천 시스템을 설계 및 구현하였다. 이를 위하여 영화 리뷰에서 기본적인 4가지 감성을 뜻하는 단어를 추출 및 분류하고, MovieLens Dataset의 메타데이터에 추가한 후 협업 필터링을 사용하여 영화를 추천한다.

Implementation of User Recommendation System based on Video Contents Story Analysis and Viewing Pattern Analysis (영상 스토리 분석과 시청 패턴 분석 기반의 추천 시스템 구현)

  • Lee, Hyoun-Sup;Kim, Minyoung;Lee, Ji-Hoon;Kim, Jin-Deog
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1567-1573
    • /
    • 2020
  • The development of Internet technology has brought the era of one-man media. An individual produces content on user own and uploads it to related online services, and many users watch the content of online services using devices that allow them to use the Internet. Currently, most users find and watch content they want through search functions provided by existing online services. These features are provided based on information entered by the user who uploaded the content. In an environment where content needs to be retrieved based on these limited word data, user unwanted information is presented to users in the search results. To solve this problem, in this paper, the system actively analyzes the video in the online service, and presents a way to extract and reflect the characteristics held by the video. The research was conducted to extract morphemes based on the story content based on the voice data of a video and analyze them with big data technology.

Design of an Efficient VLSI Architecture and Verification using FPGA-implementation for HMM(Hidden Markov Model)-based Robust and Real-time Lip Reading (HMM(Hidden Markov Model) 기반의 견고한 실시간 립리딩을 위한 효율적인 VLSI 구조 설계 및 FPGA 구현을 이용한 검증)

  • Lee Chi-Geun;Kim Myung-Hun;Lee Sang-Seol;Jung Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2 s.40
    • /
    • pp.159-167
    • /
    • 2006
  • Lipreading has been suggested as one of the methods to improve the performance of speech recognition in noisy environment. However, existing methods are developed and implemented only in software. This paper suggests a hardware design for real-time lipreading. For real-time processing and feasible implementation, we decompose the lipreading system into three parts; image acquisition module, feature vector extraction module, and recognition module. Image acquisition module capture input image by using CMOS image sensor. The feature vector extraction module extracts feature vector from the input image by using parallel block matching algorithm. The parallel block matching algorithm is coded and simulated for FPGA circuit. Recognition module uses HMM based recognition algorithm. The recognition algorithm is coded and simulated by using DSP chip. The simulation results show that a real-time lipreading system can be implemented in hardware.

  • PDF

Segmentation of Words from the Lines of Unconstrained Handwritten Text using Neural Networks (신경회로망을 이용한 제약 없이 쓰여진 필기체 문자열로부터 단어 분리 방법)

  • Kim, Gyeong-Hwan
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.7
    • /
    • pp.27-35
    • /
    • 1999
  • Researches on the recognition of handwritten script have been conducted under the assumption that the isolated recognition units are provided as inputs. However, in practical recognition system designs, providing the isolated recognition unit is an challenge due to various writing syles. This paper proposes an approach for segmenting words from lines of unconstrained handwritten text, without help of recognition. In contrast to the conventional approaches which are based on physical gaps between connected components, clues that reflect the author's writing style, in terms of spacing, are extracted and utilized for the segmentation using a simple neural network. The clues are from character segments and include normalized heights and intervals of the segments. Effectiveness of the proposed approach compared with the conventional connected component based approaches in terms of word segmentation performance was evaluated by experiments.

  • PDF

Robust Feature Extraction Based on Image-based Approach for Visual Speech Recognition (시각 음성인식을 위한 영상 기반 접근방법에 기반한 강인한 시각 특징 파라미터의 추출 방법)

  • Gyu, Song-Min;Pham, Thanh Trung;Min, So-Hee;Kim, Jing-Young;Na, Seung-You;Hwang, Sung-Taek
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.3
    • /
    • pp.348-355
    • /
    • 2010
  • In spite of development in speech recognition technology, speech recognition under noisy environment is still a difficult task. To solve this problem, Researchers has been proposed different methods where they have been used visual information except audio information for visual speech recognition. However, visual information also has visual noises as well as the noises of audio information, and this visual noises cause degradation in visual speech recognition. Therefore, it is one the field of interest how to extract visual features parameter for enhancing visual speech recognition performance. In this paper, we propose a method for visual feature parameter extraction based on image-base approach for enhancing recognition performance of the HMM based visual speech recognizer. For experiments, we have constructed Audio-visual database which is consisted with 105 speackers and each speaker has uttered 62 words. We have applied histogram matching, lip folding, RASTA filtering, Liner Mask, DCT and PCA. The experimental results show that the recognition performance of our proposed method enhanced at about 21% than the baseline method.

Lip Reading Method Using CNN for Utterance Period Detection (발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법)

  • Kim, Yong-Ki;Lim, Jong Gwan;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.233-243
    • /
    • 2016
  • Due to speech recognition problems in noisy environment, Audio Visual Speech Recognition (AVSR) system, which combines speech information and visual information, has been proposed since the mid-1990s,. and lip reading have played significant role in the AVSR System. This study aims to enhance recognition rate of utterance word using only lip shape detection for efficient AVSR system. After preprocessing for lip region detection, Convolution Neural Network (CNN) techniques are applied for utterance period detection and lip shape feature vector extraction, and Hidden Markov Models (HMMs) are then used for the recognition. As a result, the utterance period detection results show 91% of success rates, which are higher performance than general threshold methods. In the lip reading recognition, while user-dependent experiment records 88.5%, user-independent experiment shows 80.2% of recognition rates, which are improved results compared to the previous studies.

A Study on the Semantic Network Structure of the Regime in the Image Contents (영상콘텐츠분야의 정권별 의미연결망 연구)

  • Hwang, Go-Eun;Moon, Shin-Jung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.3
    • /
    • pp.217-240
    • /
    • 2017
  • The purpose of this study was to investigate the semantic network analysis to understand image contents and to examine the degree to which words, word clusters contributed to the formation of semantic map within image contents. For this research, from 1993 until 2016 the field of the image contents were collected for a total of 2,624 cases papers. The word appeared in Title analyzed the social network by using the R program of Big Data. The results were as follows: First, The field of image contents is based on researches related to 'image', 'media' and 'contents'. Second, there is a three-step flow ('education' -> 'media' -> 'contents') of research in the field of image contents. Third, researches related to 'broadcasting', 'digital', 'technology', and 'production' were continuously carried out. Finally, There were new research subjects for each regime.

Design and Implementation of a Real-Time Lipreading System Using PCA & HMM (PCA와 HMM을 이용한 실시간 립리딩 시스템의 설계 및 구현)

  • Lee chi-geun;Lee eun-suk;Jung sung-tae;Lee sang-seol
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1597-1609
    • /
    • 2004
  • A lot of lipreading system has been proposed to compensate the rate of speech recognition dropped in a noisy environment. Previous lipreading systems work on some specific conditions such as artificial lighting and predefined background color. In this paper, we propose a real-time lipreading system which allows the motion of a speaker and relaxes the restriction on the condition for color and lighting. The proposed system extracts face and lip region from input video sequence captured with a common PC camera and essential visual information in real-time. It recognizes utterance words by using the visual information in real-time. It uses the hue histogram model to extract face and lip region. It uses mean shift algorithm to track the face of a moving speaker. It uses PCA(Principal Component Analysis) to extract the visual information for learning and testing. Also, it uses HMM(Hidden Markov Model) as a recognition algorithm. The experimental results show that our system could get the recognition rate of 90% in case of speaker dependent lipreading and increase the rate of speech recognition up to 40~85% according to the noise level when it is combined with audio speech recognition.

  • PDF

Affective Effect of Video Playback Style and its Assessment Tool Development (영상의 재생 스타일에 따른 감성적 효과와 감성 평가 도구의 개발)

  • Jeong, Kyeong Ah;Suk, Hyeon-Jeong
    • Science of Emotion and Sensibility
    • /
    • v.19 no.3
    • /
    • pp.103-120
    • /
    • 2016
  • This study investigated how video playback styles affect viewers' emotional responses to a video and then suggested emotion assessment tool for playback-edited videos. The study involved two in-lab experiments. In the first experiment, observers were asked to express their feelings while watching videos in both original playback and articulated playback simultaneously. By controlling the speed, direction, and continuity, total of twelve playback styles were created. Each of the twelve playback styles were applied to five kinds of original videos that contains happy, anger, sad, relaxed, and neutral emotion. Thirty college students participated and more than 3,800 words were collected. The collected words were comprised of 899 kinds of emotion terms, and these emotion terms were classified into 52 emotion categories. The second experiment was conducted to develop proper emotion assessment tool for playback-edited video. Total of 38 emotion terms, which were extracted from 899 emotion terms, were employed from the first experiment and used as a scales (given in Korean and scored on a 5-point Likert scale) to assess the affective quality of pre-made video materials. The total of eleven pre-made commercial videos which applied different playback styles were collected. The videos were transformed to initial (un-edited) condition, and participants were evaluated pre-made videos by comparing initial condition videos simultaneously. Thirty college students evaluated playback-edited video in the second study. Based on the judgements, four factors were extracted through the factor analysis, and they were labelled "Happy", "Sad", "Reflective" and "Weird (funny and at the same time weird)." Differently from conventional emotion framework, the positivity and negativity of the valence dimension were independently treated, while the arousal aspect was marginally recognized. With four factors from the second experiment, finally emotion assessment tool for playback-edited video was proposed. The practical value and application of emotion assessment tool were also discussed.

Extracting curved text lines using the chain composition and the expanded grouping method (체인 정합과 확장된 그룹핑 방법을 사용한 곡선형 텍스트 라인 추출)

  • Bai, Nguyen Noi;Yoon, Jin-Seon;Song, Young-Jun;Kim, Nam;Kim, Yong-Gi
    • The KIPS Transactions:PartB
    • /
    • v.14B no.6
    • /
    • pp.453-460
    • /
    • 2007
  • In this paper, we present a method to extract the text lines in poorly structured documents. The text lines may have different orientations, considerably curved shapes, and there are possibly a few wide inter-word gaps in a text line. Those text lines can be found in posters, blocks of addresses, artistic documents. Our method based on the traditional perceptual grouping but we develop novel solutions to overcome the problems of insufficient seed points and vaned orientations un a single line. In this paper, we assume that text lines contained tone connected components, in which each connected components is a set of black pixels within a letter, or some touched letters. In our scheme, the connected components closer than an iteratively incremented threshold will make together a chain. Elongate chains are identified as the seed chains of lines. Then the seed chains are extended to the left and the right regarding the local orientations. The local orientations will be reevaluated at each side of the chains when it is extended. By this process, all text lines are finally constructed. The proposed method is good for extraction of the considerably curved text lines from logos and slogans in our experiment; 98% and 94% for the straight-line extraction and the curved-line extraction, respectively.