• Title/Summary/Keyword: 단어 영상 추출

Search Result 65, Processing Time 0.026 seconds

A Study on Lip-reading Enhancement Using Time-domain Filter (시간영역 필터를 이용한 립리딩 성능향상에 관한 연구)

  • 신도성;김진영;최승호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.5
    • /
    • pp.375-382
    • /
    • 2003
  • Lip-reading technique based on bimodal is to enhance speech recognition rate in noisy environment. It is most important to detect the correct lip-image. But it is hard to estimate stable performance in dynamic environment, because of many factors to deteriorate Lip-reading's performance. There are illumination change, speaker's pronunciation habit, versatility of lips shape and rotation or size change of lips etc. In this paper, we propose the IIR filtering in time-domain for the stable performance. It is very proper to remove the noise of speech, to enhance performance of recognition by digital filtering in time domain. While the lip-reading technique in whole lip image makes data massive, the Principal Component Analysis of pre-process allows to reduce the data quantify by detection of feature without loss of image information. For the observation performance of speech recognition using only image information, we made an experiment on recognition after choosing 22 words in available car service. We used Hidden Markov Model by speech recognition algorithm to compare this words' recognition performance. As a result, while the recognition rate of lip-reading using PCA is 64%, Time-domain filter applied to lip-reading enhances recognition rate of 72.4%.

Influencer Attribute Analysis based Recommendation System (인플루언서 속성 분석 기반 추천 시스템)

  • Park, JeongReun;Park, Jiwon;Kim, Minwoo;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1321-1329
    • /
    • 2019
  • With the development of social information networks, the marketing methods are also changing in various ways. Unlike successful marketing methods based on existing celebrities and financial support, Influencer-based marketing is a big trend and very famous. In this paper, we first extract influencer features from more than 54 YouTube channels using the multi-dimensional qualitative analysis based on the meta information and comment data analysis of YouTube, model representative themes to maximize a personalized video satisfaction. Plus, the purpose of this study is to provide supplementary means for the successful promotion and marketing by creating and distributing videos of new items by referring to the existing Influencer features. For that we assume all comments of various videos for each channel as each document, TF-IDF (Term Frequency and Inverse Document Frequency) and LDA (Latent Dirichlet Allocation) algorithms are applied to maximize performance of the proposed scheme. Based on the performance evaluation, we proved the proposed scheme is better than other schemes.

Image Compression Using DCT Map FSVQ and Single - side Distribution Huffman Tree (DCT 맵 FSVQ와 단방향 분포 허프만 트리를 이용한 영상 압축)

  • Cho, Seong-Hwan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.10
    • /
    • pp.2615-2628
    • /
    • 1997
  • In this paper, a new codebook design algorithm is proposed. It uses a DCT map based on two-dimensional discrete cosine of transform (2D DCT) and finite state vector quantizer (FSVQ) when the vector quantizer is designed for image transmission. We make the map by dividing input image according to edge quantity, then by the map, the significant features of training image are extracted by using the 2D DCT. A master codebook of FSVQ is generated by partitioning the training set using binary tree based on tree-structure. The state codebook is constructed from the master codebook, and then the index of input image is searched at not master codebook but state codebook. And, because the coding of index is important part for high speed digital transmission, it converts fixed length codes to variable length codes in terms of entropy coding rule. The huffman coding assigns transmission codes to codes of codebook. This paper proposes single-side growing huffman tree to speed up huffman code generation process of huffman tree. Compared with the pairwise nearest neighbor (PNN) and classified VQ (CVQ) algorithm, about Einstein and Bridge image, the new algorithm shows better picture quality with 2.04 dB and 2.48 dB differences as to PNN, 1.75 dB and 0.99 dB differences as to CVQ respectively.

  • PDF

A Study on the Feature Point Extraction Methodology based on XML for Searching Hidden Vault Anti-Forensics Apps (은닉형 Vault 안티포렌식 앱 탐색을 위한 XML 기반 특징점 추출 방법론 연구)

  • Kim, Dae-gyu;Kim, Chang-soo
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.61-70
    • /
    • 2022
  • General users who use smartphone apps often use the Vault app to protect personal information such as photos and videos owned by individuals. However, there are increasing cases of criminals using the Vault app function for anti-forensic purposes to hide illegal videos. These apps are one of the apps registered on Google Play. This paper proposes a methodology for extracting feature points through XML-based keyword frequency analysis to explore Vault apps used by criminals, and text mining techniques are applied to extract feature points. In this paper, XML syntax was compared and analyzed using strings.xml files included in the app for 15 hidden Vault anti-forensics apps and non-hidden Vault apps, respectively. In hidden Vault anti-forensics apps, more hidden-related words are found at a higher frequency in the first and second rounds of terminology processing. Unlike most conventional methods of static analysis of APK files from an engineering point of view, this paper is meaningful in that it approached from a humanities and sociological point of view to find a feature of classifying anti-forensics apps. In conclusion, applying text mining techniques through XML parsing can be used as basic data for exploring hidden Vault anti-forensics apps.

An Intelligent Chatbot Utilizing BERT Model and Knowledge Graph (BERT 모델과 지식 그래프를 활용한 지능형 챗봇)

  • Yoo, SoYeop;Jeong, OkRan
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.3
    • /
    • pp.87-98
    • /
    • 2019
  • As artificial intelligence is actively studied, it is being applied to various fields such as image, video and natural language processing. The natural language processing, in particular, is being studied to enable computers to understand the languages spoken and spoken by people and is considered one of the most important areas in artificial intelligence technology. In natural language processing, it is a complex, but important to make computers learn to understand a person's common sense and generate results based on the person's common sense. Knowledge graphs, which are linked using the relationship of words, have the advantage of being able to learn common sense easily from computers. However, the existing knowledge graphs are organized only by focusing on specific languages and fields and have limitations that cannot respond to neologisms. In this paper, we propose an intelligent chatbotsystem that collects and analyzed data in real time to build an automatically scalable knowledge graph and utilizes it as the base data. In particular, the fine-tuned BERT-based for relation extraction is to be applied to auto-growing graph to improve performance. And, we have developed a chatbot that can learn human common sense using auto-growing knowledge graph, it verifies the availability and performance of the knowledge graph.

Sign Language Spotting Based on Semi-Markov Conditional Random Field (세미-마르코프 조건 랜덤 필드 기반의 수화 적출)

  • Cho, Seong-Sik;Lee, Seong-Whan
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.12
    • /
    • pp.1034-1037
    • /
    • 2009
  • Sign language spotting is the task of detecting the start and end points of signs from continuous data and recognizing the detected signs in the predefined vocabulary. The difficulty with sign language spotting is that instances of signs vary in both motion and shape. Moreover, signs have variable motion in terms of both trajectory and length. Especially, variable sign lengths result in problems with spotting signs in a video sequence, because short signs involve less information and fewer changes than long signs. In this paper, we propose a method for spotting variable lengths signs based on semi-CRF (semi-Markov Conditional Random Field). We performed experiments with ASL (American Sign Language) and KSL (Korean Sign Language) dataset of continuous sign sentences to demonstrate the efficiency of the proposed method. Experimental results show that the proposed method outperforms both HMM and CRF.

Object Detection and Optical Character Recognition for Mobile-based Air Writing (모바일 기반 Air Writing을 위한 객체 탐지 및 광학 문자 인식 방법)

  • Kim, Tae-Il;Ko, Young-Jin;Kim, Tae-Young
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.5
    • /
    • pp.53-63
    • /
    • 2019
  • To provide a hand gesture interface through deep learning in mobile environments, research on the light-weighting of networks is essential for high recognition rates while at the same time preventing degradation of execution speed. This paper proposes a method of real-time recognition of written characters in the air using a finger on mobile devices through the light-weighting of deep-learning model. Based on the SSD (Single Shot Detector), which is an object detection model that utilizes MobileNet as a feature extractor, it detects index finger and generates a result text image by following fingertip path. Then, the image is sent to the server to recognize the characters based on the learned OCR model. To verify our method, 12 users tested 1,000 words using a GALAXY S10+ and recognized their finger with an average accuracy of 88.6%, indicating that recognized text was printed within 124 ms and could be used in real-time. Results of this research can be used to send simple text messages, memos, and air signatures using a finger in mobile environments.

Analysis of the Research Trends by Environmental Spatial-Information Using Text-Mining Technology (텍스트 마이닝 기법을 활용한 환경공간정보 연구 동향 분석)

  • OH, Kwan-Young;LEE, Moung-Jin;PARK, Bo-Young;LEE, Jung-Ho;YOON, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.1
    • /
    • pp.113-126
    • /
    • 2017
  • This study aimed to quantitatively analyze the trends in environmental research that utilize environmental geospatial information through text mining, one of the big data analysis technologies. The analysis was conducted on a total of 869 papers published in the Republic of Korea, which were collected from the National Digital Science Library (NDSL). On the basis of the classification scheme, the keywords extracted from the papers were recategorized into 10 environmental fields including "general environment", "climate", "air quality", and 20 environmental geospatial information fields including "satellite image", "numerical map", and "disaster". With the recategorized keywords, their frequency levels and time series changes in the collected papers were analyzed, as well as the association rules between keywords. First, the results of frequency analysis showed that "general environment"(40.85%) and "satellite image"(24.87%) had the highest frequency levels among environmental fields and environmental geospatial information fields, respectively. Second, the results of the time series analysis on environmental fields showed that the share of "climate" between 1996 and 2000 was high, but since 2001, that of "general environment" has increased. In terms of environmental geospatial information fields, the demand for "satellite image" was highest throughout the period analyzed, and its utilization share has also gradually increased. Third, a total of 80 correlation rules were generated for environmental fields and environmental geospatial information fields. Among environmental fields, "general environment" generated the highest number of correlation rules (17) with environmental geospatial information fields such as "satellite image" and "digital map".

Making 2.5D with Vanishing Point in Photoshop (Photoshop Vanishing Point를 이용한 2.5D 제작에 관한연구)

  • Yoon, Young-Doo;Choi, Eun-Young
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.12
    • /
    • pp.146-153
    • /
    • 2009
  • Thanks to computer graphic technology development, graphic design programming is easily accessible by any home computer user today since it is free from the burdens of complicated 알고리듬 or the expensive graphic tools that were required in the past. The term 알고리듬 2.5 is commonly used by computer graphic designers to refer to 2D, a form of pseudo-3D. In this study, by using 2.5D, which was previously utilized for strengthening visual effects and engine efficiency, together with Adobe Photoshop along with After Effects, I will incorporate these into motion graphics. Today, motion graphics dominate the advertisement and image markets. Since viewers have developed higher expectations, a more dynamic 3D space graphic technology is preferred over the outdated 2D basis. In this study, I will produce a 2.5D image which is generated through a vanishing point filter of Adobe Photoshop and After Effects based on still image information and captured at an angle of Axonometric Projection. Also, I will compare the effectiveness of the production process and camera angle flexibility between the previous 3D process and new 2.5 D process.

Emotion Based Gesture Animation Generation Mobile System (감정 기반 모바일 손제스쳐 애니메이션 제작 시스템)

  • Lee, Jung-Suk;Byun, Hae-Won
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.129-134
    • /
    • 2009
  • Recently, percentage of people who use SMS service is increasing. However, it is difficult to express own complicated emotion with text and emoticon of exited SMS service. This paper focuses on that point and practical uses character animation to express emotion and nuance correctly, funny. Also this paper suggests emotion based gesture animation generation system that use character's facial expression and gesture to delivery emotion excitably and clearly than only speaking. Michel[1] investigated interview movies of a person whose gesturing style they wish to animate and suggested gesture generation graph for stylized gesture animation. In this paper, we make focus to analyze and abstracted emotional gestures of Disney animation characters and did 3D modeling of these emotional gestures expanding Michel[1]'s research. To express emotion of person, suggests a emotion gesture generation graph that reflects emotion flow graph express emotion flow for probability. We investigated user reaction for research the propriety of suggested system and alternation propriety.

  • PDF