• Title/Summary/Keyword: text-to-image

Search Result 904, Processing Time 0.025 seconds

Effective Morphological Layer Segmentation Based on Edge Information for Screen Image Coding (스크린 이미지 부호화를 위한 에지 정보 기반의 효과적인 형태학적 레이어 분할)

  • Park, Sang-Hyo;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.38-47
    • /
    • 2013
  • An image coding based on MRC model, a kind of multi-layer image model, first segments a screen image into foreground, mask, and background layers, and then compresses each layer using a codec that is suitable to the layer. The mask layer defines the position of foreground regions such as textual and graphical contents. The colour signal of the foreground (background) region is saved in the foreground (background) layer. The mask layer which contains the segmentation result of foreground and background regions is of importance since its accuracy directly affects the overall coding performance of the codec. This paper proposes a new layer segmentation algorithm for the MRC based image coding. The proposed method extracts text pixels from the background using morphological top hat filtering. The application of white or black top hat transformation to local blocks is controlled by the information of relative brightness of text compared to the background. In the proposed method, the boundary information of text that is extracted from the edge map of the block is used for the robust decision on the relative brightness of text. Simulation results show that the proposed method is superior to the conventional methods.

Text Area Detection of Road Sign Images based on IRBP Method (도로표지 영상에서 IRBP 기반의 문자 영역 추출)

  • Chong, Kyusoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.6
    • /
    • pp.1-9
    • /
    • 2014
  • Recently, a study is conducting to image collection and auto detection of attribute information using mobile mapping system. The road sign attribute information detection is difficult because of various size and placement, interference of other facilities like trees. In this study, a text detection method that does not rely on a Korean character template is required to successfully detect the target text when a variety of differently sized texts are present near the target texts. To overcome this, the method of incremental right-to-left blob projection (IRBP) was suggested as a solution; the potential and improvement of the method was also assessed. To assess the performance improvement of the IRBP that was developed, the IRBP method was compared to the existing method that uses Korean templates through the 60 videos of street signs that were used. It was verified that text detection can be improved with the IRBP method.

Text Region Detection using Adaptive Character-Edge Map From Natural Image (자연영상에서 적응적 문자-에지 맵을 이용한 텍스트 영역 검출)

  • Park, Jong-Cheon;Hwang, Dong-Guk;Jun, Byoung-Min
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.5
    • /
    • pp.1135-1140
    • /
    • 2007
  • This paper proposes an edge-based text region detection algorithm using the adaptive character-edge maps which are independent of the size of characters and the orientation of character string in natural images. First, labeled images are obtained from edge images and in order to search for characters, adaptive character-edge maps by way grammar are applied to labeled images. Next, selected label images are clustered as for distance of its neighbors. And then, text region candidates are obtained. Finally, text region candidates are verified by using the empirical rules and horizontal/vertical projection profiles based on the orientation of text region. As the results of experiments, a text region detection algorithm turned out to be robust in the matter of various character size, orientation, and the complexity of the background.

  • PDF

Semantic Image Retrieval Using RDF Metadata Based on the Representation of Spatial Relationships (공간관계 표현 기반 RDF 메타데이터를 이용한 의미적 이미지 검색)

  • Hwang, Myung-Gwun;Kong, Hyun-Jang;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.573-580
    • /
    • 2004
  • As the modern techniques have improved, people intend to store and manage the information on the web. Especially, it is the image data that is given a great deal of weight of the information because of the development of the scan and popularization of the digital camera and the cell-phone's camera. However, most image retrieval systems are still based on the text annotations while many images are creating everyday on the web. In this paper, we suggest the new approach for the semantic image retrieval using the RDF metadata based on the representation of the spatial relationships. For the semantic image retrieval, firstly we define the new vocabularies to represent the spatial relationships between the objects in the image. Secondly, we write the metadata about the image using RDF and new vocabularies. Finally. we could expect more correct result in our image retrieval system.

Effects of Presentation Modalities of Television Moving Image and Print Text on Children's and Adult's Recall (TV동영상과 신문텍스트의 정보제시특성이 어린이와 성인의 정보기억에 미치는 영향)

  • Choi, E-Jung
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.7
    • /
    • pp.149-158
    • /
    • 2009
  • Major purpose of this study is to explore effect of presentation modalities of Television and print on children's and adult's recall. So An experiment was conducted by comparing children's and adults' recall of information stories presented in three different modalities: "television moving Image1(auditory-visual redundancy)", "television moving Image2(auditory-visual redundancy)" and "print text". Results indicated that children remembered more infornation from the television moving Image than from print versions regardless of auditory-visual redundancy. But for the adults advantage of television was only found for information that had been accompanied by redundant pictures in television moving Image, providing support for the dual-coding hypothesis.

An Effective Method for Replacing Caption in Video Images (비디오 자막 문자의 효과적인 교환 방법)

  • Chun Byung-Tae;Kim Sook-Yeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.2 s.34
    • /
    • pp.97-104
    • /
    • 2005
  • Caption texts frequently inserted in a manufactured video image for helping an understanding of the TV audience. In the movies. replacement of the caption texts can be achieved without any loss of an original image, because the caption texts have their own track in the films. To replace the caption texts in early methods. the new texts have been inserted the caption area in the video images, which is filled a certain color for removing established caption texts. However, the use of these methods could be lost the original images in the caption area, so it is a Problematic method to the TV audience. In this Paper, we propose a new method for replacing the caption text after recovering original image in the caption area. In the experiments. the results in the complex images show some distortion after recovering original images, but most results show a good caption text with the recovered image. As such, this new method is effectively demonstrated to replace the caption texts in video images.

  • PDF

Text-to-Face Generation Using Multi-Scale Gradients Conditional Generative Adversarial Networks (다중 스케일 그라디언트 조건부 적대적 생성 신경망을 활용한 문장 기반 영상 생성 기법)

  • Bui, Nguyen P.;Le, Duc-Tai;Choo, Hyunseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.764-767
    • /
    • 2021
  • While Generative Adversarial Networks (GANs) have seen huge success in image synthesis tasks, synthesizing high-quality images from text descriptions is a challenging problem in computer vision. This paper proposes a method named Text-to-Face Generation Using Multi-Scale Gradients for Conditional Generative Adversarial Networks (T2F-MSGGANs) that combines GANs and a natural language processing model to create human faces has features found in the input text. The proposed method addresses two problems of GANs: model collapse and training instability by investigating how gradients at multiple scales can be used to generate high-resolution images. We show that T2F-MSGGANs converge stably and generate good-quality images.

A Novel Character Segmentation Method for Text Images Captured by Cameras

  • Lue, Hsin-Te;Wen, Ming-Gang;Cheng, Hsu-Yung;Fan, Kuo-Chin;Lin, Chih-Wei;Yu, Chih-Chang
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.729-739
    • /
    • 2010
  • Due to the rapid development of mobile devices equipped with cameras, instant translation of any text seen in any context is possible. Mobile devices can serve as a translation tool by recognizing the texts presented in the captured scenes. Images captured by cameras will embed more external or unwanted effects which need not to be considered in traditional optical character recognition (OCR). In this paper, we segment a text image captured by mobile devices into individual single characters to facilitate OCR kernel processing. Before proceeding with character segmentation, text detection and text line construction need to be performed in advance. A novel character segmentation method which integrates touched character filters is employed on text images captured by cameras. In addition, periphery features are extracted from the segmented images of touched characters and fed as inputs to support vector machines to calculate the confident values. In our experiment, the accuracy rate of the proposed character segmentation system is 94.90%, which demonstrates the effectiveness of the proposed method.

User Responses to the Formats and Product Properties of Contents Advertised on Facebook (페이스북 광고 콘텐츠 포맷과 제품 속성에 대한 사용자 반응)

  • Su-Jin, Woo;Yu-Jin, Kim
    • Science of Emotion and Sensibility
    • /
    • v.19 no.1
    • /
    • pp.111-126
    • /
    • 2016
  • As the marketing value of Facebook advertisements increases, companies seek to create successful Facebook advertisements in order to promote their brands or products. This research aims to identify Facebook advertising factors that influence users' eye movements and attention, and thereby to investigate effective visual elements of Facebook advertising contents. Firstly, we identified two contributing factors influencing users' responses to Facebook advertisements: the formats of advertising contents(Text, Text in Image, and Movie) and the product properties(Involvement, Think/Feel). Based on theoretical reviews, eye tracking tests and surveys were conducted in order to examine how these two factors affect users' responses on Facebook, i.e. visual perception and users' purchasing responses. It was found that there were distinctive patterns of users' visual perceptions and purchasing behavioral responses according to the formats of the advertised contents. Meanwhile, the advertised products' properties influenced only the users' purchasing responses. Finally, the key findings of this research offer helpful guidelines for providers and developers to create effective SNS advertisements.

Effective teaching using textbooks and AI web apps (교과서와 AI 웹앱을 활용한 효과적인 교육방식)

  • Sobirjon, Habibullaev;Yakhyo, Mamasoliev;Kim, Ki-Hawn
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.211-213
    • /
    • 2022
  • Images in the textbooks influence the learning process. Students often see pictures before reading the text and these pictures can enhance the power of imagination of the students. The findings of some researches show that the images in textbooks can increase students' creativity. However, when learning major subjects, reading a textbook or looking at a picture alone may not be enough to understand the topics and completely realize the concepts. Studies show that viewers remember 95% of a message when watching a video than reading a text. If we can combine textbooks and videos, this teaching method is fantastic. The "TEXT + IMAGE + VIDEO (Animation)" concept could be more beneficial than ordinary ones. We tried to give our solution by using machine learning Image Classification. This paper covers the features, approaches and detailed objectives of our project. For now, we have developed the prototype of this project as a web app and it only works when accessed via smartphone. Once you have accessed the web app through your smartphone, the web app asks for access to use the camera. Suppose you bring your smartphone's camera closer to the picture in the textbook. It will then display the video related to the photo below.

  • PDF