• Title/Summary/Keyword: text-to-image

Search Result 892, Processing Time 0.024 seconds

Adversarial Shade Generation and Training Text Recognition Algorithm that is Robust to Text in Brightness (밝기 변화에 강인한 적대적 음영 생성 및 훈련 글자 인식 알고리즘)

  • Seo, Minseok;Kim, Daehan;Choi, Dong-Geol
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.3
    • /
    • pp.276-282
    • /
    • 2021
  • The system for recognizing text in natural scenes has been applied in various industries. However, due to the change in brightness that occurs in nature such as light reflection and shadow, the text recognition performance significantly decreases. To solve this problem, we propose an adversarial shadow generation and training algorithm that is robust to shadow changes. The adversarial shadow generation and training algorithm divides the entire image into a total of 9 grids, and adjusts the brightness with 4 trainable parameters for each grid. Finally, training is conducted in a adversarial relationship between the text recognition model and the shaded image generator. As the training progresses, more and more difficult shaded grid combinations occur. When training with this curriculum-learning attitude, we not only showed a performance improvement of more than 3% in the ICDAR2015 public benchmark dataset, but also confirmed that the performance improved when applied to our's android application text recognition dataset.

Rectification of Perspective Text Images on Rectangular Planes

  • Le, Huy Phat;Madhubalan, Kavitha;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.6 no.4
    • /
    • pp.1-7
    • /
    • 2010
  • Natural images often contain useful information about the scene such as text or company logos placed on a rectangular shaped plane. The 2D images captured from such objects by a camera are often distorted, because of the effects of the perspective projection camera model. This distortion makes the acquisition of the text information difficult. In this study, we detect the rectangular object on which the text is written, then the image is restored by removing the perspective distortion. The Hough transform is used to detect the boundary lines of the rectangular object and a bilinear transformation is applied to restore the original image.

Text Extraction Algorithm in Complex Images using Adaptive Edge detection (복잡한 영상에서 적응적 에지검출을 이용한 텍스트 추출 알고리즘 연구)

  • Shin, Seong;Kim, Sung-Dong;Baek, Young-Hyun;Moon, Sung-Ryong
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.251-252
    • /
    • 2007
  • The thesis proposed the Text Extraction Algorithm which is a text extraction algorithm which uses the Coiflet Wavelet, YCbCr Color model and the close curve edge feature of adaptive LoG Operator in order to complement the demerit of the existing research which is weak in complexity of background, variety of light and disordered line and similarity of text and background color. This thesis is simulated with natural images which include naturally text area regardless of size, resolution and slant and so on of image. And the proposed algorithm is confirmed to an excellent by compared with an existing extraction algorithm in same image.

  • PDF

A Study on Localization of Text in Natural Scene Images (자연 영상에서의 정확한 문자 검출에 관한 연구)

  • Choi, Mi-Young;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.5
    • /
    • pp.77-84
    • /
    • 2008
  • This paper proposes a new approach to eliminate the reflectance component for the localization of text in natural scene images. Natural scene images normally have an illumination component as well as a reflectance component. It is well known that a reflectance component usually obstructs the task of detecting and recognizing objects like texts in the scene, since it blurs out an overall image. We have developed an approach that efficiently removes reflectance components while Preserving illumination components. We decided whether an input image hits Normal or Polarized for determining the light environment, using the histogram which consisted of a red component. In the normal image, we acquired the text region without additional processing. Otherwise we removed light reflecting from the object using homomorphic filtering in the polarized image. And then this decided the each text region based on the color merging technique and the Saliency Map. Finally, we localized text region on these two candidate regions.

  • PDF

A feasibility study on new stimulation method in fMRI language examinations using custom designed images (기능적 자기공명영상의 언어기능검사 시 image를 이용한 자극방법의 타당성 연구)

  • Choi, Kwan-Woo;Son, Soon-Yong;Jeong, Mi-Ae;Min, Jung-Whan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.11
    • /
    • pp.5005-5011
    • /
    • 2011
  • The purpose of this work is to know the validity of a new stimulation method in cognitive functional imaging using custom-designed images correspond to words or syllables improving the shortcomings of existing method using text. From March 2011 to May five Subjects in need of language related functional MRI scanning were selected and both of text stimulating method and image stimulating method sacanning were carried out three times each. Using 3.0T Philps MRI machine and Invivo Co's Eloquence system, data acquisition was performed with EPI-BOLD technique. Post processing was performed with SPM 99 while the activated signals were determined within 95 percent confidence level.The number of activation clusters and the activation ratio inside ROI were compared. As as result, all of the subject showed activation inside Broca area but it did not have statistical significance. In conclusion, the image sitimulation method has potential because image itself is a common means of recognition and it can be recognised easily even if there language barrier. This stimulation method can be applied to replacing the exising scanning method especially in the elderly, infants, foerigners who may not fully understand about the examination.

Text Watermarking using Space Coding (Space Coding을 이용한 Text watermarking)

  • 황미란;추현곤;최종욱;김회율
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.117-120
    • /
    • 2002
  • In this paper, we propose a new text watermarking method using space coding and PN sequence. A PN sequence generated from user message modifies the space between words in each line. The detection can be done without original text image using the average space with in the text. Experimental results show that proposed method has the invisible property and robustness to the attack such as the elimination of words in the text.

  • PDF

A Study on Radiological Image Retrieval System (방사선 의료영상 검색 시스템에 관한 연구)

  • Park, Byung-Rae;Shin, Yong-Won
    • Journal of radiological science and technology
    • /
    • v.28 no.1
    • /
    • pp.19-24
    • /
    • 2005
  • The purpose of this study was to design and implement a useful annotation-based Radiological image retrieval system to accurately determine on education and image information for Radiological technologists. For better retrieval performance based on large image databases, we presented an indexing technique that integrated $B^+-tree$ proposed by Bayer for indexing simple attributes and inverted file structure for text medical keywords acquired from additional description information about Radiological images. In our results, we implemented proposed retrieval system with Delphi under Windows XP environment. End users, Radiological technologists, are able to store simple attributes information such as doctor name, operator name, body parts, disease and so on, additional text-based description information, and Radiological image itself as well as to retrieve wanted results by using simple attributes and text keywords from large image databases by graphic user interface. Consequently proposed system can be used for effective clinical decision on Radiological image, reduction of education time by organizing the knowledge, and well organized education in the clinical fields. In addition, It can be expected to develop as decision support system by constructing web-based integrated imaging system included general image and special contrast image for the future.

  • PDF

Pill Identification Algorithm Based on Deep Learning Using Imprinted Text Feature (음각 정보를 이용한 딥러닝 기반의 알약 식별 알고리즘 연구)

  • Seon Min, Lee;Young Jae, Kim;Kwang Gi, Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.43 no.6
    • /
    • pp.441-447
    • /
    • 2022
  • In this paper, we propose a pill identification model using engraved text feature and image feature such as shape and color, and compare it with an identification model that does not use engraved text feature to verify the possibility of improving identification performance by improving recognition rate of the engraved text. The data consisted of 100 classes and used 10 images per class. The engraved text feature was acquired through Keras OCR based on deep learning and 1D CNN, and the image feature was acquired through 2D CNN. According to the identification results, the accuracy of the text recognition model was 90%. The accuracy of the comparative model and the proposed model was 91.9% and 97.6%. The accuracy, precision, recall, and F1-score of the proposed model were better than those of the comparative model in terms of statistical significance. As a result, we confirmed that the expansion of the range of feature improved the performance of the identification model.

Design and Development of a Multimodal Biomedical Information Retrieval System

  • Demner-Fushman, Dina;Antani, Sameer;Simpson, Matthew;Thoma, George R.
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.2
    • /
    • pp.168-177
    • /
    • 2012
  • The search for relevant and actionable information is a key to achieving clinical and research goals in biomedicine. Biomedical information exists in different forms: as text and illustrations in journal articles and other documents, in images stored in databases, and as patients' cases in electronic health records. This paper presents ways to move beyond conventional text-based searching of these resources, by combining text and visual features in search queries and document representation. A combination of techniques and tools from the fields of natural language processing, information retrieval, and content-based image retrieval allows the development of building blocks for advanced information services. Such services enable searching by textual as well as visual queries, and retrieving documents enriched by relevant images, charts, and other illustrations from the journal literature, patient records and image databases.

A Design and Implementation of Generative AI-based Advertising Image Production Service Application

  • Chang Hee Ok;Hyun Sung Lee;Min Soo Jeong;Yu Jin Jeong;Ji An Choi;Young-Bok Cho;Won Joo Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.5
    • /
    • pp.31-38
    • /
    • 2024
  • In this paper, we propose an ASAP(AI-driven Service for Advertisement Production) application that provides a generative AI-based automatic advertising image production service. This application utilizes GPT-3.5 Turbo Instruct to generate suitable background mood and promotional copy based on user-entered keywords. It utilizes OpenAI's DALL·E 3 model and Stability AI's SDXL model to generate background images and text images based on these inputs. Furthermore, OCR technology is employed to improve the accuracy of text images, and all generated outputs are synthesized to create the final advertisement. Additionally, using the PILLOW and OpenCV libraries, text boxes are implemented to insert details such as phone numbers and business hours at the edges of promotional materials. This application offers small business owners who face difficulties in advertising production a simple and cost-effective solution.