• Title/Summary/Keyword: document image processing

Search Result 105, Processing Time 0.023 seconds

Optical Character Recognition for Hindi Language Using a Neural-network Approach

  • Yadav, Divakar;Sanchez-Cuadrado, Sonia;Morato, Jorge
    • Journal of Information Processing Systems
    • /
    • v.9 no.1
    • /
    • pp.117-140
    • /
    • 2013
  • Hindi is the most widely spoken language in India, with more than 300 million speakers. As there is no separation between the characters of texts written in Hindi as there is in English, the Optical Character Recognition (OCR) systems developed for the Hindi language carry a very poor recognition rate. In this paper we propose an OCR for printed Hindi text in Devanagari script, using Artificial Neural Network (ANN), which improves its efficiency. One of the major reasons for the poor recognition rate is error in character segmentation. The presence of touching characters in the scanned documents further complicates the segmentation process, creating a major problem when designing an effective character segmentation technique. Preprocessing, character segmentation, feature extraction, and finally, classification and recognition are the major steps which are followed by a general OCR. The preprocessing tasks considered in the paper are conversion of gray scaled images to binary images, image rectification, and segmentation of the document's textual contents into paragraphs, lines, words, and then at the level of basic symbols. The basic symbols, obtained as the fundamental unit from the segmentation process, are recognized by the neural classifier. In this work, three feature extraction techniques-: histogram of projection based on mean distance, histogram of projection based on pixel value, and vertical zero crossing, have been used to improve the rate of recognition. These feature extraction techniques are powerful enough to extract features of even distorted characters/symbols. For development of the neural classifier, a back-propagation neural network with two hidden layers is used. The classifier is trained and tested for printed Hindi texts. A performance of approximately 90% correct recognition rate is achieved.

A Review of Access Conditions of the W3 and the Inline Image/Sound Processing of HTML Document for Utilizing of the Virtual Library (W3 가상도서관 활용을 위한 HTML 문서작성과 이미지/사운드 처리)

  • 유사라
    • Journal of the Korean Society for information Management
    • /
    • v.12 no.1
    • /
    • pp.45-66
    • /
    • 1995
  • The information users of the middle of 1990s. who know the Internet as well as its useful information services, are now expecting the virtual library services. Especially the increasing demands on hypertext and hypermedia information in the internet settings have been centered on the W3 with the man-page information. In this manner, the paper describes the access methods with brief concepts of the W3 and explains URLs and HTML. It also gives the retrieval layouts of unformatted data including images and sounds and then provides the information sources and software of W3 Clients and Servers in order to catch up the most recently post version of W3.

  • PDF

Extracting curved text lines using the chain composition and the expanded grouping method (체인 정합과 확장된 그룹핑 방법을 사용한 곡선형 텍스트 라인 추출)

  • Bai, Nguyen Noi;Yoon, Jin-Seon;Song, Young-Jun;Kim, Nam;Kim, Yong-Gi
    • The KIPS Transactions:PartB
    • /
    • v.14B no.6
    • /
    • pp.453-460
    • /
    • 2007
  • In this paper, we present a method to extract the text lines in poorly structured documents. The text lines may have different orientations, considerably curved shapes, and there are possibly a few wide inter-word gaps in a text line. Those text lines can be found in posters, blocks of addresses, artistic documents. Our method based on the traditional perceptual grouping but we develop novel solutions to overcome the problems of insufficient seed points and vaned orientations un a single line. In this paper, we assume that text lines contained tone connected components, in which each connected components is a set of black pixels within a letter, or some touched letters. In our scheme, the connected components closer than an iteratively incremented threshold will make together a chain. Elongate chains are identified as the seed chains of lines. Then the seed chains are extended to the left and the right regarding the local orientations. The local orientations will be reevaluated at each side of the chains when it is extended. By this process, all text lines are finally constructed. The proposed method is good for extraction of the considerably curved text lines from logos and slogans in our experiment; 98% and 94% for the straight-line extraction and the curved-line extraction, respectively.

A New Mobile Content Adaptation Based on Content Provider-Specified Web Clipping (컨텐츠 제공자 지정 웹 클리핑 방식의 이동 인터넷 컨텐츠 변환)

  • Yang, Seo-Min;Lee, Hyuk-Joon
    • The KIPS Transactions:PartB
    • /
    • v.11B no.1
    • /
    • pp.35-44
    • /
    • 2004
  • Web contents created for desktop screens give rise to problems when they are to be displayed on the small screens of mobile terminals. While in some cases some of the objects of a page may not be displayable due to the lack of browser capability, the entire page may not be displayable due to the incompatibility with the browser in other cases. In this paper, we introduce a new mobile content adaptation approach based on web clipping, which transforms an original page into one that is optimally displayed on a mobile terminal. In this method, a source page is automatically clipped and transformed according to the clip specification made by the content provider using a clip editing tool. The clip editing tool allows the user to specify group clips, multi-level cups and dynamic clips as well as simple clips, and the presentation layout through a graphic user interface. Based on the clip specifications, each clip is transformed into an intermediate meta-language document, which in turn is transformed into a presentation page in the target markup language. Transcoding of image objects in major image file formats is also supported.

Design and Implementation of Automatic Linking Support System for Efficient Generating and Retrieving Integrated Documents Based on Web (웹 통합문서의 효율적 생성과 검색을 위한 자동링크지원 시스템의 설계 및 구축)

  • Lee, Won-Jung;Jung, Eun-Jae;Joo, Su-Chong;Lee, Seung-Yong
    • The KIPS Transactions:PartA
    • /
    • v.10A no.2
    • /
    • pp.93-100
    • /
    • 2003
  • With the advent of distributed computing and Web service technologies, lots of users have been requiring services that can conveniently obtain and/or support well-assembled information based on Web. For this reason, we are to construct Automatic Linking Support Systems for generating Web-based integrated information and supporting retrieval information according to user's various requirements. Our system organization is based on client/server system. A server environment consisted of automatic linking engine that can provide lexical analyzing, query processing and integrated document generating functions, and databases that are made of dictionaries, image and URL contents. Also, client environments consisted of Web editor that can generate integrated documents and Web helper that can retrieve them via automatic linking engine and databases. For client's user-friendly interfaces, web editor and helper programs can directly execute by down leading from a server without setup them before inside clients. For reducing server's overheads, Parts of server's executing modules are distributed to clients on which they can be executing. As an implementation of our system, we use the JDK 1.3, SWING for user interfaces like Web editor and helper, RMI mechanism for interaction between clients and a server, and SQL server 7.0 for database development, respectively. Finally, we showed the access procedures of automatic document linking engine and databases from Web editor or Web helper, and results appearing on their screens.

Adaptive Data Mining Model using Fuzzy Performance Measures (퍼지 성능 측정자를 이용한 적응 데이터 마이닝 모델)

  • Rhee, Hyun-Sook
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.541-546
    • /
    • 2006
  • Data Mining is the process of finding hidden patterns inside a large data set. Cluster analysis has been used as a popular technique for data mining. It is a fundamental process of data analysis and it has been Playing an important role in solving many problems in pattern recognition and image processing. If fuzzy cluster analysis is to make a significant contribution to engineering applications, much more attention must be paid to fundamental decision on the number of clusters in data. It is related to cluster validity problem which is how well it has identified the structure that Is present in the data. In this paper, we design an adaptive data mining model using fuzzy performance measures. It discovers clusters through an unsupervised neural network model based on a fuzzy objective function and evaluates clustering results by a fuzzy performance measure. We also present the experimental results on newsgroup data. They show that the proposed model can be used as a document classifier.

Deep Learning OCR based document processing platform and its application in financial domain (금융 특화 딥러닝 광학문자인식 기반 문서 처리 플랫폼 구축 및 금융권 내 활용)

  • Dongyoung Kim;Doohyung Kim;Myungsung Kwak;Hyunsoo Son;Dongwon Sohn;Mingi Lim;Yeji Shin;Hyeonjung Lee;Chandong Park;Mihyang Kim;Dongwon Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.143-174
    • /
    • 2023
  • With the development of deep learning technologies, Artificial Intelligence powered Optical Character Recognition (AI-OCR) has evolved to read multiple languages from various forms of images accurately. For the financial industry, where a large number of diverse documents are processed through manpower, the potential for using AI-OCR is great. In this study, we present a configuration and a design of an AI-OCR modality for use in the financial industry and discuss the platform construction with application cases. Since the use of financial domain data is prohibited under the Personal Information Protection Act, we developed a deep learning-based data generation approach and used it to train the AI-OCR models. The AI-OCR models are trained for image preprocessing, text recognition, and language processing and are configured as a microservice architected platform to process a broad variety of documents. We have demonstrated the AI-OCR platform by applying it to financial domain tasks of document sorting, document verification, and typing assistance The demonstrations confirm the increasing work efficiency and conveniences.

Multimedia Network Teaching System based on SMIL (SMIL을 기반으로 한 멀티미디어 네트워크 교육시스템)

  • Yu, Lei;Cao, Ke-Rang;Bang, Jin-Suk;Cho, Tae-Beom;Jung, Hoe-Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.524-527
    • /
    • 2008
  • Recently, digital and the Internet are widespread out of the world, and multimedia processing technology and the development of information and communication technology in education using the Internet as the demand is rapidly increasing. Also, we tan easily use informations with less restrictions of time and space. however, several kinds of audio, media to integrate multimedia data, such as the proliferation of demands for representation. Therefore, in 1998, W3C presented an international standard, SMIL in order to solve multimedia object representation and synchronization problems. By using SMIL, various multimedia elements can be integrated as a multimedia document with proper view in a spate and time. Using this SMIL document, we can create new internet radio broadcasting service that delivers not noly audio data but also various text, image and video. In this paper, with the system, teachers can easily create multimedia courseware and living broadcast their torture on network, students can receive audio-video information of the teacher, screen displays of the teachers computer. Moreover students can communicate with teacher simultaneously by text editor windows. Students can also order courseware after class.

  • PDF

Mulseon-Jinsang Related Document Analysis in First Half of the 18th Century (18세기 전반 물선진상 관련 자료 분석 - 『진상별단등록』을 중심으로 -)

  • Jeon, Sang-wuk
    • Korean Journal of Heritage: History & Science
    • /
    • v.47 no.4
    • /
    • pp.178-191
    • /
    • 2014
  • Jin-Sang is a local specialty donation to the palace. A local specialty donation to the palace is classified Jehyang, Bangmul, Mulseon, Medicine according to characteristic, when, use. Among these, Mulseon Jin-Sang is Most foods. And King was reduced Mulseon Jin-Sang in order to obtain a good image of the king. King Suk-Jong was frequently reduced Mulseon. But frequently changes of goods did not reflect to document. So type of goods, quantity is not clear in early 18th century. In 1728, King Yeong-Jo was published a Jingsangbyeldandngrok to clear type of goods, quantity. This book is written area, timing, quantity of Mulseon. Among these, type of goods, quantity are important. This book was written 176 kinds of goods. These goods was most of the fishery. And raw materials are largely accounted. In addition to processing the various creatures become like dried, pickled. By analyzing the regional allocation features, there are many types order by Gyeongsang-do, Hamgyeong-do. Gangwon-do. This area is faced east sea, so many fisheries have become records. In Gyeongsang-do, Cholla-do, these area were occupied a large portion of the fruit. And Jeju Island was assigned oranges. Finally, it has been assigned dried, pickled foods than living thing in distant area.

The Implementation of the Digital watermarking for 3D Polygonal Model (3차원 형상 모델의 디지털 워터마킹 구현)

  • Kim, Sun-Hyung;Lee, Sun-Heum;Kim, Gee-Seog;Ahn, Deog-Sang
    • The KIPS Transactions:PartD
    • /
    • v.9D no.5
    • /
    • pp.925-930
    • /
    • 2002
  • This paper discusses techniques for embedding data into 3D polygonal models of geometry. Much researches of Watermarking had been gone as element technology of DRM (digital rights management). But, few research had gone to 3D polygonal model. Most research is limited at text document, 2D image, animation, music etc. RP system is suitable a few production in various goods species, and it is used much in industry to possible reason that produce prototype and find error or incongruent factor at early stage on design in product development childhood. This paper is research about method that insert watermark in STL ( stereolithography) file that have 3D shape model. Proposed algorithm inserts watermark in normal vector region and facet's interior region of 3D shape data. For this reason, 3D shape does not produce some flexure and fulfill invisibility of watermark. Experiment results that insert and extract watermark in normal netter region and facet's Interior region of 3D shape data by proposed algorithm do not influence entirely in 3D shape and show that insertion and extraction of watermark are possible.