• 제목/요약/키워드: text information

검색결과 4,361건 처리시간 0.029초

An Efficient Machine Learning-based Text Summarization in the Malayalam Language

  • P Haroon, Rosna;Gafur M, Abdul;Nisha U, Barakkath
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권6호
    • /
    • pp.1778-1799
    • /
    • 2022
  • Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.

On The Full-Text Database Retrieval and Indexing Language

  • Chang, Hye-Rhan
    • 정보관리학회지
    • /
    • 제4권1호
    • /
    • pp.24-46
    • /
    • 1987
  • 최근 원문 데이타베이스의 증가는 주제접근의 새로운 가능성을 제시하였다. 온라인 정보검색은 근본적으로 색인언어와 컴퓨터 기술의 문제이다. 본 연구의 목적은 전통적인 서지 데이타베이스 검색과 비교하여 원문 데이터 베이스 검색의 특징과 성능을 규명하는데 있다. 색인언어에 따른 검색효율, 현재 응용되고 있는 원문 데이타베이스 탐색 시스템, 통제어휘의 새로운 역할 등을 살펴보았다. 이 논문은 또한 원문 데이타베이스의 검색성능 실험에 대한 리뷰를 포함한다.

  • PDF

Development Status and Prospects of Graphical Password Authentication System in Korea

  • Yang, Gi-Chul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권11호
    • /
    • pp.5755-5772
    • /
    • 2019
  • Security is becoming more important as society changes rapidly. In addition, today's ICT environment demands changes in existing security technologies. As a result, password authentication methods are also changing. The authentication method most often used for security is password authentication. The most-commonly used passwords are text-based. Security enhancement requires longer and more complex passwords, but long, complex, text-based passwords are hard to remember and inconvenient to use. Therefore, authentication techniques that can replace text-based passwords are required today. Graphical passwords are more difficult to steal than text-based passwords and are easier for users to remember. In recent years, researches into graphical passwords that can replace existing text-based passwords are being actively conducting in various places throughout the world. This article surveys recent research and development directions of graphical password authentication systems in Korea. For this purpose, security authentication methods using graphical passwords are categorized into technical groups and the research associated with graphical passwords performed in Korea is explored. In addition, the advantages and disadvantages of all investigated graphical password authentication methods were analyzed along with their characteristics.

Rectification of Perspective Text Images on Rectangular Planes

  • Le, Huy Phat;Madhubalan, Kavitha;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • 제6권4호
    • /
    • pp.1-7
    • /
    • 2010
  • Natural images often contain useful information about the scene such as text or company logos placed on a rectangular shaped plane. The 2D images captured from such objects by a camera are often distorted, because of the effects of the perspective projection camera model. This distortion makes the acquisition of the text information difficult. In this study, we detect the rectangular object on which the text is written, then the image is restored by removing the perspective distortion. The Hough transform is used to detect the boundary lines of the rectangular object and a bilinear transformation is applied to restore the original image.

Automatic Name Line Detection for Person Indexing Based on Overlay Text

  • Lee, Sanghee;Ahn, Jungil;Jo, Kanghyun
    • Journal of Multimedia Information System
    • /
    • 제2권1호
    • /
    • pp.163-170
    • /
    • 2015
  • Many overlay texts are artificially superimposed on the broadcasting videos by humans. These texts provide additional information to the audiovisual content. Especially, the overlay text in news videos contains concise and direct description of the content. Therefore, it is most reliable clue for constructing a news video indexing system. To make the automatic person indexing of interview video in the TV news program, this paper proposes the method to only detect the name text line among the whole overlay texts in one frame. The experimental results on Korean television news videos show that the proposed framework efficiently detects the overlaid name text line.

Title Extraction from Book Cover Images Using Histogram of Oriented Gradients and Color Information

  • Do, Yen;Kim, Soo Hyung;Na, In Seop
    • International Journal of Contents
    • /
    • 제8권4호
    • /
    • pp.95-102
    • /
    • 2012
  • In this paper, we present a technique to extract the title areas from book cover images. A typical book cover image may contain text, pictures, diagrams as well as complex and irregular background. In addition, the high variability of character features such as thickness, font, position, background and tilt of the text also makes the text extraction task more complicated. Therefore, we propose a two steps efficient method that uses Histogram of Oriented Gradients and color information to find the title areas. Firstly, text localization is carried out to find the title candidates. Finally, refinement process is performed to find the sufficient components of title areas. To obtain the best result, we also use other constraints about the size, ratio between the length and width of the title. We achieve encouraging results of extracted title regions from book cover images which prove the advantages and efficiency of the proposed method.

Text Detection in Scene Images Based on Interest Points

  • Nguyen, Minh Hieu;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • 제11권4호
    • /
    • pp.528-537
    • /
    • 2015
  • Text in images is one of the most important cues for understanding a scene. In this paper, we propose a novel approach based on interest points to localize text in natural scene images. The main ideas of this approach are as follows: first we used interest point detection techniques, which extract the corner points of characters and center points of edge connected components, to select candidate regions. Second, these candidate regions were verified by using tensor voting, which is capable of extracting perceptual structures from noisy data. Finally, area, orientation, and aspect ratio were used to filter out non-text regions. The proposed method was tested on the ICDAR 2003 dataset and images of wine labels. The experiment results show the validity of this approach.

Citation-based Article Summarization using a Combination of Lexical Text Similarities: Evaluation with Computational Linguistics Literature Summarization Datasets

  • Kang, In-Su
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권7호
    • /
    • pp.31-37
    • /
    • 2019
  • Citation-based article summarization is to create a shortened text for an academic article, reflecting the content of citing sentences which contain other's thoughts about the target article to be summarized. To deal with the problem, this study introduces an extractive summarization method based on calculating a linear combination of various sentence salience scores, which represent the degrees to which a candidate sentence reflects the content of author's abstract text, reader's citing text, and the target article to be summarized. In the current study, salience scores are obtained by computing surface-level textual similarities. Experiments using CL-SciSumm datasets show that the proposed method parallels or outperforms the previous approaches in ROUGE evaluations against SciSumm-2017 human summaries and SciSumm-2016/2017 community summaries.

신호의 복원된 위상 공간을 이용한 오디오 상황 인지 (A new approach technique on Speech-to-Speech Translation)

  • ;이승룡
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2009년도 추계학술발표대회
    • /
    • pp.239-240
    • /
    • 2009
  • We live in a flat world in which globalization fosters communication, travel, and trade among more than 150 countries and thousands of languages. To surmount the barriers among these languages, translation is required; Speech-to-Speech translation will automate the process. Thanks to recent advances in Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), one can now utilize a system to translate a speech of source language to a speech of target language and vice versa in affordable manner. The three phase process establishes that the source speech be transcribed into a (set of) text of the source language (ASR) before the source text is translated into the target text (MT). Finally, the target speech is synthesized from the target text (TTS).