• Title/Summary/Keyword: text image

Search Result 981, Processing Time 0.032 seconds

Development of Vaccine with Artificial Intelligence: By Analyzing OP Code Features Based on Text and Image Dataset (OP Code 특징 기반의 텍스트와 이미지 데이터셋 연구를 통한 인공지능 백신 개발)

  • Choi, Hyo-Kyung;Lee, Se-Eun;Lee, Ju-Hyun;Hong, Rae-Young;Choi, Won-Hyok;Kim, Hyung-Jong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.5
    • /
    • pp.1019-1026
    • /
    • 2019
  • Due to limitations of existing methods for detecting newly introduced malware, the importance of the development of artificial intelligence vaccines arises. Existing artificial intelligence vaccines have a disadvantage that the accuracy of the detection rate is low because those vaccines do not scan all parts of the file. In this paper, we suggest an enhanced method for detecting malware which is composed of unique OP Code features in the malware files. Specifically, we tested the method with text datasets trained on Random Forest algorithm and with image datasets trained on the Inception V3 model. As a result, the highest accuracy of the detection rate was about 80%.

Multi-type object detection-based de-identification technique for personal information protection (개인정보보호를 위한 다중 유형 객체 탐지 기반 비식별화 기법)

  • Ye-Seul Kil;Hyo-Jin Lee;Jung-Hwa Ryu;Il-Gu Lee
    • Convergence Security Journal
    • /
    • v.22 no.5
    • /
    • pp.11-20
    • /
    • 2022
  • As the Internet and web technology develop around mobile devices, image data contains various types of sensitive information such as people, text, and space. In addition to these characteristics, as the use of SNS increases, the amount of damage caused by exposure and abuse of personal information online is increasing. However, research on de-identification technology based on multi-type object detection for personal information protection is insufficient. Therefore, this paper proposes an artificial intelligence model that detects and de-identifies multiple types of objects using existing single-type object detection models in parallel. Through cutmix, an image in which person and text objects exist together are created and composed of training data, and detection and de-identification of objects with different characteristics of person and text was performed. The proposed model achieves a precision of 0.724 and mAP@.5 of 0.745 when two objects are present at the same time. In addition, after de-identification, mAP@.5 was 0.224 for all objects, showing a decrease of 0.4 or more.

A Content Analysis for Website Usefulness Evaluation: Utilizing Text Mining Technique

  • Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Internet Computing and Services
    • /
    • v.16 no.4
    • /
    • pp.71-81
    • /
    • 2015
  • With the increasing influence of online media, company websites have become important communication channels between companies and customers. Companies use their websites as a marketing tool for a variety of purposes, including enhancing their image and selling products or services. Many researchers have examined the criteria, methods, and tools for website evaluation, but most have focused on usability. Prior content analyses have focused not on text content but on website components, an approach likely to produce subjective evaluations. This study attempts to objectively evaluate company websites by utilizing text mining. We analyze the usefulness of company websites by presenting visualized outputs from a business perspective, allowing practitioners to easily understand the results of the website evaluation and use them in decision making. To demonstrate our method empirically, we selected a company with a number of affiliates in Korea and analyzed the text content of their websites to assess their usefulness using natural language processing and graphics packages in R. Practitioners can easily employ our objective evaluation method, and researchers can use it to gain a new perspective on website evaluation.

Text Region Extraction and OCR on Camera Based Images (카메라 영상 위에서의 문자 영역 추출 및 OCR)

  • Shin, Hyun-Kyung
    • The KIPS Transactions:PartD
    • /
    • v.17D no.1
    • /
    • pp.59-66
    • /
    • 2010
  • Traditional OCR engines are designed to the scanned documents in calibrated environment. Three dimensional perspective distortion and smooth distortion in images are critical problems caused by un-calibrated devices, e.g. image from smart phones. To meet the growing demand of character recognition of texts embedded in the photos acquired from the non-calibrated hand-held devices, we address the problem in three categorical aspects: rotational invariant method of text region extraction, scale invariant method of text line segmentation, and three dimensional perspective mapping. With the integration of the methods, we developed an OCR for camera-captured images.

A novel, reversible, Chinese text information hiding scheme based on lookalike traditional and simplified Chinese characters

  • Feng, Bin;Wang, Zhi-Hui;Wang, Duo;Chang, Ching-Yun;Li, Ming-Chu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.1
    • /
    • pp.269-281
    • /
    • 2014
  • Compared to hiding information into digital image, hiding information into digital text file requires less storage space and smaller bandwidth for data transmission, and it has obvious universality and extensiveness. However, text files have low redundancy, so it is more difficult to hide information in text files. To overcome this difficulty, Wang et al. proposed a reversible information hiding scheme using left-right and up-down representations of Chinese characters, but, when the scheme is implemented, it does not provide good visual steganographic effectiveness, and the embedding and extracting processes are too complicated to be done with reasonable effort and cost. We observed that a lot of traditional and simplified Chinese characters look somewhat the same (also called lookalike), so we utilize this feature to propose a novel information hiding scheme for hiding secret data in lookalike Chinese characters. Comparing to Wang et al.'s scheme, the proposed scheme simplifies the embedding and extracting procedures significantly and improves the effectiveness of visual steganographic images. The experimental results demonstrated the advantages of our proposed scheme.

Urdu News Classification using Application of Machine Learning Algorithms on News Headline

  • Khan, Muhammad Badruddin
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.2
    • /
    • pp.229-237
    • /
    • 2021
  • Our modern 'information-hungry' age demands delivery of information at unprecedented fast rates. Timely delivery of noteworthy information about recent events can help people from different segments of life in number of ways. As world has become global village, the flow of news in terms of volume and speed demands involvement of machines to help humans to handle the enormous data. News are presented to public in forms of video, audio, image and text. News text available on internet is a source of knowledge for billions of internet users. Urdu language is spoken and understood by millions of people from Indian subcontinent. Availability of online Urdu news enable this branch of humanity to improve their understandings of the world and make their decisions. This paper uses available online Urdu news data to train machines to automatically categorize provided news. Various machine learning algorithms were used on news headline for training purpose and the results demonstrate that Bernoulli Naïve Bayes (Bernoulli NB) and Multinomial Naïve Bayes (Multinomial NB) algorithm outperformed other algorithms in terms of all performance parameters. The maximum level of accuracy achieved for the dataset was 94.278% by multinomial NB classifier followed by Bernoulli NB classifier with accuracy of 94.274% when Urdu stop words were removed from dataset. The results suggest that short text of headlines of news can be used as an input for text categorization process.

A Study on the Development of E-book Contents for Fashion Online Entrepreneurship Education (패션온라인창업 교육을 위한 전자책 콘텐츠 개발에 대한 연구)

  • Hwa-Yeon Jeong;Eun-Hee Hong
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.26 no.1
    • /
    • pp.33-44
    • /
    • 2024
  • This study developed e-book content in order to use e-books as a tool to provide more efficient classes to learners who are familiar with smart devices and online spaces. E-book contents were produced using Sigil-0.9.10. The development process is as follows. Before e-book development, it is necessary to prepare manuscript files, image files to be inserted, fonts to be used, and e-book covers. After inserting the book cover images, it is necessary to register the table of contents using the title tag and register the free fonts. Also, a style must be created for text or images used in the main text connected to a file containing the entire text. Then, after separating the entire text file into separate files according to each chapter, the text is completed in turn. E-books were produced focusing on hyperlink functions so that educational content and various example images could be accessed. Currently, there is a lack of research on e-books as textbooks in universities within the fashion design major. In the future, if e-book contents are developed according to the characteristics of courses and the level of learners, they can be used as effective teaching tools.

BADA-$IV/I^2R$: Design & Implementation of an Efficient Content-based Image Retrieval System using a High-Dimensional Image Index Structure (바다-$IV/I^2R$: 고차원 이미지 색인 구조를 이용한 효율적인 내용 기반 이미지 검색 시스템의 설계와 구현)

  • Kim, Yeong-Gyun;Lee, Jang-Seon;Lee, Hun-Sun;Kim, Wan-Seok;Kim, Myeong-Jun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2S
    • /
    • pp.678-691
    • /
    • 2000
  • A variety of multimedia applications require multimedia database management systems to manage multimedia data, such as text, image, and video, as well as t support content-based image or video retrieval. In this paper we design and implement a content-based image retrieval system, BADA-IV/I$^2$R(Image Information Retrieval), which is developed based on BADA-IV multimedia database management system. In this system image databases can be efficiently constructed and retrieved with the visual features, such as color, shape, and texture, of image. we extend SQL statements to define image query based on both annotations and visual features of image together. A high-dimensional index structure, called CIR-tree, is also employed in the system to provide an efficient access method to image databases. We show that BADA-IV/I$^2$R provides a flexible way to define query for image retrieval and retrieves image data fast and effectively: the effectiveness and performance of image retrieval are shown by BEP(Bull's Eye Performance) that is used to measure the retrieval effectiveness in MPEG-7 and comparing the performance of CIR-tree with those of X-tree and TV-tree, respectively.

  • PDF

Paralinguistic Communication of the Image on Cartoon and Comics (만화에서 이미지가 주는 언어적 커뮤니케이션)

  • Lee, Won-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.1
    • /
    • pp.83-91
    • /
    • 2011
  • The attribute of cartoon and comics has been known the combination of image and text. It is improved the we find the shape of comics at the first comics on the newspaper. But we can read the comic without word. These works don't give the difficult to read and it may transfer to readers by the image of figure on comics. Therefore how can the image reach the readers by the communication. This study is the research of the visual image communication and the attribute of wordless comics.

Digital Watermarking Scheme Adopting Variable Spreading Sequence in Wireless Image Transmission (무선 이미지 전송에서 가변확산부호를 적용한 Digital Watermarking 기법)

  • 조복은;노재성;조성준
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.109-112
    • /
    • 2002
  • In this paper, we propose the efficient digital watermarking scheme to transmit effectively the compressed medical image that embedded with watermarking data in mobile Internet access channel. The wireless channel error based on multiple access interference (MAI) is closely related to the length of spreading sequence in CDMA system. Also, the fixed length coded medical image with watermark bit stream can be classified by significance of source image. In the simulation, we compare the peak signal to noise ratio (PSNR) performance when the watermarked image with a simple symbol and when the watermarked image with a text file is transmitted using variable length of spreading sequences in case of limited length of spread sequence.

  • PDF