• Title/Summary/Keyword: Text data

Search Result 2,959, Processing Time 0.033 seconds

Implementation of an efficient Pocket PC- based Hangul Matching System (Pocket PC기반의 효율적인 한글 정합 시스템 구현)

  • Park Jong-Min;Cho Beom-Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.7
    • /
    • pp.1546-1552
    • /
    • 2004
  • Electronic Ink is a stored data in the form of the handwritten text or the script without converting it into ASCII by handwritten recognition on the pen-based computers and Personal Digital Assistants(Pocket PC) for supporting natural and convenient data input. One of the most important issues is to search the electronic ink in order to use it. We proposed and implemented a script matching algorithm for the electronic ink. Proposed matching algorithm separated the input stroke into a set of primitive stroke using the curvature of the stroke curve. After determining the type of separated strokes, it produced a stroke feature vector. And then it calculated the distance between the stroke feature vector of input strokes and one of strokes in the database using the dynamic programming technique.

Framework for Content-Based Image Identification with Standardized Multiview Features

  • Das, Rik;Thepade, Sudeep;Ghosh, Saurav
    • ETRI Journal
    • /
    • v.38 no.1
    • /
    • pp.174-184
    • /
    • 2016
  • Information identification with image data by means of low-level visual features has evolved as a challenging research domain. Conventional text-based mapping of image data has been gradually replaced by content-based techniques of image identification. Feature extraction from image content plays a crucial role in facilitating content-based detection processes. In this paper, the authors have proposed four different techniques for multiview feature extraction from images. The efficiency of extracted feature vectors for content-based image classification and retrieval is evaluated by means of fusion-based and data standardization-based techniques. It is observed that the latter surpasses the former. The proposed methods outclass state-of-the-art techniques for content-based image identification and show an average increase in precision of 17.71% and 22.78% for classification and retrieval, respectively. Three public datasets - Wang; Oliva and Torralba (OT-Scene); and Corel - are used for verification purposes. The research findings are statistically validated by conducting a paired t-test.

Study on Development of Journal and Article Visualization Services (학술정보 시각화 서비스 개발에 관한 연구)

  • Cho, Sung-Nam;Seo, Tae-Sul
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.50 no.2
    • /
    • pp.183-196
    • /
    • 2016
  • The academic journal is an important medium carrying newly discovered knowledge in various disciplines. It is desirable to consider visualization of journal and article information in order to make the information more insightful and effective than text-based information. In this study, visualization service platform of journal and article information is developed. TagCloud were included in both Infographics of journal and article. Each word in the TagCloud is inter-linked with DBPedia using Linked Open Data (LOD) technique.

A Study on Job Satisfaction/Retention Factors and Job Unsatisfaction/Turnover Factors by Industries using Job Reviews (직무 리뷰 분석을 통한 산업군별 직무만족/존속 요인 및 직무불만족/이직 요인에 관한 연구)

  • Lee, Jongseo;Kim, Sunggeun;Kang, Juyoung
    • Journal of Information Technology Services
    • /
    • v.16 no.1
    • /
    • pp.1-26
    • /
    • 2017
  • Keeping good, talented people is one of the most significant factors in a company's success. HR analytics is an important area for applying big data analysis techniques to human resources. It provides organizational insight that enables effective management of employees, allowing management to reach their business goals quickly and efficiently. Job satisfaction and employee turnover analysis are the keys to HR analytics. Job review web services have been becoming popular. Because people exchange information about job satisfaction and turnover through these web services, useful information about HR Analytics is accumulated on the job review web sites. In this paper, we identified factors of employee retention by analyzing a Job Satisfaction/Retention group, and the factors of employee turnover by analyzing a Job Unsatisfaction/Turnover group. In order to do this, we first classified employees according to whether their self-reported job satisfaction or turnover was true. We collected and analyzed data from Jobplanet, a popular job review site. Through dominance analysis and LDA topic modeling, we found major factors, topics, and keywords of the classified groups by IT, service, and manufacturing domains. Our approach is a novel model to apply the analysis of reviews and text mining to the HR domain, and it will be practically helpful for setting new strategies that improve job satisfaction.

Performance Analysis of Multimedia File System

  • Park, Jinyoun;Youjip Won;Jaideep Srivastava
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04a
    • /
    • pp.100-102
    • /
    • 2001
  • Intensive I/O bandwidth demand of the multimedia streaming service puts significant burden on file system. Different from the legacy text based or image data, the semantics of the data in multimedia format can be significantly affected if the data block is not delivered by the predefined deadline. The legacy file system used in Unix or Unix like environment is designed to efficiently handle the files who sizes range from few hundreds of byte to several tens of gigabytes. This fundamental design philosophy results in the file system based on multi level skewed tree structure. Multi level i-node structure has significant drawback when the application performs sequential read operation. In this article, we present the result of the performance study of the file system which is specifically designed for handling multimedia streams. We implemented the file system on Linux Operating System environment and examines the performance behavior of the file system under streaming I/O workload. The result of the study shows that the proposed file system performs much more efficiently than the ext2 file system of Linux does.

A Cryptography Algorithm using Telescoping Series (망원급수를 이용한 암호화 알고리즘)

  • Choi, Eun Jung;Sakong, Yung;Park, Wang Keun
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.4
    • /
    • pp.103-110
    • /
    • 2013
  • In Information Technology era, various amazing IT technologies, for example Big Data, are appearing and are available as the amount of information increase. The number of counselling for violation of personal data protection is also increasing every year that it amounts to over 160,000 in 2012. According to Korean Privacy Act, in the case of treating unique personal identification information, appropriate measures like encipherment should be taken. The technologies of encipherment are the most basic countermeasures for personal data invasion and the base elements in information technology. So various cryptography algorithms exist and are used for encipherment technology. Therefore studies on safer new cryptography algorithms are executed. Cryptography algorithms started from classical replacement enciphering and developed to computationally secure code to increase complexity. Nowadays, various mathematic theories such as 'factorization into prime factor', 'extracting square root', 'discrete lognormal distribution', 'elliptical interaction curve' are adapted to cryptography algorithms. RSA public key cryptography algorithm which was based on 'factorization into prime factor' is the most representative one. This paper suggests algorithm utilizing telescoping series as a safer cryptography algorithm which can maximize the complexity. Telescoping series is a type of infinite series which can generate various types of function for given value-the plain text. Among these generated functions, one can be selected as a original equation. Some part of this equation can be defined as a key. And then the original equation can be transformed into final equation by improving the complexity of original equation through the command of "FullSimplify" of "Mathematica" software.

Direct Geo-referencing for Laser Mapping System

  • Kim, Seong-Baek;Lee, Seung-yong;Kim, Min-Soo
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.423-427
    • /
    • 2002
  • Contrary to the traditional text-based information, 4S(GIS,GNSS,SIIS,ITS) information can contribute to the citizen's welfare in upcoming era. Recently, GSIS(Geo-Spatial Information System) has been applied and stressed out in various fields. As analyzed the data from GSIS arena, the position information of objects and targets is crucial and critical. Therefore, several methods of getting and knowing position are proposed and developed. From this perspective, Position collection and processing are the heart of 4S technology. We develop 4S-Van that enables real-time acquisition of position and attribute information and accurate image data in remote site. In this study, the configuration of 4S-Van equipped with GPS, INS, CCD and eye-safe laser scanner is shown and the merits of DGPS/INS integration approach for geo-referencing is briefly discussed. The algorithm of DGPS/INS integration fur determination of six parameters of motion is eccential in the 4S-Van to avoid or simplify the complicated computation such as photogrammetric triangulation. 4S-Van has the application of Laser-Mobile Mapping System for three-dimensional data acquisition that merges the texture information from CCD camera. The technique is also applied in the fields of virtual reality, car navigation, computer games, planning and management, city transportation, mobile communication, etc.

  • PDF

Copyright Protection of E-books by Data Hiding Based on Integer Factorization

  • Wu, Da-Chun;Hsieh, Ping-Yu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.9
    • /
    • pp.3421-3443
    • /
    • 2021
  • A data hiding method based on integer factorization via e-books in the EPUB format with XHTML and CSS files for copyright protection is proposed. Firstly, a fixed number m of leading bits in a message are transformed into an integer which is then factorized to yield k results. One of the k factorizations is chosen according to the decimal value of a number n of the subsequent message bits with n being decided as the binary logarithm of k. Next, the chosen factorization, denoted as a × b, is utilized to create a combined use of the

    and elements in the XHTML files to embed the m + n message bits by including into the two elements a class selector named according to the value of a as well as a text segment with b characters. The class selector is created by the use of a CSS pseudo-element. The resulting web pages are of no visual difference from the original, achieving a steganographic effect. The security of the embedded message is also considered by randomizing the message bits before they are embedded. Good experimental results and comparisons with exiting methods show the feasibility of the proposed method for copyright protection of e-books.

Favorable analysis of users through the social data analysis based on sentimental analysis (소셜데이터 감성분석을 통한 사용자의 호감도 분석)

  • Lee, Min-gyu;Sohn, Hyo-jung;Seong, Baek-min;Kim, Jong-bae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.438-440
    • /
    • 2014
  • Recently it is used commercially to actively move the data from the SNS service. Therefore, we propose a method that can accurately analyze the information related to the reputation of companies and products in real time SNS environment in this paper.Identify the relationship between words by performing morphological analysis on the text data gathered by crawling the SNS scheme. In addition, it shows the visualization to analyze statistically through a established emotional dictionary morphemes are extracted from the sentence. Here, if the extracted word is not exist in sentimental dictionary. Also, we propose the algorithm that add the word to emotional dictionary automatically.

  • PDF

An Ontology-Based Labeling of Influential Topics Using Topic Network Analysis

  • Kim, Hyon Hee;Rhee, Hey Young
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1096-1107
    • /
    • 2019
  • In this paper, we present an ontology-based approach to labeling influential topics of scientific articles. First, to look for influential topics from scientific article, topic modeling is performed, and then social network analysis is applied to the selected topic models. Abstracts of research papers related to data mining published over the 20 years from 1995 to 2015 are collected and analyzed in this research. Second, to interpret and to explain selected influential topics, the UniDM ontology is constructed from Wikipedia and serves as concept hierarchies of topic models. Our experimental results show that the subjects of data management and queries are identified in the most interrelated topic among other topics, which is followed by that of recommender systems and text mining. Also, the subjects of recommender systems and context-aware systems belong to the most influential topic, and the subject of k-nearest neighbor classifier belongs to the closest topic to other topics. The proposed framework provides a general model for interpreting topics in topic models, which plays an important role in overcoming ambiguous and arbitrary interpretation of topics in topic modeling.