• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.028 seconds

Data Transition Minimization Algorithm for Text Image (텍스트 영상에 대한 데이터 천이 최소화 알고리즘)

  • Hwang, Bo-Hyun;Park, Byoung-Soo;Choi, Myung-Ryul
    • Journal of Digital Convergence
    • /
    • v.10 no.11
    • /
    • pp.371-376
    • /
    • 2012
  • In this paper, we propose a new data coding method and its circuits for minimizing data transition in text image. The proposed circuits can solve the synchronization problem between input data and output data in the modified LVDS algorithm. And the proposed algorithm is allowed to transmit two data signals through additional serial data coding method in order to minimize the data transition in text image and can reduce the operating frequency to a half. Thus, we can solve EMI(Electro-Magnetic Interface) problem and reduce the power consumption. The simulation results show that the proposed algorithm and circuits can provide an enhanced data transition minimization in text image and solve the synchronization problem between input data and output data.

Text Mining in Online Social Networks: A Systematic Review

  • Alhazmi, Huda N
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.3
    • /
    • pp.396-404
    • /
    • 2022
  • Online social networks contain a large amount of data that can be converted into valuable and insightful information. Text mining approaches allow exploring large-scale data efficiently. Therefore, this study reviews the recent literature on text mining in online social networks in a way that produces valid and valuable knowledge for further research. The review identifies text mining techniques used in social networking, the data used, tools, and the challenges. Research questions were formulated, then search strategy and selection criteria were defined, followed by the analysis of each paper to extract the data relevant to the research questions. The result shows that the most social media platforms used as a source of the data are Twitter and Facebook. The most common text mining technique were sentiment analysis and topic modeling. Classification and clustering were the most common approaches applied by the studies. The challenges include the need for processing with huge volumes of data, the noise, and the dynamic of the data. The study explores the recent development in text mining approaches in social networking by providing state and general view of work done in this research area.

Analysis of User Requirements Prioritization Using Text Mining : Focused on Online Game (텍스트마이닝을 활용한 사용자 요구사항 우선순위 도출 방법론 : 온라인 게임을 중심으로)

  • Jeong, Mi Yeon;Heo, Sun-Woo;Baek, Dong Hyun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.3
    • /
    • pp.112-121
    • /
    • 2020
  • Recently, as the internet usage is increasing, accordingly generated text data is also increasing. Because this text data on the internet includes users' comments, the text data on the Internet can help you get users' opinion more efficiently and effectively. The topic of text mining has been actively studied recently, but it primarily focuses on either the content analysis or various improving techniques mostly for the performance of target mining algorithms. The objective of this study is to propose a novel method of analyzing the user's requirements by utilizing the text-mining technique. To complement the existing survey techniques, this study seeks to present priorities together with efficient extraction of customer requirements from the text data. This study seeks to identify users' requirements, derive the priorities of requirements, and identify the detailed causes of high-priority requirements. The implications of this study are as follows. First, this study tried to overcome the limitations of traditional investigations such as surveys and VOCs through text mining of online text data. Second, decision makers can derive users' requirements and prioritize without having to analyze numerous text data manually. Third, user priorities can be derived on a quantitative basis.

Using Highly Secure Data Encryption Method for Text File Cryptography

  • Abu-Faraj, Mua'ad M.;Alqadi, Ziad A.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12
    • /
    • pp.53-60
    • /
    • 2021
  • Many standard methods are used for secret text files and secrete short messages cryptography, these methods are efficient when the text to be encrypted is small, and the efficiency will rapidly decrease when increasing the text size, also these methods sometimes have a low level of security, this level will depend on the PK length and sometimes it may be hacked. In this paper, a new method will be introduced to improve the data protection level by using a changeable secrete speech file to generate PK. Highly Secure Data Encryption (HSDE) method will be implemented and tested for data quality levels to ensure that the HSDE destroys the data in the encryption phase, and recover the original data in the decryption phase. Some standard methods of data cryptography will be implemented; comparisons will be done to justify the enhancements provided by the proposed method.

Unstructured Data Quantification Scheme Based on Text Mining for User Feedback Extraction (사용자 의견 추출을 위한 텍스트 마이닝 기반 비정형 데이터 정량화 방안)

  • Jo, Jung-Heum;Chung, Yong-Taek;Choi, Seong-Wook;Ok, Changsoo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.4
    • /
    • pp.131-137
    • /
    • 2018
  • People write reviews of numerous products or services on the Internet, in their blogs or community bulletin boards. These unstructured data contain important emotions and opinions about the author's product or service, which can provide important information for future product design or marketing. However, this text-based information cannot be evaluated quantitatively, and thus they are difficult to apply to mathematical models or optimization problems for product design and improvement. Therefore, this study proposes a method to quantitatively extract user's opinion or preference about a specific product or service by utilizing a lot of text-based information existing on the Internet or online. The extracted unstructured text information is decomposed into basic unit words, and positive rate is evaluated by using existing emotional dictionaries and additional lists proposed in this study. This can be a way to effectively utilize unstructured text data, which is being generated and stored in vast quantities, in product or service design. Finally, to verify the effectiveness of the proposed method, a case study was conducted using movie review data retrieved from a portal website. By comparing the positive rates calculated by the proposed framework with user ratings for movies, a guideline on text mining based evaluation of unstructured data is provided.

A study on unstructured text mining algorithm through R programming based on data dictionary (Data Dictionary 기반의 R Programming을 통한 비정형 Text Mining Algorithm 연구)

  • Lee, Jong Hwa;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2015
  • Unlike structured data which are gathered and saved in a predefined structure, unstructured text data which are mostly written in natural language have larger applications recently due to the emergence of web 2.0. Text mining is one of the most important big data analysis techniques that extracts meaningful information in the text because it has not only increased in the amount of text data but also human being's emotion is expressed directly. In this study, we used R program, an open source software for statistical analysis, and studied algorithm implementation to conduct analyses (such as Frequency Analysis, Cluster Analysis, Word Cloud, Social Network Analysis). Especially, to focus on our research scope, we used keyword extract method based on a Data Dictionary. By applying in real cases, we could find that R is very useful as a statistical analysis software working on variety of OS and with other languages interface.

Text Augmentation Using Hierarchy-based Word Replacement

  • Kim, Museong;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.57-67
    • /
    • 2021
  • Recently, multi-modal deep learning techniques that combine heterogeneous data for deep learning analysis have been utilized a lot. In particular, studies on the synthesis of Text to Image that automatically generate images from text are being actively conducted. Deep learning for image synthesis requires a vast amount of data consisting of pairs of images and text describing the image. Therefore, various data augmentation techniques have been devised to generate a large amount of data from small data. A number of text augmentation techniques based on synonym replacement have been proposed so far. However, these techniques have a common limitation in that there is a possibility of generating a incorrect text from the content of an image when replacing the synonym for a noun word. In this study, we propose a text augmentation method to replace words using word hierarchy information for noun words. Additionally, we performed experiments using MSCOCO data in order to evaluate the performance of the proposed methodology.

The Prefix Array for Multimedia Information Retrieval in the Real-Time Stenograph (실시간 속기 자막 환경에서 멀티미디어 정보 검색을 위한 Prefix Array)

  • Kim, Dong-Joo;Kim, Han-Woo
    • Proceedings of the KIEE Conference
    • /
    • 2006.10c
    • /
    • pp.521-523
    • /
    • 2006
  • This paper proposes an algorithm and its data structure to support real-time full-text search for the streamed or broadcasted multimedia data containing real-time stenograph text. Since the traditional indexing method used at information retrieval area uses the linguistic information, there is a heavy cost. Therefore, we propose the algorithm and its data structure based on suffix array, which is a simple data structure and has low space complexity. Suffix array is useful frequently to search for huge text. However, subtitle text of multimedia data is to get longer by time. Therefore, suffix array must be reconstructed because subtitle text is continually changed. We propose the data structure called prefix array and search algorithm using it.

  • PDF

Impact of Instance Selection on kNN-Based Text Categorization

  • Barigou, Fatiha
    • Journal of Information Processing Systems
    • /
    • v.14 no.2
    • /
    • pp.418-434
    • /
    • 2018
  • With the increasing use of the Internet and electronic documents, automatic text categorization becomes imperative. Several machine learning algorithms have been proposed for text categorization. The k-nearest neighbor algorithm (kNN) is known to be one of the best state of the art classifiers when used for text categorization. However, kNN suffers from limitations such as high computation when classifying new instances. Instance selection techniques have emerged as highly competitive methods to improve kNN through data reduction. However previous works have evaluated those approaches only on structured datasets. In addition, their performance has not been examined over the text categorization domain where the dimensionality and size of the dataset is very high. Motivated by these observations, this paper investigates and analyzes the impact of instance selection on kNN-based text categorization in terms of various aspects such as classification accuracy, classification efficiency, and data reduction.

A Review on Expressive Materials and Approaches to Text Visualization (텍스트 데이터 시각화의 표현 재료와 접근 방식에 관한 고찰)

  • Kim, Hyoyoung;Park, Jin Wan
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.1
    • /
    • pp.64-72
    • /
    • 2013
  • In this study, we contemplated types, essence, characteristics of text data which is material for visual expression of text visualization part of data visualization research and also analysed the multidirectional means of expressive approach for it. Studies of text visualization are spread dramastically under the influence of computer development, open data, wide use of visualization tools, etc. For these reasons, text visualization works have been creating as art works or output of research through various inter-discipline convergent research with engineering, art, humanities, sociology, etc. Nevertheless the theoretical studies on text data itself and its visualization, and also systematic analysis of its approach are rarely made. Data is target of understanding and interpretation, and it has infinite information and possibility with process and approach for it. Considering the attainable status of data in future human society, text visualization which is convergent academic field of study starting with understanding and interpretation of data needs further methodological research and theoretical accumulate.