• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.045 seconds

Text Visualization and Concordance Search Using Gutenberg Project Text Data (구텐베르그 프로젝트 텍스트 데이터를 활용한 시각화 및 용례 검색)

  • Kim, Dongsung;Shin, Yeonsu;Lee, Jian;Yu, Jimin
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.175-178
    • /
    • 2017
  • 본 연구는 거시적 빅데이터 인문학과 미시적 언어 텍스트 검색 시스템을 구축하고, 이를 통해서 언어를 통한 문화의 역동적 변화를 시간적 순서에 따라 살펴보고자 한다. 연구의 최종적인 목표는 문화도 생물체처럼 변화하는 존재라 여기고 그 구성요소들을 연구한다는 뜻인 '문화체학(文化體學; Culturomics)'과 같은 '인문학 + 정보과학 + 사회과학' 등등의 다학문간의 융합적 연구에 있다. 이 시스템을 통해서 인류 역사의 기록인 텍스트 빅데이터를 통한 인문학적 성찰을 시각화하고 있다. 이러한 구글의 업적은 인문학과 정보기술의 융합을 통해서 인문학 자체의 지평을 넓히고, 사회과학을 변형시키고, 산업과 상아탑 사이의 관계를 재조정하는데 있다[1].

  • PDF

A Study on Indexing Method using Text Partition (텍스트분할에 의한 색인방법 연구)

  • 강무영;이상구
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.4
    • /
    • pp.75-94
    • /
    • 1999
  • Indexing is a prerequisite function for the information retrieval system in order to retrieve the information of the documents effectively which are saved in database. As a digital data increases in accordance with the development of a computer, the numbers of literatures to be saved in database have also been increased in a large volume. To retrieve such documents of large volume, a lot of system resources and processing time will be required. In this paper, we suggest a advanced indexing method using text partition. This method can retrieve the documents of large volume in short processing time. We applied this suggested indexing method to real information retrieval system, and proved its excellent functions through the demonstration.

  • PDF

Implementation of information access embedded system using two-dimensional bar code and TTS (2D 바코드와 TTS를 활용한 정보접근 임베디드 시스템 구현)

  • Lee, Jae-Kyun;Kim, Si-Woo;Lee, Chae-Wook;Lee, Dong-In
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.1 no.2
    • /
    • pp.31-36
    • /
    • 2006
  • As two dimensional bar code can collect data and information quickly, it is used and recognized as a useful tool for the many industrial application field. But the information capacity of two dimensional bar code is still limited. Recently, the two dimensional AD bar code (analog-digital code) that can increase its application range and overcome capacity limitation is developed. In this paper, we implement an effective system which can transform text information into voice using two dimensional AD bar code and TTS(Text To Speech). It can be transmitted to blind people by capturing the AD bar code on the papers or the books.

  • PDF

The Measurement of Skilled Typist's Typing Position for Developments of New Text Entry Input Device (새로운 문자입력장치 개발을 위한 숙련타이피스트의 타이핑 위치 측정)

  • 김진영;이호길;황성호;최혁렬
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2001.04a
    • /
    • pp.125-130
    • /
    • 2001
  • Skilled typists can type characters or words without looking at keyboard, relying on the finger's relative position. If the relative positions of the fingers can be identified, a virtual keyboard may be accomplished by applying the concept of "DataGlove" or "FingerRing". The virtual keyboard may be efficient as a new mobile input device supporting QWERTY keyboard layout. For the purpose of investigating skilled typing pattern, in this paper the touch-positions of the fingers are measured with a touchscreen while five skilled typists type a long sentence. From these measurements it can be observed that the groups of touch-positions are classified into alphabet characters. Though there are some overlapped groups we can find constant distances capable of being discriminated among the groups from investigation of the change of touch-position for touch-time. Based on the analysis, the prediction algorithm of the constant distance is proposed and evaluated, which is useful for realization of a portable virtual keyboard.le virtual keyboard.

  • PDF

Ternary Decomposition and Dictionary Extension for Khmer Word Segmentation

  • Sung, Thaileang;Hwang, Insoo
    • Journal of Information Technology Applications and Management
    • /
    • v.23 no.2
    • /
    • pp.11-28
    • /
    • 2016
  • In this paper, we proposed a dictionary extension and a ternary decomposition technique to improve the effectiveness of Khmer word segmentation. Most word segmentation approaches depend on a dictionary. However, the dictionary being used is not fully reliable and cannot cover all the words of the Khmer language. This causes an issue of unknown words or out-of-vocabulary words. Our approach is to extend the original dictionary to be more reliable with new words. In addition, we use ternary decomposition for the segmentation process. In this research, we also introduced the invisible space of the Khmer Unicode (char\u200B) in order to segment our training corpus. With our segmentation algorithm, based on ternary decomposition and invisible space, we can extract new words from our training text and then input the new words into the dictionary. We used an extended wordlist and a segmentation algorithm regardless of the invisible space to test an unannotated text. Our results remarkably outperformed other approaches. We have achieved 88.8%, 91.8% and 90.6% rates of precision, recall and F-measurement.

Text-driven Speech Animation with Emotion Control

  • Chae, Wonseok;Kim, Yejin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3473-3487
    • /
    • 2020
  • In this paper, we present a new approach to creating speech animation with emotional expressions using a small set of example models. To generate realistic facial animation, two example models called key visemes and expressions are used for lip-synchronization and facial expressions, respectively. The key visemes represent lip shapes of phonemes such as vowels and consonants while the key expressions represent basic emotions of a face. Our approach utilizes a text-to-speech (TTS) system to create a phonetic transcript for the speech animation. Based on a phonetic transcript, a sequence of speech animation is synthesized by interpolating the corresponding sequence of key visemes. Using an input parameter vector, the key expressions are blended by a method of scattered data interpolation. During the synthesizing process, an importance-based scheme is introduced to combine both lip-synchronization and facial expressions into one animation sequence in real time (over 120Hz). The proposed approach can be applied to diverse types of digital content and applications that use facial animation with high accuracy (over 90%) in speech recognition.

Analysis of Admission Officer System Research Trends of Domestic Journals : Application of Network Text Analysis (입학사정관제 관련 국내 학술지 연구 경향 분석: 네트워크 텍스트 분석방법 적용)

  • Lee, Sang-Yong
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.24 no.5
    • /
    • pp.643-652
    • /
    • 2012
  • The admission officer system that is discussed by research topic was introduced just six years ago. So, domestic journals for research trends analysis of admission officer system are not enough. From the time of the end of 1st government financial support project (2007~2012) for admission officer system, it is important things that to summarize the results of study and to look at the implications. The contents of this study will help that other researchers search the study directions. The implication of the research are as follows. First, we can see that range of research is widening and research is advancing. Second, the journals that related to high school will be come out more frequently. Finally, after accumulation of data, in-depth study analysis will be also come out.

Cell Phone Addiction in Highschool Students and Its Predictors (고등학생의 휴대전화 중독과 예측 요인)

  • Koo, Hyun-Young
    • Child Health Nursing Research
    • /
    • v.16 no.3
    • /
    • pp.203-210
    • /
    • 2010
  • Purpose: This study was done to identify cell phone addiction in high school students and variables predicting this addiction. Methods: The participants were 469 adolescents from four high schools. Data were collected through self-report questionnaires, and analyzed using the SPSS program. Results: Of the high school students, 88.4% reported being average users, 7.5%, heavy users, and 4.1%, cell phone addicted. Cell phone addiction was significantly correlated with immediate self-control, self-efficacy, depression, and peer support. Predictors of cell phone addiction were the following: receiving text message on weekends, immediate self-control, main use (text message), minute per call on weekdays, listening to music, gender (female), monthly call charges, depression, person called (friends), and self-efficacy. These factors explained 39% of variance in cell phone addiction. Conclusion: The above findings indicate that cell phone addiction in high school students was influenced by gender, cell phone use, and psychological factors. Therefore the approach to effective cell phone addiction management for high school students is to consider these variables when developing programs for these students.

A Semantic Content Retrieval and Browsing System Based on Associative Relation in Video Databases

  • Bok Kyoung-Soo;Yoo Jae-Soo
    • International Journal of Contents
    • /
    • v.2 no.1
    • /
    • pp.22-28
    • /
    • 2006
  • In this paper, we propose new semantic contents modeling using individual features, associative relations and visual features for efficiently supporting browsing and retrieval of video semantic contents. And we implement and design a browsing and retrieval system based on the semantic contents modeling. The browsing system supports annotation based information, keyframe based visual information, associative relations, and text based semantic information using a tree based browsing technique. The retrieval system supports text based retrieval, visual feature and associative relations according to the retrieval types of semantic contents.

  • PDF

A Query Classification Method for Question Answering on a Large-Scale Text Data (대규모 문서 데이터 집합에서 Q&A를 위한 질의문 분류 기법)

  • 엄재홍;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.253-255
    • /
    • 2000
  • 어떠한 질문에 대한 구체적 해답을 얻고 싶은 경우, 일반적인 정보 검색이 가지는 문제점은 검색 결과가 사용자가 찾고자 하는 답이라 하기 보다는 해답을 포함하는(또는 포함하지 않는) 문서의 집합이라는 점이다. 사용자가 후보문서를 모두 읽을 필요 없이 빠르게 원하는 정보를 얻기 위해서는 검색의 결과로 문서집합을 제시하기 보다는 실제 원하는 답을 제공하는 시스템의 필요성이 대두된다. 이를 위해 기존의 TF-IDF(Term Frequency-Inversed Document Frequency)기반의 정보검색의 방삭에 자연언어처리(Natural Language Processing)를 이용한 질문의 분류와 문서의 사전 표지(Tagging)를 사용할 수 있다. 본 연구에서는 매년 NIST(National Institute of Standards & Technology)와 DARPA(Defense Advanced Research Projects Agency)주관으로 열리는 TREC(Text REtrieval Conference)중 1999년에 열린 TREC-8의 사용자의 질문(Question)에 대한 답(Answer)을 찾는 ‘Question & Answer’문제의 실험 환경에서 질문을 특징별로 분류하고 검색 대상의 문서에 대한 사전 표지를 이용한 정보검색 시스템으로 사용자의 질문(Question)에 대한 해답을 보다 정확하고 효율적으로 제시할 수 있음을 실험을 통하여 보인다.

  • PDF