• Title/Summary/Keyword: 텍스트 연구

Search Result 3,471, Processing Time 0.029 seconds

LGG-based Phrase-Pattern Dictionaries of Non-Standard Tokens that contain Bound Nouns in Social Media Texts (SNS 텍스트의 비정규토큰 분석 성능 향상을 위한 의존명사 내포 어형의 LGG 기반 패턴문법 사전)

  • Choi, Seong-Yong;Shin, Dong-Hyok;Hwang, Chang-Hoe;Yoo, Gwang-Hoon;Nam, Jee-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.394-399
    • /
    • 2018
  • 본 연구는 SNS 텍스트에서 형태소 분석기로 분석되지 않는 비정규토큰 유형 중 고빈도로 나타나는 의존명사 내포 어형의 형태소를 인식할 수 있는 LGG 기반 패턴문법 사전 구축과 그 성능을 평가하는 것을 목표로 한다. SNS 텍스트에서는 기존의 정형화된 텍스트와 달리, 띄어쓰기 오류로 인한 미분석어가 매우 높은 빈도로 나타나는데, 특히 의존명사를 포함한 유형이 20% 이상을 차지하며 가장 빈번한 것으로 나타났다. 이에 본 연구에서는 의존명사를 내포한 비정규토큰의 띄어쓰기 오류 문제를 효과적으로 처리하기 위해, 부분 문법 그래프(Local Grammar Graph: LGG) 프레임에 기반한 패턴문법 사전을 구축하였다. 이를 SNS 코퍼스에 적용하여 성능을 평가한 결과, 정확률 91.28%, 재현율 89%, 조화 평균 90.13%의 성능을 통해 본 연구의 접근 방법론의 유용성과 구축 자원의 실효성을 입증하였다.

  • PDF

A Study on Improvement of Image Classification Accuracy Using Image-Text Pairs (이미지-텍스트 쌍을 활용한 이미지 분류 정확도 향상에 관한 연구)

  • Mi-Hui Kim;Ju-Hyeok Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.561-566
    • /
    • 2023
  • With the development of deep learning, it is possible to solve various computer non-specialized problems such as image processing. However, most image processing methods use only the visual information of the image to process the image. Text data such as descriptions and annotations related to images may provide additional tactile and visual information that is difficult to obtain from the image itself. In this paper, we intend to improve image classification accuracy through a deep learning model that analyzes images and texts using image-text pairs. The proposed model showed an approximately 11% classification accuracy improvement over the deep learning model using only image information.

Text summarization of dialogue based on BERT

  • Nam, Wongyung;Lee, Jisoo;Jang, Beakcheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.8
    • /
    • pp.41-47
    • /
    • 2022
  • In this paper, we propose how to implement text summaries for colloquial data that are not clearly organized. For this study, SAMSum data, which is colloquial data, was used, and the BERTSumExtAbs model proposed in the previous study of the automatic summary model was applied. More than 70% of the SAMSum dataset consists of conversations between two people, and the remaining 30% consists of conversations between three or more people. As a result, by applying the automatic text summarization model to colloquial data, a result of 42.43 or higher was derived in the ROUGE Score R-1. In addition, a high score of 45.81 was derived by fine-tuning the BERTSum model, which was previously proposed as a text summarization model. Through this study, the performance of colloquial generation summary has been proven, and it is hoped that the computer will understand human natural language as it is and be used as basic data to solve various tasks.

Case Study on Public Document Classification System That Utilizes Text-Mining Technique in BigData Environment (빅데이터 환경에서 텍스트마이닝 기법을 활용한 공공문서 분류체계의 적용사례 연구)

  • Shim, Jang-sup;Lee, Kang-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.1085-1089
    • /
    • 2015
  • Text-mining technique in the past had difficulty in realizing the analysis algorithm due to text complexity and degree of freedom that variables in the text have. Although the algorithm demanded lots of effort to get meaningful result, mechanical text analysis took more time than human text analysis. However, along with the development of hardware and analysis algorithm, big data technology has appeared. Thanks to big data technology, all the previously mentioned problems have been solved while analysis through text-mining is recognized to be valuable as well. However, applying text-mining to Korean text is still at the initial stage due to the linguistic domain characteristics that the Korean language has. If not only the data searching but also the analysis through text-mining is possible, saving the cost of human and material resources required for text analysis will lead efficient resource utilization in numerous public work fields. Thus, in this paper, we compare and evaluate the public document classification by handwork to public document classification where word frequency(TF-IDF) in a text-mining-based text and Cosine similarity between each document have been utilized in big data environment.

  • PDF

A Study on the Postmodern Identity in Madonna Costume -Focusing on the intertextuality- (마돈나 의상에 나타난 포스트모던 정체성없음)

  • 김주영;양숙희
    • Journal of the Korean Society of Costume
    • /
    • v.51 no.8
    • /
    • pp.123-139
    • /
    • 2001
  • 본고는 공간, 시간, 계급, 종교 등 하위문화 텍스트의 병행인용 즉 상호텍스트성(intertextuality)을 통해, 20세기 대중 문화의 상징 마돈나 뮤직 비디오와 공연 등의 인체, 의상, 이미지 등에 나타난 포스트모던 정체성을 연구함으로써, 현대 미디어 문화를 관통하는 주체적 여성 정체성과 미적 주관성을 이해하고자한다. 첫째, 상호공간텍스트성 복식은 스패니쉬룩. 태국룩. 게이샤룩, 테크노펑크룩, 테크노 카우걸룩 등의 동서양의 지리적 소외감을 통해 비권위적 다양한 시선을 제시함으로써, 다국적 자본주의와 함께 확장된 미적 체험을 하게 한다.; 둘째, 상호시간텍스트성 복식은 중세 엠파이어 드레스, 18세기 로코코시대의 robe'a la francaise, 미래적 제 3의 종 룩 등 동시적 몽환적 이미지를 통해 유희적 유토피아를 지향하였다.; 셋째, 상호계급텍스트성 복식은 그라피티룩, 펑크룩, 키치룩, 먼로 룩, 보깅(voguing), 에비타 룩 등 상하류층, 하위문화, 빈부, 권력의 유무를 병행인용하여, 좋은/나쁜 취향, 창녀/성녀 이분법을 해체하고 반부르조아적 저항과 물질주의를 찬양하는 탈계급적, 양면적 정체성을 구축하였다. ; 넷째, 상호종교텍스트성 복식은 상징적 가부장인 카톨릭교 텍스트를 인용하여 펑키크리스찬 룩, 에로틱 크리스찬 룩 등의 선/악, 신성성/관능성, 미추, 정숙성/비정숙성의 이분법, 비장미를 해체함으로써 예술의 자율성, 무의식이 강조된 쾌락주의적, 반권위주의적 정체성을 주장한다. 섹슈얼리티에 있어, 시선, 권력, 쾌락의 주체가 됨으로써, 미적 범주에 있어 선악, 미추, 정숙성과 비정숙성의 이분법을 해체함으로써, 유동적 자아를 구성한 마돈나 의상의 포스트모던 정체성은 여성에게 확대된 가능성을 제공하며 내부로부터 해체된 열린 복식을 지향한다.

  • PDF

A Study of Hangul Text Steganography based on Genetic Algorithm (유전 알고리즘 기반 한글 텍스트 스테가노그래피의 연구)

  • Ji, Seon-Su
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.7-12
    • /
    • 2016
  • In a hostile Internet environment, steganography has focused to hide a secret message inside the cover medium for increasing the security. That is the complement of the encryption. This paper presents a text steganography techniques using the Hangul text. To enhance the security level, secret messages have been encrypted first through the genetic algorithm operator crossover. And then embedded into an cover text to form the stego text without changing its noticeable properties and structures. To maintain the capacity in the cover media to 3.69%, the experiments show that the size of the stego text was increased up to 14%.

A Corpus-based Study of Translation Universals in English Translations of Korean Newspaper Texts (한국 신문의 영어 번역에 나타난 번역 보편소의 코퍼스 기반 분석)

  • Goh, Gwang-Yoon;Lee, Younghee (Cheri)
    • Cross-Cultural Studies
    • /
    • v.45
    • /
    • pp.109-143
    • /
    • 2016
  • This article examines distinctive linguistic shifts of translational English in an effort to verify the validity of the translation universals hypotheses, including simplification, explicitation, normalization and leveling-out, which have been most heavily explored to date. A large-scale study involving comparable corpora of translated and non-translated English newspaper texts has been carried out to typify particular linguistic attributes inherent in translated texts. The main findings are as follows. First, by employing the parameters of STTR, top-to-bottom frequency words, and mean values of sentence lengths, the translational instances of simplification have been detected across the translated English newspaper corpora. In contrast, the portion of function words produced contrary results, which in turn suggests that this feature might not constitute an effective test of the hypothesis. Second, it was found that the use of connectives was more salient in original English newspaper texts than translated English texts, being incompatible with the explicitation hypothesis. Third, as an indicator of translational normalization, lexical bundles were found to be more pervasive in translated texts than in non-translated texts, which is expected from and therefore support the normalization hypothesis. Finally, the standard deviations of both STTR and mean sentence lengths turned out to be higher in translated texts, indicating that the translated English newspaper texts were less leveled out within the same corpus group, which is opposed to what the leveling-out hypothesis postulates. Overall, the results suggest that not all four hypotheses may qualify for the label translation universals, or at least that some translational predictors are not feasible enough to evaluate the effectiveness of the translation universals hypotheses.

Narrative Characteristics in High School Students' Geological Field Trip Reports: the Relationship Between the Narrative Mode of Thought and the Academic Achievement (지질 답사 보고서에 나타난 고등학생들의 내러티브 특성: 내러티브적 사고와 학업 성취도의 관계)

  • Chung, Sue-Im;Shin, Dong-Hee
    • Journal of The Korean Association For Science Education
    • /
    • v.35 no.4
    • /
    • pp.735-750
    • /
    • 2015
  • The purpose of this study is to draw an educational implication by analyzing the context of narrative texts, students' narrative thinking, and their academic achievement. We investigated text types in students' geological field trip reports, the reason why students favors narrative texts, the relationship between narrative texts and their scientific knowledge recall, and the relationship between narrative thought and academic achievement. All students used expository texts, 82% of them expressed argumentative texts, and 36% of them used narrative texts. It is likely that students use more narrative texts because students were in the context of outdoor activity and so, their emotional feelings were more activated than when they are doing lab activities. The academic characteristics of earth science seemed to contribute more narrative texts in students' reports. The post-test revealed that students with narrative texts recalled better than the others. On the other hand, there were no statistically meaningful differences in academic achievement between the two groups. However, we have noted that female students whose reports contain narrative texts achieved significantly higher scores than female students whose reports are without narrative texts. From in-depth interviews, we found that students who properly used both paradigmatic and narrative mode of thought were in a more advantageous position than those who used narrative thought only. It was also found that some narratively thinking students tended to feel uncomfortable with the way of learning or evaluating questions about science. In the future, a complementary approach of narrative and paradigmatic mode of thoughts would be encouraged by understanding students' tendency of thinking.

The Present Status of Picture Book Reading Activities and Utilization of Picture Book Peritexts of Early Childhood Teachers (유아교사의 그림책 읽기활동 현황 및 주변텍스트에 대한 인식과 활용)

  • Nam, A Reum;Kim, Sang Lim
    • Korean Journal of Child Education & Care
    • /
    • v.19 no.3
    • /
    • pp.157-170
    • /
    • 2019
  • Objective: The purpose of the study was to investigate the current status of early childhood teachers' picture book reading activities and their knowledge and utilization of the picture book peritexts. Methods: The subjects were 276 early childhood teachers in Seoul metropolitan area. The survey was conducted to investigate early childhood teachers' current status of picture book reading activities as well as their knowledge and utilization of picture book peritexts. The collected data were analyzed using SPSS Statistics 21.0 program to analyze descriptive statistics such as frequency and percentage. Results: As results, most early childhood teachers recognized that reading picture books to young children was very important and responded that the purpose of reading picture books was to develop children's imagination and creativity. In terms of early childhood teachers' knowledge on 12 peritexts, some peritexts such as 'title', 'cover' and 'title page' were recognized at high level but other peritexts such as typography and layout were at low level. In addition, early childhood teachers' utilization level of peritexts were shown as relatively low compared to their knowledge level. Conclusion/Implications: The study results imply that early childhood teachers need to be informed of the concepts of picture book peritexts and encouraged to utilize peritexts while reading picture books to young children.

A Study on Proposal of Emotional expression for Online instant Messenger (온라인 인스턴트 메신저의 감정표현 방법 제안에 관한 연구)

  • Kim, Ju-Yong;Soh, Yeon-Jung
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02b
    • /
    • pp.48-54
    • /
    • 2007
  • 컴퓨터 네트워크의 발달로 인터넷이 실생활이 되면서, 온라인에서 인스턴트 메신저를 이용한 커뮤니케이션 방식이 확대되고 있다. 이러한 메신저에서는 이모티콘 및 플래시콘 혹은 윙크 등 과 같은 다양한 방식을 통해 감정을 표현할 수 있도록 제안하고 있다. 하지만, 아직까지 온라인의 감정표현은 흥미위주이며, 오프라인의 다양한 감정표현에 비해서 제한적이다 따라서 오프라인에서의 다양한 사용자 감정표현을 온라인에서 표현해 줄 수 있는 새로운 형태의 감성 커뮤니케이션 방법이 필요하다. 본 연구에서는 온라인 인스턴트 메신저의 사용행태 이해를 통해서 사용자들의 요구사항을 파악하고 이를 통해 제안할 수 있는 커뮤니케이션 방법을 모색하고자 한다. 실생활에서 사용자의 감정표현 방법을 바탕으로 온라인 메신저에 적용하여 새로운 감성 커뮤니케이션 방법을 제안하는 연구로, 감정어휘와 표현방법의 사용자 조사를 통해 감정표현 방법에 대한 컨셉을 제안하였다. 우선 메신저 사용 행태와 요구사항 도출을 위한 사용자 FGI를 진행하였다. 사용자 FGI로 얻어진 사용자들의 메신저 사용행태를 통해, 텍스트와 모션을 활용하는 '텍스트콘'을 컨셉 아이디어로 도출하였다. 아이디어에 적용하기 위해 모션을 감정으로 인해 파생되는 행동과 체감각 요소로 구분하였다. 구체적인 표현방법 및 적용 요소를 추출하기 위한 방법으로 온라인 설문조사를 실시하였으며 감정어휘와 표현방법(행동, 체감각)을 추출하였다. 추출된 내용은 '사용자 참여 관찰' 기법을 활용하여, 사용어휘와 행동을 정리하고 구제화하여 그 결과를 '텍스트콘'에 적용하였다. 본 연구는 사용자들이 메신저를 이용한 감정표현에 있어 텍스트를 선호하고 텍스트에 대한 요구사형에 있음을 확인하였으며, 감정표현 컨셉 아이디어 '텍스트콘'을 도출하였다. '사용자 참여 관찰' 기법을 활용하여 기존의 이모티콘으로 대표되는 그래픽 아이콘 중심의 사용자 감정표현 방법을 보조해줄 수 있는 방법으로 실생활에서 사용자가 표현하고 받아들이는 신체언와 행동을 적용하였다는 것에 그 의의가 있다. 또한, 감정에 의한 생리적 반응 요소를 외적인 반응과 내적인 반응으로 구분하여, '체감각'이라고 하는 내적반응을 중심표현방법으로 삼아, 기존과는 다른 시각으로 접근할 수 있었다.

  • PDF