• Title/Summary/Keyword: text generation

Search Result 366, Processing Time 0.023 seconds

Study on Difference of Wordvectors Analysis Induced by Text Preprocessing for Deep Learning (딥러닝을 위한 텍스트 전처리에 따른 단어벡터 분석의 차이 연구)

  • Ko, Kwang-Ho
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.489-495
    • /
    • 2022
  • It makes difference to LSTM D/L(Deep Learning) results for language model construction as the corpus preprocess changes. An LSTM model was trained with a famouse literaure poems(Ki Hyung-do's work) for training corpus in the study. You get the two wordvector sets for two corpus sets of the original text and eraised word ending text each once D/L training completed. It's been inspected of the similarity/analogy operation results, the positions of the wordvectors in 2D plane and the generated texts by the language models for the two different corpus sets. The suggested words by the silmilarity/analogy operations are changed for the corpus sets but they are related well considering the corpus characteristics as a literature work. The positions of the wordvectors are different for each corpus sets but the words sustained the basic meanings and the generated texts are different for each corpus sets also but they have the taste of the original style. It's supposed that the D/L language model can be a useful tool to enjoy the literature in object and in diverse with the analysis results shown in the study.

An Analysis of the Social-Cultural Meaning of Korean Girl Groups' Appearances -Focusing on the Change of Girl Groups' Appearances across Generations- (국내 걸그룹 외모에 나타난 사회문화적 의미 분석 - 세대별 걸그룹 외모 변화를 중심으로 -)

  • Han, Cha-young
    • Journal of Fashion Business
    • /
    • v.21 no.1
    • /
    • pp.12-31
    • /
    • 2017
  • Korean commercial-organized girl groups were remarkable in the late 1990's. However, by the late 2000's, girl groups had an even more profound effect on Korean popular music compare to past influences. This study aimed to analyze the social-cultural meaning of the changing appearance of girl group between the first and second-generations. For this purpose, this study analyzed media image and text, based on a social-cultural context, about 13 girl groups. The results are as follows. First, while the first -generation girl group tended to maintain girlish/sexy images trying to the male desire, the second -generation girl group strategically showed various sexual identities such as femininity, masculinity, masculinity and androgyny along with contextual sexual images. The reason why girl groups increased the number of strategic images featuring various sexual identities was in order to appeal to a wide, diverse audience. Second, the second generation girl groups had - slim bodies with great athleticism, basically due to trainee system. Because of this, their semiotic body images have been commercially used to promote the consumption. Third, the second generation girl groups - were the bigger stars than first generation girl groups - because the members worked in many different fields. Therefore, the group members' images were successful consumed directly and then reproduced symbolically. Fourth, each member of the second -generation girl groups characterized by appearing in diverse, yet familiar images, through various media sources. Although the intention of this was to have recognition and popularity, it became difficult for them to change their image once one particular image was deemed popular.

AJFCode: An Approach for Full Aspect-Oriented Code Generation from Reusable Aspect Models

  • Mehmood, Abid;Jawawi, Dayang N.A.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1973-1993
    • /
    • 2022
  • Model-driven engineering (MDE) and aspect-oriented software development (AOSD) contribute to the common goal of development of high-quality code in reduced time. To complement each approach with the benefits of the other, various methods of integration of the two approaches were proposed in the past. Aspect-oriented code generation, which targets obtaining aspect-oriented code directly from aspect models, offers some unique advantages over the other integration approaches. However, the existing aspect-oriented code generation approaches do not comprehensively address all aspects of a model-driven code generation system, such as a textual representation of graphical models, conceptual mapping, and incorporation of behavioral diagrams. These problems limit the worth of generated code, especially in practical use. Here, we propose AJFCode, an approach for aspect-oriented model-driven code generation, which comprehensively addresses the various aspects including the graphical models and their text-based representation, mapping between visual model elements and code, and the behavioral code generation. Experiments are conducted to compare the maintainability and reusability characteristics of the aspect-oriented code generated using the AJFCode with the most comprehensive object-oriented code generation approach. AJFCode performs well in terms of all metrics related to maintainability and reusability of code. However, the most significant improvement is noticed in the separation of concerns, coupling, and cohesion. For instance, AJFCode yields significant improvement in concern diffusion over operations (19 vs 51), coupling between components (0 vs 6), and lack of cohesion in operations (5 vs 9) for one of the experimented concerns.

Randomized Block Size (RBS) Model for Secure Data Storage in Distributed Server

  • Sinha, Keshav;Paul, Partha;Amritanjali, Amritanjali
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4508-4530
    • /
    • 2021
  • Today distributed data storage service are being widely used. However lack of proper means of security makes the user data vulnerable. In this work, we propose a Randomized Block Size (RBS) model for secure data storage in distributed environments. The model work with multifold block sizes encrypted with the Chinese Remainder Theorem-based RSA (C-RSA) technique for end-to-end security of multimedia data. The proposed RBS model has a key generation phase (KGP) for constructing asymmetric keys, and a rand generation phase (RGP) for applying optimal asymmetric encryption padding (OAEP) to the original message. The experimental results obtained with text and image files show that the post encryption file size is not much affected, and data is efficiently encrypted while storing at the distributed storage server (DSS). The parameters such as ciphertext size, encryption time, and throughput have been considered for performance evaluation, whereas statistical analysis like similarity measurement, correlation coefficient, histogram, and entropy analysis uses to check image pixels deviation. The number of pixels change rate (NPCR) and unified averaged changed intensity (UACI) were used to check the strength of the proposed encryption technique. The proposed model is robust with high resilience against eavesdropping, insider attack, and chosen-plaintext attack.

The Contribution of The Research on "Somunchajujipso(素問次注集疏)" and "Somun(素問)" ($\ll$소문차주집소(素問次注集疏)$\gg$ 대(對) $\ll$소문(素問)$\gg$ 연구적공헌(硏究的貢獻))

  • Guo, Xiu-Mei
    • Journal of Korean Medical classics
    • /
    • v.22 no.4
    • /
    • pp.51-54
    • /
    • 2009
  • While we study the book "Somun(素問)", we have to take Wangbing(王冰) Note as reference, which has to be understood by later generation on reading the book Sin-gyojeong(新校正) of Imeok(林億) from Song dynasty. At the final period of Edo in Japan, the famous Han medical expert, Yamada(山田) Gyoukou(業廣) sought a complete new way to compile a book named "Somunchajujipso(素問次注集疏)", a notes and commentaries work combined the original text of "Somun(素問)", Wangbing(王冰) Note and Sin-gyojeong(新校正) by taking the reference of generations medical books and notes both China and Japan. There have been many books to give notes on "Somun(素問)" in many generations, but less of them giving notes to the original text, but to Wangbing Note a little bit at most. In "Somunchajujipso", textual research and notes are given as a special example to the forward, original text and explanation part of "Somun", Song dynasty edition. Especially the detail explanation to the forward part of Imeok(林億), no one has done better than Gyoukou(業廣) until now. It sufficiently shows Gyoukou's(業廣) enriched knowledge accumulated by years hard research in Confucian classics, history and medical books, which enable it a worthy reference statement. The issued of the book "Somunchajujipso(素問次注集疏)" expands a new area for the research of "Somun" and present new research improvement of "Somun" in Japan.

  • PDF

Automatic Pronunciation Generator Using Selection Procedure for Exceptional Pronunciation Words (예외 단어 선별 작업을 이용한 자동 발음열 생성 시스템)

  • 안주은;김순협;김선희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.248-252
    • /
    • 2004
  • Cultural, social, economic and other various environmental factors affect our language and different words and terminology are used and coined for different contexts, resulting in quantitative change of vocabulary. This paper presents an automatic pronunciation generator using selection procedure for exceptional pronunciation words from added text corpus, which reflects this dynamic nature of language. For our experiment, we used the text corpus released by ETRI for speech recognition. consisting or 53,750 sentences (740.497 Eojols), and obtained a 100% performance level of the proposed automatic pronunciation generator.

A Semiotic Analysis of Starcraft : Sense Analysis by Greimas's Carre Semiotique (스타크래프트에 관한 기호학적 분석 : 그레마스의 기호 사각형을 응용한 의미분석)

  • Park, Tae-Soon
    • Journal of Korea Game Society
    • /
    • v.7 no.1
    • /
    • pp.21-29
    • /
    • 2007
  • This paper attempts to analyze Starcraft by Greimas's Carre Semiotique and the theory of structure generation semiotics, which are useful for non verbal text as well as verbal text. First, by using the Christian Metz's grand syntagma theory and principle, this study articulated the text of starcraft. As a result, it revealed that Starcraft has the axis of sense of war and has the primary sense categories of production and destruction. The sense of the Starcraft is being generated by these axis of sense and sense categories. This analysis is expected to be a stepstone for the furthermore analysis of narrative and discoursive level.

  • PDF

Unstructured Data Processing Using Keyword-Based Topic-Oriented Analysis (키워드 기반 주제중심 분석을 이용한 비정형데이터 처리)

  • Ko, Myung-Sook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.521-526
    • /
    • 2017
  • Data format of Big data is diverse and vast, and its generation speed is very fast, requiring new management and analysis methods, not traditional data processing methods. Textual mining techniques can be used to extract useful information from unstructured text written in human language in online documents on social networks. Identifying trends in the message of politics, economy, and culture left behind in social media is a factor in understanding what topics they are interested in. In this study, text mining was performed on online news related to a given keyword using topic - oriented analysis technique. We use Latent Dirichiet Allocation (LDA) to extract information from web documents and analyze which subjects are interested in a given keyword, and which topics are related to which core values are related.

A Stochastic Text Structuring using Simulated Annealing (자연스러운 텍스트 생성을 위한 추계적 텍스트 구조화)

  • Roh, Ji-Eun;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 2002.10e
    • /
    • pp.199-206
    • /
    • 2002
  • 언어가 아닌 다양한 지식원으로부터 그것을 설명하는 텍스트를 생성하는 텍스트 생성 (text generation)은 여러 가지 복합적이고 단계적인 과정을 거쳐 이루어진다. 자연스러운 텍스트를 생성하기 위한 여러 단계 중, 지식원으로부터 텍스트에 포함되기 위해 뽑힌 정보들간의 순서를 적절히 결정하는 과정을 텍스트 구조화(text structuring)라고 한다. 텍스트 구조화는 생성될 텍스트의 결속성(coherence)을 크게 좌우하므로, 양질의 텍스트를 생성하기 위해서는 텍스트 구조화를 다루기 위한 정교한 방법론이 요구된다. 본 논문에서는 SA(simulated annealing) 알고리즘을 이용해 추계적 텍스트 구조화 방안을 제안하며 특히, SA의 평가 함수(evaluation function)로서, 총 4가지의 방법론-중심화 이론(centering theory)을 이용한 센터 전이 유형의 선호도, 추론 비용에 근거한 전이 유형간의 선호도, 서두 문장을 결정하기 위한 가중치 할당에 따른 선호도, 인접한 문장간의 유사도에 따른 선호도-을 제안하고 실험을 통해, 그 효용성을 보였다.

  • PDF

COVID-19 recommender system based on an annotated multilingual corpus

  • Barros, Marcia;Ruas, Pedro;Sousa, Diana;Bangash, Ali Haider;Couto, Francisco M.
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.24.1-24.7
    • /
    • 2021
  • Tracking the most recent advances in Coronavirus disease 2019 (COVID-19)-related research is essential, given the disease's novelty and its impact on society. However, with the publication pace speeding up, researchers and clinicians require automatic approaches to keep up with the incoming information regarding this disease. A solution to this problem requires the development of text mining pipelines; the efficiency of which strongly depends on the availability of curated corpora. However, there is a lack of COVID-19-related corpora, even more, if considering other languages besides English. This project's main contribution was the annotation of a multilingual parallel corpus and the generation of a recommendation dataset (EN-PT and EN-ES) regarding relevant entities, their relations, and recommendation, providing this resource to the community to improve the text mining research on COVID-19-related literature. This work was developed during the 7th Biomedical Linked Annotation Hackathon (BLAH7).