• Title/Summary/Keyword: R 언어

Search Result 407, Processing Time 0.029 seconds

A study on unstructured text mining algorithm through R programming based on data dictionary (Data Dictionary 기반의 R Programming을 통한 비정형 Text Mining Algorithm 연구)

  • Lee, Jong Hwa;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2015
  • Unlike structured data which are gathered and saved in a predefined structure, unstructured text data which are mostly written in natural language have larger applications recently due to the emergence of web 2.0. Text mining is one of the most important big data analysis techniques that extracts meaningful information in the text because it has not only increased in the amount of text data but also human being's emotion is expressed directly. In this study, we used R program, an open source software for statistical analysis, and studied algorithm implementation to conduct analyses (such as Frequency Analysis, Cluster Analysis, Word Cloud, Social Network Analysis). Especially, to focus on our research scope, we used keyword extract method based on a Data Dictionary. By applying in real cases, we could find that R is very useful as a statistical analysis software working on variety of OS and with other languages interface.

A Study on the Improvement Elements of Tourism Preparedness for International Tourist Using Revised-IPA: Focusing on Comparison by Tourist Type and Time Period (R-IPA분석을 적용한 외래관광객의 관광수용태세 개선 요소 분석: 관광객 유형 및 시기별 비교를 중심으로)

  • Lee, Seung-Hun
    • Journal of Digital Convergence
    • /
    • v.16 no.6
    • /
    • pp.9-18
    • /
    • 2018
  • Recently, the necessity and interest to improve the tourism preparedness for enhancing the quality of foreign tourists is increasing, but the related research is insufficient. The purpose of this study is to identify the preferential improvement elements related to the tourism preparedness of foreign tourists. To do this, we applied the R-IPA analysis to analyze and compare the elements affecting the tourist preparedness according to tourist type and time period. As a result of R-IPA analysis for all tourists, the elements that need to maintain the current quality levels were food, security, transit, shopping, and tourist attractiveness and the elements that need to be improved but low priority were language communication, travel expenses, and tourist information service. As a result of R-IPA analysis by tourist type, for individual tourists it is necessary to maintain current quality levels of transit, food, shopping, tourist attractiveness, and security. For group tourists, it is necessary to maintain current quality levels of accommodation, shopping, tourist attractiveness, and tourist information service, but food needs to be urgent improvement.

Analysis of University Department Name using the R (R을 이용한 대학의 학과 명칭 분석)

  • Ban, ChaeHoon;Kim, Dong Hyun;Ha, JongSoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.6
    • /
    • pp.829-834
    • /
    • 2018
  • As the IT technology is progressing, the big data becomes more important and is exploited on the various industry. The R is the language and the environment analyzing the big data. The university which is the highest level of the academic organization keeps opening and maintaining the departments anticipating the needs of the progressing trends. As analyzing the names of the departments opened at the universities, it is possible to find out the requirements and the needs of the recent trends. In this paper, we analyze the names of the departments presented at the 4 year universities using the R. To do this, we collect the names of the departments and measure the frequency of the names in order to know the department of major frequently presented at the universities.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (빅데이터 분석도구 R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo;Kim, Dong Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.166-171
    • /
    • 2020
  • Big data processing technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. the R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this paper, we use this to analyze the Bible data. We analyze the four Gospels of the New Testament in the Bible. We collect the Bible data and perform filtering for analysis. The R is used to investigate the frequency of what text is distributed and analyze the Bible through social network analysis, in which words from a sentence are paired and analyzed between words for accurate data analysis.

Logosphère de G. Bachelard et les rêveries de langue (바슐라르의 Logosphère와 언어적 몽상)

  • HONG, Myung-Hee
    • Cross-Cultural Studies
    • /
    • v.25
    • /
    • pp.679-694
    • /
    • 2011
  • La langue est un des ${\acute{e}}l{\acute{e}}ments$ $privil{\acute{e}}gi{\acute{e}}s$ de la $r{\hat{e}}verie$ chez Bachelard. La langue est une force fondamentale de l'imagination. D'une part, elle garde son propre valeur dans le processus de l'imagination, et d'autre part elle forme sa propre image. La $priorit{\acute{e}}$ de langue chez Bachelard a, en effet, quelque liaison avec la notion de Logos qui avait ${\acute{e}}t{\acute{e}}$ $trait{\acute{e}}$ depuis longtemps comme $v{\acute{e}}rit{\acute{e}}$ ${\acute{e}}ternelle$ dans la $m{\acute{e}}taphysique$ occidentale. Cependant, la notion de logos de Bachelard se $diff{\grave{e}}re$ de celle de $m{\acute{e}}taphysique$ occidentale. Tandis que la $m{\acute{e}}taphysique$ traditionnelle traite le logos comme un but ${\acute{e}}ternel$ de sa $m{\acute{e}}ditation$, Bachelard donne l'importance sur la $capacit{\acute{e}}$ linguistique et imaginaire du logos. Le $logosph{\grave{e}}re$ est un des exemples qui montre bien la $diff{\acute{e}}rence$ entre la notion de logos de Bachelard et celle de $m{\acute{e}}taphysique$ traditionnelle. Le $logosph{\grave{e}}re$ est un $n{\acute{e}}ologisme$ de Bachelard qui est fait pour $d{\acute{e}}signer$ $l^{\prime}atmosph{\grave{e}}re$ verbal de la $soci{\acute{e}}t{\acute{e}}$ contemporaine $gr{\hat{a}}ce$ ${\grave{a}}$ l'emission de radio. Bachelard comprend le $ph{\acute{e}}nom{\grave{e}}ne$ de radio en tant que $r{\acute{e}}alisation$ de $Psych{\acute{e}}$ dans la vie quotidienne. C'est $gr{\hat{a}}ce$ ${\grave{a}}$ la technologie moderne que nous pouvons avoir l'univers de langue plus facilement par rapport aux $si{\grave{e}}cles$ $pr{\acute{e}}c{\acute{e}}dents$. Selon Bachelard, la radio n'est pas un simple instrument de communication. C'est une porte pour entrer dans la $r{\hat{e}}verie$ universelle. La radio est une voix du monde qui exprime notre inconscient. Quand un $r{\hat{e}}veur$ $r{\hat{e}}ve$, son $r{\hat{e}}verie$ se $d{\acute{e}}veloppe$ en se discutant avec le monde. Alors, quand nous $r{\hat{e}}vons$, nous parlons au monde et nous ${\acute{e}}coutons$ du monde, de sorte que nous devenons les citoyens du $logosph{\grave{e}}re$. Dans son oeuvre Sur la Grammatologie, J. Derrida critique la $m{\acute{e}}taphysique$ occidentale en la intitulant logocentrisme. Derrida pense que la philosophie occidentale a comme le but final la $pr{\acute{e}}sence$ de logos. Cette $pr{\acute{e}}sence$ de logos ne peut ${\hat{e}}tre$ $r{\acute{e}}alis{\acute{e}}e$ que par la langue de la voix, non pas par la langue de $caract{\grave{e}}re$. $D^{\prime}o{\grave{u}}$ vient le logocentrisme ou le phonocentrisme de $m{\acute{e}}taphysique$ occidental. Mais Derrida pense que le logocentrisme n'est qu'un autre aspect de l'ethnocentrisme ${\acute{e}}troit$ de l'occident. La notion de $logosph{\grave{e}}re$ de Bachelard a quelques ressemblances avec logocentrisme par ses apparences. Cependant, elles ont une $diff{\acute{e}}rence$ fondamentale depuis leur $d{\acute{e}}part$. Tandis que logocentrisme $tra{\hat{i}}te$ la parole en tant que $mani{\grave{e}}re$ d'expression de raison qui est une puissance fondamentale de l'homme, Bachelard pense que la parole est un $r{\acute{e}}sultat$ d'une opposition et fusion de notre raisons et parole. Bachelard pense que la parole est une $r{\acute{e}}alisation$ de l'image qui est l'essence de notre $psych{\acute{e}}$. Pour lui, la parole, la quintessence de $logosph{\grave{e}}re$, est le champ de l'imagination $d^{\prime}o{\grave{u}}$ jaillissent les images. C'est pour cela que $logosph{\grave{e}}re$ se situe ${\grave{a}}$ l'antipode de logocentrisme. $Logosph{\grave{e}}re$ nous fournit un espace de $r{\hat{e}}verie$ de langue. Notre $soci{\acute{e}}t{\acute{e}}$ contemporaine $fourr{\acute{e}}e$ des images visuelles creuses est $d{\acute{e}}pouill{\acute{e}}e$ de plus en plus des espaces de $r{\hat{e}}veries$. C'est une des raisons que le $logosph{\grave{e}}re$ de Bachelard doit ${\hat{e}}tre$ $r{\acute{e}}activ{\acute{e}}$ aujourd'hui.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

A Benchmark of AI Application based on Open Source for Data Mining Environmental Variables in Smart Farm (스마트 시설환경 환경변수 분석을 위한 Open source 기반 인공지능 활용법 분석)

  • Min, Jae-Ki;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.159-159
    • /
    • 2017
  • 스마트 시설환경은 대표적으로 원예, 축산 분야 등 여러 형태의 농업현장에 정보 통신 및 데이터 분석 기술을 도입하고 있는 시설화된 생산 환경이라 할 수 있다. 근래에 하드웨어적으로 급증한 스마트 시설환경에서 생산되는 방대한 생육/환경 데이터를 올바르고 적합하게 사용하기 위해서는 일반 산업 현장과는 차별화 된 분석기법이 요구된다고 할 수 있다. 소프트웨어 공학 분야에서 연구된 빅데이터 처리 기술을 기계적으로 농업 분야의 빅데이터에 적용하기에는 한계가 있을 수 있다. 시설환경 내/외부의 다양한 환경 변수는 시계열 데이터의 난해성, 비가역성, 불특정성, 비정형 패턴 등에 기인하여 예측 모델 연구가 매우 난해한 대상이기 때문이라 할 수 있다. 본 연구에서는 근래에 관심이 급증하고 있는 인공신경망 연구 소프트웨어인 Tensorflow (www.tensorflow.org)와 대표적인 Open source인 OpenNN (www.openn.net)을 스마트 시설환경 환경변수 상호간 상관성 분석에 응용하였다. 해당 소프트웨어 라이브러리의 운영환경을 살펴보면 Tensorflow 는 Linux(Ubuntu 16.04.4), Max OS X(EL capitan 10.11), Windows (x86 compatible)에서 활용가능하고, OpenNN은 별도의 운영환경에 대한 바이너리를 제공하지 않고 소스코드 전체를 제공하므로, 해당 운영환경에서 바이너리 컴파일 후 활용이 가능하다. 소프트웨어 개발 언어의 경우 Tensorflow는 python이 기본 언어이며 python(v2.7 or v3.N) 가상 환경 내에서 개발이 수행이 된다. 주의 깊게 살펴볼 부분은 이러한 개발 환경의 제약으로 인하여 Tensorflow의 주요한 장점 중에 하나인 고속 연산 기능 수행이 일부 운영 환경에 국한이 되어 제공이 된다는 점이다. GPU(Graphics Processing Unit)의 제공하는 하드웨어 가속기능은 Linux 운영체제에서 활용이 가능하다. 가상 개발 환경에 운영되는 한계로 인하여 실시간 정보 처리에는 한계가 따르므로 이에 대한 고려가 필요하다. 한편 근래(2017.03)에 공개된 Tensorflow API r1.0의 경우 python, C++, Java언어와 함께 Go라는 언어를 새로 지원하여 개발자의 활용 범위를 매우 높였다. OpenNN의 경우 C++ 언어를 기본으로 제공하며 C++ 컴파일러를 지원하는 임의의 개발 환경에서 모두 활용이 가능하다. 특징은 클러스터링 플랫폼과 연동을 통해 하드웨어 가속 기능의 부재를 일부 극복했다는 점이다. 상기 두 가지 패키지를 이용하여 2016년 2월부터 5월 까지 충북 음성군 소재 딸기 온실 내부에서 취득한 온도, 습도, 조도, CO2에 대하여 Large-scale linear model을 실험적(시간단위, 일단위, 주단위 분할)으로 적용하고, 인접한 세그먼트의 환경변수 예측 모델링을 수행하였다. 동일한 조건의 학습을 수행함에 있어, Tensorflow가 개발 소요 시간과 학습 실행 속도 측면에서 매우 우세하였다. OpenNN을 이용하여 대등한 성능을 보이기 위해선 병렬 클러스터링 기술을 활용해야 할 것이다. 오프라인 일괄(Offline batch)처리 방식의 한계가 있는 인공신경망 모델링 기법과 현장 보급이 불가능한 고성능 하드웨어 연산 장치에 대한 대안 마련을 위한 연구가 필요하다.

  • PDF

Correlation between Self-evaluation Factor and Academic Achievement of Medical Students according to Introduction of Explanation Meeting in Cadaveric Dissection (해부설명회의 도입에 따른 의학전문대학원생들의 자기 평가 요인과 학업성취도 상관관계 분석)

  • Park, Jeong-Hyun;Kim, Jee-Hee;Kim, Kwang-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.7
    • /
    • pp.2475-2482
    • /
    • 2010
  • This study aimed to evaluate correlation between self-evaluation factors(satisfaction, linkage with major, suitability of management, verbal and non-verbal communication) and academic achievement of medical students according to introduction of explanation meeting in cadaveric dissection. The study was carried out that medical students(n=57) explained cadaveric structure to health allied sciences students and discussed with each other. Just after the meeting,, medical students filled out a questionnaire on impact of self-evaluation factors and communication. We analyzed these factors and their scores using frequency analysis, T-test and analysis of variance. Regardless of their gender, age, previous experience, the majority of the students gave high scores in all of self-evaluation factors. Among them, only verbal communication factor was closely related to their academic achievement(p<0.05). The verbal and non-verbal communication also had a high correlation of 0.673(p<0.01). The explanation meeting provided chance to learn further with positive attitude to medical students and motivated them academically. Additionally, they realized that communication skill played a key role in transmitting medical knowledge to others. Therefore, introduction of communication-based explanation meeting would be very useful tool in improving educational efficiency.

Chaucer′s Extraordinary Fabliau: The Merchant′s Tale

  • Thomas, Paul R.
    • Lingua Humanitatis
    • /
    • v.2 no.2
    • /
    • pp.109-128
    • /
    • 2002
  • The six fabliaux of the Canterbury Tales are a notable artistic achievement. Of all of them, however, the Merchant's Tale is the most notable to show Chaucer's development of the scope of this genre. We will look briefly at the characters of the fabliau narrators who are crucial to Chaucer's drama of relationships in the course of the Canterbury pilgrimage framework. To distinguish the accomplishment of the Merchant's Tale, we will consider the relative merits of each of the other five fabliaux in the Canterbury Tales. The least flawed of the fabliau narrators, the Merchant will tell a powerful tale about an old man's lust turned into a hasty marriage gone wrong that aims its satire at the noble ruling class of the land, not the usual targets of Chaucer's or most other writers' fabliaux. Further, unlike the light-hearted and dismissable endings of the other Chaucerian fabliaux, the Merchant's Tale has what we will call an Act 6 of continued deception at all corners of the love triangle represented by the senex amans January, his young wife May, perhaps now pregnant after her tryst with Damyan in the pear tree, and the still present young lover Damyan. This triangle of mutual deception will continue into the unknown future under the male and female forces at odds as personified in the king and queen of fairies, Pluto and Proserpina.

  • PDF

Children's Early English Education and the Factors on their Bilingual Language Development (유아의 조기영어교육과 이중언어발달에 영향을 주는 요인)

  • Hwang, Hae-Shin
    • Korean Journal of Human Ecology
    • /
    • v.16 no.4
    • /
    • pp.699-710
    • /
    • 2007
  • The study purposes to explore the effects of individual characteristics and home environments of children on their bilingual language aquisition, that is, to examine whether their English language competency is different from their Korean language competency depending on those variables. Thus English or Korean language competency of children who had had early exposure in English learning were studied in terms of child's individual characteristics such as age, gender, exposure period to English, intelligence, and experiences of visiting English-speaking countries, and home environments such as parental age, educational level, income level, their perceived English competency, their perceived significance of English and Korean language, and the frequency of using English at home. 72 children who went to English kindergarten were tested with Peabody Pictures Vocabulary Test-Revised (PPVT-R) in Korean version and in English version respectively. The results show that child's intelligence and experiences of visiting English-speaking countries influence their Korean language competency. Also child's age, exposure period to English and experiences of visiting English-speaking countries influence their English language competency. Moreover their mother's educational background, father's English fluency, mothers' English fluency, and the frequency of using English at home influence child's English language competency, whereas any variables did not influence child's Korean language competency. Accordingly, child's English and Korean language competencies are related to each other.