• Title/Summary/Keyword: 키워드빈도분석

Search Result 353, Processing Time 0.025 seconds

사용자 의도 정보를 사용한 웹문서 분류

  • Jang, Yeong-Cheol
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2008.10b
    • /
    • pp.292-297
    • /
    • 2008
  • 복잡한 시맨틱을 포함한 웹 문서를 정확히 범주화하고 이 과정을 자동화하기 위해서는 인간의 지식체계를 수용할 수 있는 표준화, 지능화, 자동화된 문서표현 및 분류기술이 필요하다. 이를 위해 키워드 빈도수, 문서내 키워드들의 관련성, 시소러스의 활용, 확률기법 적용 등에 사용자의도(intention) 정보를 활용한 범주화와 조정 프로세스를 도입하였다. 웹 문서 분류과정에서 시소러스 등을 사용하는 지식베이스 문서분류와 비 감독 학습을 하는 사전 지식체계(a priori)가 없는 유사성 문서분류 방법에 의도정보를 사용할 수 있도록 기반체계를 설계하였고 다시 이 두 방법의 차이는 Hybrid조정프로세스에서 조정하였다. 본 연구에서 설계된 HDCI(Hybrid Document Classification with Intention) 모델은 위의 웹 문서 분류과정과 이를 제어 및 보조하는 사용자 의도 분석과정으로 구성되어 있다. 의도분석과정에 키워드와 함께 제공된 사용자 의도는 도메인 지식(domain Knowledge)을 이용하여 의도간 계층트리(intention hierarchy tree)를 구성하고 이는 문서 분류시 제약(constraint) 또는 가이드의 역할로 사용자 의도 프로파일(profile) 또는 문서 특성 대표 키워드를 추출하게 된다. HDCI는 문서간 유사성에 근거한 상향식(bottom-up)의 확률적인 접근에서 통제 및 안내의 역할을 수행하고 지식베이스(시소러스) 접근 방식에서 다양성에 한계가 있는 키워들 간 관계설정의 정확도를 높인다.

  • PDF

An Analysis on Major Keyword & Relationship in the Studies of Superintendent (교육감 관련 연구들의 주요 핵심어와 그들 간의 관계성 분석)

  • Kwon, Choong-Hoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.177-178
    • /
    • 2019
  • 본 연구는 지방교육자치의 가장 핵심인 '교육감' 관련 연구들의 주요 핵심어들과 그들 간의 관계성을 분석하였다. 본 연구에서는 2009년부터 2018년까지(10년간)의 '교육감' 관련 선행연구 총 93건을 키워드 네트워크 분석 방법론을 활용하여, 주요 핵심어 추출 및 워드 클라우드 제시, 주요 핵심어들 간의 관계성(의미망 네트워크) 분석 등을 진행하였다. 최근 10년간 국내 '교육감' 관련 연구들의 주요 핵심어들은 교육감선거, 주민직선제, 선출제도, 개선방안, 비교연구, 교육자치, 문제점, 지방자치, 교육부장관, 교육위원 등 이었다. 주요 핵심어들(상위 출현빈도)은 높은 밀도와 연결정도를 가지고 상호 네트워크를 형성하고 있었다. 본 연구결과는 향후 진행될 '교육감' 관련 후속연구들의 새로운 연구주제 선정 및 다양한 방향 설정에 기초자료로 활용될 수 있을 것이다.

  • PDF

Text Mining and Association Rules Analysis to a Self-Introduction Letter of Freshman at Korea National College of Agricultural and Fisheries (2) (한국농수산대학 신입생 자기소개서의 텍스트 마이닝과 연관규칙 분석 (2))

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Shin, Y.K.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.22 no.2
    • /
    • pp.99-114
    • /
    • 2020
  • In this study we examined the topic analysis and correlation analysis by text mining from the self introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The analysis items of the 3rd question were and the 4th question were the motivation for applying to college, the academic plan and the career plan. The text mining to the 3rd question showed that the frequency of 'friends' was overwhelmingly high, followed by keywords such as 'thought', 'time', 'opinion', 'activity', and 'club'. In the 4th question, keyword frequency such as 'thought', 'agriculture', 'KNCAF', 'farm', 'father' was high. The result of association rules analysis for each question showed that the relationship with the highest support level, which means the frequency and importance of the rule, was the {friend} <=> {thought}, {thought} <=> {KNCAF}. The confidence level of a correlation between keywords was the highest in the rules of {teacher}=>{friend}, {agriculture, KNCAF}=>{thought}. Also the lift level that indicates the closeness of two words was the highest in the rules of {friend} <=> {teacher}, {knowledge} <=> {professional}. These keywords are found to play a very important roles in analyzing betweenness centrality and analyzing degree centrality between keywords. The results of frequency analysis and association analysis were visualized with word cloud and correlation graphs to make it easier to understand all the results.

Investigating Trends of Gifted Education in Domestic and Foreign Countries through Social Network Analysis from 2010 to 2015 (2010~2015년 사회네트워크분석(SNA) 방법 활용 국내외 영재교육 연구동향 분석)

  • Yoon, Jin A;Kim, Su Jin;Seo, Hae Ae
    • Journal of Gifted/Talented Education
    • /
    • v.26 no.2
    • /
    • pp.347-363
    • /
    • 2016
  • The purpose of this study was to analyze the trends in domestic and international gifted education in the last six years (2010-2015) by utilizing social network analysis methods. For papers of gifted education in Korea, two KCI (Korea Citation Index) rated journals, the 'Gifted/Talented Education' (The Korean Society for the Gifted) and 'Gifted and Talented Education' (The Korean Society for the Gifted and Talented Education) were selected and 457 pieces published in two journals were collected. The papers of 347 published in SSCI rated journals, 'The Gifted Child Quarterly,' 'Journal for the Education of the Gifted,' and 'High Ability Studies' were selected. English keywords were extracted from 457 papers from Korean journals and 347 papers from foreign journals and the Social Network Analysis (SNA) way was utilized for keyword frequency and central network analyses. It was appeared that the trends of paper keywords from domestic and foreign countries showed common keywords, 'academically gifted', 'science gifted', and 'gifted' as center keyword frequency, and keywords, 'achievement', 'identification', 'intelligence' appeared as the most frequent ones. For domestic papers, keywords, 'creativity', 'gifted education', and 'gifted education teacher' were the highest frequent keywords while keywords, 'foreign countries', and 'student attitudes' were most frequent ones for the foreign countries. For the analysis of papers from five journals as one group, it was found that keywords, 'identification', 'intelligence', and 'achievement' were the most important common ones and keywords, 'cognitive', 'motivation', and 'self-concept' were appeared as important keywords. The trend of gifted education in Korea seems to be different from ones of foreign countries, domestic papers of gifted education rarely included keywords of 'foreign examples', 'student attitudes', and 'gender differences.' Consequently, the trend of gifted education in Korea called for various research perspectives.

The Periodical Trend of Urban Regeneration through Mass Media - Focused on the 1920s and 1990s - (매스미디어를 통해 본 도시재생의 시대적 동향 - 1920년대~1990년대를 중심으로 -)

  • Kim, Sa-rang;Lee, Jeong
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.47 no.2
    • /
    • pp.28-48
    • /
    • 2019
  • This research is aimed at identifying the perception associated with urban regeneration and predicting policy implications of future directions by analyzing the trend of urban regeneration depicted in the mass media by utilizing SNA (Semantic-Network Analysis) techniques. As the number of articles has increased, it is noted through analysis that the interrelationships between social phenomena and issues have combined to form the meaning of urban regeneration. Overall, 'urban' and 'regeneration' keywords also appeared at different periods, with 'urban' closely related to 'regeneration' starting in 1970 when urbanization was becoming more prevalent. It was analyzed that the frequency of 'urban' appeared more frequently in the early 1990s, while the frequency of 'rural' decreased sharply. Until the 1990s, the slums and the recession that appeared as side effects of urban problem-solving policies were mostly concentrated in cities. Policy discussions were conducted with the goal of improving the physical environment of cities rather than concentrating on the surrounding rural areas. The distributions of the keywords 'development' and 'regeneration' have increased quantitatively since the 1970s, and urban polarization has exploded due to the development of the external growth of cities, mirroring the trend of accelerated environmental threats. In particular, the keywords for 'regeneration' emerged mainly related to environmental problems, which led to the need for urban regeneration, and environmentally and ecologically friendly development. The emergence of "urban," "regeneration" and "environment" as keywords having to do with urban regeneration grew in the 1990s. This suggests that urban regeneration is now linked to "environment", as that has become a social issue.

Learning User Interest using Hierarchical Concept indexing based on Ontology (온톨로지 기반의 계층적 개념 인덱싱을 이용한 사용자 관심사 학습)

  • Park Ji-Hyun;Kim Heung-Nam;Jo Geun-Sik
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11b
    • /
    • pp.646-648
    • /
    • 2005
  • 인터넷의 급속한 성장과 더불어 사용자들은 인터넷을 통해 많은 정보를 얻을 수 있게 되었으며 최신 뉴스를 실시간으로 접근할 수 있게 되었다. 이에 따라 방대한 정보 속에 사용자 관심사에 맞는 정보를 효과적으로 검색하기 위한 여러 방법들이 연구되어 왔다. 하지만 기존의 많은 선행 연구들은 단어 빈도 기반의 키워드 벡터 모델을 이용하여 사용자의 관심사를 학습하고 있다. 이러한 키워드 벡터 모델은 사용자의 선호도를 명확하게 기술하지 못하고 키워드를 이용한 특징 벡터 (feature-vector)는 개념들 사이의 관계를 찾기 어려운 한계를 가지고 있다. 이를 개선하기 위해 본 논문에선 계층적 개념 인덱싱(Hierarchical Concept Indexing)을 이용한 온톨로지 형태의 개인화된 사용자 프로파일을 만드는 방법을 제안한다. 생성된 사용자 프로파일에 개념 간의 유사도와 개념에 대한 사용자의 관심도를 고려하여 보다 개인의 선호도에 맞는 기사를 제공한다. 실험에서는 제안된 방법의 성능 평가를 위해서 기존의 키워드 벡터 모델의 학습 방법인 WebMate 시스템과 비교 분석하였다. 그 결과 제안하는 방법이 키워드 벡터를 이용한 학습 방법보다 향상된 성능을 보였다.

  • PDF

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

News Big Data Analysis System for Public Issue Extraction (공공이슈 추출을 위한 뉴스 빅데이터 분석 시스템)

  • Kim, Seung Ju;Yoon, Chang Geun;Lee, Cha Hun;Park, Dong Hwan;Lee, Hae Jun;Park, Hyeok Ju;Lee, Yong Kyu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.17-20
    • /
    • 2018
  • 대중의 관심인 공공이슈를 파악하기 위하여 다양한 종류의 빅데이터를 분석하는 연구가 진행되고 있다. 그러나 기존의 연구에서는 키워드의 노출 횟수만 파악하여 결과로 반영한다. 본 논문은 포털 사이트로부터 얻은 언론사별 뉴스 빅데이터를 이용하여 키워드별 노출 빈도수, 댓글 수 및 추천 수를 반영한 분석 방법을 제안하였다. 공공이슈를 추출하여 얻어낸 키워드들을 워드클라우드, Sankey다이어그램과 같은 형태로 시각화하여 사용자에게 제공한다. 제안된 방법을 사용하면 대중의 반응을 반영한 분석 결과를 확인 할 수 있다.

Twitter Sentiment Analysis for the Recent Trend Extracted from the Newspaper Article (신문기사로부터 추출한 최근동향에 대한 트위터 감성분석)

  • Lee, Gyoung Ho;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.731-738
    • /
    • 2013
  • We analyze public opinion via a sentiment analysis of tweets collected by using recent topic keywords extracted from newspaper articles. Newspaper articles collected within a certain period of time are clustered by using K-means algorithm and topic keywords for each cluster are extracted by using term frequency. A sentiment analyzer learned by a machine learning method can classify tweets according to their polarity values. We have an assumption that tweets collected by using these topic keywords deal with the same topics as the newspaper articles mentioned if the tweets and the newspapers are generated around the same time. and we tried to verify the validity of this assumption.

Keyword trends analysis related to the aviation industry during the Covid-19 period using text mining (텍스트마이닝을 활용한 Covid-19 기간 동안의 항공산업 관련 키워드 트렌드 분석)

  • Choi, Donghyun;Song, Bomi;Park, Dahyeon;Lee, Sungwoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.2
    • /
    • pp.115-128
    • /
    • 2022
  • The purpose of this study is to conduct keyword trend analysis using articles data on the impact of Covid-19 in the aviation in dustry. In this study, related articles were extracted centering on the keyword "Airline" by dividing the period of 6months before and after Covid-19 occurrence. After that, Topic modeling(LDA) was performed. Through this, The main topic was extracted in the event of an epidemic such as Covid-19, It is expected to be used as primary data to predict the aviation industry's impact when occurrence like Covid-19.