• Title/Summary/Keyword: 주요어 분석

Search Result 392, Processing Time 0.024 seconds

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

The Tresnds of Artiodactyla Researches in Korea, China and Japan using Text-mining and Co-occurrence Analysis of Words (텍스트마이닝과 동시출현단어분석을 이용한 한국, 중국, 일본의 우제목 연구 동향 분석)

  • Lee, Byeong-Ju;Kim, Baek-Jun;Lee, Jae Min;Eo, Soo Hyung
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.1
    • /
    • pp.9-15
    • /
    • 2019
  • Artiodactyla, which is an even-toed mammal, widely inhabits worldwide. In recent years, wild Artiodactyla species have attracted public attention due to the rapid increase of crop damage and road-kill caused by wild Artiodactyla such as water deer and wild boar and the decrease of some species such as long-tailed goral and musk deer. In spite of such public attention, however, there have been few studies on Artiodactyla in Korea, and no studies have focused on the trend analysis of Artiodactyla, making it difficult to understand actual problems. Many recent studies on trend used text-mining and co-occurrence analysis to increase objectivity in the classification of research subjects by extracting keywords appearing in literature and quantifying relevance between words. In this study, we analyzed texts from research articles of three countries (Korea, China, and Japan) through text-mining and co-occurrence analysis and compared the research subjects in each country. We extracted 199 words from 665 articles related to Artiodactyla of three countries through text-mining. Three word-clusters were formed as a result of co-occurrence analysis on extracted words. We determined that cluster1 was related to "habitat condition and ecology", cluster2 was related to "disease" and cluster3 was related to "conservation genetics and molecular ecology". The results of comparing the rates of occurrence of each word clusters in each country showed that they were relatively even in China and Japan whereas Korea had a prevailing rate (69%) of cluster2 related to "disease". In the regression analysis on the number of words per year in each cluster, the number of words in both China and Japan increased evenly by year in each cluster while the rate of increase of cluster2 was five times more than the other clusters in Korea. The results indicate that Korean researches on Artiodactyla tended to focus on diseases more than those in China and Japan, and few researchers considered other subjects including habitat characteristics, behavior and molecular ecology. In order to control the damage caused by Artiodactyla and to establish a reasonable policy for the protection of endangered species, it is necessary to accumulate basic ecological data by conducting researches on wild Artiodactyla more.

Text Mining of Successful Casebook of Agricultural Settlement in Graduates of Korea National College of Agriculture and Fisheries - Frequency Analysis and Word Cloud of Key Words - (한국농수산대학 졸업생 영농정착 성공 사례집의 Text Mining - 주요단어의 빈도 분석 및 word cloud -)

  • Joo, J.S.;Kim, J.S.;Park, S.Y.;Song, C.Y.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.20 no.2
    • /
    • pp.57-72
    • /
    • 2018
  • In order to extract meaningful information from the excellent farming settlement cases of young farmers published by KNCAF, we studied the key words with text mining and created a word cloud for visualization. First, in the text mining results for the entire sample, the words 'CEO', 'corporate executive', 'think', 'self', 'start', 'mind', and 'effort' are the words with high frequency among the top 50 core words. Their ability to think, judge and push ahead with themselves is a result of showing that they have ability of to be managers or managers. And it is a expression of how they manages to achieve their dream without giving up their dream. The high frequency of words such as "father" and "parent" is due to the high ratio of parents' cooperation and succession. Also 'KNCAF', 'university', 'graduation' and 'study' are the results of their high educational awareness, and 'organic farming' and 'eco-friendly' are the result of the interest in eco-friendly agriculture. In addition, words related to the 6th industry such as 'sales' and 'experience' represent their efforts to revitalize farming and fishing villages. Meanwhile, 'internet', 'blog', 'online', 'SNS', 'ICT', 'composite' and 'smart' were not included in the top 50. However, the fact that these words were extracted without omission shows that young farmers are increasingly interested in the scientificization and high-tech of agriculture and fisheries Next, as a result of grouping the top 50 key words by crop, the words 'facilities' in livestock, vegetables and aquatic crops, the words 'equipment' and 'machine' in food crops were extracted as main words. 'Eco-friendly' and 'organic' appeared in vegetable crops and food crops, and 'organic' appeared in fruit crops. The 'worm' of eco-friendly farming method appeared in the food crops, and the 'certification', which means excellent agricultural and marine products, appeared only in the fishery crops. 'Production', which is related to '6th industry', appeared in all crops, 'processing' and 'distribution' appeared in the fruit crops, and 'experience' appeared in the vegetable crops, food crops and fruit crops. To visualize the extracted words by text mining, we created a word cloud with the entire samples and each crop sample. As a result, we were able to judge the meaning of excellent practices, which are unstructured text, by character size.

Assessing the repeatability of reflection seismic data in the presence of complex near-surface conditions CO2CRC Otway Project, Victoria, Australia (복잡한 천부구조하에서 반사법 탄성파자료의 반복성에 대한 평가, 호주, 빅토리아, CO2CRC Otway 프로젝트)

  • Al-Jabri, Yousuf;Urosevic, Milovan
    • Geophysics and Geophysical Exploration
    • /
    • v.13 no.1
    • /
    • pp.24-30
    • /
    • 2010
  • This study utilises repeated numerical tests to understand the effects of variable near-surface conditions on time-lapse seismic surveys. The numerical tests were aimed at reproducing the significant scattering observed in field experiments conducted at the Naylor site in the Otway Basin for the purpose of $CO_2$ sequestration. In particular, the variation of elastic properties of both the top soil and the deeper rugose clay/limestone interface as a function of varying water saturation were investigated. Such tests simulate the measurements conducted in dry and wet seasons and to evaluate the contribution of these seasonal variations to seismic measurements in terms of non-repeatability. Full elastic pre-stack modelling experiments were carried out to quantify these effects and evaluate their individual contributions. The results show that the relatively simple scattering effects of the corrugated near-surface clay/limestone interface can have a profound effect on time-lapse surveys. The experiments also show that the changes in top soil saturation could potentially affect seismic signature even more than the corrugated deeper surface. Overall agreement between numerically predicted and in situ measured normalised root-mean-square (NRMS) differences between repeated (time-lapse) 2D seismic surveys warrant further investigation. Future field studies will include in situ measurements of the elastic properties of the weathered zone through the use of 'micro Vertical Seismic Profiling (VSP)' arrays and very dense refraction surveys. The results of this work may impact on other areas not associated with $CO_2$ sequestration, such as imaging oil production over areas where producing fields suffer from a karstic topography, such as in the Middle East and Australia.

WellnessWordNet: A Word Net for Unconstrained Subjective Well-Being Monitor ing Based on Unstructured Data and Contextual Polarity (웰니스워드넷: 비정형데이터와 상황적 긍부정성에 기반하여 주관적 웰빙 상태를 무구속적으로 모니터링하기 위한 워드넷 개발)

  • Song, Yeongeun;Nam, Suhyun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.1-21
    • /
    • 2016
  • IT-based subjective well-being (SWB) services, a main part of wellness IT, should measure the SWB state of individuals in an unrestrained, cost-effective manner. The dictionaries for sentiment analysis available in the market may be useful for this purpose, but obtaining proper sentiment values using only words from the sentiment lexicon is impossible; therefore, a new dictionary including wellness vocabulary is needed. The existing sentiment dictionaries link only a single sentiment value to a single sentiment word, although sentiment values may vary depending on personal traits. In this study, we develop an extended version of the SenticNet sentiment dictionary dubbed WellnessWordNet. SenticNet is considered the best and most expressive among the already existing sentiment dictionaries. Using the information provided by SenticNet, we created a database including the wellness states (estimated values) of stress, depression, and anger to develop the WellnessWordNet system. The accuracy of the system was validated through actual tests with live subjects. This study is unique and unprecedented in that i) an extended sentiment dictionary, WellnessWordNet, is developed; ii) values for wellness state language are offered; and iii) different sentiment values, namely contextual polarity, for people of the same gender or age group are suggested.

Distribution Pattern of dominant Benthic Diatoms on the Mangyung-Dongjin Tidal Flat, West Coast of Korea (서해 만경-동진 조간대의 주요 우점 저서 규조류의 분포)

  • 오상희;고철환
    • 한국해양학회지
    • /
    • v.26 no.1
    • /
    • pp.24-37
    • /
    • 1991
  • Marine benthic diatoms and environmental factors were studied at 60 sites on the Mangyung-Dongjin tidal flat of the west coast of Korea. Sediment samples were taken quantitatively from the upper 5 mm layer to obtain a representative estimate of the epipelic and epipsammic cell concentration. Surface sediments taken simultaneously with the quantitative diatom samples were analysed for the grain size. Exposure duration of study sites were calculated by the tide data recorded at Kunsan Outer-Harbour. Coarse sediments dominated mainly on the offshore coastal and lower tidal flat, whereas fine sediments occurred on inner and higher tidal flat. Total 371 diatom taxa were collected and the genera represented by a great number were Navicula and Nizschia. The 16 abundant species occupying more than 1% of total cell number are of the following: Paralia sulcata, Navicula sp. 1, Navicula arenaria, Cymatosira belgica, Amplora holsatica, Amphora coffeaeformis, Achnanthes hauckiana, Rhaphoneis amphiceros, Thalassionema nitzschioides. Navicula sp. 2, Dimeregramma minor, Amphora sp. 1, Cyclotella atomus, C, striata, Nitzschia kuetzingiana, Stephanodiscus sp. 1. The distribution pattern of these dominant species are described in relation to the habitat condition. Most of these species showed high densities in fine sediments. However, they occurred even silty sand and sandy sediments in low abundance. The epipsammic forms belonging to the Araphidineae and Monoraphidineae were restricted on the lower tidal flat. The typical species found in coarse sediments were: Cocconeis sp. 1, Opephora martyi, Amphora sabyii, Dimeregramma minor var. nana, Fragilaria virescens var. oblongella, F. virescens, Cocconeis grata. The higher tidal flat consisting of fine sediments showed relatively higher cell numbers than the lower tidal flat. River mouth region was the highest in abundance.

  • PDF

Exploring Users' Desired Emotion in Product Light Focusing on the Refrigerator (제품 조명에 기대하는 소구 감성 탐색: 냉장고 사례를 중심으로)

  • Jeong, Kyeong Ah;Suk, Hyeon-Jeong
    • Science of Emotion and Sensibility
    • /
    • v.21 no.3
    • /
    • pp.3-16
    • /
    • 2018
  • Despite the substantial changes made in the product design field to adopt light as an essential design element, there has been little effort to define how customers respond emotionally to the light design of products. Therefore, it is necessary to analyze the emotional effect of light as a new design element. However, previous research focuses solely on deriving optimal lighting conditions to achieve particular emotional effects. Therefore, this paper investigates the customers' desired emotional effects of product's light design. We studied refrigerators that utilize light as the main design element of the product. We applied mixed methods by combining close-ended questions and open-ended question to efficiently derive the desired emotion. Participants were asked to choose the most favorable refrigerator image in each of the twelve image groups and indicate why they choose that image with the short-answer survey form. Approximately one thousand terms were collected, and those terms were classified into 29 groups using thesaurus relationships. The term groups were again classified into the four big emotion categories and labelled as "abstract quality," "light property," "space perception," and "visual comfort." Also, a model of the relationship between desired light style and light properties was proposed, since we observed the light properties related to three other categories. This study used mixed methods to identify the emotional value of a new design element. We suggest that the emotional categories derived and the proposed relationship model could be used to evaluate the product's light design.

Implementation of RTOS Simulator With Execution Time Estimation (실행시간 추정 가능한 RTOS 시뮬레이터의 구현)

  • 김방현;류성준;김종현;남영광;이광용
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2002.05a
    • /
    • pp.125-129
    • /
    • 2002
  • 실시간 운영체제(Real-Time Operating System: 이하 RTOS라 함) 개발환경에서 제공하는 도구 중에 하나인 RTOS 시뮬레이터는 타겟 하드웨어가 호스트에 연결되어 있지 않아도 호스트에서 응용프로그램의 개발과 디버깅을 가능하게 해주는 타겟 시뮬레이션 환경을 제공해 줌으로서, 개발자로 하여금 빠른 시간 내에 응용프로그램을 개발할 수 있도록 지원하며 하드웨어 개발이 완료되기 전에도 응용프로그램을 개발할 수 있게 해 준다. 그러한 이유로 현재 대부분의 상용 RTOS 개발환경에서는 RTOS 시뮬레이터를 제공하고 있다. 그러나 현재 상용 RTOS 시뮬레이터들은 대부분 RTOS의 기능적인 부분들만 호스트에서 동작하도록 구현되어 있어서 RTOS나 RTOS 응용프로그램이 실제 타겟에서 실행될 때의 실질적인 시간 추정이 불가능하다. 이러한 문제점은 실시간 시스템이 정해진 시간 내에 결과를 출력해야 하는 시스템임을 감안한다면 RTOS 시뮬레이터의 가장 큰 결점이 되기 때문에 실행시간 추정 기능을 가지면서 실용화도 가능한 RTOS 시뮬레이터가 필요하다. 본 연구에서는 이러한 문제점을 해결하여 RTOS와 RTOS 응용프로그램이 실제 타겟에서 처리될 때의 실행시간 추정이 가능하고 상용화가 가능한 기계 명령어 기반(machine instruction-based)의 RTOS 시뮬레이터를 연구 개발하였다. 나아가 실행시간의 주요 요소인 파이프라인과 캐쉬의 영향도 고려함으로서 실행시간 추정의 정확도를 향상시켰다 본 연구에서 사용된 RTOS는 한국전자통신연구원(ETRI)에서 2000년에 개발된 Q+이고, Q+가 동작하는 타겟 하드웨어는 ARM 계열의 StrongARM SA-110 마이크로프로세서와 21285 주제어기가 장착된 EBSA-285 보드이다. 측정하면서 수행하였다. 검증 결과 random 상태에서는 문헌자료에 부합되는 예측결과를 보여주었으나, intermediate와 constant 상태에서는 문헌보다 다소 낮은 속도를 보여주었다 이러한 속도차는 추후 현장 데이터를 수집하여 보다 실질적인 검증을 통하여 조정되어야 할 것으로 판단된다.지발광(1.26초)보다 구애발광(1.12초)에서 0.88배 감소하였고, 암컷에서 정지발광(2.99초)보다 구애발광(1.06초)에서 0.35배 감소하였다. 발광양상에서 발광주파수는 수짓의 정지발광에서 0.8 Hz, 수컷 구애발광에서 0.9 Hz, 암컷의 정지발광에서 0.3 Hz, 암컷의 구애발광에서 0.9 Hz로 각각 나타났다. H. papariensis의 발광파장영역은 400 nm에서 700 nm에 이르는 모든 영역에서 확인되었으며 가장 높은 첨두치는 600 nm에 있고 500에서 600 nm 사이의 파장대가 가장 두드러지게 나타났다. 발광양상과 어우러진 교미행동은 Hp system과 같은 결과를 얻었다.하는 방법을 제안한다. 즉 채널 액세스 확률을 각 슬롯에서 예약상태에 있는 음성 단말의 수뿐만 아니라 각 슬롯에서 예약을 하려고 하는 단말의 수에 기초하여 산출하는 방법을 제안하고 이의 성능을 분석하였다. 시뮬레이션에 의해 새로 제안된 채널 허용 확률을 산출하는 방식의 성능을 비교한 결과 기존에 제안된 방법들보다 상당한 성능의 향상을 볼 수 있었다., 인삼이 성장될 때 부분적인 영양상태의 불충분이나 기후 등에 따른 영향을 받을 수 있기 때문에 앞으로 이에 대한 많은 연구가 이루어져야할 것으로 판단된다.태에도 불구하고 [-wh]의미의 겹의문사는 병렬적 관계의 합성어가 아니라 내부구조를 지니지 않은 단순한 단어(minimal $X^{0}$

  • PDF

Studies of application of artificial ground freezing for a subsea tunnel under high water pressure - focused on case histories - (고수압 해저터널 건설을 위한 동결공법 적용성에 관한 연구 - 사례를 중심으로 -)

  • Son, Young-Jin;Lee, Kyu-Won;Ko, Tae Young
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.16 no.5
    • /
    • pp.431-443
    • /
    • 2014
  • In this paper case studies of artificial ground freezing, which have not been applied in Korea, have been investigated for the water cut-off in a subsea tunnel under high water pressure and the most commonly used cooling mediums of brine and liquid nitrogen are examined. Since sea water with pressure has the lower freezing point than pure water, the lower temperature cooling medium is required in the application of subsea tunnel. Also, the cooling medium must have refrigeration safety and is able to reduce executing time. Brine freezing system can reuse cooling medium and is safer than liquid nitrogen freezing. But it takes more time to freeze ground and needs complex circulation plants. On the other hand, liquid nitrogen freezing system can't recycle cooling medium and may cause breathing problems or asphyxiation through oxygen deficiency. But, freezing with liquid nitrogen is fast and requires simple refrigeration equipment. Principal elements of design for ground freezing in subsea tunnel have been extracted and these elements are needed further research.

Fishing Experiment on Selectivity of Trawl Net (트로올 어구의 어획 선택성에 관한 연구)

  • 박시환
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.26 no.3
    • /
    • pp.244-253
    • /
    • 1990
  • For the purpose of the investigation on the selective action of trawl net, a series of fishing experiments carryed out in M. S. Pusan 402 during the years of 1986~1987, by using a set of trawl net with a few pocket nets in each part of the bagnet. The author analyzed these experimental data and derived the following results. 1. 58 species of aquatic animals were caught in totally 43 times of trawl operation and 33 species of them did not escape at all through the barrier of netting in the bagnet. 2. Sardinops melanosticta, Harengula zunasi, Thrisa kamalensis, Englausis japonicus, little size of Tracurus japonicus, Sphyraena pinguis, Trichirus lepturus, and Psenopsis anomala escaped easily through the barrier of netting after being caught inside of the codend. Especially, Englusis japonicus escaped well not only through the netting of the codend but also through the netting of the square and the baiting. 3. In the case of mesh size of 60mm in the codend, Pampus argentus, Doderleinia bercoides and Tracurus japonicus were caught all in the size of less than 10cm.

  • PDF