• Title/Summary/Keyword: 수행도 예측

Search Result 8,550, Processing Time 0.038 seconds

Analysis of Greenhouse Thermal Environment by Model Simulation (시뮬레이션 모형에 의한 온실의 열환경 분석)

  • 서원명;윤용철
    • Journal of Bio-Environment Control
    • /
    • v.5 no.2
    • /
    • pp.215-235
    • /
    • 1996
  • The thermal analysis by mathematical model simulation makes it possible to reasonably predict heating and/or cooling requirements of certain greenhouses located under various geographical and climatic environment. It is another advantages of model simulation technique to be able to make it possible to select appropriate heating system, to set up energy utilization strategy, to schedule seasonal crop pattern, as well as to determine new greenhouse ranges. In this study, the control pattern for greenhouse microclimate is categorized as cooling and heating. Dynamic model was adopted to simulate heating requirements and/or energy conservation effectiveness such as energy saving by night-time thermal curtain, estimation of Heating Degree-Hours(HDH), long time prediction of greenhouse thermal behavior, etc. On the other hand, the cooling effects of ventilation, shading, and pad ||||&|||| fan system were partly analyzed by static model. By the experimental work with small size model greenhouse of 1.2m$\times$2.4m, it was found that cooling the greenhouse by spraying cold water directly on greenhouse cover surface or by recirculating cold water through heat exchangers would be effective in greenhouse summer cooling. The mathematical model developed for greenhouse model simulation is highly applicable because it can reflects various climatic factors like temperature, humidity, beam and diffuse solar radiation, wind velocity, etc. This model was closely verified by various weather data obtained through long period greenhouse experiment. Most of the materials relating with greenhouse heating or cooling components were obtained from model greenhouse simulated mathematically by using typical year(1987) data of Jinju Gyeongnam. But some of the materials relating with greenhouse cooling was obtained by performing model experiments which include analyzing cooling effect of water sprayed directly on greenhouse roof surface. The results are summarized as follows : 1. The heating requirements of model greenhouse were highly related with the minimum temperature set for given greenhouse. The setting temperature at night-time is much more influential on heating energy requirement than that at day-time. Therefore It is highly recommended that night- time setting temperature should be carefully determined and controlled. 2. The HDH data obtained by conventional method were estimated on the basis of considerably long term average weather temperature together with the standard base temperature(usually 18.3$^{\circ}C$). This kind of data can merely be used as a relative comparison criteria about heating load, but is not applicable in the calculation of greenhouse heating requirements because of the limited consideration of climatic factors and inappropriate base temperature. By comparing the HDM data with the results of simulation, it is found that the heating system design by HDH data will probably overshoot the actual heating requirement. 3. The energy saving effect of night-time thermal curtain as well as estimated heating requirement is found to be sensitively related with weather condition: Thermal curtain adopted for simulation showed high effectiveness in energy saving which amounts to more than 50% of annual heating requirement. 4. The ventilation performances doting warm seasons are mainly influenced by air exchange rate even though there are some variations depending on greenhouse structural difference, weather and cropping conditions. For air exchanges above 1 volume per minute, the reduction rate of temperature rise on both types of considered greenhouse becomes modest with the additional increase of ventilation capacity. Therefore the desirable ventilation capacity is assumed to be 1 air change per minute, which is the recommended ventilation rate in common greenhouse. 5. In glass covered greenhouse with full production, under clear weather of 50% RH, and continuous 1 air change per minute, the temperature drop in 50% shaded greenhouse and pad & fan systemed greenhouse is 2.6$^{\circ}C$ and.6.1$^{\circ}C$ respectively. The temperature in control greenhouse under continuous air change at this time was 36.6$^{\circ}C$ which was 5.3$^{\circ}C$ above ambient temperature. As a result the greenhouse temperature can be maintained 3$^{\circ}C$ below ambient temperature. But when RH is 80%, it was impossible to drop greenhouse temperature below ambient temperature because possible temperature reduction by pad ||||&|||| fan system at this time is not more than 2.4$^{\circ}C$. 6. During 3 months of hot summer season if the greenhouse is assumed to be cooled only when greenhouse temperature rise above 27$^{\circ}C$, the relationship between RH of ambient air and greenhouse temperature drop($\Delta$T) was formulated as follows : $\Delta$T= -0.077RH+7.7 7. Time dependent cooling effects performed by operation of each or combination of ventilation, 50% shading, pad & fan of 80% efficiency, were continuously predicted for one typical summer day long. When the greenhouse was cooled only by 1 air change per minute, greenhouse air temperature was 5$^{\circ}C$ above outdoor temperature. Either method alone can not drop greenhouse air temperature below outdoor temperature even under the fully cropped situations. But when both systems were operated together, greenhouse air temperature can be controlled to about 2.0-2.3$^{\circ}C$ below ambient temperature. 8. When the cool water of 6.5-8.5$^{\circ}C$ was sprayed on greenhouse roof surface with the water flow rate of 1.3 liter/min per unit greenhouse floor area, greenhouse air temperature could be dropped down to 16.5-18.$0^{\circ}C$, whlch is about 1$0^{\circ}C$ below the ambient temperature of 26.5-28.$0^{\circ}C$ at that time. The most important thing in cooling greenhouse air effectively with water spray may be obtaining plenty of cool water source like ground water itself or cold water produced by heat-pump. Future work is focused on not only analyzing the feasibility of heat pump operation but also finding the relationships between greenhouse air temperature(T$_{g}$ ), spraying water temperature(T$_{w}$ ), water flow rate(Q), and ambient temperature(T$_{o}$).

  • PDF

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

A Study on the Observation of Soil Moisture Conditions and its Applied Possibility in Agriculture Using Land Surface Temperature and NDVI from Landsat-8 OLI/TIRS Satellite Image (Landsat-8 OLI/TIRS 위성영상의 지표온도와 식생지수를 이용한 토양의 수분 상태 관측 및 농업분야에의 응용 가능성 연구)

  • Chae, Sung-Ho;Park, Sung-Hwan;Lee, Moung-Jin
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.6_1
    • /
    • pp.931-946
    • /
    • 2017
  • The purpose of this study is to observe and analyze soil moisture conditions with high resolution and to evaluate its application feasibility to agriculture. For this purpose, we used three Landsat-8 OLI (Operational Land Imager)/TIRS (Thermal Infrared Sensor) optical and thermal infrared satellite images taken from May to June 2015, 2016, and 2017, including the rural areas of Jeollabuk-do, where 46% of agricultural areas are located. The soil moisture conditions at each date in the study area can be effectively obtained through the SPI (Standardized Precipitation Index)3 drought index, and each image has near normal, moderately wet, and moderately dry soil moisture conditions. The temperature vegetation dryness index (TVDI) was calculated to observe the soil moisture status from the Landsat-8 OLI/TIRS images with different soil moisture conditions and to compare and analyze the soil moisture conditions obtained from the SPI3 drought index. TVDI is estimated from the relationship between LST (Land Surface Temperature) and NDVI (Normalized Difference Vegetation Index) calculated from Landsat-8 OLI/TIRS satellite images. The maximum/minimum values of LST according to NDVI are extracted from the distribution of pixels in the feature space of LST-NDVI, and the Dry/Wet edges of LST according to NDVI can be determined by linear regression analysis. The TVDI value is obtained by calculating the ratio of the LST value between the two edges. We classified the relative soil moisture conditions from the TVDI values into five stages: very wet, wet, normal, dry, and very dry and compared to the soil moisture conditions obtained from SPI3. Due to the rice-planing season from May to June, 62% of the whole images were classified as wet and very wet due to paddy field areas which are the largest proportions in the image. Also, the pixels classified as normal were analyzed because of the influence of the field area in the image. The TVDI classification results for the whole image roughly corresponded to the SPI3 soil moisture condition, but they did not correspond to the subdivision results which are very dry, wet, and very wet. In addition, after extracting and classifying agricultural areas of paddy field and field, the paddy field area did not correspond to the SPI3 drought index in the very dry, normal and very wet classification results, and the field area did not correspond to the SPI3 drought index in the normal classification. This is considered to be a problem in Dry/Wet edge estimation due to outlier such as extremely dry bare soil and very wet paddy field area, water, cloud and mountain topography effects (shadow). However, in the agricultural area, especially the field area, in May to June, it was possible to effectively observe the soil moisture conditions as a subdivision. It is expected that the application of this method will be possible by observing the temporal and spatial changes of the soil moisture status in the agricultural area using the optical satellite with high spatial resolution and forecasting the agricultural production.

Effects of Temperature Conditions on the Growth and Oviposition of Brown Planthopper, Nilaparvata lugens $St{\aa}l$ (온도조건(溫度條件)이 벼멸구의 발육(發育) 및 산란(産卵)에 미치는 영향(影響)에 관한 연구(硏究))

  • Bae, Soon-Do;Song, Yoo-Han;Park, Yeong-Do
    • Korean journal of applied entomology
    • /
    • v.26 no.1 s.70
    • /
    • pp.13-23
    • /
    • 1987
  • This study was conducted to know the effects of temperature conditions on the growth and oviposition of the brown planthopper(BPH), Nilaparvata lugens $St{\aa}l$. Results obtained were to predict the timing of the BPH control by measuring population dynamics of the BPH in response to temperature fluctuations upon migration of the insects in paddy fields. Developmental and ovipositional rates under constant and alternating temperature conditions were observed in a plant growth cabinet. Hatchabilities of eggs of the BPH were the highest at $25^{\circ}C$ and were decreased below or above the optimum temperature. Egg periods were the shortest at $27.5^{\circ}C$ and prolonged with decreasing temperature, but retarded at higher temperature above $30^{\circ}C$. Adult emergence rates were the highest at $27.5^{\circ}C$ and reduced with decreasing temperature, and no adult emerged at $32.5^{\circ}C$ and $35^{\circ}C$. Developmental period of nymph was the shortest at both $27.5^{\circ}C$ and $30^{\circ}C$, but extended with decreasing temperature. Female longevity was increased with decreasing temperature and the male longevity was the shortest at $27.5^{\circ}C$. Preoviposition period was the shortest at $32.5^{\circ}C$, but prolonged with decreasing temperature. It was about 6.5 times longer at $17.5^{\circ}C$ than that at $32.5^{\circ}C$. Number of eggs oviposited per female was the greatest at $25^{\circ}C$, but decreased at the temperature below or above the optimum. Under the same total effective day-degrees, hatchabilty at the alternating temperature was about 10% higher than that at the constant temperature but egg period at the alternating temperature was nearly identical as that at the constant. Under the $22^{\circ}C$ condition, emergence rate was about 8% higher at the alternating temperature than that at the constant, however, at the $28^{\circ}C$, the rate was about 8% higher at the constant than that at the alternating. Nymphal period was about $4{\sim}6$ days longer at the alternating temperature than that at the constant. Under the same total effective day-degrees in adult stage, both longevity and oviposition period were longer at alternating temperature than those at the constant. Number of eggs oviposited per female was also higher at the alternating. Longevities of females reared under $28^{\circ}C$ of constant temperature was the longest no matter what temperatures they were exposed after the emergence. This result seems to be indicating that female longevity is greatly influenced by the temperature to which they were exposed durings immature stages. Preoviposition period was affected by the temperature exposed during the nympal and adult stage whereas the number of eggs oviposited was affected by the temperature during the adult stage only. Based on the results from this study, the developmental threshold temperatures seem to be $14.12^{\circ}C$ for eggs, $14.76^{\circ}C$ for nymphs, $9.62^{\circ}C$ for adults, and $15.95^{\circ}C$ for preoviposition period. Estimated values of the total effective temperature for completing each stage were 141.25 day-degrees for eggs, 167.83 day-degrees for nymphs, 349.64 day-degrees for adults, and 58.60 day-degrees for preoviposition.

  • PDF

Relationships between Eating Behavior, Dietary Self-Efficacy, and Nutrition Knowledge of Elementary School Students by Food Service Type in Gangwon Province (강원지역 초등학생들의 급식유형(도시형, 농어촌형 및 도서벽지형) 별식행동과 식이자기효능감 및 영양지식과의 관계)

  • Won, Hyang-Rye;Shin, Gi-Beum
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.41 no.5
    • /
    • pp.638-646
    • /
    • 2012
  • The purpose of this study was to find a relationship between eating behavior, dietary self-efficacy and nutrition knowledge by comparing these items in elementary school students according to food service type. The survey was made through a questionnaire from 759 students in the 6th grade of elementary school in 39 Gangwon Province. The average score of eating behaviors according to food service type was highest for urban type, followed by agri-fishery type, and finally remote island and country type, for the questions asking about the application of nutrition knowledge and the frequency of eating out. The average score of nutrition knowledge according to food service type showed significant differences for the questions about eating snacks before going to sleep and weight increase as well as calorie comparisons between foods. For the correlation of eating behavior, dietary self-efficacy and nutrition knowledge, the agri-fishery type showed positive in all of the three items with significant differences. In the remote island and country type, there was a positive relationship between nutrition knowledge and dietary self-efficacy, and between eating behavior and dietary self-efficacy. However, there was no significant difference of correlation between nutrition knowledge and eating behavior. In order to confirm the predictable variables for eating behavior, a regression analysis was made by injecting variables in every stage with independent variables of dietary self-efficacy and nutrition knowledge, which showed a significant relationship with eating behavior. The results showed that, in the urban type, dietary self-efficacy and nutrition knowledge affected the eating behavior and, in the agriculture type and the remote island and country type, only dietary self-efficacy affected the eating behavior.

Evaluation for Rock Cleavage Using Distribution of Microcrack Spacings (III) (미세균열의 간격 분포를 이용한 결의 평가 (III))

  • Park, Deok-Won
    • The Journal of the Petrological Society of Korea
    • /
    • v.25 no.4
    • /
    • pp.311-324
    • /
    • 2016
  • The characteristics of the rock cleavage in Jurassic granite from Geochang were analysed. The evaluation for three quarrying planes and three rock cleavages was performed using the parameters such as (1) reduction ratio between the value of spacing and the value of length, (2) microcrack spacing frequency(N), (3) total spacing($1mm{\geq}$), (4) exponential constant(a), (5) magnitude of exponent(${\lambda}$), (6) mean spacing($S_{mean}$), (7) difference value($S_{mean}-S_{median}$) between mean spacing and median spacing($S_{median}$) and (8) density of spacing. Especially the close dependence between the above spacing parameters and the parameters from the spacing-cumulative frequency diagrams was derived. The discrimination factors representing three quarrying planes and three rock cleavages were acquired through these mutual contrast. The analysis results of the research are summarized as follows. First, the reduction ratios of frequency(N), mean value, median value, the above difference value($S_{mean}-S_{median}$) and density for three rock cleavages are in orders of G(grain, (G1 + G2)/2) < H(hardway, (H1 + H2)/2) < R(rift, (R1 + R2)/2), H < G $\ll$ R, H < G $\ll$ R, H < G < R and H < G $\ll$ R. The values of the above five parameters for three planes show the various orders of R'(rift plane) $\ll$ H'(hardway plane) < G'(grain plane), R' $\ll$ G' < H', R' < H' < G', R' < G' < H' and R' $\ll$ H' < G', respectively. Second, the values of (I) parameters(2, 3, 4 and 5) and (II) parameters(6, 7 and 8) are in orders of (I) H < G < R and (II) R < G < H. On the contrary, the values of the above two groups(I~II) of parameters for three planes show reverse orders. Third, to review the overall characteristics of the arrangement among the six diagrams, these diagrams show an order of R2 < R1 < G2 < G1 < H2 < H1 from the related chart. In other words, above six diagrams can be summarized in order of rift(R1 + R2) < grain(G1 + G2) < hardway(H1 + H2). These results indicate a relative magnitude of rock cleavage related to microcrack spacing. Especially, two parameters for each diagram, the above difference value($S_{mean}-S_{median}$) and mean spacing, could provide advanced information for prediction the order of arrangement among the diagrams. Finally, the general chart for three planes and three rock cleavages were made. From the related chart, three exponential straight lines for three rock cleavages show an order of R(R1 + R2) < G(G1 + G2) < H(H1 + H2). On the contrary, three lines for three planes show an order of H'(R2 + G2) < G'(R1 + H2) < R'(G1 + H1). Consequently, correlation of the mutually reverse order between three planes and three rock cleavages can be drawn from the related chart.

Assessment of Nutrient Intakes of Lunch Meals for the Aged Customers at the Elderly Care Facilities Through Measuring Cooking Yield Factor and the Weighed Plate Waste (조리 중량 변화 계수 및 잔반계측법을 이용한 노인복지시설 이용자의 점심식사 영양섭취평가)

  • Chang, Hye-Ja;Yi, Na-Young;Kim, Tae-Hee
    • Journal of Nutrition and Health
    • /
    • v.42 no.7
    • /
    • pp.650-663
    • /
    • 2009
  • The purposes of this study were to investigate one portion size of menus served and to evaluate nutrient intake of lunch at three elderly care facility food services located in Seoul. A weighed plate method was employed to measure plate wastes and consumption of the menus served. Yield factors were calculated from cooking experiments based on standardized recipes, and were used to evaluate nutrient intake. One hundred elderly participated in this study for measuring plate waste and were asked to complete questionnaire. Nutrient analyses for the served and consumed meal were performed using CAN program. The yield factors of rice dishes after cooking are 2.4 regardless of rice dish types, 1.58 for thick soups, 0.60 to 0.70 for meat dishes, and 1.0 to 1.25 branched vegetable. Average consumption quantity of dishes were 235.97 g for rice, 248.53 g for soup, 72.83 g for meat dishes, 39.80 g for vegetables and 28.36 g for Kimchi. On average the food waste rate is 14.0%, indicating the second highest plate waste percentage of Kimchi (26.2%), and meat/fish dish (17.3%). The evaluation results of NAR (Nutrition Adequacy Ratio) showed that iron (0.12), calcium (0.64), riboflavin (0.80), and folic acid (0.97) were less than 1.0 in both male and female elderly groups, indicating significant differences of NAR among three facilities. Compared to the 1/3 Dietary Reference Intake (DRIs) for the elderly groups, nutrient intake analysis demonstrated that calcium (100%) and iron (100%), followed by riboflavin, vitamin A, and Vitamin B6 did not met of the 1/3 EAR (Estimated Average Requirement). For the nutritious meal management, a professional dietitian should be placed at the elderly care center to develop standardized recipes in consideration of yield factors and the elderly's health and nutrition status.

Adaptive RFID anti-collision scheme using collision information and m-bit identification (충돌 정보와 m-bit인식을 이용한 적응형 RFID 충돌 방지 기법)

  • Lee, Je-Yul;Shin, Jongmin;Yang, Dongmin
    • Journal of Internet Computing and Services
    • /
    • v.14 no.5
    • /
    • pp.1-10
    • /
    • 2013
  • RFID(Radio Frequency Identification) system is non-contact identification technology. A basic RFID system consists of a reader, and a set of tags. RFID tags can be divided into active and passive tags. Active tags with power source allows their own operation execution and passive tags are small and low-cost. So passive tags are more suitable for distribution industry than active tags. A reader processes the information receiving from tags. RFID system achieves a fast identification of multiple tags using radio frequency. RFID systems has been applied into a variety of fields such as distribution, logistics, transportation, inventory management, access control, finance and etc. To encourage the introduction of RFID systems, several problems (price, size, power consumption, security) should be resolved. In this paper, we proposed an algorithm to significantly alleviate the collision problem caused by simultaneous responses of multiple tags. In the RFID systems, in anti-collision schemes, there are three methods: probabilistic, deterministic, and hybrid. In this paper, we introduce ALOHA-based protocol as a probabilistic method, and Tree-based protocol as a deterministic one. In Aloha-based protocols, time is divided into multiple slots. Tags randomly select their own IDs and transmit it. But Aloha-based protocol cannot guarantee that all tags are identified because they are probabilistic methods. In contrast, Tree-based protocols guarantee that a reader identifies all tags within the transmission range of the reader. In Tree-based protocols, a reader sends a query, and tags respond it with their own IDs. When a reader sends a query and two or more tags respond, a collision occurs. Then the reader makes and sends a new query. Frequent collisions make the identification performance degrade. Therefore, to identify tags quickly, it is necessary to reduce collisions efficiently. Each RFID tag has an ID of 96bit EPC(Electronic Product Code). The tags in a company or manufacturer have similar tag IDs with the same prefix. Unnecessary collisions occur while identifying multiple tags using Query Tree protocol. It results in growth of query-responses and idle time, which the identification time significantly increases. To solve this problem, Collision Tree protocol and M-ary Query Tree protocol have been proposed. However, in Collision Tree protocol and Query Tree protocol, only one bit is identified during one query-response. And, when similar tag IDs exist, M-ary Query Tree Protocol generates unnecessary query-responses. In this paper, we propose Adaptive M-ary Query Tree protocol that improves the identification performance using m-bit recognition, collision information of tag IDs, and prediction technique. We compare our proposed scheme with other Tree-based protocols under the same conditions. We show that our proposed scheme outperforms others in terms of identification time and identification efficiency.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Intelligent VOC Analyzing System Using Opinion Mining (오피니언 마이닝을 이용한 지능형 VOC 분석시스템)

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.113-125
    • /
    • 2013
  • Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.