• Title/Summary/Keyword: 연관정보

Search Result 3,818, Processing Time 0.033 seconds

Secondary School Students' Images of Doing-Science-Well (과학을 잘 하는 모습에 대한 고등학생의 인식)

  • Lee, Wang-Suk;Kim, Hee-Kyong;Song, Jin-Woong
    • Journal of The Korean Association For Science Education
    • /
    • v.28 no.1
    • /
    • pp.1-14
    • /
    • 2008
  • The image of science is one of the recurrent topics in science education research. In particular, we believe that students' images of Doing-Science-Well could be used for identifying not only students' perceived goals of science learning, but also practical guidelines of effective science teaching. In this study, the students' images of Doing-Science-Well were investigated with the following two research questions: (i) what are student's images of Doing-Science-Well?; (ii) in what contexts do students perceive that someone is doing science well? Thirty seven students in a high school in Seoul, Korea were asked to write their personal experiences by which they realized that someone was doing science well. The main results of the study are the following: Firstly, the images of Doing-Science-Well could be categorized into 'Einstein type', 'Socrates type', 'MacGyver type' and six more types. Secondly, with regard to contexts, students tended to realize that somebody is doing science well in terms of two kinds of contexts: 4 physical contexts and 6 psychological contexts. The findings led us to develop a frame of judging Doing-Science-Well, which combines the types and two kinds of contexts. The frame illustrates the multiplicity of the images of Doing-Science-Well.

Risk Factors for Binge-eating and Food Addiction : Analysis with Propensity-Score Matching and Logistic Regression (폭식행동 및 음식중독의 위험요인 분석: 성향점수매칭과 로지스틱 회귀모델을 이용한 분석)

  • Jake Jeong;Whanhee Lee;Jung In Choi;Young Hye Cho;Kwangyeol Baek
    • Journal of the Korean Applied Science and Technology
    • /
    • v.40 no.4
    • /
    • pp.685-698
    • /
    • 2023
  • This study aimed to identify binge-eating behavior and food addiction in Korean population and to determine their associations with obesity, eating behaviors, mental health and cognitive characteristics. We collected clinical questionnaire scores related to eating problems (e.g. binge eating, food addiction, food cravings), mental health (e.g. depression), and cognitive functions (e.g. impulsivity, emotion regulation) in 257 Korean adults in the normal and the obese weight ranges. Binge-eating and food addiction were most frequent in obese women (binge-eating: 46.6%, food addiction: 29.3%) when we divided the participants into 4 groups depending on gender and obesity status. The independence test using the data with propensity score matching confirmed that binge-eating and food addiction were more prevalent in obese individuals. Finally, we constructed the logistic regression models using forward selection method to evaluate the influence of various clinical questionnaire scores on binge-eating and food addiction respectively. Binge-eating was significantly associated with the clinical scales of eating disorders, food craving, state anxiety, and emotion regulation (cognitive reappraisal) as well as food addiction. Food addiction demonstrated the significant effect of food craving, binge-eating, the interaction of obesity and age, and years of education. In conclusion, we found that binge-eating and food addiction are much more frequent in females and obese individuals. Both binge-eating and food addiction commonly involved eating problems (e.g. food craving), but there was difference in mental health and cognitive risk factors. Therefore, it is required to distinguish food addiction from binge-eating and investigate intrinsic and environmental risk factors for each pathology.

Phenotypic Variation in the Breast of Live Broiler Chickens Over Time (시간에 따른 생축 육계 가슴살의 표현형 변이)

  • Ji-Won Kim;Chang-Ho Han;Seul-Gy Lee;Jun-Ho Lee;Su-Yong Jang;Jeong-Uk Eom;Kang-Jin Jeong;Jae-Cheol Jang;Hyun-Wook Kim;Han-Sul Yang;Sea-Hwan Sohn;Sang-Hyon Oh
    • Korean Journal of Poultry Science
    • /
    • v.51 no.2
    • /
    • pp.97-106
    • /
    • 2024
  • This study utilized the non-invasive MyotonPRO® device to analyze the stiffness in breast muscles of commercial broilers (Ross 308 and Arbor Acres) and compared these findings with data reported for Ross 708, where Woody Breast (WB) symptoms had been previously documented. The research revealed that Ross 308 and Arbor Acres displayed relatively lower stiffness values compared to Ross 708, suggesting a lack of WB expression. These results indicate differentiation in breast muscle traits across strains and underscore the necessity for further research into factors influencing WB manifestation. The study also measured additional muscle tone characteristics such as Frequency, Decrement, Relaxation, and Creep across various growth stages (2, 4, 6, and 8 weeks), finding significant variations with pronounced severity at weeks 2 and 8. An increase in stiffness was observed as the broilers aged, pointing to potential growth-related or stress-induced changes affecting WB severity. A strong positive correlation was established between increased breast meat weight and WB severity, highlighting that heavier breast meat could exacerbate the condition. This correlation is vital for the poultry industry, suggesting that weight management could help mitigate WB effects. Moreover, the potential for genetic selection and breeding strategies to reduce WB occurrence was emphasized, which could aid in enhancing management practices in commercial poultry production. Collectively, these insights contribute to a deeper understanding of WB in broilers and propose avenues for future research and practical strategies to minimize its impact.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

Structural Properties of Social Network and Diffusion of Product WOM: A Sociocultural Approach (사회적 네트워크 구조특성과 제품구전의 확산: 사회문화적 접근)

  • Yoon, Sung-Joon;Han, Hee-Eun
    • Journal of Distribution Research
    • /
    • v.16 no.1
    • /
    • pp.141-177
    • /
    • 2011
  • I. Research Objectives: Most of the previous studies on diffusion have concentrated on efficacy of WOM communication with the use of variables at individual level (Iacobucci 1996; Midgley et al. 1992). However, there is a paucity of studies which investigated network's structural properties as antecedents of WOM from the perspective of consumers' sociocultural propensities. Against this research backbone, this study attempted to link the network's structural properties and consumer' WOM behavior on cross-national basis. The major research objective of this study was to examine the relationship between network properties and WOM by comparing Korean and Chinese consumers. Specific objectives of this research are threefold; firstly, it sought to examine whether network properties (i.e., tie strength, centrality, range) affect WOM (WOM intention and quality of WOM). Secondly, it aimed to explore the moderating effects of cutural orientation (uncertainty avoidance and individuality) on the relationship between network properties and WOM. Thirdly, it substantiates the role of innovativeness as antecedents to both network properties and WOM. II. Research Hypotheses: Based on the above research objectives, the study put forth the following research hypotheses to validate. ${\cdot}$ H 1-1 : The Strength of tie between two counterparts within network will positively influence WOM effectivenes ${\cdot}$ H 1-2 : The network centrality will positively influence the WOM effectiveness ${\cdot}$ H 1-3 : The network range will positively influence the WOM effectiveness ${\cdot}$ H 2-1 : The consumer's uncertainty avoidance tendency will moderate the relationship between network properties and WOM effectiveness ${\cdot}$ H 2-2 : The consumer's individualism tendency will moderate the relationship between network properties and WOM effectiveness ${\cdot}$ H 3-1 : The consumer's innovativeness will positively influence the social network properties ${\cdot}$ H 3-2 : The consumer's innovativeness will positively influence WOM effectiveness III. Methodology: Through a pilot study and back-translation, two versions of questionnaire were prepared, one in Korean and the other in Chinese. The chinese data were collected from the chinese students enrolled in language schools in Suwon city in Korea, while Korean data were collected from students taking classes in a major university in Seoul. A total of 277 questionnaire were used for analysis of Korean data and 212 for Chinese data. The reason why Chinese students living in Korea rather than in China were selected was based on two factors: one was to neutralize the differences (ie, retail channel availability) that may arise from living in separate countries and the second was to minimize the difference in communication venues such as internet accessibility and cell phone usability. SPSS 12.0 and AMOS 7.0 were used for analysis. IV. Results: Prior to hypothesis verification, mean differences between the two countries in terms of major constructs were performed with the following result; As for network properties (tie strength, centrality and range), Koreans showed higher scores in all three constructs. For cultural orientation traits, Koreans scored higher only on uncertainty avoidance trait than Chinese. As a result of verifying the first research objective, confirming the relationship between network properties and WOM effectiveness, on Korean side, tie strength(Beta=.116; t=1.785) and centrality (Beta=.499; t=6.776) significantly influenced on WOM intention, and similar finding was obtained for Chinese side, with tie strength (Beta=.246; t=3.544) and centrality (Beta=.247; t=3.538) being significant. However, with regard to WOM argument quality, Korean data yielded only centrality (Beta=.82; t=7.600) having a significant impact on WOM, whereas China showed both tie strength(Beat=.142; t=2.052) and centrality(Beta=.348; t=5.031) being influential. To answer for the second research objective addressing the moderating role of cultural orientation, moderated regression anaylsis was performed and the result showed that uncertainty avoidance moderated between network range and WOM intention for both Korea and China, But for Korea, the uncertainty avoidance moderated between tie strength and WOM quality, while for China it moderated between network range and WOM intention. And innovativeness moderated between tie strength and WOM intention for Korea but it moderated between network range and WOM intention for China. As a result of analysing for third research objective, we found that for Korea, innovativeness positively influenced centrality only (Beta=.546; t=10.808), while for China it influenced both tie strength (Beta=.203; t=2.998) and centrality(Beta=.518; t=8.782). But for both countries alike, the innovativeness influenced positively on WOM (WOM intention and WOM quality). V. Implications: The study yields the two practical implications. Firstly, the result suggests that companies targeting multinational customers need to identify segments which are susceptible to the positive WOM and WOM information based on individual traits such as uncertainty avoidance and individualism and based on that, develop marketing communication strategy. Secondly, the companies need to divide the market on Roger's five innovation stages and based on this information, enforce marketing strategy which utilizes social networking tools such as public media and WOM. For instance, innovator and early adopters, if provided with new product information, will be able to capitalize upon the network advantages and thus add informational value to network operations using SNS or corporate blog.

  • PDF

A Study on the Establishment of Comparison System between the Statement of Military Reports and Related Laws (군(軍) 보고서 등장 문장과 관련 법령 간 비교 시스템 구축 방안 연구)

  • Jung, Jiin;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.109-125
    • /
    • 2020
  • The Ministry of National Defense is pushing for the Defense Acquisition Program to build strong defense capabilities, and it spends more than 10 trillion won annually on defense improvement. As the Defense Acquisition Program is directly related to the security of the nation as well as the lives and property of the people, it must be carried out very transparently and efficiently by experts. However, the excessive diversification of laws and regulations related to the Defense Acquisition Program has made it challenging for many working-level officials to carry out the Defense Acquisition Program smoothly. It is even known that many people realize that there are related regulations that they were unaware of until they push ahead with their work. In addition, the statutory statements related to the Defense Acquisition Program have the tendency to cause serious issues even if only a single expression is wrong within the sentence. Despite this, efforts to establish a sentence comparison system to correct this issue in real time have been minimal. Therefore, this paper tries to propose a "Comparison System between the Statement of Military Reports and Related Laws" implementation plan that uses the Siamese Network-based artificial neural network, a model in the field of natural language processing (NLP), to observe the similarity between sentences that are likely to appear in the Defense Acquisition Program related documents and those from related statutory provisions to determine and classify the risk of illegality and to make users aware of the consequences. Various artificial neural network models (Bi-LSTM, Self-Attention, D_Bi-LSTM) were studied using 3,442 pairs of "Original Sentence"(described in actual statutes) and "Edited Sentence"(edited sentences derived from "Original Sentence"). Among many Defense Acquisition Program related statutes, DEFENSE ACQUISITION PROGRAM ACT, ENFORCEMENT RULE OF THE DEFENSE ACQUISITION PROGRAM ACT, and ENFORCEMENT DECREE OF THE DEFENSE ACQUISITION PROGRAM ACT were selected. Furthermore, "Original Sentence" has the 83 provisions that actually appear in the Act. "Original Sentence" has the main 83 clauses most accessible to working-level officials in their work. "Edited Sentence" is comprised of 30 to 50 similar sentences that are likely to appear modified in the county report for each clause("Original Sentence"). During the creation of the edited sentences, the original sentences were modified using 12 certain rules, and these sentences were produced in proportion to the number of such rules, as it was the case for the original sentences. After conducting 1 : 1 sentence similarity performance evaluation experiments, it was possible to classify each "Edited Sentence" as legal or illegal with considerable accuracy. In addition, the "Edited Sentence" dataset used to train the neural network models contains a variety of actual statutory statements("Original Sentence"), which are characterized by the 12 rules. On the other hand, the models are not able to effectively classify other sentences, which appear in actual military reports, when only the "Original Sentence" and "Edited Sentence" dataset have been fed to them. The dataset is not ample enough for the model to recognize other incoming new sentences. Hence, the performance of the model was reassessed by writing an additional 120 new sentences that have better resemblance to those in the actual military report and still have association with the original sentences. Thereafter, we were able to check that the models' performances surpassed a certain level even when they were trained merely with "Original Sentence" and "Edited Sentence" data. If sufficient model learning is achieved through the improvement and expansion of the full set of learning data with the addition of the actual report appearance sentences, the models will be able to better classify other sentences coming from military reports as legal or illegal. Based on the experimental results, this study confirms the possibility and value of building "Real-Time Automated Comparison System Between Military Documents and Related Laws". The research conducted in this experiment can verify which specific clause, of several that appear in related law clause is most similar to the sentence that appears in the Defense Acquisition Program-related military reports. This helps determine whether the contents in the military report sentences are at the risk of illegality when they are compared with those in the law clauses.

Analysis of the Time-dependent Relation between TV Ratings and the Content of Microblogs (TV 시청률과 마이크로블로그 내용어와의 시간대별 관계 분석)

  • Choeh, Joon Yeon;Baek, Haedeuk;Choi, Jinho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.163-176
    • /
    • 2014
  • Social media is becoming the platform for users to communicate their activities, status, emotions, and experiences to other people. In recent years, microblogs, such as Twitter, have gained in popularity because of its ease of use, speed, and reach. Compared to a conventional web blog, a microblog lowers users' efforts and investment for content generation by recommending shorter posts. There has been a lot research into capturing the social phenomena and analyzing the chatter of microblogs. However, measuring television ratings has been given little attention so far. Currently, the most common method to measure TV ratings uses an electronic metering device installed in a small number of sampled households. Microblogs allow users to post short messages, share daily updates, and conveniently keep in touch. In a similar way, microblog users are interacting with each other while watching television or movies, or visiting a new place. In order to measure TV ratings, some features are significant during certain hours of the day, or days of the week, whereas these same features are meaningless during other time periods. Thus, the importance of features can change during the day, and a model capturing the time sensitive relevance is required to estimate TV ratings. Therefore, modeling time-related characteristics of features should be a key when measuring the TV ratings through microblogs. We show that capturing time-dependency of features in measuring TV ratings is vitally necessary for improving their accuracy. To explore the relationship between the content of microblogs and TV ratings, we collected Twitter data using the Get Search component of the Twitter REST API from January 2013 to October 2013. There are about 300 thousand posts in our data set for the experiment. After excluding data such as adverting or promoted tweets, we selected 149 thousand tweets for analysis. The number of tweets reaches its maximum level on the broadcasting day and increases rapidly around the broadcasting time. This result is stems from the characteristics of the public channel, which broadcasts the program at the predetermined time. From our analysis, we find that count-based features such as the number of tweets or retweets have a low correlation with TV ratings. This result implies that a simple tweet rate does not reflect the satisfaction or response to the TV programs. Content-based features extracted from the content of tweets have a relatively high correlation with TV ratings. Further, some emoticons or newly coined words that are not tagged in the morpheme extraction process have a strong relationship with TV ratings. We find that there is a time-dependency in the correlation of features between the before and after broadcasting time. Since the TV program is broadcast at the predetermined time regularly, users post tweets expressing their expectation for the program or disappointment over not being able to watch the program. The highly correlated features before the broadcast are different from the features after broadcasting. This result explains that the relevance of words with TV programs can change according to the time of the tweets. Among the 336 words that fulfill the minimum requirements for candidate features, 145 words have the highest correlation before the broadcasting time, whereas 68 words reach the highest correlation after broadcasting. Interestingly, some words that express the impossibility of watching the program show a high relevance, despite containing a negative meaning. Understanding the time-dependency of features can be helpful in improving the accuracy of TV ratings measurement. This research contributes a basis to estimate the response to or satisfaction with the broadcasted programs using the time dependency of words in Twitter chatter. More research is needed to refine the methodology for predicting or measuring TV ratings.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.