• Title/Summary/Keyword: textual analysis

Search Result 201, Processing Time 0.024 seconds

Analyzing data-related policy programs in Korea using text mining and network cluster analysis (텍스트 마이닝과 네트워크 군집 분석을 활용한 한국의 데이터 관련 정책사업 분석)

  • Sungjun Choi;Kiyoon Shin;Yoonhwan Oh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.6
    • /
    • pp.63-81
    • /
    • 2023
  • This study endeavors to classify and categorize similar policy programs through network clustering analysis, using textual information from data-related policy programs in Korea. To achieve this, descriptions of data-related budgetary programs in South Korea in 2022 were collected, and keywords from the program contents were extracted. Subsequently, the similarity between each program was derived using TF-IDF, and policy program network was constructed accordingly. Following this, the structural characteristics of the network were analyzed, and similar policy programs were clustered and categorized through network clustering. Upon analyzing a total of 97 programs, 7 major clusters were identified, signifying that programs with analogous themes or objectives were categorized based on application area or services utilizing data. The findings of this research illuminate the current status of data-related policy programs in Korea, providing policy implications for a strategic approach to planning future national data strategies and programs, and contributing to the establishment of evidence-based policies.

Body discourse on DE&I in the fashion industry analyzed through The New York Times (뉴욕타임즈를 통해 분석한 패션산업 내 DE&I에 관한 신체담론)

  • Myeongseon Yi;Eunhyuk Yim
    • The Research Journal of the Costume Culture
    • /
    • v.32 no.2
    • /
    • pp.164-180
    • /
    • 2024
  • In the context of a globalized society where diversity, equity, and inclusion (DE&I) have emerged as pivotal values, the fashion industry is undergoing scrutiny for its practices related to body DE&I. This study examines the nature of the discourse surrounding body DE&I within the fashion industry, focusing on how such discussions are shaped, disseminated, and manifested in both the industry and broader society. Critical discourse analysis is applied by utilizing, content from the New York Times and leveraging Fairclough's analytical framework encompassing textual, discursive, and social practices. The findings indicate that the New York Times emphasizes diversity, with a significant focus on the shapes and sizes of women's bodies, developing a narrative centered around women's bodies through visible and representative domains. The analysis suggests conflicted discourse, with prevailing critiques against the fashion industry's standardization of beauty and superficial inclusivity efforts. Moreover, the industry's adaptation to social demands for body DE&I is observed as sporadic, often leveraging non-normative bodies as a marketing strategy rather than genuinely embracing diversity. This study highlights the importance of continuous, in-depth discourse and social practices regarding DE&I within the fashion industry, as well as the need for systemic changes and policies that genuinely reflect societal demands for inclusivity. The findings provide a foundation for future investigations into the multifaceted relationship between fashion discourse, DE&I, and social practices, advocating for a more inclusive and critically aware fashion industry.

Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being (주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법)

  • Choi, Sukjae;Song, Yeongeun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.83-105
    • /
    • 2016
  • Measuring an individual's subjective wellbeing in an accurate, unobtrusive, and cost-effective manner is a core success factor of the wellbeing support system, which is a type of medical IT service. However, measurements with a self-report questionnaire and wearable sensors are cost-intensive and obtrusive when the wellbeing support system should be running in real-time, despite being very accurate. Recently, reasoning the state of subjective wellbeing with conventional sentiment analysis and unstructured data has been proposed as an alternative to resolve the drawbacks of the self-report questionnaire and wearable sensors. However, this approach does not consider contextual polarity, which results in lower measurement accuracy. Moreover, there is no sentimental word net or ontology for the subjective wellbeing area. Hence, this paper proposes a method to extract keywords and their contextual polarity representing the subjective wellbeing state from the unstructured text in online websites in order to improve the reasoning accuracy of the sentiment analysis. The proposed method is as follows. First, a set of general sentimental words is proposed. SentiWordNet was adopted; this is the most widely used dictionary and contains about 100,000 words such as nouns, verbs, adjectives, and adverbs with polarities from -1.0 (extremely negative) to 1.0 (extremely positive). Second, corpora on subjective wellbeing (SWB corpora) were obtained by crawling online text. A survey was conducted to prepare a learning dataset that includes an individual's opinion and the level of self-report wellness, such as stress and depression. The participants were asked to respond with their feelings about online news on two topics. Next, three data sources were extracted from the SWB corpora: demographic information, psychographic information, and the structural characteristics of the text (e.g., the number of words used in the text, simple statistics on the special characters used). These were considered to adjust the level of a specific SWB. Finally, a set of reasoning rules was generated for each wellbeing factor to estimate the SWB of an individual based on the text written by the individual. The experimental results suggested that using contextual polarity for each SWB factor (e.g., stress, depression) significantly improved the estimation accuracy compared to conventional sentiment analysis methods incorporating SentiWordNet. Even though literature is available on Korean sentiment analysis, such studies only used only a limited set of sentimental words. Due to the small number of words, many sentences are overlooked and ignored when estimating the level of sentiment. However, the proposed method can identify multiple sentiment-neutral words as sentiment words in the context of a specific SWB factor. The results also suggest that a specific type of senti-word dictionary containing contextual polarity needs to be constructed along with a dictionary based on common sense such as SenticNet. These efforts will enrich and enlarge the application area of sentic computing. The study is helpful to practitioners and managers of wellness services in that a couple of characteristics of unstructured text have been identified for improving SWB. Consistent with the literature, the results showed that the gender and age affect the SWB state when the individual is exposed to an identical queue from the online text. In addition, the length of the textual response and usage pattern of special characters were found to indicate the individual's SWB. These imply that better SWB measurement should involve collecting the textual structure and the individual's demographic conditions. In the future, the proposed method should be improved by automated identification of the contextual polarity in order to enlarge the vocabulary in a cost-effective manner.

A Method for Evaluating News Value based on Supply and Demand of Information Using Text Analysis (텍스트 분석을 활용한 정보의 수요 공급 기반 뉴스 가치 평가 방안)

  • Lee, Donghoon;Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.45-67
    • /
    • 2016
  • Given the recent development of smart devices, users are producing, sharing, and acquiring a variety of information via the Internet and social network services (SNSs). Because users tend to use multiple media simultaneously according to their goals and preferences, domestic SNS users use around 2.09 media concurrently on average. Since the information provided by such media is usually textually represented, recent studies have been actively conducting textual analysis in order to understand users more deeply. Earlier studies using textual analysis focused on analyzing a document's contents without substantive consideration of the diverse characteristics of the source medium. However, current studies argue that analytical and interpretive approaches should be applied differently according to the characteristics of a document's source. Documents can be classified into the following types: informative documents for delivering information, expressive documents for expressing emotions and aesthetics, operational documents for inducing the recipient's behavior, and audiovisual media documents for supplementing the above three functions through images and music. Further, documents can be classified according to their contents, which comprise facts, concepts, procedures, principles, rules, stories, opinions, and descriptions. Documents have unique characteristics according to the source media by which they are distributed. In terms of newspapers, only highly trained people tend to write articles for public dissemination. In contrast, with SNSs, various types of users can freely write any message and such messages are distributed in an unpredictable way. Again, in the case of newspapers, each article exists independently and does not tend to have any relation to other articles. However, messages (original tweets) on Twitter, for example, are highly organized and regularly duplicated and repeated through replies and retweets. There have been many studies focusing on the different characteristics between newspapers and SNSs. However, it is difficult to find a study that focuses on the difference between the two media from the perspective of supply and demand. We can regard the articles of newspapers as a kind of information supply, whereas messages on various SNSs represent a demand for information. By investigating traditional newspapers and SNSs from the perspective of supply and demand of information, we can explore and explain the information dilemma more clearly. For example, there may be superfluous issues that are heavily reported in newspaper articles despite the fact that users seldom have much interest in these issues. Such overproduced information is not only a waste of media resources but also makes it difficult to find valuable, in-demand information. Further, some issues that are covered by only a few newspapers may be of high interest to SNS users. To alleviate the deleterious effects of information asymmetries, it is necessary to analyze the supply and demand of each information source and, accordingly, provide information flexibly. Such an approach would allow the value of information to be explored and approximated on the basis of the supply-demand balance. Conceptually, this is very similar to the price of goods or services being determined by the supply-demand relationship. Adopting this concept, media companies could focus on the production of highly in-demand issues that are in short supply. In this study, we selected Internet news sites and Twitter as representative media for investigating information supply and demand, respectively. We present the notion of News Value Index (NVI), which evaluates the value of news information in terms of the magnitude of Twitter messages associated with it. In addition, we visualize the change of information value over time using the NVI. We conducted an analysis using 387,014 news articles and 31,674,795 Twitter messages. The analysis results revealed interesting patterns: most issues show lower NVI than average of the whole issue, whereas a few issues show steadily higher NVI than the average.

Sensory and Mechanical Characteristics of Surichwi-injeulmi by Adding Surichwi Contents (수리취 인절미의 수리취 첨가량에 따른 텍스쳐 특성)

  • 이숙미;조정순
    • Korean journal of food and cookery science
    • /
    • v.17 no.1
    • /
    • pp.1-6
    • /
    • 2001
  • The purpose of this study was to investigate sensory and mechanical characteristics of Surichwi-injeulmi by adding Surichwi contents. According to sensory evaluation of Surichwi-injeulmi, the acceptance was the best in the color, flavor and overall quality when adding 20% Surichwi. As the additional ratio of Surichwi was increased, the lightness and yellowness were decreased, however, the redness was negatively increased. As a results of textural analysis of Surichwi-injeulmi by adding Surichwi contents in storing at 20$\^{C}$, the hardness, chewiness, gumminess and cohesiveness were decreased with the increased by adding Surichwi contents. The hardness, chewiness, gumminess and cohesiveness were decreased by increased storage time, whereas the elasticity was increased. Textural characteristics of Surichwi-injeulmi added 30% and 40% Surichwi in storing at 20$\^{C}$ showed less change than those of 0 and 10% group.

  • PDF

A Comparative Analysis of the Linguistic Features of Texts used in the unit of Volcano and Earthquake in Korean Elementary and Secondary School Science Textbooks (초.중등 과학 교과서 화산과 지진 관련 단원 글의 언어 구조 비교 분석)

  • Shin, Myung-Hwan;Maeng, Seung-Ho;Kim, Chan-Jong
    • Journal of the Korean earth science society
    • /
    • v.31 no.1
    • /
    • pp.36-50
    • /
    • 2010
  • The purpose of this study is to investigate the aspect of variation of the texts in elementary and secondary school science textbooks at each grade level in terms of linguistic features. Data included some of the written texts related to 'Volcano and Earthquake' in Korean elementary and secondary school science textbooks in the seventh National Curriculum. The written texts were comparatively analyzed in terms of textual meaning, interpersonal meaning, and ideational meaning. Results revealed that there were different structures and linguistic features of the texts in school science textbooks depending on the grade level. Therefore, we argue that the differences in this study may make students feel difficult and strange when they read and understand science textbooks. We suggest that science teachers need to play the role of a mediator between students' understanding and the structural features of the scientific language in science learning.

Visualization of unstructured personal narratives of perterm birth using text network analysis (텍스트 네트워크 분석을 이용한 조산 경험 이야기의 시각화)

  • Kim, Jeung-Im
    • Women's Health Nursing
    • /
    • v.26 no.3
    • /
    • pp.205-212
    • /
    • 2020
  • Purpose: This study aimed to identify the components of preterm birth (PTB) through women's personal narratives and to visualize clinical symptom expressions (CSEs). Methods: The participants were 11 women who gave birth before 37 weeks of gestational age. Personal narratives were collected by interactive unstructured storytelling via individual interviews, from August 8 to December 4, 2019 after receiving approval of the Institutional Review Board. The textual data were converted to PDF and analyzed using the MAXQDA program (VERBI Software). Results: The participants' mean age was 34.6 (±2.98) years, and five participants had a spontaneous vaginal birth. The following nine components of PTB were identified: obstetric condition, emotional condition, physical condition, medical condition, hospital environment, life-related stress, pregnancy-related stress, spousal support, and informational support. The top three codes were preterm labor, personal characteristics, and premature rupture of membrane, and the codes found for more than half of the participants were short cervix, fear of PTB, concern about fetal well-being, sleep difficulty, insufficient spousal and informational support, and physical difficulties. The top six CSEs were stress, hydramnios, false labor, concern about fetal wellbeing, true labor pain, and uterine contraction. "Stress" was ranked first in terms of frequency and "uterine contraction" had individual attributes. Conclusion: The text network analysis of narratives from women who gave birth preterm yielded nine PTB components and six CSEs. These nine components should be included for developing a reliable and valid scale for PTB risk and stress. The CSEs can be applied for assessing preterm labor, as well as considered as strategies for students in women's health nursing practicum.

Opinion Retrieval in Twitter Considering Syntactic Relations of Sentiment Phrase (의견 어구의 구문 관계를 고려한 트위터 의견 검색)

  • Kim, Yoonsung;Yang, Min-Chul;Lee, Seung-Wook;Rim, Hae-Chang
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.9
    • /
    • pp.492-497
    • /
    • 2014
  • In this paper, we propose a method of retrieving opinioned tweets in Twitter, which is the one of the popular Social Network Services and shares diverse opinions among various users. In typical opinion retrieval systems, they may consider the presence of sentiment phrases (subjectivity) as the important factor even if the subjective phrases are not related to a given query or speaker. To alleviate these problems, we utilized the syntactic structure of a sentence to identify the relationships between 1) subjectivity-query and 2) subjectivity-speaker and 3) the syntactic role of subjectivity. Besides, our learning-to-rank approach is trained to retrieve opinioned tweets based on query-relevance, textual features, user information, and Twitter-specific features. Experimental results on real world data show that our proposed method can achieve better performance than several baseline methods in terms of precision and nDCG.

Images of Female and Male Business Leaders in Newspaper Photographs (신문보도사진에 나타난 남녀 경제리더의 이미지 분석)

  • Kim, Heejin;Lee, Su-Min
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.80-92
    • /
    • 2012
  • News plays a similar role with myths in modern society. Myths provide the framework from which people understand and experience the world, and news media plays an important role in constructing these myths. In this context, this study examined how female and male business leaders have been represented in newspaper photographs through quantitative and qualitative textual analyses. Photographs of female business leaders which appeared in news interviews and profile news stories from 1990 to 2011 were analyzed, and photographs of male leaders corresponding to the total number of female photographs were also investigated. As a result, gender perspectives were found. While male business leaders were portrayed as active and serious figures in connection with their professional work places, mostly in bust shots, female business leaders were represented as passive objects artificially posed for the news in a context separated from their business, and often in full length shots.

Grading System of Movie Review through the Use of An Appraisal Dictionary and Computation of Semantic Segments (감정어휘 평가사전과 의미마디 연산을 이용한 영화평 등급화 시스템)

  • Ko, Min-Su;Shin, Hyo-Pil
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.669-696
    • /
    • 2010
  • Assuming that the whole meaning of a document is a composition of the meanings of each part, this paper proposes to study the automatic grading of movie reviews which contain sentimental expressions. This will be accomplished by calculating the values of semantic segments and performing data classification for each review. The ARSSA(The Automatic Rating System for Sentiment analysis using an Appraisal dictionary) system is an effort to model decision making processes in a manner similar to that of the human mind. This aims to resolve the discontinuity between the numerical ranking and textual rationalization present in the binary structure of the current review rating system: {rate: review}. This model can be realized by performing analysis on the abstract menas extracted from each review. The performance of this system was experimentally calculated by performing a 10-fold Cross-Validation test of 1000 reviews obtained from the Naver Movie site. The system achieved an 85% F1 Score when compared to predefined values using a predefined appraisal dictionary.

  • PDF