• Title/Summary/Keyword: TextMining

Search Result 1,563, Processing Time 0.026 seconds

Analysis of English abstracts in Journal of the Korean Data & Information Science Society using topic models and social network analysis (토픽 모형 및 사회연결망 분석을 이용한 한국데이터정보과학회지 영문초록 분석)

  • Kim, Gyuha;Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.151-159
    • /
    • 2015
  • This article analyzes English abstracts of the articles published in Journal of the Korean Data & Information Science Society using text mining techniques. At first, term-document matrices are formed by various methods and then visualized by social network analysis. LDA (latent Dirichlet allocation) and CTM (correlated topic model) are also employed in order to extract topics from the abstracts. Performances of the topic models are compared via entropy for several numbers of topics and weighting methods to form term-document matrices.

Unstructured Data Processing Using Keyword-Based Topic-Oriented Analysis (키워드 기반 주제중심 분석을 이용한 비정형데이터 처리)

  • Ko, Myung-Sook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.521-526
    • /
    • 2017
  • Data format of Big data is diverse and vast, and its generation speed is very fast, requiring new management and analysis methods, not traditional data processing methods. Textual mining techniques can be used to extract useful information from unstructured text written in human language in online documents on social networks. Identifying trends in the message of politics, economy, and culture left behind in social media is a factor in understanding what topics they are interested in. In this study, text mining was performed on online news related to a given keyword using topic - oriented analysis technique. We use Latent Dirichiet Allocation (LDA) to extract information from web documents and analyze which subjects are interested in a given keyword, and which topics are related to which core values are related.

Ethical Fashion Research Trend Using Text Mining: Network Analysis of the Published Literature 2009-2019 (텍스트 마이닝을 활용한 윤리적 패션 연구동향: 2009-2019 연구 네트워크 분석)

  • Choi, Yeong-Hyeon;Lee, Kyu-Hye
    • The Korean Fashion and Textile Research Journal
    • /
    • v.22 no.2
    • /
    • pp.181-191
    • /
    • 2020
  • The fashion industry has faced environmental, social, and ethical issues due to increased interest in ethical consumption. Numerous ethical studies have been conducted in the fashion industry. This study looked at the current state of research by year, academic journal, and detail in major related papers published in Scopus, KCI and KCI between 2009 and 2019. Ethical fashion studies began to appear in 2009 and were concentrated in certain academic journals and focused on fashion marketing and fashion design. Topics in ethical fashion were terms such as sustainable, eco-friendly, up-cycling, recycling, eco, zero-waist, and organic. In ethical fashion studies, environmental studies were conducted most often; in addition, the terms used along with ethical fashion tend to be frequently used for each particular major. Looking at key words used in research by period, the study showed that research was most diverse between 2016 and 2019. In particular, environmental and social issues of ethical fashion and convergence with animal protection, new distribution, science and technology sectors were newly added between 2016 and 2019. This study used text mining and network analysis to understand the overall trends of ethical fashion studies in Korea. In conclusion it is important to realize the relationship between the main words along with the current status analysis.

A Study on the User Perception in Fashion Design through Social Media Text-Mining (소셜미디어 텍스트마이닝을 통한 패션디자인 사용자 인식 조사)

  • An, Hyosun;Park, Minjung
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.41 no.6
    • /
    • pp.1060-1070
    • /
    • 2017
  • This study seeks methods to analyze users' perception in fashion designs shown in social media using textmining analysis methods. The research methods selected 'men's stripe shirts' as subjects and collected texts related to the subject mainly from blogs. Texts from 13,648 posts from November 1st, 2015 to October 31st, 2016 were analyzed by applying the LDA algorithm and content analysis. As a result, the wearing status per season and subjects of men's stripe shirts were derived. Across the entire period, the main topics discussed by users to be pattern, customized suits, brands, coordination and purchase information. In terms of seasons, spring time showed the sharing of information on coordinating daily looks or boyfriend looks, and during the winter season the information shared were about shirts suitable for special occasions such as job interviews and stripe shirts that match suits. The study results showed that text-mining analysis is capable of analyzing the context and provide a user-centered index responding to demands newly mentioned by users along with the rapid changes in fashion design trends.

Text Mining and Network Analysis of News Articles for Deriving Socio-Economic Damage Types of Heat Wave Events in Korea: 2012~2016 Cases (뉴스 기사 텍스트 마이닝과 네트워크 분석을 통한 폭염의 사회·경제적 영향 유형 도출: 2012~2016년 사례)

  • Jung, Jae In;Lee, Kyoungjun;Kim, Seungbum
    • Atmosphere
    • /
    • v.30 no.3
    • /
    • pp.237-248
    • /
    • 2020
  • In order to effectively prepare for damage caused by weather events, it is important to proactively identify the possible impacts of weather phenomena on the domestic society and economy. Text mining and Network analysis are used in this paper to build a database of damage types and levels caused by heat wave. We collect news articles about heat wave from the SBS news website and determine the primary and secondary effects of that through network analysis. In addition to that, based on the frequency with which each impact keyword is mentioned, we estimate how much influence each factor has. As a result, the types of impacts caused by heat wave are efficiently derived. Among these types of impacts, we find that people in South Korea are mainly interested in algae and heat-related illness. Since this technique of analysis can be applied not only to news articles but also to social media contents, such as Twitter and Facebook, it is expected to be used as a useful tool for building weather impact databases.

Time Series Analysis of Patent Keywords for Forecasting Emerging Technology (특허 키워드 시계열 분석을 통한 부상 기술 예측)

  • Kim, Jong-Chan;Lee, Joon-Hyuck;Kim, Gab-Jo;Park, Sang-Sung;Jang, Dong-Sick
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.9
    • /
    • pp.355-360
    • /
    • 2014
  • Forecasting of emerging technology plays important roles in business strategy and R&D investment. There are various ways for technology forecasting including patent analysis. Qualitative analysis methods through experts' evaluations and opinions have been mainly used for technology forecasting using patents. However qualitative methods do not assure objectivity of analysis results and requires high cost and long time. To make up for the weaknesses, we are able to analyze patent data quantitatively and statistically by using text mining technique. In this paper, we suggest a new method of technology forecasting using text mining and ARIMA analysis.

Quantifying the Process of Patent Right Quality Evaluation : Combined Application of AHP, Text Mining and Regression Analysis (특허권리성의 정량적 평가방법에 대한 연구 : AHP, 텍스트 마이닝, 회귀분석의 활용)

  • Yoon, Janghyeok;Song, Jaeguk;Ryu, Tae-Kyu
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.38 no.2
    • /
    • pp.17-30
    • /
    • 2015
  • Technology-oriented national R&D programs produce intellectual property as their final result. Patents, as typical industrial intellectual property, are therefore considered an important factor when evaluating the outcome of R&D programs. Among the main components of patent evaluation, in particular, the patent right quality is a key component constituting patent value, together with marketability and usability. Current approaches for patent right quality evaluation rely mostly on intrinsic knowledge of patent attorneys, and the recent rapid increase of national R&D patents is making expert-based evaluation costly and time-consuming. Therefore, this study defines a hierarchy of patent right quality and then proposes how to quantify the evaluation process of patent right quality by combining text mining and regression analysis. This study will contribute to understanding of the systemic view of the patent right quality evaluation, as well as be an efficient aid for evaluating patents in R&D program assessment processes.

Analysis of the Contents of Hanbok in the 「Home Life and Safety」 section of the High School Technical Family Textbook: Content Analysis and Text Mining Techniques are utilized (고등학교 기술·가정 교과서 「가정생활과 안전」 영역의 한복 내용 분석)

  • Shim, Joon Young;Baek, Min Kyung
    • Human Ecology Research
    • /
    • v.59 no.2
    • /
    • pp.261-273
    • /
    • 2021
  • This study is not just a meaning of costume but a function of culture and includes addresses the associated emotions. As the interest of youths has increased recently, the importance of traditional costume education has been growing. Therefore, this study aims to analyze the contents of Hanbok in the 2015 revised high school technology and home textbooks using content analysis techniques and text mining techniques. As a result of the study, first, the symbolic meaning and characteristics of Hanbok and the beauty of Hanbok were practiced in daily life, and the value was found through the excellence of Hanbok and the modernization of Hanbok was dealt with Second, most of the illustrations related to traditional costumes were presented in various ways, but there were some regrets due to lack of quantity and quality. Third, the words used to explain traditional costumes were used in the form of culture, excellence, tradition, modernity, harmony, succession, etc. except for the types of clothing. Therefore, the results and discussions derived from this study are expected to help the textbooks to be efficiently selected and used in the field of the front line school along with the correct understanding of traditional culture in the process of selecting traditional culture contents and illustrations.

Recognizing hanbok in youth through text mining (텍스트 마이닝을 통해 살펴본 청소년의 한복 인식)

  • Shim, Joonyoung
    • The Research Journal of the Costume Culture
    • /
    • v.27 no.3
    • /
    • pp.239-250
    • /
    • 2019
  • Recently, young people wearing hanbok are highly visible in the palace and in Hanok Village. However, there is much controversy regarding whether the hanbok the young people are wearing is traditional. Young people in Korea are exposed to hanbok through a variety of ways such as school education, games, webtoons, television shows, and movies. In this study, we presented teenagers with illustrations of hanbok to see which they preferred and which if any they recognized as traditional. The study respondents most preferred the hanbok from the 18th century, but they considered the hanbok from the 20th century to be the traditional style. We next used text mining to analyze the students' freely written, open-ended responses regarding the hanbok they preferred and the one they considered traditional. The hanbok from the 18th century, the one the teenagers preferred, was a sexy, cool style related to gisaeng that emphasized the waist, whereas the hanbok they believed was traditional, the $20^{th}$-century hanbok, was simple, neat, comfortable, and plain. Among the young people's responses regarding which hanbok was traditional, the text mining extracted the following repeated words related to both the 18th- and 20th-century hanbok: "dramas," "mass media," "historical dramas," and "movies." For the 18th-century hanbok only, we extracted "webtoons" and "Hanok Village," and for only the 20th-century hanbok, we extracted "textbooks."

OryzaGP: rice gene and protein dataset for named-entity recognition

  • Larmande, Pierre;Do, Huy;Wang, Yue
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.17.1-17.3
    • /
    • 2019
  • Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.