• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.029 seconds

Analysis of Descriptive Lectures Evaluation using Text Mining: Comparative analysis pre and post COVID-19

  • Lee, Sang-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.10
    • /
    • pp.211-222
    • /
    • 2022
  • The purpose of this study is to indicate the direction of the future university classes in the post-COVID era, comparing and analyzing lecture evaluation of pre and post COVID-19. To this end, 4 yeard data were used from 2018 to 2019 for pre COVID-19 and form 2020 to 2021 data for post COVID-19. The results were as follows. In the case of liberal arts, "assignments" was the word with the highest frequency and degree centrality(DC) regardless of pre and post-COVID-19 In the major, "understanding" appeared as the most important word. The result of the ego network analysis indicated that "video lecture" and "non-face-to-face classes" were difficult and "interaction" between the professor and the students was important. As a results, it is important to reduce the weight of assignments and increase interaction with students in liberal arts classes. In the case of majors, it is necessary to operate face-to-face classes rather than non-face-to-face classes, and to organize the contents of videos without difficulty.

Analysis of Meta Fashion Meaning Structure using Big Data: Focusing on the keywords 'Metaverse' + 'Fashion design' (빅데이터를 활용한 메타패션 의미구조 분석에 관한 연구: '메타버스' + '패션디자인' 키워드를 중심으로)

  • Ji-Yeon Kim;Shin-Young Lee
    • Fashion & Textile Research Journal
    • /
    • v.25 no.5
    • /
    • pp.549-559
    • /
    • 2023
  • Along with the transition to the fourth industrial revolution, the possibility of metaverse-based innovation in the fashion field has been confirmed, and various applications are being sought. Therefore, this study performs meaning structure analysis and discusses the prospects of meta fashion using big data. From 2020 to 2022, data including the keyword "metaverse + fashion design" were collected from portal sites (Naver, Daum, and Google), and the results of keyword frequency, N-gram, and TF-IDF analyses were derived using text mining. Furthermore, network visualization and CONCOR analysis were performed using Ucinet 6 to understand the interconnected structure between keywords and their essential meanings. The results were as follows: The main keywords appeared in the following order: fashion, metaverse, design, 3D, platform, apparel, and virtual. In the N-gram analysis, the density between fashion and metaverse words was high, and in the TF-IDF analysis results, the importance of content- and technology-related words such as 3D, apparel, platform, NFT, education, AI, avatar, MCM, and meta-fashion was confirmed. Through network visualization and CONCOR analysis using Ucinet 6, three cluster results were derived from the top emerging words: "metaverse fashion design and industry," "metaverse fashion design and education," and "metaverse fashion design platform." CONCOR analysis was also used to derive differentiated analysis results for middle and lower words. The results of this study provide useful information to strengthen competitiveness in the field of metaverse fashion design.

Multi-Label Classification for Corporate Review Text: A Local Grammar Approach (머신러닝 기반의 기업 리뷰 다중 분류: 부분 문법 적용을 중심으로)

  • HyeYeon Baek;Young Kyun Chang
    • Information Systems Review
    • /
    • v.25 no.3
    • /
    • pp.27-41
    • /
    • 2023
  • Unlike the previous works focusing on the state-of-the-art methodologies to improve the performance of machine learning models, this study improves the 'quality' of training data used in machine learning. We propose a method to enhance the quality of training data through the processing of 'local grammar,' frequently used in corpus analysis. We collected a vast amount of unstructured corporate review text data posted by employees working in the top 100 companies in Korea. After improving the data quality using the local grammar process, we confirmed that the classification model with local grammar outperformed the model without it in terms of classification performance. We defined five factors of work engagement as classification categories, and analyzed how the pattern of reviews changed before and after the COVID-19 pandemic. Through this study, we provide evidence that shows the value of the local grammar-based automatic identification and classification of employee experiences, and offer some clues for significant organizational cultural phenomena.

Exploring the phenomenon of veganphobia in vegan food and vegan fashion (비건 음식과 비건 패션에서 나타난 비건포비아 현상에 대한 탐구)

  • Yeong-Hyeon Choi;Sangyung Lee
    • The Research Journal of the Costume Culture
    • /
    • v.32 no.3
    • /
    • pp.381-397
    • /
    • 2024
  • This study investigates the negative perceptions (veganphobia) held by consumers toward vegan diets and fashion and aims to foster a genuine acceptance of ethical veganism in consumption. The textual data web-crawled Korean online posts, including news articles, blogs, forums, and tweets, containing keywords such as "contradiction," "dilemma," "conflict," "issues," "vegan food" and "vegan fashion" from 2013 to 2021. Data analysis was conducted through text mining, network analysis, and clustering analysis using Python and NodeXL programs. The analysis revealed distinct negative perceptions regarding vegan food. Key issues included the perception of hypocrisy among vegetarians, associations with specific political leanings, conflicts between environmental and animal rights, and contradictions between views on companion animals and livestock. Regarding the vegan fashion industry, the eco-friendliness of material selection and design processes were seen as the pivotal factors shaping negative attitudes. Furthermore, the study identified a shared negative perception regarding vegan food and vegan fashion. This negativity was characterized by confusion and conflicts between animal and environmental rights, biased perceptions linked to specific political affiliations, perceived self-righteousness among vegetarians, and general discomfort toward them. These factors collectively contributed to a broader negative perception of vegan consumption. In conclusion, this study is significant in understanding the complex perceptions and attitudes that con- sumers hold toward vegan food and fashion. The insights gained from this research can aid in the design of more effective campaign strategies aimed at promoting vegan consumerism, ultimately contributing to a more widespread acceptance of ethical veganism in society.

Study on Application of Big Data in Packaging (패키징(Packaging) 분야에서의 빅데이터(Big data) 적용방안 연구)

  • Kang, WookGeon;Ko, Euisuk;Shim, Woncheol;Lee, Hakrae;Kim, Jaineung
    • KOREAN JOURNAL OF PACKAGING SCIENCE & TECHNOLOGY
    • /
    • v.23 no.3
    • /
    • pp.201-209
    • /
    • 2017
  • The Big Data, the element of the Fourth Industrial Revolution, is drawing attention as the 4th Industrial Revolution is mentioned in the 2016 World Economic Forum. Big Data is being used in various fields because it predicts the near future and can create new business. However, utilization and research in the field of packaging are lacking. Today packaging has been demanded marketing elements that effect on consumer choice. Big data is actively used in marketing. In the marketing field, big data can be used to analyze sales information and consumer reactions to produce meaningful results. Therefore, this study proposed a method of applying big data in the field of packaging focusing on marketing. In this study suggest that try to utilize the private data and community data to analyze interaction between consumers and products. Using social big data will enable to understand the preferred packaging and consumer perceptions and emotions in the same product line. It can also be used to analyze the effects of packaging among various components of the product. Packaging is one of the many components of the product. Therefore, it is not easy to understand the impact of a single packaging element. However, this study presents the possibility of using Big Data to analyze the perceptions and feelings of consumers about packaging.

Detection of Protein Subcellular Localization based on Syntactic Dependency Paths (구문 의존 경로에 기반한 단백질의 세포 내 위치 인식)

  • Kim, Mi-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.375-382
    • /
    • 2008
  • A protein's subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and WordNet thesaurus. In the first step, we constructed syntactic dependency paths from each protein to its location candidate, and then converted the syntactic dependency paths into dependency trees. In the second step, we retrieved root information of the syntactic dependency trees. In the final step, we extracted syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extracted syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees from the training data, we extracted (protein, localization) pairs from the test sentences. Even with no biomolecular knowledge, our method showed reasonable performance in experimental results using Medline abstract data. Our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12-25%.

The Development of an Automatic Indexing System based on a Thesaurus (시소러스를 기반으로 하는 자동색인 시스템에 관한 연구)

  • 임형묵;정상철
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.1
    • /
    • pp.213-242
    • /
    • 1993
  • During the past decades,several automatic indexing systems have been developed such as single term indexing.phrase indexing and thesaurus basedidndexing systems.Among these systems,single term indexing has been known as superior to others despte its simpicity of extracting meaningful terms.On the other hand,thesaurus based one has been conceived as producing low retrival rate ,mainly because thesauri do not usually have enough index terms.so that much of text data fail to be indexed if they do not match with any of index terms in thesauri.This paper develops a thesaurus based indexing system THINS that yields higher retrieval rate than other systems.by doing syntactic analysis of text data and matching them with index terms in thesauri partially.First,the system analyzes the input text syntactically by using the machine translation suystem MATES/EK and extracts noun phrases.After deleting stop words from noun phrases and stemming the remaining ones.it tries to index these with similar index terms in the thesaurus as much as possible. We conduct an experiment with CACM data set that measures the retrieval effectiveness with CACM data set that measures the retrieval effectuvenss of THINS with single term based one under HYKIS-a thesaurus based information retrieval system.It turns out that THINS yields about 10 percent higher precision than single term based one.while shows 8to9 percent lower recall.This retrieval rate shows that THINS improves much better than privious ones that only yields 25 or 30 percent lower precision than single term based one.We also argue that the relatively lower recall is cause by that CRCS-the thesaurus included in CACM datea set is very incomplete one,having only more than one thousand terms,thus THINS is expected to produce much higher rate if it is associated with currently available large thesaurus.

Visualizing Spatial Information of Climate Change Impacts on Social Infrastructure using Text-Mining Method (텍스트마이닝 기법을 활용한 사회기반시설 기후변화 영향의 공간정보 표출)

  • Shin, Hana;Ryu, Jaena
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.5_3
    • /
    • pp.773-786
    • /
    • 2017
  • This study was to analyze data of climate change impacts on social infrastructure using text-mining methodology, and to visualize the spatial information by integrating those with regional data layers. First of all, the study identified that the following social infrastructure; power, oil and resource management, transport and urban, environment, and water supply infrastructures, were affected by five kinds of climate factors (heat wave, cold wave, heavy rain, heavy snow, strong wind). Climate change impacts on social infrastructure were then analyzed and visualized by regions. The analysis resulted that transport and urban infrastructures among all kinds of infrastructure were highly impacted by climate change, and the most severe factors of the climate impacts on social infrastructure were heavy rain and heavy snow. In addition, it found out that social infrastructure located in Seoul and Gangwon-do region were relatively largely affected by climate change. This study has significance that atypical data in media was used to analyze climate change impacts on social infrastructure and the results were translated into spatial information data to analyze and visualize the climate change impacts by regions.

A Design of the OOPP(Optimized Online Portfolio Platform) using Enterprise Competency Information (기업 직무 정보를 활용한 OOPP(Optimized Online Portfolio Platform)설계)

  • Jung, Bogeun;Park, Jinuk;Lee, ByungKwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.493-506
    • /
    • 2018
  • This paper proposes the OOPP(Optimized Online Portfolio Platform) design for the job seekers to search for the job competency necessary for employment and to write and manage portfolio online efficiently. The OOPP consists of three modules. First, JDCM(Job Data Collection Module) stores the help-wanted advertisements of job information sites in a spreadsheet. Second, CSM(Competency Statistical Model) classifies core competencies for each job by text-mining the collected help-wanted ads. Third, OBBM(Optimize Browser Behavior Module) makes users to look up data rapidly by improving the processing speed of a browser. In addition, The OBBM consists of the PSES(Parallel Search Engine Sub-Module) optimizing the computation of a Search Engine and the OILS(Optimized Image Loading Sub-Module) optimizing the loading of image text, etc. The performance analysis of the CSM shows that there is little difference in accuracy between the CSM and the actual advertisement because its data accuracy is 99.4~100%. If Browser optimization is done by using the OBBM, working time is reduced by about 68.37%. Therefore, the OOPP makes users look up the analyzed result in the web page rapidly by analyzing the help-wanted ads. of job information sites accurately.

An Artificial Neural Network Based Phrase Network Construction Method for Structuring Facility Error Types (설비 오류 유형 구조화를 위한 인공신경망 기반 구절 네트워크 구축 방법)

  • Roh, Younghoon;Choi, Eunyoung;Choi, Yerim
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.21-29
    • /
    • 2018
  • In the era of the 4-th industrial revolution, the concept of smart factory is emerging. There are efforts to predict the occurrences of facility errors which have negative effects on the utilization and productivity by using data analysis. Data composed of the situation of a facility error and the type of the error, called the facility error log, is required for the prediction. However, in many manufacturing companies, the types of facility error are not precisely defined and categorized. The worker who operates the facilities writes the type of facility error in the form with unstructured text based on his or her empirical judgement. That makes it impossible to analyze data. Therefore, this paper proposes a framework for constructing a phrase network to support the identification and classification of facility error types by using facility error logs written by operators. Specifically, phrase indicating the types are extracted from text data by using dictionary which classifies terms by their usage. Then, a phrase network is constructed by calculating the similarity between the extracted phrase. The performance of the proposed method was evaluated by using real-world facility error logs. It is expected that the proposed method will contribute to the accurate identification of error types and to the prediction of facility errors.