• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.044 seconds

Semantic analysis via application of deep learning using Naver movie review data (네이버 영화 리뷰 데이터를 이용한 의미 분석(semantic analysis))

  • Kim, Sojin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.19-33
    • /
    • 2022
  • With the explosive growth of social media, its abundant text-based data generated by web users has become an important source for data analysis. For example, we often witness online movie reviews from the 'Naver Movie' affecting the general public to decide whether they should watch the movie or not. This study has conducted analysis on the Naver Movie's text-based review data to predict the actual ratings. After examining the distribution of movie ratings, we performed semantics analysis using Korean Natural Language Processing. This research sought to find the best review rating prediction model by comparing machine learning and deep learning models. We also compared various regression and classification models in 2-class and multi-class cases. Lastly we explained the causes of review misclassification related to movie review data characteristics.

Recommendation System for Research Field of R&D Project Using Machine Learning (머신러닝을 이용한 R&D과제의 연구분야 추천 서비스)

  • Kim, Yunjeong;Shin, Donggu;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1809-1816
    • /
    • 2021
  • In order to identify the latest research trends using data related to national R&D projects and to produce and utilize meaningful information, the application of automatic classification technology was also required in the national R&D information service, so we conducted research to automatically classify and recommend research field. About 450,000 cases of national R&D project data from 2013 to 2020 were collected and used for learning and evaluation. A model was selected after data pre-processing, analysis, and performance analysis for valid data among collected data. The performance of Word2vec, GloVe, and fastText was compared for the purpose of deriving the optimal model combination. As a result of the experiment, the accuracy of only the subcategories used as essential items of task information is 90.11%. This model is expected to be applicable to the automatic classification study of other classification systems with a hierarchical structure similar to that of the national science and technology standard classification research field.

A Study on Health Care Service Design for the Improvement of Cognitive Abilities of the Senior Citizens: Focusing on Unstructured Data Analysis (노인 인지능력 개선을 위한 헬스케어 서비스디자인 연구: 비정형 데이터 분석을 중심으로)

  • Seongho Kim;Hyeob Kim
    • Knowledge Management Research
    • /
    • v.23 no.4
    • /
    • pp.69-89
    • /
    • 2022
  • As we enter a super-aged society, senior citizens' health issues are affecting a variety of fields, including medicine, economics, society, and culture. In this study, we intend to draw implications from unstructured data analysis such as text mining and social network analysis in order to apply digital health care service design for improving the cognitive ability of senior citizens. The research procedure of this study improved the service design methodology into a process suited to the analysis of unstructured data, and six steps were applied. Related keywords that exist on social media, focusing on cognitive improvement and healthcare for senior citizens, were collected and analyzed, and based on these results, the direction of healthcare service design for improving on the cognitive abilities of senior citizens was derived. The results of this study are expected to have academic and practical implications for expanding the scope of the use of big data analysis methods and improving existing healthcare service development methodologies.

Big data text mining analysis to identify non-face-to-face education problems (비대면 교육 문제점 파악을 위한 빅데이터 텍스트 마이닝 분석)

  • Park, Sung Jae;Hwang, Ug-Sun
    • Korean Educational Research Journal
    • /
    • v.43 no.1
    • /
    • pp.1-27
    • /
    • 2022
  • As the COVID-19 virus became prevalent worldwide, non-face-to-face contact was implemented in various ways, and the education system also began to draw much attention due to rapid non-face-to-face contact. The purpose of this study is to analyze the direction of non-face-to-face education in line with the continuously changing educational environment to date. In this study, data were visualized using Textom and Ucinet6 analysis tool programs to collect social network big data with various opinions. As a result of the study, keywords related to "COVID-19" were dominant, and keywords with high frequency such as "article" and "news" existed. As a result of the analysis, various issues related to non-face-to-face education, such as network failures and security issues, were identified. After the analysis, the direction of the non-face-to-face education system was studied according to the growth of the education market and changes in the educational environment. In addition, there is a need to strengthen security and feedback on teaching methods in non-face-to-face education analyzed using big data.

Video Summarization Using Eye Tracking and Electroencephalogram (EEG) Data (시선추적-뇌파 기반의 비디오 요약 생성 방안 연구)

  • Kim, Hyun-Hee;Kim, Yong-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.1
    • /
    • pp.95-117
    • /
    • 2022
  • This study developed and evaluated audio-visual (AV) semantics-based video summarization methods using eye tracking and electroencephalography (EEG) data. For this study, twenty-seven university students participated in eye tracking and EEG experiments. The evaluation results showed that the average recall rate (0.73) of using both EEG and pupil diameter data for the construction of a video summary was higher than that (0.50) of using EEG data or that (0.68) of using pupil diameter data. In addition, this study reported that the reasons why the average recall (0.57) of the AV semantics-based personalized video summaries was lower than that (0.69) of the AV semantics-based generic video summaries. The differences and characteristics between the AV semantics-based video summarization methods and the text semantics-based video summarization methods were compared and analyzed.

Text-Independent Speaker Identification System Based On Vowel And Incremental Learning Neural Networks

  • Heo, Kwang-Seung;Lee, Dong-Wook;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1042-1045
    • /
    • 2003
  • In this paper, we propose the speaker identification system that uses vowel that has speaker's characteristic. System is divided to speech feature extraction part and speaker identification part. Speech feature extraction part extracts speaker's feature. Voiced speech has the characteristic that divides speakers. For vowel extraction, formants are used in voiced speech through frequency analysis. Vowel-a that different formants is extracted in text. Pitch, formant, intensity, log area ratio, LP coefficients, cepstral coefficients are used by method to draw characteristic. The cpestral coefficients that show the best performance in speaker identification among several methods are used. Speaker identification part distinguishes speaker using Neural Network. 12 order cepstral coefficients are used learning input data. Neural Network's structure is MLP and learning algorithm is BP (Backpropagation). Hidden nodes and output nodes are incremented. The nodes in the incremental learning neural network are interconnected via weighted links and each node in a layer is generally connected to each node in the succeeding layer leaving the output node to provide output for the network. Though the vowel extract and incremental learning, the proposed system uses low learning data and reduces learning time and improves identification rate.

  • PDF

Illness and Experiences of the Body Among Aged Women (만성질환을 지닌 여성 노인의 몸 체험)

  • Cho, Myung Ok
    • Korean Journal of Adult Nursing
    • /
    • v.19 no.3
    • /
    • pp.365-378
    • /
    • 2007
  • Purpose: The purpose of present study was to discover the experience of the body of aged women, having had disease. Thus, the researcher tried to explore the perception of the informants and the context in which this perception emerged. Methods: 9 aged women who had disease or trauma were recruited by snow balling and theoretical sampling methods. The iterative data collection and analyzing process proceeded between September, 1999 and January, 2005. Questions posed to the informants included: "What major change in your body comes from the disease?" "How did you feel about yourself after having had disease?". Data from interviews and participant observation was taken as text. The text was analyzed using the ongoing process of qualitative content analysing method and taxonomy of Spradley. Results: Disease gives aged women a chance to reinforce the meaning of their body: the body as the most low valued component of a human, the body as a wholistic field of interacting each component of human and with natural environment and cosmos, and the body as a source of group identity. These meanings were constructed in their life world by the rules of hierarchy, reciprocity, and group cohesiveness. Conclusions: The human body is constructed as a cultural being by a social process. Nursing is concerned with the biological body and the social body. The results of this study can serve to help understand the socialization of the body and to construct a somology of nursing.

  • PDF

Construction of Two-Dimensional Database of Korean Traditional Shoes for the Development of Cultural Contents(1) (문화콘텐츠개발을 위한 한국 전통신발의 2D데이터베이스 구축(1))

  • Park, Hea-Ryung
    • Fashion & Textile Research Journal
    • /
    • v.12 no.6
    • /
    • pp.796-811
    • /
    • 2010
  • Research materials of Korean traditional shoes have so far been mainly literary explanations or plane pictures expressed on the basis of the explanations and photographs of incomplete forms of relics excavated and it makes us have difficulty in observing them visually and producing products with them by design application. This project is to establish database of literal data of Korean traditional shoes and visual data using 3D in order to make the foundation of developing culture industry contents using Korean traditional shoes. According to the initial research plan. first. it analyzed and arranged the Korean traditional shoes into period. sex and function as the research goals of the first year. categorized the form. composition. materials. patterns. and colors of traditional shoes and then database of the materials was performed with text. Second. visual image materials including forms. composition. materials. patterns. and colors of traditional shoes were established as database with scanner. digital camera and computer 2D. Results of such a database will be able to be used as important materials which can be the foundation of culture industry contents development of traditional shoes and be the materials for developing digital culture contents of traditional shoes and teaching Korean traditional culture.

A Big Data Study on Viewers' Response and Success Factors in the D2C Era Focused on tvN's Web-real Variety 'SinSeoYuGi' and Naver TV Cast Programming

  • Oh, Sejong;Ahn, Sunghun;Byun, Jungmin
    • International Journal of Advanced Culture Technology
    • /
    • v.4 no.2
    • /
    • pp.7-18
    • /
    • 2016
  • The first D2C-era web-real variety show in Korea was broadcast via tvN of CJ E&M. The web-real variety program 'SinSeoYuGi' accumulated 54 million views, along with 50 million views at the Chinese portal site QQ. This study carries out an analysis using text mining that extracts portal site blogs, twitter page views and associative terms. In addition, this study derives viewers' response by extracting key words with opinion mining techniques that divide positive words, neutral words and negative words through customer sentiment analysis. It is found that the success factors of the web-real variety were reduced in appearance fees and production cost, harmony between actual cast members and scenario characters, mobile TV programing, and pre-roll advertising. It is expected that web-real variety broadcasting will increase in value as web contents in the future, and be established as a new genre with the job of 'technical marketer' growing as well.

A Study on Design of the Electric Sign Board System using Embedded ARM Board (내장형 ARM 보드를 이용한 전광판 시스템 설계에 관한 연구)

  • 최재우
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.5 no.3
    • /
    • pp.241-246
    • /
    • 2004
  • We have designed LED display system using ARM7TDMI processor and implemented hangul input and output. This system is easily extensible because controller board and LED matrix board were designed one module. Possible Input Methods of LED display system are PC, PDA and remote controller's wired and wireless communication. We have ported QT/Embedded 2.3.7 with touch panel Input at embedded board of Linux OS 2.4.18 and PXA255 Processor based. QT Application which we coded is able to input displaying text using ethernet communication on embedded system. Many of indicating text data is able to be saved because only korean alphabet codes are stored for data which users want displaying.

  • PDF