• Title/Summary/Keyword: Text features

Search Result 580, Processing Time 0.024 seconds

Automatic Linkage Model of Classification Systems Based on a Pretraining Language Model for Interconnecting Science and Technology with Job Information

  • Jeong, Hyun Ji;Jang, Gwangseon;Shin, Donggu;Kim, Tae Hyun
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.39-45
    • /
    • 2022
  • For national industrial development in the Fourth Industrial Revolution, it is necessary to provide researchers with appropriate job information. This can be achieved by interconnecting the National Science and Technology Standard Classification System used for management of research activity with the Korean Employment Classification of Occupations used for job information management. In the present study, an automatic linkage model of classification systems is introduced based on a pre-trained language model for interconnecting science and technology information with job information. We propose for the first time an automatic model for linkage of classification systems. Our model effectively maps similar classes between the National Science & Technology Standard Classification System and Korean Employment Classification of Occupations. Moreover, the model increases interconnection performance by considering hierarchical features of classification systems. Experimental results show that precision and recall of the proposed model are about 0.82 and 0.84, respectively.

Fake News Detection on Social Media using Video Information: Focused on YouTube (영상정보를 활용한 소셜 미디어상에서의 가짜 뉴스 탐지: 유튜브를 중심으로)

  • Chang, Yoon Ho;Choi, Byoung Gu
    • The Journal of Information Systems
    • /
    • v.32 no.2
    • /
    • pp.87-108
    • /
    • 2023
  • Purpose The main purpose of this study is to improve fake news detection performance by using video information to overcome the limitations of extant text- and image-oriented studies that do not reflect the latest news consumption trend. Design/methodology/approach This study collected video clips and related information including news scripts, speakers' facial expression, and video metadata from YouTube to develop fake news detection model. Based on the collected data, seven combinations of related information (i.e. scripts, video metadata, facial expression, scripts and video metadata, scripts and facial expression, and scripts, video metadata, and facial expression) were used as an input for taining and evaluation. The input data was analyzed using six models such as support vector machine and deep neural network. The area under the curve(AUC) was used to evaluate the performance of classification model. Findings The results showed that the ACU and accuracy values of three features combination (scripts, video metadata, and facial expression) were the highest in logistic regression, naïve bayes, and deep neural network models. This result implied that the fake news detection could be improved by using video information(video metadata and facial expression). Sample size of this study was relatively small. The generalizablity of the results would be enhanced with a larger sample size.

Empowering Agriculture: Exploring User Sentiments and Suggestions for Plantix, a Smart Farming Application

  • Mee Qi Siow;Mu Moung Cho Han;Yu Na Lee;Seon Yeong Yu;Mi Jin Noh;Yang Sok Kim
    • Smart Media Journal
    • /
    • v.12 no.10
    • /
    • pp.38-46
    • /
    • 2023
  • Farming activities are transforming from traditional skill-based agriculture into knowledge-based and technology-driven digital agriculture. The use of intelligent information and communication technology introduces the idea of smart farming that enables farmers to collect weather data, monitor crop growth remotely and detect crop diseases easily. The introduction of Plantix, a pest and disease management tool in the form of a mobile application has allowed farmers to identify pests and diseases of the crop using their mobile devices. Hence, this study collected the reviews of Plantix to explore the response of the users on the Google Play Store towards the application through Latent Dirichlet Allocation (LDA) topic modeling. Results indicate four latent topics in the reviews: two positive evaluations (compliments, appreciation) and two suggestions (plant options, recommendations). We found the users suggested the application to additional plant options and additional features that might help the farmers with their difficulties. In addition, the application is expected to benefit the farmer more by having an early alert of diseases to farmers and providing various substitutes and a list of components for the remedial measures.

Handwritten Indic Digit Recognition using Deep Hybrid Capsule Network

  • Mohammad Reduanul Haque;Rubaiya Hafiz;Mohammad Zahidul Islam;Mohammad Shorif Uddin
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.89-94
    • /
    • 2024
  • Indian subcontinent is a birthplace of multilingual people where documents such as job application form, passport, number plate identification, and so forth is composed of text contents written in different languages/scripts. These scripts may be in the form of different indic numerals in a single document page. Due to this reason, building a generic recognizer that is capable of recognizing handwritten indic digits written by diverse writers is needed. Also, a lot of work has been done for various non-Indic numerals particularly, in case of Roman, but, in case of Indic digits, the research is limited. Moreover, most of the research focuses with only on MNIST datasets or with only single datasets, either because of time restraints or because the model is tailored to a specific task. In this work, a hybrid model is proposed to recognize all available indic handwritten digit images using the existing benchmark datasets. The proposed method bridges the automatically learnt features of Capsule Network with hand crafted Bag of Feature (BoF) extraction method. Along the way, we analyze (1) the successes (2) explore whether this method will perform well on more difficult conditions i.e. noise, color, affine transformations, intra-class variation, natural scenes. Experimental results show that the hybrid method gives better accuracy in comparison with Capsule Network.

Measuring Hotel Service Quality Using Social Media Analytics: The Moderating Effects of Brand of Origin

  • Byounggu Choi;Shin-Hyeok Kang
    • Asia pacific journal of information systems
    • /
    • v.33 no.3
    • /
    • pp.677-701
    • /
    • 2023
  • With the rapid advancement of social media analytics and artificial intelligence, many studies have used online customer reviews as an important source to measure service quality in many industries, including the hotel industry. However, these studies have failed to identify the relative importance of different dimensions of service quality and their role in customer satisfaction. To fill this research gap, this study aims to identify the effects of service quality on hotel customer satisfaction from the multidimensional perspectives using sentiment analysis with self-training on online reviews. Additionally, the moderating role of the brand of origin for each service quality dimension is also investigated. Drawing on the SERVQUAL model and brand of origin concept, this study develops 12 hypotheses and empirically tests them using 30,070 online customer hotel reviews collected from TripAdvisor.com. The results indicated that overall service quality and each dimension of SERVQUAL significantly influenced customer satisfaction of hotels. The results also confirmed the moderating effects of brand of origin on overall service quality. However, the moderating effects of brand of origin for the tangible, reliability, and empathy dimensions of service quality were significant, whereas the effects for responsiveness and assurance were not. This study sheds new light on service quality measurement by analyzing the multidimensional features of service quality and the role of brand of origin in the hotel service context.

How Long Will Your Videos Remain Popular? Empirical Study with Deep Learning and Survival Analysis

  • Min Gyeong Choi;Jae Hong Park
    • Asia pacific journal of information systems
    • /
    • v.33 no.2
    • /
    • pp.282-297
    • /
    • 2023
  • One of the emerging trends in the marketing field is digital video marketing. Online videos offer rich content typically containing more information than any other type of content (e.g., audible or textual content). Accordingly, previous researchers have examined factors influencing videos' popularity. However, few studies have examined what causes a video to remain popular. Some videos achieve continuous, ongoing popularity, while others fade out quickly. For practitioners, videos at the recommendation slots may serve as strong communication channels, as many potential consumers are exposed to such videos. So,this study will provide practitioners important advice regarding how to choose videos that will survive as long-lasting favorites, allowing them to advertise in a cost-effective manner. Using deep learning techniques, this study extracts text from videos and measured the videos' tones, including factual and emotional tones. Additionally, we measure the aesthetic score by analyzing the thumbnail images in the data. We then empirically show that the cognitive features of a video, such as the tone of a message and the aesthetic assessment of a thumbnail image, play an important role in determining videos' long-term popularity. We believe that this is the first study of its kind to examine new factors that aid in ensuring a video remains popular using both deep learning and econometric methodologies.

A Study on the Shopping Life through Mobile Visual Search

  • Tungyun Liu;Sijun Sung;Heeju Chae
    • Asia-Pacific Journal of Business
    • /
    • v.15 no.1
    • /
    • pp.45-69
    • /
    • 2024
  • Purpose - To examine the influence of mobile visual search as a strategic technology service on consumer perceived economic value and customer commitments, which in turn affect consumer's usage intention of mobile visual search. This study also explores the moderating effect of different levels of consumer online shopping orientation. Design/methodology/approach - One-by-one open-ended in-depth interview was first undertaken to 15 Korean consumers to figure the features of mobile visual search. Then a conceptual model was built to verify the hypotheses that indicate the impact of mobile visual search on consumer perceived economic value and customer commitment, which further influence consumer's usage intention. Findings - The results show Convenience, Information quality, Personalization, Text-free search interface design and Visual communication of mobile visual search positively influence consumer perceived economic value and customer commitment and in turn positively affect consumer's usage intention. Moreover, the different levels of consumer online shopping orientation also found to have different effects on consumers' perception and behavior of using mobile visual search in online fashion shopping. Research implications or Originality - The present study verified that mobile visual search is a service tool that consumers want to use in the online fashion shopping journey since it provides economic benefits.

A study on the improving and constructing the content for the Sijo database in the Period of Modern Enlightenment (계몽기·근대시조 DB의 개선 및 콘텐츠화 방안 연구)

  • Chang, Chung-Soo
    • Sijohaknonchong
    • /
    • v.44
    • /
    • pp.105-138
    • /
    • 2016
  • Recently with the research function, "XML Digital collection of Sijo Texts in the Period of Modern Enlightenment" DB data is being provided through the Korean Research Memory (http://www.krm.or.kr) and the foundation for the constructing the contents of Sijo Texts in the Period of Modern Enlightenment has been laid. In this paper, by reviewing the characteristics and problems of Digital collection of Sijo Texts in the Period of Modern Enlightenment and searching for the improvement, I tried to find a way to make it into the content. This database has the primary meaning in the integrating and glancing at the vast amounts of Sijo in the Period of Modern Enlightenment to reaching 12,500 pieces. In addition, it is the first Sijo data base which is provide the variety of search features according to literature, name of poet, title of work, original text, per period, and etc. However, this database has the limits to verifying the overall aspects of the Sijo in the Period of Modern Enlightenment. The title and original text, which is written in the archaic word or Chinese character, could not be searched, because the standard type text of modern language is not formatted. And also the works and the individual Sijo works released after 1945 were missing in the database. It is inconvenient to extract the datum according to the poet, because poets are marked in the various ways such as one's real name, nom de plume and etc. To solve this kind of problems and improve the utilization of the database, I proposed the providing the standard type text of modern language, giving the index terms about content, providing the information on the work format and etc. Furthermore, if the Sijo database in the Period of Modern Enlightenment which is prepared the character of the Sijo Culture Information System could be built, it could be connected with the academic, educational contents. For the specific plan, I suggested as follow, - learning support materials for the Modern history and the national territory recognition on the Modern Age - source materials for studying indigenous animals and plants characters creating the commercial characters - applicability as the Sijo learning tool such as Sijo Game.

  • PDF

A Korean Document Sentiment Classification System based on Semantic Properties of Sentiment Words (감정 단어의 의미적 특성을 반영한 한국어 문서 감정분류 시스템)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.4
    • /
    • pp.317-322
    • /
    • 2010
  • This paper proposes how to improve performance of the Korean document sentiment-classification system using semantic properties of the sentiment words. A sentiment word means a word with sentiment, and sentiment features are defined by a set of the sentiment words which are important lexical resource for the sentiment classification. Sentiment feature represents different sentiment intensity in general field and in specific domain. In general field, we can estimate the sentiment intensity using a snippet from a search engine, while in specific domain, training data can be used for this estimation. When the sentiment intensity of the sentiment features are estimated, it is called semantic orientation and is used to estimate the sentiment intensity of the sentences in the text documents. After estimating sentiment intensity of the sentences, we apply that to the weights of sentiment features. In this paper, we evaluate our system in three different cases such as general, domain-specific, and general/domain-specific semantic orientation using support vector machine. Our experimental results show the improved performance in all cases, and, especially in general/domain-specific semantic orientation, our proposed method performs 3.1% better than a baseline system indexed by only content words.

The Identification Framework for source code author using Authorship Analysis and CNN (작성자 분석과 CNN을 적용한 소스 코드 작성자 식별 프레임워크)

  • Shin, Gun-Yoon;Kim, Dong-Wook;Hong, Sung-sam;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.19 no.5
    • /
    • pp.33-41
    • /
    • 2018
  • Recently, Internet technology has developed, various programs are being created and therefore various codes are being made through many authors. On this aspect, some author deceive a program or code written by other particular author as they make it themselves and use other writers' code indiscriminately, or not indicating the exact code which has been used. Due to this makes it more and more difficult to protect the code. In this paper, we propose author identification framework using Authorship Analysis theory and Natural Language Processing(NLP) based on Convolutional Neural Network(CNN). We apply Authorship Analysis theory to extract features for author identification in the source code, and combine them with the features being used text mining to perform author identification using machine learning. In addition, applying CNN based natural language processing method to source code for code author classification. Therefore, we propose a framework for the identification of authors using the Authorship Analysis theory and the CNN. In order to identify the author, we need special features for identifying the authors only, and the NLP method based on the CNN is able to apply language with a special system such as source code and identify the author. identification accuracy based on Authorship Analysis theory is 95.1% and identification accuracy applied to CNN is 98%.