• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.058 seconds

EFTG: Efficient and Flexible Top-K Geo-textual Publish/Subscribe

  • zhu, Hong;Li, Hongbo;Cui, Zongmin;Cao, Zhongsheng;Xie, Meiyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5877-5897
    • /
    • 2018
  • With the popularity of mobile networks and smartphones, geo-textual publish/subscribe messaging has attracted wide attention. Different from the traditional publish/subscribe format, geo-textual data is published and subscribed in the form of dynamic data flow in the mobile network. The difference creates more requirements for efficiency and flexibility. However, most of the existing Top-k geo-textual publish/subscribe schemes have the following deficiencies: (1) All publications have to be scored for each subscription, which is not efficient enough. (2) A user should take time to set a threshold for each subscription, which is not flexible enough. Therefore, we propose an efficient and flexible Top-k geo-textual publish/subscribe scheme. First, our scheme groups publish and subscribe based on text classification. Thus, only a few parts of related publications should be scored for each subscription, which significantly enhances efficiency. Second, our scheme proposes an adaptive publish/subscribe matching algorithm. The algorithm does not require the user to set a threshold. It can adaptively return Top-k results to the user for each subscription, which significantly enhances flexibility. Finally, theoretical analysis and experimental evaluation verify the efficiency and effectiveness of our scheme.

A Study on Effective Sentiment Analysis through News Classification in Bankruptcy Prediction Model (부도예측 모형에서 뉴스 분류를 통한 효과적인 감성분석에 관한 연구)

  • Kim, Chansong;Shin, Minsoo
    • Journal of Information Technology Services
    • /
    • v.18 no.1
    • /
    • pp.187-200
    • /
    • 2019
  • Bankruptcy prediction model is an issue that has consistently interested in various fields. Recently, as technology for dealing with unstructured data has been developed, researches applied to business model prediction through text mining have been activated, and studies using this method are also increasing in bankruptcy prediction. Especially, it is actively trying to improve bankruptcy prediction by analyzing news data dealing with the external environment of the corporation. However, there has been a lack of study on which news is effective in bankruptcy prediction in real-time mass-produced news. The purpose of this study was to evaluate the high impact news on bankruptcy prediction. Therefore, we classify news according to type, collection period, and analyzed the impact on bankruptcy prediction based on sentiment analysis. As a result, artificial neural network was most effective among the algorithms used, and commentary news type was most effective in bankruptcy prediction. Column and straight type news were also significant, but photo type news was not significant. In the news by collection period, news for 4 months before the bankruptcy was most effective in bankruptcy prediction. In this study, we propose a news classification methods for sentiment analysis that is effective for bankruptcy prediction model.

A Deep Learning Model for Extracting Consumer Sentiments using Recurrent Neural Network Techniques

  • Ranjan, Roop;Daniel, AK
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.238-246
    • /
    • 2021
  • The rapid rise of the Internet and social media has resulted in a large number of text-based reviews being placed on sites such as social media. In the age of social media, utilizing machine learning technologies to analyze the emotional context of comments aids in the understanding of QoS for any product or service. The classification and analysis of user reviews aids in the improvement of QoS. (Quality of Services). Machine Learning algorithms have evolved into a powerful tool for analyzing user sentiment. Unlike traditional categorization models, which are based on a set of rules. In sentiment categorization, Bidirectional Long Short-Term Memory (BiLSTM) has shown significant results, and Convolution Neural Network (CNN) has shown promising results. Using convolutions and pooling layers, CNN can successfully extract local information. BiLSTM uses dual LSTM orientations to increase the amount of background knowledge available to deep learning models. The suggested hybrid model combines the benefits of these two deep learning-based algorithms. The data source for analysis and classification was user reviews of Indian Railway Services on Twitter. The suggested hybrid model uses the Keras Embedding technique as an input source. The suggested model takes in data and generates lower-dimensional characteristics that result in a categorization result. The suggested hybrid model's performance was compared using Keras and Word2Vec, and the proposed model showed a significant improvement in response with an accuracy of 95.19 percent.

Topics and Sentiment Analysis Based on Reviews of Omni-Channel Retailing

  • KIM, Soon-Hong;YOO, Byong-Kook
    • Journal of Distribution Science
    • /
    • v.19 no.4
    • /
    • pp.25-35
    • /
    • 2021
  • Purpose: This study aims to analyze the factors affecting customer satisfaction in the customer reviews of omni-channel, posted on Internet blogs, cafes, and YouTube using text mining analysis. Research, data, and Methodology: In this study, frequency analysis is performed and the LDA (Latent Dirichlet Allocation) is used to analyze social big data to respond to reviewers' reaction to the recently opened omni-channel shopping reviews by L Shopping Company. Additionally, based on the topic analysis, we conduct a sentiment analysis on purchase reviews and analyze the characteristics of each topic on the positive or negative sentiments of omni-channel app users. Results: As a result of a topic analysis, four main topics are derived: delivery and events, economic value, recommendations and convenience, and product quality and brand awareness. The emotional analysis reveals that the reviewers have many positive evaluations for price policy and product promotion, but negative evaluations for app use, delivery, and product quality. Conclusions: Retailers can establish customized marketing strategies by identifying the customer's major interests through text mining analysis. Additionally, the analysis of sentiment by subject becomes an important indicator for developing products and services that customers want by identifying areas that satisfy customers and areas that evoke negative reactions.

Changes in the Perception of Second-hand Fashion Consumption in the Post-pandemic Era (포스트 팬데믹 시대의 중고 패션 소비 인식 변화)

  • Kim, Habin;Lee, Ha Kyung
    • Fashion & Textile Research Journal
    • /
    • v.24 no.1
    • /
    • pp.66-80
    • /
    • 2022
  • Even before the Covid-19 outbreak, the second-hand fashion market has been growing as the fashion industry strives towards sustainability. It has also accelerated due to the economic contraction caused by the pandemic. In previous studies, the second-hand market has been steadily studied; however, the research is insufficient compared to the diversified market. Therefore, this study investigates changes in consumers' perception of the second-hand fashion market affected by Covid-19. This study collected text data with the keyword 'second-hand fashion' from various blogs. We analyzed 24,000 posts before and after the Covid-19 outbreak by applying the LDA algorithm for topic modeling and content analysis. Seven and nine different topics for the period before and after the pandemic respectively were derived. The results revealed that during the pandemic the consumers realized the practical value of sustainability in their daily lives than they did before the pandemic. Furthermore, they tried to minimize transaction anxiety by using diverse platforms with advanced technology. They also realized economic value by buying and selling sneakers in the popular sneakers resale market. The results could help understand the rapidly growing second-hand fashion market during Covid-19.

Keyword trends analysis related to the aviation industry during the Covid-19 period using text mining (텍스트마이닝을 활용한 Covid-19 기간 동안의 항공산업 관련 키워드 트렌드 분석)

  • Choi, Donghyun;Song, Bomi;Park, Dahyeon;Lee, Sungwoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.2
    • /
    • pp.115-128
    • /
    • 2022
  • The purpose of this study is to conduct keyword trend analysis using articles data on the impact of Covid-19 in the aviation in dustry. In this study, related articles were extracted centering on the keyword "Airline" by dividing the period of 6months before and after Covid-19 occurrence. After that, Topic modeling(LDA) was performed. Through this, The main topic was extracted in the event of an epidemic such as Covid-19, It is expected to be used as primary data to predict the aviation industry's impact when occurrence like Covid-19.

Investigation of Trend in Virtual Reality-based Workplace Convergence Research: Using Pathfinder Network and Parallel Neighbor Clustering Methodology (가상현실 기반 업무공간 융복합 분야 연구 동향 분석 : 패스파인더 네트워크와 병렬 최근접 이웃 클러스터링 방법론 활용)

  • Ha, Jae Been;Kang, Ju Young
    • The Journal of Information Systems
    • /
    • v.31 no.2
    • /
    • pp.19-43
    • /
    • 2022
  • Purpose Due to the COVID-19 pandemic, many companies are building virtual workplaces based on virtual reality technology. Through this study, we intend to identify the trend of convergence and convergence research between virtual reality technology and work space, and suggest future promising fields based on this. Design/methodology/approach For this purpose, 12,250 bibliographic data of research papers related to Virtual Reality (VR) and Workplace were collected from Scopus from 1982 to 2021. The bibliographic data of the collected papers were analyzed using Text Mining and Pathfinder Network, Parallel Neighbor Clustering, Nearest Neighbor Centrality, and Triangle Betweenness Centrality. Through this, the relationship between keywords by period was identified, and network analysis and visualization work were performed for virtual reality-based workplace research. Findings Through this study, it is expected that the main keyword knowledge structure flow of virtual reality-based workplace convergence research can be identified, and the relationship between keywords can be identified to provide a major measure for designing directions in subsequent studies.

LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19

  • Ouyang, Sizhuo;Wang, Yuxing;Zhou, Kaiyin;Xia, Jingbo
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.23.1-23.7
    • /
    • 2021
  • Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processing (BioNLP) community, so as to retrieve the important information about the mechanism of COVID-19. PubAnnotation is an aligned annotation system which provides an efficient platform for biological curators to upload their annotations or merge other external annotations. Inspired by the integration among multiple useful COVID-19 annotations, we merged three annotations resources to LitCovid data set, and constructed a cross-annotated corpus, LitCovid-AGAC. This corpus consists of 12 labels including Mutation, Species, Gene, Disease from PubTator, GO, CHEBI from OGER, Var, MPA, CPA, NegReg, PosReg, Reg from AGAC, upon 50,018 COVID-19 abstracts in LitCovid. Contain sufficient abundant information being possible to unveil the hidden knowledge in the pathological mechanism of COVID-19.

A Study on Social Perceptions of Public Libraries Utilizing the sentiment analysis

  • Noh, Younghee;Kim, Dongseok
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.12 no.4
    • /
    • pp.41-65
    • /
    • 2022
  • This study would understand the overall perception of our society about public libraries, analyzing the texts related to public libraries, utilizing the semantic connection network & sentiment analysis. For this purpose, this study collected data from the last five years with keywords, 'Library' and 'Lifelong Learning Center' from January 1, 2016 through November 30, 2020 through the blogs and cafés of major domestic portal sites. With the collected data, text mining, centrality of keywords, network structure, structural equipotentiality, and sensitivity analyses were conducted. As a result of the analysis, First, 'reading' and 'book' were identified as representative keywords that form the social perception of public libraries. Second, it turned out that there were keywords related to the use of the library and the untact service due to the recent spread of COVID-19. Third, in seeking a plan for the development of public libraries through the keywords drawn to have positive meanings, it is necessary to create continuous services that can form a new image of the library, breaking away from the existing fixed role and image of the library and increase the convenience of use. Fourth, facilities and facilities for library services were recognized from a neutral point of view. Fifth, the spread of infectious diseases, social distancing, and temporary closure and closure of libraries are negatively related to public libraries, and awareness of librarians has been identified as negative keywords.

Burmese Sentiment Analysis Based on Transfer Learning

  • Mao, Cunli;Man, Zhibo;Yu, Zhengtao;Wu, Xia;Liang, Haoyuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.535-548
    • /
    • 2022
  • Using a rich resource language to classify sentiments in a language with few resources is a popular subject of research in natural language processing. Burmese is a low-resource language. In light of the scarcity of labeled training data for sentiment classification in Burmese, in this study, we propose a method of transfer learning for sentiment analysis of a language that uses the feature transfer technique on sentiments in English. This method generates a cross-language word-embedding representation of Burmese vocabulary to map Burmese text to the semantic space of English text. A model to classify sentiments in English is then pre-trained using a convolutional neural network and an attention mechanism, where the network shares the model for sentiment analysis of English. The parameters of the network layer are used to learn the cross-language features of the sentiments, which are then transferred to the model to classify sentiments in Burmese. Finally, the model was tuned using the labeled Burmese data. The results of the experiments show that the proposed method can significantly improve the classification of sentiments in Burmese compared to a model trained using only a Burmese corpus.