• Title/Summary/Keyword: online algorithm

Search Result 587, Processing Time 0.027 seconds

A Sentiment Classification Approach of Sentences Clustering in Webcast Barrages

  • Li, Jun;Huang, Guimin;Zhou, Ya
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.718-732
    • /
    • 2020
  • Conducting sentiment analysis and opinion mining are challenging tasks in natural language processing. Many of the sentiment analysis and opinion mining applications focus on product reviews, social media reviews, forums and microblogs whose reviews are topic-similar and opinion-rich. In this paper, we try to analyze the sentiments of sentences from online webcast reviews that scroll across the screen, which we call live barrages. Contrary to social media comments or product reviews, the topics in live barrages are more fragmented, and there are plenty of invalid comments that we must remove in the preprocessing phase. To extract evaluative sentiment sentences, we proposed a novel approach that clusters the barrages from the same commenter to solve the problem of scattering the information for each barrage. The method developed in this paper contains two subtasks: in the data preprocessing phase, we cluster the sentences from the same commenter and remove unavailable sentences; and we use a semi-supervised machine learning approach, the naïve Bayes algorithm, to analyze the sentiment of the barrage. According to our experimental results, this method shows that it performs well in analyzing the sentiment of online webcast barrages.

Urdu News Classification using Application of Machine Learning Algorithms on News Headline

  • Khan, Muhammad Badruddin
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.2
    • /
    • pp.229-237
    • /
    • 2021
  • Our modern 'information-hungry' age demands delivery of information at unprecedented fast rates. Timely delivery of noteworthy information about recent events can help people from different segments of life in number of ways. As world has become global village, the flow of news in terms of volume and speed demands involvement of machines to help humans to handle the enormous data. News are presented to public in forms of video, audio, image and text. News text available on internet is a source of knowledge for billions of internet users. Urdu language is spoken and understood by millions of people from Indian subcontinent. Availability of online Urdu news enable this branch of humanity to improve their understandings of the world and make their decisions. This paper uses available online Urdu news data to train machines to automatically categorize provided news. Various machine learning algorithms were used on news headline for training purpose and the results demonstrate that Bernoulli Naïve Bayes (Bernoulli NB) and Multinomial Naïve Bayes (Multinomial NB) algorithm outperformed other algorithms in terms of all performance parameters. The maximum level of accuracy achieved for the dataset was 94.278% by multinomial NB classifier followed by Bernoulli NB classifier with accuracy of 94.274% when Urdu stop words were removed from dataset. The results suggest that short text of headlines of news can be used as an input for text categorization process.

Mining Loot Box News : Analysis of Keyword Similarities Using Word2Vec (확률형 아이템 뉴스 마이닝 : Word2Vec 활용한 키워드 유사도 분석)

  • Kim, Taekyung;Son, Wonseok;Jeon, Seongmin
    • Journal of Information Technology Services
    • /
    • v.20 no.2
    • /
    • pp.77-90
    • /
    • 2021
  • Online and mobile games represent digital entertainment. Not only the game grows fast, but also it has been noted for unique business models such as a subscription revenue model and free-to-play with partial payment. But, a recent revenue mechanism, called a loot-box system, has been criticized due to overspending, weak protection to teenagers, and more over gambling-like features. Policy makers and research communities have counted on expert opinions, review boards, and temporal survey studies to build countermeasures to minimize negative effects of online and mobile games. In this process, speed was not seriously considered. In this study, we attempt to use a big data source to find a way of observing a trend for policy makers and researchers. Specifically, we tried to apply the Word2Vec data mining algorithm to news repositories. From the findings, we acknowledged that the suggested design would be effective in lightening issues timely and precisely. This study contributes to digital entertainment service communities by providing a practical method to follow up trends; thus, helping practitioners have concrete grounds for balancing public concerns and business purposes.

Language Matters: A Systemic Functional Linguistics-Enhanced Machine Learning Framework for Cyberbullying Detection

  • Raghad Altowairgi;Ala Eshamwi;Lobna Hsairi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.192-198
    • /
    • 2023
  • Cyberbullying is a growing problem among adolescents and can have serious psychological and emotional consequences for the victims. In recent years, machine learning techniques have emerged as promising approach for detecting instances of cyberbullying in online communication. This research paper focuses on developing a machine learning models that are able to detect cyberbullying including support vector machines, naïve bayes, and random forests. The study uses a dataset of real-world examples of cyberbullying collected from Twitter and extracts features that represents the ideational metafunction, then evaluates the performance of each algorithm before and after considering the theory of systemic functional linguistics in terms of precision, recall, and F1-score. The result indicates that all three algorithms are effective at detecting cyberbullying with 92% for naïve bayes and an accuracy of 93% for both SVM and random forests. However, the study also highlights the challenges of accurately detecting cyberbullying, particularly given the nuanced and context-dependent nature of online communication. This paper concludes by discussing the implications of these findings for future research and the development of practical tool for cyberbullying prevention and intervention.

Time Series Data Cleaning Method Based on Optimized ELM Prediction Constraints

  • Guohui Ding;Yueyi Zhu;Chenyang Li;Jinwei Wang;Ru Wei;Zhaoyu Liu
    • Journal of Information Processing Systems
    • /
    • v.19 no.2
    • /
    • pp.149-163
    • /
    • 2023
  • Affected by external factors, errors in time series data collected by sensors are common. Using the traditional method of constraining the speed change rate to clean the errors can get good performance. However, they are only limited to the data of stable changing speed because of fixed constraint rules. Actually, data with uneven changing speed is common in practice. To solve this problem, an online cleaning algorithm for time series data based on dynamic speed change rate constraints is proposed in this paper. Since time series data usually changes periodically, we use the extreme learning machine to learn the law of speed changes from past data and predict the speed ranges that change over time to detect the data. In order to realize online data repair, a dual-window mechanism is proposed to transform the global optimal into the local optimal, and the traditional minimum change principle and median theorem are applied in the selection of the repair strategy. Aiming at the problem that the repair method based on the minimum change principle cannot correct consecutive abnormal points, through quantitative analysis, it is believed that the repair strategy should be the boundary of the repair candidate set. The experimental results obtained on the dataset show that the method proposed in this paper can get a better repair effect.

Online Identification for Normal and Abnormal Status of Water Quality on Ocean USN (해양 USN 환경에서 수질환경의 온라인 정상·비정상 상태 구분)

  • Jeoung, Sin-Chul;Ceong, Hee-Taek
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.4
    • /
    • pp.905-915
    • /
    • 2012
  • This paper suggests the online method to identify normal and abnormal state of water quality on the ocean USN. To define normal of the ocean water quality, we utilize the negative selection algorithm of artificial immunity system which has self and nonself identification characteristics. To distinguish abnormal status, normal state set of the ocean water quality needs to be defined. For this purpose, we generate normal state set base on mutations of each data and mutation of the data as logical product. This mutated normal (or self) sets used to identify abnormal status of the water quality. We represent the experimental result about mutated self set with the Gaussian function. Through setting the method on the ocean sensor logger, we can monitor whether the ocean water quality is normal or abnormal state by online.

Implementation of Smart E-learning based on Blended Learning (혼합형 학습 기반 스마트 이러닝 구현)

  • Hong, YouSik
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.171-178
    • /
    • 2020
  • Many countries are establishing and operating blended learning that combines the advantages of online and offline education. However, online education lecture-based Mooc courses have a very low level, with a graduation rate of less than 5-10%. Therefore, in order to increase the graduation rate of students taking online Mooc distance education lectures that anyone can easily take lectures anytime, anywhere on the web-based basis, it is necessary to introduce automatic analysis of students' understanding level of lectures and an automatic academic warning system. Moreover, in order to enter an advanced education country, it is necessary to develop an automatic judgment SW for wrong answer rate, automatic summary SW for lectures, and automatic analysis SW education for lecture-based weak subjects based on mixed learning levels. In order to improve this problem, in this paper, we proposed and simulated an automatic summarization system for lecture contents, an automatic warning system for incorrect answers, and an automatic judgment algorithm for weak subjects.

Adaptive stochastic gradient method under two mixing heterogenous models (두 이종 혼합 모형에서의 수정된 경사 하강법)

  • Moon, Sang Jun;Jeon, Jong-June
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1245-1255
    • /
    • 2017
  • The online learning is a process of obtaining the solution for a given objective function where the data is accumulated in real time or in batch units. The stochastic gradient descent method is one of the most widely used for the online learning. This method is not only easy to implement, but also has good properties of the solution under the assumption that the generating model of data is homogeneous. However, the stochastic gradient method could severely mislead the online-learning when the homogeneity is actually violated. We assume that there are two heterogeneous generating models in the observation, and propose the a new stochastic gradient method that mitigate the problem of the heterogeneous models. We introduce a robust mini-batch optimization method using statistical tests and investigate the convergence radius of the solution in the proposed method. Moreover, the theoretical results are confirmed by the numerical simulations.

Social Network Spam Detection using Recursive Structure Features (소셜 네트워크 상에서의 재귀적 네트워크 구조 특성을 활용한 스팸탐지 기법)

  • Jang, Boyeon;Jeong, Sihyun;Kim, Chongkwon
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1231-1235
    • /
    • 2017
  • Given the network structure in online social network, it is important to determine a way to distinguish spam accounts from the network features. In online social network, the service provider attempts to detect social spamming to maintain their service quality. However the spammer group changes their strategies to avoid being detected. Even though the spammer attempts to act as legitimate users, certain distinguishable structural features are not easily changed. In this paper, we investigate a way to generate meaningful network structure features, and suggest spammer detection method using recursive structural features. From a result of real-world dataset experiment, we found that the proposed algorithm could improve the classification performance by about 8%.

Continuum Mechanics-Based Environment Modeling for Telemanipulation of Soft Tissues in a Telepalpation System (생체조직의 원격촉진시스템을 위한 연속체역학 기반의 환경 모델링)

  • Kim, Jung-Sik;Kim, Jung
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.35 no.11
    • /
    • pp.1199-1204
    • /
    • 2011
  • The capability to bilaterally telemanipulate soft-tissues for medical applications could increase the quality of telemanipulation systems. Since most soft-tissue manipulation tasks include constrained motion interacting with an unknown and dynamic bioenvironment through contact, bilateral telemanipulation raises problems due to stability and transparency issues. It is well understood that knowledge of environments plays an important role in pursuing transparent telemanipulation and achieving telepresence, and in particular, online estimation of environmental parameters with an explicit environment model can improve these systems' performance. In this study, a continuum mechanics-based environment model with an online environmental property estimation algorithm and an adaptive telemanipulation control scheme is proposed. The proposed method can improve the telemanipulation performance in terms of stability and transparency and can offer valuable information (e.g., elastic modulus of soft tissues) pertaining to diagnostic examinations.