• Title/Summary/Keyword: Unsupervised

Search Result 819, Processing Time 0.026 seconds

K-Means Clustering with Content Based Doctor Recommendation for Cancer

  • kumar, Rethina;Ganapathy, Gopinath;Kang, Jeong-Jin
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.167-176
    • /
    • 2020
  • Recommendation Systems is the top requirements for many people and researchers for the need required by them with the proper suggestion with their personal indeed, sorting and suggesting doctor to the patient. Most of the rating prediction in recommendation systems are based on patient's feedback with their information regarding their treatment. Patient's preferences will be based on the historical behaviour of similar patients. The similarity between the patients is generally measured by the patient's feedback with the information about the doctor with the treatment methods with their success rate. This paper presents a new method of predicting Top Ranked Doctor's in recommendation systems. The proposed Recommendation system starts by identifying the similar doctor based on the patients' health requirements and cluster them using K-Means Efficient Clustering. Our proposed K-Means Clustering with Content Based Doctor Recommendation for Cancer (KMC-CBD) helps users to find an optimal solution. The core component of KMC-CBD Recommended system suggests patients with top recommended doctors similar to the other patients who already treated with that doctor and supports the choice of the doctor and the hospital for the patient requirements and their health condition. The recommendation System first computes K-Means Clustering is an unsupervised learning among Doctors according to their profile and list the Doctors according to their Medical profile. Then the Content based doctor recommendation System generates a Top rated list of doctors for the given patient profile by exploiting health data shared by the crowd internet community. Patients can find the most similar patients, so that they can analyze how they are treated for the similar diseases, and they can send and receive suggestions to solve their health issues. In order to the improve Recommendation system efficiency, the patient can express their health information by a natural-language sentence. The Recommendation system analyze and identifies the most relevant medical area for that specific case and uses this information for the recommendation task. Provided by users as well as the recommended system to suggest the right doctors for a specific health problem. Our proposed system is implemented in Python with necessary functions and dataset.

Building Energy Time Series Data Mining for Behavior Analytics and Forecasting Energy consumption

  • Balachander, K;Paulraj, D
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.1957-1980
    • /
    • 2021
  • The significant aim of this research has always been to evaluate the mechanism for efficient and inherently aware usage of vitality in-home devices, thus improving the information of smart metering systems with regard to the usage of selected homes and the time of use. Advances in information processing are commonly used to quantify gigantic building activity data steps to boost the activity efficiency of the building energy systems. Here, some smart data mining models are offered to measure, and predict the time series for energy in order to expose different ephemeral principles for using energy. Such considerations illustrate the use of machines in relation to time, such as day hour, time of day, week, month and year relationships within a family unit, which are key components in gathering and separating the effect of consumers behaviors in the use of energy and their pattern of energy prediction. It is necessary to determine the multiple relations through the usage of different appliances from simultaneous information flows. In comparison, specific relations among interval-based instances where multiple appliances use continue for certain duration are difficult to determine. In order to resolve these difficulties, an unsupervised energy time-series data clustering and a frequent pattern mining study as well as a deep learning technique for estimating energy use were presented. A broad test using true data sets that are rich in smart meter data were conducted. The exact results of the appliance designs that were recognized by the proposed model were filled out by Deep Convolutional Neural Networks (CNN) and Recurrent Neural Networks (LSTM and GRU) at each stage, with consolidated accuracy of 94.79%, 97.99%, 99.61%, for 25%, 50%, and 75%, respectively.

Sensitivity of abacus and Chasdaq in the Chinese stock market through analysis of Weibo sentiment related to Corona-19 (코로나-19관련 웨이보 정서 분석을 통한 중국 주식시장의 주판 및 차스닥의 민감도 예측 기법)

  • Li, Jiaqi;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.1-7
    • /
    • 2021
  • Investor mood from social media is gaining increasing attention for leading a price movement in stock market. Based on the behavioral finance theory, this study argues that sentiment extracted from social media using big data technique can predict a real-time (short-run) price momentum in Chinese stock market. Collecting Sina Weibo posts that related to COVID-19 using keyword method, a daily influential weighted sentiment factors is extracted from the sizable raw data of over 2 millions of posts. We examine one supervised and 4 unsupervised sentiment analysis model, and use the best performed word-frequency and BiLSTM mdoel. The test result shows a similar movement between stock price change and sentiment factor. It indicates that public mood extracted from social media can in some extent represent the investors' sentiment and make a difference in stock market fluctuation when people are concentrating on a special events that can cause effect on the stock market.

A Reconstruction of Classification for Iris Species Using Euclidean Distance Based on a Machine Learning (머신러닝 기반 유클리드 거리를 이용한 붓꽃 품종 분류 재구성)

  • Nam, Soo-Tai;Shin, Seong-Yoon;Jin, Chan-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.225-230
    • /
    • 2020
  • Machine learning is an algorithm which learns a computer based on the data so that the computer can identify the trend of the data and predict the output of new input data. Machine learning can be classified into supervised learning, unsupervised learning, and reinforcement learning. Supervised learning is a way of learning a machine with given label of data. In other words, a method of inferring a function of the system through a pair of data and a label is used to predict a result using a function inferred about new input data. If the predicted value is continuous, regression analysis is used. If the predicted value is discrete, it is used as a classification. A result of analysis, no. 8 (5, 3.4, setosa), 27 (5, 3.4, setosa), 41 (5, 3.5, setosa), 44 (5, 3.5, setosa) and 40 (5.1, 3.4, setosa) in Table 3 were classified as the most similar Iris flower. Therefore, theoretical practical are suggested.

Multiple Sclerosis Lesion Detection using 3D Autoencoder in Brain Magnetic Resonance Images (3D 오토인코더 기반의 뇌 자기공명영상에서 다발성 경화증 병변 검출)

  • Choi, Wonjune;Park, Seongsu;Kim, Yunsoo;Gahm, Jin Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.8
    • /
    • pp.979-987
    • /
    • 2021
  • Multiple Sclerosis (MS) can be early diagnosed by detecting lesions in brain magnetic resonance images (MRI). Unsupervised anomaly detection methods based on autoencoder have been recently proposed for automated detection of MS lesions. However, these autoencoder-based methods were developed only for 2D images (e.g. 2D cross-sectional slices) of MRI, so do not utilize the full 3D information of MRI. In this paper, therefore, we propose a novel 3D autoencoder-based framework for detection of the lesion volume of MS in MRI. We first define a 3D convolutional neural network (CNN) for full MRI volumes, and build each encoder and decoder layer of the 3D autoencoder based on 3D CNN. We also add a skip connection between the encoder and decoder layer for effective data reconstruction. In the experimental results, we compare the 3D autoencoder-based method with the 2D autoencoder models using the training datasets of 80 healthy subjects from the Human Connectome Project (HCP) and the testing datasets of 25 MS patients from the Longitudinal multiple sclerosis lesion segmentation challenge, and show that the proposed method achieves superior performance in prediction of MS lesion by up to 15%.

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation (Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단)

  • Hong, Su-Woong;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2022
  • This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

A review of gene selection methods based on machine learning approaches (기계학습 접근법에 기반한 유전자 선택 방법들에 대한 리뷰)

  • Lee, Hajoung;Kim, Jaejik
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.667-684
    • /
    • 2022
  • Gene expression data present the level of mRNA abundance of each gene, and analyses of gene expressions have provided key ideas for understanding the mechanism of diseases and developing new drugs and therapies. Nowadays high-throughput technologies such as DNA microarray and RNA-sequencing enabled the simultaneous measurement of thousands of gene expressions, giving rise to a characteristic of gene expression data known as high dimensionality. Due to the high-dimensionality, learning models to analyze gene expression data are prone to overfitting problems, and to solve this issue, dimension reduction or feature selection techniques are commonly used as a preprocessing step. In particular, we can remove irrelevant and redundant genes and identify important genes using gene selection methods in the preprocessing step. Various gene selection methods have been developed in the context of machine learning so far. In this paper, we intensively review recent works on gene selection methods using machine learning approaches. In addition, the underlying difficulties with current gene selection methods as well as future research directions are discussed.

Why Should I Ban You! : X-FDS (Explainable FDS) Model Based on Online Game Payment Log (X-FDS : 게임 결제 로그 기반 XAI적용 이상 거래탐지 모델 연구)

  • Lee, Young Hun;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.1
    • /
    • pp.25-38
    • /
    • 2022
  • With the diversification of payment methods and games, related financial accidents are causing serious problems for users and game companies. Recently, game companies have introduced an Fraud Detection System (FDS) for game payment systems to prevent financial incident. However, FDS is ineffective and cannot provide major evidence based on judgment results, as it requires constant change of detection patterns. In this paper, we analyze abnormal transactions among payment log data of real game companies to generate related features. One of the unsupervised learning models, Autoencoder, was used to build a model to detect abnormal transactions, which resulted in over 85% accuracy. Using X-FDS (Explainable FDS) with XAI-SHAP, we could understand that the variables with the highest explanation for anomaly detection were the amount of transaction, transaction medium, and the age of users. Based on X-FDS, we derive an improved detection model with an accuracy of 94% was finally derived by fine-tuning the importance of features that adversely affect the proposed model.

Deep Learning-based Depth Map Estimation: A Review

  • Abdullah, Jan;Safran, Khan;Suyoung, Seo
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.1
    • /
    • pp.1-21
    • /
    • 2023
  • In this technically advanced era, we are surrounded by smartphones, computers, and cameras, which help us to store visual information in 2D image planes. However, such images lack 3D spatial information about the scene, which is very useful for scientists, surveyors, engineers, and even robots. To tackle such problems, depth maps are generated for respective image planes. Depth maps or depth images are single image metric which carries the information in three-dimensional axes, i.e., xyz coordinates, where z is the object's distance from camera axes. For many applications, including augmented reality, object tracking, segmentation, scene reconstruction, distance measurement, autonomous navigation, and autonomous driving, depth estimation is a fundamental task. Much of the work has been done to calculate depth maps. We reviewed the status of depth map estimation using different techniques from several papers, study areas, and models applied over the last 20 years. We surveyed different depth-mapping techniques based on traditional ways and newly developed deep-learning methods. The primary purpose of this study is to present a detailed review of the state-of-the-art traditional depth mapping techniques and recent deep learning methodologies. This study encompasses the critical points of each method from different perspectives, like datasets, procedures performed, types of algorithms, loss functions, and well-known evaluation metrics. Similarly, this paper also discusses the subdomains in each method, like supervised, unsupervised, and semi-supervised methods. We also elaborate on the challenges of different methods. At the conclusion of this study, we discussed new ideas for future research and studies in depth map research.

Development of Security Anomaly Detection Algorithms using Machine Learning (기계 학습을 활용한 보안 이상징후 식별 알고리즘 개발)

  • Hwangbo, Hyunwoo;Kim, Jae Kyung
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.1-13
    • /
    • 2022
  • With the development of network technologies, the security to protect organizational resources from internal and external intrusions and threats becomes more important. Therefore in recent years, the anomaly detection algorithm that detects and prevents security threats with respect to various security log events has been actively studied. Security anomaly detection algorithms that have been developed based on rule-based or statistical learning in the past are gradually evolving into modeling based on machine learning and deep learning. In this study, we propose a deep-autoencoder model that transforms LSTM-autoencoder as an optimal algorithm to detect insider threats in advance using various machine learning analysis methodologies. This study has academic significance in that it improved the possibility of adaptive security through the development of an anomaly detection algorithm based on unsupervised learning, and reduced the false positive rate compared to the existing algorithm through supervised true positive labeling.