• Title/Summary/Keyword: 랜덤 워크

Search Result 197, Processing Time 0.025 seconds

A History-based Scheduler for Dynamic Load Balancing on Distributed VOD Server Environments (분산 VOD 서버 환경에서 히스토리 기반의 동적 부하분산 스케줄러)

  • Moon, Jongbae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.04a
    • /
    • pp.210-213
    • /
    • 2010
  • 최근 사용자의 멀티미디어에 대한 요구의 증가가 VOD (Video-on-Demand) 서비스를 발전시키게 되었다. VOD는 엔터테인먼트나 원격 교육, 광고 및 정보 등 많은 분야에서 사용되고 있다. 이러한 VOD 서비스는 많은 디스크 I/O와 네트워크 I/O를 요구하며 기존 웹 서버 시스템과 비교했을 때 오랜 시간동안 서비스를 해야 하는 특징을 가지고 있다. 또한 VOD 서비스는 많은 네트워크와 디스크의 대역폭을 요구하며, 서비스의 QoS에 민감해서 사용자 응답시간이 길어지면 사용자 요청의 취소율이 높아지게 된다. 따라서 불만족스러운 서비스의 증가로 네트워크 부하만 증가하게 된다. 이러한 기존 웹 서버 환경과는 다른 부하의 패턴이 있는 VOD 서비스 환경에서는 부하를 균형적으로 분배하여 서비스의 QoS를 높이는 것이 매우 중요하다. 본 논문에서는 분산 VOD 시스템 환경에서 부하를 효율적으로 분산하기 위해 계층형 분산 VOD 시스템 모델과 사용자 요청 패턴의 히스토리와 유전 알고리즘을 기반으로 한 스케줄러를 제안한다. 본 논문에서 제안한 계층형 분산 VOD 시스템 모델은 서버들을 지역적으로 분산하고 제어 서버를 지역마다 설치하여 지역에 있는 VOD 서버들을 관리하도록 구성한다. 사용자 요청을 지역 서버군 내에서 분산시키기 위해서 히스토리를 기반으로 한 유전 알고리즘을 사용한다. 이러한 히스토리 정보를 기반으로 유전 알고리즘의 적합도 함수에 적용하여 VOD 시스템을 위한 유전 알고리즘과 유전 연산을 구현한다. 본 논문에서 제안한 부하 분산 알고리즘은 VOD 서비스 환경에서 사용자 요구에 대한 부하를 보다 정확하게 예측하여 부하를 분산할 수 있다. 본 논문에서 제안한 계층형 분산 VOD 시스템의 부하 분산 알고리즘의 성능을 테스트하기 위해 OPNET 기반 시뮬레이터를 구현한다. 라운드로빈(round-robin) 방식과 랜덤(random) 방식과의 비교 실험을 통해 본 논문에서 제안한 부하 분산 알고리즘의 성능을 평가한다. 비교 실험을 통해 본 논문에서 제안한 알고리즘이 보다 안정적인 QoS를 제공하는 것을 보여준다.

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Two-Stage Neural Network Optimization for Robust Solar Photovoltaic Forecasting (강건한 태양광 발전량 예측을 위한 2단계 신경망 최적화)

  • Jinyeong Oh;Dayeong So;Jihoon Moon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.31-34
    • /
    • 2024
  • 태양광 에너지는 탄소 중립 이행을 위한 주요 방안으로 많은 주목을 받고 있다. 태양광 발전량은 여러 환경적 요인에 따라 크게 달라질 수 있으므로, 정확한 발전량 예측은 전력 네트워크의 안정성과 효율적인 에너지 관리에 근본적으로 중요하다. 대표적인 인공지능 기술인 신경망(Neural Network)은 불안정한 환경 변수와 복잡한 상호작용을 효과적으로 학습할 수 있어 태양광 발전량 예측에서 우수한 성능을 도출하였다. 하지만, 신경망은 모델의 구조나 초매개변수(Hyperparameter)를 최적화하는 것은 복잡하고 시간이 많이 드는 작업이므로, 에너지 분야에서 실제 산업 적용에 한계가 존재한다. 본 논문은 2단계 신경망 최적화를 통한 태양광 발전량 예측 기법을 제안한다. 먼저, 태양광 발전량 데이터 셋을 훈련 집합과 평가 집합으로 분할한다. 훈련 집합에서, 각기 다른 은닉층의 개수로 구성된 여러 신경망 모델을 구성하고, 모델별로 Optuna를 적용하여 최적의 초매개변숫값을 선정한다. 다음으로, 은닉층별 최적화된 신경망 모델을 이용해 훈련과 평가 집합에서는 각각 5겹 교차검증을 적용한 발전량 추정값과 예측값을 출력한다. 마지막으로, 스태킹 앙상블 방식을 채택해 기본 초매개변숫값으로 설정해도 우수한 성능을 도출하는 랜덤 포레스트를 이용하여 추정값을 학습하고, 평가 집합의 예측값을 입력으로 받아 최종 태양광 발전량을 예측한다. 인천 지역으로 실험한 결과, 제안한 방식은 모델링이 간편할 뿐만 아니라 여러 신경망 모델보다 우수한 예측 성능을 도출하였으며, 이를 바탕으로 국내 에너지 산업에 이바지할 수 있을 것으로 기대한다.

  • PDF

NEAR-INFRARED VARIABILITY OF OPTICALLY BRIGHT TYPE 1 AGN (가시광에서 밝은 1형 활동은하핵의 근적외선 변광)

  • JEON, WOOYEOL;SHIM, HYUNJIN;KIM, MINJIN
    • Publications of The Korean Astronomical Society
    • /
    • v.36 no.3
    • /
    • pp.47-63
    • /
    • 2021
  • Variability is one of the major characteristics of Active Galactic Nuclei (AGN), and it is used for understanding the energy generation mechanism in the center of AGN and/or related physical phenomena. It it known that there exists a time lag between AGN light curves simultaneously observed at different wavelengths, which can be used as a tool to estimate the size of the area that produce the radiation. In this paper, We present long term near-infrared variability of optically bright type 1 AGN using the Wide-field Infrared Survey Explorer data. From the Milliquas catalogue v6.4, 73 type 1 QSOs/AGN and 140 quasar candidates are selected that are brighter than 18 mag in optical and located within 5 degree around the ecliptic poles. Light curves in the W1 band (3.4 ㎛) and W2 band (4.6 ㎛) during the period of 2010-2019 were constructed for these objects by extracting multi-epoch photometry data from WISE and NEOWISE all sky survey database. Variability was analyzed based on the excess variance and the probability Pvar. Applying both criteria, the numbers of variable objects are 19 (i.e., 26%) for confirmed AGN and 12 (i.e., 9%) for AGN candidates. The characteristic time scale of the variability (τ) and the variability amplitude (σ) were derived by fitting the DRW model to W1 and W2 light curves. No significant correlation is found between the W1/W2 magnitude and the derived variability parameters. Based on the subsample that are identified in the X-ray source catalog, there exists little correlation between the X-ray luminosity and the variability parameters. We also found four AGN with changing W1-W2 color.

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.

A Study on the Optimal Location Selection for Hydrogen Refueling Stations on a Highway using Machine Learning (머신러닝 기반 고속도로 내 수소충전소 최적입지 선정 연구)

  • Jo, Jae-Hyeok;Kim, Sungsu
    • Journal of Cadastre & Land InformatiX
    • /
    • v.51 no.2
    • /
    • pp.83-106
    • /
    • 2021
  • Interests in clean fuels have been soaring because of environmental problems such as air pollution and global warming. Unlike fossil fuels, hydrogen obtains public attention as a eco-friendly energy source because it releases only water when burned. Various policy efforts have been made to establish a hydrogen based transportation network. The station that supplies hydrogen to hydrogen-powered trucks is essential for building the hydrogen based logistics system. Thus, determining the optimal location of refueling stations is an important topic in the network. Although previous studies have mostly applied optimization based methodologies, this paper adopts machine learning to review spatial attributes of candidate locations in selecting the optimal position of the refueling stations. Machine learning shows outstanding performance in various fields. However, it has not yet applied to an optimal location selection problem of hydrogen refueling stations. Therefore, several machine learning models are applied and compared in performance by setting variables relevant to the location of highway rest areas and random points on a highway. The results show that Random Forest model is superior in terms of F1-score. We believe that this work can be a starting point to utilize machine learning based methods as the preliminary review for the optimal sites of the stations before the optimization applies.

Trip Assignment for Transport Card Based Seoul Metropolitan Subway Using Monte Carlo Method (Monte Carlo 기법을 이용한 교통카드기반 수도권 지하철 통행배정)

  • Meeyoung Lee;Doohee Nam
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.2
    • /
    • pp.64-79
    • /
    • 2023
  • This study reviewed the process of applying the Monte Carlo simulation technique to the traffic allocation problem of metropolitan subways. The analysis applied the assumption of a normal distribution in which the travel time information of the inter-station sample is the basis of the probit model. From this, the average and standard deviation are calculated by separating the traffic between stations. A plan was proposed to apply the simulation with the weights of the in-vehicle time of individual links and the walking and dispatch interval of transfer. Long-distance traffic with a low number of samples of 50 or fewer was evaluated as a way to analyze the characteristics of similar traffic. The research results were reviewed in two directions by applying them to the Seoul Metropolitan Subway Network. The travel time between single stations on the Seolleung-Seongsu route was verified by applying random sampling to the in-vehicle time and transfer time. The assumption of a normal distribution was accepted for sample sizes of more than 50 stations according to the inter-station traffic sample of the entire Seoul Metropolitan Subway. For long-distance traffic with samples numbering less than 50, the minimum distance between stations was 122Km. Therefore, it was judged that the sample deviation equality was achieved and the inter-station mean and standard deviation of the transport card data for stations at this distance could be applied.

5G Network Resource Allocation and Traffic Prediction based on DDPG and Federated Learning (DDPG 및 연합학습 기반 5G 네트워크 자원 할당과 트래픽 예측)

  • Seok-Woo Park;Oh-Sung Lee;In-Ho Ra
    • Smart Media Journal
    • /
    • v.13 no.4
    • /
    • pp.33-48
    • /
    • 2024
  • With the advent of 5G, characterized by Enhanced Mobile Broadband (eMBB), Ultra-Reliable Low Latency Communications (URLLC), and Massive Machine Type Communications (mMTC), efficient network management and service provision are becoming increasingly critical. This paper proposes a novel approach to address key challenges of 5G networks, namely ultra-high speed, ultra-low latency, and ultra-reliability, while dynamically optimizing network slicing and resource allocation using machine learning (ML) and deep learning (DL) techniques. The proposed methodology utilizes prediction models for network traffic and resource allocation, and employs Federated Learning (FL) techniques to simultaneously optimize network bandwidth, latency, and enhance privacy and security. Specifically, this paper extensively covers the implementation methods of various algorithms and models such as Random Forest and LSTM, thereby presenting methodologies for the automation and intelligence of 5G network operations. Finally, the performance enhancement effects achievable by applying ML and DL to 5G networks are validated through performance evaluation and analysis, and solutions for network slicing and resource management optimization are proposed for various industrial applications.

Analysis of the impact of mathematics education research using explainable AI (설명가능한 인공지능을 활용한 수학교육 연구의 영향력 분석)

  • Oh, Se Jun
    • The Mathematical Education
    • /
    • v.62 no.3
    • /
    • pp.435-455
    • /
    • 2023
  • This study primarily focused on the development of an Explainable Artificial Intelligence (XAI) model to discern and analyze papers with significant impact in the field of mathematics education. To achieve this, meta-information from 29 domestic and international mathematics education journals was utilized to construct a comprehensive academic research network in mathematics education. This academic network was built by integrating five sub-networks: 'paper and its citation network', 'paper and author network', 'paper and journal network', 'co-authorship network', and 'author and affiliation network'. The Random Forest machine learning model was employed to evaluate the impact of individual papers within the mathematics education research network. The SHAP, an XAI model, was used to analyze the reasons behind the AI's assessment of impactful papers. Key features identified for determining impactful papers in the field of mathematics education through the XAI included 'paper network PageRank', 'changes in citations per paper', 'total citations', 'changes in the author's h-index', and 'citations per paper of the journal'. It became evident that papers, authors, and journals play significant roles when evaluating individual papers. When analyzing and comparing domestic and international mathematics education research, variations in these discernment patterns were observed. Notably, the significance of 'co-authorship network PageRank' was emphasized in domestic mathematics education research. The XAI model proposed in this study serves as a tool for determining the impact of papers using AI, providing researchers with strategic direction when writing papers. For instance, expanding the paper network, presenting at academic conferences, and activating the author network through co-authorship were identified as major elements enhancing the impact of a paper. Based on these findings, researchers can have a clear understanding of how their work is perceived and evaluated in academia and identify the key factors influencing these evaluations. This study offers a novel approach to evaluating the impact of mathematics education papers using an explainable AI model, traditionally a process that consumed significant time and resources. This approach not only presents a new paradigm that can be applied to evaluations in various academic fields beyond mathematics education but also is expected to substantially enhance the efficiency and effectiveness of research activities.

Increasing Accuracy of Stock Price Pattern Prediction through Data Augmentation for Deep Learning (데이터 증강을 통한 딥러닝 기반 주가 패턴 예측 정확도 향상 방안)

  • Kim, Youngjun;Kim, Yeojeong;Lee, Insun;Lee, Hong Joo
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.1-12
    • /
    • 2019
  • As Artificial Intelligence (AI) technology develops, it is applied to various fields such as image, voice, and text. AI has shown fine results in certain areas. Researchers have tried to predict the stock market by utilizing artificial intelligence as well. Predicting the stock market is known as one of the difficult problems since the stock market is affected by various factors such as economy and politics. In the field of AI, there are attempts to predict the ups and downs of stock price by studying stock price patterns using various machine learning techniques. This study suggest a way of predicting stock price patterns based on the Convolutional Neural Network(CNN) among machine learning techniques. CNN uses neural networks to classify images by extracting features from images through convolutional layers. Therefore, this study tries to classify candlestick images made by stock data in order to predict patterns. This study has two objectives. The first one referred as Case 1 is to predict the patterns with the images made by the same-day stock price data. The second one referred as Case 2 is to predict the next day stock price patterns with the images produced by the daily stock price data. In Case 1, data augmentation methods - random modification and Gaussian noise - are applied to generate more training data, and the generated images are put into the model to fit. Given that deep learning requires a large amount of data, this study suggests a method of data augmentation for candlestick images. Also, this study compares the accuracies of the images with Gaussian noise and different classification problems. All data in this study is collected through OpenAPI provided by DaiShin Securities. Case 1 has five different labels depending on patterns. The patterns are up with up closing, up with down closing, down with up closing, down with down closing, and staying. The images in Case 1 are created by removing the last candle(-1candle), the last two candles(-2candles), and the last three candles(-3candles) from 60 minutes, 30 minutes, 10 minutes, and 5 minutes candle charts. 60 minutes candle chart means one candle in the image has 60 minutes of information containing an open price, high price, low price, close price. Case 2 has two labels that are up and down. This study for Case 2 has generated for 60 minutes, 30 minutes, 10 minutes, and 5minutes candle charts without removing any candle. Considering the stock data, moving the candles in the images is suggested, instead of existing data augmentation techniques. How much the candles are moved is defined as the modified value. The average difference of closing prices between candles was 0.0029. Therefore, in this study, 0.003, 0.002, 0.001, 0.00025 are used for the modified value. The number of images was doubled after data augmentation. When it comes to Gaussian Noise, the mean value was 0, and the value of variance was 0.01. For both Case 1 and Case 2, the model is based on VGG-Net16 that has 16 layers. As a result, 10 minutes -1candle showed the best accuracy among 60 minutes, 30 minutes, 10 minutes, 5minutes candle charts. Thus, 10 minutes images were utilized for the rest of the experiment in Case 1. The three candles removed from the images were selected for data augmentation and application of Gaussian noise. 10 minutes -3candle resulted in 79.72% accuracy. The accuracy of the images with 0.00025 modified value and 100% changed candles was 79.92%. Applying Gaussian noise helped the accuracy to be 80.98%. According to the outcomes of Case 2, 60minutes candle charts could predict patterns of tomorrow by 82.60%. To sum up, this study is expected to contribute to further studies on the prediction of stock price patterns using images. This research provides a possible method for data augmentation of stock data.

  • PDF