• Title/Summary/Keyword: machine-learning

Search Result 5,432, Processing Time 0.039 seconds

IBN-based: AI-driven Multi-Domain e2e Network Orchestration Approach (IBN 기반: AI 기반 멀티 도메인 네트워크 슬라이싱 접근법)

  • Khan, Talha Ahmed;Muhammad, Afaq;Abbas, Khizar;Song, Wang-Cheol
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.29-41
    • /
    • 2020
  • Networks are growing faster than ever before causing a multi-domain complexity. The diversity, variety and dynamic nature of network traffic and services require enhanced orchestration and management approaches. While many standard orchestrators and network operators are resulting in an increase of complexity for handling E2E slice orchestration. Besides, there are multiple domains involved in E2E slice orchestration including access, edge, transport and core network each having their specific challenges. Hence, handling of multi-domain, multi-platform and multi-operator based networking environments manually requires specified experts and using this approach it is impossible to handle the dynamic changes in the network at runtime. Also, the manual approaches towards handling such complexity is always error-prone and tedious. Hence, this work proposes an automated and abstracted solution for handling E2E slice orchestration using an intent-based approach. It abstracts the domains from the operators and enable them to provide their orchestration intention in the form of high-level intents. Besides, it actively monitors the orchestrated resources and based on current monitoring stats using the machine learning it predicts future utilization of resources for updating the system states. Resulting in a closed-loop automated E2E network orchestration and management system.

Improved Estimation of Hourly Surface Ozone Concentrations using Stacking Ensemble-based Spatial Interpolation (스태킹 앙상블 모델을 이용한 시간별 지상 오존 공간내삽 정확도 향상)

  • KIM, Ye-Jin;KANG, Eun-Jin;CHO, Dong-Jin;LEE, Si-Woo;IM, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.3
    • /
    • pp.74-99
    • /
    • 2022
  • Surface ozone is produced by photochemical reactions of nitrogen oxides(NOx) and volatile organic compounds(VOCs) emitted from vehicles and industrial sites, adversely affecting vegetation and the human body. In South Korea, ozone is monitored in real-time at stations(i.e., point measurements), but it is difficult to monitor and analyze its continuous spatial distribution. In this study, surface ozone concentrations were interpolated to have a spatial resolution of 1.5km every hour using the stacking ensemble technique, followed by a 5-fold cross-validation. Base models for the stacking ensemble were cokriging, multi-linear regression(MLR), random forest(RF), and support vector regression(SVR), while MLR was used as the meta model, having all base model results as additional input variables. The results showed that the stacking ensemble model yielded the better performance than the individual base models, resulting in an averaged R of 0.76 and RMSE of 0.0065ppm during the study period of 2020. The surface ozone concentration distribution generated by the stacking ensemble model had a wider range with a spatial pattern similar with terrain and urbanization variables, compared to those by the base models. Not only should the proposed model be capable of producing the hourly spatial distribution of ozone, but it should also be highly applicable for calculating the daily maximum 8-hour ozone concentrations.

Comparative Study of Anomaly Detection Accuracy of Intrusion Detection Systems Based on Various Data Preprocessing Techniques (다양한 데이터 전처리 기법 기반 침입탐지 시스템의 이상탐지 정확도 비교 연구)

  • Park, Kyungseon;Kim, Kangseok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.449-456
    • /
    • 2021
  • An intrusion detection system is a technology that detects abnormal behaviors that violate security, and detects abnormal operations and prevents system attacks. Existing intrusion detection systems have been designed using statistical analysis or anomaly detection techniques for traffic patterns, but modern systems generate a variety of traffic different from existing systems due to rapidly growing technologies, so the existing methods have limitations. In order to overcome this limitation, study on intrusion detection methods applying various machine learning techniques is being actively conducted. In this study, a comparative study was conducted on data preprocessing techniques that can improve the accuracy of anomaly detection using NGIDS-DS (Next Generation IDS Database) generated by simulation equipment for traffic in various network environments. Padding and sliding window were used as data preprocessing, and an oversampling technique with Adversarial Auto-Encoder (AAE) was applied to solve the problem of imbalance between the normal data rate and the abnormal data rate. In addition, the performance improvement of detection accuracy was confirmed by using Skip-gram among the Word2Vec techniques that can extract feature vectors of preprocessed sequence data. PCA-SVM and GRU were used as models for comparative experiments, and the experimental results showed better performance when sliding window, skip-gram, AAE, and GRU were applied.

Study on Zero-shot based Quality Estimation (Zero-Shot 기반 기계번역 품질 예측 연구)

  • Eo, Sugyeong;Park, Chanjun;Seo, Jaehyung;Moon, Hyeonseok;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.35-43
    • /
    • 2021
  • Recently, there has been a growing interest in zero-shot cross-lingual transfer, which leverages cross-lingual language models (CLLMs) to perform downstream tasks that are not trained in a specific language. In this paper, we point out the limitations of the data-centric aspect of quality estimation (QE), and perform zero-shot cross-lingual transfer even in environments where it is difficult to construct QE data. Few studies have dealt with zero-shots in QE, and after fine-tuning the English-German QE dataset, we perform zero-shot transfer leveraging CLLMs. We conduct comparative analysis between various CLLMs. We also perform zero-shot transfer on language pairs with different sized resources and analyze results based on the linguistic characteristics of each language. Experimental results showed the highest performance in multilingual BART and multillingual BERT, and we induced QE to be performed even when QE learning for a specific language pair was not performed at all.

A study of artificial neural network for in-situ air temperature mapping using satellite data in urban area (위성 정보를 활용한 도심 지역 기온자료 지도화를 위한 인공신경망 적용 연구)

  • Jeon, Hyunho;Jeong, Jaehwan;Cho, Seongkeun;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.11
    • /
    • pp.855-863
    • /
    • 2022
  • In this study, the Artificial Neural Network (ANN) was used to mapping air temperature in Seoul. MODerate resolution Imaging Spectroradiomter (MODIS) data was used as auxiliary data for mapping. For the ANN network topology optimizing, scatterplots and statistical analysis were conducted, and input-data was classified and combined that highly correlated data which surface temperature, Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), time (satellite observation time, Day of year), location (latitude, hardness), and data quality (cloudness). When machine learning was conducted only with data with a high correlation with air temperature, the average values of correlation coefficient (r) and Root Mean Squared Error (RMSE) were 0.967 and 2.708℃. In addition, the performance improved as other data were added, and when all data were utilized the average values of r and RMSE were 0.9840 and 1.883℃, which showed the best performance. In the Seoul air temperature map by the ANN model, the air temperature was appropriately calculated for each pixels topographic characteristics, and it will be possible to analyze the air temperature distribution in city-level and national-level by expanding research areas and diversifying satellite data.

Effective Capacity Planning of Capital Market IT System: Reflecting Sentiment Index (자본시장 IT시스템 효율적 용량계획 모델: 심리지수 활용을 중심으로)

  • Lee, Kukhyung;Kim, Miyea;Park, Jaeyoung;Kim, Beomsoo
    • Knowledge Management Research
    • /
    • v.23 no.1
    • /
    • pp.89-109
    • /
    • 2022
  • Due to COVID-19 and soaring participation of individual investors, large-scale transactions exceeding system capacity limits have been reported frequently in the capital market. The capital market IT systems, which the impact of system failure is very critical, have encountered unexpectedly tremendous transactions in 2020, resulting in a sharp increase in system failures. Despite the fact that many companies maintained large-scale system capacity planning policies, recent transaction influx suggests that a new approach to capacity planning is required. Therefore, this study developed capital market IT system capacity planning models using machine learning techniques and analyzed those performances. In addition, the performance of the best proposed model was improved by using sentiment index that can promptly reflect the behavior of investors. The model uses empirical data including the COVID-19 period, and has high performance and stability that can be used in practice. In practical significance, this study maximizes the cost-efficiency of a company, but also presents optimal parameters in consideration of the practical constraints involved in changing the system. Additionally, by proving that the sentiment index can be used as a major variable in system capacity planning, it shows that the sentiment index can be actively used for various other forecasting demands.

Applicability Analysis on Estimation of Spectral Induced Polarization Parameters Based on Multi-objective Optimization (다중목적함수 최적화에 기초한 광대역 유도분극 변수 예측 적용성 분석)

  • Kim, Bitnarae;Jeong, Ju Yeon;Min, Baehyun;Nam, Myung Jin
    • Geophysics and Geophysical Exploration
    • /
    • v.25 no.3
    • /
    • pp.99-108
    • /
    • 2022
  • Among induced polarization (IP) methods, spectral IP (SIP) uses alternating current as a transmission source to measure amplitudes and phase of complex electrical resistivity at each source frequency, which disperse with respect to source frequencies. The frequency dependence, which can be explained by a relaxation model such as Cole-Cole model or equivalent models, is analyzed to estimate SIP parameters from dispersion curves of complex resistivity employing multi-objective optimization (MOO). The estimation uses a generic algorithm to optimize two objective functions minimizing data misfits of amplitude and phase based on Cole-Cole model, which is most widely used to explain IP relaxation effects. The MOO-based estimation properly recovered Cole-Cole model parameters for synthetic examples but hardly fitted for the real laboratory measures ones, which have relatively smaller values of phases (less than about 10 mrad). Discrepancies between scales for data misfits of amplitude and phase, used as parameters of MOO method, and it is in necessity to employ other methods such as machine learning, which can deal with the discrepancies, to estimate SIP parameters from dispersion curves of complex resistivity.

Quantitative Estimation Method for ML Model Performance Change, Due to Concept Drift (Concept Drift에 의한 ML 모델 성능 변화의 정량적 추정 방법)

  • Soon-Hong An;Hoon-Suk Lee;Seung-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.259-266
    • /
    • 2023
  • It is very difficult to measure the performance of the machine learning model in the business service stage. Therefore, managing the performance of the model through the operational department is not done effectively. Academically, various studies have been conducted on the concept drift detection method to determine whether the model status is appropriate. The operational department wants to know quantitatively the performance of the operating model, but concept drift can only detect the state of the model in relation to the data, it cannot estimate the quantitative performance of the model. In this study, we propose a performance prediction model (PPM) that quantitatively estimates precision through the statistics of concept drift. The proposed model induces artificial drift in the sampling data extracted from the training data, measures the precision of the sampling data, creates a dataset of drift and precision, and learns it. Then, the difference between the actual precision and the predicted precision is compared through the test data to correct the error of the performance prediction model. The proposed PPM was applied to two models, a loan underwriting model and a credit card fraud detection model that can be used in real business. It was confirmed that the precision was effectively predicted.

Study on Water Quality Predictability through Machine Learning Techniques in Non-point Pollutant Management Area (비점오염원관리지역의 머신러닝 기법을 통한 수질 예측 가능성 연구)

  • Yeong Na Yu;Min Hwan Shin;Dong Hyuk Kum;Kyoung Jae Lim;Jong Gun Kim
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.467-467
    • /
    • 2023
  • 강우에 의해 발생하는 비점오염물질의 수질 데이터가 충분하지 않아 비점오염원이 문제가 되고 있는 유역의 수질개선을 위한 대책마련이 어려운 실정이다. 기존에 환경부에서 운영하고 있는 자동측정망은 1시간 간격으로 데이터를 축적하고 있으나, 비점오염원이 문제가 되는 유역에 설치되어 있지 않거나 수온, DO, pH 등 현장항목만을 측정하고 있어 하천의 수질오염을 대표할 수 있는 T-P나 SS 등의 수질분석 항목의 부재하다. 이로인해 유역의 수질개선 대책을 수립하기 위한 오염원의 현황을 파악하기 어려운 실정이다. 따라서, 본 연구에서는 비점오염원관리지역 중 골지천 유역을 대상으로 수질항목별 상관성을 분석하고, 실측자료를 기반으로 DT, MLP, SVM, RF, GB, XGB 등의 머신러닝 기법을 통해 수질 예측 가능성을 연구하였다. 상관관계 분석결과 입력변수인 탁도 항목이 예측 수질과 뚜렷한 상관관계를 보이는 것으로 나타났으나, 그 외 항목에서는 약한 상관관계를 보이거나 상관관계가 없는 것으로 나타났다. 머신러닝 기법을 활용한 수질 예측 분석 결과, 검무교와 태봉2교, 제1여량교는 RF 기법에서 결정계수(R2) 0.57~0.86, RMSE 16.49~175.60으로 예측성이 우수한 것으로 나타났다. 관말교는 SVM 기법에서 R2 0.65, RMSE 57.69로, 송계교는 XGB 기법에서 R2 0.74, RMSE 282.86으로 가장 예측성이 우수한 것으로 나타났다. 분석결과와 같이 머신러닝 기법을 활용한 수질 예측은 가능하나, 예측성이 우수한 머신러닝 기법의 R2 비교 결과, 유역면적이 큰 제1여량교와 작은 관말교에서 0.57과 0.65로 다른 지점에 비해 낮은 것으로 나타났다. RMSE 비교 결과, 상류 산간지역에 발생한 국지성 호우의 영향으로 흙탕물이 가장 자주 발생하는 태봉2교 지점과 우선관리지역이 합류되는 송계교 지점에서 175.60과 282.86으로 예측값과 실측값의 오차가 큰 것으로 나타났다. 연구결과와 같이 하천 수질을 예측하기 위해서는 유역면적 혹은 유역특성과 관련한 기초자료를 추가로 적용하여 머신러닝 기법을 적용 해야할 것으로 판단된다. 또한, 본 연구에서 예측한 수질 항목 이외에 입력변수를 추가로 확보하여 수질의 예측 가능성을 검토해야 할 것으로 보여진다.

  • PDF

Development of an Ensemble-Based Multi-Region Integrated Odor Concentration Prediction Model (앙상블 기반의 악취 농도 다지역 통합 예측 모델 개발)

  • Seong-Ju Cho;Woo-seok Choi;Sang-hyun Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.383-400
    • /
    • 2023
  • Air pollution-related diseases are escalating worldwide, with the World Health Organization (WHO) estimating approximately 7 million annual deaths in 2022. The rapid expansion of industrial facilities, increased emissions from various sources, and uncontrolled release of odorous substances have brought air pollution to the forefront of societal concerns. In South Korea, odor is categorized as an independent environmental pollutant, alongside air and water pollution, directly impacting the health of local residents by causing discomfort and aversion. However, the current odor management system in Korea remains inadequate, necessitating improvements. This study aims to enhance the odor management system by analyzing 1,010,749 data points collected from odor sensors located in Osong, Chungcheongbuk-do, using an Ensemble-Based Multi-Region Integrated Odor Concentration Prediction Model. The research results demonstrate that the model based on the XGBoost algorithm exhibited superior performance, with an RMSE of 0.0096, significantly outperforming the single-region model (0.0146) with a 51.9% reduction in mean error size. This underscores the potential for increasing data volume, improving accuracy, and enabling odor prediction in diverse regions using a unified model through the standardization of odor concentration data collected from various regions.