• Title/Summary/Keyword: Secure Machine Learning

Search Result 75, Processing Time 0.027 seconds

Improvement of Wave Height Mid-term Forecast for Maintenance Activities in Southwest Offshore Wind Farm (서남권 해상풍력단지 유지보수 활동을 위한 중기 파고 예보 개선)

  • Ji-Young Kim;Ho-Yeop Lee;In-Seon Suh;Da-Jeong Park;Keum-Seok Kang
    • Journal of Wind Energy
    • /
    • v.14 no.3
    • /
    • pp.25-33
    • /
    • 2023
  • In order to secure the safety of increasing offshore activities such as offshore wind farm maintenance and fishing, IMPACT, a mid-term marine weather forecasting system, was established by predicting marine weather up to 7 days in advance. Forecast data from the Korea Hydrographic and Oceanographic Agency (KHOA), which provides the most reliable marine meteorological service in Korea, was used, but wind speed and wave height forecast errors increased as the leading forecast period increased, so improvement of the accuracy of the model results was needed. The Model Output Statistics (MOS) method, a post-correction method using statistical machine learning, was applied to improve the prediction accuracy of wave height, which is an important factor in forecasting the risk of marine activities. Compared with the observed data, the wave height prediction results by the model before correction for 6 to 7 days ahead showed an RMSE of 0.692 m and R of 0.591, and there was a tendency to underestimate high waves. After correction with the MOS technique, RMSE was 0.554 m and R was 0.732, confirming that accuracy was significantly improved.

A Study on the Improvement of Source Code Static Analysis Using Machine Learning (기계학습을 이용한 소스코드 정적 분석 개선에 관한 연구)

  • Park, Yang-Hwan;Choi, Jin-Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1131-1139
    • /
    • 2020
  • The static analysis of the source code is to find the remaining security weaknesses for a wide range of source codes. The static analysis tool is used to check the result, and the static analysis expert performs spying and false detection analysis on the result. In this process, the amount of analysis is large and the rate of false positives is high, so a lot of time and effort is required, and a method of efficient analysis is required. In addition, it is rare for experts to analyze only the source code of the line where the defect occurred when performing positive/false detection analysis. Depending on the type of defect, the surrounding source code is analyzed together and the final analysis result is delivered. In order to solve the difficulty of experts discriminating positive and false positives using these static analysis tools, this paper proposes a method of determining whether or not the security weakness found by the static analysis tools is a spy detection through artificial intelligence rather than an expert. In addition, the optimal size was confirmed through an experiment to see how the size of the training data (source code around the defects) used for such machine learning affects the performance. This result is expected to help the static analysis expert's job of classifying positive and false positives after static analysis.

An exercise algorithm for mezzanine products using artificial neural networks (인공신경망을 이용한 메자닌 상품의 행사 알고리즘)

  • Jae Pil, Yu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.1
    • /
    • pp.47-56
    • /
    • 2023
  • Mezzanine products are financial investment products with both bond and stock characteristics, which are mainly issued by low-grade companies in the financial market to secure liquidity. Therefore, bondholders investing in mezzanine products must make decisions about when they want to convert to stocks, along with whether they invest in mezzanine products issued by the company. Therefore, in this paper, a total of 2,000 learning data and 200 predictive experimental data with stock conversion events completed by major industries are divided, and mezzanine event algorithms are designed and performance analyzed through artificial neural network models. This topic is meaningful in that it proposed a methodology to scientifically solve the difficulties of exercising mezzanine products, which are of high interest in the financial field, by applying artificial neural network technology.

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

A Study on the Quality Monitoring and Prediction of OTT Traffic in ISP (ISP의 OTT 트래픽 품질모니터링과 예측에 관한 연구)

  • Nam, Chang-Sup
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.2
    • /
    • pp.115-121
    • /
    • 2021
  • This paper used big data and artificial intelligence technology to predict the rapidly increasing internet traffic. There have been various studies on traffic prediction in the past, but they have not been able to reflect the increasing factors that induce huge Internet traffic such as smartphones and streaming in recent years. In addition, event-like factors such as the release of large-capacity popular games or the provision of new contents by OTT (Over the Top) operators are more difficult to predict in advance. Due to these characteristics, it was impossible for an ISP (Internet Service Provider) to reflect real-time service quality management or traffic forecasts in the network business environment with the existing method. Therefore, in this study, in order to solve this problem, an Internet traffic collection system was constructed that searches, discriminates and collects traffic data in real time, separate from the existing NMS. Through this, the flexibility and elasticity to automatically register the data of the collection target are secured, and real-time network quality monitoring is possible. In addition, a large amount of traffic data collected from the system was analyzed by machine learning (AI) to predict future traffic of OTT operators. Through this, more scientific and systematic prediction was possible, and in addition, it was possible to optimize the interworking between ISP operators and to secure the quality of large-scale OTT services.

Vehicle Detection and Ship Stability Calculation using Image Processing Technique (영상처리기법을 활용한 차량 검출 및 선박복원성 계산)

  • Kim, Deug-Bong;Heo, Jun-Hyeog;Kim, Ga-Lam;Seo, Chang-Beom;Lee, Woo-Jun
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.7
    • /
    • pp.1044-1050
    • /
    • 2021
  • After the occurrence of several passenger ship accidents in Korea, various systems are being developed for passenger ship safety management. A total of 162 passenger ships operate along the coast of Korea, of which 105 (65 %) are car-ferries with open vehicle decks. The car-ferry has a navigation pattern that passes through 2 to 4 islands. Safety inspections at the departure point(home port) are carried out by the crew, the operation supervisor of the operation management office, and the maritime safety supervisor. In some cases, self-inspections are carried out for safety inspections at layovers. As with any system, there are institutional and practical limitations. To this end, this study was conducted to suggest a method of detecting a vehicle using image processing and linking it to the calculations for ship stability. For vehicle detection, a method using a difference image and one using machine learning were used. However, a limitation was observed in these methods that the vehicle could not be identified due to strong background lighting from the pier and the ship in the cases where the camera was backlit such as during sunset or at night. It appears necessary to secure sufficient image data and upgrade the program for stable image processing.

Analyzing K-POP idol popularity factors using music charts and new media data using machine learning (머신러닝을 활용한 음원 차트와 뉴미디어 데이터를 활용한 K-POP 아이돌 인기 요인 분석)

  • Jiwon Choi;Dayeon Jung;Kangkyu Choi;Taein Lim;Daehoon Kim;Jongkyn Jung;Seunmin Rho
    • Journal of Platform Technology
    • /
    • v.12 no.1
    • /
    • pp.55-66
    • /
    • 2024
  • The K-POP market has become influential not only in culture but also in society as a whole, including diplomacy and environmental movements. As a result, various papers have been conducted based on machine learning to identify the success factors of idols by utilizing traditional data such as music and recordings. However, there is a limitation that previous studies have not reflected the influence of new media platforms such as Instagram releases, YouTube shorts, TikTok, Twitter, etc. on the popularity of idols. Therefore, it is difficult to clarify the causal relationship of recent idol success factors because the existing studies do not consider the daily changing media trends. To solve these problems, this paper proposes a data collection system and analysis methodology for idol-related data. By developing a container-based real-time data collection automation system that reflects the specificity of idol data, we secure the stability and scalability of idol data collection and compare and analyze the clusters of successful idols through a K-Means clustering-based outlier detection model. As a result, we were able to identify commonalities among successful idols such as gender, time of success after album release, and association with new media. Through this, it is expected that we can finally plan optimal comeback promotions for each idol, album type, and comeback period to improve the chances of idol success.

  • PDF

Traffic Flooding Attack Detection on SNMP MIB Using SVM (SVM을 이용한 SNMP MIB에서의 트래픽 폭주 공격 탐지)

  • Yu, Jae-Hak;Park, Jun-Sang;Lee, Han-Sung;Kim, Myung-Sup;Park, Dai-Hee
    • The KIPS Transactions:PartC
    • /
    • v.15C no.5
    • /
    • pp.351-358
    • /
    • 2008
  • Recently, as network flooding attacks such as DoS/DDoS and Internet Worm have posed devastating threats to network services, rapid detection and proper response mechanisms are the major concern for secure and reliable network services. However, most of the current Intrusion Detection Systems(IDSs) focus on detail analysis of packet data, which results in late detection and a high system burden to cope with high-speed network environment. In this paper we propose a lightweight and fast detection mechanism for traffic flooding attacks. Firstly, we use SNMP MIB statistical data gathered from SNMP agents, instead of raw packet data from network links. Secondly, we use a machine learning approach based on a Support Vector Machine(SVM) for attack classification. Using MIB and SVM, we achieved fast detection with high accuracy, the minimization of the system burden, and extendibility for system deployment. The proposed mechanism is constructed in a hierarchical structure, which first distinguishes attack traffic from normal traffic and then determines the type of attacks in detail. Using MIB data sets collected from real experiments involving a DDoS attack, we validate the possibility of our approaches. It is shown that network attacks are detected with high efficiency, and classified with low false alarms.

Fault Pattern Extraction Via Adjustable Time Segmentation Considering Inflection Points of Sensor Signals for Aircraft Engine Monitoring (센서 데이터 변곡점에 따른 Time Segmentation 기반 항공기 엔진의 고장 패턴 추출)

  • Baek, Sujeong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.3
    • /
    • pp.86-97
    • /
    • 2021
  • As mechatronic systems have various, complex functions and require high performance, automatic fault detection is necessary for secure operation in manufacturing processes. For conducting automatic and real-time fault detection in modern mechatronic systems, multiple sensor signals are collected by internet of things technologies. Since traditional statistical control charts or machine learning approaches show significant results with unified and solid density models under normal operating states but they have limitations with scattered signal models under normal states, many pattern extraction and matching approaches have been paid attention. Signal discretization-based pattern extraction methods are one of popular signal analyses, which reduce the size of the given datasets as much as possible as well as highlight significant and inherent signal behaviors. Since general pattern extraction methods are usually conducted with a fixed size of time segmentation, they can easily cut off significant behaviors, and consequently the performance of the extracted fault patterns will be reduced. In this regard, adjustable time segmentation is proposed to extract much meaningful fault patterns in multiple sensor signals. By considering inflection points of signals, we determine the optimal cut-points of time segments in each sensor signal. In addition, to clarify the inflection points, we apply Savitzky-golay filter to the original datasets. To validate and verify the performance of the proposed segmentation, the dataset collected from an aircraft engine (provided by NASA prognostics center) is used to fault pattern extraction. As a result, the proposed adjustable time segmentation shows better performance in fault pattern extraction.

Effects of mining activities on Nano-soil management using artificial intelligence models of ANN and ELM

  • Liu, Qi;Peng, Kang;Zeng, Jie;Marzouki, Riadh;Majdi, Ali;Jan, Amin;Salameh, Anas A.;Assilzadeh, Hamid
    • Advances in nano research
    • /
    • v.12 no.6
    • /
    • pp.549-566
    • /
    • 2022
  • Mining of ore minerals (sfalerite, cinnabar, and chalcopyrite) from the old mine has led in significant environmental effects as contamination of soils and plants and acidification of water. Also, nanoparticles (NP) have obtained global importance because of their widespread usage in daily life, unique properties, and rapid development in the field of nanotechnology. Regarding their usage in various fields, it is suggested that soil is the final environmental sink for NPs. Nanoparticles with excessive reactivity and deliverability may be carried out as amendments to enhance soil quality, mitigate soil contaminations, make certain secure land-software of the traditional change substances and enhance soil erosion control. Meanwhile, there's no record on the usage of Nano superior substances for mine soil reclamation. In this study, five soil specimens have been tested at 4 sites inside the region of mine (<100 m) to study zeolites, and iron sulfide nanoparticles. Also, through using Artificial Neural Network (ANN) and Extreme Learning Machine (ELM), this study has tried to appropriately estimate the mechanical properties of soil under the effect of these Nano particles. Considering the RMSE and R2 values, Zeolite Nano materials could enhance the mine soil fine through increasing the clay-silt fractions, increasing the water holding capacity, removing toxins and improving nutrient levels. Also, adding iron sulfide minerals to the soils would possibly exacerbate the soil acidity problems at a mining site.