• Title/Summary/Keyword: empirical type I error

Search Result 15, Processing Time 0.021 seconds

An Integrated Model based on Genetic Algorithms for Implementing Cost-Effective Intelligent Intrusion Detection Systems (비용효율적 지능형 침입탐지시스템 구현을 위한 유전자 알고리즘 기반 통합 모형)

  • Lee, Hyeon-Uk;Kim, Ji-Hun;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.125-141
    • /
    • 2012
  • These days, the malicious attacks and hacks on the networked systems are dramatically increasing, and the patterns of them are changing rapidly. Consequently, it becomes more important to appropriately handle these malicious attacks and hacks, and there exist sufficient interests and demand in effective network security systems just like intrusion detection systems. Intrusion detection systems are the network security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. Conventional intrusion detection systems have generally been designed using the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. However, they cannot handle new or unknown patterns of the network attacks, although they perform very well under the normal situation. As a result, recent studies on intrusion detection systems use artificial intelligence techniques, which can proactively respond to the unknown threats. For a long time, researchers have adopted and tested various kinds of artificial intelligence techniques such as artificial neural networks, decision trees, and support vector machines to detect intrusions on the network. However, most of them have just applied these techniques singularly, even though combining the techniques may lead to better detection. With this reason, we propose a new integrated model for intrusion detection. Our model is designed to combine prediction results of four different binary classification models-logistic regression (LOGIT), decision trees (DT), artificial neural networks (ANN), and support vector machines (SVM), which may be complementary to each other. As a tool for finding optimal combining weights, genetic algorithms (GA) are used. Our proposed model is designed to be built in two steps. At the first step, the optimal integration model whose prediction error (i.e. erroneous classification rate) is the least is generated. After that, in the second step, it explores the optimal classification threshold for determining intrusions, which minimizes the total misclassification cost. To calculate the total misclassification cost of intrusion detection system, we need to understand its asymmetric error cost scheme. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, total misclassification cost is more affected by FNE rather than FPE. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 10,000 samples from them by using random sampling method. Also, we compared the results from our model with the results from single techniques to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell R4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on GA outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that the proposed model outperformed all the other comparative models in the total misclassification cost perspective. Consequently, it is expected that our study may contribute to build cost-effective intelligent intrusion detection systems.

Drainage Performance of Various Subsurface Drain Materials- (배수개선공법개발에 관한 연구(I) -각종 지하배수용 암거재료의 배수성능-)

  • 김철회;이근후;유시조;서원명
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.21 no.3
    • /
    • pp.104-120
    • /
    • 1979
  • I. Title of the Study Studies on the Development of Improved Subsurface Drainage Methods. -Drainage Performance of Various Subsurface Drain Materials- II. Object of the Study Studies were carried out to select the drain material having the highest performance of drainage; And to develop the water budget model which is necessary for the planning of the drainage project and the establishment of water management standards in the water-logged paddy field. III. Content and Scope of the Study 1. The experiment was carried out in the laboratory by using a sand tank model. The drainage performance of various drain materials was compared evaluated. 2. A water budget model was established. Various parameters necessary for the model were investigated by analyzing existing data and measured data from the experimental field. The adaptability of the model was evaluated by comparing the estimated values to the field data. IV. Results and Recommendations 1. A corrugated tube enveloped with gravel or mat showed the highest drainage performance among the eight materials submmitted for the experiment. 2. The drainage performance of the long cement tile(50 cm long) was higher than that of the short cement tile(25 cm long). 3. Rice bran was superior to gravel in its' drain performance. 4. No difference was shown between a grave envelope and a P.V.C. wool mat in their performance of drainage. Continues investigation is needed to clarify the envelope performance. 5. All the results described above were obtained from the laboratory tests. A field test is recommended to confirm the results obtained. 6. As a water balance model of a given soil profile, the soil moisture depletion D, could be represented as follows; $$D=\Sigma\limit_{t=1}^{n}(Et-R_{\ell}-I+W_d)..........(17)$$ 7. Among the various empirical formulae for potential evapotranspiration, Penman's formular was best fit to the data observed with the evaporation pans in Jinju area. High degree of positive correlation between Penman;s predicted data and observed data was confirmed. The regression equation was Y=1.4X-22.86, where Y represents evaporation rate from small pan, in mm/100 days, and X represents potential evapotranspiration rate estimated by Penman's formular. The coefficient of correlation was r=0.94.** 8. To estimate evapotranspiration in the field, the consumptive use coefficient, Kc, was introduced. Kc was defined by the function of the characteristics of the crop soil as follows; $Kc=Kco{\cdot}Ka+Ks..........(20)$ where, Kco, Ka ans Ks represents the crop coefficient, the soil moisture coefficient, and the correction coefficient, respectively. The value of Kco and Ka was obtained from the Fig.16 and the Fig.17, respectively. And, if $Kco{\cdot}Ka{\geq}1.0,$ then Ks=0, otherwise, Ks value was estimated by using the relation; $Ks=1-Kco{\cdot}Ka$. 9. Into type formular, $r_t=\frac{R_{24}}{24}(\frac{b}{\sqrt{t}+a})$, was the best fit one to estimate the probable rainfall intensity when daily rainfall and rainfall durations are given as input data, The coefficient a and b are shown on the Table 16. 10. Japanese type formular, $I_t=\frac{b}{\sqrt{t}+a}$, was the best fit one to estimate the probable rainfall intensity when the rainfall duration only was given. The coefficient a and b are shown on the Table 17. 11. Effective rainfall, Re, was estimated by using following relationships; Re=D, if $R-D\geq}0$, otherwise, Re=R. 12. The difference of rainfall amount from soil moisture depletion was considered as the amount of drainage required. In this case, when Wd=O, Equation 24 was used, otherwise two to three days of lag time was considered and correction was made by use of storage coefficient. 13. To evaluate the model, measured data and estimated data was compared, and relative error was computed. 5.5 percent The relative error was 5.5 percent. 14. By considering the water budget in Jinju area, it was shown that the evaporation amount was greater than the rainfall during period of October to March in next year. This was the behind reasonning that the improvement of surface drainage system is needed in Jinju area.

  • PDF

Parameter Estimation of Intensity-Duration-Frequency Curve Using Genetic Algorithm (I): Comparison Study of Existing Estimation Method (유전자알고리즘을 이용한 강우강도식 매개변수 추정에 관한 연구(I): 기존 매개변수 추정방법과의 비교)

  • Kim, Tae-Son;Shin, Ju-Young;Kim, Soo-Young;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.40 no.10
    • /
    • pp.811-821
    • /
    • 2007
  • The intensity-duration-frequency (IDF) curves by Talbot, Sherman and Japanese type formulas are widely used in South Korea since the parameters are easily estimated. However, these IDF curves' accuracies are relatively worse than those of the IDF curves developed by Lee et al. (1993) and Heo et al. (1999), and different parameters for the given return periods should be computed. In this study, parameter estimation method for the IDF curve by Heo et al. (1999) is suggested using genetic algorithm (GA). Quantiles computed by at-site frequency analysis using the rainfall data of 22 rainfall gauges operated by Korea Meteorological Administration are employed to estimate the parameters of IDF curves and minimizing root mean squared error (RMSE) and relative RMSE (RRMSE) of observed and computed quantiles are used as objective functions of GA. The comparison of parameter estimation methods between the empirical regression analysis and the suggested method show that the IDF curve in which the parameters are estimated by GA using RRMSE as an objective function is superior to the IDF curves using RMSE.

A Stiudy on the Deveplopment of Algorithm for the Representative Unit Hydrograph of a Watershed as a Closed Linear System. (폐선형계로 본 유역대표 단위유량도의 유도를 위한 알고리즘의 개발에 관한 연구)

  • 김재한;이원환
    • Water for future
    • /
    • v.13 no.2
    • /
    • pp.35-47
    • /
    • 1980
  • An algorithm is developed to derive a representative I hr-unit hydrograph through an analysis of rainfall-runoff relations of a watershed as a closed system. For the base flow seperation of a flood hydrograph the multi-deflection method is proposed herein, which gace better results compared with those by the existing empirical methods. A modified $\Phi$index method is also proposed in this stidy to determine the time distribution rainfall excess of a rainstorm, which is essetially a modification of the commonly used $\Phi$index method of rainfall seperation. With the so-obtained rainfall excess hyetograph and the direct runoff hydrograph a trial and error computation of the ordinates of 1 hr-unit hydrograph was executed in such a manner that the synthesized flood hydrograph closely approximates the observed one, thus resulting a unit hydrograph of a piecewise exponential function type. To verify the validity of this study the 1 hr-unit hydrographs for the Imha and Dongchon in Nagdong River basin, and Yongdam in Geum River basin were derived by this algorithm, and the results were compared with those by the conventional synthetic unit hydrograph method and the Nakayasu method. Besides, the validity of this stiudy was also tested by comparing the observed hydrograph with the one computed by applying the unit hydrograph to a specific rainfall event. To generalize the result of this study a computer program, consisited of a main and three subprograns (for rainfall excess estimation, convolution summation, and sorting), is developed as a package, which is believed to be applicable to other watersheds for the similar purpose as those in this study.

  • PDF

The Inter-correlation Analysis between Oil Prices and Dry Bulk Freight Rates (유가와 벌크선 운임의 상관관계 분석에 관한 연구)

  • Ahn, Byoung-Churl;Lee, Kee-Hwan;Kim, Myoung-Hee
    • Journal of Navigation and Port Research
    • /
    • v.46 no.3
    • /
    • pp.289-296
    • /
    • 2022
  • The purpose of this study was to investigate the inter-correlation between crude oil prices and Dry Bulk Freight rates. Eco-friendly shipping fuels has being actively developed to reduce carbon emission. However, carbon neutrality will take longer than anticipated in terms of the present development process. Because of OVID-19 and the Russian invasion of Ukraine, crude oil price fluctuation has been exacerbated. So we must examine the impact on Dry Bulk Freight rates the oil prices have had, because oil prices play a major role in shipping fuels. By using the VAR (Vector Autoregressive) model with monthly data of crude oil prices (Brent, Dubai and WTI) and Dry Bulk Freight rates (BDI, BCI and (BP I) 2008.10~2022.02, the empirical analysis documents that the oil prices have an impact on Dry bulk Freight rates. From the analysis of the forecast error variance decomposition, WTI has the largest explanatory relationship with the BDI and Dubai ranks seoond, Brent ranks third. In conclusion, WTI and Dubai have the largest impact on the BDI, while there are some differences according to the ship-type.