• 제목/요약/키워드: Cross - Validation

검색결과 994건 처리시간 0.026초

Restricted support vector quantile regression without crossing

  • Shim, Joo-Yong;Lee, Jang-Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권6호
    • /
    • pp.1319-1325
    • /
    • 2010
  • Quantile regression provides a more complete statistical analysis of the stochastic relationships among random variables. Sometimes quantile functions estimated at different orders can cross each other. We propose a new non-crossing quantile regression method applying support vector median regression to restricted regression quantile, restricted support vector quantile regression. The proposed method provides a satisfying solution to estimating non-crossing quantile functions when multiple quantiles for high dimensional data are needed. We also present the model selection method that employs cross validation techniques for choosing the parameters which aect the performance of the proposed method. One real example and a simulated example are provided to show the usefulness of the proposed method.

Machine Learning Based Hybrid Approach to Detect Intrusion in Cyber Communication

  • Neha Pathak;Bobby Sharma
    • International Journal of Computer Science & Network Security
    • /
    • 제23권11호
    • /
    • pp.190-194
    • /
    • 2023
  • By looking the importance of communication, data delivery and access in various sectors including governmental, business and individual for any kind of data, it becomes mandatory to identify faults and flaws during cyber communication. To protect personal, governmental and business data from being misused from numerous advanced attacks, there is the need of cyber security. The information security provides massive protection to both the host machine as well as network. The learning methods are used for analyzing as well as preventing various attacks. Machine learning is one of the branch of Artificial Intelligence that plays a potential learning techniques to detect the cyber-attacks. In the proposed methodology, the Decision Tree (DT) which is also a kind of supervised learning model, is combined with the different cross-validation method to determine the accuracy and the execution time to identify the cyber-attacks from a very recent dataset of different network attack activities of network traffic in the UNSW-NB15 dataset. It is a hybrid method in which different types of attributes including Gini Index and Entropy of DT model has been implemented separately to identify the most accurate procedure to detect intrusion with respect to the execution time. The different DT methodologies including DT using Gini Index, DT using train-split method and DT using information entropy along with their respective subdivision such as using K-Fold validation, using Stratified K-Fold validation are implemented.

Finding Unexpected Test Accuracy by Cross Validation in Machine Learning

  • Yoon, Hoijin
    • International Journal of Computer Science & Network Security
    • /
    • 제21권12spc호
    • /
    • pp.549-555
    • /
    • 2021
  • Machine Learning(ML) splits data into 3 parts, which are usually 60% for training, 20% for validation, and 20% for testing. It just splits quantitatively instead of selecting each set of data by a criterion, which is very important concept for the adequacy of test data. ML measures a model's accuracy by applying a set of validation data, and revises the model until the validation accuracy reaches on a certain level. After the validation process, the complete model is tested with the set of test data, which are not seen by the model yet. If the set of test data covers the model's attributes well, the test accuracy will be close to the validation accuracy of the model. To make sure that ML's set of test data works adequately, we design an experiment and see if the test accuracy of model is always close to its validation adequacy as expected. The experiment builds 100 different SVM models for each of six data sets published in UCI ML repository. From the test accuracy and its validation accuracy of 600 cases, we find some unexpected cases, where the test accuracy is very different from its validation accuracy. Consequently, it is not always true that ML's set of test data is adequate to assure a model's quality.

지상 Pandora와 위성 OMI와 OMPS 오존관측 자료의 상호검증 방법에 대한 분석 연구 (The Cross-validation of Satellite OMI and OMPS Total Ozone with Pandora Measurement)

  • 백강현;김재환;김준
    • 대한원격탐사학회지
    • /
    • 제36권3호
    • /
    • pp.461-474
    • /
    • 2020
  • 한국은 세계 최초로 정지궤도 환경위성탑재체 Geostationary Environment Monitoring Spectrometer(GEMS)를 발사하여, 동북아시아 지역 대기오염물질을 실시간으로 감시할 계획 중이다. 위성을 이용한 대기오염물질 관측은 불량조건문제(ill-posed problem)에서 역문제의 해를 찾는 과정이기 때문에, 위성 관측 값은 오차를 포함하고 있다. 따라서 GEMS 산출물이 신뢰성을 갖기 위해서는 지상관측 또는 다른 위성과의 상호 비교 검증 연구가 반드시 요구된다. 본 연구는 향후 GEMS의 오존 관측자료의 검증에 사용될 위성인 OMI, OMPS 그리고 서울과 부산에 설치된 지상 관측기기인 Pandora에서 측정된 total column ozone (TCO)를 상호 비교분석 함으로써 자료의 평가를 실시하였다. 이 연구에서는 위성이 전지구적으로 일관된 정확도의 오존 자료를 제공한다는 특성을 이용하여 지상 관측자료의 정확도를 평가하였다. 그 결과 서울 Pandora #29의 자료에서 심각한 기기오차를 발견하여, 위성자료를 이용한 지상자료의 역검증이 가능함을 보였다. 다음으로 지상 Pandora를 이용한 OMPS의 자료 비교 검증에서 OMPS TCO는 Pandora TCO 값에 대해 상관관계 0.97과 ~1.8 DU의 RMSE 그리고 4%의 양의 편차(bias)를 가졌으며, 이 편차는 SZA과 Cross track position, TCO 그리고 계절 변화에 대한 의존성을 갖지 않음을 보였다. 또한 Pandora TCO은 구름 필터링이 제대로 수행되어 있지 않기 때문에, Pandora 자료를 위성자료 검증에 활용하기 위해서는 각 위성 센서의 관측 공간 해상도에 따른 적절한 경계 조건이 사용되어야 함을 보였다.

잔향 수조에서의 시간 이력 수음 신호 간 교차상관을 이용한 수중 음속 계측 방법에 관한 실험적 검증 (Experimental Validation on Underwater Sound Speed Measurement Method Using Cross-Correlation of Time-Domain Acoustic Signals in a Reverberant Water Tank)

  • 이주엽;김국현;박성주;조대승
    • 대한조선학회논문집
    • /
    • 제61권1호
    • /
    • pp.1-7
    • /
    • 2024
  • Underwater sound speed is an important analysis parameter on an estimation of the underwater radiated noise (URN) emitted from vessels. This paper aims to present an underwater sound speed measurement procedure using a cross-correlation of time-domain acoustic signals and validate the procedure through an experiment in a reverberant water tank. For the purpose, time-domain acoustic signals transmitted by a Gaussian pulse excitation from an acoustic projector have been measured at 20 hydrophone positions in the reverberant water tank. Then, the sound speed in water has been calculated by a linear regression using 190 cross-correlation cases of distances and time lags between the received signals and the result has been compared with those estimated by the existing empirical formulae. From the result, it is regarded that the presented experimental procedure to measure an underwater sound speed is reliably applicable if the time resolution is sufficiently high in the measurement.

신생아의 출생 체중에 따른 혈액 여과지 17alpha-hydroxyprogesterone의 농도 분석 및 판정 기준 조정 (Analysis and Cut-off Adjustment of Dried Blood Spot 17alpha-hydroxyprogesterone Concentration by Birth Weight)

  • 박승만;권애린;양송현;박은아;최재황;황미정;남현경;이은희
    • 대한유전성대사질환학회지
    • /
    • 제14권2호
    • /
    • pp.150-155
    • /
    • 2014
  • The measurement of $17{\alpha}$-hydroxyprogesterone ($17{\alpha}$-OHP) in a dried blood spot on filter paper is an important for screening of congenital adrenal hyperplasia (CAH). Since high levels of $17{\alpha}$-OHP are frequently observed in premature infants without congenital adrenal hyperplasia, we evaluated cuts-off based on birth weight and performed validation. Birth weight and $17{\alpha}$-OHP concentration data of 292,204 newborn screening subjects in Greencross labopratories were analyzed. The cut-off values based on birth weight were newly evaluated and validated with the original data. The mean $17{\alpha}$-OHP concentration were 7.25 ng/mL in very low birth weight (VLBW) group, 4.02 ng/mL in low birth weight (LBW) group, 2.53 g/mL in normal birth weight (NBW) group, and 2.24 ng/mL in heavy birth weight (HBW) group. The cut-offs for CAH were decided as follows: 21.12 ng/mL for VLBW and LBW groups and 11.14 ng/mL for NBW and HBW groups. When applied new cut-offs for original data, positive rates in VLBW and LBW groups were decreased and positive rates in NBW and HBW groups were increased. The cut-offs based on birth weight should be used in the screening for CAH. We believe that our new cut-off reduce the false positive rate and false negative rate and our experience for cut-off set up and validation will be helpful for other laboratories doing newborn screening test.

상호상관 관계를 이용한 운동중의 임피던스 파형에서의 특성점 검출 (Detection of Distinctive Points in Impedance Cardiogram during Exercise by Cross-Correlation Method)

  • 오인식;송철규
    • 대한의용생체공학회:의공학회지
    • /
    • 제12권4호
    • /
    • pp.261-266
    • /
    • 1991
  • As the ensemble averaged dz/dt signal during exercise gets smoothed, it is difficult to find the distinctive marks for estimation of stroke volume. The cross correlation function was made use of estimating these marks for automatic calculation by computer from the ensemble averaged dz/dt signal. LVET( Left Ventricular Ejection Time) and stroke volume were estimated based on the calculated parameters from the characteristic points. LVET, stroke volume calculated by hand, by the ensemble average and the cross correlation were compared for accuracy validation.

  • PDF

Prediction of concrete compressive strength using non-destructive test results

  • Erdal, Hamit;Erdal, Mursel;Simsek, Osman;Erdal, Halil Ibrahim
    • Computers and Concrete
    • /
    • 제21권4호
    • /
    • pp.407-417
    • /
    • 2018
  • Concrete which is a composite material is one of the most important construction materials. Compressive strength is a commonly used parameter for the assessment of concrete quality. Accurate prediction of concrete compressive strength is an important issue. In this study, we utilized an experimental procedure for the assessment of concrete quality. Firstly, the concrete mix was prepared according to C 20 type concrete, and slump of fresh concrete was about 20 cm. After the placement of fresh concrete to formworks, compaction was achieved using a vibrating screed. After 28 day period, a total of 100 core samples having 75 mm diameter were extracted. On the core samples pulse velocity determination tests and compressive strength tests were performed. Besides, Windsor probe penetration tests and Schmidt hammer tests were also performed. After setting up the data set, twelve artificial intelligence (AI) models compared for predicting the concrete compressive strength. These models can be divided into three categories (i) Functions (i.e., Linear Regression, Simple Linear Regression, Multilayer Perceptron, Support Vector Regression), (ii) Lazy-Learning Algorithms (i.e., IBk Linear NN Search, KStar, Locally Weighted Learning) (iii) Tree-Based Learning Algorithms (i.e., Decision Stump, Model Trees Regression, Random Forest, Random Tree, Reduced Error Pruning Tree). Four evaluation processes, four validation implements (i.e., 10-fold cross validation, 5-fold cross validation, 10% split sample validation & 20% split sample validation) are used to examine the performance of predictive models. This study shows that machine learning regression techniques are promising tools for predicting compressive strength of concrete.

다품목 공용 제약설비인 바이알 충전기에 대한 세척공정 밸리데이션 (Cleaning Validation Studies for Multi-Purpose Facility : Vial Filling Machine)

  • 최한곤;양호준;김영란;성준호;황마로;김종오;용철순
    • Journal of Pharmaceutical Investigation
    • /
    • 제39권4호
    • /
    • pp.263-267
    • /
    • 2009
  • The purpose of this study is to evaluate the efficacy of stipulated cleaning process, and the prohibition of cross-contamination and microbiological contamination, which inadequate cleaning in multi-production could occur, through cleaning validation of multi-purpose facility used to produce five biopharmaceutical products as sterile injection. After production of five biopharmaceutical products such as hGH, rhGCSF, rhEPO, rhFSH and rhIFN using vial filling machine, the cleaning validation such as residual analysis of active ingredients or human serum albumin, measurement of total organic carbon (TOC), residual analysis of detergent and microbiological contamination were carried out. In the case of rhGH and rhGCSF clean validations, drug residues were not detected. Furthermore, in the case of rhEPO, rhFSH and rhIFN clean validations, human serum albumin residues were not detected. At TOC (total organic carbon) analysis, all clean validations gave the TOC of about average 137.93%, not more than 150% of acceptance criteria. At sodium analysis for the checking of residues of cleaning agent, sodium residues were not detected. In sterility test, they showed no microbiological contamination of bacteria and fungi. Thus, this cleaning validation was determined as successful in protection of cross-contamination and induction of safety in multi-purpose facility.