• Title/Summary/Keyword: Cross - Validation

Search Result 994, Processing Time 0.023 seconds

Classification of Ovarian Cancer Microarray Data based on Intelligent Systems with Marker gene (선별 시스템 기반 표지 유전자를 포함한 난소암 마이크로어레이 데이터 분류)

  • Park, Su-Young;Jung, Chai-Yeoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.3
    • /
    • pp.747-752
    • /
    • 2011
  • Microarray classification typically possesses two striking attributes: (1) classifier design and error estimation are based on remarkably small samples and (2) cross-validation error estimation is employed in the majority of the papers. A Microarray data of ovarian cancer consists of the expressions of thens of thousands of genes, and there is no systematic procedure to analyze this information instantaneously. In this paper, gene markers are selected by ranking genes according to statistics, popular classification rules - linear discriminant analysis, k-nearest-neighbor and decision trees - has been performed comparing classification accuracy of data selecting gene markers and not selecting gene markers. The Result that apply linear classification analysis at Microarray data set including marker gene that are selected using ANOVA method represent the highest classification accuracy of 97.78% and the lowest prediction error estimate.

Realization of home appliance classification system using deep learning (딥러닝을 이용한 가전제품 분류 시스템 구현)

  • Son, Chang-Woo;Lee, Sang-Bae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1718-1724
    • /
    • 2017
  • Recently, Smart plugs for real time monitoring of household appliances based on IoT(Internet of Things) have been activated. Through this, consumers are able to save energy by monitoring real-time energy consumption at all times, and reduce power consumption through alarm function based on consumer setting. In this paper, we measure the alternating current from a wall power outlet for real-time monitoring. At this time, the current pattern for each household appliance was classified and it was experimented with deep learning to determine which product works. As a result, we used a cross validation method and a bootstrap verification method in order to the classification performance according to the type of appliances. Also, it is confirmed that the cost function and the learning success rate are the same as the train data and test data.

Named Entity Recognition for Patent Documents Based on Conditional Random Fields (조건부 랜덤 필드를 이용한 특허 문서의 개체명 인식)

  • Lee, Tae Seok;Shin, Su Mi;Kang, Seung Shik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.9
    • /
    • pp.419-424
    • /
    • 2016
  • Named entity recognition is required to improve the retrieval accuracy of patent documents or similar patents in the claims and patent descriptions. In this paper, we proposed an automatic named entity recognition for patents by using a conditional random field that is one of the best methods in machine learning research. Named entity recognition system has been constructed from the training set of tagged corpus with 660,000 words and 70,000 words are used as a test set for evaluation. The experiment shows that the accuracy is 93.6% and the Kappa coefficient is 0.67 between manual tagging and automatic tagging system. This figure is better than the Kappa coefficient 0.6 for manually tagged results and it shows that automatic named entity tagging system can be used as a practical tagging for patent documents in replacement of a manual tagging.

Genetic Function Approximation and Bayesian Models for the Discovery of Future HDAC8 Inhibitors

  • Thangapandian, Sundarapandian;John, Shalini;Lee, Keun-Woo
    • Interdisciplinary Bio Central
    • /
    • v.3 no.4
    • /
    • pp.15.1-15.11
    • /
    • 2011
  • Background: Histone deacetylase (HDAC) 8 is one of its family members catalyzes the removal of acetyl groups from N-terminal lysine residues of histone proteins thereby restricts transcription factors from being expressed. Inhibition of HDAC8 has become an emerging and effective anti-cancer therapy for various cancers. Application computational methodologies may result in identifying the key components that can be used in developing future potent HDAC8 inhibitors. Results: Facilitating the discovery of novel and potential chemical scaffolds as starting points in the future HDAC8 inhibitor design, quantitative structure-activity relationship models were generated with 30 training set compounds using genetic function approximation (GFA) and Bayesian algorithms. Six GFA models were selected based on the significant statistical parameters calculated during model development. A Bayesian model using fingerprints was developed with a receiver operating characteristic curve cross-validation value of 0.902. An external test set of 54 diverse compounds was used in validating the models. Conclusions: Finally two out of six models based on their predictive ability over the test set compounds were selected as final GFA models. The Bayesian model has displayed a high classifying ability with the same test set compounds and the positively and negatively contributing molecular fingerprints were also unveiled by the model. The effectively contributing physicochemical properties and molecular fingerprints from a set of known HDAC8 inhibitors were identified and can be used in designing future HDAC8 inhibitors.

Prediction of the turning and zig-zag maneuvering performance of a surface combatant with URANS

  • Duman, Suleyman;Bal, Sakir
    • Ocean Systems Engineering
    • /
    • v.7 no.4
    • /
    • pp.435-460
    • /
    • 2017
  • The main objective of this study is to investigate the turning and zig-zag maneuvering performance of the well-known naval surface combatant DTMB (David Taylor Model Basin) 5415 hull with URANS (Unsteady Reynolds-averaged Navier-Stokes) method. Numerical simulations of static drift tests have been performed by a commercial RANS solver based on a finite volume method (FVM) in an unsteady manner. The fluid flow is considered as 3-D, incompressible and fully turbulent. Hydrodynamic analyses have been carried out for a fixed Froude number 0.28. During the analyses, the free surface effects have been taken into account using VOF (Volume of Fluid) method and the hull is considered as fixed. First, the code has been validated with the available experimental data in literature. After validation, static drift, static rudder and drift and rudder tests have been simulated. The forces and moments acting on the hull have been computed with URANS approach. Numerical results have been applied to determine the hydrodynamic maneuvering coefficients, such as, velocity terms and rudder terms. The acceleration, angular velocity and cross-coupled terms have been taken from the available experimental data. A computer program has been developed to apply a fast maneuvering simulation technique. Abkowitz's non-linear mathematical model has been used to calculate the forces and moment acting on the hull during the maneuvering motion. Euler method on the other hand has been applied to solve the simultaneous differential equations. Turning and zig-zag maneuvering simulations have been carried out and the maneuvering characteristics have been determined and the numerical simulation results have been compared with the available data in literature. In addition, viscous effects have been investigated using Eulerian approach for several static drift cases.

Modelling of dissolved oxygen (DO) in a reservoir using artificial neural networks: Amir Kabir Reservoir, Iran

  • Asadollahfardi, Gholamreza;Aria, Shiva Homayoun;Abaei, Mehrdad
    • Advances in environmental research
    • /
    • v.5 no.3
    • /
    • pp.153-167
    • /
    • 2016
  • We applied multilayer perceptron (MLP) and radial basis function (RBF) neural network in upstream and downstream water quality stations of the Karaj Reservoir in Iran. For both neural networks, inputs were pH, turbidity, temperature, chlorophyll-a, biochemical oxygen demand (BOD) and nitrate, and the output was dissolved oxygen (DO). We used an MLP neural network with two hidden layers, for upstream station 15 and 33 neurons in the first and second layers respectively, and for the downstream station, 16 and 21 neurons in the first and second hidden layer were used which had minimum amount of errors. For learning process 6-fold cross validation were applied to avoid over fitting. The best results acquired from RBF model, in which the mean bias error (MBE) and root mean squared error (RMSE) were 0.063 and 0.10 for the upstream station. The MBE and RSME were 0.0126 and 0.099 for the downstream station. The coefficient of determination ($R^2$) between the observed data and the predicted data for upstream and downstream stations in the MLP was 0.801 and 0.904, respectively, and in the RBF network were 0.962 and 0.97, respectively. The MLP neural network had acceptable results; however, the results of RBF network were more accurate. A sensitivity analysis for the MLP neural network indicated that temperature was the first parameter, pH the second and nitrate was the last factor affecting the prediction of DO concentrations. The results proved the workability and accuracy of the RBF model in the prediction of the DO.

Temperature analysis of a long-span suspension bridge based on a time-varying solar radiation model

  • Xia, Qi;Liu, Senlin;Zhang, Jian
    • Smart Structures and Systems
    • /
    • v.25 no.1
    • /
    • pp.23-35
    • /
    • 2020
  • It is important to take into account the thermal behavior in assessing the structural condition of bridges. An effective method of studying the temperature effect of long-span bridges is numerical simulation based on the solar radiation models. This study aims to develop a time-varying solar radiation model which can consider the real-time weather changes, such as a cloud cover. A statistical analysis of the long-term monitoring data is first performed, especially on the temperature data between the south and north anchors of the bridge, to confirm that temperature difference can be used to describe real-time weather changes. Second, a defect in the traditional solar radiation model is detected in the temperature field simulation, whereby the value of the turbidity coefficient tu is subjective and cannot be used to describe the weather changes in real-time. Therefore, a new solar radiation model with modified turbidity coefficient γ is first established on the temperature difference between the south and north anchors. Third, the temperature data of several days are selected for model validation, with the results showing that the simulated temperature distribution is in good agreement with the measured temperature, while the calculated results by the traditional model had minor errors because the turbidity coefficient tu is uncertainty. In addition, the vertical and transverse temperature gradient of a typical cross-section and the temperature distribution of the tower are also studied.

Wind Speed Prediction in Complex Terrain Using a Commercial CFD Code (상용 CFD 프로그램을 이용한 복잡지형에서의 풍속 예측)

  • Woo, Jae-Kyoon;Kim, Hyeon-Gi;Paek, In-Su;Yoo, Neung-Soo;Nam, Yoon-Su
    • Journal of the Korean Solar Energy Society
    • /
    • v.31 no.6
    • /
    • pp.8-22
    • /
    • 2011
  • Investigations on modeling methods of a CFD wind resource prediction program, WindSim for a ccurate predictions of wind speeds were performed with the field measurements. Meteorological Masts having heights of 40m and 50m were installed at two different sites in complex terrain. The wind speeds and direction were monitored from sensors installed on the masts and recorded for one year. Modeling parameters of WindSim input variables for accurate predictions of wind speeds were investigated by performing cross predictions of wind speeds at the masts using the measured data. Four parameters that most affect the wind speed prediction in WindSim including the size of a topographical map, cell sizes in x and y direction, height distribution factors, and the roughness lengths were studied to find out more suitable input parameters for better wind speed predictions. The parameters were then applied to WindSim to predict the wind speed of another location in complex terrain in Korea for validation. The predicted annual wind speeds were compared with the averaged measured data for one year from meteorological masts installed for this study, and the errors were within 6.9%. The results of the proposed practical study are believed to be very useful to give guidelines to wind engineers for more accurate prediction results and time-saving in predicting wind speed of complex terrain that will be used to predict annual energy production of a virtual wind farm in complex terrain.

A Study on the Management Innovation of the Spot (현장 중심적 경영혁신에 관한 연구)

  • Kim, Gee-Jung
    • Industry Promotion Research
    • /
    • v.2 no.2
    • /
    • pp.23-29
    • /
    • 2017
  • This study aims to suggest a new method to escape crisis of manufacturing industry. We focused on the field - oriented management innovation promotion performance divided into qualitative performance and quantitative performance. The site-centered management innovation is an organization that helps managers to quickly solve the problems faced by field management personnel by switching from manager-centered management method to on-site management personnel-oriented management. Find out the waste factors of the work and improve the process and set the process standards according to the customer's needs. Before working on the site, an employee should know exactly the process standard of what I will do, and frequently inspect the raw materials to make sure they meet the specifications. After the self-inspection is carried out, the standard is reviewed. If there is no abnormality, the cross-validation is carried out by the method which is carried over to the next step. The results of this study can be used to enhance the competitiveness of manufacturing companies. New management innovation techniques are required to adapt to the rapidly changing international business environment, and this study has implications for this research.

Cross Validation of Attention-Deficit/Hyperactivity Disorder-After School Checklist

  • Lee, Sukhyun;Kim, Bongseog;Yoo, Hanik K.;Huh, Hannah;Roh, Jaewoo
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.29 no.3
    • /
    • pp.129-136
    • /
    • 2018
  • Objectives: This study aimed to evaluate the efficacy of the attention-deficit/hyperactivity disorder (ADHD)-After School Checklist (ASK) by comparing the results of the Comprehensive Attention Test (CAT) and Clinical Global Impression-Severity (CGI-S) Scale and then by calculating the area under the receiver operating characteristic (ROC) curve. Methods: We performed correlation analyses on the ASK and CAT results and then the ASK and CGI-S results. We created a ROC curve and evaluated performance on the ASK as a diagnostic tool. We then analyzed the test results of 1348 subjects (male 56.8%), including 1201 subjects in the general population and 147 ADHD subjects, aged 6-15 years, from kindergarten to middle school in Seoul and Gyeonggi province, South Korea. Results: According to the correlation analyses, ASK scores and the Attention Quotient (AQ) of CAT scores showed a significant correlation of -0.20--0.29 (p<0.05). The t-test between ADHD scores and CGI-S also showed a significant correlation (t=-2.55, p<0.05). The area under the ROC curve was calculated as 0.81, indicating good efficacy of the ASK, and the cut-off score was calculated as 15.5. Conclusion: The ASK can be used as a valid tool not only to evaluate functional impairment of ADHD children and adolescents but also to screen ADHD.