Tuning Backpropagation Networks for Analyzing NIR Data

  • M.A.Hana (Department of Biological and Agricultural Engineering, North Carolina State University) ;
  • W.F.McClure (Department of Biological and Agricultural Engineering, North Carolina State University) ;
  • T.B.Whitaker (Department of Biological and Agricultural Engineering, North Carolina State University)
  • Published : 2001.06.01


Designing (specifying the number of nodes in each layer) and training (calibration and validation) back-propagation (BP) for analyzing NIR data can be an arduous and time-consuming task. Actually, training is somewhat trivial. A BP network may be trained by randomly dividing the data set (DS) into two parts, training the network with one part and checking its performance with the other part. However, this procedure is plagued with the lack of objective information about network characteristics - the required number of nodes in the hidden layer(s) and the number of epochs needed to train for optimal performance. Work reported in this paper compares a BP network tuning procedure with a conventional reference (training and testing) procedure. The tuning procedure, believed to have several novel attributes, involved randomly dividing a data set into five groups. Each of the five groups was randomly subdivided into two groups with 80% in a training set and 20% in a tuning set. Training was interrupted periodically after every 100 epochs. During each interruption, network performance was checked against the tuning set - each time recording the mean-squared error (MSE) and the number of epochs (K) needed to reach this point. This procedure continued until a plot of MSE vs total epochs identified a minimum MSE. The number of epochs required achieve minimum MSE was noted. Now optimized (or tuned), network performance was determined by testing the network with all available data. One nice feature of using the tuning method is that the entire process can be executed without user input - i.e., the whole process of developing and training a BP network becomes objective. Four different near infrared data sets (A, B, C and D) were used in this work. Tow of the data sets (A and B) were used to determine the concentration of nicotine in tobacco samples. The other two sets (C and D) were used as a basis for classifyign tobaccos. The optimum BP architecture for each of the four data sets were those consisting of 1, 5, 2 and 1 hidden units in the hidden layer, respectively. The suggested tuning method improved, though marginally in some cases, the true performances of all calibration models as well as their standard deviations. since this work was dependent upon the artificial neural network (ANN) literature, a glossary of terms is given at the end of this paper. Results indicate improved performance using the tuning procedure. In addition, BP network calibrations were better than multiple linear regression (MLR) calibrations on the same data.