Selecting an Artificial Neural Network for Efficient Modeling And

Published on January 2017 | Categories: Documents | Downloads: 13 | Comments: 0 | Views: 128

of 12

Content

International Journal of Machine Tools & Manufacture 42 (2002) 663–674

Selecting an artiﬁcial neural network for efﬁcient modeling and accurate simulation of the milling process
Jorge F. Briceno a, Hazim El-Mounayri a,∗, Snehasis Mukhopadhyay b
a

Mechanical Engineering Department at Indiana University, Purdue University, Indianapolis (IUPUI), 723 W. Michigan Street, SL 260, Indianapolis, IN 46202-5132, USA b Department of Computer and Information Science at IUPUI, 723 W. Michigan Street, Indianapolis, IN, USA Received 16 August 2001; accepted 15 January 2002

Abstract In this paper, two supervised neural networks are used to estimate the forces developed during milling. These two Artiﬁcial Neural Networks (ANNs) are compared based on a cost function that relates the size of the training data to the accuracy of the model. Training experiments are screened based on design of experiments. Veriﬁcation experiments are conducted to evaluate these two models. It is shown that the Radial Basis Network model is superior in this particular case. Orthogonal design and speciﬁcally equally spaced dimensioning showed to be a good way to select the training experiments.  2002 Elsevier Science Ltd. All rights reserved.
Keywords: End milling; Artiﬁcial neural networks; Back propagation; Radial basis

1. Introduction As one of the most useful methods of metal cutting, the milling process attempts to remove an amount of material through chip formation by the two continuous motions of a tool and a workpiece (see Fig. 1). In this

Fig. 1.
∗

Flat-end milling process.

Corresponding author. Tel.: +1-317-278-3320; fax.: +1-317-2749744. E-mail address: [email protected] (H. El-Mounayri).

case, the tool has a rotational motion (expressed by spindle speed) and the workpiece a linear movement (expressed by feed rate). The cutting edge is in contact with the material at many points, which change depending on the position of the edge relative to the material. This makes the present process involved in terms of operational variables. Many parameters have to be deﬁned to conduct this operation. Among the principal ones are spindle speed (tool rotational velocity), feed rate (workpiece velocity), diameter of the tool, helix angle, radial depth of cut (RDC), axial depth of cut (ADC), rake angle, clearance angle and number of ﬂutes. These variables conjointly with tool and workpiece material deﬁne the state of cutting, which controls the process parameters. The latter include tool wear, tool life, surface ﬁnish, etc. The forces that are developed during the milling process, can directly or indirectly measure/estimate such process parameters. In general, excessive cutting forces result in low product quality while small cutting forces often indicate low machining efﬁciency [1]. Thus, controlling these forces is of paramount importance. The majority of milling operations have been carried out based on cutting conditions determined from previous experience and/or existing machining data. On the

0890-6955/02/$ - see front matter  2002 Elsevier Science Ltd. All rights reserved. PII: S 0 8 9 0 - 6 9 5 5 ( 0 2 ) 0 0 0 0 8 - 1

664

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

other hand, researchers have been trying to develop mathematical models that would predict the cutting forces based on the geometry and physical characteristics of the process. Such prediction could then be used to optimize the process. However, due to its complexity, the milling process still represents a challenge to the modeling and simulation research effort. In fact, most of the research work reported in this regard, which is based on either analytical or semi-empirical approaches, has in general shown only limited levels of accuracy and/or generality. In the present paper, a different approach that is based on advanced artiﬁcial intelligence techniques is implemented and tested. More speciﬁcally two different neural networks are used to predict the forces developed during End milling. The networks are then compared and the “best” network is selected based on certain criteria.

superior to the other models. Elanayar and Shin [8] utilized RBN to predict tool wear based on certain machining conditions. A more general representation of the milling process cannot be found in the literature. In addition, no work has been conducted yet to evaluate and compare different artiﬁcial neural networks used to model the milling process.

3. Artiﬁcial neural network models of the milling process In the current work, two supervised neural networks for modeling the milling process are compared. The ﬁrst one is a back propagation neural network (BP) with logsigmoid transfer functions in hidden layers and linear transfer function in the output layer; the second is a radial basis network (RBN) with Gaussian activation functions. The ﬁrst ANN is very popular, especially in the area of manufacturing modeling, as its design and operation are relatively simple. The radial basis network has some additional advantages such as rapid convergence and less error. In particular, most commonly used RBNs involve ﬁxed basis functions with linearly appearing unknown parameters in the output layer. In contrast, multi-layer BP ANNs involve adjustable basis functions. That result in nonlinearly appearing unknown parameters. It is commonly known that linearity in parameters in RBN allow the use of least squares error based updating schemes that have faster convergence than the gradient-descent methods used to update the nonlinear parameters of multi-layer BP ANN. On the other hand, it is also known that the use of ﬁxed basis functions in RBN results in exponential complexity in terms of the number of parameters, while adjustable basis functions of BP ANN can lead to much less complexity in terms of the number of parameters or network size [9]. However, in practice, the number of parameters in RBN starts becoming unmanageably large only when the number of input features increases beyond about 10 or 20, which is not the case in our study. Hence, the use of RBN was practically possible for our problem. MatLab Neural Network Tool Box was used as a platform to create the networks. 3.1. Back-propagation neural network (BPNN) Since the objective is to evolve a model that relates selected inputs with outputs, BPNN constitutes an excellent tool to approximate such function. The general network topology is shown in Fig. 2. This network is composed of several neurons or processing elements (PE) operating in parallel. The PEs are arranged in different sections or layers. These structures include: an input layer, hidden layer(s) and an output layer. Each layer is connected to other layers through the weight lines that

2. Literature review This relatively new methodology of Artiﬁcial Neural Network (ANN), inspired by biological nervous systems, has found application in many real-world problem solving. One of the ﬁrst engineering applications was reported by Minsky and Papert developing perceptrons in 1969. Then this ﬁeld stayed dormant until about 1986 when the “PDP group” comprising Rumelhart and McClelland [2] published a two-volume book on explorations in the microstructure of cognition. It is only in the past few years that this methodology was implemented in metal-cutting operations. In [3], a feedforward neural network algorithm is implemented to predict ﬂank wear in orthogonal turning. In this case, feed rate, cutting speed and force ratio are used as inputs. Liu and Wang [4] also propose a back propagation (BP) ANN for on-line modeling of the milling system. However, this study has several limitations, the most important of which is the use of a single machining parameter as the variable input. In [5], a more efﬁcient model is created using BP ANN (using Levenberg–Marquardt approach). In this case, three inputs are considered with different levels for each parameter. This approach has the disadvantage of requiring too many experiments to train the ANN. This, in terms of Industrial usability, is unattractive and expensive. Radial Basis Networks (RBN), a neural network architecture different from multi-layer BP ANN, have been used mainly for pattern recognition. However, recent studies have indicated that this important network can be successfully used as a function modeler as well. Cook and Chiu [6] used a radial basis network as a framework to establish some network improvements considering a time series model of a manufacturing process. Cheng and Lin [7] used three ANNs to estimate bending angles formed by laser. The RBN showed to be

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

665

to the next layer(s). The output of this element is given by Yj SFl(aj) (2)

where, Yj is the output value of the jth element; SFl is the squashing function (or activation function) of the lth hidden layer. In this paper, the squashing functions used in the hidden and output layers are log-sigmoid transfer function and linear transfer function respectively. The value of Yj is propagated through each further layer until the output is generated. 3.1.2. Back-propagation phase In this phase the learning process is conducted. In general terms, the implementation of BP consists of updating the network weights in the direction in which the performance function decreases most rapidly. Once the output (Yj) is calculated, it is compared with the target value (tj). Then the following error is computed: ej 1 (t Y )2 2 j j (3)

Fig. 2. Back-propagation network topology.

come from each PE. The architecture of each PE is shown is Fig. 3. In general terms, the operation of this type of network can be described in terms of two major phases: The feed-forward phase and the back-propagation phase. 3.1.1. Feed-forward phase The input patterns are represented by the input PEs. Here no calculation is made. The following set of neurons are found in the hidden layer(s). Form the ith input PE the information is conducted to the jth PE in hidden layer through the weight Wij. As depicted in Fig. 3, the incoming data, in such element, is represented by
n

This error ej corresponds to just one output PE. Therefore the overall error (E vector) is expressed by E (e1,…,ej,…ek) (4)

aj
i 0

WijIi

(1)

where, aj is the linear combination of each Ii multiplied by Wij. is the value used in the activation function; Ii is the ith input; Wij is the weight value from the ith input PE to the jth hidden PE; n is the number of incoming information to the jth PE; aj is the value fed to the squashing function which gives the output of the jth PE

where k is the number of outputs. The error is then transmitted backwards from the output layer to the input layer. The connection weights are updated by each PE, leading the network to converge. Several techniques can be used to conduct this backpropagation. One of the most widely used is the Levenberg–Marquardt technique. This technique approximates the Hessian matrix with the product of the Jacobian matrix and its transpose. In this way, the weight updates is based on the following equation: Wnew ij Wold ij JT··δ JT·J µ·I (5)

Fig. 3.

Architecture of an individual PE for BP.

where, Wnew Corrected weight for jth PE coming from ij the previous layer, Wold Previous weight for jth PE ij from previous layer, J Jacobian matrix containing the ﬁrst derivatives of the network errors with respect to the network weights and error signals for the ith pattern, µ Scalar factor (when equal to zero, the method is called second order Newton’s, while when set to a large number, it is called gradient descent with small step size), δ Error signal for the jth PE. This network offers a good generalization methodology and a fast convergence using the Levenberg–Marquardt algorithm. In the same way, regularization is used to improve generalization through the use of automated regularization based on Bayesian framework. For this particular case, since the size of the data is relatively

666

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

small and based on White’s theorem [10] (which states that “one layer with non-linear activation functions is sufﬁcient to map any non-inear functional relationship with a reasonable level of accuracy”), a single hidden layer neural network was utilized and the number of weights are kept around 3/4 of the number of experiments, actually: Number of weights (Number of experiments)∗(3/4) Normally this factor is about 1/10 , but due the small size of the data in this particular case a factor of 3/4 was used, which still resulted in more data points than the number of unknown weights. The effect of topology is also studied by considering different cases. The topologies are varied by varying the number of neurons in hidden layer (n, in Fig. 2) between a lower limit of 2 and an upper limit of 3/4 of the total number of experiments. The lower limit was selected based on the fact that one neuron in the hidden layer represents a model in which a linear relation is implied between the inputs and outputs. The following notation is used to describe the topology: 3.n.4; which means: 3 inputs, n neurons in the hidden layer and 4 outputs. 3.2. Radial basis network This neural network utilizes the Gaussian curve to map values. RBN works considerably well in function approximation. It is very fast in convergence and it is very simple to deﬁne in terms of a number of characteristic parameters. Radial basis network (RBN) or radial basis function network is a two layer fully interconnected neural network. It has two general characteristics: First, it may require more neurons than the standard feed-forward BP networks. Second, it can be designed in a fraction of the time that it takes to train the aforementioned BP. A typical RBN is shown in Fig. 4. The network has

n inputs and k outputs. The ﬁrst layer is connected with the second or internal layer by weights that come from the input elements and the bias element. Weights from internal layer to outputs are also deﬁned. Each element in the internal layer receives an input pattern vector and compares it with the mean weight vector that connects the input with second layer. The weight vector determines the position of the center of the radial hidden element in the input space. Here, the activation function is similar to a Gaussian density function. This function is deﬁned as follows:
(uih aih)2C

Yki

e

h

V2

(6)

Here Yki is the response of the ith element in the hidden layer. The weights uih deﬁne the mean value vector associated with each hidden PE, aih represent the inputs. The parameter V is the factor that shapes the form of the squashing function and is called spread factor; C is a constant. The PEs architecture of the hidden layer can be seen in Fig. 5. Finally, the connection weights between the second layer and the output layer is multiplied by the output of the internal elements (linear summation function), giving the output value to be compared with the target vectors.
p

zkj
i

Wij·Ykj
0

(7)

Radial basis network is a very efﬁcient network when function approximation is needed. This artiﬁcial neural network has the following characteristics: 1. it is very fast in comparison to back-propagation; 2. it has the ability of representing nonlinear functions; 3. it does not experience local minima problems of back-propagation. RBN is being used for an increasing number of applications, proportioning a very helpful modeling tool.

Fig. 4.

Radial basis network architecture.

Fig. 5.

Radial basis neuron.

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

667

In summary, two parameters need to be deﬁned. Spread factor and goal factor. The spread factor V, has to be speciﬁed depending on the particular case in hand. It has to be smaller than the highest limit of the input data and larger than the lowest limit [11]. Based on this, and assuming that all the training data (as will be explained in a future paper) is mapped between 0 and 1, three values to be considered are: 0.2, 0.5 and 0.8. The goal factor value is set to zero, since error is a decisive factor in this study.

[12]. Experimental design is made up of three stages: First, system design. In this phase, the ﬂat end milling experimental set-up is built including the dynamometer to measure the required forces. Second, parameter design. Here the variables that are involved in the process are valuated. In this particular case, orthogonal arrays are used to host the variations of process parameters. Third, tolerance design which is not considered here, as this study aims at comparing two artiﬁcial neural networks. The present work constitutes a ﬁrst step and eventually further enhancements and reﬁnements would be needed. 4.3. Set of experiments

4. Experimental data for training the ANN models 4.1. Experimental set-up The three components of the cutting force are measured using a Kistler 9257B dynamometer. These were sampled at 2500 Hz for 10 s each and have been stored in ﬁles in a spreadsheet format. The machine tool used for all the experiments in this work is a FADAL VMC3016L 4-Axis CNC milling machine. The experiments were conducted using a 1/4 in. diameter, 2-ﬂute, HSS, Do-All end mill. The tool geometry parameters were a 14° rake angle, a 16° primary clearance angle, and a 37.5° helix angle. This is a tool designed speciﬁcally for non-ferrous metals like aluminum and has a higher rake angle. The data acquisition package used was LabVIEW. The set up can be seen in Fig. 6. 4.2. Design of experiments Design of experiments (DOE) is utilized here to determine the optimum number of experiments needed to successfully model the process within the required accuracy. This technique came into picture as a link between statistical design and engineering knowledge. Literature on experimental design is numerous and this paper is not intended to cover aspects of experimental design techniques and detailed information can be found in Ross As noted earlier, there is a number of machining parameters that signiﬁcantly affects the milling process. Of these parameters, spindle speed, feed rate and depth of cut have been varied in current experiments and cutting force variation with time recorded. Other parameters such as tool diameter, rake angle, etc. are kept constant for the scope of this study. In fact, the selected parameters are very critical in the ﬂat-end milling process and should provide a basis for meaningful results for comparing the two models. In order to select the data to be used in the training phase, several experimental sets were designed. All these sets represent states or points in a 3D space, since only 3 parameters were selected. 4.3.1. First set of experiments The ﬁrst set consists of 27 experiments. Three values were selected for each parameter. This approach gives 33=27 experiments (full factorial). The range of values were selected based on recommendations given by [13]. Next, DOE is applied. Since no sensibility relation is known at this stage, equally spaced division is used in order to set the particular values. This results in the following: feed rate (mm/min): range 100–200 and selected values: 100, 150 and 200; spindle speed (rpm): range 600–1800 and selected values: 600, 1200 and 1800; radial depth of cut (%D): range 0–100 and selected values: 25, 62.5 and 100. The corresponding space is shown in Fig. 7. 4.3.2. Second set of experiments For this set, the work-space is divided as shown in Fig. 8 (bold points are in different RDC planes). Again an equally spaced division is used. In this set, the number of states in the work-space has increased. In fact, the total number of experiments (full factorial) is then 53=125, 27 of which are already in the ﬁrst set. This

Fig. 6.

Experimental set-up.

668

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

Fig. 9. Fig. 7. First set of experiments.

Third set of experiments.

Fig. 10. Fig. 8. Second set of experiments.

Fourth set of experiments.

reduces the number of additional experiments towards a full factorial to 98. As mentioned above, the second set is represented by equally spaced states inside the deﬁned range and subsequent models partially cover the rest of the space. The second set has in total 35 experiments, 27 from the ﬁrst set plus an additional 8 experiments (see Fig. 8). It is important to point out that these eight additional experiments are in different RDC planes (radial depth of cut) from the ones used in the previous set. Again, they are equally spaced. 4.3.3. Third set of experiments This set consists of 12 additional experiments (see Fig. 9), which results in a total of 47 experiments. 4.3.4. Fourth set of experiments Eighteen additional points inside the range, as shown in Fig. 10, are considered resulting in a total of 65 experiments. In summary, four different experimental sets are deﬁned to be used in the training phase. Each set is used to train each one of the ANN models.

4.3.5. Validation set This set (made of 20 new experiments) is used to compare the measured values with the ones predicted from the ANNs. These experiments will also support the determination of the optimum number of representative training data. All experiments were performed using the abovementioned milling machine. Forces in X-, Y- and Z-direction were measured and are found to be periodic. 4.4. Data pre-processing After collecting the force components, the resultant force R was calculated using the following equation: R Fx2 Fy2 Fz2 (8)

The maximum (MAX), minimum (MIN), mean (MEAN) and standard deviation (STDV) values of this resultant force are calculated for each experiment (as they represent important characteristics of a continuous force pattern). Next, the data is normalized in order to make it suitable for the training process [11]. This was done by mapping each term to a value between 0 and 1 using the following formula:

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

669

Table 2 Error values of BP network, 1st experimental set, topology 3.2.4 MAX 0.1293 0.4576 0.7888 0.6513 0.002 0.0837 0.6995 0.5869 0.3958 0.4416 0.5721 0.1314 0.721 0.0028 0.5879 0.0794 0.4278 0.6645 0.4569 0.9767 MIN 0.0353 0.048 0.088 0.0241 0.0002 0.0356 0.2055 0.4021 0.2684 0.0148 0.0447 0.473 0.178 0.0217 1.4483 0.0158 0.2741 0.0548 0.2025 0.0218 MEAN 0.5958 0.1146 0.0775 0.2677 0.3962 0.2639 0.0747 0.0476 0.167 0.4065 0.1487 0.1748 0.0093 0.4942 0.9077 0.3518 0.205 0.0371 0.1407 0.1056 STDV 0.0832 0.1396 0.2374 0.2231 0.0399 0.0494 0.2582 0.164 0.1484 0.0945 0.2264 0.1092 0.1913 0.0382 0.109 0.0318 0.2454 0.2152 0.1116 0.3863

Fig. 11. General ANN topology.

N

(R Rmin)∗(Nmax Nmin) (Rmax Rmin)

Nmin

(9)

where, N: normalized value of the real variable; Nmin and Nmax: minimum and maximum values of normalization, respectively; R: real value of the variable; Rmin and Rmax: minimum and maximum values of the real variable, respectively. This normalized data was utilized as the inputs (machining conditions) and outputs (characteristics of the resultant force) to train the ANN. In other words, two vectors are formed in order to train the neural network (see Fig. 11): Input=[feed rate; spindle speed; radial depth of cut]; Output=[MAX; MIN; MEAN;STDV];

Table 3 Values to report MAXIMUM MINIMUM MEAN 0.4428 0.2853 0.1928 0.3261 0.2493 0.2235 STDV 0.1551 0.0924 Mean Stdv

5. Results 5.1. Training results

Table 1 Linear regression for training phase (using BP) Back propa Set W R MAX 0.987 0.995 0.973 0.983 0.984 0.966 0.973 0.973 0.977 0.958 0.967 0.968 0.969 0.97 0.97 MIN 0.981 0.983 0.965 0.97 0.985 0.917 0.922 0.952 0.955 0.912 0.924 0.945 0.933 0.951 0.951 MEAN 0.984 0.993 0.974 0.981 0.986 0.974 0.977 0.987 0.988 0.974 0.978 0.983 0.987 0.986 0.988 STDV 0.992 0.995 0.973 0.974 0.975 0.951 0.958 0.959 0.968 0.946 0.953 0.953 0.955 0.956 0.955

1st 2nd

3rd

4th

14 21 14 21 28 14 21 28 35 14 21 28 35 42 49

Each experimental set (except the validation set) is used to train each network. This training is repeated for each topology. The performance is measured by the linear regression (R) of each output. With this analysis it is possible to determine the response of the network with respect to the targets. A value of 1 indicates that the network is perfectly simulating the training set while 0 means the opposite. For all the cases in this study, the value of R (for all output sets) is shown in Table 1. The case of RBN showed a perfect ﬁtting pattern (R=1 for all the cases) as expected since the goal error factor is set to zero. 5.2. Validation results of the BP model and RBN model For each network, the difference between the real value and the predicted value is calculated producing a matrix of 20 by 4 elements, meaning 20 experiments of validation (rows) and 4 outputs parameters (columns).

670

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

Table 4 Results from BP ERROR (mean and stdv) (real-predicted)×102 N Topology MAXIMUM MINIMUM 1st Set — 27 Experiments 0.4428 0.2853 3.3.4 (w=21) 0.4328 0.2346 2nd Set — 35 Experiments 3.2.4 (w=14) 0.3396 0.2053 3.3.4 (w=21) 0.2726 0.1802 3.4.4 (w=28) 0.2554 0.1783 3rd Set — 47 Experiments 3.2.4 (w=14) 0.3524 0.2144 3.3.4 (w=21) 0.2293 0.2002 3.4.4 (w=28) 0.253 0.1815 3.5.4 (w=35) 0.2394 0.2018 4th Set — 65 Experiments 3.2.4 (w=14) 0.2891 0.2028 3.3.4 (w=21) 0.1905 0.187 3.4.4 (w=28) 0.1884 0.1742 3.5.4 (w=35) 0.1923 0.1661 3.6.4 (w=42) 0.1935 0.1733 3.7.4 (w=49) 0.1988 0.1715 3.2.4 (w=14) 0.1928 0.3261 0.2161 0.3018 0.13 0.3133 0.1194 0.2981 0.133 0.2325 0.1127 0.2194 0.1128 0.1855 0.1215 0.1132 0.1309 0.1092 0.0859 0.212 0.0794 0.1981 0.0998 0.1167 0.1307 0.1147 0.0809 0.091 0.086 0.0893

MEAN

STDV

0.2493 0.2235 0.2168 0.1636 0.2128 0.2173 0.1332 0.1711 0.1404 0.1497 0.2078 0.1887 0.1309 0.1397 0.079 0.0945 0.0906 0.1131 0.2262 0.1934 0.138 0.1469 0.1255 0.1321 0.0978 0.1098 0.0974 0.0797 0.0892 0.0624

0.1551 0.0924 0.1672 0.0874 0.1073 0.0896 0.1082 0.0628 0.1035 0.0591 0.1009 0.0844 0.09 0.0485 0.0843 0.0471 0.0769 0.0478 0.079 0.0628 0.0569 0.0424 0.0517 0.0429 0.0515 0.0332 0.0551 0.0394 0.0533 0.0388

Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv Mean Stdv

For each column, the mean and standard deviation are calculated. These two values represent the mean error and standard deviation of each output element respectively. In this way, a vector of two elements is used to make the comparison. To illustrate the calculations, an example is presented. For back-propagation network and using the 1st experimental set with topology 3.2.4, the error is calculated as follows: eij |mij pij|i:1 20,j:1 4 (10)

3. This was done for each model and for each topology (in BP) as well as combination (in RB network). The results are shown in Table 4 (BP) and Table 5 (RBN).

6. Methodology used to compare the two Artiﬁcial Neural Networks The selection of the corresponding “best network” is carried out in terms of accuracy and efﬁciency. The latest term is measured by selecting a minimum number of training experiments that results in a sufﬁciently accurate model. It is known that the larger is the training set, the more accurate the evolved model is. Consequently, a cost function is needed to evaluate the simultaneous inﬂuence of training experiment’s size and model’s accuracy.

where, i refers to experiment number, and j refers to the jth output of the network; eij is the error value of the ith machining condition state for the jth output; mij is the measured value of the ith machining condition state for the jth output; and pij is the predicted value of the ith machining condition state for the jth output The calculated errors are shown in Table 2. From this table the mean and standard deviation are calculated for each column. The reported results are shown in Table

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

671

Table 5 Results from RBN ERROR (mean and stdv) (real-predicted)×102 N Spread MAXIMUM MINIMUM 1st Set — 27 Experiments 0.2 Mean 0.3758 STDV 0.2893 0.5 Mean 0.5111 STDV 0.28 0.8 Mean 0.7133 STDV 0.5511 2nd Set — 35 Experiments 0.2 Mean 0.224 STDV 0.1981 0.5 Mean 0.2602 STDV 0.2129 0.8 Mean 0.2532 STDV 0.2188 3rd Set — 47 Experiments 0.2 Mean 0.2794 STDV 0.263 0.5 Mean 0.4704 STDV 0.4654 0.8 Mean 0.5866 STDV 0.5228 4th Set — 65 Experiments 0.2 Mean 0.2272 STDV 0.2402 0.5 Mean 2.2885 STDV 2.5721 0.8 Mean 4.7221 STDV 5.2183 0.3143 0.2858 0.2909 0.2207 0.2084 0.2551 0.1147 0.1621 0.2183 0.1809 0.2048 0.1717 0.0753 0.0989 0.2839 0.2399 0.2995 0.235 0.0337 0.0401 0.7389 0.8868 1.4937 1.7041

MEAN

STDV

0.2954 0.1277 0.1925 0.193 0.8924 0.3299 0.1497 0.0895 0.1813 0.1537 0.1754 0.1452 0.1379 0.1046 0.2028 0.1552 0.2051 0.1761 0.1506 0.1068 0.3053 0.2742 0.3416 0.2734

0.1911 0.1071 0.1685 0.0819 0.1149 0.1062 0.09 0.0825 0.0892 0.0793 0.0917 0.0754 0.0928 0.0982 0.1197 0.1107 0.1346 0.1151 0.0621 0.0586 0.8367 0.9921 1.8317 2.0805

6.1. Establishment of the cost function The cost function (C) is set to relate the following parameters: 1. number of experiments (NE); 2. error of prediction in terms of two important variables: 3. Maximum resultant force (EMAX); 4. Mean resultant force (EMEAN). Therefore, the overall cost function is given by C γ1·NE N γ2·EMAX E γ3·EMEAN E (11)

where γi (i=1,2,3) are the weights of each equation term. N is the maximum number of possible experiments (in this case 125, which represent the full factorial condition), E is the maximum allowed error (which is set to 30 N). The last value was selected based on the fact that this error constitutes a relatively small value compared to the magnitude of the forces developed during milling experiments conducted in here. Eq. (11) shows that the closer NE is to N, the higher

the value of C. This is compensated by the fact that the error would be much smaller than E. On the other hand, using a small NE, the cost is reduced by the ﬁrst term but augmented by the last two terms, since the accuracy would be compromised. In addition, the equation is set to be unitless in order to provide a fair base of comparison. The experimental set that gives the least cost is the selected set to be utilized for the particular ANN model. Then, the two networks are compared. The weights of each parameter (Eq. (11)) are selected based on the needs of this study. Previous studies have been criticized for the number of experiments required in the training phase. Normally the use of artiﬁcial neural networks requires a large number of experiments for training. For this very reason, the heaviest weight will be γ1 (term that determines the heaviness of the number of experiments in the cost function), while γ2 & γ3 are set to smaller values and with equal values. The reason for choosing equal values for γ2 & γ3 is that EMAX and EMEAN represent forces that have the same value in terms of cost relevance. The maximum force is important in this study due the great signiﬁcance that this particular variable has in tool breakage while the mean force indi-

672

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

Table 6 Cost Values (BP) Sets 1st Cost (γ=0.5) W 14 21 28 35 42 49 14 21 28 35 42 49 14 21 28 35 42 49 1.3263 1.25546667 N/A N/A N/A N/A Cost (γ=0.7) 1.7877 1.68853333 N/A N/A N/A N/A Cost (γ=0.2) 0.6342 0.60586667 N/A N/A N/A N/A 1.14466667 0.90033333 0.88366667 N/A N/A N/A 1.51293333 1.17086667 1.14753333 N/A N/A N/A 0.59226667 0.49453333 0.48786667 N/A N/A N/A 1.23446667 0.90113333 0.85413333 0.8508 N/A N/A 1.60793333 1.14126667 1.07546667 1.0708 N/A N/A 0.67426667 0.54093333 0.52213333 0.5208 N/A N/A 1.274833 0.9635 0.939167 0.8995 0.900833 0.896 1.618367 1.1825 1.148433 1.0929 1.094767 1.088 0.759533 0.635 0.625267 0.6094 0.609933 0.608

7. Discussion of results For the training phase, Table 1 shows the effectiveness of the selected ANN architecture for BP. All Rvalues are over 0.9. This table depicts that the more neurons in the hidden layer (high W), the better the representation (high R). By the same token, the increased number of experiments results in a reduction in the value of R which is compensated by the addition of more PEs in the hidden layer. This tendency was expected because of the fact that when having a larger training set, more neurons are needed to establish a good modeling. Since all R-values are sufﬁciently high, it is possible to conclude that any of these W combinations in any set can be used to successfully train the neural network. The same was applied for RBN where all R-values are 1. Furthermore, this indicates that the methodology of DOE can be successfully applied with good results. From these results, it is possible to state that based on the training performance, the RBN is better than the BP. Table 4 depicts that the 4th set produces smaller errors than the 1st set. This is because the former set contains more experiments (more information about the process) than the latter one. This table also shows the effect of increasing the number of neurons or PEs in the hidden layer. The more neurons, the more accurate the experiment set. The magnitudes of the errors are relatively small if compared with the forces developed during milling, which in this study vary between 200 and 1000 N. The results given by the radial basis network indicates that the smallest errors are reached when the spread value is equal to 0.2 for all sets. For this reason the cost calculation and therefore the model comparison is conducted using this particular value. Tables 6 and 7 show interesting results pertaining to the cost values for BP and RBN, respectively. Each model indicates different lowest cost sets. For BP the set with lowest cost is the third set with 0.85 and 1. 07 for γ=0.5 and 0.7, respectively. These two values are at W=35. While for γ=0.2, the lowest cost corresponds to the second set, W=28. This trend is represented in Fig. 12, where the cost vs set is plotted. This tendency is due to the lower weight given to the error term (γ=0.2), compared with the weight given to the number of experiments (NE). From these results, γ=0.5 can be considered as the base of comparison since this value does not underestimate the participation of the error part on the cost function. Therefore, the best model from this network is the one with 47 experiments with a cost value of 0.8508 and the topology of 3 inputs, 5 neurons in the hidden layer and 4 outputs. In the radial basis network, the 2nd set gives the lowest cost regardless of the value of γ (see Fig. 13). Again, γ=0.5 is selected for comparison purpose. For this network, the best model is the one with 35 experiments, a cost value of 0.8468, and a spread value of 0.2.

2nd

3rd

4th

W

W

cates the average force experienced during a cutting cycle, giving indication of the total power used. Based on the above, the weights are selected as follows: γ1=0.8 and γ2=γ3=0.5 (initially). Since the effect of EMAX and EMEAN in the cost function is important, a sensibility analysis is carried to see how the value of the cost function varies when the weights of these terms (γ2, γ3) are set to different values. It is necessary to mention that these values are limited by γ1=0.8 (the upper limit; since this factor is the largest weight in Eq. (11)) and 0 (the lowest limit; which corresponds to zero participation). Therefore, three values are considered for γ2 and γ3: 0.5, 0.7 and 0.2. The cost value is then calculated based on Eq. (11) and using the values of EMAX and EMEAN from Tables 4 and 5 for BP and RBN respectively. The cost is calculated for all the values of γ2 and γ3. These two values are represented by γ. The results are shown in Table 6 (BP) and 7 (RBN).
Table 7 Cost values (RBN) — results correspond to spread factor=0.2 Cost values Sets 1st 2nd 3rd 4th Cost (0.5) 1.29146667 0.84683333 0.9963 1.04566667 Cost (0.7) 1.73893333 1.09596667 1.2745 1.29753333 Cost (0.2) 0.62026667 0.47313333 0.579 0.66786667

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

673

Fig. 12. Cost vs sets (back-propagation network, using the lowest cost of each experimental set).

Fig. 13.

Cost vs sets (RB network).

Based on these results (BP cost=0.8508 and 47 experiments; RBN cost=0.8468 and 35 experiments) and the fact that the radial basis network is for this particular case about 3 times faster to train than back-propagation network, the model that best represents the functional relation between the considered milling parameters is the radial basis network. The selected network, not only gave the lowest cost, but it could be trained with fewer experiments and is much faster. This indicates that for this particular case, the RBN is more efﬁcient than BP.

8. Conclusions and future work In this paper, two supervised neural networks are used to successfully estimate the forces developed during milling process. Design of experiments and speciﬁcally orthogonal arrangement is used to select the experiments to perform and establish different sets that are considered

in each ANN model. DOE contributed to increasing the efﬁciency in the system by drastically reducing the amount of experimental data needed for successful training. Based on the results of this study, it is possible to conclude that having 5 values (equally spaced) of the selected milling parameters and applying orthogonal arrangement, 35 experiments (out of 125) are enough to train and evolve an accurate ANN model of the end milling process. In back-propagation networks, the use of a single hidden layer showed to work sufﬁciently well for the process in consideration. However, it is shown that radial basis network is superior to back-propagation network in predicting the milling forces, when evaluated in terms of a cost function that combines costs of experiments with accuracy. In this study, a cost function is deﬁned based on speciﬁc needs. This ﬁtness function can be reﬁned in the future in order to represent more extensively the characteristics of the milling process. In the same way, it is

674

J.F. Briceno et al. / International Journal of Machine Tools & Manufacture 42 (2002) 663–674

possible to design a more systematic methodology to select the spread factor in radial basis network. This could increase the accuracy of the model. References
[1] H.Y. Feng, N. Su, A mechanistic citting force model for ballend milling, Journal of Manufacturing Science and Engineering November (1998). [2] D.E. Rumelhart, J.L. McClelland, PDP Research Group Parallel Distributed Processing: Explorations in the Microstructure of Cognition 2 vols., in: MIT Press, Cambridge, MA, 1986. [3] Q. Liu, Y. Altintas, On-line monitoring of ﬂank wear in turning with multi-layered feed-forward neural network, International Journal of Machine Tools & Manufacture 39 (1999) 1945–1959. [4] Y. Liu, C. Wang, Neural network based adaptive control and optimisation in the milling process, International Journal of Advanced Manufacturing Technology 15 (11) (1999) 791–795. [5] V. Tandon, Closing the gap between CAD/CAM and optimized CNC end milling, MSME Thesis, Purdue School of Engineering & Technology, 2000

[6] D. Cook, C. Chiu, Combining a radial basis neural network with time series analysis techniques to predict manufacturing process parameters, Applied Artiﬁcial Intelligence 9 (6) (1995) 623–631. [7] P.J. Cheng, S.C. Lin, Using neural networks to predict bending angle of sheet metal formed by laser, International Journal of Machine Tools & Manufacturing 40 (1999) 1185–1197. [8] S. Elanayar, Y.C. Shin, Design and implementation of tool wear monitoring with radial basis function neural networks, in: Proceedings of the American Control Conference, Proceedings of the 1995 American Control Conference. Part 3 (of 6), 1995, pp. 1722–1726. [9] A.R. Barron, Neural net approximation, in: Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems, 1992, pp. 68–72. [10] R.C. Eberhart, P. Simpson, R. Dobbins, Computational Intelligence PC Tools, in: AP Professional, New York, 1996. [11] H. Demuth, M. Beale, Neural Network Toolbox v3 User’s Guide, in: The MathWorks Inc, USA, 1999. [12] P.J. Ross, Taguchi techniques for quality engineering, in: McGraw Hill, New York, 1988. [13] R.A. Walsh, McGraw-Hill machining and metalworking handbook, in: McGraw-Hill, New York, 1994.

Selecting an Artificial Neural Network for Efficient Modeling And

Comments

Content

Sponsor Documents

Recommended