改善神经网络质量的方式

NOTE：神经网络结合官方文档的自己的一些总结和理解。

Retraining Neural Networks

[x, t] = house_dataset;
Q = size(x, 2);
Q1 = floor(Q * 0.90);
Q2 = Q - Q1;
ind = randperm(Q);
ind1 = ind(1 : Q1);
ind2 = ind(Q1 + (1 : Q2));
x1 = x(: , ind1);
t1 = t(: , ind1);
x2 = x(: , ind2);
t2 = t(: , ind2);
net = feedforwardnet(10);
numNN = 10;
NN = cell(1, numNN);
perfs = zeros(1, numNN);
for i = 1 : numNN disp(['Training 'num2str(i)'/'num2str(numNN)]) NN {
    i
} = train(net, x1, t1);
y2 = NN {
    i
} (x2);
perfs(i) = mse(net, t2, y2);
end

上面程序的流程为：

生产随机index序列
生产随机index对应的数据，含input和target
将数据分为两部分，一部分90%用于训练,一部分10%用于验证
cell arrary保存训练的net，总共训练10次
perfs vector保存相应的MSE

以上所使用的理念无非是每次训练初值不同，多次训练寻找最佳网络

Early Stopping

In this technique the available data is divided into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases. The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. When the validation error increases for a specified number of iterations (net.trainParam.max_fail), the training is stopped, and the weights and biases at the minimum of the validation error are returned.

重要的一点是：validation error发生在validation sets验证net时，如果增加超过max_fail次，则训练停止。一般默认为6次，个人认为设置更大的数值，其实没有意义!

Input-Output Processing Functions

For multilayer network creation functions, such as feedforwardnet, the default input processing functions are removeconstantrows and mapminmax. For outputs, the default processing functions are also removeconstantrows and mapminmax.

The following table lists the most common preprocessing and postprocessing functions. In most cases, you will not need to use them directly, since the preprocessing steps become part of the network object. When you simulate or train the network, the preprocessing and postprocessing will be done automatically.

网上很多中文的帖子，自己处理了输入输出数据，在新版本的神经网络工具箱，这些处理全部包含在net object（matlab语言是面向对象的），当然可以自定义化函数和相应的参数。但是使用默认的，就足以满足我们一般的需要

Regularization

It is possible to improve generalization if you modify the performance function by adding a term that consists of the mean of the sum of squares of the network weights and biases msereg = γmse + (1 − γ)msw,

一般网络默认的performance函数为MSE,统计学里面MSE代表实验数据和设定值的偏离程度，越小越好。但是这里采用msw和y（将就下）来处理下。

看程序：

[x, t] = simplefit_dataset;
net = feedforwardnet(10, 'trainbfg');
net.divideFcn = '';
net.trainParam.epochs = 300;
net.trainParam.goal = 1e-5;
net.performParam.regularization = 0.5;
net = train(net, x, t);

总结一下上面的程序：

训练算法采用trainbfg，训练10次
数据不分集，以失效validation error check fail机制
训练最大次数300
梯度目标为1e-5
上面msereg的y为0.5

The problem with regularization is that it is difficult to determine the optimum value for the performance ratio parameter. If you make this parameter too large, you might get overfitting. If the ratio is too small, the network does not adequately fit the training data. The next section describes a routine that automatically sets the regularization parameters.

我们从逻辑上分析这句话：

regulatization我们确实直接给定了0.5。
regularzation越大，mse的权重越高，训练出现overfit的几率越大；相反，mse的权重越小，训练很容易达不到效果。
我们迫切需要一个自动决定权重的机制。

Bayesian regularization has been implemented in the function trainbr.

我写这篇记录不深入算法部分，BR算法会单独开贴，结合相应论文分析，目前我们知道trainbr整合了这一切：

x = -1 : 0.05 : 1;
t = sin(2 * pi * x) + 0.1 * randn(size(x));
net = feedforwardnet(20, 'trainbr');
net = train(net, x, t);

分析以上程序：

最简单的程序数据处理，只有train数据。
FF，BP网络，隐节点数20，训练算法为trainbr

One feature of this algorithm is that it provides a measure of how many network parameters (weights and biases) are being effectively used by the network. In this case, the final trained network uses approximately 12 parameters (indicated by #Par in the printout) out of the 61 total weights and biases in the 1-20-1 network. This effective number of parameters should remain approximately the same, no matter how large the number of parameters in the network becomes. (This assumes that the network has been trained for a sufficient number of iterations to ensure convergence.)

一个小测试：NARX是否自动包含了数据预处理呢？

inputDelays = 1:1;
feedbackDelays = 1:1;
hiddenLayerSize = 20;
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open',trainFcn);
process=net.inputs{1}.processFcns

process =
 
'removeconstantrows''mapminmax'

答案是肯定的。

The performance of a trained network can be measured to some extent by the errors on the training, validation, and test sets, but it is often useful to investigate the network response in more detail. One option is to perform a regression analysis between the network response and the corresponding targets

Posttraining Analysis (regression)

x = [-1:.05:1];
t = sin(2*pi*x)+0.1*randn(size(x));
net = feedforwardnet(10);
net = train(net,x,t);
y = net(x);
[r,m,b] = regression(t,y)

简单分析程序：

步长0.05的X向量
t的基于sine的目标向量
BP神经网络
regression分析，m代表斜率，b代表截距，r代表correlation系数，自然r,m,b分别为1,1,0时，网络性能最佳。

Summary and Discussion of Early Stopping and Regularization

无论是哪一种方式，都是各有优缺点，early stop对tranlm算法，需要注意收敛过快的可能。实际上不恰当的参数值，可能训练10次以内就会收敛，笔者亲身体会。
对于test sets的问题，笔者彻夜思索一个遇到的问题：有一个我自己的网络，无论如何都是出现overfit（对新数据的适应能力很差）。最后我大胆得到一个结论：神经网络的预测是在无数次的训练数据中出来的！
- 针对此，我的解释是：如果你的test set，完全不相干前面的tran set or validation，你不可能让网络得出你想要的response。（区别下数据之间的correlation的概念，不是一回事）
- 所以sample the data set in cycle是非常重要的，也就是交错取样。
尝试不同的训练算法，也许对于改善问题比较有帮助.
tranbr算法并不需要validtion sets！
神经网络如何对数据的选择比训练本身更重要！

来源: http://blog.csdn.net/bushipeien/article/details/78395290

与本文相关文章

暂无,快来抢沙发吧！