用函数拟合能力解释神经网络

birdskyws

发布于 2019-03-04 10:07:53

1.5K00

代码可运行

运行总次数：0

代码可运行

有2种思路理解神经网络：一种是函数方式，另一种是概率方式。函数方式，通过神经网络进行复杂函数的拟合，生成对象的模型。本文希望通过示例使大家理解神经网络函数拟合能力和神经网络中激活函数的作用，通过将对象的特征转化为数字，多个特征组成向量，标签也转化为数字，那么训练模型就是在样本数据上，拟合向量到标签的函数。

非线性函数

单层神经网络，用下面的公式描述：

在没有非线性函数时，

将y1代入到y2中，

那么还是线性变换。 常用的非线性变化有

relu
sigmoid

sigmoid容易出现梯度消失的问题，所以用relu的较多。

用tensorflow中的relu和sigmoid画图

import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

plt.rcParams['axes.unicode_minus']=False
x = tf.placeholder(dtype=tf.float32,shape=[None,1])
relu_y = tf.nn.relu(x)
sigmoid_y = tf.nn.sigmoid(x)

sess = tf.Session()
input_x = np.linspace(-5,5,1000).reshape(-1,1)
relu = sess.run(relu_y,feed_dict={x:input_x})
sigmoid = sess.run(sigmoid_y,feed_dict={x:input_x})
plt.plot(input_x,relu)
plt.plot(input_x,sigmoid)
plt.show(

relu

sigmoid

曲线拟合

图中，蓝色曲线是目标函数（

）

一、用1层神经元，1个隐藏单元

代码如下：

import numpy as  np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow.contrib.slim as slim

X = np.linspace(-2,5,1000).reshape(-1,1)
d = np.square(X)+1
d = d.reshape(-1,1)
x = tf.placeholder(dtype=tf.float32,shape=[None,1],name="input_X")
y = tf.placeholder(dtype=tf.float32,shape=[None,1],name="input_Y")
# 定义网络
net = slim.fully_connected(x,1,activation_fn=None)
net = slim.fully_connected(net,1,activation_fn=None)
loss = tf.reduce_mean(tf.square(net-y))
#train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
l = []
for itr in range(5000):
    idx = np.random.randint(0,1000,512) 
    inx = X[idx]
    iny = d[idx]
    sess.run(train_step,feed_dict={x:inx,y:iny})
    if itr%100==0:
        print("step:{}".format(itr))
        #print()
        l_var  = sess.run(loss,feed_dict={x:X,y:d})
        l.append(l_var)
plt_x = np.linspace(-2,5,200).reshape(-1,1)
plt_y = sess.run(net,feed_dict={x:plt_x})
plt.plot(plt_x,plt_y,color='#FF0000')
plt.scatter(X,d)
plt.show()
plt.plot(l)
plt.show()

linear.png

二、用一层隐藏层，16个神经元，

不采用激活函数，关键代码如下：

net = slim.fully_connected(x,1,activation_fn=None)
net = slim.fully_connected(net,16,activation_fn=None)
loss = tf.reduce_mean(tf.square(net-y))

从下图中可以看出，没有加入非线性函数，曲线只是角度变化和截距变化。

Figure_1.png

三、用一层隐藏层，2个神经元，加入激活函数relu

net = slim.fully_connected(x,2,activation_fn=tf.nn.relu)
net = slim.fully_connected(net,1,activation_fn=None)
loss = tf.reduce_mean(tf.square(net-y))

Figure_1.png

上图中，我们已经看到了一根折线，我们希望这根曲线能有更多“弯”，更加贴合曲线。

四、用一层隐藏层，4个神经元，加入激活函数

net = slim.fully_connected(x,2,activation_fn=tf.nn.relu)
net = slim.fully_connected(net,1,activation_fn=None)
loss = tf.reduce_mean(tf.square(net-y))

从下图中，我们很容易观察出来有3个转折点，而且红色曲线也基本可以拟合上蓝色曲线了。

Figure_1.png

五、用二层隐藏层，每层2个神经元，加入激活函数

# 定义网络
net = slim.fully_connected(x,2,activation_fn=tf.nn.relu)
net = slim.fully_connected(net,2,activation_fn=tf.nn.relu)
net = slim.fully_connected(net,1,activation_fn=None)
loss = tf.reduce_mean(tf.square(net-y))

从下图中，我们能够看到比单层神经网络更好的拟合效果。

总结

两种机器学习问题：回归和分类。通常回归问题比较适合采用函数解释，而分类问题采用概率解释。二分类问题，可用(0,1)标签，也可用(-1,+1)标签，而用神经网络模型处理分类问题也可以用函数思想解释，函数描述了多维空间曲面，曲面上的点为（特征1，特征2，....,标签1，标签2，....）,多维标签即用one_hot编码的多分类问题。神经网络拟合多维空间的曲面是解释深度学习的一种方式。

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2019.02.08 ，如有侵权请联系 cloudcommunity@tencent.com 删除

神经网络

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

神经网络

登录后参与评论

0 条评论

热度