首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >Python 3深度置信网络(DBN)在Tensorflow中的实现MNIST手写数字识别

Python 3深度置信网络(DBN)在Tensorflow中的实现MNIST手写数字识别

原创
作者头像
青年夏日
修改于 2021-04-19 02:45:40
修改于 2021-04-19 02:45:40
2.1K00
代码可运行
举报
文章被收录于专栏:青年夏日青年夏日
运行总次数:0
代码可运行

Deep Learning with TensorFlow IBM Cognitive Class ML0120EN Module 5 - Autoencoders

使用DBN识别手写体 传统的多层感知机或者神经网络的一个问题: 反向传播可能总是导致局部最小值。 当误差表面(error surface)包含了多个凹槽,当你做梯度下降时,你找到的并不是最深的凹槽。 下面你将会看到DBN是怎么解决这个问题的。

深度置信网络

深度置信网络可以通过额外的预训练规程解决局部最小值的问题。 预训练在反向传播之前做完,这样可以使错误率离最优的解不是那么远,也就是我们在最优解的附近。再通过反向传播慢慢地降低错误率。 深度置信网络主要分成两部分。第一部分是多层玻尔兹曼感知机,用于预训练我们的网络。第二部分是前馈反向传播网络,这可以使RBM堆叠的网络更加精细化。

1. 加载必要的深度置信网络库

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
# urllib is used to download the utils file from deeplearning.net
import urllib.request
response = urllib.request.urlopen('http://deeplearning.net/tutorial/code/utils.py')
content = response.read().decode('utf-8')
target = open('utils.py', 'w')
target.write(content)
target.close()
# Import the math function for calculations
import math
# Tensorflow library. Used to implement machine learning models
import tensorflow as tf
# Numpy contains helpful functions for efficient mathematical calculations
import numpy as np
# Image library for image manipulation
from PIL import Image
# import Image
# Utils file
from utils import tile_raster_images

2. 构建RBM层

RBM的细节参考【https://blog.csdn.net/sinat_28371057/article/details/115795086

​ 为了在Tensorflow中应用DBN, 下面创建一个RBM的类

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
class RBM(object):
    def __init__(self, input_size, output_size):
        # Defining the hyperparameters
        self._input_size = input_size  # Size of input
        self._output_size = output_size  # Size of output
        self.epochs = 5  # Amount of training iterations
        self.learning_rate = 1.0  # The step used in gradient descent
        self.batchsize = 100  # The size of how much data will be used for training per sub iteration

        # Initializing weights and biases as matrices full of zeroes
        self.w = np.zeros([input_size, output_size], np.float32)  # Creates and initializes the weights with 0
        self.hb = np.zeros([output_size], np.float32)  # Creates and initializes the hidden biases with 0
        self.vb = np.zeros([input_size], np.float32)  # Creates and initializes the visible biases with 0

    # Fits the result from the weighted visible layer plus the bias into a sigmoid curve
    def prob_h_given_v(self, visible, w, hb):
        # Sigmoid
        return tf.nn.sigmoid(tf.matmul(visible, w) + hb)

    # Fits the result from the weighted hidden layer plus the bias into a sigmoid curve
    def prob_v_given_h(self, hidden, w, vb):
        return tf.nn.sigmoid(tf.matmul(hidden, tf.transpose(w)) + vb)

    # Generate the sample probability
    def sample_prob(self, probs):
        return tf.nn.relu(tf.sign(probs - tf.random_uniform(tf.shape(probs))))

    # Training method for the model
    def train(self, X):
        # Create the placeholders for our parameters
        _w = tf.placeholder("float", [self._input_size, self._output_size])
        _hb = tf.placeholder("float", [self._output_size])
        _vb = tf.placeholder("float", [self._input_size])

        prv_w = np.zeros([self._input_size, self._output_size],
                         np.float32)  # Creates and initializes the weights with 0
        prv_hb = np.zeros([self._output_size], np.float32)  # Creates and initializes the hidden biases with 0
        prv_vb = np.zeros([self._input_size], np.float32)  # Creates and initializes the visible biases with 0

        cur_w = np.zeros([self._input_size, self._output_size], np.float32)
        cur_hb = np.zeros([self._output_size], np.float32)
        cur_vb = np.zeros([self._input_size], np.float32)
        v0 = tf.placeholder("float", [None, self._input_size])

        # Initialize with sample probabilities
        h0 = self.sample_prob(self.prob_h_given_v(v0, _w, _hb))
        v1 = self.sample_prob(self.prob_v_given_h(h0, _w, _vb))
        h1 = self.prob_h_given_v(v1, _w, _hb)

        # Create the Gradients
        positive_grad = tf.matmul(tf.transpose(v0), h0)
        negative_grad = tf.matmul(tf.transpose(v1), h1)

        # Update learning rates for the layers
        update_w = _w + self.learning_rate * (positive_grad - negative_grad) / tf.to_float(tf.shape(v0)[0])
        update_vb = _vb + self.learning_rate * tf.reduce_mean(v0 - v1, 0)
        update_hb = _hb + self.learning_rate * tf.reduce_mean(h0 - h1, 0)

        # Find the error rate
        err = tf.reduce_mean(tf.square(v0 - v1))

        # Training loop
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            # For each epoch
            for epoch in range(self.epochs):
                # For each step/batch
                for start, end in zip(range(0, len(X), self.batchsize), range(self.batchsize, len(X), self.batchsize)):
                    batch = X[start:end]
                    # Update the rates
                    cur_w = sess.run(update_w, feed_dict={v0: batch, _w: prv_w, _hb: prv_hb, _vb: prv_vb})
                    cur_hb = sess.run(update_hb, feed_dict={v0: batch, _w: prv_w, _hb: prv_hb, _vb: prv_vb})
                    cur_vb = sess.run(update_vb, feed_dict={v0: batch, _w: prv_w, _hb: prv_hb, _vb: prv_vb})
                    prv_w = cur_w
                    prv_hb = cur_hb
                    prv_vb = cur_vb
                error = sess.run(err, feed_dict={v0: X, _w: cur_w, _vb: cur_vb, _hb: cur_hb})
                print('Epoch: %d' % epoch, 'reconstruction error: %f' % error)
            self.w = prv_w
            self.hb = prv_hb
            self.vb = prv_vb

    # Create expected output for our DBN
    def rbm_outpt(self, X):
        input_X = tf.constant(X)
        _w = tf.constant(self.w)
        _hb = tf.constant(self.hb)
        out = tf.nn.sigmoid(tf.matmul(input_X, _w) + _hb)
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            return sess.run(out)

3. 导入MNIST数据

使用one-hot encoding标注的形式载入MNIST图像数据。

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
# Getting the MNIST data provided by Tensorflow
from tensorflow.examples.tutorials.mnist import input_data

# Loading in the mnist data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
trX, trY, teX, teY = mnist.train.images, mnist.train.labels, mnist.test.images,\
    mnist.test.labels
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

4. 建立DBN

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
RBM_hidden_sizes = [500, 200 , 50 ] #create 4 layers of RBM with size 785-500-200-50

#Since we are training, set input as training data
inpX = trX

#Create list to hold our RBMs
rbm_list = []

#Size of inputs is the number of inputs in the training set
input_size = inpX.shape[1]

#For each RBM we want to generate
for i, size in enumerate(RBM_hidden_sizes):
    print('RBM: ',i,' ',input_size,'->', size)
    rbm_list.append(RBM(input_size, size))
    input_size = size
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
RBM:  0   784 -> 500
RBM:  1   500 -> 200
RBM:  2   200 -> 50

rbm的类创建好了和数据都已经载入,可以创建DBN。 在这个例子中,我们使用了3个RBM,一个的隐藏层单元个数为500, 第二个RBM的隐藏层个数为200,最后一个为50. 我们想要生成训练数据的深层次表示形式。

5.训练RBM

我们将使用***rbm.train()***开始预训练步骤, 单独训练堆中的每一个RBM,并将当前RBM的输出作为下一个RBM的输入。

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
#For each RBM in our list
for rbm in rbm_list:
    print('New RBM:')
    #Train a new one
    rbm.train(inpX) 
    #Return the output layer
    inpX = rbm.rbm_outpt(inpX)
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
New RBM:
Epoch: 0 reconstruction error: 0.061174
Epoch: 1 reconstruction error: 0.052962
Epoch: 2 reconstruction error: 0.049679
Epoch: 3 reconstruction error: 0.047683
Epoch: 4 reconstruction error: 0.045691
New RBM:
Epoch: 0 reconstruction error: 0.035260
Epoch: 1 reconstruction error: 0.030811
Epoch: 2 reconstruction error: 0.028873
Epoch: 3 reconstruction error: 0.027428
Epoch: 4 reconstruction error: 0.026980
New RBM:
Epoch: 0 reconstruction error: 0.059593
Epoch: 1 reconstruction error: 0.056837
Epoch: 2 reconstruction error: 0.055571
Epoch: 3 reconstruction error: 0.053817
Epoch: 4 reconstruction error: 0.054142

现在我们可以将输入数据的学习好的表示转换为有监督的预测,比如一个线性分类器。特别地,我们使用这个浅层神经网络的最后一层的输出对数字分类。

6. 神经网络

下面的类使用了上面预训练好的RBMs实现神经网络。

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
import numpy as np
import math
import tensorflow as tf


class NN(object):

    def __init__(self, sizes, X, Y):
        # Initialize hyperparameters
        self._sizes = sizes
        self._X = X
        self._Y = Y
        self.w_list = []
        self.b_list = []
        self._learning_rate = 1.0
        self._momentum = 0.0
        self._epoches = 10
        self._batchsize = 100
        input_size = X.shape[1]

        # initialization loop
        for size in self._sizes + [Y.shape[1]]:
            # Define upper limit for the uniform distribution range
            max_range = 4 * math.sqrt(6. / (input_size + size))

            # Initialize weights through a random uniform distribution
            self.w_list.append(
                np.random.uniform(-max_range, max_range, [input_size, size]).astype(np.float32))

            # Initialize bias as zeroes
            self.b_list.append(np.zeros([size], np.float32))
            input_size = size

    # load data from rbm
    def load_from_rbms(self, dbn_sizes, rbm_list):
        # Check if expected sizes are correct
        assert len(dbn_sizes) == len(self._sizes)

        for i in range(len(self._sizes)):
            # Check if for each RBN the expected sizes are correct
            assert dbn_sizes[i] == self._sizes[i]

        # If everything is correct, bring over the weights and biases
        for i in range(len(self._sizes)):
            self.w_list[i] = rbm_list[i].w
            self.b_list[i] = rbm_list[i].hb

    # Training method
    def train(self):
        # Create placeholders for input, weights, biases, output
        _a = [None] * (len(self._sizes) + 2)
        _w = [None] * (len(self._sizes) + 1)
        _b = [None] * (len(self._sizes) + 1)
        _a[0] = tf.placeholder("float", [None, self._X.shape[1]])
        y = tf.placeholder("float", [None, self._Y.shape[1]])

        # Define variables and activation functoin
        for i in range(len(self._sizes) + 1):
            _w[i] = tf.Variable(self.w_list[i])
            _b[i] = tf.Variable(self.b_list[i])
        for i in range(1, len(self._sizes) + 2):
            _a[i] = tf.nn.sigmoid(tf.matmul(_a[i - 1], _w[i - 1]) + _b[i - 1])

        # Define the cost function
        cost = tf.reduce_mean(tf.square(_a[-1] - y))

        # Define the training operation (Momentum Optimizer minimizing the Cost function)
        train_op = tf.train.MomentumOptimizer(
            self._learning_rate, self._momentum).minimize(cost)

        # Prediction operation
        predict_op = tf.argmax(_a[-1], 1)

        # Training Loop
        with tf.Session() as sess:
            # Initialize Variables
            sess.run(tf.global_variables_initializer())

            # For each epoch
            for i in range(self._epoches):

                # For each step
                for start, end in zip(
                        range(0, len(self._X), self._batchsize), range(self._batchsize, len(self._X), self._batchsize)):
                    # Run the training operation on the input data
                    sess.run(train_op, feed_dict={
                        _a[0]: self._X[start:end], y: self._Y[start:end]})

                for j in range(len(self._sizes) + 1):
                    # Retrieve weights and biases
                    self.w_list[j] = sess.run(_w[j])
                    self.b_list[j] = sess.run(_b[j])

                print("Accuracy rating for epoch " + str(i) + ": " + str(np.mean(np.argmax(self._Y, axis=1) == \
                                                                                 sess.run(predict_op, feed_dict={_a[0]: self._X, y: self._Y}))))

7. 运行

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
nNet = NN(RBM_hidden_sizes, trX, trY)
nNet.load_from_rbms(RBM_hidden_sizes,rbm_list)
nNet.train()
代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
Accuracy rating for epoch 0: 0.46683636363636366
Accuracy rating for epoch 1: 0.6561272727272728
Accuracy rating for epoch 2: 0.7678363636363637
Accuracy rating for epoch 3: 0.8370727272727273
Accuracy rating for epoch 4: 0.8684181818181819
Accuracy rating for epoch 5: 0.885
Accuracy rating for epoch 6: 0.8947636363636363
Accuracy rating for epoch 7: 0.9024909090909091
Accuracy rating for epoch 8: 0.9080363636363636
Accuracy rating for epoch 9: 0.9124181818181818

完整代码

pip install tensorflow==1.13.1

代码语言:javascript
代码运行次数:0
运行
AI代码解释
复制
# Import the math function for calculations
import math
# Tensorflow library. Used to implement machine learning models
import tensorflow as tf
# Numpy contains helpful functions for efficient mathematical calculations
import numpy as np
# Image library for image manipulation
# import Image
# Utils file
# Getting the MNIST data provided by Tensorflow
from tensorflow.examples.tutorials.mnist import input_data

""" This file contains different utility functions that are not connected
in anyway to the networks presented in the tutorials, but rather help in
processing the outputs into a more understandable way.

For example ``tile_raster_images`` helps in generating a easy to grasp
image from a set of samples or weights.
"""

import numpy


def scale_to_unit_interval(ndar, eps=1e-8):
    """ Scales all values in the ndarray ndar to be between 0 and 1 """
    ndar = ndar.copy()
    ndar -= ndar.min()
    ndar *= 1.0 / (ndar.max() + eps)
    return ndar


def tile_raster_images(X, img_shape, tile_shape, tile_spacing=(0, 0),
                       scale_rows_to_unit_interval=True,
                       output_pixel_vals=True):
    """
    Transform an array with one flattened image per row, into an array in
    which images are reshaped and layed out like tiles on a floor.

    This function is useful for visualizing datasets whose rows are images,
    and also columns of matrices for transforming those rows
    (such as the first layer of a neural net).

    :type X: a 2-D ndarray or a tuple of 4 channels, elements of which can
    be 2-D ndarrays or None;
    :param X: a 2-D array in which every row is a flattened image.

    :type img_shape: tuple; (height, width)
    :param img_shape: the original shape of each image

    :type tile_shape: tuple; (rows, cols)
    :param tile_shape: the number of images to tile (rows, cols)

    :param output_pixel_vals: if output should be pixel values (i.e. int8
    values) or floats

    :param scale_rows_to_unit_interval: if the values need to be scaled before
    being plotted to [0,1] or not


    :returns: array suitable for viewing as an image.
    (See:`Image.fromarray`.)
    :rtype: a 2-d array with same dtype as X.

    """

    assert len(img_shape) == 2
    assert len(tile_shape) == 2
    assert len(tile_spacing) == 2

    # The expression below can be re-written in a more C style as
    # follows :
    #
    # out_shape    = [0,0]
    # out_shape[0] = (img_shape[0]+tile_spacing[0])*tile_shape[0] -
    #                tile_spacing[0]
    # out_shape[1] = (img_shape[1]+tile_spacing[1])*tile_shape[1] -
    #                tile_spacing[1]
    out_shape = [
        (ishp + tsp) * tshp - tsp
        for ishp, tshp, tsp in zip(img_shape, tile_shape, tile_spacing)
    ]

    if isinstance(X, tuple):
        assert len(X) == 4
        # Create an output numpy ndarray to store the image
        if output_pixel_vals:
            out_array = numpy.zeros((out_shape[0], out_shape[1], 4),
                                    dtype='uint8')
        else:
            out_array = numpy.zeros((out_shape[0], out_shape[1], 4),
                                    dtype=X.dtype)

        #colors default to 0, alpha defaults to 1 (opaque)
        if output_pixel_vals:
            channel_defaults = [0, 0, 0, 255]
        else:
            channel_defaults = [0., 0., 0., 1.]

        for i in range(4):
            if X[i] is None:
                # if channel is None, fill it with zeros of the correct
                # dtype
                dt = out_array.dtype
                if output_pixel_vals:
                    dt = 'uint8'
                out_array[:, :, i] = numpy.zeros(
                    out_shape,
                    dtype=dt
                ) + channel_defaults[i]
            else:
                # use a recurrent call to compute the channel and store it
                # in the output
                out_array[:, :, i] = tile_raster_images(
                    X[i], img_shape, tile_shape, tile_spacing,
                    scale_rows_to_unit_interval, output_pixel_vals)
        return out_array

    else:
        # if we are dealing with only one channel
        H, W = img_shape
        Hs, Ws = tile_spacing

        # generate a matrix to store the output
        dt = X.dtype
        if output_pixel_vals:
            dt = 'uint8'
        out_array = numpy.zeros(out_shape, dtype=dt)

        for tile_row in range(tile_shape[0]):
            for tile_col in range(tile_shape[1]):
                if tile_row * tile_shape[1] + tile_col < X.shape[0]:
                    this_x = X[tile_row * tile_shape[1] + tile_col]
                    if scale_rows_to_unit_interval:
                        # if we should scale values to be between 0 and 1
                        # do this by calling the `scale_to_unit_interval`
                        # function
                        this_img = scale_to_unit_interval(
                            this_x.reshape(img_shape))
                    else:
                        this_img = this_x.reshape(img_shape)
                    # add the slice to the corresponding position in the
                    # output array
                    c = 1
                    if output_pixel_vals:
                        c = 255
                    out_array[
                        tile_row * (H + Hs): tile_row * (H + Hs) + H,
                        tile_col * (W + Ws): tile_col * (W + Ws) + W
                    ] = this_img * c
        return out_array

# Class that defines the behavior of the RBM
class RBM(object):
    def __init__(self, input_size, output_size):
        # Defining the hyperparameters
        self._input_size = input_size  # Size of input
        self._output_size = output_size  # Size of output
        self.epochs = 5  # Amount of training iterations
        self.learning_rate = 1.0  # The step used in gradient descent
        self.batchsize = 100  # The size of how much data will be used for training per sub iteration

        # Initializing weights and biases as matrices full of zeroes
        self.w = np.zeros([input_size, output_size], np.float32)  # Creates and initializes the weights with 0
        self.hb = np.zeros([output_size], np.float32)  # Creates and initializes the hidden biases with 0
        self.vb = np.zeros([input_size], np.float32)  # Creates and initializes the visible biases with 0

    # Fits the result from the weighted visible layer plus the bias into a sigmoid curve
    def prob_h_given_v(self, visible, w, hb):
        # Sigmoid
        return tf.nn.sigmoid(tf.matmul(visible, w) + hb)

    # Fits the result from the weighted hidden layer plus the bias into a sigmoid curve
    def prob_v_given_h(self, hidden, w, vb):
        return tf.nn.sigmoid(tf.matmul(hidden, tf.transpose(w)) + vb)

    # Generate the sample probability
    def sample_prob(self, probs):
        return tf.nn.relu(tf.sign(probs - tf.random_uniform(tf.shape(probs))))

    # Training method for the model
    def train(self, X):
        # Create the placeholders for our parameters
        _w = tf.placeholder("float", [self._input_size, self._output_size])
        _hb = tf.placeholder("float", [self._output_size])
        _vb = tf.placeholder("float", [self._input_size])

        prv_w = np.zeros([self._input_size, self._output_size],
                         np.float32)  # Creates and initializes the weights with 0
        prv_hb = np.zeros([self._output_size], np.float32)  # Creates and initializes the hidden biases with 0
        prv_vb = np.zeros([self._input_size], np.float32)  # Creates and initializes the visible biases with 0

        cur_w = np.zeros([self._input_size, self._output_size], np.float32)
        cur_hb = np.zeros([self._output_size], np.float32)
        cur_vb = np.zeros([self._input_size], np.float32)
        v0 = tf.placeholder("float", [None, self._input_size])

        # Initialize with sample probabilities
        h0 = self.sample_prob(self.prob_h_given_v(v0, _w, _hb))
        v1 = self.sample_prob(self.prob_v_given_h(h0, _w, _vb))
        h1 = self.prob_h_given_v(v1, _w, _hb)

        # Create the Gradients
        positive_grad = tf.matmul(tf.transpose(v0), h0)
        negative_grad = tf.matmul(tf.transpose(v1), h1)

        # Update learning rates for the layers
        update_w = _w + self.learning_rate * (positive_grad - negative_grad) / tf.to_float(tf.shape(v0)[0])
        update_vb = _vb + self.learning_rate * tf.reduce_mean(v0 - v1, 0)
        update_hb = _hb + self.learning_rate * tf.reduce_mean(h0 - h1, 0)

        # Find the error rate
        err = tf.reduce_mean(tf.square(v0 - v1))

        # Training loop
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            # For each epoch
            for epoch in range(self.epochs):
                # For each step/batch
                for start, end in zip(range(0, len(X), self.batchsize), range(self.batchsize, len(X), self.batchsize)):
                    batch = X[start:end]
                    # Update the rates
                    cur_w = sess.run(update_w, feed_dict={v0: batch, _w: prv_w, _hb: prv_hb, _vb: prv_vb})
                    cur_hb = sess.run(update_hb, feed_dict={v0: batch, _w: prv_w, _hb: prv_hb, _vb: prv_vb})
                    cur_vb = sess.run(update_vb, feed_dict={v0: batch, _w: prv_w, _hb: prv_hb, _vb: prv_vb})
                    prv_w = cur_w
                    prv_hb = cur_hb
                    prv_vb = cur_vb
                error = sess.run(err, feed_dict={v0: X, _w: cur_w, _vb: cur_vb, _hb: cur_hb})
                print('Epoch: %d' % epoch, 'reconstruction error: %f' % error)
            self.w = prv_w
            self.hb = prv_hb
            self.vb = prv_vb

    # Create expected output for our DBN
    def rbm_outpt(self, X):
        input_X = tf.constant(X)
        _w = tf.constant(self.w)
        _hb = tf.constant(self.hb)
        out = tf.nn.sigmoid(tf.matmul(input_X, _w) + _hb)
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            return sess.run(out)

class NN(object):

    def __init__(self, sizes, X, Y):
        # Initialize hyperparameters
        self._sizes = sizes
        self._X = X
        self._Y = Y
        self.w_list = []
        self.b_list = []
        self._learning_rate = 1.0
        self._momentum = 0.0
        self._epoches = 10
        self._batchsize = 100
        input_size = X.shape[1]

        # initialization loop
        for size in self._sizes + [Y.shape[1]]:
            # Define upper limit for the uniform distribution range
            max_range = 4 * math.sqrt(6. / (input_size + size))

            # Initialize weights through a random uniform distribution
            self.w_list.append(
                np.random.uniform(-max_range, max_range, [input_size, size]).astype(np.float32))

            # Initialize bias as zeroes
            self.b_list.append(np.zeros([size], np.float32))
            input_size = size

    # load data from rbm
    def load_from_rbms(self, dbn_sizes, rbm_list):
        # Check if expected sizes are correct
        assert len(dbn_sizes) == len(self._sizes)

        for i in range(len(self._sizes)):
            # Check if for each RBN the expected sizes are correct
            assert dbn_sizes[i] == self._sizes[i]

        # If everything is correct, bring over the weights and biases
        for i in range(len(self._sizes)):
            self.w_list[i] = rbm_list[i].w
            self.b_list[i] = rbm_list[i].hb

    # Training method
    def train(self):
        # Create placeholders for input, weights, biases, output
        _a = [None] * (len(self._sizes) + 2)
        _w = [None] * (len(self._sizes) + 1)
        _b = [None] * (len(self._sizes) + 1)
        _a[0] = tf.placeholder("float", [None, self._X.shape[1]])
        y = tf.placeholder("float", [None, self._Y.shape[1]])

        # Define variables and activation functoin
        for i in range(len(self._sizes) + 1):
            _w[i] = tf.Variable(self.w_list[i])
            _b[i] = tf.Variable(self.b_list[i])
        for i in range(1, len(self._sizes) + 2):
            _a[i] = tf.nn.sigmoid(tf.matmul(_a[i - 1], _w[i - 1]) + _b[i - 1])

        # Define the cost function
        cost = tf.reduce_mean(tf.square(_a[-1] - y))

        # Define the training operation (Momentum Optimizer minimizing the Cost function)
        train_op = tf.train.MomentumOptimizer(
            self._learning_rate, self._momentum).minimize(cost)

        # Prediction operation
        predict_op = tf.argmax(_a[-1], 1)

        # Training Loop
        with tf.Session() as sess:
            # Initialize Variables
            sess.run(tf.global_variables_initializer())

            # For each epoch
            for i in range(self._epoches):

                # For each step
                for start, end in zip(
                        range(0, len(self._X), self._batchsize), range(self._batchsize, len(self._X), self._batchsize)):
                    # Run the training operation on the input data
                    sess.run(train_op, feed_dict={
                        _a[0]: self._X[start:end], y: self._Y[start:end]})

                for j in range(len(self._sizes) + 1):
                    # Retrieve weights and biases
                    self.w_list[j] = sess.run(_w[j])
                    self.b_list[j] = sess.run(_b[j])

                print("Accuracy rating for epoch " + str(i) + ": " + str(np.mean(np.argmax(self._Y, axis=1) == \
                                                                                 sess.run(predict_op, feed_dict={_a[0]: self._X, y: self._Y}))))


if __name__ == '__main__':
    # Loading in the mnist data
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

    trX, trY, teX, teY = mnist.train.images, mnist.train.labels, mnist.test.images,\
        mnist.test.labels

    RBM_hidden_sizes = [500, 200, 50]  # create 4 layers of RBM with size 785-500-200-50
    # Since we are training, set input as training data
    inpX = trX
    # Create list to hold our RBMs
    rbm_list = []
    # Size of inputs is the number of inputs in the training set
    input_size = inpX.shape[1]

    # For each RBM we want to generate
    for i, size in enumerate(RBM_hidden_sizes):
        print('RBM: ', i, ' ', input_size, '->', size)
        rbm_list.append(RBM(input_size, size))
        input_size = size

    # For each RBM in our list
    for rbm in rbm_list:
        print('New RBM:')
        # Train a new one
        rbm.train(inpX)
        # Return the output layer
        inpX = rbm.rbm_outpt(inpX)

    nNet = NN(RBM_hidden_sizes, trX, trY)
    nNet.load_from_rbms(RBM_hidden_sizes, rbm_list)
    nNet.train()

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
暂无评论
推荐阅读
CNN神经网络--手写数字识别
CNN神经网络–手写数字识别 引入包 python import numpy as np import tensorflow as tf 下载并载入 MNIST 手写数字库(55000 * 28 *
MiChong
2020/09/24
1.2K0
CNN神经网络--手写数字识别
Tensorflow MNIST CNN 手写数字识别
Tesorflow实现基于MNIST数据集上简单CNN: https://github.com/Asurada2015/TF_Cookbook/blob/master/08_Convolutional_Neural_Networks/02_Intro_to_CNN_MNIST/02_introductory_cnn.py
演化计算与人工智能
2020/08/14
7440
Tensorflow MNIST CNN 手写数字识别
mnist手写数字识别(TensorFlow-GPU)------原理及源码
本文主要是对mnist手写数据集这中的迷糊数字进行识别,在Softmax Regression基础上建立了一个较为简单的机器学习模型。
andrew_a
2019/07/30
6.7K0
mnist手写数字识别(TensorFlow-GPU)------原理及源码
TensorFlow-手写数字识别(三)
本篇文章在上篇TensorFlow-手写数字识别(二)的基础上,将全连接网络改为LeNet-5卷积神经网络,实现手写数字识别。
xxpcb
2020/08/04
1K0
DeepFM算法解析及Python实现
由于DeepFM算法有效的结合了因子分解机与神经网络在特征学习中的优点:同时提取到低阶组合特征与高阶组合特征,所以越来越被广泛使用。
Bo_hemian
2020/09/09
3.5K0
TensorFlow从0到1 | 第十一章 74行Python实现手写体数字识别
到目前为止,我们已经研究了梯度下降算法、人工神经网络以及反向传播算法,他们各自肩负重任: 梯度下降算法:机器自学习的算法框架; 人工神经网络:“万能函数”的形式表达; 反向传播算法:计算人工神经网络梯度下降的高效方法; 基于它们,我们已经具备了构建具有相当实用性的智能程序的核心知识。它们来之不易,从上世纪40年代人工神经元问世,到80年代末反向传播算法被重新应用,历经了近半个世纪。然而,实现它们并进行复杂的数字手写体识别任务,只需要74行Python代码(忽略空行和注释)。要知道如果采用编程的方法(非学习的
用户1332428
2018/03/08
1.2K0
TensorFlow从0到1  | 第十一章 74行Python实现手写体数字识别
基于tensorflow的LSTM 时间序列预测模型
递归神经网络(RNN)相对于MLP和CNN的主要优点是,它能够处理序列数据,在传统神经网络或卷积神经网络中,样本(sample)输入与输出是没有“顺序”概念的,可以理解为,如果把输入序列和输出序列重新排布,对整体模型的理论性能不会有影响。RNN则不同,它保证了输入和输出至少有一端是有序列特征的。
全栈程序员站长
2022/07/25
2K0
基于tensorflow的LSTM 时间序列预测模型
tensorflow 1.01中GAN(生成对抗网络)手写字体生成例子(MINST)的测试
为了更好地掌握GAN的例子,从网上找了段代码进行跑了下,测试了效果。具体过程如下: 代码文件如下: import tensorflow as tf from tensorflow.examples.
sparkexpert
2018/01/09
1.4K0
tensorflow 1.01中GAN(生成对抗网络)手写字体生成例子(MINST)的测试
TensorFlow-- Chapter06 MNIST手写数字识别
TensorFlow-- Chapter06 MNIST手写数字识别 TensorFlow-- Chapter06 MNIST手写数字识别,tensorboard的使用。 作者:北山啦 文章目录 TensorFlow-- Chapter06 MNIST手写数字识别 理论部分 MNIST手写数字识别数据集 数据集的划分 拆分数据 工作流程 新的工作流程 逻辑回归 Sigmod函数 损失函数 多元分类 实战代码 TensorBoard可视化 利用TensorBoard可视化TensorFlow运行状态
北山啦
2022/11/27
2560
TensorFlow-- Chapter06   MNIST手写数字识别
TensorFlow实例: 手写汉字识别
MNIST手写数字数据集通常做为深度学习的练习数据集,这个数据集恐怕早已经被大家玩坏了。识别手写汉字要把识别英文、数字难上很多。首先,英文字符的分类少,总共10+26*2;而中文总共50,000多汉字,常用的就有3000多。其次,汉字有书法,每个人书写风格多样。 本文目标是利用TensorFlow做一个简单的图像分类器,在比较大的数据集上,尽可能高效地做图像相关处理,从Train,Validation到Inference,是一个比较基本的Example, 从一个基本的任务学习如果在TensorFlow下
机器学习AI算法工程
2018/03/15
4.5K0
TensorFlow实例: 手写汉字识别
深度学习算法中的非线性独立成分分析(Nonlinear Independent Component Analysis in Deep Learning)
深度学习是一种强大的机器学习技术,已经在计算机视觉、自然语言处理、语音识别等领域取得了巨大成功。然而,在深度学习中,由于网络层数的增加和复杂的非线性变换,传统的线性独立成分分析(Linear Independent Component Analysis,简称LICA)的有效性受到了限制。为了解决这个问题,研究人员提出了一种新的方法,即深度学习算法中的非线性独立成分分析(Nonlinear Independent Component Analysis,简称NLICA)。
大盘鸡拌面
2023/09/22
4540
Tensorflow入门-白话mnist手写数字识别
文章目录 mnist数据集 简介 图片和标签 One-hot编码(独热编码) 神经网络的重要概念 输入(x)输出(y)、标签(label) 损失函数(loss function) 回归模型 学习速率 softmax激活函数 Tensorflow识别手写数字 构造网络 model.py 训练 train.py 验证准确率 train.py 主函数 train.py mnist数据集 简介 MNIST是一个入门级的计算机视觉数据集,它包含各种手写数字图片。在机器学习中的地位相当于Python入门的打印Hel
小莹莹
2018/04/24
1.3K0
Tensorflow入门-白话mnist手写数字识别
mnist手写数字识别代码(knn手写数字识别)
MNIST是一个很有名的手写数字识别数据集(基本可以算是“Hello World”级别的了吧),我们要了解的情况是,对于每张图片,存储的方式是一个 28 * 28 的矩阵,但是我们在导入数据进行使用的时候会自动展平成 1 * 784(28 * 28)的向量,这在TensorFlow导入很方便,在使用命令下载数据之后,可以看到有四个数据集:
全栈程序员站长
2022/08/01
2.6K0
mnist手写数字识别代码(knn手写数字识别)
Tensorflow之 CNN卷积神经网络的MNIST手写数字识别
前言 tensorflow中文社区对官方文档进行了完整翻译。鉴于官方更新不少内容,而现有的翻译基本上都已过时。故本人对更新后文档进行翻译工作,纰漏之处请大家指正。(如需了解其他方面知识,可参阅以下Tensorflow系列文章)。 深入MNIST TensorFlow是一个非常强大的用来做大规模数值计算的库。其所擅长的任务之一就是实现以及训练深度神经网络。在本教程中,通过为MNIST构建一个深度卷积神经网络的分类器,我们将学到构建一个TensorFlow模型的基本步骤。 这个教程假设你已经熟悉神经网络和MNI
用户1332428
2018/03/08
1.6K0
Tensorflow之 CNN卷积神经网络的MNIST手写数字识别
教程 | 基于LSTM实现手写数字识别
基于tensorflow,如何实现一个简单的循环神经网络,完成手写数字识别,附完整演示代码。
OpenCV学堂
2019/09/26
1.5K0
用Tensorflow识别手写体
数据准备 import tensorflow as tfimport tensorflow.examples.tutorials.mnist.input_data as input_datamnist = input_data.read_data_sets("MNIST_data/", one_hot=True) WARNING:tensorflow:From <ipython-input-1-6bfbaa60ed82>:3: read_data_sets (from tensorflow.contrib.
用户3577892
2020/06/12
4.3K0
用TensorFlow进行手写数字识别
本文介绍了一种基于深度学习的图像识别方法,该方法采用卷积神经网络(CNN)和最大池化层来提取图像特征,并通过交叉熵损失函数进行优化。实验结果表明,该方法在MNIST数据集上达到了92%的准确率,效果良好。
longchen
2017/04/12
6.5K0
用TensorFlow进行手写数字识别
基于tensorflow的MNIST数字识别
MNIST是一个非常有名的手写体数字识别数据集,在很多资料中,这个数据集都会作为深度学习的入门样例。下面大致介绍这个数据集的基本情况,并介绍temsorflow对MNIST数据集做的封装。tensorflow的封装让使用MNIST数据集变得更加方便。MNIST数据集是NIST数据集的一个子集,它包含了60000张图片作为训练数据,10000张图片作为测试数据。在MNIST数据集中的每一张图片都代表了0~9中的一个数字。图片的大小都为28*28,且数字都会出现在图片的正中间。
狼啸风云
2019/03/01
3.1K0
TensorFlow-手写数字识别(二)
本篇文章在上篇TensorFlow-手写数字识别(一)的基础上进行改进,主要实现以下3点:
xxpcb
2020/08/04
8560
tensorflow 实现wgan-gp mnist图片生成
版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/qq_25737169/article/details/76695935
DoubleV
2018/09/12
1.6K0
tensorflow 实现wgan-gp  mnist图片生成
相关推荐
CNN神经网络--手写数字识别
更多 >
LV.1
这个人很懒,什么都没有留下~
目录
  • 深度置信网络
    • 1. 加载必要的深度置信网络库
    • 2. 构建RBM层
    • 3. 导入MNIST数据
    • 4. 建立DBN
    • 5.训练RBM
    • 6. 神经网络
    • 7. 运行
  • 完整代码
加入讨论
的问答专区 >
    领券
    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档