首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >深度学习-多层神经网络

深度学习-多层神经网络

原创
作者头像
freesan44
修改2021-10-12 17:56:56
修改2021-10-12 17:56:56
5180
举报
文章被收录于专栏:freesan44freesan44

3 - Initialization

您将编写两个辅助函数来初始化模型的参数。第一个函数将用于初始化双层模型的参数。第二个将把这个初始化过程推广到L层。

3.1 - 2层神经网络

**练习**:创建并初始化2层神经网络的参数。

**说明**:

* 该模型的结构是:*LINEAR - > RELU - > LINEAR - > SIGMOID*。

* 对权重矩阵使用随机初始化。使用np.random.randn(shape)\*0.01正确的形状。

* 对偏差使用零初始化。使用np.zeros(shape)

代码语言:txt
复制
# GRADED FUNCTION: initialize\_parameters



def initialize\_parameters(n\_x, n\_h, n\_y):

    """

    Argument:

    n\_x -- size of the input laye

    n\_h -- size of the hidden laye

    n\_y -- size of the output laye

    

    Returns:

    parameters -- python dictionary containing your parameters:

                    W1 -- weight matrix of shape (n\_h, n\_x)

                    b1 -- bias vector of shape (n\_h, 1)

                    W2 -- weight matrix of shape (n\_y, n\_h)

                    b2 -- bias vector of shape (n\_y, 1)

    """

    

    np.random.seed(1)

    

    ### START CODE HERE ### (≈ 4 lines of code)

    W1 = np.random.randn(n\_h,n\_x)\*0.01

    b1 = np.zeros((n\_h,1))

    W2 = np.random.randn(n\_y,n\_h)\*0.01

    b2 = np.zeros((n\_y,1))

    ### END CODE HERE ###

    

    assert(W1.shape == (n\_h, n\_x))

    assert(b1.shape == (n\_h, 1))

    assert(W2.shape == (n\_y, n\_h))

    assert(b2.shape == (n\_y, 1))

    

    parameters = {"W1": W1,

                  "b1": b1,

                  "W2": W2,

                  "b2": b2}

    

    return parameters    

3.2 - L层神经网络

更深层的L层神经网络的初始化更复杂,因为有更多的权重矩阵和偏置向量。完成后initialize\_parameters\_deep,您应确保每个图层之间的尺寸匹配。回想一下,nl是层l中的单元数。因此,例如,如果我们的输入的大小X是(12288,209)(用m=209个实施例)然后:

代码语言:txt
复制
# GRADED FUNCTION: initialize\_parameters\_deep



def initialize\_parameters\_deep(layer\_dims):

    """

    Arguments:

    layer\_dims -- python array (list) containing the dimensions of each layer in our network

    

    Returns:

    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":

                    Wl -- weight matrix of shape (layer\_dims[l], layer\_dims[l-1])

                    bl -- bias vector of shape (layer\_dims[l], 1)

    """

    

    np.random.seed(3)

    parameters = {}

    L = len(layer\_dims)            # number of layers in the network



    for l in range(1, L):

        ### START CODE HERE ### (≈ 2 lines of code)

        parameters['W' + str(l)] = np.random.randn(layer\_dims[l],layer\_dims[l-1])\*0.01

        parameters['b' + str(l)] = np.zeros((layer\_dims[l],1))

        ### END CODE HERE ###

        

        assert(parameters['W' + str(l)].shape == (layer\_dims[l], layer\_dims[l-1]))

        assert(parameters['b' + str(l)].shape == (layer\_dims[l], 1))



        

    return parameters

4 - 前向传播模块

4.1 - 线性转发

现在您已经初始化了参数,您将进行前向传播模块。您将首先实现一些基本功能,稍后您将在实现模型时使用这些功能。您将按此顺序完成三个功能:

* LINEAR

* LINEAR - > ACTIVATION,其中ACTIVATION将是ReLU或Sigmoid。

* LINEAR - > RELU ×(L-1) - > LINEAR - > SIGMOID(整个模型)

代码语言:txt
复制
# GRADED FUNCTION: linear\_forward



def linear\_forward(A, W, b):

    """

    Implement the linear part of a layer's forward propagation.



    Arguments:

    A -- activations from previous layer (or input data): (size of previous layer, number of examples)

    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)

    b -- bias vector, numpy array of shape (size of the current layer, 1)



    Returns:

    Z -- the input of the activation function, also called pre-activation parameter 

    cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently

    """

    

    ### START CODE HERE ### (≈ 1 line of code)

    Z = Z = np.dot(W,A)+b

    ### END CODE HERE ###

    

    assert(Z.shape == (W.shape[0], A.shape[1]))

    cache = (A, W, b)

    

    return Z, cache

4.2 - 线性激活转发

您将使用两个激活功能:

* **Sigmoid**:
我们为您提供了这项功能。此函数返回**两个**项目:A和Z(这是我们将输入相应的向后函数)。要使用它你可以回调:sigmoid``a``cache``Z
代码语言:txt
复制
```
代码语言:txt
复制
A, activation\_cache = sigmoid(Z)
代码语言:txt
复制
```

* **ReLU**:ReLu的数学公式是A=RELU(Z)=max(0,Z)。我们为您提供了这项功能。此函数返回**两个**项目:A和Z(这是我们将输入相应的向后函数)。要使用它你可以回调:relu``A``cache``Z

代码语言:txt
复制
```
代码语言:txt
复制
A, activation\_cache = relu(Z)
代码语言:txt
复制
```
代码语言:txt
复制
# GRADED FUNCTION: linear\_activation\_forward



def linear\_activation\_forward(A\_prev, W, b, activation):

    """

    Implement the forward propagation for the LINEAR->ACTIVATION laye



    Arguments:

    A\_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)

    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)

    b -- bias vector, numpy array of shape (size of the current layer, 1)

    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"



    Returns:

    A -- the output of the activation function, also called the post-activation value 

    cache -- a python dictionary containing "linear\_cache" and "activation\_cache";

             stored for computing the backward pass efficiently

    """

    

    if activation == "sigmoid":

        # Inputs: "A\_prev, W, b". Outputs: "A, activation\_cache".

        ### START CODE HERE ### (≈ 2 lines of code)

        Z, linear\_cache = linear\_forward(A\_prev,W,b)

        A, activation\_cache = sigmoid(Z)

        ### END CODE HERE ###

    

    elif activation == "relu":

        # Inputs: "A\_prev, W, b". Outputs: "A, activation\_cache".

        ### START CODE HERE ### (≈ 2 lines of code)

        Z, linear\_cache = linear\_forward(A\_prev,W,b )

        A, activation\_cache = relu(Z)

        ### END CODE HERE ###

    

    assert (A.shape == (W.shape[0], A\_prev.shape[1]))

    cache = (linear\_cache, activation\_cache)



    return A, cache

d)L层模型

为了在实现L层神经网络时更加方便,您需要一个能够复制前一个(使用RELU)L-1次的函数,然后使用一个带有SIGMOID 的函数。linear\_activation\_forward``linear\_activation\_forward
代码语言:txt
复制
# GRADED FUNCTION: L\_model\_forward



def L\_model\_forward(X, parameters):

    """

    Implement forward propagation for the [LINEAR->RELU]\*(L-1)->LINEAR->SIGMOID computation

    

    Arguments:

    X -- data, numpy array of shape (input size, number of examples)

    parameters -- output of initialize\_parameters\_deep()

    

    Returns:

    AL -- last post-activation value

    caches -- list of caches containing:

                every cache of linear\_activation\_forward() (there are L-1 of them, indexed from 0 to L-1)

    """



    caches = []

    A = X

    L = len(parameters) // 2                  # number of layers in the neural network

    

    # Implement [LINEAR -> RELU]\*(L-1). Add "cache" to the "caches" list.

    for l in range(1, L):

        A\_prev = A 

        ### START CODE HERE ### (≈ 2 lines of code)

        A, cache = linear\_activation\_forward(A\_prev,parameters['W'+str(l)],parameters['b'+str(l)],activation= 'relu')

        caches.append(cache)

        ### END CODE HERE ###

    

    # Implement LINEAR -> SIGMOID. Add "cache" to the "caches" list.

    ### START CODE HERE ### (≈ 2 lines of code)

    AL, cache = linear\_activation\_forward(A,parameters['W'+str(L)],parameters['b'+str(L)],activation= 'sigmoid')

    caches.append(cache)

    ### END CODE HERE ###

    

    assert(AL.shape == (1,X.shape[1]))

            

    return AL, caches

5 - 成本函数

现在,您将实现向前和向后传播。您需要计算成本,因为您要检查您的模型是否实际学习。
代码语言:txt
复制
# GRADED FUNCTION: compute\_cost



def compute\_cost(AL, Y):

    """

    Implement the cost function defined by equation (7).



    Arguments:

    AL -- probability vector corresponding to your label predictions, shape (1, number of examples)

    Y -- true "label" vector (for example: containing 0 if non-cat, 1 if cat), shape (1, number of examples)



    Returns:

    cost -- cross-entropy cost

    """

    

    m = Y.shape[1]



    # Compute loss from aL and y.

    ### START CODE HERE ### (≈ 1 lines of code)

    cost = - 1/m \* np.sum(np.multiply(Y,np.log(AL))+np.multiply(1-Y,np.log(1-AL)))

    ### END CODE HERE ###

    

    cost = np.squeeze(cost)      # To make sure your cost's shape is what we expect (e.g. this turns [[17]] into 17).

    assert(cost.shape == ())

    

    return cost

6 - 后向传播模块

就像前向传播一样,您将实现反向传播的辅助函数。请记住,反向传播用于计算损失函数相对于参数的梯度。

###6.1 - 线性后退

通过dZ获取其他导数dw、db、da

代码语言:txt
复制
# GRADED FUNCTION: linear\_backward



def linear\_backward(dZ, cache):

    """

    Implement the linear portion of backward propagation for a single layer (layer l)



    Arguments:

    dZ -- Gradient of the cost with respect to the linear output (of current layer l)

    cache -- tuple of values (A\_prev, W, b) coming from the forward propagation in the current laye



    Returns:

    dA\_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A\_prev

    dW -- Gradient of the cost with respect to W (current layer l), same shape as W

    db -- Gradient of the cost with respect to b (current layer l), same shape as b

    """

    A\_prev, W, b = cache

    m = A\_prev.shape[1]



    ### START CODE HERE ### (≈ 3 lines of code)

    dW = 1/m \* np.dot(dZ,A\_prev.T)

    db = 1/m \* np.sum(dZ,axis=1,keepdims=True)

    dA\_prev = np.dot(W.T,dZ)

    ### END CODE HERE ###

    

    assert (dA\_prev.shape == A\_prev.shape)

    assert (dW.shape == W.shape)

    assert (db.shape == b.shape)

    

    return dA\_prev, dW, db

6.2 - 向后线性激活

接下来,您将创建一个合并两个辅助函数的函数:**linear\_backward**以及激活的后退步骤**linear\_activation\_backward**。

为了帮助您实现linear\_activation\_backward,我们提供了两个向后功能:

* **sigmoid\_backward**:实现SIGMOID单元的向后传播。您可以按如下方式调用它:

代码语言:txt
复制
dZ = sigmoid\_backward(dA, activation\_cache)

* **relu\_backward**:实现RELU单元的反向传播。您可以按如下方式调用它:

代码语言:txt
复制
dZ = relu\_backward(dA, activation\_cache)
代码语言:txt
复制
# GRADED FUNCTION: linear\_activation\_backward



def linear\_activation\_backward(dA, cache, activation):

    """

    Implement the backward propagation for the LINEAR->ACTIVATION layer.

    

    Arguments:

    dA -- post-activation gradient for current layer l 

    cache -- tuple of values (linear\_cache, activation\_cache) we store for computing backward propagation efficiently

    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

    

    Returns:

    dA\_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A\_prev

    dW -- Gradient of the cost with respect to W (current layer l), same shape as W

    db -- Gradient of the cost with respect to b (current layer l), same shape as b

    """

    linear\_cache, activation\_cache = cache

    

    if activation == "relu":

        ### START CODE HERE ### (≈ 2 lines of code)

        dZ = relu\_backward(dA,activation\_cache)

        dA\_prev, dW, db = linear\_backward(dZ,linear\_cache)

        ### END CODE HERE ###

        

    elif activation == "sigmoid":

        ### START CODE HERE ### (≈ 2 lines of code)

        dZ = sigmoid\_backward(dA,activation\_cache)

        dA\_prev, dW, db = linear\_backward(dZ,linear\_cache)

        ### END CODE HERE ###

    

    return dA\_prev, dW, db

6.3 - L模型后退

现在,您将为整个网络实现向后功能。回想一下,当您实现该L\_model\_forward函数时,在每次迭代时,您都存储了一个包含(X,W,b和z)的缓存。在反向传播模块中,您将使用这些变量来计算渐变。因此,在L\_model\_backward函数中,您将从层L开始向后遍历所有隐藏层。在每个步骤中,您将使用层l的缓存值通过层l反向传播。下面的图5显示了向后传球。
代码语言:txt
复制
# GRADED FUNCTION: L\_model\_backward



def L\_model\_backward(AL, Y, caches):

    """

    Implement the backward propagation for the [LINEAR->RELU] \* (L-1) -> LINEAR -> SIGMOID group

    

    Arguments:

    AL -- probability vector, output of the forward propagation (L\_model\_forward())

    Y -- true "label" vector (containing 0 if non-cat, 1 if cat)

    caches -- list of caches containing:

                every cache of linear\_activation\_forward() with "relu" (it's caches[l], for l in range(L-1) i.e l = 0...L-2)

                the cache of linear\_activation\_forward() with "sigmoid" (it's caches[L-1])

    

    Returns:

    grads -- A dictionary with the gradients

             grads["dA" + str(l)] = ... 

             grads["dW" + str(l)] = ...

             grads["db" + str(l)] = ... 

    """

    grads = {}

    L = len(caches) # the number of layers

    m = AL.shape[1]

    Y = Y.reshape(AL.shape) # after this line, Y is the same shape as AL

    

    # Initializing the backpropagation

    ### START CODE HERE ### (1 line of code)

    dAL = -(np.divide(Y,AL)- np.divide(1-Y,1-AL))

    ### END CODE HERE ###



    # Lth layer (SIGMOID -> LINEAR) gradients. Inputs: "dAL, current\_cache". Outputs: "grads["dAL-1"], grads["dWL"], grads["dbL"]

    ### START CODE HERE ### (approx. 2 lines)

    current\_cache = caches[-1]

    grads["dA" + str(L - 1)], grads["dW" + str(L)], grads["db" + str(L)] = linear\_activation\_backward(dAL, current\_cache, activation = "sigmoid")

    ### END CODE HERE ###

    

    # Loop from l=L-2 to l=0

    for l in reversed(range(L-1)):

        # lth layer: (RELU -> LINEAR) gradients.

        # Inputs: "grads["dA" + str(l + 1)], current\_cache". Outputs: "grads["dA" + str(l)] , grads["dW" + str(l + 1)] , grads["db" + str(l + 1)] 

        ### START CODE HERE ### (approx. 5 lines)

        current\_cache = caches[l]

        dA\_prev\_temp, dW\_temp, db\_temp = linear\_activation\_backward(grads["dA" + str(l+1)], current\_cache, activation = "relu")

        grads["dA" + str(l)] = dA\_prev\_temp

        grads["dW" + str(l + 1)] = dW\_temp

        grads["db" + str(l + 1)] = db\_temp

        ### END CODE HERE ###



    return grads

6.4 - 更新参数

在本节中,您将使用渐变下降更新模型的参数:
代码语言:txt
复制
# GRADED FUNCTION: update\_parameters



def update\_parameters(parameters, grads, learning\_rate):

    """

    Update parameters using gradient descent

    

    Arguments:

    parameters -- python dictionary containing your parameters 

    grads -- python dictionary containing your gradients, output of L\_model\_backward

    

    Returns:

    parameters -- python dictionary containing your updated parameters 

                  parameters["W" + str(l)] = ... 

                  parameters["b" + str(l)] = ...

    """

    

    L = len(parameters) // 2 # number of layers in the neural network



    # Update rule for each parameter. Use a for loop.

    ### START CODE HERE ### (≈ 3 lines of code)

    for l in range(L):

        parameters["W" + str(l + 1)] = parameters["W" + str(l + 1)]-learning\_rate\*grads["dW" + str(l + 1)]

        parameters["b" + str(l + 1)] = parameters["b" + str(l + 1)]-learning\_rate\*grads["db" + str(l + 1)]

    ### END CODE HERE ###

    return parameters

3 - 模型的架构

现在您已经熟悉了数据集,现在是时候构建一个深度神经网络来区分猫图像和非猫图像。

您将构建两个不同的模型:

一个2层神经网络

L层深度神经网络

然后,您将比较这些模型的性能,并尝试不同的值L 。

让我们看看这两种架构。

图中的详细架构:

1.输入是(64,64,3)图像,其被展平为大小为(12288,1)的向量。

2.相应的向量:x0,x1,...,x12287 T然后乘以权重矩阵W 1的大小(n 1,12288)

3.然后你添加一个偏见项并使其得到以下向量:[a 1 0,a 1 1,...,a 1 n 1 -1] T

4.然后重复相同的过程。

5.将得到的向量乘以W 2并添加截距(偏差)。

6.最后,你取结果的sigmoid。 如果它大于0.5,则将其归类为猫。

图中的详细架构:

1.输入是(64,64,3)图像,其被展平为大小为(12288,1)的向量。

2.相应的向量:x0,x1,...,x12287 T然后乘以权重矩阵W 1,然后加上截距b 1。 结果称为线性单位。

3.接下来,你拿线性单元的relu。 根据模型架构,每个(W I,b l)可以重复此过程若干次。

4.在最初,你采用最终线性单位的sigmoid。 如果它大于0.5,则将其归类为猫。

##3.3 - 一般方法

像往常一样,您将遵循深度学习方法来构建模型:

  1. Initialize parameters / Define hyperparameters
  2. Loop for num_iterations:
代码语言:txt
复制
a. Forward propagation
代码语言:txt
复制
b. Compute cost function
代码语言:txt
复制
c. Backward propagation
代码语言:txt
复制
d. Update parameters (using parameters, and grads from backprop) 
  1. Use trained parameters to predict labels

4 - 双层神经网络

**问题**:使用您在先前任务中实现的辅助函数来构建具有以下结构的2层神经网络:*LINEAR - > RELU - > LINEAR - > SIGMOID*。您可能需要的功能及其输入是:

代码语言:txt
复制
# GRADED FUNCTION: two\_layer\_model



def two\_layer\_model(X, Y, layers\_dims, learning\_rate=0.0075, num\_iterations=3000, print\_cost=False):

    """

    Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID.



    Arguments:

    X -- input data, of shape (n\_x, number of examples)

    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)

    layers\_dims -- dimensions of the layers (n\_x, n\_h, n\_y)

    num\_iterations -- number of iterations of the optimization loop

    learning\_rate -- learning rate of the gradient descent update rule

    print\_cost -- If set to True, this will print the cost every 100 iterations



    Returns:

    parameters -- a dictionary containing W1, W2, b1, and b2

    """



    np.random.seed(1)

    grads = {}

    costs = []  # to keep track of the cost

    m = X.shape[1]  # number of examples

    (n\_x, n\_h, n\_y) = layers\_dims



    # Initialize parameters dictionary, by calling one of the functions you'd previously implemented

    ### START CODE HERE ### (≈ 1 line of code)

    parameters = initialize\_parameters(n\_x,n\_h,n\_y)

    ### END CODE HERE ###



    # Get W1, b1, W2 and b2 from the dictionary parameters.

    W1 = parameters["W1"]

    b1 = parameters["b1"]

    W2 = parameters["W2"]

    b2 = parameters["b2"]



    # Loop (gradient descent)



    for i in range(0, num\_iterations):



        # Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. Inputs: "X, W1, b1, W2, b2". Output: "A1, cache1, A2, cache2".

        ### START CODE HERE ### (≈ 2 lines of code)

        A1, cache1 = linear\_activation\_forward(X,W1,b1,activation='relu')

        A2, cache2 = linear\_activation\_forward(A1,W2,b2,activation='sigmoid')

        ### END CODE HERE ###



        # Compute cost

        ### START CODE HERE ### (≈ 1 line of code)

        cost = compute\_cost(A2,Y)

        ### END CODE HERE ###



        # Initializing backward propagation

        dA2 = - (np.divide(Y, A2) - np.divide(1 - Y, 1 - A2))



        # Backward propagation. Inputs: "dA2, cache2, cache1". Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1".

        ### START CODE HERE ### (≈ 2 lines of code)

        dA1, dW2, db2 = linear\_activation\_backward(dA2,cache2,activation='sigmoid')

        dA0, dW1, db1 = linear\_activation\_backward(dA1,cache1,activation='relu')

        ### END CODE HERE ###



        # Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2

        grads['dW1'] = dW1

        grads['db1'] = db1

        grads['dW2'] = dW2

        grads['db2'] = db2



        # Update parameters.

        ### START CODE HERE ### (approx. 1 line of code)

        parameters = update\_parameters(parameters,grads,learning\_rate)

        ### END CODE HERE ###



        # Retrieve W1, b1, W2, b2 from parameters

        W1 = parameters["W1"]

        b1 = parameters["b1"]

        W2 = parameters["W2"]

        b2 = parameters["b2"]



        # Print the cost every 100 training example

        if print\_cost and i % 100 == 0:

            print("Cost after iteration {}: {}".format(i, np.squeeze(cost)))

        if print\_cost and i % 100 == 0:

            costs.append(cost)



    # plot the cost



    plt.plot(np.squeeze(costs))

    plt.ylabel('cost')

    plt.xlabel('iterations (per tens)')

    plt.title("Learning rate =" + str(learning\_rate))

    plt.show()



    return parameters

##5 - L层神经网络

代码语言:txt
复制
# GRADED FUNCTION: L\_layer\_model



def L\_layer\_model(X, Y, layers\_dims, learning\_rate=0.0075, num\_iterations=3000, print\_cost=False):  # lr was 0.009

    """

    Implements a L-layer neural network: [LINEAR->RELU]\*(L-1)->LINEAR->SIGMOID.



    Arguments:

    X -- data, numpy array of shape (number of examples, num\_px \* num\_px \* 3)

    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)

    layers\_dims -- list containing the input size and each layer size, of length (number of layers + 1).

    learning\_rate -- learning rate of the gradient descent update rule

    num\_iterations -- number of iterations of the optimization loop

    print\_cost -- if True, it prints the cost every 100 steps



    Returns:

    parameters -- parameters learnt by the model. They can then be used to predict.

    """



    np.random.seed(1)

    costs = []  # keep track of cost



    # Parameters initialization. (≈ 1 line of code)

    ### START CODE HERE ###

    parameters = initialize\_parameters\_deep(layers\_dims)

    ### END CODE HERE ###



    # Loop (gradient descent)

    for i in range(0, num\_iterations):



        # Forward propagation: [LINEAR -> RELU]\*(L-1) -> LINEAR -> SIGMOID.

        ### START CODE HERE ### (≈ 1 line of code)

        AL, caches = L\_model\_forward(X,parameters)

        ### END CODE HERE ###



        # Compute cost.

        ### START CODE HERE ### (≈ 1 line of code)

        cost = compute\_cost(AL,Y)

        ### END CODE HERE ###



        # Backward propagation.

        ### START CODE HERE ### (≈ 1 line of code)

        grads = L\_model\_backward(AL,Y,caches)

        ### END CODE HERE ###



        # Update parameters.

        ### START CODE HERE ### (≈ 1 line of code)

        parameters = update\_parameters(parameters,grads,learning\_rate)

        ### END CODE HERE ###



        # Print the cost every 100 training example

        if print\_cost and i % 100 == 0:

            print("Cost after iteration %i: %f" % (i, cost))

        if print\_cost and i % 100 == 0:

            costs.append(cost)



    # plot the cost

    plt.plot(np.squeeze(costs))

    plt.ylabel('cost')

    plt.xlabel('iterations (per tens)')

    plt.title("Learning rate =" + str(learning\_rate))

    plt.show()



    return parameters
代码语言:txt
复制
pred\_train = predict(train\_x, train\_y, parameters)

Accuracy: 0.985645933014

pred\_test = predict(test\_x, test\_y, parameters)

Accuracy: 0.8

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 3 - Initialization
    • 3.1 - 2层神经网络
    • 3.2 - L层神经网络
  • 4 - 前向传播模块
    • 4.1 - 线性转发
    • 4.2 - 线性激活转发
    • d)L层模型
  • 5 - 成本函数
  • 6 - 后向传播模块
    • 6.2 - 向后线性激活
    • 6.3 - L模型后退
    • 6.4 - 更新参数
  • 3 - 模型的架构
  • 4 - 双层神经网络
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档