首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >[Pytorch][转载]用pytorch实现两层神经网络

[Pytorch][转载]用pytorch实现两层神经网络

作者头像
云未归来
发布2025-07-18 14:56:38
发布2025-07-18 14:56:38
1120
举报

PyTorch: Tensors

这次我们使用PyTorch tensors来创建前向神经网络,计算损失,以及反向传播。

一个PyTorch Tensor很像一个numpy的ndarray。但是它和numpy ndarray最大的区别是,PyTorch Tensor可以在CPU或者GPU上运算。如果想要在GPU上运算,就需要把Tensor换成cuda类型。  import torch

dtype = torch.float

device = torch.device("cpu")

# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data

x = torch.randn(N, D_in, device=device, dtype=dtype)

y = torch.randn(N, D_out, device=device, dtype=dtype)

# Randomly initialize weights

w1 = torch.randn(D_in, H, device=device, dtype=dtype)

w2 = torch.randn(H, D_out, device=device, dtype=dtype)

learning_rate = 1e-6

for t in range(500):

    # Forward pass: compute predicted y

    h = x.mm(w1)

    h_relu = h.clamp(min=0)

    y_pred = h_relu.mm(w2)

    # Compute and print loss

    loss = (y_pred - y).pow(2).sum().item()

    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss

    grad_y_pred = 2.0 * (y_pred - y)

    grad_w2 = h_relu.t().mm(grad_y_pred)

    grad_h_relu = grad_y_pred.mm(w2.t())

    grad_h = grad_h_relu.clone()

    grad_h[h < 0] = 0

    grad_w1 = x.t().mm(grad_h)

    # Update weights using gradient descent

    w1 -= learning_rate * grad_w1

    w2 -= learning_rate * grad_w2  简单的autograd  # Create tensors.

x = torch.tensor(1., requires_grad=True)

w = torch.tensor(2., requires_grad=True)

b = torch.tensor(3., requires_grad=True)

# Build a computational graph.

y = w * x + b    # y = 2 * x + 3

# Compute gradients.

y.backward()

# Print out the gradients.

print(x.grad)    # x.grad = 2 

print(w.grad)    # w.grad = 1 

print(b.grad)    # b.grad = 1 

# Create tensors.

x = torch.tensor(1., requires_grad=True)

w = torch.tensor(2., requires_grad=True)

b = torch.tensor(3., requires_grad=True)

# Build a computational graph.

y = w * x + b    # y = 2 * x + 3

# Compute gradients.

y.backward()

# Print out the gradients.

print(x.grad)    # x.grad = 2 

print(w.grad)    # w.grad = 1 

print(b.grad)    # b.grad = 1   PyTorch: Tensor和autograd

PyTorch的一个重要功能就是autograd,也就是说只要定义了forward pass(前向神经网络),计算了loss之后,PyTorch可以自动求导计算模型所有参数的梯度。

一个PyTorch的Tensor表示计算图中的一个节点。如果x是一个Tensor并且x.requires_grad=True那么x.grad是另一个储存着x当前梯度(相对于一个scalar,常常是loss)的向量。  import torch

dtype = torch.float

device = torch.device("cpu")

# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N 是 batch size; D_in 是 input dimension;

# H 是 hidden dimension; D_out 是 output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# 创建随机的Tensor来保存输入和输出

# 设定requires_grad=False表示在反向传播的时候我们不需要计算gradient

x = torch.randn(N, D_in, device=device, dtype=dtype)

y = torch.randn(N, D_out, device=device, dtype=dtype)

# 创建随机的Tensor和权重。

# 设置requires_grad=True表示我们希望反向传播的时候计算Tensor的gradient

w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)

w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6

for t in range(500):

    # 前向传播:通过Tensor预测y;这个和普通的神经网络的前向传播没有任何不同,

    # 但是我们不需要保存网络的中间运算结果,因为我们不需要手动计算反向传播。

    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # 通过前向传播计算loss

    # loss是一个形状为(1,)的Tensor

    # loss.item()可以给我们返回一个loss的scalar

    loss = (y_pred - y).pow(2).sum()

    print(t, loss.item())

    # PyTorch给我们提供了autograd的方法做反向传播。如果一个Tensor的requires_grad=True,

    # backward会自动计算loss相对于每个Tensor的gradient。在backward之后,

    # w1.grad和w2.grad会包含两个loss相对于两个Tensor的gradient信息。

    loss.backward()

    # 我们可以手动做gradient descent(后面我们会介绍自动的方法)。

    # 用torch.no_grad()包含以下statements,因为w1和w2都是requires_grad=True,

    # 但是在更新weights之后我们并不需要再做autograd。

    # 另一种方法是在weight.data和weight.grad.data上做操作,这样就不会对grad产生影响。

    # tensor.data会我们一个tensor,这个tensor和原来的tensor指向相同的内存空间,

    # 但是不会记录计算图的历史。

    with torch.no_grad():

        w1 -= learning_rate * w1.grad

        w2 -= learning_rate * w2.grad

        # Manually zero the gradients after updating weights

        w1.grad.zero_()

        w2.grad.zero_()  PyTorch: nn

这次我们使用PyTorch中nn这个库来构建网络。 用PyTorch autograd来构建计算图和计算gradients, 然后PyTorch会帮我们自动计算gradient。  import torch

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

# Use the nn package to define our model as a sequence of layers. nn.Sequential

# is a Module which contains other Modules, and applies them in sequence to

# produce its output. Each Linear Module computes output from input using a

# linear function, and holds internal Tensors for its weight and bias.

model = torch.nn.Sequential(

    torch.nn.Linear(D_in, H),

    torch.nn.ReLU(),

    torch.nn.Linear(H, D_out),

)

# The nn package also contains definitions of popular loss functions; in this

# case we will use Mean Squared Error (MSE) as our loss function.

loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4

for t in range(500):

    # Forward pass: compute predicted y by passing x to the model. Module objects

    # override the __call__ operator so you can call them like functions. When

    # doing so you pass a Tensor of input data to the Module and it produces

    # a Tensor of output data.

    y_pred = model(x)

    # Compute and print loss. We pass Tensors containing the predicted and true

    # values of y, and the loss function returns a Tensor containing the

    # loss.

    loss = loss_fn(y_pred, y)

    print(t, loss.item())

    # Zero the gradients before running the backward pass.

    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable

    # parameters of the model. Internally, the parameters of each Module are stored

    # in Tensors with requires_grad=True, so this call will compute gradients for

    # all learnable parameters in the model.

    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so

    # we can access its gradients like we did before.

    with torch.no_grad():

        for param in model.parameters():

            param -= learning_rate * param.grad

PyTorch: optim

这一次我们不再手动更新模型的weights,而是使用optim这个包来帮助我们更新参数。 optim这个package提供了各种不同的模型优化方法,包括SGD+momentum, RMSProp, Adam等等。

import torch

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

# Use the nn package to define our model and loss function.

model = torch.nn.Sequential(

    torch.nn.Linear(D_in, H),

    torch.nn.ReLU(),

    torch.nn.Linear(H, D_out),

)

loss_fn = torch.nn.MSELoss(reduction='sum')

# Use the optim package to define an Optimizer that will update the weights of

# the model for us. Here we will use Adam; the optim package contains many other

# optimization algoriths. The first argument to the Adam constructor tells the

# optimizer which Tensors it should update.

learning_rate = 1e-4

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for t in range(500):

    # Forward pass: compute predicted y by passing x to the model.

    y_pred = model(x)

    # Compute and print loss.

    loss = loss_fn(y_pred, y)

    print(t, loss.item())

    # Before the backward pass, use the optimizer object to zero all of the

    # gradients for the variables it will update (which are the learnable

    # weights of the model). This is because by default, gradients are

    # accumulated in buffers( i.e, not overwritten) whenever .backward()

    # is called. Checkout docs of torch.autograd.backward for more details.

    optimizer.zero_grad()

    # Backward pass: compute gradient of the loss with respect to model

    # parameters

    loss.backward()

    # Calling the step function on an Optimizer makes an update to its

    # parameters

    optimizer.step()  PyTorch: 自定义 nn Modules

我们可以定义一个模型,这个模型继承自nn.Module类。如果需要定义一个比Sequential模型更加复杂的模型,就需要定义nn.Module模型。

 import torch

class TwoLayerNet(torch.nn.Module):

    def __init__(self, D_in, H, D_out):

        """

        In the constructor we instantiate two nn.Linear modules and assign them as

        member variables.

        """

        super(TwoLayerNet, self).__init__()

        self.linear1 = torch.nn.Linear(D_in, H)

        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):

        """

        In the forward function we accept a Tensor of input data and we must return

        a Tensor of output data. We can use Modules defined in the constructor as

        well as arbitrary operators on Tensors.

        """

        h_relu = self.linear1(x).clamp(min=0)

        y_pred = self.linear2(h_relu)

        return y_pred

# N is batch size; D_in is input dimension;

# H is hidden dimension; D_out is output dimension.

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs

x = torch.randn(N, D_in)

y = torch.randn(N, D_out)

# Construct our model by instantiating the class defined above

model = TwoLayerNet(D_in, H, D_out)

# Construct our loss function and an Optimizer. The call to model.parameters()

# in the SGD constructor will contain the learnable parameters of the two

# nn.Linear modules which are members of the model.

criterion = torch.nn.MSELoss(reduction='sum')

optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

for t in range(500):

    # Forward pass: Compute predicted y by passing x to the model

    y_pred = model(x)

    # Compute and print loss

    loss = criterion(y_pred, y)

    print(t, loss.item())

    # Zero gradients, perform a backward pass, and update the weights.

    optimizer.zero_grad()

    loss.backward()

    optimizer.step()

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2020-04-23,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档