Windows下从零搭建深度学习环境Tensorflow+PyTorch（附深度学习入门三大名著）

自学气象人

发布于 2023-06-21 15:05:47

5970

发布于 2023-06-21 15:05:47

文章被收录于专栏：自学气象人

安装环境

Anaconda安装

首先安装python环境，推荐Anaconda+jupyter，而不是Pycharm

1.首先下载Anaconda：

https://www.anaconda.com/products/distribution#download-section

2.下载好打开Anaconda Prompt

3.配置一下镜像源，添加阿里镜像

conda config --set show_channel_urls yes

然后就在C:\Users\用户名这里找到.condarc

记事本打开修改为：

channels:
  - defaults
show_channel_urls: true
default_channels:
  - http://mirrors.aliyun.com/anaconda/pkgs/main
  - http://mirrors.aliyun.com/anaconda/pkgs/r
  - http://mirrors.aliyun.com/anaconda/pkgs/msys2
custom_channels:
  conda-forge: http://mirrors.aliyun.com/anaconda/cloud
  msys2: http://mirrors.aliyun.com/anaconda/cloud
  bioconda: http://mirrors.aliyun.com/anaconda/cloud
  menpo: http://mirrors.aliyun.com/anaconda/cloud
  pytorch: http://mirrors.aliyun.com/anaconda/cloud
  simpleitk: http://mirrors.aliyun.com/anaconda/cloud

再清除索引缓存：

conda clean -i

创建一个深度学习的环境（避免不同的包相互冲突，我目前设置了四个环境：geemap，绘图，地理库和深度学习）

# 1.查看有哪些可安装的python版本
conda search --full-name python
# 2.创建新环境DL
conda create --name DL python=3.8.12
# 如果想删除环境采用以下操作
# conda remove -n DL --all
# 激活环境
conda activate DL

如果不支持GPU环境，就可以直接安装Tensorflow了

pip install tensorflow
import tensorflow as tf

检测GPU环境

win下面搜索设备管理器

在显示适配器下面看到自己的显卡：

接下来查看电脑显卡型号是否支持CUDN，查看链接：https://developer.nvidia.com/zh-cn/cuda-gpus

可以看到我的显卡是在支持列表里的

接下来安装显卡驱动，官方驱动链接：https://www.nvidia.com/Download/index.aspx?lang=en-us

在这里找到你显卡的型号并Search：

之后会跳转到一个界面，点击下载即可：

下载后双击安装，根据程序默认一路点下去就好：

安装完后可以重启一下电脑。

版本选择

CUDA的版本依赖于显卡的驱动程序版本，首先查看GPU驱动版本，win搜索NVIDIA控制面板

可以看到我的版本号是531.41

官方参考链接：https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

我的驱动版本是531.41，因此可以安装CUDA 12.1

cuDNN、TensorFlow 版本选择

官方参考链接：https://tensorflow.google.cn/install/source_windows#gpu

到官方查看。对应CUDA 12，向下兼容发现可以安装cuDNN 8和 tensorflow_gpu-2.6.0

安装CUDA、cuDNN

CUDA

下载：https://developer.nvidia.com/cuda-toolkit-archive

这里我选择exe(local)本地安装

下载到本地后双击exe文件安装：

安装完成后在环境变量中检查：

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\lib\x64

我的已经自动添加了，若没有这两个路径则手动添加：

在CMD中输入：

nvcc -V

有消息提示则安装成功

cuDNN

cuDNN下载需要进行一个漫长的登陆

https://developer.nvidia.com/zh-cn/cudnn

下载之后解压有以下内容：

都复制到（除了LICENSE）CUDA的安装目录（C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1）下。

安装TensorFlow

最终我选择的环境（可以参考）

python3.8.12
cuda_11.6.1_511.6
cudnn_8.3.2.44
tensorflow-gpu 2.7.0
keras 2.7.0

pip install tensorflow-gpu==2.7.0

测试：

import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))

但是我这里报错了，先提示我降级protobuf包：

pip install protobuf==3.20.*

这里就成功了：

成功会显示下面的代码，否则只会显示[]

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

若报错：

2023-03-27 12:06:54.443860: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2023-03-27 12:06:54.444235: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

提示缺少ddl，把相应的ddl复制到

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin

安装PyTorch

除了pytorch，还有一个很好用的包是torchvision，用于图像相关的功能
torch和torchvision的版本对应如下表：（https://github.com/pytorch/vision实时更新）

torch	torchvision	python
main / nightly	main / nightly	>=3.8, <=3.11
2.0.0	0.15.1	>=3.8, <=3.11
1.13.0	0.14.0	>=3.7.2, <=3.10
1.12.0	0.13.0	>=3.7, <=3.10
1.11.0	0.12.0	>=3.7, <=3.10
1.10.2	0.11.3	>=3.6, <=3.9
1.10.1	0.11.2	>=3.6, <=3.9
1.10.0	0.11.1	>=3.6, <=3.9
1.9.1	0.10.1	>=3.6, <=3.9
1.9.0	0.10.0	>=3.6, <=3.9
1.8.2	0.9.2	>=3.6, <=3.9
1.8.1	0.9.1	>=3.6, <=3.9
1.8.0	0.9.0	>=3.6, <=3.9
1.7.1	0.8.2	>=3.6, <=3.9
1.7.0	0.8.1	>=3.6, <=3.8
1.7.0	0.8.0	>=3.6, <=3.8
1.6.0	0.7.0	>=3.6, <=3.8
1.5.1	0.6.1	>=3.5, <=3.8
1.5.0	0.6.0	>=3.5, <=3.8
1.4.0	0.5.0	==2.7, >=3.5, <=3.8
1.3.1	0.4.2	==2.7, >=3.5, <=3.7
1.3.0	0.4.1	==2.7, >=3.5, <=3.7
1.2.0	0.4.0	==2.7, >=3.5, <=3.7
1.1.0	0.3.0	==2.7, >=3.5, <=3.7
<=1.0.1	0.2.2	==2.7, >=3.5, <=3.7

首先确定pytorch的版本：

https://download.pytorch.org/whl/torch_stable.html

由于我是：

python3.8.12
cuda_11.6.1_511.6

故选择：

cp38
cu116

查表得torch 1.12.0 对应的 torchvision 0.13.0

建议不同的深度学习框架，换不同的envi

conda create --name torch python=3.8.12
conda activate torch

安装torch和torchvision

pip install torch-1.12.0+cu116-cp38-cp38-win_amd64.whl
pip install torchvision-0.13.0+cu116-cp38-cp38-win_amd64.whl

测试结果

TensorFlow

比较在CPU和GPU上的运行时间

import tensorflow as tf
import timeit


def cpu_run():
    with tf.device('/cpu:0'):
        cpu_a = tf.random.normal([10000, 1000])
        cpu_b = tf.random.normal([1000, 2000])
        c = tf.matmul(cpu_a, cpu_b)
    return c


def gpu_run():
    with tf.device('/gpu:0'):
        gpu_a = tf.random.normal([10000, 1000])
        gpu_b = tf.random.normal([1000, 2000])
        c = tf.matmul(gpu_a, gpu_b)
    return c


cpu_time = timeit.timeit(cpu_run, number=10)
gpu_time = timeit.timeit(gpu_run, number=10)
print("cpu:", cpu_time, "  gpu:", gpu_time)

可以看到差异明显

PyTorch

import torch
flag = torch.cuda.is_available()
print(flag)

ngpu= 1
# Decide which device we want to run on
device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu")
print(device)
print(torch.cuda.get_device_name(0))
print(torch.rand(3,3).cuda())

显示已经具有GPU环境

True
cuda:0
NVIDIA GeForce RTX 3080 Laptop GPU
tensor([[0.2823, 0.0544, 0.1159],
        [0.8368, 0.2139, 0.7360],
        [0.7613, 0.5881, 0.5153]], device='cuda:0')

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2023-04-06，如有侵权请联系 cloudcommunity@tencent.com 删除

tensorflow