centos7 安装nnDetection环境

2023-12-15 00:58:38

nnunet和nnDetection更新导致默认安装可能会出现无法调用GPU的问题，这里稍微细致的记录下安装nnDetection环境过程。

1.创建虚拟环境：

Please note that nndetection requires Python 3.8+.?Please use PyTorch 1.X version for now and not 2.0

这里要求python3.8版本以上，pytorch1.0以上但不到2.0。综合考虑我选择python3.9：

conda create --name xxx python==3.9

2.在虚拟环境中安装pytroch。

Install CUDA (>10.1) and cudnn (make sure to select compatible versions!)

这里要求CUDA（>10.1）并且选择对应的cudnn。由于之前已经安装CUDA我先查看下本机CUDA版本：

(xxxxxxx) [xxxx@xxxxxxxxx ~]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0

满足要求的。

[Optional] Depending on your GPU you might need to set TORCH_CUDA_ARCH_LIST, check compute capabilit。

可选项，查看GPU的计算能力没有需求。

Install torch (make sure to match the pytorch and CUDA versions!) (requires pytorch >1.10+) and torchvision(make sure to match the versions!).

这里进一步将pytorch版本要求提升到pytoch1.10以上，并且要匹配CUDA版本。去官网查看发现只右有如下这条满足：

则安装这个版本的pytorch：

Collecting torch==1.10.0+cu111
  Downloading https://download.pytorch.org/whl/cu111/torch-1.10.0%2Bcu111-cp39-cp39-linux_x86_64.whl (2137.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━?━━━━━ 1.8/2.1 GB 6.4 kB/s eta 12:30:51
ERROR: Wheel 'torch' located at /tmp/pip-unpack-zyhz8qfc/torch-1.10.0+cu111-cp39-cp39-linux_x86_64.whl is invalid.

下载是失败了。但好像这种下载方式自带cu111了，那么我么是否可以下载自带cu102的版本呢。（不考虑cu12主要原因是本机nvidia驱动为NVIDIA-SMI 450.57 ? ? ? Driver Version: 450.57 ? ? ? CUDA Version: 11.0?）下面开始尝试：

则安装这个版本的pytroch：

(xxxxxxx) [xxxx@xxxxxxxxx ~]$ pip install torch==1.12.1+cu102 torchvision==0.13.1+cu102 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu102
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu102
Collecting torch==1.12.1+cu102
  Downloading https://download.pytorch.org/whl/cu102/torch-1.12.1%2Bcu102-cp39-cp39-linux_x86_64.whl (776.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 776.4/776.4 MB 1.3 MB/s eta 0:00:00
Collecting torchvision==0.13.1+cu102
  Downloading https://download.pytorch.org/whl/cu102/torchvision-0.13.1%2Bcu102-cp39-cp39-linux_x86_64.whl (19.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.1/19.1 MB 1.5 MB/s eta 0:00:00
Collecting torchaudio==0.12.1
  Downloading https://download.pytorch.org/whl/cu102/torchaudio-0.12.1%2Bcu102-cp39-cp39-linux_x86_64.whl (3.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.7/3.7 MB 1.3 MB/s eta 0:00:00
Collecting typing-extensions (from torch==1.12.1+cu102)
  Downloading typing_extensions-4.9.0-py3-none-any.whl.metadata (3.0 kB)
Collecting numpy (from torchvision==0.13.1+cu102)
  Using cached numpy-1.26.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Collecting requests (from torchvision==0.13.1+cu102)
  Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision==0.13.1+cu102)
  Using cached Pillow-10.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.5 kB)
Collecting charset-normalizer<4,>=2 (from requests->torchvision==0.13.1+cu102)
  Using cached charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests->torchvision==0.13.1+cu102)
  Using cached idna-3.6-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests->torchvision==0.13.1+cu102)
  Using cached urllib3-2.1.0-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests->torchvision==0.13.1+cu102)
  Using cached certifi-2023.11.17-py3-none-any.whl.metadata (2.2 kB)
Using cached Pillow-10.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.5 MB)
Using cached numpy-1.26.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Downloading typing_extensions-4.9.0-py3-none-any.whl (32 kB)
Using cached certifi-2023.11.17-py3-none-any.whl (162 kB)
Using cached charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)
Using cached idna-3.6-py3-none-any.whl (61 kB)
Using cached urllib3-2.1.0-py3-none-any.whl (104 kB)
Installing collected packages: urllib3, typing-extensions, pillow, numpy, idna, charset-normalizer, certifi, torch, requests, torchvision, torchaudio
Successfully installed certifi-2023.11.17 charset-normalizer-3.3.2 idna-3.6 numpy-1.26.2 pillow-10.1.0 requests-2.31.0 torch-1.12.1+cu102 torchaudio-0.12.1+cu102 torchvision-0.13.1+cu102 typing-extensions-4.9.0 urllib3-2.1.0

安装成功pytroch1.12.1.下面尝试这个版本cuda是否可用：

Python 3.9.0 (default, Nov 15 2020, 14:28:56) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True

如上所示可用，看来服务器没安装cuda可以在下载torch是可自带编译好的对应cuda版本来替代（不过可能还需要低于驱动显示的CUDA Version: 11.0 版本）

3.安装nndet框架

Clone nnDetection, cd [path_to_repo] and pip install -e .

(lungdoc) [pacs@localhost ~]$ git clone https://github.com/MIC-DKFZ/nnDetection.git
正克隆到 'nnDetection'...
remote: Enumerating objects: 1449, done.
remote: Counting objects: 100% (154/154), done.
remote: Compressing objects: 100% (65/65), done.
remote: Total 1449 (delta 119), reused 89 (delta 89), pack-reused 1295
接收对象中: 100% (1449/1449), 1.29 MiB | 1.72 MiB/s, done.
处理 delta 中: 100% (848/848), done.
(lungdoc) [pacs@localhost ~]$ cd nnDetection
(lungdoc) [pacs@localhost nnDetection]$ pip install -e .
Obtaining file:///home/pacs/nnDetection
  Preparing metadata (setup.py) ... done
...........................................
    RuntimeError:
    The detected CUDA version (11.1) mismatches the version that was used to compile
    PyTorch (10.2). Please make sure to use the same CUDA versions.
    
    [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.

其它安装库都没啥问题，但是在编译nms.cu时用到了安装在本地cuda版本。

为了不返工，我这里就在本机安装cuda10.2版本（虚拟环境中cu102应该值局限与可执行调用，要现场编译就不行了）下子cuda10.2和对应cudnn库。（本机上有安装包这里就不下载了）再次执行

还是报错无语了，再次查看其它系统老环境pytorch1.9.1，cuda11.1。?

再次卸载pytroch

torch ? ? ? ? ? ? ? ? ? 1.12.1+cu102
torchaudio ? ? ? ? ? ? ?0.12.1+cu102
torchvision ? ? ? ? ? ? 0.13.1+cu102

安装

torch ? ? ? ? ? ? ? ? ? 1.12.1
再次编译还是报原来的错误如下图：

根据报错找到【BUG】关于Pytoch中CUDA扩展的本地安装 - 知乎这个攻略尝试下

##就是报了torch api中cloneable.h文件的错误，经过尝试，将cloneable.h文件中46行，58行，70行三句

copy->parameters_.size() == parameters_.size()

copy->buffers_.size() == buffers_.size()

copy->children_.size() == children_.size()
##分别改成

copy->parameters_.size() == this -> parameters_.size()

copy->buffers_.size() == this -> buffers_.size()

copy->children_.size() == this -> children_.size()
##保存后再次安装成功。

大佬牛啊，评论区大佬指出是有gcc版本过低导致的。本机gcc5.4也会报这个错。

再次安装torchaudio和torchvision：

(lungdoc) [pacs@localhost software10.2]$ pip install torch==1.12.1+cu102 torchvision==0.13.1+cu102 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu102

4.设置环境变量：

Set environment variables (more info can be found below):

det_data: [required] Path to the source directory where all the data will be located
det_models: [required] Path to directory where all models will be saved
OMP_NUM_THREADS=1?: [required] Needs to be set! Otherwise bad things will happen... Refer to batchgenerators documentation.
det_num_threads: [recommended] Number processes to use for augmentation (at least 6, default 12)
det_verbose: [optional] Can be used to deactivate progress bars (activated by default)
MLFLOW_TRACKING_URI: [optional] Specify the logging directory of mlflow. Refer to the?mlflow documentation?for more information.
```
export det_data='/media/XXX/nnDetection_file/data1'
export det_models='/media/XXX/nnDetection_file/model1'
export OMP_NUM_THREADS=1
export det_num_threads=2
```
接着source .bashrc激活下ok了。

文章来源:https://blog.csdn.net/qq_36401512/article/details/134925670
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：veading@qq.com进行投诉反馈，一经查实，立即删除！