Unable to run NVIDIA Docker image on Azure

I set up a Data Science Virtual Machine for Linux (Ubuntu) on Azure and want to check the installation of GPUs following these TensorFlow directions. The first command shows that a GPU is available with Tesla M60:

$ lspci | grep -i nvidia

db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)

The second command fails with a cryptic message:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.

How can I set up a machine to run the Nvidia docker image?

asked Jan 21 at 16:40

mmorin

243110

add a comment |

$ lspci | grep -i nvidia

db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)

The second command fails with a cryptic message:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.

How can I set up a machine to run the Nvidia docker image?

asked Jan 21 at 16:40

mmorin

243110

add a comment |

$ lspci | grep -i nvidia

db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)

The second command fails with a cryptic message:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.

How can I set up a machine to run the Nvidia docker image?

asked Jan 21 at 16:40

mmorin

243110

$ lspci | grep -i nvidia

db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)

The second command fails with a cryptic message:

$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.

How can I set up a machine to run the Nvidia docker image?

linux gpu nvidia-graphics-card docker azure

asked Jan 21 at 16:40

mmorin

243110

asked Jan 21 at 16:40

mmorin

243110

asked Jan 21 at 16:40

mmorin

243110

asked Jan 21 at 16:40

mmorin

243110

asked Jan 21 at 16:40

mmorin

243110

add a comment |

1 Answer
1

active

oldest

votes

This NVIDIA GitHub issue and this part of the error message:

--require=cuda>=10.0 brand=tesla,driver>=384,driver<385

suggest it is a driver issue. I don't quite understand why.

Solution using Docker, but without your image

The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.

Solution using your image, but without Docker

Alternatively, you can still use Data Science Virtual Machine for Linux (Ubuntu) but without the containerisation of Docker. Conda, for example, can set up an environment (where the initial yes | answers yes to the prompts to install the packages):

yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn

export PATH=$PATH:/data/anaconda/envs/TF/bin

export PATH=$PATH:/data/anaconda/envs/py35/bin

These commands pull the official models from Tensorflow:

git clone https://github.com/tensorflow/models.git

export PYTHONPATH="$PYTHONPATH:./models"

A first call to nvidia-smi shows that the GPU has no running processes:

$ nvidia-smi

Mon Jan 21 16:26:02 2019       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   39C    P8    14W / 150W |      0MiB /  8129MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:

$ python models/official/mnist/mnist.py &

[1] 25967

$ nvidia-smi

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   37C    P0    77W / 150W |   7851MiB /  8129MiB |     93%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|    0     26077      C   python                                      7840MiB |

+-----------------------------------------------------------------------------+

answered Jan 21 at 16:40

mmorin

243110

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1396689%2funable-to-run-nvidia-docker-image-on-azure%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This NVIDIA GitHub issue and this part of the error message:

--require=cuda>=10.0 brand=tesla,driver>=384,driver<385

suggest it is a driver issue. I don't quite understand why.

Solution using Docker, but without your image

The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.

Solution using your image, but without Docker

yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn

export PATH=$PATH:/data/anaconda/envs/TF/bin

export PATH=$PATH:/data/anaconda/envs/py35/bin

These commands pull the official models from Tensorflow:

git clone https://github.com/tensorflow/models.git

export PYTHONPATH="$PYTHONPATH:./models"

A first call to nvidia-smi shows that the GPU has no running processes:

$ nvidia-smi

Mon Jan 21 16:26:02 2019       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   39C    P8    14W / 150W |      0MiB /  8129MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:

$ python models/official/mnist/mnist.py &

[1] 25967

$ nvidia-smi

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   37C    P0    77W / 150W |   7851MiB /  8129MiB |     93%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|    0     26077      C   python                                      7840MiB |

+-----------------------------------------------------------------------------+

answered Jan 21 at 16:40

mmorin

243110

add a comment |

This NVIDIA GitHub issue and this part of the error message:

--require=cuda>=10.0 brand=tesla,driver>=384,driver<385

suggest it is a driver issue. I don't quite understand why.

Solution using Docker, but without your image

The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.

Solution using your image, but without Docker

yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn

export PATH=$PATH:/data/anaconda/envs/TF/bin

export PATH=$PATH:/data/anaconda/envs/py35/bin

These commands pull the official models from Tensorflow:

git clone https://github.com/tensorflow/models.git

export PYTHONPATH="$PYTHONPATH:./models"

A first call to nvidia-smi shows that the GPU has no running processes:

$ nvidia-smi

Mon Jan 21 16:26:02 2019       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   39C    P8    14W / 150W |      0MiB /  8129MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:

$ python models/official/mnist/mnist.py &

[1] 25967

$ nvidia-smi

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   37C    P0    77W / 150W |   7851MiB /  8129MiB |     93%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|    0     26077      C   python                                      7840MiB |

+-----------------------------------------------------------------------------+

answered Jan 21 at 16:40

mmorin

243110

add a comment |

This NVIDIA GitHub issue and this part of the error message:

--require=cuda>=10.0 brand=tesla,driver>=384,driver<385

suggest it is a driver issue. I don't quite understand why.

Solution using Docker, but without your image

The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.

Solution using your image, but without Docker

yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn

export PATH=$PATH:/data/anaconda/envs/TF/bin

export PATH=$PATH:/data/anaconda/envs/py35/bin

These commands pull the official models from Tensorflow:

git clone https://github.com/tensorflow/models.git

export PYTHONPATH="$PYTHONPATH:./models"

A first call to nvidia-smi shows that the GPU has no running processes:

$ nvidia-smi

Mon Jan 21 16:26:02 2019       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   39C    P8    14W / 150W |      0MiB /  8129MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:

$ python models/official/mnist/mnist.py &

[1] 25967

$ nvidia-smi

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   37C    P0    77W / 150W |   7851MiB /  8129MiB |     93%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|    0     26077      C   python                                      7840MiB |

+-----------------------------------------------------------------------------+

answered Jan 21 at 16:40

mmorin

243110

This NVIDIA GitHub issue and this part of the error message:

--require=cuda>=10.0 brand=tesla,driver>=384,driver<385

suggest it is a driver issue. I don't quite understand why.

Solution using Docker, but without your image

The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.

Solution using your image, but without Docker

yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn

export PATH=$PATH:/data/anaconda/envs/TF/bin

export PATH=$PATH:/data/anaconda/envs/py35/bin

These commands pull the official models from Tensorflow:

git clone https://github.com/tensorflow/models.git

export PYTHONPATH="$PYTHONPATH:./models"

A first call to nvidia-smi shows that the GPU has no running processes:

$ nvidia-smi

Mon Jan 21 16:26:02 2019       

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   39C    P8    14W / 150W |      0MiB /  8129MiB |      0%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|  No running processes found                                                 |

+-----------------------------------------------------------------------------+

When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:

$ python models/official/mnist/mnist.py &

[1] 25967

$ nvidia-smi

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.44                 Driver Version: 396.44                    |

|-------------------------------+----------------------+----------------------+

| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|===============================+======================+======================|

|   0  Tesla M60           On   | 0000DB4D:00:00.0 Off |                  Off |

| N/A   37C    P0    77W / 150W |   7851MiB /  8129MiB |     93%      Default |

+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+

| Processes:                                                       GPU Memory |

|  GPU       PID   Type   Process name                             Usage      |

|=============================================================================|

|    0     26077      C   python                                      7840MiB |

+-----------------------------------------------------------------------------+

answered Jan 21 at 16:40

mmorin

243110

answered Jan 21 at 16:40

mmorin

243110

answered Jan 21 at 16:40

mmorin

243110

answered Jan 21 at 16:40

mmorin

243110

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Super User!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtyktl

Unable to run NVIDIA Docker image on Azure

1 Answer
1

Solution using Docker, but without your image

Solution using your image, but without Docker

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Solution using Docker, but without your image

Solution using your image, but without Docker

Solution using Docker, but without your image

Solution using your image, but without Docker

Solution using Docker, but without your image

Solution using your image, but without Docker

Solution using Docker, but without your image

Solution using your image, but without Docker

Post as a guest

Popular posts from this blog

VLC cannot play a non-UDF mastered DVD

Mouse cursor on multiple screens with different PPI

Agildo Ribeiro

Unable to run NVIDIA Docker image on Azure

1 Answer 1

Solution using Docker, but without your image

Solution using your image, but without Docker

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Solution using Docker, but without your image

Solution using your image, but without Docker

Solution using Docker, but without your image

Solution using your image, but without Docker

Solution using Docker, but without your image

Solution using your image, but without Docker

Solution using Docker, but without your image

Solution using your image, but without Docker

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

VLC cannot play a non-UDF mastered DVD

Mouse cursor on multiple screens with different PPI

Agildo Ribeiro

1 Answer
1

1 Answer
1

1 Answer
1