Unable to run NVIDIA Docker image on Azure
I set up a Data Science Virtual Machine for Linux (Ubuntu) on Azure and want to check the installation of GPUs following these TensorFlow directions. The first command shows that a GPU is available with Tesla M60:
$ lspci | grep -i nvidia
db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
The second command fails with a cryptic message:
$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.
How can I set up a machine to run the Nvidia docker image?
linux gpu nvidia-graphics-card docker azure
add a comment |
I set up a Data Science Virtual Machine for Linux (Ubuntu) on Azure and want to check the installation of GPUs following these TensorFlow directions. The first command shows that a GPU is available with Tesla M60:
$ lspci | grep -i nvidia
db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
The second command fails with a cryptic message:
$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.
How can I set up a machine to run the Nvidia docker image?
linux gpu nvidia-graphics-card docker azure
add a comment |
I set up a Data Science Virtual Machine for Linux (Ubuntu) on Azure and want to check the installation of GPUs following these TensorFlow directions. The first command shows that a GPU is available with Tesla M60:
$ lspci | grep -i nvidia
db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
The second command fails with a cryptic message:
$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.
How can I set up a machine to run the Nvidia docker image?
linux gpu nvidia-graphics-card docker azure
I set up a Data Science Virtual Machine for Linux (Ubuntu) on Azure and want to check the installation of GPUs following these TensorFlow directions. The first command shows that a GPU is available with Tesla M60:
$ lspci | grep -i nvidia
db4d:00:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)
The second command fails with a cryptic message:
$ sudo docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=31149 /data/docker/overlay2/16e2b65fa0831681029432e3936005fa2796afd6d5a50c297d6bc0693e57a0b0/merged]\\nnvidia-container-cli: requirement error: unsatisfied condition: driver < 385\\n\""": unknown.
How can I set up a machine to run the Nvidia docker image?
linux gpu nvidia-graphics-card docker azure
linux gpu nvidia-graphics-card docker azure
asked Jan 21 at 16:40
mmorinmmorin
243110
243110
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
This NVIDIA GitHub issue and this part of the error message:
--require=cuda>=10.0 brand=tesla,driver>=384,driver<385
suggest it is a driver issue. I don't quite understand why.
Solution using Docker, but without your image
The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.
Solution using your image, but without Docker
Alternatively, you can still use Data Science Virtual Machine for Linux (Ubuntu) but without the containerisation of Docker. Conda, for example, can set up an environment (where the initial yes | answers yes to the prompts to install the packages):
yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn
export PATH=$PATH:/data/anaconda/envs/TF/bin
export PATH=$PATH:/data/anaconda/envs/py35/bin
These commands pull the official models from Tensorflow:
git clone https://github.com/tensorflow/models.git
export PYTHONPATH="$PYTHONPATH:./models"
A first call to nvidia-smi shows that the GPU has no running processes:
$ nvidia-smi
Mon Jan 21 16:26:02 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 39C P8 14W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:
$ python models/official/mnist/mnist.py &
[1] 25967
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 37C P0 77W / 150W | 7851MiB / 8129MiB | 93% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 26077 C python 7840MiB |
+-----------------------------------------------------------------------------+
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1396689%2funable-to-run-nvidia-docker-image-on-azure%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This NVIDIA GitHub issue and this part of the error message:
--require=cuda>=10.0 brand=tesla,driver>=384,driver<385
suggest it is a driver issue. I don't quite understand why.
Solution using Docker, but without your image
The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.
Solution using your image, but without Docker
Alternatively, you can still use Data Science Virtual Machine for Linux (Ubuntu) but without the containerisation of Docker. Conda, for example, can set up an environment (where the initial yes | answers yes to the prompts to install the packages):
yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn
export PATH=$PATH:/data/anaconda/envs/TF/bin
export PATH=$PATH:/data/anaconda/envs/py35/bin
These commands pull the official models from Tensorflow:
git clone https://github.com/tensorflow/models.git
export PYTHONPATH="$PYTHONPATH:./models"
A first call to nvidia-smi shows that the GPU has no running processes:
$ nvidia-smi
Mon Jan 21 16:26:02 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 39C P8 14W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:
$ python models/official/mnist/mnist.py &
[1] 25967
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 37C P0 77W / 150W | 7851MiB / 8129MiB | 93% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 26077 C python 7840MiB |
+-----------------------------------------------------------------------------+
add a comment |
This NVIDIA GitHub issue and this part of the error message:
--require=cuda>=10.0 brand=tesla,driver>=384,driver<385
suggest it is a driver issue. I don't quite understand why.
Solution using Docker, but without your image
The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.
Solution using your image, but without Docker
Alternatively, you can still use Data Science Virtual Machine for Linux (Ubuntu) but without the containerisation of Docker. Conda, for example, can set up an environment (where the initial yes | answers yes to the prompts to install the packages):
yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn
export PATH=$PATH:/data/anaconda/envs/TF/bin
export PATH=$PATH:/data/anaconda/envs/py35/bin
These commands pull the official models from Tensorflow:
git clone https://github.com/tensorflow/models.git
export PYTHONPATH="$PYTHONPATH:./models"
A first call to nvidia-smi shows that the GPU has no running processes:
$ nvidia-smi
Mon Jan 21 16:26:02 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 39C P8 14W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:
$ python models/official/mnist/mnist.py &
[1] 25967
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 37C P0 77W / 150W | 7851MiB / 8129MiB | 93% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 26077 C python 7840MiB |
+-----------------------------------------------------------------------------+
add a comment |
This NVIDIA GitHub issue and this part of the error message:
--require=cuda>=10.0 brand=tesla,driver>=384,driver<385
suggest it is a driver issue. I don't quite understand why.
Solution using Docker, but without your image
The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.
Solution using your image, but without Docker
Alternatively, you can still use Data Science Virtual Machine for Linux (Ubuntu) but without the containerisation of Docker. Conda, for example, can set up an environment (where the initial yes | answers yes to the prompts to install the packages):
yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn
export PATH=$PATH:/data/anaconda/envs/TF/bin
export PATH=$PATH:/data/anaconda/envs/py35/bin
These commands pull the official models from Tensorflow:
git clone https://github.com/tensorflow/models.git
export PYTHONPATH="$PYTHONPATH:./models"
A first call to nvidia-smi shows that the GPU has no running processes:
$ nvidia-smi
Mon Jan 21 16:26:02 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 39C P8 14W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:
$ python models/official/mnist/mnist.py &
[1] 25967
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 37C P0 77W / 150W | 7851MiB / 8129MiB | 93% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 26077 C python 7840MiB |
+-----------------------------------------------------------------------------+
This NVIDIA GitHub issue and this part of the error message:
--require=cuda>=10.0 brand=tesla,driver>=384,driver<385
suggest it is a driver issue. I don't quite understand why.
Solution using Docker, but without your image
The simplest solution is to use different Azure images: both NVIDIA GPU Cloud Image and NVIDIA GPU Cloud Image for Deep Learning and HPC will run that Docker image.
Solution using your image, but without Docker
Alternatively, you can still use Data Science Virtual Machine for Linux (Ubuntu) but without the containerisation of Docker. Conda, for example, can set up an environment (where the initial yes | answers yes to the prompts to install the packages):
yes | conda create -n TF python=2.7 scipy==1.0.0 tensorflow-gpu==1.8 Keras==2.1.3 pandas==0.22.0 numpy==1.14.0 matplotlib scikit-learn
export PATH=$PATH:/data/anaconda/envs/TF/bin
export PATH=$PATH:/data/anaconda/envs/py35/bin
These commands pull the official models from Tensorflow:
git clone https://github.com/tensorflow/models.git
export PYTHONPATH="$PYTHONPATH:./models"
A first call to nvidia-smi shows that the GPU has no running processes:
$ nvidia-smi
Mon Jan 21 16:26:02 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 39C P8 14W / 150W | 0MiB / 8129MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
When you leave the official MNIST model running in the background for a little while, you will see one process using the GPU:
$ python models/official/mnist/mnist.py &
[1] 25967
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.44 Driver Version: 396.44 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M60 On | 0000DB4D:00:00.0 Off | Off |
| N/A 37C P0 77W / 150W | 7851MiB / 8129MiB | 93% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 26077 C python 7840MiB |
+-----------------------------------------------------------------------------+
answered Jan 21 at 16:40
mmorinmmorin
243110
243110
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1396689%2funable-to-run-nvidia-docker-image-on-azure%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown