Try Kubernetes

Get Started

Ready to get your hands dirty? Build a simple Kubernetes cluster that runs "Hello World" for Node.js.

Documentation

Learn how to use Kubernetes with the use of walkthroughs, samples, and reference documentation. You can even help contribute to the docs!

Community

If you need help, you can connect with other Kubernetes users and the Kubernetes authors, attend community events, and watch video presentations from around the web.

Blog

Read the latest news for Kubernetes and the containers space in general, and get technical how-tos hot off the presses.

Interested in hacking on the core Kubernetes code base?

View On Github

Explore the community

Tasks

Step-by-step instructions for performing operations with Kubernetes.

Documentation for Kubernetes v1.8 is no longer actively maintained. The version you are currently viewing is a static snapshot. For up-to-date documentation, see the latest version.

Edit This Page

Schedule GPUs

Kubernetes includes experimental support for managing NVIDIA GPUs spread across nodes. This page describes how users can consume GPUs and the current limitations.

Before you begin
API
- Warning
Access to CUDA libraries
Future

Before you begin

Kubernetes nodes have to be pre-installed with Nvidia drivers. Kubelet will not detect Nvidia GPUs otherwise. Try to re-install Nvidia drivers if kubelet fails to expose Nvidia GPUs as part of Node Capacity. After installing the driver, run nvidia-docker-plugin to confirm that all drivers have been loaded.
A special alpha feature gate Accelerators has to be set to true across the system: --feature-gates="Accelerators=true".
Nodes must be using docker engine as the container runtime.

The nodes will automatically discover and expose all Nvidia GPUs as a schedulable resource.

API

Nvidia GPUs can be consumed via container level resource requirements using the resource name alpha.kubernetes.io/nvidia-gpu.

apiVersion: v1
kind: Pod 
metadata:
  name: gpu-pod
spec: 
  containers: 
    - 
      name: gpu-container-1
      image: gcr.io/google_containers/pause:2.0
      resources: 
        limits: 
          alpha.kubernetes.io/nvidia-gpu: 2 # requesting 2 GPUs
    -
      name: gpu-container-2
      image: gcr.io/google_containers/pause:2.0
      resources: 
        limits: 
          alpha.kubernetes.io/nvidia-gpu: 3 # requesting 3 GPUs

GPUs are only supposed to be specified in the limits section, which means:
- You can specify GPU limits without specifying requests because Kubernetes will use the limit as the request value by default.
- You can specify GPU in both limits and requests but these two values must equal.
- You cannot specify GPU requests without specifying limits.
Containers (and pods) do not share GPUs.
Each container can request one or more GPUs.
It is not possible to request a portion of a GPU.
Nodes are expected to be homogenous, i.e. run the same GPU hardware.

If your nodes are running different versions of GPUs, then use Node Labels and Node Selectors to schedule pods to appropriate GPUs. Following is an illustration of this workflow:

As part of your Node bootstrapping, identify the GPU hardware type on your nodes and expose it as a node label.

NVIDIA_GPU_NAME=$(nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 | sed -e 's/ /-/g')
source /etc/default/kubelet
KUBELET_OPTS="$KUBELET_OPTS --node-labels='alpha.kubernetes.io/nvidia-gpu-name=$NVIDIA_GPU_NAME'"
echo "KUBELET_OPTS=$KUBELET_OPTS" > /etc/default/kubelet

Specify the GPU types a pod can use via Node Affinity rules.

kind: pod
apiVersion: v1
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/affinity: >
      {
        "nodeAffinity": {
          "requiredDuringSchedulingIgnoredDuringExecution": {
            "nodeSelectorTerms": [
              {
                "matchExpressions": [
                  {
                    "key": "alpha.kubernetes.io/nvidia-gpu-name",
                    "operator": "In",
                    "values": ["Tesla K80", "Tesla P100"]
                  }
                ]
              }
            ]
          }
        }
      }
spec:
  containers:
    -
      name: gpu-container-1
      resources:
        limits:
          alpha.kubernetes.io/nvidia-gpu: 2

This will ensure that the pod will be scheduled to a node that has a Tesla K80 or a Tesla P100 Nvidia GPU.

Warning

The API presented here will change in an upcoming release to better support GPUs, and hardware accelerators in general, in Kubernetes.

Access to CUDA libraries

As of now, CUDA libraries are expected to be pre-installed on the nodes.

To mitigate this, you can copy the libraries to a more permissive folder in /var/lib/ or change the permissions directly. (Future releases will automatically perform this operation)

Pods can access the libraries using hostPath volumes.

kind: Pod
apiVersion: v1
metadata:
  name: gpu-pod
spec:
  containers:
  - name: gpu-container-1
    image: gcr.io/google_containers/pause:2.0
    resources:
      limits:
        alpha.kubernetes.io/nvidia-gpu: 1
    volumeMounts:
    - mountPath: /usr/local/nvidia/bin
      name: bin
    - mountPath: /usr/lib/nvidia
      name: lib
  volumes:
  - hostPath:
      path: /usr/lib/nvidia-375/bin
    name: bin
  - hostPath:
      path: /usr/lib/nvidia-375
    name: lib

Future

Support for hardware accelerators is in its early stages in Kubernetes.
GPUs and other accelerators will soon be a native compute resource across the system.
Better APIs will be introduced to provision and consume accelerators in a scalable manner.
Kubernetes will automatically ensure that applications consuming GPUs get the best possible performance.
Key usability problems like access to CUDA libraries will be addressed.

Create an Issue Edit this Page