Dynamic Allocation of GPU Resources¶

This feature provides dynamic allocation of GPU resources, allowing you to make real-time changes to allocated vGPU resources without the need to reload, reset, or restart the entire runtime environment. The goal is to minimize the impact on business operations, ensuring continuous and stable operation while flexibly adjusting GPU resources based on actual needs.

Use Cases¶

Elastic Resource Allocation : Quickly adjust GPU resources to meet new performance requirements when business needs or workloads change.
Immediate Response : Swiftly increase GPU resources in response to sudden high loads or business demands without interrupting operations, ensuring service stability and performance.

Steps¶

Below is a specific operational example demonstrating how to dynamically adjust the compute and memory resources of a vGPU without restarting the vGPU Pod:

Create a vGPU Pod¶

First, we create a vGPU Pod using the following YAML, with initial unlimited compute power and a memory limit of 200Mb.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: gpu-burn-test
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu-burn-test
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: gpu-burn-test
    spec:
      containers:
        - name: container-1
          image: docker.io/chrstnhntschl/gpu_burn:latest
          command:
            - sleep
            - '100000'
          resources:
            limits:
              cpu: 1m
              memory: 1Gi
              nvidia.com/gpucores: '0'
              nvidia.com/gpumem: '200'
              nvidia.com/vgpu: '1'

Dynamic Allocation of Compute Power¶

To modify the compute power to 10%, follow these steps:

Enter the container:

kubectl exec -it <pod-name> -- /bin/bash

Execute:
```
export CUDA_DEVICE_SM_LIMIT=10
```
Run directly in the current terminal:
```
./gpu_burn 60
```
The program will take effect immediately. Note that you should not exit the current bash terminal.

Dynamic Allocation of Memory¶

To modify the memory to 300Mb, follow these steps: