配置自动扩缩容
配置AI应用自动扩缩容,包括自动扩缩容vGPU资源请求、限制等。
[Under Construction]
apiVersion: tensor-fusion.ai/v1
kind: WorkloadProfile
metadata:
name: auto-scale-template
spec:
qos: medium
autoRequests: true
autoLimits: true
autoReplicas: true
# when auto replicas is enabled, this number will be the init replica,
# and won't be changed along with the config, but with the actual load
replicas: 2
TensorFusion 文档