WorkloadProfile

WorkloadProfile is the Schema for the workloadprofiles API.

WorkloadProfile is the Schema for the workloadprofiles API.

Resource Information

Field	Value
API Version	`tensor-fusion.ai/v1`
Kind	`WorkloadProfile`
Scope	`Namespaced`

Spec

WorkloadProfileSpec defines the desired state of WorkloadProfile.

Property	Type	Description
autoScalingConfig	`object`	AutoScalingConfig configured here will override Pool's schedulingConfig This field can not be fully supported in annotation, if user want to enable auto-scaling in annotation, user can set tensor-fusion.ai/auto-resources
gpuCount	`integer<int32>`	The number of GPUs to be used by the workload, default to 1
gpuModel	`string`	GPUModel specifies the required GPU model (e.g., "A100", "H100")
isLocalGPU	`boolean`	Schedule the workload to the same GPU server that runs vGPU worker for best performance, default to false
nodeAffinity	`object`	NodeAffinity specifies the node affinity requirements for the workload
poolName	`string`
qos	`string`	Qos defines the quality of service level for the client. Allowed values: `low`, `medium`, `high`, `critical`
replicas	`integer<int32>`	If replicas not set, it will be dynamic based on pending Pod If isLocalGPU set to true, replicas must be dynamic, and this field will be ignored
resources	`object`
sidecarWorker	`boolean`	When set to sidecar worker mode, its always Local GPU mode, and hard-isolated with shared memory default to false, indicates the workload's embedded worker is same process, soft-isolated
workerPodTemplate	`object`	WorkerPodTemplate is the template for the worker pod, only take effect in remote vGPU mode

Status

WorkloadProfileStatus defines the observed state of WorkloadProfile.

Table of Contents

Resource Information