TensorFusionWorkload

TensorFusionWorkload is the Schema for the tensorfusionworkloads API.

Resource Information

Field	Value
API Version	`tensor-fusion.ai/v1`
Kind	`TensorFusionWorkload`
Scope	`Namespaced`

Spec

WorkloadProfileSpec defines the desired state of WorkloadProfile.

Property	Type	Description
autoScalingConfig	`object`	AutoScalingConfig configured here will override Pool's schedulingConfig This field can not be fully supported in annotation, if user want to enable auto-scaling in annotation, user can set tensor-fusion.ai/auto-resources
gpuCount	`integer<int32>`	The number of GPUs to be used by the workload, default to 1
gpuModel	`string`	GPUModel specifies the required GPU model (e.g., "A100", "H100")
isLocalGPU	`boolean`	Schedule the workload to the same GPU server that runs vGPU worker for best performance, default to false
nodeAffinity	`object`	NodeAffinity specifies the node affinity requirements for the workload
poolName	`string`
qos	`string`	Qos defines the quality of service level for the client. Allowed values: `low`, `medium`, `high`, `critical`
replicas	`integer<int32>`	If replicas not set, it will be dynamic based on pending Pod If isLocalGPU set to true, replicas must be dynamic, and this field will be ignored
resources	`object`
sidecarWorker	`boolean`	When set to sidecar worker mode, its always Local GPU mode, and hard-isolated with shared memory default to false, indicates the workload's embedded worker is same process, soft-isolated
workerPodTemplate	`object`	WorkerPodTemplate is the template for the worker pod, only take effect in remote vGPU mode

Status

TensorFusionWorkloadStatus defines the observed state of TensorFusionWorkload.

Property	Type	Description
activeCronScalingRule	`object`	The currently active cron scaling rule
appliedRecommendedReplicas	`integer<int32>`	The number of replicas currently applied based on the latest recommendation
conditions	`array`	Represents the latest available observations of the workload's current state.
phase	`string`	(default: `Pending`) Allowed values: `Pending`, `Running`, `Failed`, `Unknown`
podTemplateHash	`string`	Hash of the pod template used to create worker pods
readyWorkers	`integer<int32>`	readyWorkers is the number of vGPU workers ready
recommendation	`object`	The most recently GPU resources recommended by the autoscaler
workerCount *	`integer<int32>`	workerCount is the number of vGPU workers

TensorFusionWorkload

TensorFusionWorkload is the Schema for the tensorfusionworkloads API.

Resource Information

Field	Value
API Version	`tensor-fusion.ai/v1`
Kind	`TensorFusionWorkload`
Scope	`Namespaced`

Spec

WorkloadProfileSpec defines the desired state of WorkloadProfile.

Property	Type	Description
autoScalingConfig	`object`	AutoScalingConfig configured here will override Pool's schedulingConfig This field can not be fully supported in annotation, if user want to enable auto-scaling in annotation, user can set tensor-fusion.ai/auto-resources
gpuCount	`integer<int32>`	The number of GPUs to be used by the workload, default to 1
gpuModel	`string`	GPUModel specifies the required GPU model (e.g., "A100", "H100")
isLocalGPU	`boolean`	Schedule the workload to the same GPU server that runs vGPU worker for best performance, default to false
nodeAffinity	`object`	NodeAffinity specifies the node affinity requirements for the workload
poolName	`string`
qos	`string`	Qos defines the quality of service level for the client. Allowed values: `low`, `medium`, `high`, `critical`
replicas	`integer<int32>`	If replicas not set, it will be dynamic based on pending Pod If isLocalGPU set to true, replicas must be dynamic, and this field will be ignored
resources	`object`
sidecarWorker	`boolean`	When set to sidecar worker mode, its always Local GPU mode, and hard-isolated with shared memory default to false, indicates the workload's embedded worker is same process, soft-isolated
workerPodTemplate	`object`	WorkerPodTemplate is the template for the worker pod, only take effect in remote vGPU mode

Status

TensorFusionWorkloadStatus defines the observed state of TensorFusionWorkload.

Property	Type	Description
activeCronScalingRule	`object`	The currently active cron scaling rule
appliedRecommendedReplicas	`integer<int32>`	The number of replicas currently applied based on the latest recommendation
conditions	`array`	Represents the latest available observations of the workload's current state.
phase	`string`	(default: `Pending`) Allowed values: `Pending`, `Running`, `Failed`, `Unknown`
podTemplateHash	`string`	Hash of the pod template used to create worker pods
readyWorkers	`integer<int32>`	readyWorkers is the number of vGPU workers ready
recommendation	`object`	The most recently GPU resources recommended by the autoscaler
workerCount *	`integer<int32>`	workerCount is the number of vGPU workers

Resource Information

Spec

Status

目录

TensorFusionWorkload

Resource Information

Spec

Status

目录

TensorFusionWorkload

Resource Information

Spec

autoScalingConfig

nodeAffinity

resources

workerPodTemplate

Status

activeCronScalingRule

conditions (array items)

recommendation

目录

TensorFusionWorkload

Resource Information

Spec

autoScalingConfig

nodeAffinity

resources

workerPodTemplate

Status

activeCronScalingRule

conditions (array items)

recommendation

目录