TensorFusionWorkload
TensorFusionWorkload is the Schema for the tensorfusionworkloads API.
TensorFusionWorkload is the Schema for the tensorfusionworkloads API.
Resource Information
| Field | Value |
|---|---|
| API Version | tensor-fusion.ai/v1 |
| Kind | TensorFusionWorkload |
| Scope | Namespaced |
Spec
WorkloadProfileSpec defines the desired state of WorkloadProfile.
| Property | Type | Description |
|---|---|---|
| autoScalingConfig | object | AutoScalingConfig configured here will override Pool's schedulingConfig This field can not be fully supported in annotation, if user want to enable auto-scaling in annotation, user can set tensor-fusion.ai/auto-resources |
| gpuCount | integer<int32> | The number of GPUs to be used by the workload, default to 1 |
| gpuModel | string | GPUModel specifies the required GPU model (e.g., "A100", "H100") |
| isLocalGPU | boolean | Schedule the workload to the same GPU server that runs vGPU worker for best performance, default to false |
| nodeAffinity | object | NodeAffinity specifies the node affinity requirements for the workload |
| poolName | string | |
| qos | string | Qos defines the quality of service level for the client. Allowed values: low, medium, high, critical |
| replicas | integer<int32> | If replicas not set, it will be dynamic based on pending Pod If isLocalGPU set to true, replicas must be dynamic, and this field will be ignored |
| resources | object | |
| sidecarWorker | boolean | When set to sidecar worker mode, its always Local GPU mode, and hard-isolated with shared memory default to false, indicates the workload's embedded worker is same process, soft-isolated |
| workerPodTemplate | object | WorkerPodTemplate is the template for the worker pod, only take effect in remote vGPU mode |
Status
TensorFusionWorkloadStatus defines the observed state of TensorFusionWorkload.
| Property | Type | Description |
|---|---|---|
| activeCronScalingRule | object | The currently active cron scaling rule |
| appliedRecommendedReplicas | integer<int32> | The number of replicas currently applied based on the latest recommendation |
| conditions | array | Represents the latest available observations of the workload's current state. |
| phase | string | (default: Pending) Allowed values: Pending, Running, Failed, Unknown |
| podTemplateHash | string | Hash of the pod template used to create worker pods |
| readyWorkers | integer<int32> | readyWorkers is the number of vGPU workers ready |
| recommendation | object | The most recently GPU resources recommended by the autoscaler |
| workerCount * | integer<int32> | workerCount is the number of vGPU workers |
TensorFusion 文档