LogoTensorFusion 文档
LogoTensorFusion 文档
首页

快速开始

TensorFusion概览在Kubernetes安装在虚拟机/服务器安装(K3S)Helm本地安装在宿主机/虚拟机安装TensorFusion架构

应用操作

创建AI应用配置自动扩缩容迁移现有应用最佳实践

自定义AI基础设施

生产级部署指南QoS级别与计价云厂商集成(BYOC)管理许可证

维护与优化

组件更新配置告警GPU热迁移预加载模型优化GPU效率

故障排除

问题处理手册链路追踪/性能分析查询指标和日志

参考

Helm安装Values配置
TensorFusionClusterGPUPoolGPUNodeGPUGPUNodeClassSchedulingConfigTemplateGPUResourceQuota
Kubernetes 事件监控指标定义性能测试命令行参考GPU/驱动/操作系统支持矩阵TensorFusion 安全白皮书

对比

与NVIDIA vGPU比较与MIG/MPS对比与趋动科技对比与 Run.AI 对比与HAMi的对比
系统管理员参考Kubernetes资源定义

TensorFusionCluster

TensorFusionCluster is the Schema for the tensorfusionclusters API.

TensorFusionCluster is the Schema for the tensorfusionclusters API.

Resource Information

FieldValue
API Versiontensor-fusion.ai/v1
KindTensorFusionCluster
ScopeCluster

Spec

TensorFusionClusterSpec defines the desired state of TensorFusionCluster.

PropertyTypeDescription
computingVendorobjectComputingVendorConfig defines the Cloud vendor connection such as AWS, GCP, Azure etc.
gpuPoolsarray

Status

TensorFusionClusterStatus defines the observed state of TensorFusionCluster.

PropertyTypeDescription
allocatedTFlopsPercentstring
allocatedVRAMPercentstring
availableTFlops *any`pattern: ^(+
availableVRAM *any`pattern: ^(+
cloudVendorConfigHashstring
conditionsarray
notReadyGPUPoolsarray
phasestringTensorFusionClusterPhase represents the phase of the TensorFusionCluster resource. (default: Pending) Allowed values: Pending, Running, Updating, Destroying, Unknown
potentialSavingsPerMonthstring
readyGPUPoolsarray
retryCount *integer<int64>(default: 0)
savedCostsPerMonthstring
totalGPUs *integer<int32>
totalNodes *integer<int32>
totalPools *integer<int32>
totalTFlops *any`pattern: ^(+
totalVRAM *any`pattern: ^(+
utilizedTFlopsPercentstring
utilizedVRAMPercentstring
virtualAvailableTFlopsany`pattern: ^(+
virtualAvailableVRAMany`pattern: ^(+
virtualTFlops *any`pattern: ^(+
virtualVRAM *any`pattern: ^(+

Helm安装Values配置

Helm chart的values.yaml文件配置说明

GPUPool

API documentation for GPUPool

目录

Resource Information
Spec
Status