Compare with HAMi
Compare TensorFusion with HAMi project
HAMi is a popular GPU pool management solution, offering fractional GPU and dynamic MIG features for multi-vendor GPU/NPUs.
Compare to HAMi, TensorFusion not only offers fractional GPU, but also offers real GPU virtualization, isolation, remote sharing, live migration by completely different technology, with more features and enterprise-grade features and cloud vendor integration.
Features
| Feature | TensorFusion | HAMi |
|---|---|---|
| Basic Features | ||
| Fractional GPU | ✅ | ✅ |
| GPU Pooling | ✅ | ✅ |
| GPU Scheduling & Allocation | ✅ | ✅ |
| Remote GPU Sharing | ✅ | ❌ |
| Advanced Features | ||
| Seamless Onboarding for Existing Workloads | ✅ | ✅ |
| Monitoring & Alert | ✅ | ✅ |
| GPU Resource Oversubscription | ✅ | ✅ |
| GPU VRAM Expansion and hot/warm/cold tiering | ✅ | ❌ |
| GPU-first Autoscaling Policies | ✅ | ❌ |
| Support different QoS levels | ✅ | ❌ |
| Request Multiple vGPUs | ✅ | ✅ |
| GPU Node Auto Provisioning/Termination | ✅ | ❌ |
| GPU Compaction/Bin-packing | 🚧 | 🚧 |
| Dynamic MIG(Multi-instance GPU) | 👋 | ✅ |
| Centralized Dashboard & Control Plane | ✅ | ✅ |
| Support Non-NVIDIA GPU | 🚧 | ✅ |
| Enterprise Features | ||
| Windows/Linux VM vGPU | ✅ | ❌ |
| OpenGL Virtualization | ✅ | ❌ |
| GPU Live Migration | 🚧 | ❌ |
| Advanced observability, CUDA Call Profiling/Tracing | 🚧 | ❌ |
| AI Model Preloading | 🚧 | ❌ |
| Advanced auto-scaling policies, scale to zero, rebalancing | 🚧 | ❓ |
| Monetization of your GPU cluster | 🚧 | ❌ |
Notes:
- ✅ means supported
- ❌ means not supported
- 🚧 means Working in progress
- ❓ means unknown
- 👋 means not necessary any more
In summary, both TensorFusion and HAMi offer fractional GPU and distributed scheduler in Kubernetes. While TensorFusion offers more features, HAMi supports more GPU vendors.
As for the Fractional GPU feature, there are also design differences, HAMi uses percentage based limit unit, while TensorFusion uses FP16 TFLOPs. Percentage-based way can lead to unpredictable behaviors, because 1% of GPU card 5 years ago has huge difference with 1% of GPU card today.
TensorFusion Docs