Execution Providers
UniFace uses ONNX Runtime for model inference, which supports multiple hardware acceleration backends.
Automatic Provider Selection
UniFace automatically selects the optimal execution provider based on available hardware:
Priority order:
- CUDAExecutionProvider - NVIDIA GPU
- CoreMLExecutionProvider - Apple Silicon
- CPUExecutionProvider - Fallback
Check Available Providers
import onnxruntime as ort
providers = ort.get_available_providers()
print("Available providers:", providers)
Example outputs:
Platform-Specific Setup
Apple Silicon (M1/M2/M3/M4)
No additional setup required. ARM64 optimizations are built into onnxruntime:
Verify ARM64:
Performance
Apple Silicon Macs use CoreML acceleration automatically, providing excellent performance for face analysis tasks.
NVIDIA GPU (CUDA)
Install with GPU support:
Requirements:
- CUDA 11.x or 12.x
- cuDNN 8.x
- Compatible NVIDIA driver
Verify CUDA:
import onnxruntime as ort
if 'CUDAExecutionProvider' in ort.get_available_providers():
print("CUDA is available!")
else:
print("CUDA not available, using CPU")
CPU Fallback
CPU execution is always available:
Works on all platforms without additional configuration.
Internal API
For advanced use cases, you can access the provider utilities:
from uniface.onnx_utils import get_available_providers, create_onnx_session
# Check available providers
providers = get_available_providers()
print(f"Available: {providers}")
# Models use create_onnx_session() internally
# which auto-selects the best provider
Performance Tips
1. Use GPU When Available
For batch processing or real-time applications, GPU acceleration provides significant speedups:
2. Optimize Input Size
Smaller input sizes are faster but may reduce accuracy:
from uniface import RetinaFace
# Faster, lower accuracy
detector = RetinaFace(input_size=(320, 320))
# Balanced (default)
detector = RetinaFace(input_size=(640, 640))
3. Batch Processing
Process multiple images to maximize GPU utilization:
# Process images in batch (GPU-efficient)
for image_path in image_paths:
image = cv2.imread(image_path)
faces = detector.detect(image)
# ...
Troubleshooting
CUDA Not Detected
-
Verify CUDA installation:
-
Check CUDA version compatibility with ONNX Runtime
-
Reinstall with GPU support:
Slow Performance on Mac
Verify you're using ARM64 Python (not Rosetta):
Next Steps
- Model Cache & Offline - Model management
- Thresholds & Calibration - Tuning parameters