Execution Providers
UniFace automatically runs each model on the best available hardware — Apple Silicon (CoreML), NVIDIA GPU (CUDA), or CPU. Under the hood this is handled by ONNX Runtime execution providers.
Automatic Provider Selection
UniFace automatically selects the optimal execution provider based on available hardware:
from uniface.detection import RetinaFace
# Automatically uses best available provider
detector = RetinaFace()
Priority order:
- CoreMLExecutionProvider - Apple Silicon
- CUDAExecutionProvider - NVIDIA GPU
- CPUExecutionProvider - Fallback
Explicit Provider Selection
You can specify which execution provider to use by passing the providers parameter:
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# Force CPU execution (even if GPU is available)
detector = RetinaFace(providers=['CPUExecutionProvider'])
recognizer = ArcFace(providers=['CPUExecutionProvider'])
# Use CUDA with CPU fallback
detector = RetinaFace(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
All ONNX-based model classes accept the providers parameter:
- Detection:
RetinaFace,SCRFD,YOLOv5Face,YOLOv8Face - Recognition:
ArcFace,AdaFace,MobileFace,SphereFace - Landmarks:
Landmark106,PIPNet - Gaze:
MobileGaze - Parsing:
BiSeNet,XSeg - Attributes:
AgeGender,FairFace - Anti-Spoofing:
MiniFASNet
Non-ONNX components
- Emotion uses TorchScript and selects its device automatically (
mps/cuda/cpu). It does not accept theprovidersparameter. - BlurFace is a pure OpenCV utility and does not load any model.
Check Available Providers
import onnxruntime as ort
providers = ort.get_available_providers()
print("Available providers:", providers)
Example outputs:
Platform-Specific Setup
Apple Silicon (M1/M2/M3/M4)
No additional setup required. ARM64 optimizations are built into onnxruntime:
Verify ARM64:
Performance
Apple Silicon Macs use CoreML acceleration automatically, providing excellent performance for face analysis tasks.
NVIDIA GPU (CUDA)
Install with GPU support (this installs onnxruntime-gpu, which already includes CPU fallback):
Requirements:
- CUDA 11.x or 12.x
- cuDNN 8.x
- Compatible NVIDIA driver
Verify CUDA:
import onnxruntime as ort
if 'CUDAExecutionProvider' in ort.get_available_providers():
print("CUDA is available!")
else:
print("CUDA not available, using CPU")
CPU Fallback
CPU execution is always available:
Works on all platforms without additional configuration.
Internal API
For advanced use cases, you can access the provider utilities:
from uniface.onnx_utils import get_available_providers, create_onnx_session
# Check available providers
providers = get_available_providers()
print(f"Available: {providers}")
# Models use create_onnx_session() internally
# which auto-selects the best provider
Performance Tips
1. Use GPU When Available
For batch processing or real-time applications, GPU acceleration provides significant speedups:
2. Optimize Input Size
Smaller input sizes are faster but may reduce accuracy:
from uniface.detection import RetinaFace
# Faster, lower accuracy
detector = RetinaFace(input_size=(320, 320))
# Balanced (default)
detector = RetinaFace(input_size=(640, 640))
3. Batch Processing
Process multiple images to maximize GPU utilization:
# Process images in batch (GPU-efficient)
for image_path in image_paths:
image = cv2.imread(image_path)
faces = detector.detect(image)
# ...
Troubleshooting
CUDA Not Detected
-
Verify CUDA installation:
-
Check CUDA version compatibility with ONNX Runtime
-
Reinstall with GPU support:
Slow Performance on Mac
Verify you're using ARM64 Python (not Rosetta):
Next Steps
- Model Cache & Offline - Model management
- Thresholds & Calibration - Tuning parameters