Video by Yan Krukau: https://www.pexels.com/video/male-teacher-with-his-students-8617126/
This project aims to perform gaze estimation using several deep learning models like ResNet, MobileNet v2, and MobileOne. It supports both classification and regression for predicting gaze direction. Built on top of L2CS-Net, the project includes additional pre-trained models and refined code for better performance and flexibility.
git clone https://github.com/yakyo/gaze-estimation.git
cd gaze-estimation
pip install -r requirements.txt
Download weight files:
a) Download weights from the following links:
Model | Weights | Size | Epochs | MAE |
---|---|---|---|---|
ResNet-18 | resnet18.pt | 43 MB | 200 | 12.84 |
ResNet-34 | resnet34.pt | 81.6 MB | 200 | 11.33 |
ResNet-50 | resnet50.pt | 91.3 MB | 200 | 11.34 |
MobileNet V2 | mobilenetv2.pt | 9.59 MB | 200 | 13.07 |
MobileOne S0 | mobileone_s0_fused.pt | 4.8 MB | 200 | 12.58 |
MobileOne S1 | mobileone_s1_fused.pt | xx MB | 200 | * |
MobileOne S2 | mobileone_s2_fused.pt | xx MB | 200 | * |
MobileOne S3 | mobileone_s3_fused.pt | xx MB | 200 | * |
MobileOne S4 | mobileone_s4_fused.pt | xx MB | 200 | * |
β*β - soon will be uploaded (due to limited computing resources I cannot publish rest of the weights, but you still can train them with given code).
b) Run the command below to download weights to the weights
directory (Linux):
sh download.sh [model_name]
resnet18
resnet34
resnet50
mobilenetv2
mobileone_s0
mobileone_s1
mobileone_s2
mobileone_s3
mobileone_s4
Dataset folder structure:
data/
βββ Gaze360/
β βββ Image/
β βββ Label/
βββ MPIIFaceGaze/
βββ Image/
βββ Label/
Gaze360
MPIIGaze
python main.py --data [dataset_path] --dataset [dataset_name] --arch [architecture_name]
main.py
arguments:
usage: main.py [-h] [--data DATA] [--dataset DATASET] [--output OUTPUT] [--checkpoint CHECKPOINT] [--num-epochs NUM_EPOCHS] [--batch-size BATCH_SIZE] [--arch ARCH] [--alpha ALPHA] [--lr LR] [--num-workers NUM_WORKERS]
Gaze estimation training.
options:
-h, --help show this help message and exit
--data DATA Directory path for gaze images.
--dataset DATASET Dataset name, available `gaze360`, `mpiigaze`.
--output OUTPUT Path of output models.
--checkpoint CHECKPOINT
Path to checkpoint for resuming training.
--num-epochs NUM_EPOCHS
Maximum number of training epochs.
--batch-size BATCH_SIZE
Batch size.
--arch ARCH Network architecture, currently available: resnet18/34/50, mobilenetv2, mobileone_s0-s4.
--alpha ALPHA Regression loss coefficient.
--lr LR Base learning rate.
--num-workers NUM_WORKERS
Number of workers for data loading.
python evaluate.py --data [dataset_path] --dataset [dataset_name] --weights [weights_path] --arch [architecture_name]
evaluate.py
arguments:
usage: evaluate.py [-h] [--data DATA] [--dataset DATASET] [--weights WEIGHTS] [--batch-size BATCH_SIZE] [--arch ARCH] [--num-workers NUM_WORKERS]
Gaze estimation evaluation.
options:
-h, --help show this help message and exit
--data DATA Directory path for gaze images.
--dataset DATASET Dataset name, available `gaze360`, `mpiigaze`
--weights WEIGHTS Path to model weight for evaluation.
--batch-size BATCH_SIZE
Batch size.
--arch ARCH Network architecture, currently available: resnet18/34/50, mobilenetv2, mobileone_s0-s4.
--num-workers NUM_WORKERS
Number of workers for data loading.
detect.py --arch [arch_name] --gaze-weights [path_gaze_estimation_weights] --face-weights [face_det_weights] --view --input [input_file] --output [output_file] --dataset [dataset_name]
detect.py
arguments:
usage: detect.py [-h] [--arch ARCH] [--gaze-weights GAZE_WEIGHTS] [--face-weights FACE_WEIGHTS] [--view] [--input INPUT] [--output OUTPUT] [--dataset DATASET]
Gaze Estimation Inference Arguments
options:
-h, --help show this help message and exit
--arch ARCH Model name, default `resnet18`
--gaze-weights GAZE_WEIGHTS
Path to gaze esimation model weights
--face-weights FACE_WEIGHTS
Path to face detection model weights
--view Display the inference results
--input INPUT Path to input video file
--output OUTPUT Path to save output file
--dataset DATASET Dataset name to get dataset related configs
If you use this work in your research, please cite it as:
Valikhujaev, Y. (2024). MobileGaze: Pre-trained mobile nets for Gaze-Estimation. Zenodo. https://doi.org/10.5281/zenodo.14257640
Alternatively, in BibTeX format:
@misc{valikhujaev2024mobilegaze,
author = {Valikhujaev, Y.},
title = {MobileGaze: Pre-trained mobile nets for Gaze-Estimation},
year = {2024},
publisher = {Zenodo},
doi = {10.5281/zenodo.14257640},
url = {https://doi.org/10.5281/zenodo.14257640}
}