(Full time, onsite role in Riyadh, Saudi Arabia)
Position Objectives:
- Focus on LLM fine-tuning, model compression, quantisation, and inference acceleration.
Job Description and Responsibilities
- Fine-tune large-scale models using multi-GPU tools (DeepSpeed, FSDP)
- Convert models to ONNX/TF and optimise for serving
- Apply quantisation, pruning, or distillation
- Collaborate on GPU resource planning and benchmarking
- Communication: Explaining models, presenting results, writing documentation
- Cross-functional alignment (AI, MLOps, Data, App teams)
- Debugging models, optimising pipelines, and creative solutions
- Navigating fast-changing tools, frameworks, and use cases
- Delivering prototypes, deployments, and iterations on time
- Building usable AI solutions and understanding requirements.
Qualifications & Experience:
- Master or Phd in Computer Science, Deep Learning, or a related engineering discipline.
- 6+ years in DL, high-performance training
- Experience with distributed training and optimisation frameworks
- Knowledge of ONNX, TensorRT, OpenVINO a plus
- Familiar with general concepts used in OpenShift AI platforms.
- Proficiency in English, Arabic (preferred)