Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailSummary
As the leading delivery platform in the region we have a unique responsibility and opportunity to positively impact millions of customers restaurant partners and riders. To achieve our mission we must scale and continuously evolve our machine learning capabilities including cutting-edge Generative AI (genAI) initiatives. This demands robust efficient and scalable ML platforms that empower our teams to rapidly develop deploy and operate intelligent systems.
As an ML Platform Engineer your mission is to design build and enhance the infrastructure and tooling that accelerates the development deployment and monitoring of traditional ML and genAI models at scale. Youll collaborate closely with data scientists ML engineers genAI specialists and product teams to deliver seamless ML workflowsfrom experimentation to production servingensuring operational excellence across our ML and genAI systems.
Qualifications :
Responsibilities
Design build and maintain scalable reusable and reliable ML platforms and tooling that support the entire ML lifecycle including data ingestion model training evaluation deployment and monitoring for both traditional and generative AI models.
Develop standardized ML workflows and templates using MLflow and other platforms enabling rapid experimentation and deployment cycles.
Implement robust CI/CD pipelines Docker containerization model registries and experiment tracking to support reproducibility scalability and governance in ML and genAI.
Collaborate closely with genAI experts to integrate and optimize genAI technologies including transformers embeddings vector databases (e.g. Pinecone Redis Weaviate) and real-time retrieval-augmented generation (RAG) systems.
Automate and streamline ML and genAI model training inference deployment and versioning workflows ensuring consistency reliability and adherence to industry best practices.
Ensure reliability observability and scalability of production ML and genAI workloads by implementing comprehensive monitoring alerting and continuous performance evaluation.
Integrate infrastructure components such as real-time model serving frameworks (e.g. TensorFlow Serving NVIDIA Triton Seldon) Kubernetes orchestration and cloud solutions (AWS/GCP) for robust production environments.
Drive infrastructure optimization for generative AI use-cases including efficient inference techniques (batching caching quantization) fine-tuning prompt management and model updates at scale.
Partner with data engineering product infrastructure and genAI teams to align ML platform initiatives with broader company goals infrastructure strategy and innovation roadmap.
Contribute actively to internal documentation onboarding and training programs promoting platform adoption and continuous improvement.
Requirements
Technical Experience
Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads.
Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow PyTorch) infrastructure tooling (MLflow Kubeflow Ray) and popular APIs (Hugging Face OpenAI LangChain).
Experience implementing modern MLOps practices including model lifecycle management CI/CD Docker Kubernetes model registries and infrastructure-as-code tools (Terraform Helm).
Demonstrated experience working with cloud infrastructure ideally AWS or GCP including Kubernetes clusters (GKE/EKS) serverless architectures and managed ML services (e.g. Vertex AI SageMaker).
Proven experience with generative AI technologies: transformers embeddings prompt engineering strategies fine-tuning vs. prompt-tuning vector databases and retrieval-augmented generation (RAG) systems.
Experience designing and maintaining real-time inference pipelines including integrations with feature stores streaming data platforms (Kafka Kinesis) and observability platforms.
Familiarity with SQL and data warehouse modeling; capable of managing complex data queries joins aggregations and transformations.
Solid understanding of ML monitoring including identifying model drift decay latency optimization cost management and scaling API-based genAI applications efficiently.
Qualifications
Bachelors degree in Computer Science Engineering or a related field; advanced degree is a plus.
3 years of experience in ML platform engineering ML infrastructure generative AI or closely related roles.
Proven track record of successfully building and operating ML infrastructure at scale ideally supporting generative AI use-cases and complex inference scenarios.
Strategic mindset with strong problem-solving skills and effective technical decision-making abilities.
Excellent communication and collaboration skills comfortable working cross-functionally across diverse teams and stakeholders.
Remote Work :
No
Employment Type :
Full-time
Full-time