AI Cloud Engineer

https://www.lockedinai.com/

17 hours ago

Full-time

Remote

Worldwide

$140,000 - $195,000 USD yearly

Engineering & science

LockedIn AI is hiring an AI Cloud Engineer to design and operate the cloud infrastructure powering real-time AI systems used by over 1 million users worldwide.

About the Role

We are looking for a cloud-native, AI-infrastructure-focused engineer to build and scale the backbone of our AI systems. This role sits at the intersection of cloud engineering, DevOps, and machine learning infrastructure, where you will design the environments that train, serve, and optimize large-scale AI models in production.

You will own the full lifecycle of AI infrastructure — from GPU clusters for training to low-latency inference systems powering real-time interview assistance.

Key Responsibilities

AI-Optimized Cloud Architecture

Design scalable cloud infrastructure for ML training, fine-tuning, and inference
Architect GPU-based compute environments optimized for AI workloads
Build multi-environment systems (training, staging, production) with proper isolation
Implement auto-scaling systems for dynamic AI workloads

Model Serving & Inference Infrastructure

Build production-grade inference systems for real-time AI responses
Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, TGI, etc.)
Optimize latency, throughput, batching, and GPU utilization
Design load balancing, routing, and failover systems for AI APIs

GPU Compute & Training Systems

Manage GPU clusters for model training and evaluation
Configure distributed training (multi-node, multi-GPU setups)
Optimize spot/preemptible instance usage for cost efficiency
Operate managed ML platforms (SageMaker, Vertex AI, Azure ML, etc.)

Cloud Cost Optimization (FinOps for AI)

Monitor and optimize cloud spend across GPU, storage, and API usage
Implement cost dashboards and alerts for infrastructure usage
Optimize LLM usage, token consumption, and inference efficiency
Reduce idle compute and improve GPU utilization rates

Networking, Security & Reliability

Design secure VPCs, private endpoints, and high-performance networking
Implement IAM policies, encryption, and secrets management
Ensure compliance readiness (SOC2, GDPR, CCPA)
Build resilient systems with high availability and fault tolerance

Infrastructure as Code & Observability

Build all infrastructure using Terraform, Pulumi, or CloudFormation
Implement GitOps workflows for reproducible deployments
Develop monitoring systems for GPU health, latency, and system performance
Build alerting systems for failures, spikes, and anomalies

Required Qualifications

Experience

3+ years in cloud engineering, DevOps, or infrastructure roles
Experience with ML/AI workloads in production environments
Hands-on experience with GPU-based compute systems
Startup or high-growth environment experience preferred

Technical Skills

Strong proficiency in Python, Go, or Bash
Deep experience with AWS, GCP, or Azure
Strong Kubernetes expertise (GPU scheduling, autoscaling, Helm, etc.)
Experience with model serving systems (vLLM, Triton, TensorRT, etc.)
Infrastructure as Code (Terraform, Pulumi, CloudFormation)
Monitoring tools (Prometheus, Grafana, Datadog, CloudWatch, etc.)

Soft Skills

Strong systems thinking and cloud architecture mindset
Cost-conscious engineering approach
Clear communication and documentation skills
Ability to work independently in fast-paced environments

Preferred Qualifications

Experience with large-scale LLM inference systems
Multi-GPU distributed training expertise
Knowledge of real-time streaming or low-latency systems
Experience with RDMA, InfiniBand, or high-performance networking
Background in SaaS, edtech, or AI product companies
Open-source or startup experience

What We Offer

Competitive equity in a fast-growing AI company
$140,000 – $195,000 USD / year compensation range
Remote-first work model (US-based, NYC optional hybrid)
Opportunity to build infrastructure used by 1M+ users
High-impact engineering ownership from day one
Fast-paced, AI-native development environment

About LockedIn AI

LockedIn AI is building the world’s leading real-time AI interview and meeting copilot. Our platform helps users succeed in interviews, assessments, and professional conversations using advanced AI systems operating at scale.

How to Apply

Submit:

Resume/CV
Short note covering:
- Why you want to join LockedIn AI
- Experience with cloud or AI infrastructure
- Ideas for improving AI system performance or scalability
Optional: GitHub, portfolio, or technical writeups

Apply now