https://www.lockedinai.com/ logo

AI Cloud Engineer

https://www.lockedinai.com/
17 hours ago
Full-time
Remote
Worldwide
$140,000 - $195,000 USD yearly
Engineering & science

LockedIn AI is hiring an AI Cloud Engineer to design and operate the cloud infrastructure powering real-time AI systems used by over 1 million users worldwide.

About the Role

We are looking for a cloud-native, AI-infrastructure-focused engineer to build and scale the backbone of our AI systems. This role sits at the intersection of cloud engineering, DevOps, and machine learning infrastructure, where you will design the environments that train, serve, and optimize large-scale AI models in production.

You will own the full lifecycle of AI infrastructure β€” from GPU clusters for training to low-latency inference systems powering real-time interview assistance.

Key Responsibilities

AI-Optimized Cloud Architecture

  • Design scalable cloud infrastructure for ML training, fine-tuning, and inference
  • Architect GPU-based compute environments optimized for AI workloads
  • Build multi-environment systems (training, staging, production) with proper isolation
  • Implement auto-scaling systems for dynamic AI workloads

Model Serving & Inference Infrastructure

  • Build production-grade inference systems for real-time AI responses
  • Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, TGI, etc.)
  • Optimize latency, throughput, batching, and GPU utilization
  • Design load balancing, routing, and failover systems for AI APIs

GPU Compute & Training Systems

  • Manage GPU clusters for model training and evaluation
  • Configure distributed training (multi-node, multi-GPU setups)
  • Optimize spot/preemptible instance usage for cost efficiency
  • Operate managed ML platforms (SageMaker, Vertex AI, Azure ML, etc.)

Cloud Cost Optimization (FinOps for AI)

  • Monitor and optimize cloud spend across GPU, storage, and API usage
  • Implement cost dashboards and alerts for infrastructure usage
  • Optimize LLM usage, token consumption, and inference efficiency
  • Reduce idle compute and improve GPU utilization rates

Networking, Security & Reliability

  • Design secure VPCs, private endpoints, and high-performance networking
  • Implement IAM policies, encryption, and secrets management
  • Ensure compliance readiness (SOC2, GDPR, CCPA)
  • Build resilient systems with high availability and fault tolerance

Infrastructure as Code & Observability

  • Build all infrastructure using Terraform, Pulumi, or CloudFormation
  • Implement GitOps workflows for reproducible deployments
  • Develop monitoring systems for GPU health, latency, and system performance
  • Build alerting systems for failures, spikes, and anomalies

Required Qualifications

Experience

  • 3+ years in cloud engineering, DevOps, or infrastructure roles
  • Experience with ML/AI workloads in production environments
  • Hands-on experience with GPU-based compute systems
  • Startup or high-growth environment experience preferred

Technical Skills

  • Strong proficiency in Python, Go, or Bash
  • Deep experience with AWS, GCP, or Azure
  • Strong Kubernetes expertise (GPU scheduling, autoscaling, Helm, etc.)
  • Experience with model serving systems (vLLM, Triton, TensorRT, etc.)
  • Infrastructure as Code (Terraform, Pulumi, CloudFormation)
  • Monitoring tools (Prometheus, Grafana, Datadog, CloudWatch, etc.)

Soft Skills

  • Strong systems thinking and cloud architecture mindset
  • Cost-conscious engineering approach
  • Clear communication and documentation skills
  • Ability to work independently in fast-paced environments

Preferred Qualifications

  • Experience with large-scale LLM inference systems
  • Multi-GPU distributed training expertise
  • Knowledge of real-time streaming or low-latency systems
  • Experience with RDMA, InfiniBand, or high-performance networking
  • Background in SaaS, edtech, or AI product companies
  • Open-source or startup experience

What We Offer

  • Competitive equity in a fast-growing AI company
  • $140,000 – $195,000 USD / year compensation range
  • Remote-first work model (US-based, NYC optional hybrid)
  • Opportunity to build infrastructure used by 1M+ users
  • High-impact engineering ownership from day one
  • Fast-paced, AI-native development environment

About LockedIn AI

LockedIn AI is building the world’s leading real-time AI interview and meeting copilot. Our platform helps users succeed in interviews, assessments, and professional conversations using advanced AI systems operating at scale.

How to Apply

Submit:

  • Resume/CV
  • Short note covering:
    • Why you want to join LockedIn AI
    • Experience with cloud or AI infrastructure
    • Ideas for improving AI system performance or scalability
  • Optional: GitHub, portfolio, or technical writeups