See all roles

[Remote] Lead Site Reliability Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Gradle Technologies is an AI-native company focused on transforming software development through their Develocity platform. They are seeking a Lead Site Reliability Engineer to define SRE vision, set operational standards, and ensure reliability across production services while mentoring a growing team.

Responsibilities

  • Operate and maintain all Develocity instances and supporting services in production
  • Define and evolve SRE standards, practices, and operating models, including on-call, incident response, postmortems, and SLOs
  • Participate in a follow-the-sun on-call rotation, acting as a technical escalation point for complex or high-severity incidents
  • Lead incident response and blameless retrospectives, ensuring learnings result in measurable reliability improvements
  • Set reliability priorities using risk, customer impact, business goals, SLOs, and error budgets
  • Identify systemic reliability risks and continuously evolve Develocity’s SaaS operations as the platform and customer base grow
  • Lead and influence architectural and design reviews to ensure reliability, scalability, and operability
  • Drive automation across deployment, upgrades, monitoring, self-healing, recovery, and operational workflows
  • Build and maintain comprehensive observability for all managed services, including logging, metrics, tracing, and alerting
  • Own disaster recovery, backups, and business continuity planning and execution
  • Partner with engineering leadership to balance feature delivery with reliability and operational excellence
  • Mentor and coach SREs, supporting technical growth and strong operational practices
  • Help onboard new SREs and contribute to hiring by defining and assessing SRE excellence at Develocity
  • Communicate clearly with customers during incidents and maintenance windows
  • Optimize performance, resource utilization, and operational costs

Skills

  • 7+ years in SRE, DevOps, or an equivalent role operating production services at scale
  • Experience leading reliability initiatives across multiple teams or services
  • Demonstrated ability to influence technical direction without direct authority
  • Experience designing and operating systems with SLOs and error budgets, and exercising strong judgment in balancing reliability, velocity, and cost
  • Strong Kubernetes experience in production environments
  • Cloud infrastructure expertise, preferably AWS (EKS, RDS, S3, EC2)
  • Proficiency with observability tools (Prometheus, Grafana) and Infrastructure as Code (Terraform)
  • Track record of incident management and response in a 24/7 on-call environment
  • Scripting proficiency (Python, Bash) for automation
  • Strong written and verbal English communication skills
  • Experience as a founding or early SRE establishing practices in a growing SaaS organization
  • Familiarity with Develocity
  • JVM language experience (Java, Kotlin)
  • Experience with customer-facing and executive-level incident communications

Benefits

  • A ground-floor role in a new SRE team - you'll shape how we do things, not inherit someone else's decisions.
  • Real ownership of production systems used by engineers at companies you've heard of.
  • Direct interaction with customers when things go wrong (and when they go right).
  • A culture that values automation over heroics.
  • In-person meetings, such as our annual company offsite and team meetings.
  • Work from home in a remote-first environment.
  • Competitive salaries and equity grants.

Company Overview

  • Gradle Technologies is the award-winning developer productivity company behind Gradle Build Tool—one of the most used build systems in the world—and Develocity®, the leading developer observability platform. It was founded in 2014, and is headquartered in San Francisco, California, USA, with a workforce of 51-200 employees. Its website is https://gradle.com/.
  • Company H1B Sponsorship

  • Gradle Technologies has a track record of offering H1B sponsorships, with 1 in 2025, 1 in 2024, 2 in 2022. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    You might like

    [Remote] Director, Business Operations

    Work from home Full-time role

    [Remote] Associate Environmental Analyst (Agriculture) NY HELPS

    Work from home Full-time role

    [Remote] Online Marketing Specialist - Performance

    Work from home Full-time role

    [Remote] Sales Advisor - PEO Conversion Focus

    Work from home Full-time role

    [Remote] Machinist - Level 3 (CNC Grinder) 1st shift

    Work from home Full-time role

    [Remote] Senior Manager, Go-to-Market Growth Strategy

    Work from home Full-time role

    [Remote] Senior Project Manager (Systemwide/Higher Education/SaaS/ERP)

    Work from home Full-time role

    [Remote] Program Manager (R4930)

    Work from home Full-time role

    [Remote] Senior Program Manager (R4932)

    Work from home Full-time role

    [Remote] Account Manager (Pre-Professional)

    Work from home Full-time role

    Experienced Bilingual Customer Support Specialist – Spanish & English – Global Compliance Platform

    Work from home Full-time role

    Project Coordinator III (6175)

    Work from home Full-time role

    Veterinary Assistant

    Work from home Full-time role

    Senior Associate / Associate Manager, SEM @ Klook

    Work from home Full-time role

    Director, Anthropic Alliances Partner Manager

    Work from home Full-time role

    ISDA Negotiator

    Work from home Full-time role

    Help Desk Technician (Part-Time)

    Work from home Full-time role

    Join Our Team: Regional CDL Owner Operator - Home Weekly Opportunities Available

    Work from home Full-time role

    Senior Administrator-Commercial Contact Operations Advisor Support Team (COAST)

    Work from home Full-time role

    Senior District Business Manager, Neuroscience - Cobenfy – Indiana

    Work from home Full-time role