Site Reliability Engineer

LogDNA | U.S. Only | 10 months ago

UnknownUnknownIntermediate (2-5 years)Intermediate (2-5 years)Long-termLong-termOver 30 hrs/wkOver 30 hrs/wk

At LogDNA you’ll help us build a fast and modern log management platform that offers the flexibility of an amazing developer experience with the trust of enterprise-grade infrastructure. We strive to help developers pinpoint production issues by aggregating all system and application logs into one platform. Today, LogDNA is used by over 3,000 teams including IBM, OpenAI, Instacart, and Lime Bike. We’re building a future where developers don’t have to dread the tools they use at work, starting with log management. We've achieved 300% year-over-year revenue growth in the last year, and we're just getting started.

We're Y-Combinator alumni, venture-backed by Emergence Capital (Salesforce, Box, and Zoom) and Initialized Capital (Reddit, Coinbase, and Patreon). Our team comes from a wide variety of backgrounds and experiences, having worked on products at Heroku, Facebook, WhatsApp, Udacity, Ripple, among others.

Our team is responsible for keeping LogDNA’s systems running smoothly 24x7x365, leveraging our mixture of specialties. We are currently looking for a passionate and motivated engineer who is enthusiastic to join a distributed team, shares our commitment to growing our platform together. A successful candidate should be an energetic self-starter with a passion for continuous improvement and a desire to positively impact a growing venture-backed, Y-Combinator alum start-up.

Responsibilities

Be in an On-call rotation to respond to suprises
Support our support engineers on customer incidents
Engage with your distributed teammates
Manage our infrastructure with Terraform, Ansible, and Kubernetes
Focus improvements in monitoring and alerting towards early detection and reducing pager fatigue
Continuously improve our tooling and pipelines for building and maintaining our products
Work to make production as boring as possible