Engineering Manager, Site Reliability



Other Engineering
United States
Posted on Monday, October 30, 2023
About Attentive:
Attentive® is the AI marketing platform for leading brands, designed to optimize message performance through 1:1 SMS and email interactions. Infusing intelligence at every stage of the consumer's purchasing journey, Attentive empowers businesses to achieve hyper-personalized communication with their customers on a large scale. Leveraging AI-powered tools, a mobile-first approach, two-way conversations, and enterprise-grade technology, Attentive drives billions in online revenue for brands around the globe. Trusted by over 8,000 leading brands such as CB2, Urban Outfitters, GUESS, Dickey’s Barbecue Pit, and Wyndham Resort, Attentive is the go-to solution for delivering powerful commerce experiences for consumers with the brands they love.
Attentive’s growth has been recognized by Deloitte’s Fast 500, Linkedin’s Top Startups and Forbes Cloud 100 all thanks to the hard work from our global employees!
Who we are
We’re looking for an Engineering Manager to lead our SRE team working to improve our Incident Management and reliability practices. Whether that’s through implementing tooling, like a service catalog, or directly embedding with teams to understand their pain points. Then using that knowledge to build scalable solutions. This role will partner closely with different departments like Client Strategy, Engineering leadership, and Product Management to ensure our processes work for all of them. This is a hands-on role where you are expected to balance people management, technical oversight, and cross-functional leadership.

Why Attentive needs you

  • You will be an effective people manager who also has the capability to contribute as a highly technical hands-on infrastructure engineer when the need arises
  • You will collaborate with our product management team to craft plans to achieve our business goals
  • You'll serve as a technical domain expert in observability best practices, incident management, and ensuring high availability and performance of our applications
  • You are an effective people manager, who can manage high-performing teams, and helps them reach their highest potential

About you

  • You have helped guide product teams to develop meaningful SLOs for their systems
  • You have utilized modern monitoring tools like Datadog, New Relic, Prometheus, Splunk, etc.
  • Ability to build automation with various scripting languages (Python, Shell, etc)
  • Experience with managing production workloads running in Kubernetes and in AWS
  • You like building productive relationships with vendors, getting updates on open issues, pushing for new features, leveraging training
  • 5yrs of industry experience in Infrastructure solving real-world reliability and scaling problems
  • You care about software quality and have a track record of building reliable, high-uptime systems
  • You are an excellent coach, mentor, and developer of engineers
  • You translate business needs into clearly scoped projects, and take a hands-on approach to steer solution design and implementation
  • You are excited by new technologies, yet know how to evaluate and choose the right tools for the right reasons

Our scale

  • 8,000 brands powered by Attentive sent over 2.2 billion text messages over Cyber Week 2023 (Black Friday/Cyber Monday) representing a growth of 31% from 2022
  • We sent 32 billion SMS messages in 2023, up 32% YoY. That’s an average of 87 million per day
  • Our production cluster contains over 18,000 containers which serve 200+ services
  • Our streaming services process over 80 billion events per month

What we use

  • Our infrastructure runs primarily in Kubernetes hosted in AWS’s EKS
  • Infrastructure tooling includes Istio, Datadog, Terraform, CloudFlare, and Helm
  • Our backend is Java / Spring Boot microservices, built with Gradle, coupled with things like DynamoDB, Pulsar, AirFlow, Postgres, Planetscale, and Redis, hosted via AWS
  • Our frontend is built with React and TypeScript, and uses best practices like GraphQL, Storybook, Radix UI, Vite, esbuild, and Playwright
You'll get competitive perks and benefits, from health & wellness to equity, to help you bring your best self to work.
For US based applicants:
- The US base salary range for this full-time position is $163,200 - $260,000 annually + equity + benefits
- Our salary ranges are determined by role, level and location
Attentive Company Values
Default to Action - Move swiftly and with purpose
Be One Unstoppable Team - Rally as each other’s champions
Champion the Customer - Our success is defined by our customers' success
Act Like an Owner - Take responsibility for Attentive’s success
Learn more about AWAKE, Attentive’s collective of employee resource groups.
If you do not meet all the requirements listed here, we still encourage you to apply! No job description is perfect, and we may also have another opportunity that closely matches your skills and experience.
At Attentive, we know that our Company's strength lies in the diversity of our employees. Attentive is an Equal Opportunity Employer and we welcome applicants from all backgrounds. Our policy is to provide equal employment opportunities for all employees, applicants and covered individuals regardless of protected characteristics. We prioritize and maintain a fair, inclusive and equitable workplace free from discrimination, harassment, and retaliation.