• Location: Remote
  • Type: Direct Hire
  • Job #344


In the role of Lead Observability Engineer, you will be responsible for leading technical direction and implementation of the software pipeline, focusing on the monitoring and observability platform. This platform is essential, providing the tooling, practices, and visibility that our development engineering teams leverage to deliver, observe, and maintain the product suite. You will be responsible for ensuring that development teams have low friction, highly secure, highly available, cost-effective, and consistent toolchain to deliver software within and effective tools to monitor, observe and operate that software in the framework of large enterprise compliance and governance needs. You will work closely with the Director, software development leadership, technical operations leadership, architecture, PMO and vendors from around the globe to deliver these services.



  • Work with the Director of Development to assess the needs of Software Engineering against current offerings
  • Participate in and occasionally lead the daily, weekly, sprint cycle team ceremonies and ensure efficient activities of the team aligned to goals
  • Provide mentoring to team members
  • Assess and size effort associated with work backlog and participate in grooming
  • Advise and inform a program of work to rationalize and mature the monitoring and observability toolchain
  • Leading technical discussions and requirements analysis with partners and customers to drive alignment with customer needs
  • Define best practices around making systems and services measurable and work with various teams to get those best practices applied
  • Collect, aggregate, and visualize the collected metrics to provide integrators and consumers with actionable insight
  • Interview and participate in building a team to establish a center of excellence and expertise in the Monitoring and Observability domains
  • Participate in regular planning cycles to align business priorities to programs of work within your organization
  • Inform recommendations, including resourcing, of strategic projects to mature and improve the monitoring and observability platform
  • Lead proofs of concepts, engineering, and implementation projects
  • Deliver Managed Services that cover the following domains:
  • Log Collection and Analysis
  • Operational Metrics
  • Distributed Tracing
  • Code and Artifact Repositories
  • Build, Test, and Deployment Automation
  • Secrets Management
  • Configuration Repository and Management
  • Execution Bus
  • Container Build Standards and Automation
  • Continuous Delivery



  • Background in both Systems and Software Engineering
  • Competent developing software in 1 or more of Python, Golang, Ruby, C (implement defensive programming, error handling, testing frameworks)
  • You should have experience in designing, automating, maintaining and optimizing observability platforms (logging, metric and tracing)
  • Expertise and experience in delivering large-scale systems using big data technologies including but not limited to: Enterprise scale Kafka, Splunk, Prometheus, and other Time Series databases
  • Abreast of industry standards and trends related to telemetry and software pipelines
  • Experience rationalizing and implementing monitoring and observability toolchain at enterprise scale
  • Previous experience of public clouds (AWS and Terraform)
  • Knowledge and experience of containers and Kubernetes cluster
  • Hands on experience consolidating application and system logs at enterprise scale
  • Experience with automation tools (Chef, Ansible)
  • Experience working in a remote team across multiple regions and time zones
  • Incident management


Attach a resume file. Accepted file types are DOC, DOCX, PDF, HTML, and TXT.

We are uploading your application. It may take a few moments to read your resume. Please wait!