In the role of Lead Observability Engineer, you will be responsible for leading technical direction and implementation of the software pipeline, focusing on the monitoring and observability platform. This platform is essential, providing the tooling, practices, and visibility that our development engineering teams leverage to deliver, observe, and maintain the product suite. You will be responsible for ensuring that development teams have low friction, highly secure, highly available, cost-effective, and consistent toolchain to deliver software within and effective tools to monitor, observe and operate that software in the framework of large enterprise compliance and governance needs. You will work closely with the Director, software development leadership, technical operations leadership, architecture, PMO and vendors from around the globe to deliver these services.
ROLE RESPONSIBILITIES:
Work with the Director of Development to assess the needs of Software Engineering against current offerings
Participate in and occasionally lead the daily, weekly, sprint cycle team ceremonies and ensure efficient activities of the team aligned to goals
Provide mentoring to team members
Assess and size effort associated with work backlog and participate in grooming
Advise and inform a program of work to rationalize and mature the monitoring and observability toolchain
Leading technical discussions and requirements analysis with partners and customers to drive alignment with customer needs
Define best practices around making systems and services measurable and work with various teams to get those best practices applied
Collect, aggregate, and visualize the collected metrics to provide integrators and consumers with actionable insight
Interview and participate in building a team to establish a center of excellence and expertise in the Monitoring and Observability domains
Participate in regular planning cycles to align business priorities to programs of work within your organization
Inform recommendations, including resourcing, of strategic projects to mature and improve the monitoring and observability platform
Lead proofs of concepts, engineering, and implementation projects
Deliver Managed Services that cover the following domains:
Log Collection and Analysis
Operational Metrics
Distributed Tracing
Code and Artifact Repositories
Build, Test, and Deployment Automation
Secrets Management
Configuration Repository and Management
Execution Bus
Container Build Standards and Automation
Continuous Delivery
TECHNICAL REQUIREMENTS:
Background in both Systems and Software Engineering
Competent developing software in 1 or more of Python, Golang, Ruby, C (implement defensive programming, error handling, testing frameworks)
You should have experience in designing, automating, maintaining and optimizing observability platforms (logging, metric and tracing)
Expertise and experience in delivering large-scale systems using big data technologies including but not limited to: Enterprise scale Kafka, Splunk, Prometheus, and other Time Series databases
Abreast of industry standards and trends related to telemetry and software pipelines
Experience rationalizing and implementing monitoring and observability toolchain at enterprise scale
Previous experience of public clouds (AWS and Terraform)
Knowledge and experience of containers and Kubernetes cluster
Hands on experience consolidating application and system logs at enterprise scale
Experience with automation tools (Chef, Ansible)
Experience working in a remote team across multiple regions and time zones