Lead Site Reliability Consultant


We are looking for Lead Site Reliability Consultant with a proven background in to join our network of more than 5000 consultants in 20 countries. Numerous new and ongoing projects ranging from enterprise content solutions, mobile apps, and data services to experience-driven enterprise e-commerce, user experience, and marketing. Global opportunities to work with different industry verticals: retail & luxury, financial, healthcare, travel & hospitality, automotive, media & entertainment.

Valtech is a founding member of MACH Alliance, a group that educates enterprises on best-of-breed Microservices, APIs, Cloud, and Headless (MACH) technology. Among the projects delivered are works for Levi's, BMW Online Store, Aerolíneas Argentinas.

The ideal candidate's skills include:

/ A proven background in Reliability, covering Observability, Reliability and Performance
/ Commercial experience of observability, from data ingestion to alerting and issue resolution. Elastic highly desirable but AppDynamics, Datadog etc considered
/ Comfortable with logging frameworks, log manipulation and shipping
/ Capable of designing, testing and implementing resilience on new and existing AWS solutions
/ Comfortable coding in one or more languages such as Python, Go, Java, NodeJS etc.
/ Commercial experience of using AWS Services including EC2, ECS, Serverless (Lambda)
/ You’ll know how to debug a complex, high availability production environments; Networking knowledge, load balancing, TCP/HTTP etc.
/ Comfortable with Linux operating systems and able to create and maintain shell scripts
/ Demonstrable range of in-depth technical knowledge/experience of handling complex software and platform architectures
/ In-depth level of technical knowledge/experience in building cloud solutions that have security, reliability, scalability, high availability and concurrency built-in from the outset
/ Background and relevant current experience in a hands-on Observability/SRE/Platform Engineering role is needed
/ Knowledge of IaaS deployment tools such as Terraform
/ Competent in using source control, preferably Git based
/ Upper-Intermediate English level
/ Accountability, strong communication, collaboration, and time-management skills
/ Strong desire to learn more about business every day


/ Elastic Observability or OpenTelemetry experience

/ Working knowledge of continuous integration systems such as Jenkins and GitLab

/ Elasticsearch internals experience a big plus

/ Performance Test experience

/ AWS Certification

/ Docker development experience is desirable.

The Role

As a Lead Site Reliability Engineer you’ll be expected to:

/ Develop and maintain Observability solutions using Elastic Observability and OpenTelemetry
/ Design, test and implement resilience on new and existing AWS solutions
/ Assist tribes with performance testing
/ Build and maintain solutions developed on AWS
/ Help the tribes enable observability features and develop solutions where none currently exist. Also document the process for future reference
/ Assist on the creation and maintenance of pipelines to manage the observability components
/ Monitor and reporting usage of our cloud solutions
/ Advise on the selection of the most appropriate technologies for the task
/ Ensure delivery pipeline for your IaaS code has optimal quality controls built-in to support testing, deployment, reporting and task management
/ Make a selection of appropriate quality controls to complete assigned tasks, including; code driven deployment; infrastructure deployment; automated testing; and effective operational monitoring, alerting and incident responses
/ Supply appropriate information and analysis to support resolution of issues and incidents with the tribes Observability.

As a consultant and as a binding part between developers and our clients you are expected to develop expertise both in technology and the means to communicate complex concepts and rationale to non-techies. We’ll encourage and support this with frequent opportunities to share ideas internally. We also have consultants who frequently deliver at regional, national and global conferences.


Mental and physical health:

/ 20 working days of vacation

/ national holidays

/ sick leave (up to 20)

/ unpaid leave (up to 20)

/ medical insurance

/ sports reimbursement

/ metrnity&paternity leave support

    Personal and professional development:

    / Internal workshops & learning initiatives
    / English language classes compensation
    / Professional certifications reimbursement
    / Participation in professional local & global communities
    / Growth Framework to manage expectations and define the steps to move towards the selected career
    / Mentoring program with the ability to become a mentor or a mentee to grow to a higher position

      You can not only become a part of constant evolution but can lead the change. The more we grow – the more opportunities there are to take responsibility, implement your creative ideas, be the innovator and driver rather than the task executor.

      Say hello to your future. Apply!

      Contact us

      We would love to hear from you! Please fill out the form and the nearest person from office will contact you.

      Let's reinvent the future