Return to jobs

Job Details

Similar Jobs

Principal Site Reliability Engineer

ID
15460
Location
Dublin, Ireland
Role Type
Permanent

Principal Site Reliability Engineer

Similar Jobs

Principal TechOps Engineer – SRE

Overview

We are seeking a Principal TechOps Engineer (SRE) to play a key role in designing, building, and operating highly available cloud infrastructure. This position involves close collaboration with engineering teams to drive initiatives from concept through to production.

You will work within a modern, multi-region Kubernetes environment (AWS EKS) supporting mission-critical workloads, helping to shape infrastructure strategy and improve reliability, scalability, and automation across the platform.

This is a high-impact opportunity to influence cloud architecture, deployment practices, and operational excellence in a fast-paced, collaborative environment.

Key Responsibilities

  • Partner with engineering teams to deliver infrastructure and platform initiatives end-to-end
  • Design and operate highly available, secure, and scalable cloud-native systems
  • Manage and optimize Kubernetes environments (AWS EKS) across multiple regions and availability zones
  • Lead efforts in infrastructure automation and infrastructure-as-code (IaC)
  • Build and maintain CI/CD pipelines and deployment frameworks
  • Define and implement monitoring, logging, and alerting strategies
  • Drive adoption of DevOps best practices and automation-first mindset
  • Provide technical leadership and mentorship to SRE / Cloud Engineering teams
  • Collaborate cross-functionally with product, engineering, and risk stakeholders
  • Champion reliability, performance, and operational excellence across all systems

Required Skills & Experience

  • 5+ years of hands-on experience with AWS in production environments
  • Strong experience with Docker and containerized workloads
  • Proven experience running and managing Kubernetes workloads (preferably AWS EKS)
  • Experience deploying and managing Kubernetes clusters
  • Hands-on experience with CI/CD tools (Jenkins preferred)
  • Experience creating and managing Helm charts and libraries
  • Strong knowledge of monitoring and observability tools (e.g., CloudWatch, Datadog, Splunk)
  • Solid experience with UNIX/Linux systems and shell scripting
  • Experience working in large-scale AWS environments (multi-account, IAM, SSO)
  • Strong communication skills with the ability to engage across all levels
  • Ability to work independently and take ownership of initiatives

Preferred Experience

  • Infrastructure-as-code experience (Terraform preferred)
  • Programming experience (Python preferred)
  • Experience with Git or other distributed version control systems
  • Experience with Kafka / Confluent Kafka
  • Familiarity with agile methodologies (Kanban preferred)
  • Experience with CDN providers (e.g., Akamai)

Desirable Traits

  • Strong automation mindset – sees problems as opportunities to improve processes
  • Proven leadership experience within SRE / Cloud Engineering teams
  • Passion for building resilient, scalable systems
  • Ability to thrive in a fast-moving, evolving environment

Team & Environment

You will join a highly skilled Technical Operations team focused on cloud transformation, reliability engineering, and scalable infrastructure.

The team operates with a strong DevOps culture, emphasizing:

  • Infrastructure-as-code
  • Automation and continuous delivery
  • Security and resilience
  • High availability and system reliability

 

Share Jobs

Search Jobs

Match my CV

We take the hard work out of finding you a new job. Simply upload your CV (or call us) and we’ll get hunting for you!