Site Reliability Engineer
The role is almost fully remote with travel to the London office required just once/month.
We are AMS. We are a global total workforce solutions firm; we enable organisations to thrive in an age of constant change by building, re-shaping, and optimising workforces. Our Contingent Workforce Solutions (CWS) is one of our service offerings; we act as an extension of our clients' recruitment team and provide professional interim and temporary resources.
Our Client is a big four consultancy firm with a global presence, operating in over 150 countries. This organisation works with many public and private companies spanning multiple industries. The advisory work that they cover spans across audit, Accountancy, tax, corporate finance and consulting.
On behalf of this organisation, AMS are looking for a Site Reliability Engineer for a 3 Months (Potential Extension) contract based in London or Remote.
Purpose of the Role:
A Site Reliability Engineer (SRE) is a specialized role within a technology company or IT department that focuses on ensuring the reliable and efficient operation of complex software systems and services. The primary goal of an SRE is to bridge the gap between traditional software development and IT operations by applying software engineering principles to the management of large-scale, distributed, and critical production systems.
Responsibilities of the role:
As a Site Reliability Engineer you will be responsible for:
- System Monitoring and Incident Management:
- Implement and maintain monitoring, alerting, and logging systems to proactively identify and resolve issues.
- Respond to incidents, troubleshoot problems, and perform root cause analysis to prevent future occurrences.
- Participate in on-call rotations to provide 24/7 support for critical systems.
- Automation and Tooling:
- Develop and maintain automation scripts and tools to streamline operational tasks and improve system efficiency.
- Create and manage deployment pipelines to ensure reliable and consistent software releases.
- Capacity Planning andPerformanceOptimization:
- Monitor system performance metrics and conduct capacity planning to anticipate future resource needs.
- Optimize the performance of applications and infrastructure through load testing and tuning.
- Infrastructure Management:
- Work with cloud services and infrastructure-as-code tools to manage and scale the organization's cloud infrastructure.
- Ensure high availability and fault tolerance through redundancy and disaster recovery planning.
- Security and Compliance:
- Collaborate with security teams to implement and enforce security best practices.
- Assist with compliance efforts by ensuring systems meet relevant industry standards and regulations.
- Continuous Improvement:
- Continuously identify areas for improvement and implement changes to increase system reliability and efficiency.
- Participate in post-mortems and contribute to knowledge sharing within the team.
What we require from the candidate:
- Ability to code in Python - essential.
- Linux Admin (System Administration & Network Configuration) - essential.
- Kubernetes Administration.
- Debugging & Troubleshooting (Application and Infrastructure) production performance issues.
- Knowledge of MQ (MessageQueue - i.e., Kafka, RabbitMQ).
- CICD Tooling & DevOps Automation.
- Knowledge of modern infrastructure and cloud technologies (e.g., AWS, Azure, Google Cloud).
- Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Solid understanding of networking concepts and protocols.
This client will only accept workers operating via an Umbrella or PAYE engagement model.
If you are interested in applying for this position and meet the criteria outlined above, please click the link to apply and we will contact you with an update in due course.
AMS, a Recruitment Process Outsourcing Company, may in the delivery of some of its services be deemed to operate as an Employment Agency or an Employment Business
Contact Name: AMS
Job ID: 3286995