Internet site designer
Title posted on Jobillico -
Manager, Site Reliability Engineering (SRE) - eCommerce
Posted on
October 29, 2024
by
Employer details
The Home Depot Canada
Job details
With a career at The Home Depot, you can be yourself and also be part of something bigger.<br> <br> Position Overview:<br> The Manager, SRE will lead a team of Site Reliability Engineers to ensure the reliability, performance, and operational support of our eCommerce systems, with a focus on Google Cloud Platform (GCP) environments. This role requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support, with emphasis on DevOps principles and GCP expertise.<br> Responsibilities:<br> Leadership & Management:Lead and mentor a team of Site Reliability Engineers<br> Foster a culture of continuous improvement and innovation<br> Collaborate with cross-functional teams to align SRE practices with business objectives<br> <br> Reliability & Performance:Conduct reliability reviews to identify areas for improvement and implement solutions to enhance system reliability, particularly in GCP environments<br> Implement and promote performance engineering practices to ensure optimal system performance on GCP<br> Develop and maintain service level objectives (SLOs) and error budgets<br> <br> Production Engineering & Operational Support:Oversee production engineering efforts to ensure systems are designed for operational excellence and reliability, leveraging GCP services and best practices<br> Manage incident response and post-incident reviews to minimize downtime and improve system resilience<br> Implement monitoring, alerting, and observability solutions to proactively identify and address issues<br> Develop and maintain runbooks and playbooks for common operational tasks.<br> Coordinate with security teams to ensure compliance with security policies and best practice<br> <br> DevOps & Continuous Improvement:Drive DevOps initiatives to improve collaboration between development and operations teams, with a focus on GCP-native tools and services<br> Implement and maintain CI/CD pipelines to streamline deployment processes in GCP environments<br> Identify and implement automation opportunities to reduce manual tasks and improve efficiency<br> Promote the use of Infrastructure as Code (IaC) to manage and provision cloud resources.<br> Continuously evaluate and integrate new tools and technologies to enhance DevOps practices<br> <br> Release Management:Implement and maintain release management best practices to minimize disruptions and maximize system stability<br> Collaborate with DevOps teams to integrate release management into CI/CD pipelines<br> Oversee release schedules, ensuring minimal impact on business operations<br> Ensure there is a rigorous release readiness process in place that includes reviews and post-release retrospectives<br> Maintain a release calendar and communicate release plans to stakeholders<br> <br> <br> Strategic Planning:Create and maintain a strategic roadmap for SRE initiatives, aligning with business goals and technological advancements.<br> Refine and standardize Standard Operating Procedures (SOPs) to enhance operational efficiency and consistency.<br> Address customer pain points by developing and implementing solutions that improve user experience and system reliability.<br> Engage with stakeholders to understand their needs and incorporate feedback into strategic planning and execution<br> Monitor industry trends and best practices to ensure the SRE team remains at the forefront of technology.<br> <br> <br> Experience:<br> Bachelor?s degree in computer science, Engineering, or a related field<br> Strong problem-solving and analytical abilities<br> Excellent communication and collaboration skills<br> 4-6 years of relevant work experience, including significant experience with GCP<br> Extensive experience with cloud infrastructure, GCP services and architecture<br> Proven track record of managing and optimizing large-scale systems on GCP<br> Proven ability to effectively communicate with individuals at all levels of the organization<br> Ab
-
LocationToronto, ON
-
Workplace information
On site
-
SalaryNot available
-
Starts as soon as possible
- vacancies
1 vacancy
- Source
Jobillico
#14742788
Advertised until
2024-11-26
Important notice: This job posting has been provided by a partner site. Job Bank is not responsible for this content.
Report a problem with this job posting
Thank you for your help!
You will not receive a reply. For enquiries, please contact us.