systems operations manager - computer systems
Title posted on CareerBeacon -
Manager, Site Reliability Engineering (SRE) - eCommerce
Posted on
December 11, 2024
by
Employer details
Home Depot
Job details
With a career at The Home Depot, you can be yourself and also be part of something bigger.Position Overview: The Manager, SRE will lead a team of Site Reliability Engineers to ensure the reliability, performance, and operational support of our eCommerce systems, with a focus on Google Cloud Platform (GCP) environments. This role requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support, with emphasis on DevOps principles and GCP expertise.Responsibilities:Leadership & Management:Lead and mentor a team of Site Reliability EngineersFoster a culture of continuous improvement and innovationCollaborate with cross-functional teams to align SRE practices with business objectivesReliability & Performance:Conduct reliability reviews to identify areas for improvement and implement solutions to enhance system reliability, particularly in GCP environmentsImplement and promote performance engineering practices to ensure optimal system performance on GCPDevelop and maintain service level objectives (SLOs) and error budgetsProduction Engineering & Operational Support:Oversee production engineering efforts to ensure systems are designed for operational excellence and reliability, leveraging GCP services and best practicesManage incident response and post-incident reviews to minimize downtime and improve system resilienceImplement monitoring, alerting, and observability solutions to proactively identify and address issuesDevelop and maintain runbooks and playbooks for common operational tasks.Coordinate with security teams to ensure compliance with security policies and best practiceDevOps & Continuous Improvement:Drive DevOps initiatives to improve collaboration between development and operations teams, with a focus on GCP-native tools and servicesImplement and maintain CI/CD pipelines to streamline deployment processes in GCP environmentsIdentify and implement automation opportunities to reduce manual tasks and improve efficiencyPromote the use of Infrastructure as Code (IaC) to manage and provision cloud resources.Continuously evaluate and integrate new tools and technologies to enhance DevOps practicesRelease Management:Implement and maintain release management best practices to minimize disruptions and maximize system stabilityCollaborate with DevOps teams to integrate release management into CI/CD pipelinesOversee release schedules, ensuring minimal impact on business operationsEnsure there is a rigorous release readiness process in place that includes reviews and post-release retrospectivesMaintain a release calendar and communicate release plans to stakeholdersStrategic Planning:Create and maintain a strategic roadmap for SRE initiatives, aligning with business goals and technological advancements.Refine and standardize Standard Operating Procedures (SOPs) to enhance operational efficiency and consistency.Address customer pain points by developing and implementing solutions that improve user experience and system reliability.Engage with stakeholders to understand their needs and incorporate feedback into strategic planning and executionMonitor industry trends and best practices to ensure the SRE team remains at the forefront of technology.Experience:Bachelor's degree in computer science, Engineering, or a related fieldStrong problem-solving and analytical abilitiesExcellent communication and collaboration skills4-6 years of relevant work experience, including significant experience with GCPExtensive experience with cloud infrastructure, GCP services and architectureProven track record of managing and optimizing large-scale systems on GCPProven ability to effectively communicate with individuals at all levels of the organizationAbility to maintain relationship and negotiate with vendors.Ability to operate in and leverage resources in a matrixed environment.Ability to analyze and present data to support ideas.Ability to clearly c
-
LocationToronto, ON
-
Workplace information
On site
-
SalaryNot available
-
Terms of employment
Permanent employmentFull time
-
Starts as soon as possible
- vacancies
1 vacancy
- Source
CareerBeacon
#2120636
Advertised until
2025-01-10
Important notice: This job posting has been provided by a partner site. Job Bank is not responsible for this content.
Report a problem with this job posting
Thank you for your help!
You will not receive a reply. For enquiries, please contact us.