Accountabilities & Key Roles:
We are looking for a motivated and experienced Team Leader to manage our Microservices Application Team working on applications deployed on Anthos. The ideal candidate will be responsible for leading a team handling DevOps, CI/CD pipelines, deployment activities, production support, and impact analysis for mission-critical services. This role requires strong leadership, technical expertise in containerized environments, and a solid background in managing modern application architectures.
Key Responsibilities:
- Lead and manage the day-to-day activities of the Microservices Application Team.
- Oversee deployment and release management for microservices hosted on Anthos and other container platforms.
- Drive and support DevOps practices including CI/CD automation, infrastructure-as-code, monitoring, and alerting.
- Ensure timely and efficient handling of production support issues and incident resolution.
- Perform and coordinate impact analysis for code changes, upgrades, and new feature rollouts.
- Collaborate with development, QA, infrastructure, and business teams to ensure seamless delivery and operational excellence.
- Enforce best practices for microservices design, observability, reliability, and scalability.
- Track and report team performance, support KPIs, and participate in continuous improvement initiatives.
- Mentor and support team members’ professional development.
Job Requirements:
Education:
- Graduate degree in Computer Science/ Software engineering, CIS from a recognized University.
Experience:
- Minimum 5 years of experience in microservices application support/deployment with at least 2 years in a leadership role.
- Strong hands-on experience with Anthos, Kubernetes, or similar container orchestration platforms.
- Solid understanding of DevOps practices, CI/CD tools (e.g., GitLab CI, Jenkins), and containerized environments.
- Experience in production support, incident management, and root cause analysis.
- Strong analytical and communication skills, with a focus on problem-solving and collaboration.
Competencies:
- Experience with service mesh, observability tools (e.g., Prometheus, Grafana), and logging frameworks.
- Familiarity with cloud platforms (GCP preferred) and hybrid cloud environments.
- Understanding of SRE (Site Reliability Engineering) principles is a plus.