Cloud Operations Lead

Type - Passive Opening, Submit your resume for future consideration

Job Description


This is a remote position.

Main Responsibilities

Leadership Role:

  • Control the cloud service design, roadmap and delivery
  • Enhance customer experience for cloud customers
  • Resolve critical issues in production
  • Lead and motivate every member in the department and outside
  • Advise and enhance process based improvements with the help of co-ordinators

Technical Role:

  • Work with product engineers and design team to maintain and secure cloud environments with lowest cost of ownership
  • Develop automation for cloud for continuous delivery
  • Manage operations in all public and private cloud environments
  • Incident Management and support
  • Logging, monitoring and event management
  • Overall Health management of all environments
  • Be available on-call as part of team maintaining SaaS availability 24x7

Management Role:

  • Capacity Management
  • Change Management
  • Organisation and People Management
  • Hire and train high performing team
  • Develop succession plan and delegate and elevate
  • Ensure that the departmental goals and scorecard metrics are met
  • Effectively communicate and establish department goals

 

Required Knowledge & Experience

 

  • 12+ years of progressive experience in IT design and implementation of various technology and operations
  • 5+ years in managing/ leading cloud based environments supporting IaaS and SaaS serving millions of users
  • Experience is security management would be good to have
  • Expert knowledge of Linux or Unix based system administration
  • Good knowledge of computing, storing and of network architecture
  • Strong expertise in managing cloud based infrastructure
  • Experience in troubleshooting HW, OS and App related issues
  • Public Cloud knowledge: GCP, AWS, Azure or Softlayer
  • Deployment & Configuration management : Puppet, Chef, Salt or Ansible
  • Scripting Language : Shell, Python, PHP, Go or Perl
  • Log Management: ElasticSearch, Syslog or Logstash
  • Monitoring : Graphite, Nagios or Ganglia