What you’ll Do
You will be a member of an SRE team that operates our Openstack private cloud and develops tools and integrations for a portfolio of cloud infrastructure services running NexTurn’s critical business services. You will use your Openstack knowledge to drive improvements in operations and releases through code. You will use common opensource observability tooling like ELK and Grafana for proactive alerts to measure and maintain Service Level Objectives. You will work with Tier 1 team on escalations and use those escalations as opportunities to automate.
- Collaborate with other SRE services team members to define roadmaps, write clear user stories with well-defined acceptance criteria, design, and build solutions
- Develop and deliver software required for building & improving the functionality, reliability, availability, and manageability of applications and cloud platforms using a DevOps model for both On-Prem (Openstack)
- Automate the development, testing, and deployment processes through CI/CD pipelines (Git, GitLab, Jenkins, Helm, ArgoCD)
- Work with Tier 1 support on system and customer escalations
- Responsible for determining and setting SLO’s for the infrastructure, creating adequate monitoring and logging for features so that SLO can successfully be measured
Required Skills and Experience
- Solid Openstack cloud infrastructure background and operational, fixing and problem-solving experience
- Software development lifecycle including design, development, testing, packaging, deployment, upgrade and support.
- OpenStack development and operations experience. Familarity with major Openstack components like Keystone, Nova, Neutron, Glance
- Software development experience in Python
- Ability to write patches for Openstack in python and contribute to community
- Working with opensource community for bug fixes / enhancement etc.
- Experience supporting Software-defined storage with Ceph or other cloud-based storage.
- Hypervisor technologies including KVM
- Redhat Enterprise Linux and/or CentOS build, development, and operations
- Experience in building and maintaining code distribution through automated pipelines
- Experience with Ansible or Puppet for configuration management
- Software-defined network technologies including OVS, OVN, NFV, etc.
- IaaC experience – Terraform, Ansible, Git, GitLab, Jenkins, Helm, ArgoCD, Conjur/Vault
- Agile software development practices.
- Work with geographically distributed teams.
- Understand IT processes, including Design, implementation, and Operations.
- Opensource development experience
- Ambitious, able and willing to help where help is needed
- Able to build relationships, be culturally sensitive, have goal alignment, have learning agility
- Ready to rock and build an amazing product
Location: Remote first job, but where the client mandates Work From Office, Candidate needs to relocate.