Lead Site Reliability Engineer at Lightspeed Systems
Lightspeed Systems is looking for an enthusiastic engineer to join our team as a Lead Site Reliability Engineer. As a member of this team you will focus on software development and infrastructure design building services to manage, scale and monitor our shared core infrastructure. The infrastructure and services that this team is responsible includes databases, message queues, monitoring solutions, security and networking in the cloud and physical data-centers. Engineers on this team will be challenged in a fast-paced environment and steer the advancement of efficient, resilient and scalable shared resources used by many of our production core services. This role may be based in either our Austin, TX or Portland, OR office.
ABOUT THE ROLE:
- Streamline and enhance the day-to-day operational workflows of shared services in a 24x7x365 environment located in AWS, and physical data centers.
- Build tools to enhance performance, scalability and observability of resources shared between multiple projects in production.
- Utilize a wide variety of open source technologies to create fault-tolerant, scalable and secure high-performance services and pipelines on a global scale.
- Interact with other teams across the organization to define KPIs and evangelize the adoption of best practices in relation to performance and reliability.
- Continuously improve observability to ensure the uptime and reliability of our applications and infrastructure.
- Troubleshoot issues across the entire stack; hardware, software, application and network within physical datacenter and cloud-based environments.
- Provide on-call support for shared services and infrastructure.
- Mentor and manage 1-3 person team, providing technical guidance and expertise.
- Provide project management, oversight and reporting for your team.
- Proven track record of designing, building, optimizing, and maintaining infrastructure on a large scale.
- Software development experience using Go, Python and Ruby.
- A deep understanding of the Linux operating system, from the console to the kernel.
- Ability to work in as part of a distributed team.
- Knowledge of CI/CD best practices.
Preferred Qualifications include:
- Experience with containers and container orchestration tools (Docker, Kubernetes and Spinnaker experience preferred).
- Experience working in the AWS environment.
- Experience with AWS, Google Cloud and Azure.
- Experience with Postgres, DynamoDB, Redis, and/or Memcached and other AWS Services.
We require all qualified applicants, as part of the application process, to complete a set of assessments. We invite you to jump start your application for this role by completing the assessment at the following link (it will only take 7-10 minutes): https://assess.predictiveindex.com/1nrxV
Education is undergoing a technology revolution with new devices and tools being added to the classroom every day and IT departments are responsible for keeping all this technology managed, safe and working. That is where we come in! Lightspeed Systems, ed-tech provider and leader in K-12 device filtering for 20 years, partners with schools to make learning safe, managed and mobile. Learn more at www.lightspeedsystems.com.
We love our employees, and we show it. A sneak peek into our BENEFITS & PERKS include:
- Health -- Medical, dental and vision insurance with healthy company contribution toward premiums.
- Wellness -- Lightspeed kicks cash into your HSA if you participate our HDHP. Employees are provided an adjustable desk and onsite gyms at some offices. Healthy holiday and PTO policy.
- Retirement -- 401(k) matching up to 6%
- Perks -- Fully stocked kitchen with snacks and beverages. Some lunches provided as well!