Principal Site Reliability Engineer (SRE), Production Systems at ThousandEyes (part of Cisco)
ThousandEyes empowers enterprises to see, understand and improve digital experiences for their customers and employees. The ThousandEyes cloud platform offers unmatched vantage points throughout the global Internet and cloud providers, delivering immediate visibility into the digital experience for every user, application, website or service, over any network. ThousandEyes is central to the global operations of the world's largest and fastest growing brands, including Comcast, eBay, HP, 120+ of the Global 2000, 65+ of the Fortune 500, 6 of the 7 top US banks, and 20 of the 25 top SaaS companies.
Engineering at ThousandEyes
At ThousandEyes, we use cutting-edge technologies and innovative techniques to study and visualize networks on a global scale. ThousandEyes engineers are focused on continuous improvement — of our product, our codebase, our knowledge, and our skills. We believe in innovation, simplicity, and elegance. We work in small, cross-functional teams where everyone has a voice.
Learn more about engineering at ThousandEyes: https://www.youtube.com/watch?v=b9a_c8yJyzc
About the Role
The goal of the Production Systems team is to have a fully automated and self service, highly scalable, efficient and reliable infrastructure. As a Principal Site Reliability Engineer on this team you will be ensuring the team reliably builds and operates infrastructure, as a code, that keeps the ThousandEyes platform always available.
What we're looking for:
- 8+ years of SRE/devops/infrastructure experience
- A fast learner
- Comfortable working with new technologies
- Exceptional with Go and/or Python and familiar with algorithms, data structures, and complexity analysis
- Exceptional with Unix/Linux systems, with experience working with the shell, the kernel, system libraries, file systems, and client-server protocols
- Experience with network protocols and theory (TCP/IP, UDP, ICMP, MAC addresses, IP packets, DNS, overlay networks, OSI layers, load balancing, etc.)
- Experience with configuration management systems
- Experience with containers at large scale
- Experience with databases and/or performance tuning
- Experience with distributed systems and high scalable systems
- The will to contribute to open source projects