Senior Site Reliability Engineer
Optimizely is the world's leader in customer experience optimization, allowing businesses to dramatically drive up the value of their digital products, commerce and campaigns through its best in class experimentation software platform. By replacing digital guesswork with evidence-based results, Optimizely enables product and marketing professionals to accelerate innovation, lower the risk of new features, and drive up the return on investment from digital by up to 10X. Over 26 of the Fortune 100 companies choose Optimizely to power their global digital experiences. Optimizely’s impressive customer list includes eBay, FOX, IBM, The New York Times and many more global enterprises.Job Description
Optimizely’s Site Reliability Engineers work on improving the availability, scalability, performance and reliability of our production data platform. Our distributed event processing and compute platform powers the results and analytics for all of our Experimentation and Personalization products. This platform processes billions of events a day and is relied on by many Fortune 100 global businesses. We value observability, monitoring, actionable alerting based on SLOs, blameless postmortems and efficient incident response. We work in both the application and systems worlds, instrumenting key parts of core architecture while supporting developers as they do the same.
How you will make an impact
- As a member of the Data Infrastructure team your work will directly impact the reliability and performance of all of Optimizely’s products.
- You will deep dive into gnarly operational issues within software deployments, operating systems, network I/O, and Linux processes.
- You will also work on projects to move away from operational toil and towards improving fault tolerance, automation and SLO driven priorities.
- Work closely with distributed systems engineers developing new features and services within our data platform
- Build and scale new infrastructure to meet demand
- Document system design and procedures
- Participate in production on-call rotation
- Contribute to improvements of infra and application monitoring and alerting
- Develop and improve disaster recovery procedures and automation
- Work closely with security engineers to develop and improve network ACLs and tests
- Engage in service capacity planning and demand forecasting
- 5+ years running 24/7x365 production environments
- 5+ years of building infrastructure with automation
- Experience with Cloud Infrastructure (ideally AWS)
- Deep understanding of Linux
- You’ve worked with messaging and data storage systems like Kafka, HBase, Spark, Cassandra or similar.
- Solid grasp of a modern programming language: Java, Python, Ruby, Go, Rust or similar.
- Proficiency with configuration management and orchestration tools like puppet, chef, ansible or terraform.
- Solid understanding of fundamental networking technologies and concepts.
- Knowledge of best practices related to security, performance, and disaster recovery.
- Experience with infrastructure monitoring, network design, high availability systems
- Strong interpersonal communication skills and ability to work well in a diverse, team-focused environment with other Engineers, Product Managers, etc.
- Minimum BA/BS degree in Computer Science, Engineering, or a related degree
At Optimizely, we embody inclusion and embrace diversity. We believe in work/life balance and bringing our true selves to work. To that end, we offer best-in-class perks and benefits that support our Optinauts along their career journey with us. Read more about our culture at optimizely.com/careers.
Optimizely is an equal opportunity employer and makes employment decisions on the basis of merit. Optimizely prohibits discrimination based on race, color, religion, sex, sexual identity, gender identity, marital status, veteran status, nationality, citizenship, age, disability, medical condition, pregnancy, or any other unlawful consideration. All your information will be kept confidential according to EEO guidelines.
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.