Senior Data Reliability Engineer at EZ Texting
EZ Texting is the #1 text communications technology company delivering fast, easy, and effective solutions for businesses across a wide variety of industries. Dreamers first, we are at the forefront of revolutionizing the way businesses communicate with their customers and believe personal relationships can transform an organization’s ability to grow.
Our employees are our greatest strength. We’re expanding quickly and scaling our teams to help accelerate growth while remaining committed to hiring exceptional, values-aligned talent. We have consistently been rated a Top 100 workplace and are committed to being a best-in-class employer for remote work — with benefits to match!
We are open to hire in CA, NY, TX, OR, WA, GA, PA, & TN, but welcome top applicants nationwide as we expand our operating boundaries.Role Overview:
After our employees and culture, our data is our greatest asset. We extract valuable insights from our data to help our customers achieve their desired outcomes quickly and easily. Data Reliability Engineers (DRE) are responsible for keeping database systems that support all user-facing services and many other production systems running smoothly 24/7/365. DREs are a blend of database engineers, software developers, and automation gurus that apply sound engineering principles, operational discipline and mature automation, while specializing in data storage solutions. In that capacity, DREs are peers to SREs and bring database and data expertise to the SRE and engineering teams.Primary Responsibilities:
Identify changes for the product architecture from the reliability, performance and availability perspective with a data driven approach.
Proactively work on efficiency and capacity planning to set clear requirements and reduce resource usage.
Improve the performance of the system by either making better use of resources, distributing load or reducing the latency.
Identify the SLO (Service Level Objectives) that will align the team to meet the availability and latency objectives.
Deliver projects, design solutions, identify potential issues, tradeoffs and risks.
Deliver production solutions that scale, identify automation points, and propose ideas on how to improve efficiency.
Provide emergency response either by being on-call or by reacting to symptoms according to monitoring.
Identify parts of the system that do not scale, provide immediate palliative measures and drive long term resolution of these incidents.
Perform and run blameless RCA's on incidents and outages.
Lead and mentor by setting the example.
Improve documentation all around.
Bachelor’s degree in Computer Science, Computer Engineering or relevant field.
5+ years experience in a similar role.
5+ years experience running MySQL in production environments.
3+ years experience running ElasticSearch in production environments.
3+ years experience in a software engineering role (Python, Ruby, Java or Go).
2+ years experience working with Microservices architecture.
2+ years experience with infrastructure automation and configuration management (Terraform, Ansible, Chef, Puppet).
Able to review SQL statements and guide developers with best practices.
Strong knowledge of database structure systems and data mining.
Have an urge to collaborate and communicate asynchronously.
Have an urge to document all the things so you don't need to learn the same thing twice.
Process oriented, driven to iterate on existing processes or create new ones.
Have the ability to orchestrate and automate complex administrative tasks.
Have a passion for stable and secure systems management practices.
Constantly improve product quality, security, and performance.
Excellent written and verbal English communication skills.
Have a proactive, grab-a-shovel and go-for-it attitude.
Have an urge for delivering quickly and iterating fast.
Excellent organizational and analytical abilities.
Outstanding problem solver.
Understanding of analytical data warehouses, like Snowflake.
Hands-on experience implementing ETL (or ELT) best practices at scale.
Hands-on experience with any of the following tools or technologies: Glue, Airflow, Dataflow, BigQuery, Tableau, PowerBI, Ignite, Jupyter.
Experience publishing/consuming data to/from SaaS application APIs.
Experience with Java.
Data modeling skills.
Benefits available to EZ Texting team members include, but are not limited to:
- 100% paid medical, vision, dental and life insurance for self (generous coverage for families)
Paid vacation and unlimited sick leave
Paid parental leave
Annual personalized learning reimbursement
Quarterly wellness reimbursement
Remote-work optimization benefits including:
Monthly internet reimbursement
Monthly flexible remote work stipend, including DoorDash subscription
Annual home office enhancement stipend
Direct-billing ordering for supplies
EZ Texting is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.