Senior Data Engineer (Massachusetts)
Rapid7 (Nasdaq: RPD) is advancing security with visibility, analytics, and automation delivered through our Insight cloud. Our solutions simplify the complex, allowing security teams to work more effectively with IT and development to reduce vulnerabilities, monitor for malicious behavior, investigate and shut down attacks, and automate routine tasks. Over 9,300 customers rely on Rapid7 technology, services, and research to improve security outcomes and securely advance their organization. For more information, visit our website, check out our blog, or follow us on LinkedIn.
The Opportunity
Rapid7 seeks a Sr. Data Engineer to build and maintain data infrastructure within the Business Intelligence team's data platform. You will be responsible for deploying data pipelines and machine learning models in the cloud, implementing DevOps practices and developing data models within Snowflake. You will work closely with AWS, Snowflake and DevOps applications such as Github Actions, Terraform, Jenkins and more. In this role you will bridge the gap between a data engineer and a DevOps engineer by building monitoring systems, cloud applications and CI/CD pipelines that support the data engineering team's efforts.
The ideal candidate has hands-on experience performing DevOps work in a cloud environment, and has worked closely with databases and data pipelines. It's critical that you are able to translate business objectives into data required to support key analyses. You will collaborate with a creative, analytical and data-driven team to bring a single source of truth and self-service analytics to the entire company.
In the role you will:
Transform Rapid7's Business Intelligence and Analytics data platform by applying DevOps best practices of automation, monitoring, CI/CD and configuration management
Build and maintain the applications that ingest, analyze and store Rapid7's enterprise data
Mentor and provide guidance to peer data engineers based on your experiences and technical expertise
Productionize data and machine learning pipelines with docker containerization and clustering tools (ECS/Kubernetes)
Develop effective monitoring and alerting systems to provide real-time visibility into the health of data infrastructure, cloud applications and data/machine learning pipelines
Build an environment that enables data scientists to easily develop and productionize Python, R and Spark code on top of a Snowflake data warehouse
Automate existing code and processes using scripting, CI/CD, infrastructure-as-code and configuration management tools
Perform data engineering projects within Snowflake such as developing data pipelines, data models and metadata management solutions
Collaborate with stakeholders in product, business and IT to deliver data products
Work closely with leadership to drive adoption of the latest DevOps and DataOps trends and technologies
Partner with the IT, Infrastructure and engineering teams on integration efforts between systems that impact data & Analytics
In return you will bring:
3+ years of experience with a major cloud provider (preferably AWS) including hands-on experience with code deployment in cloud environments using tools such as Docker, Kubernetes, EC2, Terraform
More than 3 years of experience working with a modern cloud data warehouse (preferably Snowflake) and SQL, plus 1+ years of experience with orchestration tools (preferably Airflow)
3+ years of experience in at least one programming such as Python, Java, Scala etc.
Experience as a leader within a data engineering team and ability to mentor teammates
Strong written and verbal communication skills
Highly collaborative in working teammates and stakeholders
Experience with a CI/CD tool such as Github Actions and AWS Code Pipeline
Working knowledge of data architecture, data warehousing, and metadata management
BS or MS in Computer Science, Analytics, Statistics, Informatics, Information Systems or another quantitative field. or equivalent experience and certifications will be considered
#LI-REMOTE