Software Engineer II at Rapid7
The Data Science team, as part of the Office of the CTO, works with large amounts of data to improve products, conduct security research, and report on the state of the internet at large.
In this role, you’ll apply your expertise to build and maintain the tools that enable, support and distribute the results of this research. You will work with partners across the company to understand and gain access to different types of data. You will work the data scientists on our team to build tools and POCs which demonstrate our findings. The tools and infrastructure you build will drive research, enhance products, and empower the community.Essential Responsibilities
- Understand the data
- Where data is coming from?
- How it is generated?
- What does it mean?
- Are there any noteworthy characteristics of the data (one column never varies)
- Get the data
- Work with data owners to get access to data
- ETL samples, small slices and full sets of data
- Build and enhance tools that analyze the data
- Research tools
- Engage with other teams to enable and empower them using our research
- Work with engineers on other teams to adopt methods derived from research
- Minimum 2 years industry experience
- Experience with Scala, Python, and SQL, and distributed processing using Apache Spark, Hadoop or Hive.
- Proficient in AWS services, including EC2, SQS, VPC networking, S3, Athena, Glue etc.
- Experience automating infrastructure through Terraform or CloudFormation, Chef, and Docker/Kubernetes
- Strong communication skills
- Strong programming skills in Python, Java, Scala, and Bash
- Strong debugging skills, including the ability to reproduce a bug given limited information and/or time
- Experience with the Git version control system
- Experience with test frameworks across a variety of programming languages