Data Engineer
SparkCognition, named top 20 on CNBC’s Top 50 Disruptors of 2017, is one of the fasting growing AI companies in the world. Our technology is deployed across dozens of enterprise environments with some of the largest global organizations, covering O&G, utility, manufacturing, finance, government, defense, and aerospace.
Description
We are looking for a versatile Data Engineer enthusiastic about working with the latest technologies to streamline, automate and optimize large scale data solutions. You are a designer, builder, and manager of information. You have a deep understanding of both software development and database technologies allowing you to select and integrate tools, frameworks and database systems into corporate products and other initiatives. You keep abreast of new tools and techniques to get the job done. You are self-motivated and can effectively work both independently or as part of a team.
Essential Duties and Responsibilities
As a Data Engineer you will:
Select, develop, and integrate tools, frameworks, and database technologies supporting highly scalable data processing focusing on reliability, performance, and data quality
Install, configure, maintain databases
Backup/Restore
Maintenance patches
Security Administration
Optimization/Performance tuning
Define and implement data retention policies and disaster recovery systems
Model data for various data stores
Understanding of normalization, warehouse, distributed databases, column store
Understanding of Relational, non-Relational NoSQL, Data Stream and Time Series data
Design, develop and implement ETL/ELT processes
Working with different file formats both on import & export
understanding of Bulk import/export, BCP, data link and virtualization techniques
Knowledge and Ability to use scripting and/or tools (SSIS, Informatica, Kettle, ODI etc)
Data Analysis
Analyze data quality and integrity
Using SQL, PL/SQL, TSQL and other scripting techniques, find matching pattern and data points across distributed system
Identify entities and attributes for facts/dimensions in datastore/data warehouse
Required Skills/Experience
Bachelor's degree in a related field with 7-10 years' relevant work experience
Advanced SQL skills and database experience in Oracle, SQL Server, or Postgres
Advanced experience with scripting languages like Python, JavaScript, PHP, or Perl
Understanding of Data Storage techniques, Data Lake, XML, JSON, Parquet, Column Store, row store, horizontal/vertical partitioned, distributed etc
Ability to multi-task in a fast-paced environment with accelerating priorities
Ability to diagnose performance issues and address them
Preferred Skills/Experience (Any of these is a plus)
Hands on experience with NoSQL database technologies like Mongo, DynamicDB, Hbase
Experience with various messaging systems, such as Kafka or RabbitMQ
Experience working with cloud ecosystem such as GCP, AWS, Azure
*US Citizenship required