This role will complement our existing team of analytics experts. The candidate will be responsible for optimizing our data architecture as well as managing data flow and collection from disparate sources into a workable format by cross-functional teams. The ideal candidate should have subject matter expertise on how to build and optimize BIG DATA systems from the ground up. They will be self-driven and will work with different stakeholders across the organization which includes software engineers, data analysts and data scientists. The candidate will have the opportunity to shape our data stack, ensuring that our ever-flowing data is adequately collected, organized and made accessible for advanced analytics and beyond.
Duties and responsibility:
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data to a central location from a wide variety of data sources using SQL or other ‘big data’ technologies
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
Implementing industry standard data governance and security standards
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Strong project management and organizational skills.
Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Familiar with data management and visualization tools such as Tableau
Coursework in machine learning, data science, data mining, big data, and/or statistical inference is a plus
Comfortable with GIT version control
Excellent written and verbal communication skills in English
A DevOps attitude – you build it, run it & maintain it
A Bachelor’s or advanced degree in Computer Science or related field
At least 5 years’ experience
Willingness and ability to travel and be away for long periods of time at a go