You are viewing a preview of this job. Log in or register to view more details about this job.

Data Engineering Intern

The Role at a Glance

We are looking for a motivated intern to join our Data Engineering team and help build scalable, cloud-native data solutions on AWS. You will work on data ingestion, transformation, and analytics pipelines leveraging Amazon S3, AWS Glue, Glue Data Catalog, and query engines like Presto and Hive. Additionally, you will gain exposure to EC2 and Amazon EKS environments that host Dataiku, our enterprise data science platform.

What you'll be doing

  • Design and implement ETL pipelines using AWS Glue for batch data processing and register metadata in Glue Data Catalog.
  • Manage datalake on Amazon S3, ensuring proper partitioning, compression, and lifecycle policies.
  • Develop analytical queries using Presto and Hive for data validation and reporting.
  • Support Dataiku workflows by integrating curated datasets and enabling model deployment pipelines on EC2/EKS.
  • Automate infrastructure provisioning data pipelines using CloudFormation.
  • Monitor and optimize performance of ETL jobs and queries using CloudWatch and cost analysis tools.
  • Collaborate with data scientists to ensure smooth data access and governance within Dataiku.
  • Document architecture, data flows, and operational runbooks for reproducibility.

What we’re looking for

  • Rising junior
  • Python (data processing, Glue scripts, boto3 basics)
  • SQL (joins, aggregations, window functions)
  • Foundational knowledge of cloud computing architecture basics

Nice-to-haves:

  • AWS Core Services: S3 (bucket structure, partitioning, lifecycle policies), Glue (ETL jobs, Glue Data Catalog), IAM (roles, policies)
  • Query Engines: Basic experience with Presto or Hive for analytical queries
  • Linux & EC2 Basics: Familiarity with SSH, file transfers, and basic EC2 configuration
  • Version Control: Git (branching, pull requests)
  • Containers & Orchestration (Intro): Awareness of Docker and Amazon EKS concepts