Senior Data Engineer, GenAI Platform

Cellebrite

Cellebrite

Posted on May 24, 2026

Description

Company Overview



Cellebrite’s (Nasdaq: CLBT) mission is to enable its global customers to protect and save lives by enhancing digital investigations and intelligence gathering to accelerate justice in communities around the world. Cellebrite’s AI-powered Digital Investigation Platform enables customers to lawfully access, collect, analyze and share digital evidence in legally sanctioned investigations while preserving data privacy. Thousands of public safety organizations, intelligence agencies and businesses rely on Cellebrite’s digital forensic and investigative solutions—available via cloud, on‑premises and hybrid deployments—to close cases faster and safeguard communities.

To learn more, visit www.cellebrite.com and follow us on social media @Cellebrite.

Position Overview



We are assembling an elite, small-scale team of innovators focused on transforming generative AI from breakthrough concepts into real-world products. As a Senior Data Engineer, you will serve as the data backbone of this GenAI innovation group, enabling rapid experimentation, research, and prototyping. In this role, you will transform complex, raw data into high-quality, AI-ready assets that directly power Cellebrite’s next generation of digital intelligence capabilities.

Key Responsibilities and Requirements



  • Design, build, and maintain scalable data architectures that support generative AI research and rapid prototyping
  • Prepare, structure, optimize, and curate diverse datasets for AI and machine learning model training
  • Develop flexible, automated data pipelines to accelerate GenAI experimentation and development cycles
  • Partner closely with AI researchers and engineers to understand evolving data requirements
  • Conduct deep data exploration to uncover insights and identify new opportunities for GenAI applications
  • Ensure data quality, reliability, performance, and accessibility across multiple data domains
  • Optimize data processing workflows for large-scale and complex datasets
  • Apply strong data modeling and transformation practices to support advanced analytics and AI use cases

Technical Capabilities



  • Deep understanding of data requirements for machine learning and generative AI systems
  • Strong expertise in cloud-based data platforms (AWS, Google Cloud, or Azure)
  • Advanced proficiency in SQL and experience with both relational and NoSQL databases
  • Strong Python skills with a focus on data processing and automation
  • Experience building and optimizing data pipelines for AI/ML workloads
  • Hands-on experience with big data technologies and distributed data processing
  • Knowledge of performance tuning and data infrastructure optimization techniques
  • Experience integrating with BigQuery – advantage

Research and Innovation Skills



  • Proven ability to derive meaningful insights from complex, large-scale datasets
  • Creative and analytical approach to data preparation and feature engineering
  • Strong experimental mindset with rigorous analytical thinking
  • Ability to identify unique data-driven opportunities that can inspire new GenAI initiatives

Requirements

  • Bachelor’s degree in Computer Science, Data Science, or a related field
  • 5+ years of progressive experience in data engineering or related roles
  • Demonstrated experience working with cloud platforms, big data technologies, and AI-focused data pipelines