Skip to content

Streaks & Bestie Points — Data Pipeline Design

Overview

This document presents a production-grade architecture for Luzia's Streaks & Bestie Points (BP) gamification system. The system rewards millions of users worldwide for consistent daily engagement with the Luzia personal assistant app.

Requirement Solution
Streak tracking Per-user daily activity detection with timezone-aware reset logic
Bestie Points Event-driven point accrual from conversations and tool usage
Scale Millions of DAU, global distribution, sub-10ms reads
Real-time Instant streak/BP updates reflected in the mobile app
Analytics Historical data warehouse for product and business intelligence

Tech Stack


Design Document Sections

1. Data Modeling & System Design

Database design, table schemas, DynamoDB access patterns, caching with ElastiCache, and performance optimization for peak-load scenarios.

2. ETL & Data Pipeline Design

End-to-end pipeline from mobile events to the analytical data warehouse — real-time and batch paths, timezone handling, and data transformation.

3. Data Integrity & Anomaly Detection

Idempotency, deduplication, consistency validation, anomaly detection, and security controls to keep streak data trustworthy.


Code Samples

Working implementations are in the code-samples/ directory:

File Purpose
dynamodb/table_definitions.py Boto3 DynamoDB table creation with GSIs
lambda/streak_processor.py Streak update logic with timezone handling
lambda/event_ingestion.py API event validation and ingestion
glue/etl_job.py PySpark dedup & transform job
sql/redshift_schema.sql Analytical tables DDL