1. Data Modeling & System Design¶

Database Choice: DynamoDB + Redshift Hybrid¶

The system uses a dual-database architecture — DynamoDB for the operational (OLTP) layer and Redshift for the analytical (OLAP) layer.

Why DynamoDB for the Operational Layer¶

Requirement	DynamoDB Capability
Sub-10ms latency	Single-digit millisecond reads/writes at any scale
Automatic scaling	On-demand capacity mode handles traffic spikes without provisioning
Global reach	Global Tables for multi-region replication (Luzia serves users worldwide)
Conditional writes	Built-in optimistic locking for streak consistency
Event streaming	DynamoDB Streams for change data capture (CDC)

Why Not a Relational Database?¶

A relational DB like Aurora would work for moderate scale, but DynamoDB is the stronger choice because:

No connection pooling bottleneck — DynamoDB is HTTP-based, no connection limits
Horizontal scaling is automatic — no read replicas to manage for read-heavy streak lookups
Cost model fits the pattern — millions of small reads/writes are cheaper on DynamoDB on-demand than Aurora
Schema flexibility — BP reward types can evolve without ALTER TABLE migrations

Table Design¶

Access Pattern Analysis¶

Before designing tables, identify the access patterns:

Access Pattern	Operation	Frequency
Get user's current streak & BP	`GetItem` by `user_id`	Every app open (~millions/day)
Update streak on daily activity	`UpdateItem` conditional	Every unique daily activity
Record a BP-earning event	`PutItem`	Every interaction
Get user's BP history (last 30 days)	`Query` by `user_id` + date range	On profile view
Leaderboard: top streaks	`Query` GSI	Periodic / on-demand
Detect duplicate events	`GetItem` by idempotency key	Every incoming event

Entity Relationship Diagram¶

erDiagram
    USER_STREAKS {
        string user_id PK
        int current_streak
        int longest_streak
        string last_activity_date
        string timezone
        int total_bestie_points
        string updated_at
    }

    BP_EVENTS {
        string user_id PK
        string event_id SK
        string event_type
        int points_awarded
        string source
        string created_at
        string idempotency_key
        int ttl
    }

    DAILY_ACTIVITY {
        string user_id PK
        string activity_date SK
        string first_event_id
        int events_count
        int points_earned
        string created_at
    }

    USER_STREAKS ||--o{ BP_EVENTS : "earns"
    USER_STREAKS ||--o{ DAILY_ACTIVITY : "tracks"

Table 1: `UserStreaks`¶

The hot table — queried on every app open to display the user's streak and BP balance.

Attribute	Type	Key	Description
`user_id`	String	PK	Unique user identifier
`current_streak`	Number		Consecutive days count
`longest_streak`	Number		All-time max streak
`last_activity_date`	String		`YYYY-MM-DD` in user's local TZ
`timezone`	String		IANA timezone (e.g. `America/Sao_Paulo`)
`total_bestie_points`	Number		Lifetime BP balance
`streak_updated_at`	String		ISO 8601 timestamp
`version`	Number		Optimistic locking counter
`recent_tz_changes`	List		Tracks timezone changes for abuse detection

GSI: StreakLeaderboard

Key	Attribute
PK	`leaderboard_partition` (fixed value `"GLOBAL"` or regional shard)
SK	`current_streak` (sort descending)

This GSI enables leaderboard queries. The partition key is sharded (e.g., by country) to avoid hot partitions.

Table 2: `BPEvents`¶

Append-only event log for every point-earning interaction.

Attribute	Type	Key	Description
`user_id`	String	PK	User identifier
`event_id`	String	SK	`{timestamp}#{uuid}` for sort order
`event_type`	String		`CONVERSATION_START`, `TOOL_USE`, `DAILY_OPEN`, etc.
`points_awarded`	Number		Points for this event
`source`	String		`mobile_app`, `web`, `whatsapp`
`idempotency_key`	String		Client-generated dedup key
`created_at`	String		ISO 8601
`ttl`	Number		Epoch seconds — auto-delete after 90 days

GSI: IdempotencyIndex

Key	Attribute
PK	`idempotency_key`

Enables O(1) duplicate detection before writing.

Table 3: `DailyActivity`¶

One record per user per day — the source of truth for streak calculation.

Attribute	Type	Key	Description
`user_id`	String	PK	User identifier
`activity_date`	String	SK	`YYYY-MM-DD` in user's local TZ
`first_event_id`	String		Reference to the triggering event
`events_count`	Number		Total interactions that day
`points_earned`	Number		Total BP earned that day
`created_at`	String		ISO 8601

Performance Optimization¶

Caching Strategy (ElastiCache Redis)¶

graph LR
    App[Mobile App] --> API[API Gateway]
    API --> Lambda[Lambda]
    Lambda --> Cache{ElastiCache<br/>Redis}
    Cache -->|Cache Hit| Lambda
    Cache -->|Cache Miss| DDB[DynamoDB]
    DDB --> Lambda
    Lambda -->|Write-through| Cache

What gets cached:

Data	Cache Key	TTL	Invalidation
User streak + BP	`streak:{user_id}`	5 min	Write-through on update
Daily activity flag	`active:{user_id}:{date}`	24h	None (immutable once set)
Leaderboard top 100	`leaderboard:{region}`	1 min	Scheduled refresh

Why write-through instead of write-behind: streak data must be immediately consistent — a user updating their streak should see the new count instantly.

DynamoDB Performance Tuning¶

On-demand capacity mode — no need to predict traffic; handles spikes automatically during push notifications or marketing campaigns
ElastiCache Redis — caches streak data, rate-limit counters, and computed aggregates. Chosen over DAX because the system needs more than DynamoDB read acceleration — Redis supports rate limiting (INCR/EXPIRE), processed-event sets (SADD/SISMEMBER), and arbitrary cached computations that DAX cannot provide
Partition key design — user_id provides natural distribution across partitions (assuming UUIDs). No hot partition risk since each user's data is independent
Item size optimization — keep items small (<1 KB) for maximum throughput per partition

Load Balancing¶

API Gateway handles request distribution across Lambda functions automatically
Lambda concurrency — set reserved concurrency per function to prevent one function from starving others
DynamoDB auto-scaling (if using provisioned mode) — CloudWatch alarms trigger capacity increases before throttling occurs
Regional failover — DynamoDB Global Tables + Route 53 health checks enable active-active multi-region

System Architecture Overview¶

graph TB
    subgraph Client
        App[Mobile App]
    end

    subgraph "AWS — Real-Time Path"
        APIGW[API Gateway]
        LambdaIngest[Lambda:<br/>Event Ingestion]
        DDB[(DynamoDB)]
        Redis[ElastiCache Redis]
        Streams[DynamoDB Streams]
        LambdaProcess[Lambda:<br/>Streak Processor]
    end

    subgraph "AWS — Batch / Analytics Path"
        Firehose[Kinesis Data Firehose]
        S3Raw[(S3: Raw Events)]
        Glue[AWS Glue]
        S3Curated[(S3: Curated)]
        Redshift[(Redshift)]
    end

    App --> APIGW
    APIGW --> LambdaIngest
    LambdaIngest --> DDB
    LambdaIngest --> Redis
    DDB --> Streams
    Streams --> LambdaProcess
    LambdaProcess --> DDB
    LambdaProcess --> Redis

    Streams --> Firehose
    Firehose --> S3Raw
    S3Raw --> Glue
    Glue --> S3Curated
    S3Curated --> Redshift

Code Samples

See code-samples/dynamodb/table_definitions.py for the full Boto3 implementation of these table designs.