MediaPulse

System Flow

System Architecture Overview

The system uses a modular, event-driven architecture where agents run independently on their own schedules, communicating through a shared data store and event queue. This design provides:

  • Scalability: Agents can be scaled independently based on load
  • Reliability: Failures in one agent don't block others
  • Flexibility: Each agent runs on its optimal schedule
  • Modularity: Agents are loosely coupled and can be updated independently
  • Extensibility: Plugin-based architecture allows dynamic addition/removal of capabilities (e.g., analysis types) without code changes

Agent Schedules & Independence

Independent Agent Flows

Query Strategy Agent

Triggers:

  • Schedule
  • Manual trigger

Schedule:

  • Schedules are created by admins via the admin interface
  • Each schedule references a JobTemplate that defines the agent and parameters
  • Parameter expansion can be configured (e.g., expand to all tickers)
  • Examples:
    • Entity Graph Refresh: Weekly schedule with ticker expansion
    • Query Optimization: Daily schedule with ticker expansion
    • Query Generation: On-demand or triggered by events

Data Collection Agent

Triggers:

  • Schedule
  • Manual trigger

Schedule:

  • Schedules are created by admins via the admin interface
  • Each schedule references a JobTemplate for the Data Collection agent
  • Parameter expansion can be configured to process multiple tickers
  • Runs independently - doesn't wait for Query Strategy
  • Example: Interval-based schedule (e.g., every 4 hours) with ticker expansion

Newsletter Generation Pipeline

Triggers:

  • Schedule
  • Manual trigger

Schedule:

  • Pipeline is defined in the Pipeline table by admins
  • Schedule references the pipeline (stored in Schedule table)
  • Admin-configurable timing (cron, interval, or on-demand)
  • Parameter expansion can be configured (e.g., expand to all user-ticker subscriptions)
  • User-defined schedules can be created per user-ticker subscription

Event-Driven Communication

Data Flow & Storage

Note: All agent outputs written to the database include an agentVersion field that specifies the semantic version (e.g., "1.2.3") of the agent that generated the output. This enables traceability, debugging, and supports the agent versioning and experimentation system. Agents read their active version from the AgentVersionDeployment table during initialization and include it in all outputs.

Key Design Principles

1. Loose Coupling

  • Agents don't directly call each other
  • Agents don't receive data from other agents
  • All communication through shared database
  • Agents read from and write to the database independently
  • Agents can be updated/deployed independently

2. Independent Scheduling

  • The Scheduler/Orchestrator manages all schedules (it is not an agent)
  • Each agent has its own optimal schedule (admin-configurable)
  • Schedules are stored in the database (Schedule table)
  • Job templates define agent parameters and expansion rules
  • Query Strategy: Admin-configurable schedules (e.g., weekly entity graph, daily optimization)
  • Data Collection: Admin-configurable intervals (e.g., every 1-4 hours)
  • Newsletter Generation: Admin-configurable pipeline schedules
  • Learning: Admin-configurable schedule (e.g., daily at midnight)

3. Data Freshness Management

  • Data Collection Agent maintains fresh data in database
  • Newsletter pipeline checks data freshness before analysis
  • Can trigger immediate collection if data is stale
  • Timestamps on all data for freshness tracking

4. Scalability

  • Agent Instances: Multiple instances of the same agent version can run in parallel for horizontal scaling
  • Job Distribution: Orchestrator distributes jobs across available agent instances using load balancing
  • Job Division: Large jobs (e.g., 100 keywords) can be divided into smaller sub-jobs and distributed across instances
  • Instance Management: Each agent instance registers itself and reports capacity/load for intelligent job distribution
  • Database connection pooling
  • Caching layer for frequently accessed data

5. Reliability

  • Failed jobs are retried with exponential backoff
  • Agents continue running even if others fail
  • Data is never lost (persisted before processing)
  • Circuit breakers for external API calls

6. Modularity

  • Each agent is a separate service/worker
  • Can be deployed independently
  • Configuration stored in database (hot-reloadable)
  • Versioned agent deployments
  • Plugin-based architecture: Analysis types are plugins registered in the database, allowing addition/removal without code changes
  • Dynamic component loading: Agents discover and load components (analysis types, sections) at runtime

Error Handling & Resilience

Per-Agent Error Handling

  • Query Strategy: Retries entity discovery, falls back to cached entity graph, logs errors to database
  • Data Collection: Continues with other sources if one fails, retries with exponential backoff, marks failed sources for health monitoring
  • Analysis: Returns partial results if some analysis types fail, logs errors per plugin, continues with successful analyses
  • Content Generation: Retries with different prompts on failure (up to max retries), falls back to simpler templates if needed
  • QA: Flags for manual review if repeated failures, writes detailed error information to database
  • Delivery: Retries with exponential backoff, dead-letter queue for permanently failed deliveries, tracks bounce rates
  • Learning: Continues analysis even if some metrics are unavailable, logs warnings for incomplete data

Job Queue Error Handling

  • BullMQ Retry Logic: All jobs have configurable retry attempts with exponential backoff
  • Failed Jobs: Jobs that exceed max retries are moved to dead-letter queue for manual review
  • Job Dependencies: Failed jobs don't block dependent jobs indefinitely; dependencies can timeout
  • Error Logging: All errors are logged to database with context for debugging

System-Level Resilience

  • Database Failures: Agents queue jobs for retry when DB is available, use connection pooling with retries
  • Queue Failures: Jobs persisted to database as backup, Redis replication for high availability
  • Agent Failures: Other agents continue operating independently, failed agent jobs are retried automatically
  • Data Staleness: Newsletter pipeline triggers fresh collection if needed, can proceed with existing data if collection fails
  • Partial Failures: System continues operating with partial data (e.g., some analysis types fail but others succeed)
  • Circuit Breakers: External API calls use circuit breakers to prevent cascade failures
  • Health Monitoring: All agents report health status, failed agents are automatically restarted

Configuration & Monitoring

Agent Configuration

  • All configs stored in AgentConfig table
  • Hot-reloadable without restart
  • Per-ticker, per-user, and system-wide configs
  • Versioned for rollback capability

Monitoring & Observability

  • Each agent emits metrics (execution time, success rate, data quality)
  • Metrics stored in AgentMetrics table
  • Learning Agent analyzes metrics and optimizes configs
  • Dashboard shows agent health, data freshness, pipeline status

Benefits of This Architecture

  1. Scalability: Scale Data Collection independently from Newsletter Generation
  2. Reliability: Data Collection failures don't block Newsletter Generation (uses cached data)
  3. Efficiency: Data Collection runs frequently to keep data fresh, Newsletter uses latest data
  4. Flexibility: Easy to add new agents or modify schedules
  5. Performance: Parallel execution where possible, caching for speed
  6. Maintainability: Clear separation of concerns, easy to debug and update