MediaPulse

Architecture

The project consists of multiple apps, agents, and packages.

Project Structure

mediapulse/
├── apps/
│   ├── user/                    # Next.js web app for users
│   │   ├── app/
│   │   │   ├── (auth)/
│   │   │   ├── dashboard/
│   │   │   ├── newsletters/
│   │   │   └── api/
│   │   └── components/
│   ├── admin/                   # Next.js web app for admins
│   │   ├── app/
│   │   │   ├── (auth)/
│   │   │   ├── dashboard/
│   │   │   ├── agents/
│   │   │   ├── versions/
│   │   │   ├── config/
│   │   │   ├── users/
│   │   │   ├── monitoring/
│   │   │   └── api/
│   │   └── components/
│   ├── hermes-scheduler/               # Long-running app to execute schedules and orchestrate agents
│   │   └── src/
│   │       ├── scheduler.ts     # Main scheduler implementation
│   │       ├── agent-client.ts  # HTTP client for agent invocations
│   │       └── ...
│   ├── agent-auth-api/          # Server (Next.js or other) to manage API keys used by agents
│   │   └── src/
│   │       ├── api/
│   │       │   ├── add.ts    # Add a new api key
│   │       │   ├── delete.ts   # Delete an api key
│   │       │   ├── list.ts   # List all api keys
│   │       │   ├── revoke.ts   # Revoke an api key
│   │       │   ├── rotate.ts   # Rotate an api key
│   │       │   ├── view.ts   # View an api key
│   │       │   ├── update.ts   # Update an api key
│   │       │   ├── enable.ts   # Enable an api key
│   │       │   ├── disable.ts   # Disable an api key
│   │       │   └── ...
│   │       └── ...
│   ├── agent-registry-api/      # Server (Next.js or other) with endpoints for agents to report status
│   │   └── src/
│   │       ├── api/
│   │       │   ├── register.ts  # Agent registration endpoint
│   │       │   ├── heartbeat.ts # Agent heartbeat/status reporting
│   │       │   └── ...
│   │       └── ...
│   ├── agent-data-api/          # Server (Next.js or other) with endpoints for agents to read and write data
│   │   └── src/
│   │       ├── api/
│   │       │   ├── read.ts    # Read data endpoint
│   │       │   ├── write.ts   # Write data endpoint
│   │       │   └── ...
│   │       └── ...
│   ├── query-strategy/          # Single endpoint server for query strategy agent
│   │   └── src/
│   │       ├── index.ts         # Agent implementation
│   │       └── ...
│   ├── data-collection/         # Single endpoint server for data collection agent
│   │   └── src/
│   │       ├── index.ts         # Agent implementation
│   │       └── ...
│   ├── analysis-agent/          # Single endpoint server for analysis agent
│   │   └── src/
│   │       ├── index.ts         # Agent implementation
│   │       ├── plugins/         # Analysis plugin implementations
│   │       └── ...
│   ├── content-generation/      # Single endpoint server for content generation agent
│   │   └── src/
│   │       ├── index.ts         # Agent implementation
│   │       └── ...
│   ├── quality-assurance/       # Single endpoint server for QA agent
│   │   └── src/
│   │       ├── index.ts         # Agent implementation
│   │       └── ...
│   ├── delivery/                # Single endpoint server for delivery agent
│   │   └── src/
│   │       ├── index.ts         # Agent implementation
│   │       └── ...
│   └── learning/                # Single endpoint server for learning agent
│       └── src/
│           ├── index.ts         # Agent implementation
│           └── ...
├── packages/
│   ├── shared/                  # Shared types and utilities
│   ├── database/                # Prisma schema and migrations
│   ├── agents/                  # Shared agent utilities and base classes (optional)
│   ├── ai/                      # AI/LLM integration layer
│   └── email/                   # Email templates and sending
├── prisma/
│   └── schema.prisma
└── package.json                 # Turborepo monorepo config

System Components

Hermes (Orchestrator/Scheduler)

The Hermes (Orchestrator/Scheduler) is not an agent but a core system component that orchestrates agent execution. It:

  • Discovers agents dynamically from the database
  • Automatically spawns and manages agent instances based on demand and load
  • Scales instances up/down automatically (creates new instances when needed, terminates idle ones)
  • Executes schedules and pipelines
  • Distributes jobs across agent instances using load balancing
  • Monitors job status and handles retries
  • Manages job division (splitting large jobs into smaller sub-jobs)
  • Monitors instance health and automatically replaces failed instances

The orchestrator is a separate service (apps/hermes-scheduler/) that invokes agents via HTTP endpoints. See Scheduler/Orchestrator Documentation for details.

Agent Auth API

The Agent Auth API (apps/agent-auth-api/) is a server that manages the API keys used by agents to authenticate themselves.

  • POST /api/add/ - Add a new API key
  • POST /api/delete/ - Delete an API key
  • POST /api/list/ - List all API keys
  • POST /api/revoke/ - Revoke an API key
  • POST /api/rotate/ - Rotate an API key
  • POST /api/view/ - View an API key
  • POST /api/update/ - Update an API key
  • POST /api/enable/ - Enable an API key
  • POST /api/disable/ - Disable an API key

Agent Registry API

The Agent Registry API (apps/agent-registry-api/) is a server that provides endpoints for agents to:

  • Register themselves when they start up (agent type, version, endpoint URL, capacity)
  • Report status via heartbeat mechanism (current load, health status, availability)
  • Update capacity and load information in real-time
  • Deregister when shutting down

The orchestrator uses this API to:

  • Discover available agent instances
  • Monitor agent health and availability
  • Track instance capacity and current load for load balancing
  • Automatically detect and handle failed instances

The Agent Registry API maintains the AgentInstance table in the database, which tracks all running agent instances and their status. This enables the orchestrator to make informed decisions about job distribution and instance management.

Agent Data API

The Agent Data API (apps/agent-data-api/) is a server that provides endpoints for receiving data from the agents:

  • POST /api/query-strategy/ - The output of the Query Strategy Agent
  • POST /api/data-collection/ - The output of the Data Collection Agent
  • POST /api/analysis-agent/ - The output of the Analysis Agent
  • POST /api/content-generation/ - The output of the Content Generation Agent
  • POST /api/quality-assurance/ - The output of the Quality Assurance Agent
  • POST /api/delivery/ - The output of the Delivery Agent
  • POST /api/learning/ - The output of the Learning Agent

Agents

Agents are language-agnostic services that expose HTTP endpoints. Each agent type has its own dedicated app:

  • apps/query-strategy/ - Query Strategy Agent
  • apps/data-collection/ - Data Collection Agent
  • apps/analysis-agent/ - Analysis Agent
  • apps/content-generation/ - Content Generation Agent
  • apps/quality-assurance/ - Quality Assurance Agent
  • apps/delivery/ - Delivery Agent
  • apps/learning/ - Learning Agent

Agent Versioning

  • Multiple versions exist per agent type (stored in AgentVersion table)
  • Only one version can be active in production per agent type (determined by AgentVersionDeployment table)
  • When a new version is promoted to production, it replaces the previous production version
  • Agents read their active version from AgentVersionDeployment during initialization
  • All agent outputs include an agentVersion field for traceability

Agent Instances

  • The orchestrator automatically spawns and manages agent instances - admins do not need to manually start instances
  • Multiple instances of the same agent version can run in parallel for horizontal scaling
  • The orchestrator automatically:
    • Spawns new instances when job queue grows or existing instances are at capacity
    • Scales down instances when load decreases (terminates idle instances)
    • Replaces failed or unhealthy instances automatically
    • Monitors instance health and capacity
  • Each instance is tracked in the AgentInstance table with:
    • Unique instance ID (generated by orchestrator)
    • Agent type and version it's running (from AgentVersionDeployment)
    • Instance-specific endpoint URL (assigned by orchestrator)
    • Capacity (max concurrent jobs)
    • Current load tracking
  • The orchestrator distributes jobs across available instances using load balancing
  • Instance lifecycle is fully managed by the orchestrator

Database Schema

Key entities:

  • User - Recipients with preferences
  • Ticker - Company identifiers being monitored
  • UserTicker - User-company subscriptions with custom schedules
  • DataSource - Scraped articles, social posts, market data
  • Newsletter - Generated newsletter instances
  • NewsletterContent - Sections and insights within newsletters
  • UserFeedback - Ratings, clicks, engagement metrics
  • AgentMetrics - Performance tracking for self-improvement
  • ABTest - A/B testing configurations and results
  • AgentVersion - Agent version history and metadata (configs include prompts embedded within them)
  • AgentVersionDeployment - Production deployment assignments (which version is active) - Only one version per agentId can have environment: 'production'
  • AgentExperiment - Experimental runs and version comparisons
  • AgentValidation - Validation checks before version promotion
  • AnalysisTypeRegistry - Registered analysis types and their metadata
  • APIKey - API keys used by agents to authenticate themselves

Orchestrator Tables

  • AgentRegistry - Agent type metadata and registration (agents register themselves here with HTTP endpoints) - Stores agent type metadata, not version-specific
  • AgentInstance - Running agent instances (tracks multiple instances of the same agent version)
  • AgentJobExecution - Tracks individual agent job executions and status
  • Schedule - Admin-created schedules (cron, interval, or on-demand)
  • JobTemplate - Reusable job definitions with default parameters and expansion config
  • Pipeline - Multi-agent workflow definitions with dependencies
  • ScheduleExecution - Schedule execution history and tracking

AgentInstance Table

The AgentInstance table tracks running instances of agents:

interface AgentInstance {
  id: string;                    // Unique instance ID (e.g., 'data-collection-instance-1')
  agentId: string;                // Agent type (e.g., 'data-collection')
  agentVersion: string;           // Version this instance runs (e.g., '1.2.3')
  endpoint: {
    url: string;                   // Instance-specific endpoint URL
    method: 'POST';
    timeout: number;
  };
  status: 'active' | 'inactive' | 'unhealthy';
  capacity: number;                // Max concurrent jobs this instance can handle
  currentLoad: number;             // Current number of running jobs
  lastHeartbeat: Date;             // Last health check timestamp
  metadata?: {
    region?: string;
    zone?: string;
    deployment?: string;           // 'container', 'lambda', etc.
  };
  createdAt: Date;
  updatedAt: Date;
}

Key Points:

  • Multiple instances of the same agent version can exist simultaneously
  • The orchestrator uses this table to discover available instances and distribute jobs
  • Instances update their currentLoad and lastHeartbeat to indicate availability
  • Instances with stale heartbeats are considered unavailable

Analysis Plugin System

The Analysis Agent uses a plugin-based architecture that allows analysis types to be dynamically registered, configured, and enabled/disabled without code deployments. This design provides maximum flexibility for managing analysis capabilities.

Important Distinction: Plugin metadata (schemas, templates, enable/disable flags) is stored in the database, but plugin implementation code must be part of the codebase. The database registry controls which plugins are active, but does not store executable code.

Plugin Registry

Analysis types are registered in the AnalysisTypeRegistry database table, which stores:

  • Analysis Type Metadata:
    • id: string - Unique identifier (e.g., 'sentiment', 'competitive', 'event')
    • name: string - Human-readable name
    • version: string - Plugin version
    • description: string - What the analysis does
    • enabled: boolean - Whether the analysis type is active
    • configSchema: object - Zod schema for configuration validation (JSON representation)
    • outputSchema: object - Zod schema for output validation (JSON representation)
    • sectionTemplate?: string - Optional template for content generation

Dynamic Loading Mechanism

  1. Plugin Discovery: On initialization, the Analysis Agent queries AnalysisTypeRegistry for all enabled analysis types
  2. Plugin Code Loading: Plugin implementations are loaded from the codebase (not from database) based on the registry IDs. See Plugin Code Location below for details.
  3. Configuration Loading: Each registered analysis type loads its configuration from AgentConfig
  4. Runtime Execution: Analysis types are executed in parallel based on the analysisTypes parameter
  5. Hot Reload: Changes to the registry (enable/disable) or configurations trigger agent re-initialization without restart

Plugin Code Location

Analysis plugin implementation code must be located in the codebase. The recommended structure is:

apps/
  analysis-agent/              # Analysis Agent app
    src/
      plugins/
        sentiment/
          index.ts             # Plugin implementation
          schemas.ts           # Config and output schemas
        competitive/
          index.ts
          schemas.ts
        event/
          index.ts
          schemas.ts
        index.ts               # Plugin registry/loader

Plugin Discovery Process:

  1. Registry Query: Analysis Agent queries AnalysisTypeRegistry for enabled plugins (gets plugin IDs like 'sentiment', 'competitive', 'event')
  2. Code Mapping: Agent maps registry IDs to code locations:
    • Convention: plugins/{pluginId}/index.ts exports the plugin implementation
    • Or: Explicit mapping in plugin registry file
  3. Dynamic Import: Agent dynamically imports plugin code using the mapped path
  4. Validation: Agent validates that imported plugin implements AnalysisPlugin interface
  5. Registration: Valid plugins are registered and available for execution

Example Plugin Loader:

// plugins/index.ts
import { AnalysisPlugin } from '@mediapulse/types'

const pluginRegistry = new Map<string, () => Promise<AnalysisPlugin>>()

// Register plugin loaders
pluginRegistry.set('sentiment', () => import('./sentiment'))
pluginRegistry.set('competitive', () => import('./competitive'))
pluginRegistry.set('event', () => import('./event'))

export async function loadPlugin(pluginId: string): Promise<AnalysisPlugin | null> {
  const loader = pluginRegistry.get(pluginId)
  if (!loader) {
    console.warn(`Plugin ${pluginId} not found in registry`)
    return null
  }
  
  const module = await loader()
  const plugin = module.default || module
  
  // Validate plugin interface
  if (!plugin.id || !plugin.execute || !plugin.validateConfig) {
    throw new Error(`Plugin ${pluginId} does not implement AnalysisPlugin interface`)
  }
  
  return plugin
}

export async function loadAllPlugins(pluginIds: string[]): Promise<AnalysisPlugin[]> {
  const plugins = await Promise.all(
    pluginIds.map(id => loadPlugin(id))
  )
  return plugins.filter((p): p is AnalysisPlugin => p !== null)
}

Adding New Plugins:

  1. Create plugin directory: plugins/{your-plugin-id}/
  2. Implement plugin in index.ts following AnalysisPlugin interface
  3. Add plugin loader to plugins/index.ts registry
  4. Register plugin metadata in AnalysisTypeRegistry database table
  5. Deploy code changes
  6. Plugin is automatically available after agent re-initialization

Note: The plugin registry file (plugins/index.ts) can be auto-generated from the AnalysisTypeRegistry table if desired, or manually maintained. The key requirement is that plugin code exists in the codebase and can be imported by the Analysis Agent.

Plugin Interface

All analysis plugins must implement the standard interface:

interface AnalysisPlugin {
  id: string
  name: string
  version: string
  execute: (data: CollectedData, config: AnalysisConfig) => Promise<AnalysisResult>
  validateConfig: (config: any) => boolean
  getOutputSchema: () => ZodSchema
  getSectionTemplate?: () => string // Optional: for content generation
}

Adding New Analysis Types

New analysis types require two steps:

  1. Code Implementation: Write and deploy the plugin implementation code as part of the application
  2. Database Registration: Register the plugin metadata in the database:
    • Admin Interface: Register via /admin/agents/analysis-types dashboard
    • Database Direct: Insert into AnalysisTypeRegistry table
    • Configuration: Add analysis type config to AgentConfig for the analysis agent

Once both code is deployed and metadata is registered, the analysis type is immediately available for use. The database registration can be done without code deployment, but the code must exist first.

Configuration Integration

Analysis type configurations are stored in AgentConfig with the following structure:

{
  analysis: {
    enabledTypes: string[], // Which analysis types to run
    [analysisTypeId: string]: {
      // Type-specific configuration
      enabled: boolean,
      // ... type-specific settings
    }
  }
}

The configuration hierarchy applies:

  • User-specific analysis preferences (can enable/disable specific types)
  • Ticker-specific overrides (different analyses per company)
  • Agent-specific defaults from AgentConfig
  • System-wide defaults from SystemConfig

Configuration Management System

All configurations are stored in the database and loaded dynamically. No hardcoded values.

Agent Output Versioning

All agent outputs include version information for traceability and debugging. Every agent output contains an agentVersion field that specifies the semantic version (e.g., "1.2.3") of the agent that generated the output. This enables:

  • Traceability: Track which version of an agent generated specific outputs
  • Debugging: Identify version-specific issues by correlating outputs with agent versions
  • Audit Trail: Maintain a complete record of which agent version was responsible for each result
  • Experimentation: Compare outputs from different agent versions during A/B testing

Agents read their active version from the AgentVersionDeployment table during initialization and include it in all outputs, including error outputs and partial results.

Configuration Storage

  • Database Tables:

    • AgentConfig - Per-agent configurations (JSONB column with versioning) - Runtime configurations (includes prompts embedded within config)
    • AgentVersion - Agent version snapshots (configs include prompts embedded within them) - Historical versions for rollback
    • SourceConfig - Data source configurations (news sites, APIs, etc.)
    • UserPreferences - User-specific preferences
    • SystemConfig - Global system settings
    • ABTestConfig - A/B testing configurations
    • AnalysisTypeRegistry - Registered analysis types and their metadata (see Analysis Plugin System)
  • Configuration Loading:

    • Configs loaded at agent initialization from database (AgentConfig table)
    • Active version determined by AgentVersionDeployment table
    • Configs cached with TTL (configurable via SystemConfig)
    • Hot-reloadable via admin API endpoint (triggers agent re-initialization)
    • Versioned for rollback capability (via AgentVersion table)
    • Config changes trigger agent re-initialization (no restart required)
  • Configuration vs Versioning:

    • AgentConfig: Runtime configuration that agents read during execution. This is the source of truth at runtime.
    • AgentVersion: Historical snapshots of configurations for version control and rollback. Stores configuration snapshots but does not replace AgentConfig.
    • AgentVersionDeployment: Determines which version is active in production (only one version per agentId can be marked as production)
    • Version Promotion Flow: When a version is promoted to production:
      1. The AgentVersionDeployment table is updated to mark the new version as active
      2. The AgentConfig table is updated to match the version's configuration from AgentVersion.config
      3. Agents read from AgentConfig at runtime (not directly from AgentVersion)
      4. Agents reload configuration from database (hot-reload without restart)
    • Runtime Behavior: Agents always read from AgentConfig at runtime, not directly from AgentVersion. The AgentVersion table serves as a historical record and rollback point, but AgentConfig is the active configuration source.
    • Version Promotion: Only one version per agent type can be active in production at a time. Promoting a new version automatically demotes the previous production version.
  • Configuration Hierarchy (priority order):

    1. User-specific preferences (highest priority)
    2. Ticker-specific overrides
    3. Agent-specific runtime overrides
    4. Agent-specific defaults from AgentConfig table
    5. System-wide defaults from SystemConfig table (lowest priority)
  • Configuration Schema:

    • All configs stored as JSONB in PostgreSQL
    • Schema validation using Zod schemas
    • Type-safe config loading with TypeScript
    • Config migration system for version updates