Milestone 1 - Foundation & Basic Pipeline

Summary

Establish the technical foundation and build the minimal agent framework to support the pipeline.

Create the database foundation, basic agent framework, Hermes Orchestrator, API services (Agent Auth, Registry, Data), agent versioning system, and minimal orchestrator that can discover agents and execute a simple pipeline.

Deliverables

Database & Infrastructure

✅ Monorepo structure with Turborepo configured
✅ PostgreSQL database with Prisma ORM setup
✅ Minimal database schema (only essential tables):
- User - Basic user information
- Ticker - Stock symbols
- UserTicker - User-ticker subscriptions
- Newsletter - Generated newsletters (with createdAt timestamp for freshness tracking)
- NewsletterContent - Newsletter content (simple text field)
- AgentConfig - Basic configuration storage
- AgentRegistry - Agent type metadata (not version-specific)
- AgentVersion - Agent version snapshots and metadata
- AgentVersionDeployment - Active version tracking (only one version per agentId can be production)
- AgentInstance - Running agent instance tracking (capacity, load, status)
- AgentJobExecution - Individual agent job execution tracking
- Schedule - Admin-created schedules
- JobTemplate - Reusable job definitions
- Pipeline - Multi-agent workflow definitions
- ScheduleExecution - Schedule execution history
- APIKey - API keys used by agents to authenticate themselves
- Note: Job tracking also handled by BullMQ/Redis
✅ Database migrations and seed scripts
✅ Redis setup for BullMQ queue system
✅ Environment variable management

Agent Framework (Minimal)

✅ Base Agent class with:
- Error handling
- Basic logging
- Configuration loading from database (AgentConfig table)
- Agent version reading from AgentVersionDeployment table
- Version information included in all outputs (agentVersion field)
- HTTP endpoint exposure (agents are language-agnostic services)
✅ Agent versioning foundation:
- Agents read active version from AgentVersionDeployment table during initialization
- All agent outputs include agentVersion field for traceability
- Basic version management (create version, deploy to production)
✅ Event-driven communication foundation:
- Agents communicate through shared database and Agent Data API (not direct calls)
- Data timestamping for freshness tracking (using createdAt/updatedAt fields on data tables)
- Agents read from database and write outputs via Agent Data API

Hermes Orchestrator (Core System Component - Not an Agent)

✅ Basic orchestrator (apps/hermes-scheduler/) that can:
- Load schedules from database (Schedule table)
- Load pipelines from database (Pipeline table)
- Discover agents from database (AgentRegistry table)
- Query AgentVersionDeployment to determine active production version for each agent
- Query AgentInstance table to discover available agent instances
- Execute schedules (cron, interval, or once) - configured via admin interface
- Support data source expansion for any agent parameter (e.g., db:ticker:all:id?enabled=true)
- Execute a simple 3-stage newsletter pipeline sequentially:
  1. Data Collection (placeholder - invoked via HTTP endpoint, writes via Agent Data API)
  2. Content Generation (placeholder - invoked via HTTP endpoint, writes via Agent Data API)
  3. Delivery (placeholder - invoked via HTTP endpoint, writes via Agent Data API)
- Basic instance management:
  - Spawn agent instances when needed (creates instance records in AgentInstance table)
  - Terminate idle instances
  - Track instance capacity and current load
- Invoke agent HTTP endpoints with proper authentication (API keys from Agent Auth API)
- Sequential execution (no parallelism yet)
- Basic error handling and retry logic
✅ Long-running BullMQ worker process
✅ Database tables: AgentRegistry, AgentVersion, AgentVersionDeployment, AgentInstance, AgentJobExecution, Schedule, Pipeline, ScheduleExecution

Agent Auth API

✅ Separate server (apps/agent-auth-api/) with basic CRUD endpoints:
- POST /api/add/ - Add a new API key
- POST /api/delete/ - Delete an API key
- POST /api/list/ - List all API keys
- POST /api/view/ - View an API key
- POST /api/update/ - Update an API key
- POST /api/enable/ - Enable an API key
- POST /api/disable/ - Disable an API key
- POST /api/revoke/ - Revoke an API key
- POST /api/rotate/ - Rotate an API key
✅ Authentication endpoint for agents to validate API keys
✅ Integration with APIKey database table

Agent Registry API

✅ Separate server (apps/agent-registry-api/) with endpoints:
- POST /api/registry/register/ - Register or update agent type metadata
- POST /api/register/ - Register a new agent instance
- POST /api/heartbeat/ - Update instance status and load
- POST /api/deregister/ - Deregister an agent instance
- GET /api/instances/ - Query available agent instances
- GET /api/registry/ - Query agent type registry
✅ Maintains AgentRegistry table (agent type metadata)
✅ Maintains AgentInstance table (running instance tracking)
✅ Basic health monitoring (heartbeat tracking)

Agent Data API

✅ Separate server (apps/agent-data-api/) with write endpoints:
- POST /api/query-strategy/ - Receive output from Query Strategy Agent
- POST /api/data-collection/ - Receive output from Data Collection Agent
- POST /api/analysis-agent/ - Receive output from Analysis Agent
- POST /api/content-generation/ - Receive output from Content Generation Agent
- POST /api/quality-assurance/ - Receive output from Quality Assurance Agent
- POST /api/delivery/ - Receive output from Delivery Agent
- POST /api/learning/ - Receive output from Learning Agent
✅ Authentication using API keys (validates via Agent Auth API)
✅ Basic validation and storage of agent outputs
✅ All outputs include agentVersion field for traceability

Web Application (Minimal)

✅ Next.js 16+ App Router application (apps/web/ - user dashboard, separate from apps/marketing/)
✅ NextAuth.js authentication (email/password only)
✅ Basic dashboard page (/dashboard) that:
- Shows authenticated user information
- Lists user's newsletters (filtered by Newsletter.userId, if any exist)
- Displays newsletter content in simple text format
✅ Database package (packages/database/) with Prisma client (shared package used by all apps)

Shared Packages

✅ packages/shared/ - Basic types and utilities
✅ Basic logging infrastructure

Task Timeline

Limitations (Acceptable for This Milestone)

No actual data collection (uses hardcoded data)
No analysis stage (skipped in 3-stage pipeline)
No quality assurance (skipped in 3-stage pipeline)
No email delivery (only saves to database)
No personalization (same content for all users)
Sequential execution only (no parallelism - processes one user-ticker combination at a time)
Basic error handling only (errors logged but no retry logic beyond BullMQ defaults)
No job status tracking UI (jobs tracked in Redis/BullMQ and AgentJobExecution table)
Basic instance management only (spawn/terminate, no advanced scaling yet)
Agent versioning is basic (create/deploy, no rollback UI yet)
API services have minimal validation and error handling

Success Criteria

✅ Database schema created and migrations run successfully (including versioning tables)
✅ Agent Auth API is functional and can manage API keys
✅ Agent Registry API is functional and can register agent types and instances
✅ Agent Data API is functional and can receive agent outputs
✅ Hermes Orchestrator can discover agents from AgentRegistry table
✅ Orchestrator can execute schedules and run the 3-stage pipeline end-to-end
✅ Orchestrator can spawn and track agent instances in AgentInstance table
✅ Placeholder agents execute via HTTP endpoints and produce output (each stage completes successfully)
✅ Agents include agentVersion in all outputs
✅ Agents write outputs via Agent Data API (not directly to database)
✅ Newsletter is saved to database with correct userId association
✅ User can log in via NextAuth.js and see their dashboard
✅ Newsletter appears in user's dashboard (filtered by user, even if content is placeholder)
✅ Data flows correctly between pipeline stages via Agent Data API (Data Collection → Content Generation → Delivery)

Next Steps

After this milestone, the system can run end-to-end (with placeholder data). Milestone 2 will replace placeholders with actual functionality.