The ResearchManagerAgent is the orchestration backbone of the Hedge Fund Backend’s quantitative research system. When you submit a natural language query, the manager parses it for symbol tickers, timeframe hints, and date ranges, then sequentially dispatches six specialist agents — each building on the outputs of those before it. Every agent writes its results back into a sharedDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/najmulhossainnj/Hedge-fund-backend/llms.txt
Use this file to discover all available pages before exploring further.
AgentContext that flows through the entire pipeline, accumulating feature IDs, model artefacts, backtest runs, and governance flags as research progresses.
Agent Pipeline
The six specialist agents are always executed in the following order. Each agent is stateless; all mutable research state lives inAgentContext and is threaded through by the manager.
| # | Agent | Role value | Responsibility |
|---|---|---|---|
| 1 | FeatureDiscoveryAgent | feature_discovery | Parses query for hints (rsi, sentiment, momentum, etc.), selects candidate feature plugins from the registry, runs a lightweight Random Forest + SHAP pass to prune low-signal features, and persists winning Feature rows to the database. |
| 2 | ModelDiscoveryAgent | model_discovery | Runs an AutoML leaderboard across ml.xgboost, ml.lightgbm, and ml.random_forest (or a model hinted in the query). Scores each candidate with 3-fold CV and selects the best plugin_key, then persists an MLModel row. |
| 3 | HyperparameterAgent | hyperparameter | Runs an Optuna study (default 20 trials) against the winning model using the plugin’s registered search space. Updates the MLModel row with tuned parameters and writes best_model_params back into context. |
| 4 | BacktestAgent | backtest | Trains the tuned model on the full assembled dataset, then creates and executes a Backtest record using the vectorbt engine with default capital of $100,000, 0.05% commission, and 0.05% slippage. |
| 5 | ValidationAgent | validation | Runs 5-fold rolling walk-forward analysis. If walk-forward passes, proceeds to Combinatorial Purged Cross-Validation (CPCV, 6 splits / 2 test splits) to compute Probability of Backtest Overfitting (PBO) and deflated Sharpe. |
| 6 | GovernanceAgent | governance | Checks for overfitting (IS/OOS Sharpe ratio, PBO), data leakage (future-peeking feature keys, look-ahead bias heuristics), parameter instability (learning rate bounds, tree depth, fold Sharpe variance), and minimum sample size. The pipeline halts immediately if any CRITICAL flag is raised. |
Query Parsing
The manager applies a lightweight regex parser to every incomingquery string before agents run. It extracts:
- Symbols — uppercase 1–5 letter words that look like tickers (e.g.
AAPL,TSLA). Common stop-words and indicator names (RSI,MACD,ATR) are filtered out. Up to five symbols are captured. - Timeframe — looks for
weekly/1w→"1w",hourly/1h→"1h", otherwise defaults to"1d". - Date range — if
start_date/end_dateare omitted from the request body, the manager defaults to the last three years ending at the current UTC time.
AgentContext only when not already set by the request body, so explicit body fields always take priority.
Research Pipeline
Run Full Pipeline
/api/v1/agents/research
For long-running pipelines — particularly when Optuna tuning with many trials or CPCV validation is involved — prefer the streaming endpoint
/api/v1/agents/research/stream to avoid HTTP gateway timeouts.Request Body
Natural language research instruction. The manager parses symbols, timeframe, and date range from this string. Example:
"Build a momentum strategy using RSI and news sentiment on AAPL"Explicit list of ticker symbols. When provided, these override any tickers parsed from the
query. Defaults to an empty list (parser fills it from the query).Bar timeframe string. Supported values:
"1d" (daily), "1h" (hourly), "1w" (weekly).ISO 8601 datetime marking the beginning of the data window. Defaults to three years before
end_date if omitted.ISO 8601 datetime marking the end of the data window. Defaults to current UTC time if omitted.
UUID string of an existing
Strategy row. If omitted, the manager creates a new Strategy record named after the first 60 characters of the query.Response
UUID identifying this research session. Pass this to
/chat or /sessions/{session_id} for follow-up queries.true if the pipeline completed without any CRITICAL governance flags.One-line human-readable summary: session prefix, universe, selected model, walk-forward pass status, and governance flag count.
Aggregated error strings from any agent that failed. Empty on full success.
Stream Research Pipeline (SSE)
/api/v1/agents/research/stream
/research but responds with a text/event-stream (Server-Sent Events) connection. The server emits one JSON event per agent lifecycle transition so you can display real-time progress in a UI or log pipeline state without polling.
Use this endpoint for pipelines that include Optuna hyperparameter tuning or CPCV validation — these can run for several minutes. The synchronous
/research endpoint holds the HTTP connection open for the entire duration and may be terminated by an upstream gateway timeout (typically 60–90 seconds).Request Body
Same fields as Run Full Pipeline.Response — text/event-stream
Each line is data: <json>\n\n. The JSON payload varies by event type:
| Event | When emitted | Key fields |
|---|---|---|
start | Pipeline begins | session_id, strategy_id, message |
agent_start | Before each agent runs | role, message |
agent_done | After each agent completes | role, success, summary, details, errors |
pipeline_halted | GovernanceAgent raises a CRITICAL flag | reason, flags |
complete | All agents finished successfully | session_id, strategy_id, summary, context |
Chat
Single-Turn AI Researcher Chat
/api/v1/agents/chat
ResearchManagerAgent without necessarily running the full six-agent pipeline. When a session_id is provided, the agent rehydrates the prior AgentContext — restoring symbols, feature IDs, model ID, governance flags, and strategy ID — so follow-up questions work correctly within an ongoing research session.
Request Body
Natural language question or instruction for the AI Researcher.
UUID of a previous research session. When provided, the existing context is restored so follow-up questions can reference prior pipeline outputs such as the selected model or feature set.
Key-value pairs that override specific
AgentContext fields before the query is processed. Useful for adjusting symbols or timeframe without running a new full pipeline.Response
UUID for this chat session (new UUID if no
session_id was passed in the request).The manager’s one-line summary response to the query.
Full agent result details including the updated context snapshot.
true if the manager completed without critical errors.Sessions
Retrieve Session Context
/api/v1/agents/sessions/{session_id}
AgentContext that was saved at the end of a research or chat run. Use this to inspect pipeline artefact IDs, the selected model, or governance flags from a prior session without re-running anything.
Path Parameters
UUID of the session to retrieve.
Response
UUID of the session.
Instrument tickers used in this session.
Bar timeframe (e.g.
"1d").UUID of the
Strategy row created or used by this session.UUIDs of
Feature rows discovered and persisted during this session.UUID of the winning
MLModel row.Plugin key of the winning model (e.g.
"ml.xgboost").Tuned hyperparameter dictionary for the winning model.
UUIDs of
Backtest rows executed during this session.All governance flag strings (severity-prefixed) raised by the GovernanceAgent.
true if both walk-forward and CPCV validation passed.List All Active Sessions
/api/v1/agents/sessions
Sessions are stored in an in-process Python dictionary (
_sessions). They are not persisted to the database and will be lost on server restart. For production deployments, the session store should be replaced with a Redis backend to survive worker restarts and support horizontal scaling.Response
List of all active session UUID strings.
Total number of active sessions.