Reproducible Quantitative Research – Beyond Pure MCP Workflows

The article “Reproducible Quantitative Research – Beyond Pure MCP Workflows” was originally published on Deltaray blog.

Reproducibility is the cornerstone of credible quantitative research. In both academic papers and proprietary trading strategy development, results mean little if others cannot replicate them. Yet in quantitative finance, reproducibility remains challenging due to proprietary data, complex methodologies, and now, increasingly autonomous AI agents.

The latest AI coding assistants like Claude Code, Google Gemini and OpenAI’s Codex using Model Context Protocol (MCP) have revolutionized research workflows.

They can compress weeks of development into hours.

But this power comes with a hidden cost: when AI agents operate autonomously, they can undermine the very reproducibility that makes research credible.

In this post, we’ll explore how modern AI tools are transforming quantitative research, why pure agentic workflows threaten reproducibility, and a better approach to address these challenges.

The Evolution: From Text Generator to Active Researcher

AI assistants have rapidly evolved from simple code completion tools into active research partners. Early large language models could only suggest text based on their training.

Today’s AI Agents can autonomously:

Fetch and analyze historical market data
Execute complex multi-step research workflows
Run backtests and do statistical tests
Generate visualizations and reports
Commit results to version control

This transformation was enabled by giving LLMs tool-use capabilities. Claude Code was the first to achieve this: since its initial version, it has not just suggested code but actively taken actions on your behalf. It maintains project-wide awareness, navigates documentation, and performs complex tasks from natural language prompts.

By now, both OpenAI and Google have caught up—with Codex and Gemini Code—matching the functionality of Claude Code.

To generalize tool use, Anthropic introduced the Model Context Protocol (MCP) in late 2024

Understanding MCP: Power and Pitfalls

The Model Context Protocol (MCP) is Anthropic’s open-source standard for providing uniform API that allows AI models to interact with external tools and services. Instead of hard-coding specific integrations, MCP servers expose tools that AI agents can invoke as needed.

MCP in Quantitative Research

Common MCP applications include:

Query financial databases for market data (like Polygon.io’s recent MCP connector)
Execute trades through broker APIs: Alpaca, Tasty or IBKR, etc.
Fetch social sentiment from Reddit or Twitter
Screen and analyze indicators for trading signals: TradingView MCP
Read and analyze research papers from arXiv
And many more

Zen-MCP: The Multi-Model Orchestrator

While not strictly related to quantitative finance, zen-mcp is worth menitioning (and using).

This open-source orchestrator extends agentic coding tools to enable multi-model AI workflows. What this means in practice is that you can use OpenAI’s, Google’s and Anthropic’s (and many other) models in the same session. For example, one model can design the task, the other can implement it and the third can review it.

The different models can collaborate and chat with each other, which is impressive to see in action.

The Reproducibility problem

While powerful, autonomous AI agents introduce several reproducibility challenges.

1. The Opacity Problem

When AI agents autonomously fetch data, perform analysis, and generate results, the exact steps often remain hidden. Unlike executing a script – where every transformation is visible – AI agent workflows can be black boxes. You might get results, but understanding how those results were obtained becomes difficult or impossible.

2. Non-Deterministic Execution

AI models may take different approaches to solving the same problem across runs. This non-determinism means:

The same research question might yield different methodologies
Data processing steps may vary
Rate limits or tier changes can trigger model fallbacks (e.g., Opus → Sonnet), altering tool choices and outputs

Note: Read more about defeating nondeterminism in LLM inference on the Thinking Machines blog

3. Hidden State and Dependencies

Similar to the notorious “hidden state” problem in Jupyter notebooks where cell execution order affects results, AI agents compound this by:

Dynamically choosing data sources without documentation
Using different libraries or methods without explicit tracking
Making assumptions that aren’t recorded

Danger-zone: Agentic Trading

Several open-source projects use MCP for quantitative analysis and trading:

Maverick MCP: “financial data analysis, technical indicators, and portfolio optimization tools directly to your Claude Desktop”
PrimoAgent: “multi agent AI stock analysis system … to provide comprehensive daily trading insights and next-day price predictions”
Alpaca’s example on building MCP-Based Trading workflow

As teaching demos, they are excellent: they reduce integration friction and demonstrate how quickly you can reach a working prototype. In practice, however, running trading strategies this way is too risky.

Danger Zone

In these projects, the trading logic relies on the model’s output, which is inherently non-deterministic and can change over time. To make things worse, you may get downgraded to a cheaper model mid-session due to rate limits or quota exhaustion.

This means the exact same results cannot be reproduced, even if the rules, data, and environment are fixed.

A simple solution

Instead of relying on MCP agents to research and trade, use them to generate code that you review, version-control, and run in a controlled environment. Over time, the accumulated code can be curated into a strategy or research library.

Based on our experience with hundreds of hours of AI-assisted strategy and product development, we recommend:

1. Treat AI as a Code Generator, Not an Autonomous Agent

Use AI to generate reproducible scripts and analysis code
Review AI-generated plans and code before execution
Maintain human oversight of critical decisions

2. Version Control Everything

Regularly commit all analysis scripts, strategies, and utilities to version control
Include the models and the prompts used to generate the code (e.g., in Pull Request description)
Document data sources, extraction timestamps, and filters in a data catalog
Persist backtest results, metrics, and visualizations in structured storage

3. Use the right model for the task

Use the most capable model to create a detailed plan for the task (e.g. Anthropic Opus or GPT-5 at the time of writing)
Review the plan using different models to gain confidence (e.g. OpenAI’s o3-pro, Gemini-2.5-pro)
Use the detailed implementation plan to generate the code. A less capable model can be used here. (e.g. Anthropic Sonnet)
Review the generated code using different models (e.g.: gemini-2.5-pro, r1 and o3-pro)
Conduct the final review of the generated code using human expertise

Note:

Over time, we hope every MCP server and agentic tool will generate audit logs of every tool call, including inputs, outputs, model IDs, and timestamps. This would resolve several reproducibility issues.

Conclusion

Pure MCP agentic workflows are productivity rockets—but but also reproducibility traps. For credible research, treat agents as compilers and planners rather than autonomous researchers. Generate code, pin environments and data, and log every run.

If a result can’t be reproduced from code, config, data snapshot, and a manifest, it’s not research—it’s a demo.

Join The Conversation

For specific platform feedback and suggestions, please submit it directly to our team using these instructions.

If you have an account-specific question or concern, please reach out to Client Services.

We encourage you to look through our FAQs before posting. Your question may already be covered!

Visit IBKR.com Open an IBKR Account

One thought on “Reproducible Quantitative Research – Beyond Pure MCP Workflows”

Anonymous
October 17, 2025 at 2:25 pm
Everything is nice, you bring and give articles, but as a regular and independent trader, I have no chance of getting into algorithmic and automated trading. In practice, there is no code available. And you do not give the information to the ordinary trader, and you never provide proper guidance on what products and systems he needs to establish and compete with the bloodthirsty institutional traders and the algo market makers who lecture and deceive the individual trader who only relies on indicators until he discovers from the owner or receives knowledge. The institutions and the automatic algo traders ate most of my money. You simply or did not care about the ordinary trader who pays you expensive commissions and someone like me supports you, but to give me proper and normal tools, I do not have your TWS system/and this is especially a shame that I pay dearly to your agent Interactive Israel. And this is unfair and unethical.
Reply

Leave a Reply Cancel reply

Disclosure: Interactive Brokers Third Party

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from Deltaray and is being posted with its permission. The views expressed in this material are solely those of the author and/or Deltaray and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

Disclosure: API Examples Discussed

Throughout the lesson, please keep in mind that the examples discussed are purely for technical demonstration purposes, and do not constitute trading advice. Also, it is important to remember that placing trades in a paper account is recommended before any live trading.

How much could you save on your margin loan by switching to Interactive Brokers?

Fill out the information below to see your estimated savings.

Current Interest Rate

Balance

USD

Margin Amount Borrowed

USD

Time Margin is Borrowed

IBKR will assess a surcharge of 1% on large loan balances unless otherwise prearranged with IBKR. The 1% surcharge would apply to all balances in the highest tier.

The interest calculator is based on information that we believe to be accurate and correct, but neither Interactive Brokers LLC nor its affiliates warrant its accuracy or adequacy and it should not be relied upon as such. Neither IBKR nor its affiliates are responsible for any errors or omissions or for results obtained from the use of this calculator.

Restrictions apply. Annual Percentage Rate (APR) on USD margin loan balances for IBKR Pro as of October 3, 2024. Interactive Brokers calculates the interest charged on margin loans using the applicable rates for each interest rate tier listed on its website. Learn more about margin loan rates.

The projections or other information generated by the Interest Calculator tool are hypothetical in nature, do not reflect actual results and are not guarantees of future results. Please note that results may vary with use of the tool over time.

Trading on margin is only for experienced investors with high risk tolerance. You may lose more than your initial investment. For additional information about rates on margin loans, please see Margin Loan Rates.

Master options fundamentals with our new Interactive Learning course

Reproducible Quantitative Research – Beyond Pure MCP Workflows

The Evolution: From Text Generator to Active Researcher

Understanding MCP: Power and Pitfalls

MCP in Quantitative Research

Zen-MCP: The Multi-Model Orchestrator

The Reproducibility problem

1. The Opacity Problem

2. Non-Deterministic Execution

3. Hidden State and Dependencies

Danger-zone: Agentic Trading

Danger Zone

A simple solution

1. Treat AI as a Code Generator, Not an Autonomous Agent

2. Version Control Everything

3. Use the right model for the task

Conclusion

Join The Conversation

Leave a Reply Cancel reply

Disclosure: Interactive Brokers Third Party

Disclosure: API Examples Discussed

Information on Other Interactive Brokers Affiliates

Interactive Brokers Canada Inc.

Interactive Brokers Australia Pty. Ltd.

Interactive Brokers Hong Kong Limited

Interactive Brokers India Pvt. Ltd.

Interactive Brokers Securities Japan Inc.

Interactive Brokers Singapore Pte. Ltd.

IBKR Campus Log In

Master options fundamentals with our new Interactive Learning course

The Evolution: From Text Generator to Active Researcher

Understanding MCP: Power and Pitfalls

MCP in Quantitative Research

Zen-MCP: The Multi-Model Orchestrator

The Reproducibility problem

1. The Opacity Problem

2. Non-Deterministic Execution

3. Hidden State and Dependencies

Danger-zone: Agentic Trading

Danger Zone

A simple solution

1. Treat AI as a Code Generator, Not an Autonomous Agent

2. Version Control Everything

3. Use the right model for the task

Conclusion

Join The Conversation

Leave a Reply Cancel reply

Disclosure: Interactive Brokers Third Party

Disclosure: API Examples Discussed

Bi-Weekly Newsletter

Daily Newsletter

Weekly Newsletter

Weekly Newsletter

Monthly Newsletter