- Solve real problems with our hands-on interface
- Progress from basic puts and calls to advanced strategies

Posted October 9, 2025 at 11:47 am
The article “Reproducible Quantitative Research – Beyond Pure MCP Workflows” was originally published on Deltaray blog.
Reproducibility is the cornerstone of credible quantitative research. In both academic papers and proprietary trading strategy development, results mean little if others cannot replicate them. Yet in quantitative finance, reproducibility remains challenging due to proprietary data, complex methodologies, and now, increasingly autonomous AI agents.
The latest AI coding assistants like Claude Code, Google Gemini and OpenAI’s Codex using Model Context Protocol (MCP) have revolutionized research workflows.
They can compress weeks of development into hours.
But this power comes with a hidden cost: when AI agents operate autonomously, they can undermine the very reproducibility that makes research credible.
In this post, we’ll explore how modern AI tools are transforming quantitative research, why pure agentic workflows threaten reproducibility, and a better approach to address these challenges.
AI assistants have rapidly evolved from simple code completion tools into active research partners. Early large language models could only suggest text based on their training.
Today’s AI Agents can autonomously:
This transformation was enabled by giving LLMs tool-use capabilities. Claude Code was the first to achieve this: since its initial version, it has not just suggested code but actively taken actions on your behalf. It maintains project-wide awareness, navigates documentation, and performs complex tasks from natural language prompts.
By now, both OpenAI and Google have caught up—with Codex and Gemini Code—matching the functionality of Claude Code.
To generalize tool use, Anthropic introduced the Model Context Protocol (MCP) in late 2024
The Model Context Protocol (MCP) is Anthropic’s open-source standard for providing uniform API that allows AI models to interact with external tools and services. Instead of hard-coding specific integrations, MCP servers expose tools that AI agents can invoke as needed.
Common MCP applications include:
While not strictly related to quantitative finance, zen-mcp is worth menitioning (and using).
This open-source orchestrator extends agentic coding tools to enable multi-model AI workflows. What this means in practice is that you can use OpenAI’s, Google’s and Anthropic’s (and many other) models in the same session. For example, one model can design the task, the other can implement it and the third can review it.
The different models can collaborate and chat with each other, which is impressive to see in action.
While powerful, autonomous AI agents introduce several reproducibility challenges.
When AI agents autonomously fetch data, perform analysis, and generate results, the exact steps often remain hidden. Unlike executing a script – where every transformation is visible – AI agent workflows can be black boxes. You might get results, but understanding how those results were obtained becomes difficult or impossible.
AI models may take different approaches to solving the same problem across runs. This non-determinism means:
Note: Read more about defeating nondeterminism in LLM inference on the Thinking Machines blog
Similar to the notorious “hidden state” problem in Jupyter notebooks where cell execution order affects results, AI agents compound this by:
Several open-source projects use MCP for quantitative analysis and trading:
As teaching demos, they are excellent: they reduce integration friction and demonstrate how quickly you can reach a working prototype. In practice, however, running trading strategies this way is too risky.
In these projects, the trading logic relies on the model’s output, which is inherently non-deterministic and can change over time. To make things worse, you may get downgraded to a cheaper model mid-session due to rate limits or quota exhaustion.
This means the exact same results cannot be reproduced, even if the rules, data, and environment are fixed.
Instead of relying on MCP agents to research and trade, use them to generate code that you review, version-control, and run in a controlled environment. Over time, the accumulated code can be curated into a strategy or research library.
Based on our experience with hundreds of hours of AI-assisted strategy and product development, we recommend:
Note:
Over time, we hope every MCP server and agentic tool will generate audit logs of every tool call, including inputs, outputs, model IDs, and timestamps. This would resolve several reproducibility issues.
Pure MCP agentic workflows are productivity rockets—but but also reproducibility traps. For credible research, treat agents as compilers and planners rather than autonomous researchers. Generate code, pin environments and data, and log every run.
If a result can’t be reproduced from code, config, data snapshot, and a manifest, it’s not research—it’s a demo.
For specific platform feedback and suggestions, please submit it directly to our team using these instructions.
If you have an account-specific question or concern, please reach out to Client Services.
We encourage you to look through our FAQs before posting. Your question may already be covered!
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Deltaray and is being posted with its permission. The views expressed in this material are solely those of the author and/or Deltaray and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
Throughout the lesson, please keep in mind that the examples discussed are purely for technical demonstration purposes, and do not constitute trading advice. Also, it is important to remember that placing trades in a paper account is recommended before any live trading.
Everything is nice, you bring and give articles, but as a regular and independent trader, I have no chance of getting into algorithmic and automated trading. In practice, there is no code available. And you do not give the information to the ordinary trader, and you never provide proper guidance on what products and systems he needs to establish and compete with the bloodthirsty institutional traders and the algo market makers who lecture and deceive the individual trader who only relies on indicators until he discovers from the owner or receives knowledge. The institutions and the automatic algo traders ate most of my money. You simply or did not care about the ordinary trader who pays you expensive commissions and someone like me supports you, but to give me proper and normal tools, I do not have your TWS system/and this is especially a shame that I pay dearly to your agent Interactive Israel. And this is unfair and unethical.