Google Releases Gemini Deep Research Max with Arbitrary MCP Support

Google has officially moved its autonomous research capabilities from a closed-beta experiment into a full-scale developer platform. The release of Gemini Deep Research and Deep Research Max via the new Interactions API marks a shift from simple web-searching chatbots to autonomous agents capable of navigating proprietary data silos and executing complex, multi-step reasoning chains.

This release is anchored by two major technical milestones: the integration of the Model Context Protocol (MCP) for arbitrary tool support and a new state-of-the-art performance on the Humanity’s Last Exam (HLE) benchmark, where the “Max” variant achieved a score of 54.6%—surpassing both GPT-5.4 and Gemini 3 Deep Think Humanity’s Last Exam Leaderboard.

The Two-Tier Agent Strategy

Google is splitting the research experience into two distinct modes to balance the classic trade-off between latency and depth. Both are powered by the Gemini 3.1 Pro reasoning core, which features a 1-million-token context window and a 64K token output limit Google Blog.

Deep Research: Optimized for speed and interactive use. It is designed for real-time integration into user-facing applications where a response is needed in seconds or minutes rather than hours. It replaces the initial December 2024 preview with significantly lower per-query costs.
Deep Research Max: This is the “heavy lifter.” It uses extended test-time compute to iteratively reason, search, and refine. It is intended for asynchronous workflows—like generating a 20-page due diligence report overnight.

Arbitrary Tool Support via MCP

Perhaps the most significant update for engineers is the native support for the Model Context Protocol (MCP). This allows the Deep Research agent to step outside the open web and query specialized, gated, or local data repositories.

By using a client-host-server architecture based on JSON-RPC 2.0, developers can connect the agent to custom MCP servers that wrap internal databases, Jira instances, or local file systems. This transforms the agent from a general-purpose searcher into a specialized analyst that can cross-reference public market trends with a company’s internal Gmail and Drive data Google AI for Developers.

Technical Implementation

Access is provided through the Interactions API, a stateful gateway designed for long-running tasks. Unlike the standard generate_content endpoint, the Interactions API allows for background execution and polling.


# Example conceptual workflow for triggering a research task
from google import genai

client = genai.Client(api_key="YOUR_API_KEY")

# Trigger an asynchronous research task
interaction = client.interactions.create(
    model="deep-research-max-preview-04-2026",
    input="Analyze the impact of LPU architectures on edge inference costs.",
    config={"background": True}
)

# The agent can now use MCP tools to query internal cost databases
print(f"Task ID: {interaction.id}")

Benchmarking the “Thinking” Gap

Google also open-sourced DeepSearchQA, a benchmark containing 900 “causal chain” tasks across 17 fields. These tasks are designed so that step B cannot be completed without successfully analyzing the results of step A.

Model	Humanity’s Last Exam (HLE)	DeepSearchQA
Gemini Deep Research Max	54.6%	93.3%
Gemini 3 Deep Think	48.4%	—
GPT-5 Pro (High Reasoning)	38.9%	65.2%
OpenAI Deep Research (o3)	26.6%	44.2%

Source: DeepSearchQA: Bridging the Comprehensiveness Gap

The “Max” version’s 93.3% on DeepSearchQA suggests that scaling test-time compute—allowing the model to “think” longer and explore parallel search trajectories—is currently the most effective way to close the gap on complex information retrieval.

Pricing and Availability

Google is pricing this aggressively to capture market share from OpenAI. While OpenAI’s Deep Research is bundled into the $200/mo ChatGPT Pro tier, Google is offering the standard Deep Research agent within the $20/mo Gemini Advanced tier.

For developers using the API, the costs are roughly $2.00 per 1M input tokens and $12.00 per 1M output tokens, with a $14 charge per 1,000 Google Search queries (though the first 5,000 searches per month are free) Gemini API Pricing. Early reports suggest a typical comprehensive research query costs approximately $3.00 total.

Takeaways

MCP is the new standard: By adopting the Model Context Protocol, Google is making it trivial to point an autonomous agent at your own private data without building custom scrapers.
Test-time compute scales: The massive jump from 66% to 93% on DeepSearchQA proves that for research, “thinking longer” beats “training bigger.”
Enterprise-ready outputs: The addition of native infographic generation and Google Docs export suggests Google is targeting the “analyst-in-a-box” market directly.
API-first agents: The move to the Interactions API shows that Google expects these agents to be embedded in third-party apps, not just used in a chat window.