Stop Chatting. Start Acting.
The ScrapingBee MCP Server connects your AI models to the live web, empowering them to browse, search, and extract data using our robust API that handles proxies, CAPTCHAs, and JavaScript rendering.
The Power of a Universal Language for AI
The Model Context Protocol (MCP) is an open-source specification that allows AI models to reliably interact with external tools. It's a universal language between an AI (the client) and a tool provider like us (the server).
Instead of being limited to static training data, an MCP-enabled AI can request real-time actions and receive structured data back. This is the key to building powerful, autonomous agents.
Live Data Access
Access real-time information from any website.
Perform Actions
Go beyond text generation. Search, extract, and more.
Standardized & Reliable
A robust protocol ensures predictable results.
A Comprehensive Suite of Tools
get_page_text
Core
Scrapes a URL and returns clean text or Markdown. Use premium_proxy or stealth_proxy options to bypass advanced blocking.
get_page_html
Core
Retrieves the full, raw HTML content of a webpage, essential for tasks that require parsing the DOM or analyzing site structure.
get_screenshot
Core
Captures a visual screenshot of a webpage. Can capture the full page or a specific element identified by a CSS selector.
get_file
Core
A versatile tool to fetch any file from a URL, including images, PDFs, or other documents.
get_google_search_results
Search
Performs a Google search with support for classic results, news, images, maps, Google Lens, and even Google's AI mode.
get_amazon_product_details
E-commerce
Fetches clean, detailed information for any Amazon product using its ASIN, including pricing, ratings, and review counts.
get_amazon_search_results
E-commerce
Scrapes Amazon search result pages for a given query, providing a structured list of products to power e-commerce analysis.
get_walmart_product_details
E-commerce
Retrieves structured data for a specific Walmart product, including price and availability, localized by store or zip code.
get_walmart_search_results
E-commerce
Performs a product search on Walmart and returns a structured list of results with filtering and sorting capabilities.
ask_chatgpt
AI
Sends a prompt to a ChatGPT model, with an option to enhance the response with live web search results for up-to-date answers.
Connect to Your AI in Minutes
Connect to Claude
Integrate our MCP server with clients like Claude Desktop by adding it to your configuration file.
// In claude_desktop_config.json
{
"mcpServers": {
"scrapingbee": {
"command": "npx",
"args": [
"mcp-remote",
"https://mcp.scrapingbee.com/mcp?api_key=<your_api_key>"
]
}
}
}
Create a Custom MCP Client
Build a simple Python client with `httpx` to interact with the ScrapingBee MCP server programmatically.
# custom_mcp_client.py
import os
import re
import json
import asyncio
import httpx
import google.generativeai as genai
from typing import Dict, Any, List
# ---------------------------------------------------------------------
# 🔧 Configuration
# ---------------------------------------------------------------------
MCP_SERVER_URL = "https://mcp.scrapingbee.com/mcp"
SCRAPINGBEE_API_KEY = os.environ.get("SCRAPINGBEE_API_KEY")
GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
# ---------------------------------------------------------------------
# ⚙️ MCP Client
# ---------------------------------------------------------------------
class MCPClient:
"""A minimal async client to communicate with an MCP server."""
def __init__(self, base_url: str, api_key: str):
self.base_url = f"{base_url}?api_key={api_key}"
self.session_id = None
self.http_client = httpx.AsyncClient(timeout=300)
self.headers = {
"Content-Type": "application/json",
"Accept": "application/json, text/event-stream",
"User-Agent": "Minimal-MCP-Client/1.0",
}
async def _send_request(self, payload: Dict[str, Any]) -> Dict[str, Any] | None:
"""Send a JSON-RPC request and handle JSON or SSE responses."""
headers = self.headers.copy()
if self.session_id:
headers["Mcp-Session-Id"] = self.session_id
try:
response = await self.http_client.post(self.base_url, json=payload, headers=headers)
response.raise_for_status()
text = await response.aread()
text_str = text.decode("utf-8")
# Detect Server-Sent Events (SSE) and extract JSON
if "data:" in text_str:
match = re.search(r"data:\s*({.*})", text_str, re.DOTALL)
if match:
return json.loads(match.group(1))
# Some notifications may have no body
if response.status_code in [202, 204] or not text_str:
return {"status": "ok"}
return json.loads(text_str)
except Exception as e:
print(f"❌ Request error: {e}")
return None
async def initialize(self) -> List[Dict[str, Any]] | None:
"""Perform MCP handshake and list available tools."""
# 1️⃣ Initialize session
init_payload = {
"jsonrpc": "2.0",
"id": "1",
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": {"name": "MinimalClient"},
},
}
async with self.http_client.stream("POST", self.base_url, json=init_payload, headers=self.headers) as response:
self.session_id = response.headers.get("Mcp-Session-Id")
if not self.session_id:
print("❌ Could not establish MCP session.")
return None
print(f"✅ MCP Session established: {self.session_id}")
# 2️⃣ Send `notifications/initialized`
notify_payload = {"jsonrpc": "2.0", "method": "notifications/initialized", "params": {}}
await self._send_request(notify_payload)
# 3️⃣ List available tools
list_payload = {"jsonrpc": "2.0", "id": "2", "method": "tools/list", "params": {}}
response_data = await self._send_request(list_payload)
if response_data and "result" in response_data and "tools" in response_data["result"]:
tool_count = len(response_data["result"]["tools"])
print(f"🔧 {tool_count} tools fetched successfully from MCP.")
return response_data["result"]["tools"]
print("❌ Failed to retrieve tool list.")
return None
async def call_tool(self, tool_name: str, arguments: Dict[str, Any]) -> Dict[str, Any] | None:
"""Execute a specific MCP tool."""
payload = {
"jsonrpc": "2.0",
"id": "3",
"method": "tools/call",
"params": {"name": tool_name, "arguments": arguments},
}
return await self._send_request(payload)
# ---------------------------------------------------------------------
# 🧩 Gemini Tool Schema Conversion
# ---------------------------------------------------------------------
def format_tools_for_gemini(tools: List[Dict[str, Any]]) -> List[genai.protos.Tool]:
"""
Converts MCP tool schemas into Gemini FunctionDeclarations,
allowing Gemini to reason about which tool to call.
"""
gemini_tools = []
type_map = {
"string": genai.protos.Type.STRING,
"integer": genai.protos.Type.INTEGER,
"number": genai.protos.Type.NUMBER,
"boolean": genai.protos.Type.BOOLEAN,
}
for tool in tools:
properties = {}
input_schema = tool.get("inputSchema", {})
# Build parameter schema
if "properties" in input_schema:
for name, prop in input_schema["properties"].items():
properties[name] = genai.protos.Schema(
type=type_map.get(prop.get("type", "string"), genai.protos.Type.STRING),
description=prop.get("description", ""),
)
func = genai.protos.FunctionDeclaration(
name=tool["name"],
description=tool["description"],
parameters=genai.protos.Schema(
type=genai.protos.Type.OBJECT,
properties=properties,
required=input_schema.get("required", []),
),
)
gemini_tools.append(genai.protos.Tool(function_declarations=[func]))
return gemini_tools
# ---------------------------------------------------------------------
# 🤖 Main Program Flow
# ---------------------------------------------------------------------
async def main():
"""Runs the MCP–Gemini integration demo."""
if not SCRAPINGBEE_API_KEY or not GEMINI_API_KEY:
print("❌ Please set SCRAPINGBEE_API_KEY and GEMINI_API_KEY environment variables.")
return
# 1️⃣ Connect to MCP and list tools
mcp_client = MCPClient(MCP_SERVER_URL, SCRAPINGBEE_API_KEY)
tools = await mcp_client.initialize()
if not tools:
return
# 2️⃣ Configure Gemini with MCP tool schemas
genai.configure(api_key=GEMINI_API_KEY)
gemini_tools = format_tools_for_gemini(tools)
model = genai.GenerativeModel(model_name="gemini-2.5-pro", tools=gemini_tools)
print("\n✨ Gemini is now configured with MCP tools.")
# 3️⃣ Get user request and let Gemini pick a tool
user_prompt = input("\nAsk the AI to perform a task (e.g., 'Search Walmart for iPhones'):\n> ")
if not user_prompt.strip():
return
print("\n🤖 Asking Gemini which MCP tool to use...")
response = model.generate_content(user_prompt)
function_call = response.candidates[0].content.parts[0].function_call
if not function_call.name:
print("\n💬 Gemini Response (no tool selected):")
print(response.text)
return
tool_name = function_call.name
tool_args = dict(function_call.args)
print(f"🧠 Gemini chose tool: '{tool_name}' with arguments: {tool_args}")
# 4️⃣ Execute Gemini’s selected tool
print(f"\n⚙️ Running '{tool_name}' on MCP...")
result = await mcp_client.call_tool(tool_name, tool_args)
# 5️⃣ Display the result
print("\n📄 --- MCP Tool Result ---")
print(json.dumps(result, indent=2) if result else "No result returned.")
# ---------------------------------------------------------------------
# 🚀 Entry Point
# ---------------------------------------------------------------------
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
print("\nExiting gracefully.")
Frequently Asked Questions
How do I get a ScrapingBee API key?
An API key is required to use these tools. You can get a free key with 1,000 credits by registering here. The key should be passed as a query parameter in the MCP URL.
When should I use premium_proxy or stealth_proxy?
Start without them. If you get blocked (e.g., receive an error or CAPTCHA page), try premium_proxy: true. If you are still blocked, stealth_proxy: true will help you to unblock it.
Can I scrape more than just text?
Yes. The server provides tools to get the full `get_page_html`, take a `get_screenshot` of the page (or a specific element), and even download files like images or PDFs using the `get_file` tool.
How much does each request cost?
Each requests will be charged based on the proxy and features you are using. For more details, you can refer to this link.
Ready to Build?
Get your free API key with 1,000 credits to start building powerful AI agents today.
Start for Free