Stop Chatting. Start Acting.

The ScrapingBee MCP Server connects your AI models to the live web, empowering them to browse, search, and extract data using our robust API that handles proxies, CAPTCHAs, and JavaScript rendering.

MODEL CONTEXT PROTOCOL

The Power of a Universal Language for AI

The Model Context Protocol (MCP) is an open-source specification that allows AI models to reliably interact with external tools. It's a universal language between an AI (the client) and a tool provider like us (the server).

Instead of being limited to static training data, an MCP-enabled AI can request real-time actions and receive structured data back. This is the key to building powerful, autonomous agents.

Live Data Access

Access real-time information from any website.

Perform Actions

Go beyond text generation. Search, extract, and more.

Standardized & Reliable

A robust protocol ensures predictable results.

THE TOOLBOX

A Comprehensive Suite of Tools

get_page_text Core

Scrapes a URL and returns clean text or Markdown. Use premium_proxy or stealth_proxy options to bypass advanced blocking.

get_page_html Core

Retrieves the full, raw HTML content of a webpage, essential for tasks that require parsing the DOM or analyzing site structure.

get_screenshot Core

Captures a visual screenshot of a webpage. Can capture the full page or a specific element identified by a CSS selector.

get_file Core

A versatile tool to fetch any file from a URL, including images, PDFs, or other documents.

get_google_search_results Search

Performs a Google search with support for classic results, news, images, maps, Google Lens, and even Google's AI mode.

get_amazon_product_details E-commerce

Fetches clean, detailed information for any Amazon product using its ASIN, including pricing, ratings, and review counts.

get_amazon_search_results E-commerce

Scrapes Amazon search result pages for a given query, providing a structured list of products to power e-commerce analysis.

get_walmart_product_details E-commerce

Retrieves structured data for a specific Walmart product, including price and availability, localized by store or zip code.

get_walmart_search_results E-commerce

Performs a product search on Walmart and returns a structured list of results with filtering and sorting capabilities.

ask_chatgpt AI

Sends a prompt to a ChatGPT model, with an option to enhance the response with live web search results for up-to-date answers.

GETTING STARTED

Connect to Your AI in Minutes

Claude
Custom Python Client

Connect to Claude

Integrate our MCP server with clients like Claude Desktop by adding it to your configuration file.

// In claude_desktop_config.json
{
  "mcpServers": {
    "scrapingbee": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://mcp.scrapingbee.com/mcp?api_key=<your_api_key>"
      ]
    }
  }
}

Create a Custom MCP Client

Build a simple Python client with `httpx` to interact with the ScrapingBee MCP server programmatically.

# custom_mcp_client.py
import os
import re
import json
import asyncio
import httpx
import google.generativeai as genai
from typing import Dict, Any, List

# ---------------------------------------------------------------------
# 🔧 Configuration
# ---------------------------------------------------------------------
MCP_SERVER_URL = "https://mcp.scrapingbee.com/mcp"
SCRAPINGBEE_API_KEY = os.environ.get("SCRAPINGBEE_API_KEY")
GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")


# ---------------------------------------------------------------------
# ⚙️ MCP Client
# ---------------------------------------------------------------------
class MCPClient:
    """A minimal async client to communicate with an MCP server."""

    def __init__(self, base_url: str, api_key: str):
        self.base_url = f"{base_url}?api_key={api_key}"
        self.session_id = None
        self.http_client = httpx.AsyncClient(timeout=300)
        self.headers = {
            "Content-Type": "application/json",
            "Accept": "application/json, text/event-stream",
            "User-Agent": "Minimal-MCP-Client/1.0",
        }

    async def _send_request(self, payload: Dict[str, Any]) -> Dict[str, Any] | None:
        """Send a JSON-RPC request and handle JSON or SSE responses."""
        headers = self.headers.copy()
        if self.session_id:
            headers["Mcp-Session-Id"] = self.session_id

        try:
            response = await self.http_client.post(self.base_url, json=payload, headers=headers)
            response.raise_for_status()

            text = await response.aread()
            text_str = text.decode("utf-8")

            # Detect Server-Sent Events (SSE) and extract JSON
            if "data:" in text_str:
                match = re.search(r"data:\s*({.*})", text_str, re.DOTALL)
                if match:
                    return json.loads(match.group(1))

            # Some notifications may have no body
            if response.status_code in [202, 204] or not text_str:
                return {"status": "ok"}

            return json.loads(text_str)

        except Exception as e:
            print(f"❌ Request error: {e}")
            return None

    async def initialize(self) -> List[Dict[str, Any]] | None:
        """Perform MCP handshake and list available tools."""
        # 1️⃣ Initialize session
        init_payload = {
            "jsonrpc": "2.0",
            "id": "1",
            "method": "initialize",
            "params": {
                "protocolVersion": "2024-11-05",
                "capabilities": {},
                "clientInfo": {"name": "MinimalClient"},
            },
        }

        async with self.http_client.stream("POST", self.base_url, json=init_payload, headers=self.headers) as response:
            self.session_id = response.headers.get("Mcp-Session-Id")

        if not self.session_id:
            print("❌ Could not establish MCP session.")
            return None

        print(f"✅ MCP Session established: {self.session_id}")

        # 2️⃣ Send `notifications/initialized`
        notify_payload = {"jsonrpc": "2.0", "method": "notifications/initialized", "params": {}}
        await self._send_request(notify_payload)

        # 3️⃣ List available tools
        list_payload = {"jsonrpc": "2.0", "id": "2", "method": "tools/list", "params": {}}
        response_data = await self._send_request(list_payload)

        if response_data and "result" in response_data and "tools" in response_data["result"]:
            tool_count = len(response_data["result"]["tools"])
            print(f"🔧 {tool_count} tools fetched successfully from MCP.")
            return response_data["result"]["tools"]

        print("❌ Failed to retrieve tool list.")
        return None

    async def call_tool(self, tool_name: str, arguments: Dict[str, Any]) -> Dict[str, Any] | None:
        """Execute a specific MCP tool."""
        payload = {
            "jsonrpc": "2.0",
            "id": "3",
            "method": "tools/call",
            "params": {"name": tool_name, "arguments": arguments},
        }
        return await self._send_request(payload)


# ---------------------------------------------------------------------
# 🧩 Gemini Tool Schema Conversion
# ---------------------------------------------------------------------
def format_tools_for_gemini(tools: List[Dict[str, Any]]) -> List[genai.protos.Tool]:
    """
    Converts MCP tool schemas into Gemini FunctionDeclarations,
    allowing Gemini to reason about which tool to call.
    """
    gemini_tools = []
    type_map = {
        "string": genai.protos.Type.STRING,
        "integer": genai.protos.Type.INTEGER,
        "number": genai.protos.Type.NUMBER,
        "boolean": genai.protos.Type.BOOLEAN,
    }

    for tool in tools:
        properties = {}
        input_schema = tool.get("inputSchema", {})

        # Build parameter schema
        if "properties" in input_schema:
            for name, prop in input_schema["properties"].items():
                properties[name] = genai.protos.Schema(
                    type=type_map.get(prop.get("type", "string"), genai.protos.Type.STRING),
                    description=prop.get("description", ""),
                )

        func = genai.protos.FunctionDeclaration(
            name=tool["name"],
            description=tool["description"],
            parameters=genai.protos.Schema(
                type=genai.protos.Type.OBJECT,
                properties=properties,
                required=input_schema.get("required", []),
            ),
        )

        gemini_tools.append(genai.protos.Tool(function_declarations=[func]))

    return gemini_tools


# ---------------------------------------------------------------------
# 🤖 Main Program Flow
# ---------------------------------------------------------------------
async def main():
    """Runs the MCP–Gemini integration demo."""
    if not SCRAPINGBEE_API_KEY or not GEMINI_API_KEY:
        print("❌ Please set SCRAPINGBEE_API_KEY and GEMINI_API_KEY environment variables.")
        return

    # 1️⃣ Connect to MCP and list tools
    mcp_client = MCPClient(MCP_SERVER_URL, SCRAPINGBEE_API_KEY)
    tools = await mcp_client.initialize()
    if not tools:
        return

    # 2️⃣ Configure Gemini with MCP tool schemas
    genai.configure(api_key=GEMINI_API_KEY)
    gemini_tools = format_tools_for_gemini(tools)
    model = genai.GenerativeModel(model_name="gemini-2.5-pro", tools=gemini_tools)
    print("\n✨ Gemini is now configured with MCP tools.")

    # 3️⃣ Get user request and let Gemini pick a tool
    user_prompt = input("\nAsk the AI to perform a task (e.g., 'Search Walmart for iPhones'):\n> ")
    if not user_prompt.strip():
        return

    print("\n🤖 Asking Gemini which MCP tool to use...")
    response = model.generate_content(user_prompt)
    function_call = response.candidates[0].content.parts[0].function_call

    if not function_call.name:
        print("\n💬 Gemini Response (no tool selected):")
        print(response.text)
        return

    tool_name = function_call.name
    tool_args = dict(function_call.args)
    print(f"🧠 Gemini chose tool: '{tool_name}' with arguments: {tool_args}")

    # 4️⃣ Execute Gemini’s selected tool
    print(f"\n⚙️ Running '{tool_name}' on MCP...")
    result = await mcp_client.call_tool(tool_name, tool_args)

    # 5️⃣ Display the result
    print("\n📄 --- MCP Tool Result ---")
    print(json.dumps(result, indent=2) if result else "No result returned.")

# ---------------------------------------------------------------------
# 🚀 Entry Point
# ---------------------------------------------------------------------
if __name__ == "__main__":
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        print("\nExiting gracefully.")
QUESTIONS?

Frequently Asked Questions

How do I get a ScrapingBee API key?

An API key is required to use these tools. You can get a free key with 1,000 credits by registering here. The key should be passed as a query parameter in the MCP URL.

When should I use premium_proxy or stealth_proxy?

Start without them. If you get blocked (e.g., receive an error or CAPTCHA page), try premium_proxy: true. If you are still blocked, stealth_proxy: true will help you to unblock it.

Can I scrape more than just text?

Yes. The server provides tools to get the full `get_page_html`, take a `get_screenshot` of the page (or a specific element), and even download files like images or PDFs using the `get_file` tool.

How much does each request cost?

Each requests will be charged based on the proxy and features you are using. For more details, you can refer to this link.

Ready to Build?

Get your free API key with 1,000 credits to start building powerful AI agents today.

Start for Free