MCP Servers

Website Inspector

Overview

The Website Inspector is a powerful and intelligent MCP server designed to bridge the gap between AI agents and the content of the World Wide Web. It provides a robust tool that can crawl a website, process its content, and perform a semantic search to find the most relevant information based on a user's natural language query.

This server acts as a specialized Retrieval-Augmented Generation (RAG) tool, allowing an AI to "read" a website and answer questions about it without having to process the entire site from scratch for every query.

Key Features

Deep Crawling: Can navigate and index content across multiple pages of a target website, up to a configurable depth.
Intelligent Caching: Automatically caches website content. If the content is fresh, it serves results instantly. If it's outdated, it re-crawls to ensure the AI has the most current information.
Semantic Search: Uses advanced vector embeddings (text-embedding-005) to understand the meaning behind a query, not just keywords, leading to highly relevant results.
Content Extraction: Focuses on the main content of pages, stripping away irrelevant boilerplate like navbars, footers, and ads.
Configurable & Secure: Offers control over crawl depth, page limits, and cache duration. Maintained by Portal One and supports standard MCP authentication protocols.
Clear Progress Notifications: Provides detailed, real-time feedback on the progress of a request, from crawling and embedding to searching.

Use Cases

AI-Powered Research: An AI agent can use this tool to research topics from specific, authoritative websites.
Customer Support Automation: An AI can answer user questions by consulting a company's official documentation or help center.
Content Summarization: An agent can find the most relevant sections of a long article or blog post to create a concise summary.
Competitive Analysis: An AI can be tasked to find information about a competitor's products or services from their website.

Server Details

Maintainer: Portal One
Maintainer Site: portal.one
MCP URL: https://mcp.website-inspector.portal.one/mcp
Authentication: OAuth2, Dynamic Client Registration

Tool: `website_search`

This is the primary tool provided by the Website Inspector server.

Description: Performs a semantic search of a given website and returns the most relevant text chunks. It will crawl the site if the content is not already cached or if the cached content is expired.

Input Parameters

Parameter	Type	Required	Default	Description
`url`	`string`	Yes	-	The URL of the website to crawl and search.
`query`	`string`	Yes	-	The natural language query to perform on the website content.
`k`	`integer`	No	`3`	The number of search results (text chunks) to return.
`maxDepth`	`integer`	No	`2`	Maximum depth to crawl. `0` scrapes only the entered URL.
`limit`	`integer`	No	`50`	The maximum number of pages to crawl during the indexing process.
`maxAge`	`integer`	No	`3600000`	Max age in milliseconds (1 hour) for cached content before re-crawling.

Output Schema

On a successful run, the tool returns a structuredContent object with the following format:

json

{
  "results": [
    {
      "pageContent": "A string containing the relevant text chunk found on the page.",
      "metadata": {
        "title": "The title of the source page.",
        "description": "The meta description of the source page.",
        "url": "The exact URL where the text chunk was found.",
        "loc": { "lines": { "from": 1, "to": 10 } }
      }
    }
  ]
}

Example Workflow

Scenario: A user asks an AI, "How do I sign up for Portal One?" The AI uses the website_search tool to find the answer on the portal.one website.

1. Tool Call (Request from AI):

json

{
  "tool": "website_search",
  "args": {
    "url": "https://portal.one/faq",
    "query": "How do I sign up?"
  }
}

2. Progress Notifications (from Server):

The user would see a series of clear progress steps:

[1/6] Initializing search for https://portal.one/faq...
[2/6] Cached content was not found. Crawling fresh data...
[3/6] Processing 1 page from https://portal.one/faq...
[4/6] Embedding content for search...
[5/6] Searching for "How do I sign up?"...
[6/6] Successfully completed search.

3. Successful Response (from Server):

json

{
  "structuredContent": {
    "results": [
      {
        "pageContent": "### How do I sign up or get started with Portal One?\n\nGetting started is easy! Visit our website at [portal.one](https://portal.one/) and look for the \"Get Started Now\" button. The process is quick, and you'll be on your way to commanding your AI agents.",
        "metadata": {
          "title": "Portal One - Frequently Asked Questions",
          "description": "Find answers to common questions about Portal One, AI agent management, LLMs, MCP servers, AI workflow scheduling, and human oversight.",
          "url": "https://portal.one/faq/",
          "loc": {
            "lines": {
              "from": 21,
              "to": 27
            }
          }
        }
      }
    ]
  }
}

Error Handling

If the target website cannot be crawled (e.g., due to a firewall or if it's offline), the tool will return a clear error message like: Could not crawl any content from [URL].
If no relevant results are found for the query, it will return: No results found for query "[query]" at [URL].

Getting Started

Next Steps

Core Features

MCP Servers

Website Inspector

Overview

Key Features

Use Cases

Server Details

Tool: `website_search`

Input Parameters

Output Schema

Example Workflow

Error Handling

On this page

Getting Started

Next Steps

Core Features

MCP Servers