Website Inspector
Overview
The Website Inspector is a powerful and intelligent MCP server designed to bridge the gap between AI agents and the content of the World Wide Web. It provides a robust tool that can crawl a website, process its content, and perform a semantic search to find the most relevant information based on a user's natural language query.
This server acts as a specialized Retrieval-Augmented Generation (RAG) tool, allowing an AI to "read" a website and answer questions about it without having to process the entire site from scratch for every query.
Key Features
- Deep Crawling: Can navigate and index content across multiple pages of a target website, up to a configurable depth.
- Intelligent Caching: Automatically caches website content. If the content is fresh, it serves results instantly. If it's outdated, it re-crawls to ensure the AI has the most current information.
- Semantic Search: Uses advanced vector embeddings (
text-embedding-005
) to understand the meaning behind a query, not just keywords, leading to highly relevant results. - Content Extraction: Focuses on the main content of pages, stripping away irrelevant boilerplate like navbars, footers, and ads.
- Configurable & Secure: Offers control over crawl depth, page limits, and cache duration. Maintained by Portal One and supports standard MCP authentication protocols.
- Clear Progress Notifications: Provides detailed, real-time feedback on the progress of a request, from crawling and embedding to searching.
Use Cases
- AI-Powered Research: An AI agent can use this tool to research topics from specific, authoritative websites.
- Customer Support Automation: An AI can answer user questions by consulting a company's official documentation or help center.
- Content Summarization: An agent can find the most relevant sections of a long article or blog post to create a concise summary.
- Competitive Analysis: An AI can be tasked to find information about a competitor's products or services from their website.
Server Details
- Maintainer: Portal One
- Maintainer Site: portal.one
- MCP URL:
https://mcp.website-inspector.portal.one/mcp
- Authentication:
OAuth2
,Dynamic Client Registration
Tool: website_search
This is the primary tool provided by the Website Inspector server.
Description: Performs a semantic search of a given website and returns the most relevant text chunks. It will crawl the site if the content is not already cached or if the cached content is expired.
Input Parameters
Output Schema
On a successful run, the tool returns a structuredContent
object with the following format:
json
{
"results": [
{
"pageContent": "A string containing the relevant text chunk found on the page.",
"metadata": {
"title": "The title of the source page.",
"description": "The meta description of the source page.",
"url": "The exact URL where the text chunk was found.",
"loc": { "lines": { "from": 1, "to": 10 } }
}
}
]
}
Example Workflow
Scenario: A user asks an AI, "How do I sign up for Portal One?" The AI uses the website_search
tool to find the answer on the portal.one
website.
1. Tool Call (Request from AI):
json
{
"tool": "website_search",
"args": {
"url": "https://portal.one/faq",
"query": "How do I sign up?"
}
}
2. Progress Notifications (from Server):
The user would see a series of clear progress steps:
[1/6] Initializing search for https://portal.one/faq...
[2/6] Cached content was not found. Crawling fresh data...
[3/6] Processing 1 page from https://portal.one/faq...
[4/6] Embedding content for search...
[5/6] Searching for "How do I sign up?"...
[6/6] Successfully completed search.
3. Successful Response (from Server):
json
{
"structuredContent": {
"results": [
{
"pageContent": "### How do I sign up or get started with Portal One?\n\nGetting started is easy! Visit our website at [portal.one](https://portal.one/) and look for the \"Get Started Now\" button. The process is quick, and you'll be on your way to commanding your AI agents.",
"metadata": {
"title": "Portal One - Frequently Asked Questions",
"description": "Find answers to common questions about Portal One, AI agent management, LLMs, MCP servers, AI workflow scheduling, and human oversight.",
"url": "https://portal.one/faq/",
"loc": {
"lines": {
"from": 21,
"to": 27
}
}
}
}
]
}
}
Error Handling
- If the target website cannot be crawled (e.g., due to a firewall or if it's offline), the tool will return a clear error message like:
Could not crawl any content from [URL].
- If no relevant results are found for the query, it will return:
No results found for query "[query]" at [URL].