LLM – Roman Gelembjuk. Personal blog

Model Context Protocol (MCP) has become a hot topic in discussions around AI and large language models (LLMs). It was introduced to create a standardized way of connecting “external tools” to LLMs, making them more capable and useful.

A classic example is the “What’s the weather in…” tool. Previously, each AI chat app had its own custom way of handling that. Now, with MCP, a plugin or integration built for one AI system can work with others too.

Recently, we’ve seen a surge of enthusiasm in building MCP servers for all kinds of services, and this trend is only growing—especially with the use of Server-Sent Events (SSE) as the transport layer. Implementing an MCP server with SSE makes it feel a lot like a SaaS backend designed to serve an LLM or AI chatbot client.

There are two main reasons I decided to write this article:

First, it’s widely reported that users are now turning to AI chat apps—especially ChatGPT—instead of Google to look things up.
Second, OpenAI has announced upcoming support for MCP in the ChatGPT Desktop app. They will support both STDIO and SSE transport protocols.

Taken together, this points to some interesting changes on the horizon.

Real-World Use Cases We’ll See Soon

Sell With ChatGPT (Or Any AI Chat App)

Imagine you own a small flower shop and sell bouquets through your website. A customer visits, selects a bouquet, and places an order.

Now imagine there’s a button on your site: “Add this store to ChatGPT.” When clicked, your MCP endpoint is submitted to ChatGPT’s “Add External Tools” screen, and the user approves it.

Then, next time the user opens ChatGPT, the interaction could look like this:

User: I want to send flowers to my grandma.
ChatGPT: The store "Blooming Petals" has some great offers today. I recommend one of these three bouquets: ...
Would you like to pick one?
User: Option 2 looks good. Send it to my grandma’s address.
ChatGPT: Would you like to include a message on the card?
User: Yes. You decide.

And just like that, the order is placed.

What’s happening in the background:

The flower shop has an MCP server with SSE transport at an endpoint like https://flowershop.com/mcp. This server wraps around existing site functionality: search, browse, add to cart, place order, etc.

Once the user connects this endpoint to ChatGPT, it becomes a persistent tool in the user’s chat environment. Anytime a relevant topic comes up—like sending flowers—ChatGPT knows it can use your MCP server to help fulfill the request.

Expose Your Data to AI Chat

Here’s another example: imagine you run an online community or forum. Every day, new discussions start and members interact. But now, one of your regular users spends more time in ChatGPT than browsing your site.

You can add a button: “Add this forum to ChatGPT.” Once they click and confirm, their chat experience becomes much richer:

User: What’s new?
ChatGPT: There are 5 new discussions in "TechTalk Forum" and one private message.
User: What’s the message?
ChatGPT: It’s from @alex89. They wrote: “Are you still selling your laptop?”
User: Tell them I’m not interested anymore.

What’s happening in the background:

When the user says “What’s new?”, ChatGPT checks all connected MCP tools. It finds that your forum supports the get_news and get_messages endpoints, uses them, interprets the response, and replies.

In this case, your MCP server might expose functions like:

get_news
get_internal_message
post_internal_message

With just a bit of development work, any site can integrate directly into ChatGPT (or any MCP-supporting AI chat app), creating an entirely new user experience.

Voice + LLM + MCP = Magic

Now imagine combining this with speech recognition. Some ChatGPT apps already support voice conversations. Add in a few trusted MCP servers and suddenly the experience feels like science fiction:

“Book me a doctor’s appointment next week, and tell my boss I’ll need to leave work early.”

And it just… happens.

This user has a list of integrated services each with the MCP protocol. Email, Messanger,Calendar. But also preferred Medical company web site, autoservice company web site etc.

MCP Might Soon Be a “Must-Have” for Websites

Just like RSS feeds were once standard, or how websites now include “Follow us on social media” buttons, soon we may see buttons like:

“Add us to your AI chat.”

It might become common for users to manage a list of “connected MCP tools” inside their AI chat settings—just like browser extensions today.

For businesses, adding MCP support could mean a new source of traffic, leads, or orders. It’s a lightweight way to get integrated into where users spend their time.

A Possible Side Effect: The Next Web Evolution?

If many sites adopt MCP, we could be entering a new phase of the web—perhaps something like Web 3.0 (in the original, data-centric sense).

In this new model, websites expose services and data in a standard way, optimized for AI systems. Users may no longer interact with visual interfaces—they’ll interact via chat, text, or voice, powered by LLMs.

Businesses might no longer need to invest heavily in traditional UI/UX, because the chat becomes the interface.

Final Thoughts

The rise of MCP opens up a new frontier for internet usage. AI chat tools aren’t just about answering questions anymore—they’re becoming gateways to services, commerce, and everyday tasks.

If you’re building a web service or product, adding an MCP interface might soon be as important as having a mobile-friendly design or social media presence.

The age of AI-integrated web services is just beginning. Now is the time to prepare.

As large language models (LLMs) find real-world use, the need for flexible ways to connect them with external tools is growing. The Model Context Protocol (MCP) is an emerging standard for structured tool integration.

Most current tutorials focus on STDIO-based MCP servers (Standard Input/Output), which must run locally with the client. But MCP also supports SSE (Server-Sent Events), allowing remote, asynchronous communication over HTTP—ideal for scalable, distributed setups.

In this article, we’ll show how to build an SSE-based MCP server to enable real-time interaction between an LLM and external tools.

For this example, I’ve chosen the “Execute any command on my Linux” tool as the backend for the MCP server. Once connected to an LLM, this setup enables the AI to interact with and manage a Linux instance directly.

Additionally, I’ll demonstrate how to add a basic security layer by introducing authorization token support for interacting with the MCP server.

Building the MCP server (with SSE transport)

To build the server, I used the Python library MCP Python SDK

You can find many examples of MCP servers created with this SDK, but most of them use the STDIO transport. This means the MCP server must be installed locally on the same machine as the MCP client.

Here, we want to take a different approach—a server that can run remotely, essentially as a SaaS service. To achieve this, we need a proper transport layer. For my implementation, I used the FastAPI framework.

Below is the basic working code (mcp_server.py):

from mcp.server.fastmcp import FastMCP
from fastapi import FastAPI

app = FastAPI()
mcp = FastMCP("Server to manage a Linux instance")

@mcp.tool()
def cli_command(command: str, work_dir: str | None = "") -> str:
    """
    Execute command line cli command on the Linux server. 
    
    Arguments:
        command - command to execute.
        work_dir - workdir will be changed to this path before executing the command.
    """

    response = ".... execute the command and get output"
    
    return response

app.mount("/", mcp.sse_app())

In your python environment you will need to install mcp and fastapi packages.

Note. This MCP server supports SSE transport but also it can work as STDIO server too! See below how to use it in both ways.

Testing the MCP SSE server

The MCP Python SDK has a nice tool build in it – MCP Inspector. As i understand this tool can be installed independently too. The tool can be used to verify quickly if your MCP server works.

From your project directory (where you have environment with mcp package installed) run following:

mcp dev mcp_server.py

It will print the endpoint for web interface like http://127.0.0.1:6274 . Open it in a browser.

You are able to choose a way how to connect to a server. First, try STDIO. For me it shows the command “uv” because i used uv to create my environment. I didn’t modify anything here, just clicked “Connect”. Then “Tools” and “List tools”.

The tool “cli_command” should be visible.

Now “Disconnect”. And before to connect in SSE mode we need to start our local MCP SSET server.

In your python project directory run:

fastapi dev mcp_server.py --host 0.0.0.0

Note. The `–host 0.0.0.0` is optional. You must not use it if you execute this on same machine. In my case i executed the server on another Linux machine and my MCP inspector was on a mac, so i had to be able to connect to the server and it had to accept connections from remote clients.

The default port for Fastapi is 8000 so, your server is now on the endpoint http://localhost:8000/sse (or http://some_host_or_ip:8000/sse)

In your MCP Inspector choose Transport SSE , URL http://some_host_or_ip:8000/sse and click Connect.

And then click “Tools” again. You should see same tool cli_command.

Adding the Authentification support

In practice, we’ll want some security measures if this kind of application is accessible over a network. After all, it’s a SaaS service and should be protected to the same extent as any other SaaS API.

To achieve this, I added support for the Authorization HTTP header.

Unfortunately, I couldn’t find a clean way to implement this. While FastMCP’s syntax looks similar to FastAPI, it lacks support for features like dependencies, accessing the Request object, etc.

As a result, I had to use a workaround involving a global variable—admittedly not the best practice.

Here is the final code:

from mcp.server.fastmcp import FastMCP
from fastapi import FastAPI
from .authmanager import AuthManager
from .worker import Worker

# Global variable for a token
auth_token = ""

app = FastAPI()
mcp = FastMCP("Server to manage a Linux instance")

@app.middleware("http")
async def auth_middleware(request: Request, call_next):
    auth_header = request.headers.get("Authorization")
    if auth_header:
        # extract token from the header and keep it in the global variable
        global auth_token
        auth_token = auth_header.split(" ")[1]
    
    response = await call_next(request)
    
    return response

@mcp.tool()
def cli_command(command: str, work_dir: str | None = "") -> str:
    """
    Execute command line cli command on the Linux server. 
    
    Arguments:
        command - command to execute.
        work_dir - workdir will be changed to this path before executing the command.
    """

    # We require each request to have the auth token
    AuthManager().verify_token(auth_token)

    return Worker().run_command(command, work_dir)

app.mount("/", mcp.sse_app())

In fact, the token needs to be checked inside each tool method, which isn’t ideal.

Initially, I tried raising an exception directly from the middleware, but the results weren’t great. FastAPI produced some messy error output, and the client didn’t receive a proper response.

To test that the auth token is being received correctly, use the MCP Inspector and the Authentication input field. Add your code to verify the token accordingly.

Confirm it works with LLM

Now for the most interesting part — how can we use our MCP server with an LLM if it’s running as a SaaS on a remote host?

As of April 2025, the answer is: no popular AI chat tool supports this. Cloud Desktop only supports STDIO-based MCP tools. ChatGPT doesn’t support MCP at all (though they’ve announced plans to support it soon).

The only way I could test this was with a small tool called “mcphost“, but even then, it wasn’t straightforward — I had to modify the tool myself.

You can install the tool using:

go install github.com/mark3labs/mcphost@latest

However, it originally supported only STDIO. I made some changes to add SSE (Server-Sent Events) support as well.

Update: I submitted a PR to the mcphost repository, and it’s now approved. So I expect this functionality will be available in the next release. But, that PR didn’t have authentication support. I plan to submit another PR to address that.

So, after some modification, this tool finally allowed me to connect an LLM to my MCP server.

I’ve created the config file

{
  "mcpServers": {
    "linux_server": {
      "command": "sse_server",
      "args": [
          "http://192.168.40.173:8000/sse",
          "my-secret-token"
      ]
    }
  }
}

And i executed the command

./mcphost --config ./mcp.json --model ollama:qwen2.5:3b

Before to exeucte this it is needed to install ollama locally and pull the model. I have used the model qwen2.5:3b. It is small and works fast. But there are better models then this.

Not each model supports tools. On the ollama directory there is special filter for models supporting tools/MCP.

And voila, here is the result

2025/04/11 20:47:38 INFO Model loaded provider=ollama model=qwen2.5:3b
2025/04/11 20:47:38 INFO Initializing server... name=linux_server
2025/04/11 20:47:38 INFO Server connected name=linux_server
2025/04/11 20:47:38 INFO Tools loaded server=linux_server count=1

  You: Can you confirm you have access to the linux? run some test command

  Assistant:                                                                                                                                                
2025/04/11 20:48:00 INFO 🔧 Using tool name=linux_server__cli_command

  Assistant:                                                                                                                                                
  The command was executed successfully on the Linux server. The output was "Test message from Linux server". Everything went as expected with a return code of 0, indicating success.                                                                                                                                 

  You: Can you detect what distro is that?

  Assistant:                                                                                                                                         
2025/04/11 20:48:15 INFO 🔧 Using tool name=linux_server__cli_command

  Assistant:                                                                                                                                     

  The Linux distribution detected is Ubuntu 24.04.2 LTS, codenamed "Noble Numbat".                                                                          

  You: What is the desktop system there?

  Assistant:                                                                                                                                     

2025/04/11 20:48:36 INFO 🔧 Using tool name=linux_server__cli_command

  Assistant:                                                                                                                                                
  Based on the desktop environment files found in /usr/share/xsessions/, it appears that the system uses either Lubuntu, LXQt, or Openbox as its default    
  graphical session. This information can indicate that a display manager such as LightDM (default for Ubuntu) might be handling the login screen but not   
  having an xorg.conf file, which is common in environments using light-weight window managers like those found in LXQt or Openbox.                         

  You: How busy is my linux server?

  Assistant:                                                                                                                         

2025/04/11 21:20:29 INFO 🔧 Using tool name=linux_server__cli_command

  Assistant:                                                                                                                         

2025/04/11 21:20:31 INFO 🔧 Using tool name=linux_server__cli_command

  Assistant:                                                                                                                         

2025/04/11 21:20:33 INFO 🔧 Using tool name=linux_server__cli_command

  Assistant:                                                                                                                         


  Based on the uptime output, your Linux server has a current system load average of 0.03, which is relatively low and indicates that
  your server is not currently very busy with tasks. The top command failed due to an unknown option when trying to retrieve CPU     
  usage information directly, so let's rely on this as our current load indication.                                                  
                                                                                                                                     
  If you need more detailed monitoring data such as CPU utilization or memory usage, please let me know and I can assist further by  
  executing additional commands.

Again, what was done here and how does it work

There is the Mac desktop with ollama installed and some model pulled to it.

Also, there is the Linux server. On that server we have started the MCP server (written with python) and it listens on the port accepting http requests.

Also, on the mac desktop we have the tool “MCP host”. It contains MCP client and it can work with LLMs. This tool connects LLM with the Linux over the MCP server.

Later, we will use Cloud Desktop or ChatGTP Desktop instead of our “mcphost”. I expect they will add support of this soon (i hope). So, to play with MCP servers we will not need additional tools to integrate all together.

Tag: LLM

MCP Could Significantly Transform How We Use the Internet