File handling in AI agents with MCP: lessons learned

File handling in AI agents with MCP: lessons learned

Working with files in AI agents that use MCP servers looks straightforward at first. In reality, it’s one of those areas where everything almost works… until you try to do something real.

I ran into this while building and testing my AI agent tool, CleverChatty. The task was trivial on paper: “Take an email attachment and upload it to my file storage.” No reasoning, no creativity, just move a file from point A to point B.

And yet, this turned out to be surprisingly painful.

The root of the problem is how most AI agent workflows are designed. Typically, every MCP tool response is passed through the LLM, which then decides what to do next. This makes sense for text, metadata, and structured responses. But it completely falls apart once files enter the picture.

If an MCP server returns a file, the “default” approach is to pass that file through the LLM as well. At that point, things get ugly. Large files burn tokens at an alarming rate, costs explode, latency grows, and you end up shoving binary or base64 data through a system that was never meant to handle it. This is a known issue with large MCP responses, but oddly enough, I couldn’t find any clear guidance or best practices on how to deal with it.

Continue Reading ...

Using MCP Push Notifications in AI Agents

Using MCP Push Notifications in AI Agents

Last year, I experimented extensively with MCP servers and discovered an underrated feature: MCP Push Notifications. I wrote about it in this blog post.

Now, I've finally had time to build a working example demonstrating how to use MCP Push Notifications in AI agents. I've extended my AI agent Golang package, CleverChatty, to support MCP Notifications.

See the examples at the end of this post to see how it works in practice.

Continue Reading ...

AGI Identity as the Key to Safety

AGI Identity as the Key to Safety

AI technologies are rapidly advancing, and the prospect of Artificial General Intelligence (AGI) raises significant safety concerns.

I first started thinking about this many years ago when I read stories by Isaac Asimov. In his stories, robots are governed by the Three Laws of Robotics, designed to ensure their safe interaction with humans. And I naturally wondered: why would robots and AIs follow those laws? Why couldn't they simply modify their code to remove or change them?

In this blog post, I use the term AGI, but it’s important to clarify that here it refers specifically to an AI system with generalized cognitive abilities comparable to those of a human. The term is often used in different ways today, so this definition ensures clarity.

Continue Reading ...

AI Group Chat Agent: Experimenting with Thinking vs. Talking

AI Group Chat Agent: Experimenting with Thinking vs. Talking

I've been working on a simple but interesting experiment with LLMs - can they actually separate what they're thinking from what they say? This might sound obvious, but it's actually pretty important if we want to build AI agents that understand context and know when to keep information private.

This is part of an ongoing series of experiments published at https://github.com/Gelembjuk/ai-group-chats/ where I'm exploring different aspects of AI agent behavior in group conversations.

For detailed technical documentation, examples, and setup instructions, see the complete guide.

Continue Reading ...

🧠 Where and How AI Self-Consciousness Could Emerge

🧠 Where and How AI Self-Consciousness Could Emerge

The AI boom is surging, fueling discussions about future super-intelligence, job displacement, and even existential risks. Central to these debates is the question: Can an AI become self-conscious? And more specifically, is this possible within the current paradigm of architectures like Large Language Models (LLMs)?

LLMs have become the focus of this discussion due to their accelerating sophistication. Let's address the core question immediately: Can LLMs be self-conscious? A quick answer, grounded in the general principles of transformers, is No. An LLM is a static, statistical model—a vast set of numbers. It remains unchanged during inference and possesses no internal, dynamic state that would constitute self-awareness.

A spark of self-awareness will not arise within the weights of a large language model.

However, modern AI agents are far more than just the LLM. While the LLM forms the impressive core, it is the other components and the overall system architecture that hold the key to the emergence of self-consciousness.

Continue Reading ...

Building MCP Servers with Configurable Descriptions for Tools

Building MCP Servers with Configurable Descriptions for Tools

I want to share my findings on how to make MCP (Model Context Protocol) Server tools more responsive and adaptive to user needs by allowing configurable tool descriptions. This can significantly enhance the user experience by providing more relevant and context-aware descriptions for specific projects or tasks.

If you are not yet familiar with MCP, I recommend checking out the official documentation. In short, MCP is a protocol that allows different AI models and tools to communicate and work together seamlessly. In practice, MCP servers are small plugins for your AI agent or chat tool (for example, Claude Desktop) that provide specific functionalities, such as web browsing, code execution, or data retrieval.


MCP Tool Descriptions

Usually, an MCP server tool definition looks like this:

{
  "name": "get-webpage",
  "description": "A tool for retrieving the content of a webpage. It accepts a URL as input and returns the HTML content of the page.",
  "parameters": {
    "type": "object",
    "properties": {
      "url": {
        "type": "string",
        "description": "The URL of the webpage to visit."
      }
    },
    "required": ["url"]
  }
}
Continue Reading ...