🚀 New Version of CleverChatty: Now with Server Mode & A2A Communication!

Introducing CleverChatty's new server mode with A2A support for building AI assistants and agents.

Continue reading...

🚀 New Version of CleverChatty: Now with Server Mode & A2A Communication!
Science Fiction
The End of the Holocene
1 June 2025

The End of the Holocene is a science fiction narrative exploring the implications of artificial intelligence on humanity's future. It delves into themes of consciousness, identity, and the potential consequences of technological advancement.

Continue Reading
The End of the Holocene
Artificial Intelligence
Artificial Intelligence in the next decades. Will it bring happiness to humanity?
6 December 2023

Forecast on the development of Artificial Intelligence technologies.

Continue Reading
Artificial Intelligence in the next decades. Will it bring happiness to humanity?

Latest Blog Posts

Implementing Authentication in a Remote MCP Server with Python and FastMCP

Implementing Authentication in a Remote MCP Server with Python and FastMCP

A couple of months ago, I published the blog post Implementing Authentication in a Remote MCP Server with SSE Transport. That article demonstrated how to add authentication for remote MCP servers written in Go.

At the time, I also wanted to include Python examples. Unfortunately, things weren’t straightforward. The official Python MCP SDK didn’t provide a clean way to implement what I needed. There were some workarounds using Starlette middleware, but in my experience, those solutions were brittle and ultimately unsuccessful.

Later, I managed to create a working Python MCP server supporting SSE (or streaming HTTP) transport. But my solution relied on thread-level hacks to make the data thread-safe. It worked, but it felt like a fragile and inelegant design—something I wasn’t comfortable recommending or maintaining long-term.

Now, after revisiting the problem, I’ve found a much cleaner solution in Python. This time it’s not with the official Python MCP SDK, but with an alternative implementation called FastMCP. FastMCP is written in the spirit of the official SDK, offering a very similar syntax, but with additional features, clearer abstractions, and—importantly—excellent documentation.

Continue Reading ...

Building MCP Servers with Configurable Descriptions for Tools

Building MCP Servers with Configurable Descriptions for Tools

I want to share my findings on how to make MCP (Model Context Protocol) Server tools more responsive and adaptive to user needs by allowing configurable tool descriptions. This can significantly enhance the user experience by providing more relevant and context-aware descriptions for specific projects or tasks.

If you are not yet familiar with MCP, I recommend checking out the official documentation. In short, MCP is a protocol that allows different AI models and tools to communicate and work together seamlessly. In practice, MCP servers are small plugins for your AI agent or chat tool (for example, Claude Desktop) that provide specific functionalities, such as web browsing, code execution, or data retrieval.


MCP Tool Descriptions

Usually, an MCP server tool definition looks like this:

{
  "name": "get-webpage",
  "description": "A tool for retrieving the content of a webpage. It accepts a URL as input and returns the HTML content of the page.",
  "parameters": {
    "type": "object",
    "properties": {
      "url": {
        "type": "string",
        "description": "The URL of the webpage to visit."
      }
    },
    "required": ["url"]
  }
}
Continue Reading ...

AI Agent’s Common Memory

AI Agent’s Common Memory

In this post, I want to explore an idea I’ve been experimenting with: common memory for AI agents. I’ll explain what I mean by this term, how such memory can be implemented, and why I believe it's worth exploring.


What Is Agent's “Common” Memory?

I’m not sure whether “common memory” is already a widely accepted term in the AI space, or even the most accurate label for the concept I have in mind — but I’ll use it for now until a better one emerges (or someone suggests one).

By common memory, I mean:

A shared repository of memories formed by a single AI agent from interactions with multiple other agents — including both humans and other AI agents. For eample, AI Chat can retain information learned from conversations with different users, and selectively reference that information in future interactions.

This is distinct from related terms:

  • Shared memory usually refers to memory shared across different AI systems or agents — not across users of the same assistant.
  • Collaborative memory comes closer, but often implies more structured cooperation and might be too narrow for what I’m describing.

So for now, I’ll stick with common memory to describe a memory system that allows an AI assistant to retain and selectively reference information learned across interactions with multiple users.

AI Chats do not use Common Memory

When we interact with AI chats like ChatGPT, they typically do not retain information across different users. Each conversation is isolated, and the AI does not remember past interactions with other users. This means that if you ask the AI about something you discussed with another user, it won’t have any context or memory of that conversation. Only the current user's context and history are considered.

Continue Reading ...

🔍 Building an Agentic RAG System with CleverChatty (No Coding Required)

🔍 Building an Agentic RAG System with CleverChatty (No Coding Required)

With the recent addition of A2A (Agent-to-Agent) protocol support in CleverChatty, it’s now possible to build powerful, intelligent applications—without writing any custom logic. In this blog post, we’ll walk through how to build an Agentic RAG (Retrieval-Augmented Generation) system using CleverChatty.


🤖 What is Agentic RAG?

The term agentic refers to an agent's ability to reason, make decisions, use tools, and interact with other agents or humans intelligently.

In the context of RAG, an Agentic RAG system doesn’t just retrieve documents based on a user’s prompt. Instead, it:

  • Preprocesses the user’s query,
  • Executes a more contextually refined search,
  • Postprocesses the results, summarizing and formatting them,
  • And only then returns the final answer to the user.

This kind of intelligent behavior is made possible by using a Large Language Model (LLM) as the core reasoning component.

The goal of a RAG system is to enrich the user’s query with external context, especially when the required information is not available within the LLM itself. This typically involves accessing an organization’s knowledge base—structured or unstructured—and providing relevant data to the LLM to enhance its responses.

Continue Reading ...

🤖 Agent-to-Agent Communication in CleverChatty

🤖 Agent-to-Agent Communication in CleverChatty

Recently, I released a new version of CleverChatty with built-in support for the A2A (Agent-to-Agent) protocol. This addition enables AI agents to call each other as tools, opening the door to more dynamic, modular, and intelligent agent systems.


🔄 What Is the A2A Protocol?

The A2A protocol defines a standard for communication and collaboration between AI agents. It allows one agent to delegate tasks to another, much like how humans might assign work to collaborators with specific expertise.

Many blog posts and articles describe the A2A protocol and provide examples of an A2A client calling an A2A server. However, few explain how an AI agent decides when and why to call another agent in a real scenario.

Let’s consider an example: Imagine there's a specialized AI agent called "Document Summarizer", exposed via the A2A protocol. Another agent — a general-purpose chat assistant with access to an LLM — receives this user query:

Continue Reading ...

🚀 New Version of CleverChatty: Now with Server Mode & A2A Communication!

🚀 New Version of CleverChatty: Now with Server Mode & A2A Communication!

In this post, I’m excited to announce a new version of CleverChatty that introduces server mode — unlocking powerful new capabilities for building AI assistants and agents that can interact over the network.

Previously, CleverChatty functioned only as a command-line interface (CLI) for interacting with LLM-based assistants. A typical use case involved a single user chatting with an AI model via the terminal. With this latest update, CleverChatty can now run as a server, enabling:

  • Concurrent communication with multiple clients
  • Background operation on local or cloud environments
  • Integration into distributed agent systems

But that’s not all. The biggest leap forward? Full support for A2A (Agent-to-Agent) protocol.

CleverChatty


Continue Reading ...

About

This is the personal blog of Roman Gelembjuk. Here, I share my ideas and thoughts on programming, IT, AI, hiking, and more.