Adding Support for Retrieval-Augmented Generation (RAG) to AI Orchestrator

Adding Support for Retrieval-Augmented Generation (RAG) to AI Orchestrator

Good news! I've extended my lightweight AI orchestrator, CleverChatty, to support Retrieval-Augmented Generation (RAG) by integrating it using the Model Context Protocol (MCP).

Quick Recap

  • RAG (Retrieval-Augmented Generation) is an AI technique that enhances language models by retrieving relevant external documents (e.g., from databases or vector stores) based on a user’s query. These documents are then used as additional context during response generation, enabling more accurate, up-to-date, and grounded outputs.

  • MCP (Model Context Protocol) is a standard for how external systems—such as tools, memory, or document retrievers—communicate with language models. It enables structured, portable, and extensible context exchange, making it ideal for building complex AI systems like assistants, copilots, or agents.

  • CleverChatty is a simple AI orchestrator that connects LLMs with tools over MCP and supports external memory. My goal is to expand it to work with modern AI infrastructure—RAG, memory, tools, agent-to-agent (A2A) interaction, and beyond. It’s provided as a library, and you can explore it via the CLI interface: CleverChatty CLI.

Continue Reading ...

The End of The Holocene: 2. The Hideout

The End of The Holocene: 2. The Hideout

This is the second part of the novella "The End of the Holocene". The first part


"Life, although it may only be an accumulation of suffering, is dear to me, and I will defend it." — Mary Shelley, Frankenstein

April 2030. San Francisco.

Michael Kravchenko returned to his place of power on the ocean shore near San Francisco. A light mist had almost completely swallowed the Golden Gate Bridge. Michael missed this view, these scents. He hadn’t been here in almost half a year. A cascade of events that followed the launch of the general artificial intelligence Suffragium, developed with his participation, had brought a dark streak into his life.

During that time, Michael had to justify himself a thousand times before various committees, proving that there had been no malicious intent in his actions. That the responsibility couldn’t be laid on the engineers. Sometimes, science encounters failures. Ultimately, it’s all experience. And there hadn’t been any serious problems—aside from the financial losses. Yes, the global network was unstable for a while. But everything was resolved eventually.

Continue Reading ...

Inside the LLM Black Box: що входить у контекст і чому це важливо

Inside the LLM Black Box: що входить у контекст і чому це важливо

Великі мовні моделі (LLM), такі як GPT-4, Claude, Mistral та інші, здаються розумними у своїх відповідях — але справжня магія полягає в тому, як вони сприймають і інтерпретують контекст. Розуміння того, що входить у контекст LLM і як це впливає на результат, критично важливе для розробників, дослідників і дизайнерів продуктів, які працюють із генеративним ШІ.

У цій публікації я хочу дослідити складові контексту, його структуру, обмеження та взаємодію з найбільш поширеними сценаріями використання, такими як використання інструментів (Tools, MCP) і включення додаткових знань з Retrieval-Augmented Generation (RAG).


Continue Reading ...

Implementing the Most Universal MCP Server Ever

Implementing the Most Universal MCP Server Ever

It seems the MCP hype is starting to slow down a bit. After 6–8 months of high enthusiasm, the community is beginning to realize that MCP is not a magic bullet. In some MCP listings, you’ll find more than 10,000 servers doing all sorts of things. Naturally, many of them are useless—spun up by enthusiasts just to see what MCP is all about.

But some of these servers are actually useful.

In this post, I want to share my thoughts on building the most universal MCP server—one that can adapt to almost any use case.

Continue Reading ...

Building More Independent AI Agents: Let Them Plan for Themselves

Building More Independent AI Agents: Let Them Plan for Themselves

I continue to explore one of my favorite topics: how to make AI agents more independent. This blog is my way of organizing ideas and gradually shaping a clear vision of what this might look like in practice.

The Dream That Started It All

When large language models (LLMs) and AI chat tools first started delivering truly impressive results, it felt like we were entering a new era of automation. Back then, I believed it wouldn’t be long before we could hand off any intellectual task to an AI—from a single prompt.

I imagined saying something like:

"Translate this 500-page novel from French to Ukrainian, preserving its original literary style."

And the AI would just do it.

But that dream quickly ran into reality. The context window was a major limitation, and most chat-based AIs had no memory of what they'd done before. Sure, you could translate one page. But across an entire novel? The tone would shift, the style would break, and continuity would be lost.

Continue Reading ...

Inside the LLM Black Box: What Goes Into Context and Why It Matters

Inside the LLM Black Box: What Goes Into Context and Why It Matters

Large Language Models (LLMs) like GPT-4, Claude, and Mistral appear to produce intelligent responses — but the magic lies in how they consume and interpret context. Understanding what goes into an LLM's context and how it shapes output is critical for developers, researchers, and product designers working with generative AI.

This post explores the components of context, how it's structured, how it's limited, and how advanced use cases like tool usage and retrieval-augmented generation (RAG) interact with it.


Continue Reading ...

Easily Switch Transport Protocols in MCP Servers

Easily Switch Transport Protocols in MCP Servers

I would like to expose one more benefit of the Model Context Protocol (MCP) — the ability to easily change the transport protocol. There are three different transport protocols available now, and each has its own benefits and drawbacks.

However, if an MCP server is implemented properly using a good SDK, then switching to another transport protocol is easy.

Quick Recap: What is MCP?

  • Model Context Protocol (MCP) is a new standard for integrating external tools with AI chat applications. For example, you can add Google Search as an MCP server to Claude Desktop, allowing the LLM to perform live searches to improve its responses. In this case, Claude Desktop is the MCP Host.

There are three common types of MCP server transports:

  • STDIO Transport: The MCP server runs locally on the same machine as the MCP Host. Users download a small application (the MCP server), install it, and configure the MCP Host to communicate with it via standard input/output.

  • SSE Transport: The MCP server runs as a network service, typically on a remote server (but it can also be on localhost). It's essentially a special kind of website that the MCP Host connects to via Server-Sent Events (SSE).

Continue Reading ...

An Underrated Feature of MCP Servers: Client Notifications

An Underrated Feature of MCP Servers: Client Notifications

In recent months, the Model Context Protocol (MCP) has gained a lot of traction as a powerful foundation for building AI assistants. While many developers are familiar with its core request-response flow, there's one feature that I believe remains underappreciated: the ability of MCP servers to send notifications to clients.

Let’s quickly recap the typical flow used by most MCP-based assistants:

  • A user sends a prompt to the assistant.
  • The assistant attaches a list of available tools and forwards the prompt to the LLM.
  • The LLM generates a response, possibly requesting the use of certain tools for additional context.
  • The assistant invokes those tools and gathers their responses.
  • These tool responses are sent back to the LLM.
  • The LLM returns a final answer, which the assistant presents to the user.

This user-initiated flow is incredibly effective—and it’s what powers many AI assistants today.

However, MCP also supports a less obvious but equally powerful capability: tool-initiated communication. That is, tools can trigger actions that cause the MCP server to send real-time notifications to the client, even when the user hasn’t sent a new prompt.

Continue Reading ...

Hello, World!

Hello, World!

This is the 5-th version of my personal blog.

I still have the backup of the 4-th, 3-rd and 2-nd. Later i want to roll them on some separate hosts.

The first version from 2003 is seems lost. But i still hope to find it somewhere in archives.

Continue Reading ...