Building an AI Memory Ecosystem

Back to Blog

The Stack

Building portable AI memory requires more than a file format. You need tools for every step of the workflow:

Creation — Convert documents into .aif-bin files with embeddings
Storage — Organize files into collections
Search — Query collections with semantic understanding
Integration — Connect to AI agents and applications

The AIF-BIN ecosystem provides open-source tools for each layer. Here's how they work together.

Layer 1: File Creation

AIF-BIN Lite

For basic file operations

Read, write, and convert AIF-BIN files. Suitable for scripting and automation. Minimal dependencies.

pip install aifbin-lite

Use Lite when you need programmatic access to the format without the full CLI. It's the core library that other tools build on.

from aifbin_lite import AIFBINFile

# Create a file
file = AIFBINFile()
file.add_chunk("Your text here", embedding=[...])
file.save("output.aif-bin")

# Read a file
file = AIFBINFile.load("input.aif-bin")
for chunk in file.chunks:
    print(chunk.text)

AIF-BIN Pro

For production workflows

Full CLI with batch processing, watch mode, AI extraction for PDFs/images, and multiple embedding model support.

pip install aifbin-pro

Pro is what you use for real document pipelines:

# Convert a directory of documents
aifbin-pro migrate ./documents -o ./memories -r -p

# Watch for changes and auto-convert
aifbin-pro watch ./documents -o ./memories

# Use a specific embedding model
aifbin-pro migrate ./docs -m bge-base -o ./output

Pro handles the complexity: chunking strategies, embedding model selection, parallel processing, and format conversion.

Layer 2: Search and Retrieval

AIF-BIN Recall

Memory server for AI agents

Index collections, run semantic searches, serve results via HTTP API and MCP protocol.

npm install -g @terronex/aifbin-recall

Recall bridges the gap between files and applications:

# Index your memories
aifbin-recall index ./memories --collection my-project

# Start the server
aifbin-recall serve

# Search via HTTP
curl "http://localhost:3847/search?q=pricing+decisions"

# Or use the MCP protocol for AI agents
aifbin-recall mcp

Why a separate server? Files are great for storage and portability. But for real-time search across thousands of chunks, you need an index. Recall builds that index from your .aif-bin files and keeps the embeddings portable — if you move the files, you can rebuild the index in seconds.

Layer 3: AI Integration

Bot-BIN

Persistent memory for chatbots

Give your AI assistant memory that persists across conversations and sessions.

pip install bot-bin

Bot-BIN specializes in conversational memory:

from bot_bin import BotMemory

memory = BotMemory("./bot-memories")

# Store conversation context
memory.remember("User prefers concise answers")
memory.remember("Last discussed project: AIF-BIN")

# Recall relevant context
context = memory.recall("What does the user prefer?")
# Returns: "User prefers concise answers"

Under the hood, Bot-BIN uses AIF-BIN files. The memory is portable — move it to a new server, share it between bot instances, or back it up like any other file.

Layer 4: SDKs

For building custom integrations, SDKs are available in 8 languages:

Python — pip install aifbin
TypeScript/JavaScript — npm install @terronex/aifbin
Rust — cargo add aifbin
Go — go get github.com/Terronex-dev/aifbin/go
C#/.NET — NuGet package
Java — Maven Central
Swift — Swift Package Manager
Kotlin — Maven Central

All SDKs implement the same AIF-BIN v2 specification, so files are interoperable across languages.

How the Pieces Fit Together

A typical workflow:

Ingest — Use AIF-BIN Pro to convert your documents into .aif-bin files
Organize — Store files in directories by project or topic
Index — Use Recall to build a searchable index
Query — Connect your AI agent via MCP or HTTP
Maintain — Update files with Pro's watch mode; Recall re-indexes automatically

The key insight: embeddings live in the files, not the index. Move your .aif-bin files anywhere, re-index in seconds. No re-embedding, no API costs, no vendor lock-in.

Comparison with Alternatives

How does this compare to other approaches?

vs. Pinecone/Weaviate/Chroma: Vector databases are powerful but require running services and lock your embeddings in their format. AIF-BIN is file-based — no services required, portable by design.

vs. LangChain/LlamaIndex: These frameworks are great for orchestration but rely on external vector stores. AIF-BIN can serve as the storage layer, making your RAG pipeline fully portable.

vs. JSON + embeddings: Works but bloated (50% larger) and slower to parse. AIF-BIN is purpose-built for the use case.

Getting Started

The fastest path to working AI memory:

# 1. Install the tools
pip install aifbin-pro
npm install -g @terronex/aifbin-recall

# 2. Convert your documents
aifbin-pro migrate ./documents -o ./memories -r

# 3. Start the search server
aifbin-recall index ./memories --collection my-docs
aifbin-recall serve

# 4. Query
curl "http://localhost:3847/search?q=your+question"

That's it. Your documents are now searchable by meaning, running entirely on your machine, with no cloud dependencies.

What's Next

The ecosystem is actively developing. Planned additions:

Integrations — VS Code extension, Obsidian plugin, browser extensions
Frameworks — Native LangChain and LlamaIndex loaders
Platforms — Notion, Confluence, and GitHub sync
Enterprise — Multi-tenant server, access controls, audit logs

All built on the same portable file format. Contributions welcome at github.com/Terronex-dev.