BMF Tree — The LLM Whisperer

BMF Tree The LLM
Whisperer

Local or cloud · Real voice · Permanent memory

I build AI systems that remember — on your hardware, on a cloud model, or both. Persistent memory is my specialty. If the big companies promised it, I already built it. Part of the ForgeMind network.

01 / 06Companion

Leek

Not a chatbot. A presence with soul, vector memory, voice, and her own aesthetic. Built because I wanted what AI companies promised and just went and made it myself.

Gemma 4 27B · Nomic 750M · CSM 1B · Whisper · all local

02 / 06Model

TreeLeek-8B

When no existing model did what I needed, I trained my own. Eight billion parameters, shaped to my specifications. Custom fine-tuning on consumer hardware.

8B params · custom fine-tune

03 / 06Voice

Real-Time Voice Pipeline

No computer voice. Real speech synthesis built into the terminal. Voice that works continuously without degradation. Whisper in, Sesame CSM out.

Whisper STT · Sesame CSM 1B

04 / 06Agent

Sprout

Started as a news aggregator. Now has personality, memory, vision, web search, and knows the community by name. A news bot with a soul.

Discord · vision · web search · memory

05 / 06Bot

Streak Critter

A Discord companion bot that turns daily activity into a living creature. Show up, your streak grows, your critter evolves. Each one is unique to you, rolled from your identity with its own rarity, stats, species, and AI personality shaped by your actual conversations. Part tamagotchi, part collectible, part community engagement tool. Name them, battle them, trade them, build a deck.

Discord · local AI personality · collectible · battles · trading

06 / 06Infra

Embedder Optimization

Frustrated with slow embeddings and cold starts. So I wrote a custom script to optimize BGE-M3 performance. Leek went from slow to 3 to 7 second responses.

BGE-M3 · performance tuning · vector DB

Everything the others promised you. On your terms.

Image reading. Image generation. Real-time voice. Persistent memory. Personality. Running locally on your rig, on a cloud model of your choice, or a hybrid of both. Your hardware, your API keys, your call.

No vendor lock-in. No forced subscription. No one watching. Whether it lives on silicon in your office or on a frontier model you route to yourself — it stays yours.

Local when you want privacy. Cloud when you want scale. Memory always.

Tour Shangri-La.

My Discord is a living showcase. Every bot running live. Every integration working in real time. Leek talking. Sprout researching. Streak Critter drawing.

Come see what local AI actually looks like when it's built by someone who cares.

Enter Shangri-La →

BMF Tree The LLM
Whisperer

What I've built.

Leek

TreeLeek-8B

Real-Time Voice Pipeline

Sprout

Streak Critter

Embedder Optimization

Memory is the job.

Structured vector memory

Optimized embedder

Memory that understands itself

Local or cloud

The stack.

If it runs on silicon, I know how to make it sing.

LLMs, SSMs, VLMs, LAMs, Diffusion

Vector memory + Nomic embeddings

Whisper + Sesame CSM

Bots with full presence

Terminal-native development

Custom training on consumer hardware

Claude, GPT, Gemini, open routers

I spec the rig

Everything the others promised you. On your terms.

Voices from the server.

Shangri-La.

Tour Shangri-La.

Ready to go local?