Untitled @rostglukhov - Tumblr Blog

Token budgeting, fallback models, and caching strategies that cut LLM API bills. With real numbers, hardware break-even analysis, and working Python code.

#LLM #AI #Cost #Optimization #Local #Inference

How to design short-term, long-term, and structured memory for AI assistants, with retrieval mechanics, tradeoffs, failure modes, and real patterns from OpenAI, LangGraph, Hermes, and OpenClaw.

#Hermes #OpenClaw #Architecture #LLM #AI #RAG #SelfHosting

Build self-hosted AI systems with OpenClaw, Hermes, RAG, and local LLM infrastructure. Learn to orchestrate assistants with memory, retrieval, routing, and observability.

#AI #LLM #SelfHosting #OpenClaw #Hermes #RAG #Observability

How to design short-term, long-term, and structured memory for AI assistants, with retrieval mechanics, tradeoffs, failure modes, and real patterns from OpenAI, LangGraph, Hermes, and OpenClaw.

#Hermes #OpenClaw #Architecture #LLM #AI #RAG #SelfHosting

A deep technical guide to AI assistant architecture: LLMs, memory, tools, routing, and observability, with real tradeoffs, failure modes, and design patterns.

#Hermes #OpenClaw #Architecture #LLM #AI #Coding #Dev #DevOps #RAG

A practical guide to AI-augmented knowledge management, from summarisation and extraction to semantic linking, local models, APIs, and review loops.

#LLM #AI #knowledge-management #rag

Explore shared database, separate schema, and database-per-tenant patterns for multi-tenant apps. Learn trade-offs, security, and when to use each approach - with examples in Go

#SQL #DevOps #Dev #Privacy #Go #Architecture

Parallel execution of table-driven tests in Go: Learn best practices, avoid race conditions, and optimize test performance with t.Parallel() and subtests.

#Go #Golang #Dev #DevOps

Master Go unit testing with built-in testing package, table-driven tests, mocks, coverage analysis, and industry best practices for robust Go applications.

#Go #Golang #Dev #DevOps

A practical Zettelkasten guide for developers: write atomic notes, link concepts to code, avoid folder traps, and build a useful knowledge system.

#Obsidian #Logseq #Knowledge-Management

Real world OpenClaw production setups combining plugins and skills by user type, with practical architecture patterns for reliability, workflows, and scale.

#openclaw #Architecture #SelfHosting #LLM #AI #Privacy #Security

Full data: 20 AI agent repos ranked by GitHub stars, OpenRouter daily tokens, npm/PyPI downloads, CVE history, ecosystem size, and Reddit sentiment.

#Hermes #OpenClaw #Community #SelfHosting #LLM #AI

Benchmark results for Qwen 3.6 27B and 35B MTP speculative decoding in llama.cpp on RTX 4080 16GB. Token speed, VRAM cost, and optimal --spec-draft-n-max settings.

#SelfHosting #LLM #AI #llama.cpp #NVidia #Hardware

Learn how to unload every loaded llama.cpp router model with curl and jq, free VRAM safely, and avoid restarting llama-server in local LLM workflows.

#Cheatsheet #Self-Hosting #SelfHosting #LLM #AI #DevOps #llama.cpp

RAG retrieves fragments on demand. LLM Wiki compiles structured knowledge before any question is asked. Learn when ingest-time synthesis beats query-time retrieval, and when it does not.

#wiki #knowledge-management #rag #ai-systems #knowledge-systems #agentic-ai #Architecture #LLM #AI

Compare PKM, RAG, wikis, and AI memory systems by structure, retrieval, ownership, evolution, and real-world use cases.

#rag #wiki #ai #knowledge-management #Architecture

Personal Knowledge Management - What it is, it's goals, methods and tools to use in 2025

#Offline #knowledge-management

Trending Blogs

Recently Viewed Blogs

Untitled