2 posts tagged with "rag"

Rust Concurrency Patterns for AI Agents

January 16, 2026 · 7 min read

fr4nk

Software Engineer

Production patterns for building fast, concurrent AI agents in Rust.

Production-Ready Text Embeddings with WebAssembly: WasmEdge + GGML

November 9, 2025 · 8 min read

fr4nk

Software Engineer

Building production ML inference services that run anywhere—from Raspberry Pi to cloud edge—requires a different approach. This article walks through a complete implementation of a text embedding API using WasmEdge, GGML, and Rust, delivering a 136KB WASM module paired with a 1.8MB async HTTP server that processes embeddings in ~100-200ms per request.

Full implementation: github.com/porameht/wasmedge-ggml-llama-embedding