Projects
Five open-source tools covering every layer of LLM inference. From the metal to the mobile screen.
mullama
Python / RustRun any LLM locally. Use it from any language. Deploy anywhere. Drop-in Ollama replacement with native bindings for Python, Node.js, Go, PHP, Rust, and C/C++.
LLM ServerPolyglotOpenAI APIAnthropic API
llamafu
DartRun AI models directly on mobile devices. Flutter FFI plugin for on-device inference with vision, tool calling, and streaming.
Mobile AIFlutterOn-DeviceMultimodal
unillm
RustA modular LLM inference runtime written in Rust. 47 model architectures, unified interface, type-safe and composable.
Runtime47 ArchitecturesModularKV Cache
cllm
CA bare-metal C unikernel for serving LLMs. No OS, no overhead. Boots directly on hardware and serves inference over HTTP.
UnikernelBare Metalx86HTTP Server
zigllm
ZigLearn how LLMs work by building one in Zig. 18 model families, 285+ tests, progressive architecture from tensors to text.
EducationalSIMD18 Architectures285+ Tests