Zig EducationalSIMD18 Architectures285+ Tests
zigllm
Learn how LLMs work by building one in Zig. 18 model families, 285+ tests, progressive architecture from tensors to text generation.
Key Features
📚
Progressive Architecture
6 layers from tensors to text generation. Each layer builds on the last.
🏗️
18 Model Families
LLaMA, Mistral, GPT-2, Falcon, Mamba, BERT, Gemma, StarCoder, and more.
✅
285+ Tests
Every test is executable documentation. Each demonstrates a concept and validates the math.
⚡
SIMD Acceleration
First-class SIMD intrinsics for 3-5x speedup on matrix operations.
📉
18+ Quantization Formats
K-quantization, IQ-quantization, up to 95% memory reduction.
🔧
Zig for ML/AI
Comptime generics, manual memory control, no runtime or garbage collector.
Quick Start
git clone https://github.com/cognisoc/zigllm.git
cd zigllm
zig build test # Run all 285+ tests
Prerequisites
- Zig 0.14+
- A modern CPU (AVX/AVX2 recommended but not required)
Progressive Architecture
zigllm builds understanding through 6 layers:
6. Inference Text generation, sampling, KV caching, streaming
5. Models LLaMA, GPT-2, Mistral, Falcon, GGUF loading, tokenization
4. Transformers Multi-head attention, feed-forward networks, full blocks
3. Neural Primitives Activations (SwiGLU, GELU), normalization (RMSNorm), RoPE
2. Linear Algebra SIMD matrix ops, K-quantization, IQ-quantization (18+ formats)
1. Foundation Tensors, memory management, memory mapping
Each layer only depends on the layers below it. Start at the bottom and work up.
Model Architectures
| Category | Architectures |
|---|---|
| Core LLMs | LLaMA/LLaMA2, Mistral, GPT-2, Falcon, Qwen, Phi, GPT-J, GPT-NeoX, BLOOM |
| Specialized | Mamba (state-space), BERT (bidirectional), Gemma, StarCoder (code) |
| Advanced | Mixture of Experts (MoE), Multi-modal (vision-language), BLAS integration |
Key Capabilities
- KV Caching — 20x speedup for autoregressive generation
- SIMD Acceleration — 3-5x on matrix operations
- 18+ Quantization Formats — Up to 95% memory reduction
- Memory-Mapped Loading — Efficient model loading for large files
- Sampling — Greedy, top-k, top-p, temperature, Mirostat, grammar-constrained generation
Why Zig for ML?
Zig offers unique advantages for ML/AI workloads:
- Comptime generics — Type-safe tensor operations resolved at compile time
- First-class SIMD — Direct intrinsics without wrapper libraries
- Manual memory control — Deterministic allocation, no GC pauses
- No hidden allocations — Every allocation is explicit and trackable
- Cross-compilation — Build for any target from any host