Our Mission

AI inference should run
everywhere.

We believe large language models shouldn't be locked behind cloud APIs. Every device — from a mobile phone to a bare-metal server — should be able to run AI locally. That's why we're building the full inference stack, open source.

The Problem

Cloud LLM APIs create vendor lock-in and unpredictable costs
Sensitive data must leave your infrastructure for every inference call
Existing local tools only solve one layer of the stack
Most languages have no native LLM bindings — HTTP is the only option
Mobile and edge devices are underserved by current inference tooling

Our Approach

Build every layer of the inference stack as open source
Provide native bindings for 6+ programming languages
Target every deployment surface: servers, desktops, mobile, bare metal
Maintain API compatibility with existing ecosystems (Ollama, OpenAI)
Invest in education to grow the community of local-AI practitioners
Explore open hardware reference designs purpose-built for local inference

The Cognisoc Stack

Five projects, each purpose-built for a specific layer of the inference problem.

unillm

Modular Rust inference runtime — the engine that powers everything

mullama

Local LLM server with polyglot bindings — making inference accessible

llamafu

Mobile inference via Flutter — AI in every pocket

cllm

Bare-metal unikernel — pushing inference to the silicon

zigllm

Educational implementation — growing the next generation of ML engineers

By the Numbers

Model architectures

Language bindings

GPU backends

Deployment targets

What's Next: Open Hardware

Software is only half the stack. We're exploring open hardware reference designs optimized for local LLM inference — purpose-built boards and configurations designed to run cognisoc software from boot.

Inference Accelerators

Single-board designs with NPUs and RISC-V cores, running cllm directly on bare metal.

FPGA Capes

Reconfigurable accelerator boards for custom quantization and novel attention mechanisms.

Cluster Blueprints

GPU cluster rack designs with optimized networking for distributed inference with unillm.

Open schematics. Open firmware. Open software. If you're in the hardware space, reach out.

Open Source, Open Future

Every project in the Cognisoc ecosystem is open source under MIT or Apache-2.0 licenses. We believe the infrastructure layer for AI inference should be a public good — not a proprietary moat.

We welcome contributions from developers, researchers, and organizations who share our vision of democratizing LLM inference. Whether it's adding a new model architecture to unillm, improving mobile performance in llamafu, or writing educational content for zigllm — there's a place for you.

Get Involved

Whether you're a developer, investor, or cloud architect — we'd love to hear from you.

Explore on GitHub Contact Us