Happy Zelda's 40th first LLM running on N64 hardware (4MB RAM, 93MHz)

30 pointsposted 10 hours ago
by AutoJanitor

25 Comments

acmiyaguchi

7 hours ago

This feels like an AI agent doing it's own thing. The screenshot of this working is garble text (https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...), and I'm skeptical of reasonable generation with a small hard-coded training corpus. And the linked devlog on youtube is quite bizzare too.

AutoJanitor

5 hours ago

This is the text inference issues I was alluding to. We had several hurdles to overcome. 1 llms were trained on little endian. Mips for n64 is big endian. 2 we had python to c issues. 3 we had quantization issues. all being resolved. This is a tech demo to honor LOZ and also the code can be used for n64 devs to add ai style npcs in the future. So did we achieve it yes we are the first to do llm inference on n64. I am just trying to give you guys the proper video.

Scott

gbnwl

7 hours ago

It totally is. The fact that this post has gotten this many upvotes is appalling.

AutoJanitor

6 hours ago

Just wait sir. We are indeed doing inference on n64. We had serious issues with text. I am almost done resolving.

Jach

7 hours ago

It's best to flag this fake garbage shit and move on.

mlaux

9 hours ago

I tried to build this but it's missing the weights.bin file and my computer is too weak to generate it. Can you add it to the repo?

AutoJanitor

5 hours ago

Uploading weights.bin its really meant for you to generate your own llm but we are uplaoding it. They are ripping on it but they didnt check the code themselves. THis is a tech demo. its not about graphics its about the llm is inferring on the hardware lol.

AutoJanitor

5 hours ago

Honest Limitations

    819K parameters. Responses are short and sometimes odd. That's expected at this scale with a small training corpus. The achievement is that it runs at all on this hardware.
    Context window is 64 tokens. Prompt + response must fit in 64 bytes.
    No memory between dialogs. The KV cache resets each conversation.
    Byte-level vocabulary. The model generates one ASCII character at a time.
Future Directions

These are things we're working toward — not current functionality:

    RSP microcode acceleration — the N64's RSP has 8-lane SIMD (VMULF/VMADH); offloading matmul would give an estimated 4–8× speedup over scalar VR4300
    Larger model — with the Expansion Pak (8MB total), a 6-layer model fits in RAM
    Richer training data — more diverse corpus = more coherent responses
    Real cartridge deployment — EverDrive compatibility, real hardware video coming
Why This Is Real

The VR4300 was designed for game physics, not transformer inference. Getting Q8.7 fixed-point attention, FFN, and softmax running stably at 93MHz required:

    Custom fixed-point softmax (bit-shift exponential to avoid overflow)
    Q8.7 accumulator arithmetic with saturation guards
    Soft-float compilation flag for float16 block scale decode
    Alignment-safe weight pointer arithmetic for the ROM DFS filesystem
The inference code is in nano_gpt.c. The training script is train_sophia_v5.py. Build it yourself and verify.

andrekandre

7 hours ago

   The sgai_rsp_matmul_q4() stub is planned for RSP microcode:

     DMA Q4 weight tiles into DMEM (4KB at a time)
     VMULF/VMADH vector multiply-accumulate for 8-lane dot products
     Estimated 4-8× speedup over scalar VR4300 inference
----

rsp is the gift that keeps on giving; such a forwards-looking architecture (shame about the rambus latency tho)

AutoJanitor

6 hours ago

We are going to use the gpu 128simd soon but it only has 4kb ram addressable so matmul offload in small chunks!

andrekandre

6 hours ago

thats such really cool work; i wish i could get payed to do stuff like this, more power to you all ^^

AutoJanitor

5 hours ago

I am doing this for 0 dollars. I am a self funded ai research lab. So when people diss me i get a little jaded, but then I remember I am doing cool stuff. Even if others don't see it. Thats enough for me!

Wowfunhappy

7 hours ago

The readme says:

> This isn't just a tech demo — it's a tool for N64 homebrew developers. Running an LLM natively on N64 hardware enables game mechanics that were impossible in the cartridge era:

> AI analyzes play style and adjusts on the fly

> NPCs that remember previous conversations and reference past events

> In-game level editors where you describe what you want to build

...anyone who has ever used very small language models before should see the problem here. They're fun and interesting, but not exactly, um, coherent.

The N64 has a whopping 8 megabytes (!) of memory, and that's with the expansion pack!

I'm kind of confused, especially since there are no demonstration videos. Is this, um, real? The repository definitely contains source code for something.

gbnwl

7 hours ago

You mean to tell me the included screenshot hasn't convinced you?

https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...

danbolt

7 hours ago

I think the source code in the GitHub repo generates the ROM in the corresponding screenshots, but it seems quite barebones.

It feels very much like it’s cobbled together from the libdragon examples directory. Or, they use hardware acceleration for the 2D sprites, but then write fixed-width text to the frambuffer with software rendering.

AutoJanitor

5 hours ago

Partially correct. The value is not the game interface right now. Its proof you can do actual inference on an LLM the surprise I am developing is a bit bigger than this, just have to get the llm outputs right first!

acuozzo

8 hours ago

I normally don't write comments like this, but... this title was extremely challenging to parse.

crustaceansoup

7 hours ago

The repo description on GitHub would have been fine

> World's First LLM-powered Nintendo 64 Game — nano-GPT running on-cart on a 93MHz VR4300

AutoJanitor

6 hours ago

Hey guys i had endian mess. I had nano llm text issues. But its resolved im about to issue real proof on emualtor and real hardware!

AutoJanitor

9 hours ago

Yes it runs on emulator. I am fixing the endianess text issue from llm output right now. And the surprise is coming soon. Happy 40th Zelda!

shomp

9 hours ago

Cool, is there maybe a video demonstrating this?

great_psy

9 hours ago

Any place where we can test the llm output without loading it on n64?

Curious of what we can get out of those constraints.

steveBK123

7 hours ago

AI slop

AutoJanitor

5 hours ago

Thanks, slop or not its the first llm inference to actually run on mips. So you do something cool with ai or on your own. Be happy. Be productive.