Mnist neural network implemented in pure x86 assembly from scratch

5 pointsposted 3 months ago
by mghaderi

3 Comments

mghaderi

3 months ago

I implemented a neural network from scratch in x86 assembly (no frameworks, no Python) to recognize handwritten digits from MNIST. Feedback on performance optimizations or next steps is welcome Uses AVX-512 SIMD for parallel float32 ops (~7× faster than NumPy). Runs in a lightweight Debian Slim Docker container. The goal was to understand neural networks at the CPU level.

checker659

3 months ago

> ~7× faster than NumPy

Is that on the CPU (not sure if NumPy has a GPU backend)

mghaderi

3 months ago

Yes CPU same resources And same implementation