BIO – The Bao I/O Co-Processor

4 pointsposted 12 hours ago
by hasheddan

2 Comments

theamk

an hour ago

It's a very nice write-up, but this part makes me uneasy:

> So long as all the computation in the loop finishes before the next quantum, the timing requirements [...] are met.

Seems like we are back to cycle counting then? but instead of having just 32 1-IPC instructions, we have up to 4K instructions with various latency, and there is C compiler too, so even if you had enough cycles in budget now, the things might break when compiler is upgraded.

I am wondering if the original PIO approach was still salvageable if the binary compatibility is not a goal. Because while co-processors are useful, people did some amazing things with PIO, like fully software DVI.

jauntywundrkind

10 hours ago

Really glad to get this write-up, adds a very nice broad picture & does a good job introducing the queue too.

I'm an unranked unwashed neophyte at hardware design, but I did spend some time looking at BIO. One particular thing that caught my eye a while ago was Streaming Semantic Registers, which is an instruction set extension for risc-v where load and store are implicit, with data pointers that automatically walk on each instruction. This greatly increases code density, allowing for DSP like capabilities on risc-v. https://arxiv.org/abs/1911.08356

I forget how exactly I was convinced, but after spending a while chatting with the LLM, I became somewhat convinced that the FIFO queues here gave a lot of similar capabilities. With additional interesting use for decoupling multiple systems. Register mapped data arrays, that can be used without having to load/store each word. I felt then and felt now that I still have a good bit to learn about how exactly each of the FIFO registers works, but it was cool to see, and I love this idea of code that can run without having to issue endless load/stores all the time.