Maia 200: The AI accelerator built for inference

4 pointsposted 9 hours ago
by boulos

1 Comments

jauntywundrkind

9 hours ago

The deep dive is very fun.

10PFLOPs @ 4bit is sick, in a 750W envelope.

I'm particularly impressed by the networking. On chip networking good for 2.8GB/s. 22400GbE!! Eat your heart out 800GbE! With their own RDMA centric AI transport layer!

The hyperscalers are going to devour chip making at this rate. No one else can connect chips like them. The commercial market with UltraEthernet or whatever else is just being totally left in the dust and it has a huge cost (and space) to add as a NIC, where-as the smart obvious thing is to have more network on chip.