Show HN: Τ³-Bench is out – can agents handle complex docs and live calls?

9 pointsposted 4 hours ago
by victorbarres

Item id: 47520448

1 Comments

sohamray19

3 hours ago

was talking to some AI labs yesterday who use their own version of voicified tau bench that is half duplex and clean audio, hopefully we can move to tau voice for more representative environments.

Also brought up questions about how multimodal models handle knowledge and context rot, and it seems like an open question so far.