newest
Agent Judge: Solving Long-Context Evals for Production Agents
2 pointsposted 5 hours ago
by gmaysPredicting the world cup winners with ML – live coding with Claude and Hopsworks
1 pointsposted 5 hours ago
by jamesblondeABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity
1 pointsposted 5 hours ago
by rndsignalsShow HN: Amanuensis – a local-first AI persona that won't fabricate facts
1 pointsposted 5 hours ago
by msalsasI built a tool that matches songs to driving speed
2 pointsposted 5 hours ago
by hspeiserShow HN: Preflight-UX – OSS toolkit to run UX product critiques
1 pointsposted 5 hours ago
by SparckixNot having an opinion on SpaceX is going to cost you
1 pointsposted 5 hours ago
by nutjob2In plain sight: A new pink-fruited species of Actaea L. from New York
1 pointsposted 5 hours ago
by PaulHouleSmallweb Is Becoming an Archipelago
1 pointsposted 5 hours ago
by speckxUK Ranks Second-to-Last in New NATO Ranking
1 pointsposted 5 hours ago
by jimjohnny123PixelRAG – Retrieve and Read Web Pages as Screenshots Instead of HTML
2 pointsposted 5 hours ago
by yichuanECB reins in Revolut over rapid-fire product launches
1 pointsposted 5 hours ago
by nixassEnjoyable Tasks, Contracting and Automation [pdf]
1 pointsposted 5 hours ago
by pupperinoFable in a Data Analyics Harness
1 pointsposted 5 hours ago
by robertclausHistory of WYSIWYG editors and CMS: a timeline (2022)
1 pointsposted 5 hours ago
by peter_d_shermanThe Missing Link Between Agents and Applications
1 pointsposted 5 hours ago
by cbromannThe White House Freakout over the Epstein Files
13 pointsposted 5 hours ago
by JumpCrisscrossClaude Fable 5 missed a bug that Sonnet 4.6 caught
3 pointsposted 5 hours ago
by startagesSoftBank Attempt to Get $6B OpenAI Margin Loan Stalls
3 pointsposted 5 hours ago
by adriandThe first century Roman aqueduct at Segovia carried water into the 1970s
5 pointsposted 5 hours ago
by dxsBuilding a serializable database in Rust, and measuring what it costs
2 pointsposted 5 hours ago
by YahyaEhsansubSpeed Is a Signal: When Faster Replies Increase Hiring Likelihood
13 pointsposted 5 hours ago
by speckxShow HN: Which F1 Team Website Is Fastest?
2 pointsposted 5 hours ago
by deflyDimidium – terminal color scheme crafted with science
2 pointsposted 5 hours ago
by microflashReviewing Code in the Agent Era
2 pointsposted 5 hours ago
by cristinacordovaInfinite precision intermediate arithmetic: how much would break?
3 pointsposted 6 hours ago
by logickkk1Social media platforms warned over role in fuelling Belfast riots
3 pointsposted 6 hours ago
by mmarian