JanitorBench: A new LLM benchmark for multi-turn chats

25 pointsposted 10 hours ago
by shep101

27 Comments

tomhow

3 hours ago

We've killed all the green accounts that voted and commented on this, and moved the comments to a stub to hide them. We've also killed the post.

The HN guidelines and FAQ make it clear that it's not OK to ask people to upvote or comment on your stuff:

https://news.ycombinator.com/newsguidelines.html

https://news.ycombinator.com/newsfaq.html

And the HN community is very quick to notice it, flag the post and comments, and email us, all of which happened here.

There's no need to try and game HN like this. We're always looking for interesting new projects to showcase via Show HN, and we routinely spend substantial amounts of time helping people polish their Show HN posts, and if we think the community may find it interesting, we'll put it in the second chance pool (https://news.ycombinator.com/pool, explained here https://news.ycombinator.com/item?id=26998308), which guarantees it a bit of front page time, without the bad vibes and negative sentiment you attract from trying to game HN.

tomhow

3 hours ago

[stub for green-account comments]

user

7 hours ago

[deleted]

Sahadia

9 hours ago

[dead]

hugorsmith

9 hours ago

you can see jllm benchmarks on the table as well ('janitor-llm')

user

4 hours ago

[deleted]