_danish
6 hours ago
Hey HN, I'm the author.
The problem I tried to work on: You're evaluating 5 promising papers for your project. Which one actually works for your use case? Right now, you have to implement all 5 to find out - that's days to weeks.
Tomea auto-generates PyTorch implementations from ArXiv papers, runs experiments on cloud GPUs with self-healing, and returns comparative benchmarks. The idea: see what works before committing engineering time.
Reality check: After asking some ML engineers I know, reactions were mixed. Some want exactly this to filter papers fast, others think manual implementation is part of learning. Probably depends on research vs applied work.
I'm 18 and this was my first attempt at agentic dev tools. Would love honest feedback&criticism - would you use something like this? Why or why not?
Technical feedback especially appreciated on the self-healing approach and benchmarking.