Show HN: I Scraped 2,200 Software Engineering Jobs from Career Pages Using LLMs

8 pointsposted 3 days ago
by kylem866

5 Comments

wbakst

19 hours ago

cool stuff! I wish there were a fuzzy search / filter bar to make it easier to search for more specific things.

I'm also curious, what are you using to structure the outputs?

toomuchtodo

3 days ago

You should connect with the person building https://hiring.cafe. They are scraping something like 1.6M jobs using ChatGPT, might be some collaboration opportunity or knowledge transfer.

https://news.ycombinator.com/item?id=42806956

Worst case, proven pattern to emulate. Wishing you success!

kylem866

3 days ago

Thanks! I've actually already sent a message to the hiring cafe creator and didn't hear back. Might be worth another shot

spicy_ranch

3 days ago

I really enjoy the simple, elegant design and look of this site. Well done!

I did notice that the mid-level jobs are returning mainly senior roles though.

kylem866

3 days ago

Thanks! Yeah I have noticed accuracy problems with the seniority too. I'm using 4o-mini + structured output to extract the seniority. Currently the seniority output is defined as an array to handle edge cases where a job could technically be either mid level or senior. But, in reality the LLM is over eager at assigning multiple seniorities. It frequently gives a mid level seniority to jobs which literally have 'Senior' in the title. I'll work on it!