Show HN: Automated tech news site with custom multi-LLM agent pipelines

3 pointsposted 6 hours ago
by siddkgn

3 Comments

siddkgn

6 hours ago

I built this autonomous pipeline to see if agentic orchestration could replicate a high-quality editorial desk with zero manual overhead. This is a a tech news stream that removes the "noise" (deals, opinions, fluff) using a multi-model agentic approach.

The Agentic Pipeline (runs every 2 hour):

I custom-coded the orchestration to swap LLMs based on their specific strengths:

1. Discovery: Scrapes raw feeds, removes duplicates, and checks against the published cache.

2. Classification (default:Gemini): Filters out non-tech news and "opinion" pieces. Gemini's context window makes it great for high-volume filtering.

3. Prioritization: Selects the top 5 most impactful stories from the filtered list.

4. Authoring (default:GPT-4o): Drafts the report based on the raw facts provided by the Discovery agent.

5. Proofreader (default:Sonnet 3.5): Handles the final edit to ensure a human-like tone and fact-checks against the source.

The Lean Tech Stack:

- Backend: Custom Python orchestration.

- Publishing: WordPress API (Website) + X API (Twitter) + Zapier (LinkedIn).

- Stateless: I bypass a local database entirely, using the WordPress REST API as my primary content store.

- Optimized: A "Non-News Cache" prevents re-processing URLs already identified as noise, saving in token costs.

Every post starts with a disclaimer and cites the original sources. Currently, it's 100% automated and has grown to 50 organic followers.

I'd love to hear feedback on the "agentic" logic or how I can better handle potential classification hallucinations!

vivzkestrel

6 hours ago

- define "impactful" ? how do you what is impactful and what is not, where is the threshold for it?

siddkgn

4 hours ago

Currently, 'impactful' is defined by category filtering and the removal of deal/opinion content within each 2-hour cycle. Because it runs frequently, it prioritizes the most news in that moment based on confidence score. I recognize that 'impact' is subjective, and it will mostly cover all stories at the end of the day. so I’d love to get your feedback on the stories it picked today, which felt useful to you?