Please don't do this. Ask HN isn't your blogging platform. Per the guidelines its for asking questions of the community.
Appreciate the feedback. To be transparent: I originally submitted this as a standard text post, but after it hit a spam filter, the HN moderators kindly restored it and moved it to /ask themselves to help with visibility.
I'm definitely here for the dialogue, specifically looking to compare notes on graph algorithms with other IaC engineers.
Author here. A few implementation notes:
1. We use NetworkX for the graph operations. Tarjan's SCC detection is O(V+E), so it scales well even for large accounts.
2. The trickiest part isn't the algorithm — it's mapping AWS API responses to graph edges. AWS APIs are... inconsistent. Some resources return IDs, some ARNs, some Names. Security Groups can reference themselves, reference by ID or by name, and have rules scattered across inline blocks and separate resources. Normalizing this soup into a clean adjacency matrix is where 80% of the engineering work lives.
3. For those wondering about the "Shell & Fill" naming: it's essentially forcing Terraform's create_before_destroy lifecycle behavior manually, by decoupling the resource identity from its configuration.
Would love to hear if others have hit similar graph problems with other IaC tools (Pulumi, CDK, CloudFormation).
Not IAC, but I’ve been doing a similar trick to sequence adding type annotations to python code,
Eg take the module graph, break the SCCs in a similar manner , then take a reverese topological sort of the imports (now a dag by construction).
That's a spot-on parallel! Python circular imports (especially for type hinting) are basically the software equivalent of this infrastructure deadlock.
Do you use string-based forward references ("ClassName") to break the cycles? That's essentially our "empty shell" trick — decoupling the resource identity from its configuration to satisfy the graph.
Did you stick with Tarjan's for the SCC detection on the module graph?