ethbr1
3 months ago
As someone who spent most of a career in process automation, I've decided doing it well is mostly about state limitation.
Exceptions or edge cases add additional states.
To fight the explosion of state count (and the intermediate states those generate), you have a couple powerful tools:
1. Identifying and routing out divergent items (aka ensuring items get more similar as they progress through automation)
2. Reunifying divergent paths, instead of building branches
Well-designed automation should look like a funnel, rather than a subway map.If you want to go back and automate a class of work that's being routed out, write a new automation flow explicitly targeting it. Don't try and kludge into into some giant spaghetti monolith that can handle everything.
PS: This also has the side effect of simplifying and concluding discussions about "What should we do in this circumstance?" with other stakeholders. Which for more complex multi-type cases can be never-ending.
PPS: And for god's sake, never target automation of 100% of incoming workload. Ever. Iteratively approach it, but accept reaching it may be impossible.
carlmr
3 months ago
I also like the way the Toyota production system puts it with Jidoka / automate with a human touch. [1]
1. Only automate the steps you know how to execute manually very well. This is a prerequisite. This kind of goes in the direction of not automating everything but looking at the parts you can automate well first.
2. Enable human intervention in the automation. Enable humans to stop the line when something is wrong. Then you figure out what's wrong and improve your automation. This implies the interative approach.
[1] https://global.toyota/en/company/vision-and-philosophy/produ...
MichaelZuo
3 months ago
Or to put it more concisely, trust but verify the ‘manual process’, that includes any verbal or written explanations of the process.
Only 100% trust the automation after every possible step, procedure, and word has been 100% verified. (Which is to say almost never…)
bloopernova
3 months ago
> Well-designed automation should look like a funnel, rather than a subway map.
Do you have any examples of this? Like an example project?
> never target automation of 100% of incoming workload. Ever.
A new product owner came in last year and wanted to "automate everything". They wanted a web page where builds, branches, commits, and more were all on one page that showed exactly which commit was deployed where, by whom, etc etc. They wanted this extravaganza for a 2-person application that was in maintenance with no new features.
They're also the kind of person who consults chatgpt or copilot on a subject while you're explaining that subject to them, to check that the LLM agrees with what you are saying. They'll even challenge people to prove some copilot output is incorrect. It seems to me that they consider LLMs more reliable than people.
bee_rider
3 months ago
> They're also the kind of person who consults chatgpt or copilot on a subject while you're explaining that subject to them, to check that the LLM agrees with what you are saying. They'll even challenge people to prove some copilot output is incorrect. It seems to me that they consider LLMs more reliable than people.
Dear lord, these tools have just come out, how can they already have invented a new type of asshole?
whstl
3 months ago
We had a product manager that made requirements based mostly on ChatGPT.
It would output completely nonsensical stuff like QR-Code formats that don't exist, or asking to connect to hallucinated APIs.
It was often caught by lead devs quite quickly: the documentation wasn't a link or a PDF but rather some block of text.
But in the cases it wasn't, it was super costly: some developer would spend hours trying to make the API work to no avail, or, in the case of the QR code, it would reach QA which would be puzzled about how to test it.
So yes there is a new type of asshole.
johnm
3 months ago
No, those are the same that have been around forever. They just have a new tool to "justify" their crappy behavior.
mitjam
3 months ago
I experienced this, as well. It’s a whole new level of „I know enough to be dangerous“.
whstl
3 months ago
This is as fun as the business or product person that "knows how to code".
bee_rider
3 months ago
Hah, thanks to LLMs we’ve drastically reduced the barrier to entry, in terms of knowing enough to be dangerous. Hopefully there’s a corresponding reduction in the level of knowledge required to be useful…
SatvikBeri
3 months ago
> Do you have any examples of this? Like an example project?
Not the OP, but I worked on loans once. The application originally required tax returns to check income. Then we added the option to upload recent bank statements instead of tax returns. But there were a lot of places where the assumption of tax returns was hard-coded, so to save time the developers basically added an entirely separate code path for bank statements. Do this n times, and you have 2^n code paths.
When we eventually refactored and fixed it, we replaced the hardcoded references to tax returns with a concept of estimated income. Then we were able to reduce the branch to simply saying "if tax returns are present, estimated income = get_estimated_income_from_tax_returns(), otherwise estimated income = get_estimated_income_from_bank_statement()".
That's the core idea – collapse a branch as soon as possible.
chrsig
3 months ago
> A new product owner came in last year and wanted to "automate everything". They wanted a web page where builds, branches, commits, and more were all on one page that showed exactly which commit was deployed where, by whom, etc etc. They wanted this extravaganza for a 2-person application that was in maintenance with no new features.
you know, this seems very reasonable until that last sentence.
pc86
3 months ago
I would argue that if you are non-technical enough that you need this type of data on a web page for you, you probably don't actually need this data and I'd be wary that you're just looking for a way to find out who broke something so you can be shitty to them.
If you really want to know this type of commit-level data you can get it from git pretty easily, even if you're not particularly good with git but can search half-decently. If you don't have the skills to use git, it's extremely unlikely that knowing what the current branch and commit status of the repository is will meaningfully help you do your job.
Aeolun
3 months ago
I want this information and I can easily pull it out of git. I still want the webpage too because I don’t want to take 15 manual steps and open twenty different Github pages every time I want to find out.
ykonstant
3 months ago
Why a web page and not directly the git log? If style is necessary, reformat the log data with some fancy ASCII art?
teqsun
3 months ago
I'm assuming the PO isn't technical so git-log, git-blame, etc. are over their head.
Which itself begs why they'd need this level of detail on the codebase.
marcosdumay
3 months ago
It's hard to put your current ops configuration inside the git log. If you found some way to do that that fits well in the philosophy of a stream of immutable changes, I'm interested in reading your ideas.
andiveloper
3 months ago
We are using git tags on the commit to figure out what is currently deployed where, e.g. "dev-stage", "test-stage" etc.
chrsig
3 months ago
the "which ones are deployed where" bit is nice. if you're managing a lot of repos and deployments, yeah, that kind of thing can get really messy.
i don't care how it's presented, via webpage or cli tool or whatever -- just saying that when you are working at larger scale, those are very reasonable things to want to see in one spot at a glance.
the need dissipates as you scale down.
stackskipton
3 months ago
Sure, but that's best handled by Application reporting it via monitoring system. For example, at my company, we embed git commit, version and branch that last merge to main in container environment variables. Prometheus then exposes that as labels so we can just look any time it comes up. If we wanted to build a Grafana dashboard, that could be done easily as well.
I'm sure most monitoring systems have some way to loading that into their system.
chrsig
3 months ago
Sure, and the commenter's PO didn't specify how to get him a webpage. All very reasonable, see?
Smeevy
3 months ago
Oh my goodness, I wouldn't last a day with someone who did that. That sort of casual disrespect while you're talking to someone is wholly unacceptable behavior.
There's only one person you work with that's like that, right? Right?
bryanrasmussen
3 months ago
this reminds me of the last manager I had who used to be a developer, I think he was probably a pretty good developer - but not as good as he thought he was, because he thought he was good enough that he could tell you how things worked and why you were wrong without any knowledge of the code base or experience actually using the various 3rd party services that were being integrated.
I tried to develop the skill of nodding and then doing it correctly later, but it would always hit because there was a ticket I had written getting assigned to another dev and I had to explain to them why it was the way I had specified, and then he would correct me, and I would say yes part of what you say is correct and needs to be considered (as I said, I think he was a good developer at one time) but not all of it and he would insist he was correct and I had to go talk it over later with the dev as to why it worked the way I specified.
bornfreddy
3 months ago
That sounds awful. The manager is clearly not doing their job well, so avoiding their opinion and covering their mistakes is imho counterproductive. Instead, I would let my reservations be known and proceed exactly as they suggested. If it works, great, I have learned something. If not, let's scrape this and do it right, this time with manager's knowledge.
But in the end, if you can't work with the manager, it's time to head to greener pastures.
_rm
3 months ago
The problem with this head nod approach is it won't lead to reigning such types in.
Only their boss can reign them in, and so you have to use techniques to shine a light on them to their superiors.
Think "I want you to record your order" from HBO's Chernobyl, but more surreptitious.
supriyo-biswas
3 months ago
One other way of reining in such a manager in is to bring someone in the meeting who the manager trusts and who you too have good rapport with, and have them say the same points that you would have made otherwise.
Effectively a form of trust and reputation arbitrage, but it was effective for dealing with a particularly difficult manager who didn’t accept certain things about the design of an API, and yet when the other guy told him the same things, he just asked a few mild follow ups and accepted what I was telling him all along.
_rm
3 months ago
Yeah absolutely, I've used this for improvement suggestions too. Decide what you want to do, and then find the most big name source who's said basically the same thing, and then quote them pretending you're just relaying their message.
bloopernova
3 months ago
So far they are the only one.
They're definitely "leadership material"!
shakna
3 months ago
> They wanted a web page where builds, branches, commits, and more were all on one page that showed exactly which commit was deployed where, by whom, etc etc.
Sounds like they would have been a fan of ungit [0], which I have seen used for that kind of flow overview, though it has looked more impressive than actually proved helpful in my experience.
Arch485
3 months ago
> Do you have any examples of this?
Not GP, but at my old job making games we had a truly horrendous asset import pipeline. A big chunk of it was a human going through a menial 30 minute process that was identical for each set of assets. I took on the task of automating it.
I made a CLI application that took a folder path, and then did the asset import for you. It was structured into "layers" basically, where the first layer would make sure all the files were present and correct any file names, the next layer would ensure that all of the textures were the right size, etc. etc.
This funneled the state of "any possible file tree" to "100% verified valid set of game assets", hence the funnel approach.
It didn't accept 100% of """valid""" inputs, but adding cases to handle ones I'd missed was pretty easy because the control flow was very straightforward. (lots of quotes on "valid" because what I thought should be acceptable v.s. what the people making the assets thought should be acceptable were very different)
initplus
3 months ago
One example is that instead of adding support for edge case mutations/changes late in a process, it's sometimes better to force those records to be thrown away and reset with a new record from the start of the process. You avoid chasing down flow on effects of late unexpected changes in different parts of the application.
To give a contrived/trivial example, imagine a TLS handshake. Rather than building support to allow hosts to retry with a different cert, it's better to fail the connection and let the client start from scratch. Same principle can be applied to more complex process automation tasks in business. Imagine a leave tracking system. It might be better to not support changing dates of an existing leave application, and instead supporting cancel & re-apply. Best part is that the user facing part of both versions can be exactly the same.
jimkoen
3 months ago
> wanted to "automate everything".
With all due respect, this is preached by pretty much every book you read on cloud administration. I'd argue that if the process is decent enough, it'll work with major cloud providers, because their API's are rich enough to enable this already.
The thing with most automation tools though is, a) they're abysmal for most of the workflows preached (thinking of ansible and im shuddering) and b) to reach the degree of automation described in most literature, you need the API's of $MAJOR_CLOUD_PROVIDER.
internet101010
3 months ago
An example of this would be filtering on different attributes of data. You have a complex decision tree interface but ultimately the output is constructed into a single way that can be passed along to a single class/function.
airbreather
3 months ago
They are also the sort of person that thinks the problem can defined by thinking about and describing wanted behaviour alone.
Aeolun
3 months ago
My boss does this once in a while. While I understand the impulse, it always makes me feel a bit redundant when they’ll ask me to explain why what ChatGPT spits out won’t work for our situation.
Isn’t that the whole point of hiring experts?! So that you can ask them, instead of the computer for advice?
perrygeo
3 months ago
> never target automation of 100% of incoming workload... Iteratively approach it.
This is critical. So many people jump from "it works" to "let's automate it entirely" without understanding all the nuances, all the various conditions and intermediate states that exist in the real world. The result is a brittle system that is unreliable and requires constant manual intervention.
A better approach is to semi-automate things first. Write scripts with manual QA checkpoints and playbooks that get executed directly by devs. Do the thing yourself until it's so boring it hurts. Then automate it.
screye
3 months ago
This is genius. I skimmed it the first time, and it took me a good 30 minutes to appreciate the wide applicability of your insight.
A whole family of problems across domains : supply chains, manufacturing lines, AI pipelines, resilient software, physical security, etc. come down to effective state limitation.
In my day job, the vast bulk of my arguments with PMs come down to a lack of planning allocation for post-launch code cleanup. I haven't been able to find a succinct articulation for the utility of 'code cleanup' to a wider audience. 'State-limitation' fits nicely. Exactly the concept I was looking for.
It draws from the better known 'less-is-more' adage, but adds a lot of implicit detail to the generic (almost cliche) adage.
_rm
3 months ago
To be honest, for most low or moderately performing organisations, the best technique is just to not talk about it and just do it.
So long as it's done silently, blended in with other things, and cloaked under clever wording (e.g. "this blocks that other thing you want" rather than "this will improve the codebase"), things will go quite well.
As soon as you speak to them as you would another engineer, you provide them material to use against you in prevention of you taking proper action.
screye
3 months ago
That only works if the whole team coordinates.
If one person writes broken code in half the time, while you take twice as much cleaning the mess.....then you're going to be perceived as ineffective.
achillesheels
3 months ago
“A whole family of problems across domains…come down to effective state limitation.”
A fancy way of saying, “simplicity is the mark of truth.”
Or
“Less mechanical points of failure the better.”
I concur. Eliminate design risk.
HatchedLake721
3 months ago
With your experience, anything you'd recommend to read in the process automation space?
(I'm a founder of an automation SaaS where we've made "human interface" one of the core features of the product)
trod123
3 months ago
I'm not aware of any books that have covered this appropriately.
The biggest part of automation in my experience is boiling down the inputs to a 'unique' state that automation can then use as inputs and be run on.
For computation to do work, it requires a property of consistency, and computers can only operate accurately and do work when properties of determinism are met. Also as the OP mentions, state can explode leading to sphagetti, this is why he mentions a sieve like approach based on similarity.
Some problem spaces can be fundamentally inconsistent, such as with some approximations (common methods used for such), which falls back to what amounts to guesses, heuristics, and checks in terms of exception handling. There are problem scopes that cannot be characterized too so no amount of exception handling will resolve the entire scope, which is why you need fallbacks in a resilient design.
If inputs cannot be controlled and uniquely differentiated, the automation fails in brittle ways, especially with regards to external change.
The main interface (with regards to your core features) would be language, or communication. There exists words right now that can have contradictory, and different meanings, where the same word may mean the opposite depending on context, and this is not a general consensus but an individual one (where the individual may be misusing it).
That breaks the 1:1 mapping required for determinism, and AI weights mimicking neurons have a narrow approximation where it may work under a narrow set of circumstances but no computer today can differentiate when the inputs are the same but have two or more, different states mixed in (many people forget that absence of a state is a state too) and then decompose them. Abstract decomposition seems to be something only humans are good at, and I'm glad this is the case otherwise none of us would have jobs.
seanlinehan
3 months ago
Spot on.
I was an early eng and first VP of Product at Flexport. Global logistics is inherently complicated and involves coordinating many disparate parties. To complete any step in the workflow, you're generally taking in input data from a bunch of different companies, each of which have varying formats and quality of data. A very challenging context if your goal is process automation.
The only way to make progress was exactly the way you described. At each step of the workflow, you need to design at least 2 potential resolution pathways:
1. Automated
2. Manual
For the manual case, you have to actually build the interfaces for an operator to do the manual work and encode the results of their work as either:
1. Input into the automated step
2. Or, in the same format as the output of the automated case
In either case, this is precisely aligned with your "reuinifying divergent paths" framing.
In the automated case, you actually may wind up with N different automation pathways for each workflow step. For an example at Flexport: if we needed to ingest some information from an ocean carrier, we often had to build custom processors for each of the big carriers. And if the volume with a given trading partner didn't justify that investment, then it went to the manual case.
From the software engineering framing, it's not that different from building a micro-services architecture. You encapsulate complexity and expose standard inputs and outputs. This avoids creating an incomprehensible mess and also allows the work to be subdivided for individual teams to solve.
All that said – doing this in practice at a scaling organization is tough. The micro-services framing is hard to explain to people who haven't internalized the message.
But yeah, 100% automation is a wild-goose chase. Maybe you eventually get it, maybe not. But you have to start with the assumption that you won't or you never will.
initplus
3 months ago
Sounds like a really interesting problem space. I'm curious if you have any comments about how you approached dealing with inconsistencies between information sources? System A says X, system B says Y. I suppose best approach is again just to bail out to manual resolution?
seanlinehan
3 months ago
In the early days, we bailed out to manual resolution. In the later days, we had enough disparate data sources that we built oracles to choose which of the conflicting data was most likely to be correct.
For example, we integrated with a data source that used OCR to scan container numbers as they passed through various way points while they were on trains. The tech wasn't perfect. We frequently got reports from the rail data source that a train was, for example, passing through the middle of the country when we knew with 100% certainty that it was currently in the middle of the pacific ocean on a boat. That spurious data could be safely thrown out on logical grounds. Other cases were not as straightforward!
davedx
3 months ago
This has a corollary in software and refactoring: don’t refactor duplicated code too early; keep it separate until it’s very clear the refactoring is correct
chrsig
3 months ago
I wouldn't consider my career being in process automation, but I feel like you just described my approach to managing new product development on top of existing machinery while ensuring no breakage of what exists.
InDubioProRubio
3 months ago
airbreather
3 months ago
Implicit state is the enemy.