I don't mind tabs or spaces, and I don't even know what I use in a given project. I use the defaults from rustfmt in rust, eslint or biome in TypeScript/JavaScript, or ruff in Python. I also have some .editorconfig files so my editor is automatically configured correctly.
Importantly, I do tell the agents to always run the linters and use the auto-formatters if needed. They get it. Don't make them format code, it's slow. The auto formatters are better and they should use them.
I also don't like any or unknown and I have some eslint rules against those, and the agents will notice and fix their code, automatically without being asked. I guess you can also find some linting rules against ternaries, or too short variables. I like eslint-plugin-unicorn for example and I think it has rules for that by default.
It's not very different that working with other people.
I can't address all your points, but if you add your linter as a pre-commit hook, the AI won't be able to open a PR if it doesn't pass your linter. That could catch the tabs-versus-spaces issue.
> I use tabs instead of spaces for indentation. It seems like the model is massively weighted on code written using spaces (duh)
LLMs by nature are not very good at peeing against the wind. Also on average they are only as good as the average codebase they been trained on. By design.
For me, it's tabs-vs-spaces, but doesn't every codebase have its own peeing-against-the-wind patterns that are necessary because of some historical reason or another? What's the way to mitigate against this trend towards the center other than throwing up my hands and admitting defeat?
Absolutely not. Every codebase has some nuances but majority folows very similar rules and patterns. There is nothing to mitigate just don’t expect that something trained on A would suddenly be good at B
A pre-commit hook that runs a linter and type-checking is absolutely vital to maintain the code formatting of AI generated commits.
One observation I’ve made working with LLMs is that sometimes it’s worth being flexible and conforming to the LLM’s code style and patterns.
Frankly the code doesn’t need to be elegant or follow arbitrary guidelines (in reality nobody cares if it uses spaces or tabs, what matters is the result).
In the past (pre-LLM) I used to nitpick people in code reviews, calling out a bunch of stylistic preferences that I believed would keep the codebase “consistent” and “elegant”. The idea was that if the codebase is uniform it’s easier for other engineers to iterate on it or debug.
Today I don’t care in the slightest.
I’m not the one writing the code nor am I the one actively debugging it, that has been delegated to the AI.
Furthermore very seldom am I actually reading any of the generated code unless it’s mission critical. I treat the code it generates as a black box until I can’t, and nowhere throughout that process do I worry about aesthetics.
Try to put aside all code vanity, accept that not all code will be aesthetically pleasing or elegantly written. Focus on delivering the end goal, not the syntactical minutia.