Git-absorb: Git commit –fixup, but automatic

410 pointsposted a day ago
by striking

239 Comments

burntsushi

20 hours ago

The negativity in the comments here is unwarranted in my opinion. I've been using `git absorb` for years and it works amazingly well. I use it in addition to manual fixups. My most common uses of git-absorb, but definitely not the only, are when I submit a PR with multiple commits and it fails CI for whatever reason. If fixing CI requires changes across multiple commits (say, lint violations), then git-absorb will almost always find the right commit for each change automatically. It saves the tedium of finding the right commit for each change. False positives are virtually non-existent in my experience. False negatives do happen, but then you just fall back to the manual approach.

It seems like some would reply and say PRs should just be one commit. Or that they will be squashed anyway. And sure, that is sometimes the case. But not always. I tend to prefer logically small commits when possible, and it's not always practical to break them up across multiple PRs. Perhaps partially due to how GitHub works.

I use this workflow on all of my projects on GitHub.

menaerus

9 hours ago

What is wrong with simply pushing a "Fix linting issues" in a new commit? It's self-contained and very well describes the (single) purpose of the commit.

I share the sentiment about the "logical small commits", and hence I don't see that adding a new fix commit is problematic as long as it is self-contained and purposeful, but perhaps I don't understand what is the problem that this tool is trying to solve.

It says

> You have fixes for the bugs, but you don't want to shove them all into an opaque commit that says fixes

So my understanding is that bugs were found in the PR and now you want to fix those bugs by not introducing separate "Fix A", "Fix B" and "Fix C" commits but you want to rewrite the history of existing N commits into N' commits so that it blends those A, B and C fixes in.

Maybe I can see this being somewhat useful only ever if those N commits are pushed either directly as series as patches to the main branch or as a single merge commit, and you want to keep the S/N ratio reasonable.

But otherwise I think it's a little bit problematic since it makes it harder for the reviewer to inspect and understand what changes have been done. But it also makes it harder for a developer since not all fixes are context-free, e.g. a fix for an issue found in the PR cannot always be attributed to the self-contained and single commit in your branch but it's the composition of multiple commits that actually makes this bug appear.

cryptonector

7 hours ago

The issue isn't `Fix linting issues`, the issue is a `Fix linting issues` commit for issues you introduced in the code you'll be pushing with that `Fix linting issues` commit. Nobody wants to see you pre-push dev history -- it's not at all interesting. So rebase and squash/fixup such commits, split/merge commits -- do what you have to to make the history you do push to the upstream useful.

Now, if you find linting issues in the upstream and fix those, then a `Fix linting issues` commit is perfectly fine.

homebrewer

7 hours ago

It makes git bisect more difficult than it needs to be.

menaerus

5 hours ago

merge-commits or patch-series are already making it vastly more difficult for git-bisect than linear history with single self-contained commits. I used both and git-bisect on merge-commits is a nightmare.

WorldMaker

4 hours ago

An option is `git bisect --first-parent` to start from your integration points. (Then drill down into that branch if you need to.)

codetrotter

6 hours ago

Yeah. But for that we can squash everything into a single commit on merge. Instead of spending a bunch on time making every single commit in an MR perfect both in isolation and all-together. And if squashing everything in the MR causes a problem with the git history being too coarse, then that is almost certainly because the MR itself should have been split up into multiple MRs. Not because of how you organized the individual commits that belonged to one MR.

burntsushi

5 hours ago

The goal isn't perfection. Splitting PRs has overhead, especially depending on what code hosting platform you're using.

Do you advocate for making code easier to read and understand by humans? What would you do if someone told you, "no I don't want to waste my time trying to make it perfect." It's the same misunderstanding. I try to treat code history like I treat code: I do my best to make it comprehensible to humans. And yes, sometimes multiple PRs is better. But sometimes multiple commits in one PR are better. I use both strategies.

codetrotter

5 hours ago

I make multiple commits in the PR and squash it all on merge.

If the resulting squash is too big / touches too many things, it’s because I didn’t split the MR where I probably should have.

Because to me, the bigger waste of time is fiddling with all of the commits in the MR just to make it so that the MR can be merged without squashing.

If someone needs fine-grained history about how exactly a specific feature or bug fix came to be, they can always go to the MR itself and look at what happened there.

burntsushi

4 hours ago

> I make multiple commits in the PR and squash it all on merge.

Sometimes I do that. But sometimes I want the history I've curated in my PR to be preserved without the overhead of creating separate PRs (which then often need to be stacked, introducing problems of its own). In which case, I avoid things like "fixup lint" commits. And that's where git-absorb shines. And saving time is exactly the point! git-absorb helps you curate history faster.

> If someone needs fine-grained history about how exactly a specific feature or bug fix came to be, they can always go to the MR itself and look at what happened there.

That's a lot more work than just reading the commit log. But I agree, one can do this ifneedbe. But it's nice to avoid when possible.

codetrotter

4 hours ago

> That's a lot more work than just reading the commit log.

Yeah, that’s true. I suppose that we may be working in somewhat different environments.

For example, when someone has big public projects that get a lot of eyeballs on them, like your ripgrep and other projects, it makes a lot of sense to spend extra time making the git log a thing that can be read on its own completely offline.

For the things I work on at my job, there are just a few of us writing the code itself and those coworkers that work on the repos I do will usually check every MR from everyone else. And anyone else in the company working on other repos is usually likely to browse code via the GitLab anyway, if they are interested in actually looking at any of the code that are in the repos of my team. Onboarding new coworkers on the team is also mostly centered around the docs we have, and talking with other people in the team and the company, rather than digging through the git history.

And for me when I am looking to understand some code from our internal repos it’s usually that way too, that I use the GitLab UI a lot, and I look at the MRs that were merged, and I check the associated Jira ticket, and I might ask some coworkers on Slack to explain something.

Most of the time, no one outside of our team is very interested in the code of our repos at all. Discussions revolve around other aspects like the APIs we expose, and the schema of the data we store in tables in the databases, and the Kafka events we produce. That sort of thing.

And everyone at my job will necessarily need to be online when they are working anyways, so that they can use Slack etc. So we are one click away from our GitLab instance when we work. On the off-chance that someone is trying to do some work offline while being on the go and without Internet access available, they probably have some specific kind of work in mind that is sufficiently independent of specific details that they can do without that history.

burntsushi

4 hours ago

Yeah folks work differently. I'm just mostly responding to this idea that (roughly paraphrasing) "hey we don't need this tool, you should just be squash merging instead."

FWIW, at my previous role, where I worked on an internal proprietary codebase, we all followed this same philosophy of curating commits. It was primarily to facilitate code review though, and not source history. Stacking PRs on GitHub is really truly annoying in my experience. (Although there is some tooling out there that purports to improve the experience.)

baq

9 hours ago

> What is wrong with simply pushing a "Fix linting issues" in a new commit?

if you want every individual commit be buildable then it is a no-go. it's also a no-go if you don't squash your prs.

menaerus

5 hours ago

Red main branch is not what I am talking about at all. I am referring to the code being developed in a feature/bug-fix branch that is yet to be merged with main branch. OP believes that even in the development branch you should not fix your WIP code by adding "Fix linting issues". I thought that was implied by the nature of discussion since rewriting history of the code that resides already on the main branch would be beyond my understanding.

recursive

7 hours ago

What do the semantic commit purists do when a rebase causes some arbitrary commit to go from red to green? I've always wondered about that. Commit A becomes A' in a rebase. A is good, but A' is not. A' might be 20 commits ago in a feature branch.

burntsushi

7 hours ago

It's hard to follow your example. You say "go from red to green" which I read as "go from failing to passing," but then you go on to say "A becomes A'" where "A is good, but A' is not." Either way, the answer is that if that commit is in your patch series that hasn't been merged yet, that it might make sense to just rewrite A'. But it might not. It could be a ton of work. But also, 20 commits in one branch is rather large, and maybe that's probably wrong too.

I suppose a purist might say you should go clean up the history so that you have no failing commits, but I don't know anyone who is a true purist. You don't have to be.

Instead of living in a word of black-or-white, join me in embracing the grey. I don't treat commit history as something that must be perfect. I just try to treat history like I treat the source code itself: I write it for other humans. Is the history always perfectly comprehensible? Nope. Just like the source code.

recursive

7 hours ago

Yeah, that's what I meant.

I guess at the end it's all weighted tradeoffs for me too. I just put less weight on legibility and more on the ability to work on branches co-operatively without force-pushing.

burntsushi

6 hours ago

If you're working on a branch cooperatively, then yes, don't rewrite history. But maybe you do rewrite history before merging to main to clean it up, depending.

And this is also why workflow questions are hard. Because "working on branches co-operatively" was previously unstated. It's not universal. I rarely work on branches co-operatively in a long term sort of way. Sometimes I might "take over" a branch to get it merged or similar, but rewriting history in that context is still fine.

It is always the case, even among so-called purists, that you don't rewrite history that you're sharing with others. (Unless you've both agreed to it. But even then, it better be short-lived because it's annoying.)

nothrabannosir

5 hours ago

“Fix lint” commits also taint git blame.

You could perhaps add some kind of filter to git blame to automatically skip commits whose message is “fix lint” but at some point the cure is worse than the disease.

I also see people argue that merge commits make git bisect harder than squashing in the first place but there is a third option: rebase and fast forward. If every commit in your branch is clean and stand-alone that’s viable. Linter fix commits break that paradigm.

paulddraper

29 minutes ago

> What is wrong with simply pushing a "Fix linting issues" in a new commit?

Everything.

1. git blame is obfuscated. "Fix lint" is not helpful or relevant. Tell me what actually changed.

2. git log is noisy. More stuff is harder to read than less stuff, and you're making me read more stuff.

3. git bisect is difficult. Interspersed broken commits requires more time to sift through. (Lint is a poor example, say it's "fix server config")

4. git cherry-pick is tedious. When you copy the dev change to a release branch (let's say you maintain 1.x, 2.x, etc), you must include errata.

beart

9 hours ago

In some teams, you are not allowed to submit any commit that breaks the build, and a lint failure would be considered a broken build.

codetrotter

5 hours ago

Do these teams run the pipeline on all the commits when you push multiple commits at the same time to your branch?

Say I have an open MR from a branch that I’ve pushed three commits to so far. I pushed these three commits individually and the pipeline ran green each time.

A coworker of mine points out some errors to me.

I have to touch files that I previously touched in the past three commits again.

I am tempted to commit these changes in one commit. But I decide to try git absorb instead.

So instead of adding one fourth, green commit to my MR, my use of git absorb rewrites all three of my previous commits.

But actually, the changes I was about to put in the fourth commit only work when taken together.

Splitting them up and rewriting the previous three commits will result in build failure if you try to build from the new first commit or the new second commit.

I don’t notice that because I’m on the new third commit.

I force push the three new commits to my branch for my MR. Gitlab runs the pipeline on the last of the commits. Everything looks fine to me.

Your team lead approves the MR and pushes the merge button.

Three months later he scolds me for a broken bisect.

barbazoo

8 hours ago

You're right, most places I've worked this applies only to a subset of branches, usually main/master and sometimes other branches considered "stable" such as for the staging env.

menaerus

5 hours ago

We are talking about the commit, or series of commits thereof, during the development of code in feature/bug-fix branch and not about the commit that you push as a post-fix because one of your previous commits broke something. That was not the discussion as far as my understanding goes.

burntsushi

8 hours ago

> What is wrong with simply pushing a "Fix linting issues" in a new commit? It's self-contained and very well describes the (single) purpose of the commit.

Because I care a lot more about the logical history of the source code versus the actual history of the source code. Commits like "fix linting issues" are more like artifacts of the development process. The commit exists likely because you forgot to run the linter earlier, and only found the issue after creating the commit. Logically speaking, those fixes belong with the relevant change.

Now, if you're just going to squash the PR you're working on, then sure. The extra commit totally doesn't matter. Make as many of those commits as you want.

But if your PR is 5 commits that you want to "rebase & merge" as-is (which is something I do pretty frequently), and you care about the history being easy to navigate, then it can totally make sense to do fixups before merging instead of creating new "fix linting issues" commits. If the follow the latter, then your commit history winds up being littered with those sorts of things. And fewer commits will likely pass tests (because you created new commits to fix tests).

I want to be VERY CLEAR, that I do not agree with your use of the word "wrong." I do not want to call your workflow wrong. Communicating the nuances and subtleties of workflows over text here is very difficult. For example, one take-away from my comment here is that you might think I always work this way. But I don't! I use "squash & merge" plenty myself, in which case, git-absorb is less useful. And sometimes breaking a change down into small atomic commits is actually a ton of work, and in that case, I might judge it to not be worth it. And still yet other times, I might stack PRs (although that's rare because I find it very annoying). It always depends.

> But otherwise I think it's a little bit problematic since it makes it harder for the reviewer to inspect and understand what changes have been done. But it also makes it harder for a developer since not all fixes are context-free, e.g. a fix for an issue found in the PR cannot always be attributed to the self-contained and single commit in your branch but it's the composition of multiple commits that actually makes this bug appear.

I'm having a hard time understanding your concern about the reviewer. But one aspect of breaking things down into commits is absolutely to make it easier for the reviewer. Most of my PRs should be read as a sequence of changes, and not as one single diff. This is extremely useful when, in order to make a change, you first need to do some refactoring. When possible (but not always, because sometimes it's hard), I like to split the refactor into one commit and then the actual interesting change into another commit. I believe this makes it easier for reviewers, because now you don't need to mentally separate "okay this change is just refactoring, so no behavior changes" and "okay this part is where the behavioral change is."

In the case where you can't easily attribute a fix to one commit, then absolutely, create a new commit! There aren't any hard rules here. It doesn't have to be perfect. I just personally don't want to live in a world where there are tons of "fix lint" commits scattered through the project's history. But of course, this is in tension with workflow. Because there are multiple ways to avoid "fix lint" commits. One of those is "squash & merge." But if you don't want to use "squash & merge" in a particular case, then the only way to avoid "fix lint" commits is to do fixups. And git-absorb will help you find the commits to create fixups for automatically. That's it.

masklinn

19 hours ago

I’ve been using autofixup for this and it’s been ok but not great, it can be quite slow as things grown, and it doesn’t say anything when there was no match so it’s easy to miss. How does absorb surface that?

> Perhaps partially due to how GitHub works.

That’s definitely a major factor, I’d like to use stacked PRs they sound really neat, but GitHub.

Also even with stacked PRs I figure sometimes you’re at the top of the stack and you change things which affect multiple commits / prs anyway, surely in that case you need to fixup other commits than the ToS?

burntsushi

11 hours ago

> I’ve been using autofixup for this and it’s been ok but not great, it can be quite slow as things grown, and it doesn’t say anything when there was no match so it’s easy to miss. How does absorb surface that?

I haven't used autofixup, but:

* git-absorb has always been pretty snappy. I don't think it scales with repository size.

* If there's no match, then the things that don't match stay in the staging area and don't make it into a commit. git-absorb will also note this in its output after running it.

Degorath

16 hours ago

GitHub's quirks definitely make life much harder than it needs to be, but I've been using `git machete` for months now with great success in my team. The __one__ thing GitHub has that makes it all work is the fact that if you merge the parentmost branch, its immediate child will retarget its base branch.

I think if I had full "control" over my company's SCM workflows I would use a tool that considers a branch as a workspace and every commit in the branch becomes its own PR (personal preference, but in my experience it also motivates people to split changes more), but alas.

keybored

16 hours ago

The term Stacked PRs already sounds like a term that was invented specifically in order to communicate in a GitHub-influenced context. Because Stacked PRs are just a reinvention of being able to review a commit at a time (the stack part is straightforward).

travisb

4 hours ago

Stacked PRs are like being able to review a commit at a time, but add an additional layer of sequencing. It's most simply thought of as a patch series, where the evolution of each 'patch' is retained.

That additional layer allows finer grained history and to mostly avoid (unreviewed) rebasing. Many teams find those properties valuable.

masklinn

11 hours ago

Stacked PRs is a way to surface the lifetime of the proposition as they get fixed or updated following reviews.

It has nothing to do with github.

keybored

8 hours ago

It has nothing to do with GitHub in the sense that GitHub does not support it (I guess, I’m not up to touch). It does have something to do with GitHub in the sense that the name (PRs) and the benefits are framed from the standpoint of This is What GitHub Lacks.

Which is a GitHub-centric perspective.

https://news.ycombinator.com/item?id=41514663

Zitrax

17 hours ago

I assume you refer to https://github.com/torbiak/git-autofixup. I have also used it, and its ok but not perfect.

krobelus

17 hours ago

I use git autofixup; it was much better than git absorb last time I checked

> it doesn’t say anything when there was no match

that's what it should do

> it can be quite slow as things grown

How? All the slowness (on large repos) I've seen has been fixed.

masklinn

11 hours ago

> that's what it should do

No it is not.

> How?

I don’t know, that’s just an observation from using it, semi regularly I autofixup changes and it takes a while to do anything.

JeremyNT

5 hours ago

I am (what I assumed to be) an extensive user of fixup (usually invoked via interactive rebase). I'm intrigued by this but curious as to how it can really save so much time.

Are people fixup'ing a lot more than I do? I might do it once or twice per MR and it's never a large burden to fix the right commit.

If things get really out of hand such that the whole thing is a mess I just squash. Whatever history I had before is almost by definition gross and wrong in this scenario, and I don't mind losing it.

burntsushi

5 hours ago

It's same kind of thing people tell me about ripgrep. "Why bother with ripgrep, grep has always been fast enough for me." That might well be true. Or maybe we value time differently. Or maybe none of your use cases involve searching more than 100K lines of code, in which case, the speed difference between GNU grep and ripgrep is likely imperceptible. Or maybe being faster unlocks different workflows. (The last one is my favorite. I think it's super common but little is written about it. Once something becomes fast enough, it often changes the way you interact with it in fundamental and powerful ways.)

Because I have git-absorb, I tend to be a bit more fearless when it comes to fixups. It works well enough that I can be pretty confident that it will take most of the tedium away. So in practice, maybe git-absorb means I wind up with fewer squashes and fewer "fix lint" commits because I don't want to deal with finding the right commit.

My use of git-absorb tends to be bursty. Usually when I'm trying to prepare a PR for review or something along those lines. It is especially useful when I have changes that I want to be fixed up into more than one distinct commit. git-absorb will just do it automatically for me. The manual approach is not just about finding the right commit. It also means I need to go through git-add in patch mode to select the right things to put into a fixup commit, and then select the other things for the other commits. So it actually winds up saving a fair bit of work in some cases.

Another example is renaming. Maybe PR review revealed the names of some functions were bad. Maybe I introduced two functions to be renamed in two distinct commits. I could do

first rename -> first commit -> second rename -> second commit

Or I could do:

both renames -> git add -p -> first commit -> second commit

Or I could do:

both renames -> git absorb

In practice, my workflow involves all three of these things. But that last git-absorb option eats into some of the first two options and makes it much nicer.

notlinus

17 hours ago

Criticism isn't negativity. We're not Pollyannas here, we're adults who can handle critique.

burntsushi

11 hours ago

Can you give me an example of criticism that is not negative? As far as I know, all forms of criticisms involve pointing out a flaw or fault. There's constructive criticism, but it's still fundamentally negative.

Either way, feel free to replace the word "negative" with "criticism" in my comment if you want. It expresses the same thing I intended to express: I disagree with the criticism.

If we're not Pollyannas and we're all "adults who can handle criticism," then you should also be able to handle criticism of criticism. It goes both ways.

beart

8 hours ago

I've been thinking about your comment that all criticism is negative and I'm not sure where to go with that in my mind. I think it very much depends on your definitions of "criticism" and "negative". For example, a movie critic could praise a movie, 100% score, no faults. That's still a criticism by many definitions.

But the interesting thing about this is "the eye of the beholder". I suppose to some folks, all criticism does seem to be negative... which is why code reviews can turn into nightmares... and helps explain why I recently received a very angry phone call from another engineer because of some changes that were made to "his code".

For the purposes of this discussion, the oddity seems to be the folks jumping to defend the software they had no part in creating. Why does the criticism result in negative feelings for them? I can understand the author taking issue with it, but someone else's criticism should not impact another's ability to utilize the software for whatever purposes.

burntsushi

8 hours ago

I don't want to play a definitions game and dissect word meanings here. I think it's usually a waste of time. That's why I implored folks to just accept a revision in wording in my previous comment. What I intended to convey was this:

1. There were many comments providing criticism that I disagreed with.

2. I felt that the vibe given by the comments at the time was not commensurate with how much utility the tool brought me in my own workflow.

So it seemed useful to point this out. That is, that there was a gap in the commentary that I thought I could fill.

Obviously I won't be using this phrasing again because it spawned off a huge meta sub-thread that I think is generally a huge waste of time and space.

> For the purposes of this discussion, the oddity seems to be the folks jumping to defend the software they had no part in creating. Why does the criticism result in negative feelings for them? I can understand the author taking issue with it, but someone else's criticism should not impact another's ability to utilize the software for whatever purposes.

Again, it cuts both ways. Why isn't it odd for folks to jump in and express criticism of software they had no part in creating? If folks can express negative criticism, then why can't I express positive feedback?

I didn't say people shouldn't make those comments. I said it was unwarranted. Or in other words, that it was undeserved and that I disagreed with it. Or that some balance was appropriate. This has literally nothing to do with my ability to utilize the software. I don't know why you're going down that road.

j_maffe

10 hours ago

Negative comments are not always a product of negativity. Sometimes it's positive feedback to improve something that has potential.

burntsushi

10 hours ago

I think that can be true and my top-level comment can be true simultaneously. I've also clarified what I meant at this point.

cryptonector

7 hours ago

I think GP is saying that you didn't need to focus on the negativity (which is in itself negativity), just say the substantive thing that you wanted to say without editorializing about negativity. Your complaint (negativity) about negativity might be over-done, and anyways not useful. We can all shake our own heads at others' possibly-unnecessary negativity without having to be calling it out all the time. Meta arguments are not always useful.

Besides, consider this negative comment:

  https://news.ycombinator.com/item?id=41653797
is it one of the negative comments you didn't like? It seems like a possibly-useful negative comment -- "it didn't work for me because ..." is useful unless it was really a matter of undiagnosed PEBKAC. Would you rather than comment not have been posted? Surely not.

burntsushi

6 hours ago

I think I've clarified my position in other follow-up comments sufficiently that everything you said here has already been addressed.

I even already said I would use different wording next time.

And I never said folks shouldn't make those comments. I clarified that too.

Y_Y

14 hours ago

I'd be in favour of auto-stickying this. I see a lot of e-ink spilled over arguments that boil down to whether or not it's ok to comment about not liking some aspect of the subject under discussion. There are good reasons not to criticize in some situations, but I don't think they apply here. Either way the arguments are tiresome. We should agree to ban criticism, or agree not to argue about it (barring special circumstances).

tpoacher

16 hours ago

I had to look up the reference, and based on the wikipedia plot summary at least, I admit I don't quite get the relevance. I expected a plot where someone handles criticism quite badly and suffers as a result, but in fact the plot was actually about someone who handled criticism very well instead, and improved the lives of others as a result?

So now I'm curious! In what way does Pollyanna relate to adults who can't handle critique? Have I got the wrong Pollyanna by any chance? xD

talideon

14 hours ago

A Pollyanna is somebody who's cheerful and optimistic to a fault, i.e., even when it's unjustified.

The plot summary of the book is likely not what you should be reading as it's become an idiom. Something like Wiktionary or another dictionary would be a better place to look it up.

In this case, it's not about being able to receive criticism, but about being reticent about _giving_ it.

Y_Y

14 hours ago

> The plot summary of the book is likely not what you should be reading as it's become an idiom.

This is a good and perhaps under-appreciated point. When I first read the term "Polyanna" I made the same mistake as GP. I think if you read "The Prince" to find out what "Machiavellian" meant you'd be no better than when you started. Even terms like "Kafkaesque" have taken on lives of their own and are probably better not thought of as mere literary references.

ekidd

12 hours ago

Machiavelli's "The Prince" will give you a decent understanding of what people usually mean by "Machiavellian". The book explains what methods would allow an absolute ruler to stay in control of state. It does not generally make moral judgments about those methods.

Machiavelli's "Discourses" is the one that will really confuse a reader looking to understand the colloquial meaning of "Machiavellian". In this book, Machiavelli lays out a vision of a healthy "republic" (or more precisely, res publica) which benefits the people who live in it. Among other things, Machiavelli argues that republics actually benefit from multiple competing factions, and from some kind of checks and balances. Apparently these ideas influenced several of the people who helped draft the Constitution of the United States.

Now why Machiavelli had two such different books on how governments worked is another interesting question...

Y_Y

12 hours ago

> Machiavellian

> adjective

> uk /ˌmæk.i.əˈvel.i.ən/ us /ˌmæk.i.əˈvel.i.ən/

> using clever but often dishonest methods that deceive people so that you can win power or control

(from https://dictionary.cambridge.org/dictionary/english/machiave... )

Ymmv, but I think that's far from the point of the book, and isn't even the main topic. It's hard for me to imagine taking a person who'd never heard the term, letting them read the book, and then asking them to propose a definition, would produce anything like the above.

unkulunkulu

16 hours ago

I’ve also thought about that based on the same info, i.e. not reading the source completely.

There is a notion of a “Pollyanna mode” in schematherapy. What it means is ignoring negative facts and challenges with an outwardly positive attitude and avoiding addressing the issues themselves.

This certainly can be harmful to oneself. Another harmful thing is hating and bashing oneself for mistakes and faults and I won’t make a comparative judgement, but a healthy way is supposed to be along the lines of speaking up openly about what bothers you and thinking what can be done about it if at all.

tpoacher

16 hours ago

This makes sense. Thank you for your comment! Learnt something new today :)

metadat

4 hours ago

I haven't used `git --fixup' or `git rebase --autosquash' before, but they sound pretty handy.

https://jordanelver.co.uk/blog/2020/06/04/fixing-commits-wit...

git-absorb appears to take it one level further. However from the README I'm not clear on exactly what it will do in specific situations:

Does git-absorb automatically associate the most recent commit unique to a branch for a given file and apply the diff to said commit? Or what is the precise workflow and outcome in terms of which edits go into which commits?

AndrewHampton

a day ago

FWIW, I've been using this alias for the past couple years for fixup commits, and I've been happy with it:

> gfx='git commit --fixup $(git log $(git merge-base main HEAD)..HEAD --oneline| fzf| cut -d" " -f1)'

It shows you the commits on the current branch and lets you select one via fzf. It then creates the fixup commit based on the commit you selected.

jonS90

2 hours ago

Funny that I've been doing something nearly identical, but with way more boilerplate.

    fzfCommit() {
      local FZF_PROMPT="${FZF_PROMPT:=Commit: }"
      git log --oneline | fzf --border --prompt="$FZF_PROMPT" --height=10 --preview="git show {+1} --color=always" --no-sort --reverse | cut -d' ' -f1 | tr '\n' ' ' | sed 's/[[:space:]]$//';
    }
    function gfixup {
      local commit=$(FZF_PROMPT='Fixup Commit: ' fzfCommit)
      if [[ -z "$commit" ]]; then
        return 1
      fi
      set -x
      git commit --fixup "$commit" --allow-empty > /dev/null || return 1
      git rebase --interactive "$commit"~ --autosquash || return 1
    }

psadauskas

7 hours ago

I have this one in mine: https://github.com/paul/dotfiles/blob/master/git/.gitconfig#...

    # make a fixup commit for the last time the file was modified
    cff   = "!f() { [ -n $@ ] && git add $@ && git commit --fixup $(git last-sha $@); }; f"
    # Get latest sha for file(s)
    last-sha = log -n1 --pretty=format:%h --grep 'fixup!' --invert-grep
Given a file like `git cff path/to/file.rb`, It'll find the last commit that touched that file, and make a fixup commit for it. Its great for the try-this-change-on-CI cycle: Change the Dockerfile, git cff Dockerfile, push, repeat, without it screwing up because you changed a different file while you were working on it.

mdaniel

6 hours ago

It won't matter until it does, but $@ expands arguments into separate words but those expansions are themselves only word-preserved if the $@ is itself quoted. The [ will probably get away with what you want, but its use in $(git add) and $(git last-sha) almost certainly will not

  $ cat > tmp.sh <<'FOO'
  for a in $@;   do echo "a=$a"; done
  echo
  for b in "$@"; do echo "b=$b"; done
  echo
  for c in $*;   do echo "c=$c"; done
  echo
  for d in "$*"; do echo "d=$d"; done
  echo
  FOO

  $ bash tmp.sh 'alpha beta' $'charlie\ndelta'
  a=alpha
  a=beta
  a=charlie
  a=delta

  b=alpha beta
  b=charlie
  delta

  c=alpha
  c=beta
  c=charlie
  c=delta

  d=alpha beta charlie
  delta

psadauskas

3 hours ago

Yeah, you're probably right. I guess I haven't run it on any files with spaces in the 6 years since I added it to my dotfiles.

kccqzy

a day ago

If it only lets you select one, that's strictly less powerful. What if I want some parts of it into one commit and another parts into another? The `hg absorb` works for this case.

AndrewHampton

12 hours ago

Yeah, it's definitely less powerful that what absorb is doing. I wasn't trying to argue that it was equivalent. I just wanted to share a bash one-liner that I've had success with in case others find it helpful.

> What if I want some parts of it into one commit and another parts into another?

Looks like absorb will automatically break out every hunk into a separate fixup commit. My one-liner will create 1 fixup commit for everything that's staged. That's typically what I need, but on the occasions it's not, I use `git add -p`, as kadoban mentioned, to stage exactly what I want for each commit.

mst

10 hours ago

Oh, hrm, looking at this description and the one liner, I rather like.

Once you mentioned `git add -p` I realised that this is pretty much what I do already, except with a far more efficient way of selecting the relevant commit to do it to.

Muchas gracias.

AndrewHampton

9 hours ago

Yeah, I use about a dozen git aliases in my normal workflow. In case it's helpful, here are the relevant ones for this flow:

  alias git_main_branch='git rev-parse --abbrev-ref origin/HEAD | cut -d/ -f2'
  alias gapa='git add --patch'
  alias grbm='git rebase -i --autosquash $(git_main_branch)'
  alias gfx='git commit --fixup $(git log $(git_main_branch)..HEAD --oneline| fzf| cut -d" " -f1)'
Another favorite is:

  alias gmru="git for-each-ref --sort=-committerdate --count=50 refs/heads/ --format='%(HEAD) %(refname:short) | %(committerdate:relative) | %(contents:subject)'| fzf | sed -e 's/^[^[[:alnum:]]]*[[:space:]]*//' | cut -d' ' -f1| xargs -I _ git checkout _"
gmru (git most recently used) will show you the branches you've been working on recently and let you use fzf to select one to check out.

atq2119

14 hours ago

Then you use `git gui`, which is part of the git distribution itself, or `tig` if TUIs are your thing. I have a key binding for `git commit --squash=%(commit)` in my tig config, so I can interactively select lines or hunks to stage and then the target commit for the squash.

kadoban

21 hours ago

That sounds like what `git add -p` is for, stage part of the current changes.

kccqzy

8 hours ago

That still requires you to manually select hunks. The point of `hg absorb` is to automatically select hunks even if these hunks are to be squashed into different commits.

k3vinw

20 hours ago

Might be able to use the multimode flag in the fzf command above and it should let you select more than one using Tab and Shift+Tab.

emersion

7 hours ago

With a shell such as fish, one can "git commit --fixup <tab>" and a list of commits will be displayed.

johnnypangs

18 hours ago

I’ve been using this:

alias gfixup="git commit -v --fixup HEAD && GIT_SEQUENCE_EDITOR=touch git rebase -i --stat --autosquash --autostash HEAD~2"

From what I understand it does the same thing as this crate for the most part. All I do after is:

git push —force-with-lease

Not sure what you get from the crate otherwise

masklinn

18 hours ago

Your alias seems like a completely unecessary complexity. If you want to meld new changes into your branch head you can just alias “git commit --amend”, you don’t need that mess.

Absorb will find the commits to fix up for each change in the working copy, it doesn’t just merge everything into the head.

johnnypangs

14 hours ago

I see, the reason it’s that long complicated alias was that I didn’t want to open up the editor to change the commit every time I updated. “git commit —amend” does that.

I read the rough how it works and it now makes sense. I might give it a try. Thanks!

johnnypangs

14 hours ago

Seems like you can add —no-edit and get the same behavior, now I can delete that alias. Thanks again :)

(Edit: typo)

masklinn

11 hours ago

That is correct, and there is a `--edit` to revert that, so my personal alias is to `git ci --amend --no-edit` such that by default it just merges the staging into the HEAD, and then tacking on `-e` will open the commit message in an editor to expand it.

cryptonector

7 hours ago

You can also set `EDITOR=true` for that `git commit --amend` if you forget about `--no-edit`.

johnnypangs

18 hours ago

I guess the crate version is easier to soft reset?

mckn1ght

a day ago

sounds like how magit lets you create fixup commits in emacs

AndrewHampton

12 hours ago

This was 100% my inspiration. I used emacs+magit for years. After switching away from emacs for dev work, I still cracked it open for git interactions for another year or so. Eventually, I moved entirely to the shell with a bunch of aliases to replicate my magit workflow.

masklinn

19 hours ago

Worse in fact, since magit lets you fixup, squash, or instafix.

imiric

21 hours ago

I tried using this tool after seeing recommendations for it, but IME it got the parent commit wrong enough times that the work to undo the damage was more than if I had looked up the commit myself and used `--fixup` instead. So I moved back to this manual workflow pretty quickly.

I prefer having full control over my commit history, and this tool is too much magic for my taste. I'm sure that it could be improved so that mistakes are rare, but I'm not sure I would trust it enough to not have to review its work anyway.

3eb7988a1663

21 hours ago

I use lazygit for this. Keyboard driven TUI which lets you easily re-order/squash commits with minimum fuss. Being able to see them all laid out means no futzing with identifying the right ids.

imiric

21 hours ago

I rarely refer to commits by their IDs in these cases, though. More often than not I use the relative `HEAD~n`, when it's easy to determine `n` visually. With a few aliases, the manual fixup workflow really doesn't take much effort.

arcanemachiner

19 hours ago

I love lazygit so much. I use it all the time, and it's constantly improving my workflow.

insane_dreamer

10 hours ago

lazygit also my go-to for this and git in general

montag

20 hours ago

This sounds just right.

adastra22

21 hours ago

If it just automatically created the fix up commit and moved it to the proper place in the commit tree, I’d be happy with that. You can then run a one-liner to merge the fix up commits after reviewing.

BeeOnRope

20 hours ago

That's exactly what it does by default.

Only if you pass --and-rebase does it actually do the autosquash rebase for you.

globular-toast

17 hours ago

The more worrying thing for me is how would you know whether it got it right or wrong? I suspect if you're using it the point is you're not going to go and inspect each commit manually. IMO something that looks right is far more dangerous than something that's merely wrong.

bradley13

18 hours ago

Maybe I am being to much of a purist, but retroactively modifying commits and history? Why? Stuff happens, so do mistakes. Fix the mistakes, make another commit, and go on with your life.

strunz

18 hours ago

Commits should be a contained change that can be understood as a logical piece of history, and reverted if necessary. If you have to look at multiple commits for 1 logical change to the code, it's much more difficult to figured out what the intention was, if it was correct, and how it can be reverted.

Terr_

13 hours ago

I'd also point out that it's bad to have commits in the main branch where compilation or unit tests would fail, even when each bad-commit always arrives with an immediate "oops, fixed" commit trailing it.

It messes up the team's ability to use tools like git-bisect to pinpoint behavior changes, and is generally more noise than signal.

At a bare minimum, those spans should get squashed before they leave the PR or feature branch.

notlinus

17 hours ago

Do you always make all of your logical code changes in single, atomic commits? What if you have to modify a feature later on? I'm sorry, but this is ridiculous. There's nothing wrong with having multiple commits in a row modifying something. That's how git works. GitHub's PR merge workflow really messed up the meta game, I tell you what...

trashburger

15 hours ago

Yes, always. Of course I make mistakes, but the idea is that any patch I put up for review, if it cleanly applies, should not break any user and any tests. It's much more liberating knowing that bisects actually work and each commit does exactly what it says on the tin (no followup "fix"/"minor" commits that actually do something completely different). Incremental improvements are compatible with this approach. Also see https://gist.github.com/thoughtpolice/9c45287550a56b2047c631... .

brigandish

17 hours ago

I don't think you're arguing against the point being made.

Logical change 1, implement Hello World:

    puts "Hello, World!"
Logical change 2, make it a method:

    def hello_world
      puts "Hello, World"
    end
Logical change 3, take an argument for the greeting:

    def hello_world(greeting)
      puts "#{greeting}, World"
    end
Logical change 4, take an argument for the recipient:

    def hello_world(greeting, recipient)
      puts "#{greeting}, #{recipient}"
    end
Logical change 5, improve the method name by showing the intent:

    def greet_recipient(greeting, recipient)
      puts "#{greeting}, #{recipient}"
    end

Logical change 6, defaults…:

    def greet_recipient(greeting="Hello", recipient)
      puts "#{greeting}, #{recipient}"
    end
And so on. These could each be their own feature or part of a feature etc. What they are, though, is logically atomic. If you want to add a default recipient later, or next, that is its own logical commit. If you roll it back, each rollback will lead you to a logical difference, not some typo fix, e.g.

    def greet_recipient(greeting="Hell", recipient)
      puts "#{greeting}, #{recipient}"
    end
plus the missing "o"

    def greet_recipient(greeting="Hello", recipient)
      puts "#{greeting}, #{recipient}"
    end
Are one logical commit, for which I'd use a fixup or `git commit --amend` or whatever. What one considers logically atomic may differ from person to person, language to language, feature to feature… but they can be differentiated from things like typos quite easily.

Personally, I make numerous commits in a feature branch, (transparently, too, my typos included - I'm not proud:) and before requesting review I'll clean up using rebase into as few logical commits as possible, and upon acceptance either squash it all to one or leave it as is, depending on the team's culture/needs.

Terr_

13 hours ago

If those are all different commits, now imagine the fun of interactively rebasing them on top of a main branch where some other thing changed "Hello, World" to "Greetings dear Globe."

brigandish

12 hours ago

Why is someone else writing my feature? Do we not have a ticket system? ;-)

More importantly, when using a feature branch one should really rebase master into the feature branch regularly, and any other branches that might touch the same places, it takes care of little surprises like this.

Terr_

5 hours ago

To clarify, I meant a scenario where that very simplest version is already upstream, and your feature-branch contains 5-6 commits which are all different refactoring-steps or making the string literals override-able or abstracted away.

When you need to rebase that onto main/master, there would be 5-6 stops to fix conflicts in a 3-way diff, and I don't think the kind of work in that example is compelling enough to justify that as opposed to having one (squashed) commit that simply states that you improved the method in several ways.

P.S.: Sometimes more commits can make refactoring easier instead, often when it comes to splitting file rename/move operations from changing their content.

recursive

7 hours ago

> Do we not have a ticket system?

Maybe we do, maybe we don't. Maybe there are two tickets with a mutual dependence.

Even without that, if you add a parameter to a method, and later someone uses it in a totally different PR, then you effectively can't revert the creation of the parameter. How are you going to track that dependency. I've heard the arguments about logical commits and the ability to cleanly revert them. I just haven't seen any code where that would actually work, even if the trouble was taking artisanally curate the commit history.

vasergen

14 hours ago

I'd say pull request should be reverted if necessary, but an idea that each commit should be revertable and expectation that project should work after it is unneeded complexity for me.

ozim

17 hours ago

I am not history purist in that way. I don’t care about fixes for mistakes, if they are in your local branch do the cleanup before publish and if it is your own published branch force push is ok.

Unless of course your changes were merged to common code already - then they become not touchable - just make a new commit with fix don’t try to be smart, common code rolls only forward and fixes are new commits.

chippiewill

7 hours ago

Amusingly it actually makes you less of a "purist" to not modify commits. Git was created by Linus Torvalds to support Linux Kernel development

The Linux Kernel is heavily built around reviewing each commit individually via patchsets sent as emails and rewriting history to make it cleaner.

Git history is incredibly valuable if its curated, at my previous employer we used Gerrit and pretty much every change would have a well crafted multi-line commit message that explained what was being changed and most importantly _why_. If I wanted to understand why a bit of the codebase had been authored in a particular way it was incredibly easy because I could just git blame the line and there'd be a detailed commit message explaining the rationale.

In the Pull Request world I do agree with what you say - but only if you squash merge. Commits that say "fix typo" or "oops" or "wip" aren't useful long term.

gorset

18 hours ago

The use case is when you look at a branch as a series of patches.

Reviewing a clean set of commits is much easier than a branch full of mistakes and wrong paths taken.

Useful when we optimize for reviewing and good history for future maintenance. This has been important and useful when I’ve worked on big mission critical backend system, but I also understand it might not be the most important factor for a new project’s success.

keybored

14 hours ago

A branch with some base is already a series of commits. I don’t get where the conceptual re-imagining is here.

gorset

13 hours ago

Patch series comes from the linux kernel workflow, which git was developed to support. https://kernelnewbies.org/PatchSeries

In this workflow you review every commit and not just the branch diff. Each commit is crafted carefully, and a well crafter series of commits can make even very large changes a brief to review.

It takes a certain skill to do this well. As the page above says > Crafting patches is one of the core activities in contributing code to the kernel, it takes practice and thought.

This is in contrast to using git more as a distributed filesystem where you don't care particularly much about the history, and you typically squash commits before merging etc. It's simpler and easier to work this way, but you lose some of the nice attributes of the linux kernel workflow.

keybored

8 hours ago

That’s a nice summary.

What I don’t like about the Git documentation as I’ve read it is that they go between “patch” and “commit” in some places without stopping and explaining what the difference is. It makes sense to them. It’s obvious. But it isn’t necessarily obvious to most people.

A patch is a patch proper plus a commit message encoded in a format that git am understands. That’s fine. And the core developers understand that you cannot transmit a commit snapshot via email (or you shouldn’t). But I prefer to mostly stick to “commit” in the abstract sense, whether that to-be-commit is from a pull or from an email (or: it’s in the form of an email and it could be applied as a commit).

git rebase talks about “patch series” I think. Without explaining it. Why not “commit series”?

Sometimes it seems like talking about your changes by the way it happened to be transmitted. It’s like talking about “attachments” instead of commits because you happened to send them via email as an attachment (instead of inline).

Then you now have “stacked diffs” or “stacked commits”. Which are just a series of commits. Or a branch of commits (implicitly grounded by a base commit). For a while I was wondering what stacked diffs/stacked PRs/stacked patches and if I was missing out. When it just turned out to be, as you explain, essentially the Linux Kernel style of being able to review a commit in isolation. But in a sort of context that pull request inhabitants can understand.

I prefer to mostly talk about these things as “commits”.

(At several times writing those paragraphs above I wondered if I would be able to string together them in a coherent way)

travisb

an hour ago

I think part of the confusion is because 'patch' and 'commit' (really snapshot) are duals of each other, but in practice have important technical differences. When speaking abstractly about 'changes' it often doesn't much matter which term is used, but most interactions are with 'commits' so that tends to be the default term to use.

However, sometimes the details matter. For example, a 'patch' (diff + description) tends to be small enough to transfer conveniently, and human friendly. Patches do not describe their relationship to other patches, so it makes sense to talk about a series of patches which must be applied in sequence to accomplish some larger goal. 'patch' doesn't imply any particular storage format, so sometimes saying 'attachment' or 'patch file' or 'email body' is an important distinction. Git documentation assumes you know what you want. It might help to think of patches as "outside time", though of course any particular version of a patch will only apply to a subset of all snapshots.

A 'commit' (snapshot + description), on the other hand, tends to be large in that it describes the entire state of the codebase and contains (implicit or explicit) metadata describing its predecessor(s) -- possibly a large graph of them. People talk about a 'commit series' all the time, but use the terms 'history' and 'branch' instead.

Stacked PRs are built on top of commits and branches to gain the advantages of a patch series while retaining the advantages of snapshots. There's no industry consensus on if they are preferable, let alone how to best implement them (eg. rebase, rebase+squash, merge-squash, merge+first-parent, etc.), so different people have different ideas about what they look like. It isn't correct to say they are just a series of commits, because sometimes they are implemented as a series of (implicit or anonymous) branches. One of the few agreed-upon features of stacked PRs (or its other names) is that it is a sequence of 'changes' which are ordered to both satisfy change dependencies and broken into smaller pieces which "tells a story" to reviewers.

dclowd9901

17 hours ago

So this wouldn’t work very well in workflows that flatten merges to a trunk?

chippiewill

7 hours ago

Tools where they allow you to squash merge tend to be fairly incompatible with stack diff workflows generally.

It's tricky to review individual commits on Github and Gitlab and they don't even allow you to leave comments on the commit messages.

In areas where people do review individual commits they tend to use tools that support that workflow. Git uses email patches where each commit is sent as a separate email. Tools like Gerrit do code review directly on a commit by commit basis and the usual strategy to get it into trunk is to either "fast forward" (advance the HEAD of the trunk branch to that commit) or to cherry-pick it.

mst

10 hours ago

I like for non-trivial stuff to have a branch with a series of logical commits, cleanly rebased atop main, then use -no-ff to force a merge commit anyway. That way you can the whole branch appears as a single commit/diff atop main in primary history but you can dig in to the original components of it if/when that's useful.

The primary obstacle to doing this for me is, if I'm honest, not having automated it sufficiently that I can't forget to do that.

travisjungroth

17 hours ago

It would mostly be unnecessary. The separate commits won’t matter after the PR if they’re getting squashed.

Debatably, if you’re making changes during a PR review, it could be helpful to make those changes in relevant commits. That way if someone goes through them during the PR, they get one clean “story”, rather than see the pre-PR commits and the conversation after.

Izkata

11 hours ago

For me it's not purism, but practicality: The top comment right now mentions using this to apply linting changes to the original commit that introduced the linting violation, but I've seen linting commits introduce bugs often enough that I think those should always be a new separate commit, so a future maintainer can easily see why that kind of bug was introduced and what the fix is.

theptip

11 hours ago

It sounds like you have never heard of the Stacked Diffs workflow.

Here is an overview: https://newsletter.pragmaticengineer.com/p/stacked-diffs

It’s a perfectly reasonable preference to eschew this workflow, but definitely worth understanding the pros and cons.

dpkirchner

10 hours ago

Every explanation of stacked diffs I've seen (including this one) makes me think there is some secret magic not being shared that will make the whole idea "click".

If the problem is devs having to wait too long for PRs to be reviewed, breaking tasks up in to smaller diffs should exacerbate the problem, wouldn't it?

If diff 2..n follow diff 1 then a review of 1 blocks everything else you're doing, eh? At least the traditional route, branching everything straight off of main, means you can merge 2..n and rebase 1 once it passes, leaving you less blocked.

There must be something I'm missing because there are a lot of people promoting stacking diffs.

travisb

4 hours ago

In my experience with stacked diffs the smaller diffs doesn't exacerbate the problem because they are pipelined.

When getting merge approval takes at least one or two business days (say because of time zones or your colleagues are doing deep work which they don't want to drop every fifteen minutes to do a review) the pipelining is necessary to be unblocked.

johnmaguire

9 hours ago

> If the problem is devs having to wait too long for PRs to be reviewed, breaking tasks up in to smaller diffs should exacerbate the problem, wouldn't it?

I find small diffs to be much easier to review and as a result I review them much practically as soon as they are posted. And I'm less likely to miss something.

dpkirchner

8 hours ago

For sure, I agree. However there is also a context switching cost and IMO switching between your IC work and, say, a dozen small diffs is rougher than a couple larger diffs. That opinion said I haven't experienced fast reviews myself, so maybe that's what I'm missing.

steveklabnik

4 hours ago

I recently mentally came around to stacked PRs, though I currently work mostly solo, and so don't use them. But I'm happy to try and talk with you about it. Maybe because I learned so recently it might help.

> If diff 2..n follow diff 1 then a review of 1 blocks everything else you're doing, eh? At least the traditional route, branching everything straight off of main, means you can merge 2..n and rebase 1 once it passes, leaving you less blocked.

I am a little confused about the scenario you're describing here: with the PR based flow, you can't merge the second half of the PR while waiting on some change to the first commit.

So let me try to suggest a scenario. We're working on a web app. We're going to add a new feature. This requires some backend changes, and then some front end changes that use those back end changes. For the immediate purposes here, we're going to presume these are in the same PR. I'll talk about them as two PRs in a moment. We're also going to assume for the sake of simplicity that there's two commits in this PR: one for backend, one for frontend.

With the PR workflow, you're asking for review from two people: a backend expert, and a frontend expert. Both are going to request changes, and you're gonna respond to all of them, and eventually things get to a good place and the PR is merged. Let's say that process takes a week.

There's a few possible issues here: the first is, both people are going to get notifications for every change in the PR, even if it's not in the code that they need to review. This creates additional review fatigue for both reviewers. They also may need to check what's happened since the last time they reviewed things, and github doesn't always make that easy, though they do help sometimes in some circumstances. Another issue is, say that the backend changes are good to go, but there's more frontend work to do: because it's in the same PR, the backend change will not be integrated until the frontend change is done. And the backend reviewer will still be getting those pings. Notably, if the frontend change is ready first, they'll also be getting pings from more backend reviews too.

In the stacked diff flow, and using those tools, each reviewer only needs to review their half individually. They will not get pinged for changes to the other half. They will be presented with only the diffs that affect the parts that they need to review. And, because you can land each diff in the stack whenever it's at the top (or bottom, depending on how you visualize things, I guess) of the stack, as soon as the backend commit is good to go, it can land, and the frontend commit can land at a later time. We've popped it off the stack. And no matter who finishes first, they won't get review pings for the other part's work.

So, you might say: yeah, that's exactly why I'd make them as two separate PRs. And that may be a good idea. But to do that, you have two choices: wait until the backend stuff is fully done, and then start on the frontend stuff. That can work, sure, but maybe what you want to do needs the frontend work to influence the backend work, so treating it as one logical change before any of it hits master makes sense. So to do that, you make a frontend PR that, instead of branching off of master, branches off of the backend PR. Well, now you've reinvented stacked diffs, but you don't have any of the tooling to make that nice! Once the backend lands, you'll have to rebase the frontend off of master, stuff like that.

... does any of that make sense?

cryptonector

6 hours ago

If you haven't pushed then you want to eventually push clean history. That's what rebasing is about.

nextaccountic

17 hours ago

If it is unreleased it may make the life of reviewers easier. If they look at the commits at all, that is.

But if not even reviewers are looking at commits I question whether a PR should be chunked in commits at all - why not squash all commits into a single one, so that every PR is composed of exactly one commit? Could maybe separating the changes in commits convey some information?

toastal

14 hours ago

Normally creating a new feature involves independent actions like updating dependency, refactoring some module beforehand, & so on. If these things can be broken up in a way that make review & reading logs easier, why would you squash the whole history—especially if that history is already good. In as much as it would be ridiculous to go to the other extreme of committing ever character/line as a separate commit, just collapsing the whole history isn’t a good approach. There isn’t a hardened rule about where the line is but that is okay & up to developer discretion on how they want to ‘tell the story’ to a reviewer (or their future self) about how certain chunks should be read together.

baq

8 hours ago

the history you're merging to the main branch is much more useful clean than accurate.

eviks

16 hours ago

Because it's easier to understand later on when it's not spread out

IshKebab

15 hours ago

The point of history is so that other people (or your future self) can look back at the history of the code and understand how it changed and why changes were made.

There's zero value in retaining mistakes that were never merged into `master`. In fact there is negative value because it makes the history harder to follow.

For example would you rather see "review fixes" in Git blame, or the actual useful commit message?

You can argue that you can't be bothered to make your history nice; fine. But you can't say it's wrong to do that.

(Personally I think you should squash commits for a MR; if it's too big to review as one commit it's too big for one MR - except for branches that multiple people have worked on over an extended period.)

Terr_

13 hours ago

> There's zero value in retaining mistakes that were never merged into `master`.

Not to be too pedantic about this, it sounds like you mean flawed commits that were merged into master, but never had a chance to be the HEAD of master? (In other words, they always arrived along with a fix-commit too, so that nobody checking out code could've hit the bug.)

> In fact there is negative value because it makes the history harder to follow.

It also hampers your ability to use tools like git-bisect to detect when behavior changed, especially if some of the bad commits won't compile or pass unit tests.

IshKebab

3 hours ago

> flawed commits that were merged into master, but never had a chance to be the HEAD of master

Yeah exactly.

globular-toast

17 hours ago

Why store git history at all? It's useless if you don't take care of it. Have you ever used git history for anything? People use it to find the source of regressions (you can do it quite quickly using git bisect).

keybored

14 hours ago

Happens on all these threads.

- Thread: This helps you make a useful Git history

- Counter-point: Why are you using a version control system to make a useful history? Why does the history matter? I have been developing software for eighty years and neither I nor anyone I’ve met have cared.

jeltz

12 hours ago

I use the git history all the time. For example the git history of PostgreSQL is amazing but I have also worked with proprietary software with good git history.

And one key to good git history is people splitting up big PRs in multiple commits.

froh

16 hours ago

the git history of the linux kernel documents the Linux kernel detailed Design, the reasoning how things fit together.

I've used it several times to understand why things are done the way they are.

The discussion leading to this design is encoded in the revisions of the patch series, browsable on patchworks.

I agree to your point that the git history has to be taken care of to be useful.

globular-toast

15 hours ago

My question was, of course, rhetorical. This and hunting regressions are two great reasons to keep history.

"What did John do on Tuesday morning" is not a good reason to keep history. Neither is "This change was made as part of a sprawling "fix" commit that happened on Thursday afternoon".

The important point is if you are going to keep history at all it needs to be kept in a state that is actually useful, meaning useful for the first two things, not the last two. Otherwise it's just a waste of disk space. The git history is meticulously maintained which is what makes it actually useful.

tcoff91

20 hours ago

Just use magit and easily make fixup! commits with like 3 key presses. Even if you don't use emacs keeping it around just to use magit is worth it.

Edamagit for vscode users is not as good but it does this particular workflow great.

psanford

19 hours ago

I use magit. I still use git absorb. The issue for me is not the speed at which I can run the interactive rebase, it is the amount of looking I have to do to find the correct commit to fixup onto in a big stack of commits. git absorb figures that out for me.

accelbred

18 hours ago

magit supports git-absorb out of the box if its installed; see the magit-commit-absorb command. I find it quite useful.

bananapub

15 hours ago

magit is fantastic but this isn't at all the same thing.

fixup is "stage changes, users selects past commit to fixup, commit staged changes"

absorb is "stage changes, git-absorb figures out which past commit to fixup, commit staged changes"

gorjusborg

5 hours ago

'git rebase -i' meets this need and more. Everyone using git should learn to use it eventually. With it you can squash, fixup, reword, or delete commits interactively.

syncsynchalt

an hour ago

It sounds like this will automatically find and associate with each commit that last touched the modified lines, which is nice.

There've been plenty of times I haven't done a fixup because I couldn't be arsed to check exactly which of the three ancestor commits in my branch the tweak actually belongs with. I wouldn't even consider splitting into three fixups if there were three tweaks that belonged to three different ancestors. It sounds like this tool/workflow will do all that work with a single invocation.

pragma_x

5 hours ago

Glad I'm not the only one. Squashing branches pre-merge, or post-merge solves the problems of commit purity (main/develop/prod/etc always builds at every commit) while preserving the intent of any MR/PR. Squashing also makes rebases _A LOT_ easier; every bad day you've had executing a rebase is likely solved by squashing your feature/fix branch, first.

Going back in time to fix the past just to sort your changes into the "correct" bin or vicinity of changes just seems unnecessarily complex. I am also unaware of what use-cases this supports. Are teams out there forcing commits to have a stricter scope than just "anything in the codebase"? Perhaps it's a monorepo pattern? I have no idea.

Either way, you have to `git push --force` to modify your PR/MR. May as well use the stock workflow and learn how to clean up your branch.

I also endorse `git commit --amend` for anyone that identifies as a "micro-committer" but doesn't need to preserve intermediate changes.

Edit: re-reading comments - it sounds sort of like some teams prefer to bin changes like this in order to make a PR/MR browsable by commit, rather than inspect one big broad diff. Is that the case?

burntsushi

5 hours ago

git-absorb is a complementary tool to `git rebase -i`. git-absorb will create the fixup commits (from your staging area) for you and set them up for use with `git rebase -i --autosquash`.

gorjusborg

30 minutes ago

It solves a problem that can be solved better by good habits and a solid understanding of git.

parasti

16 hours ago

TIL about `git commit --fixup` and `git rebase --autosquash`.

Interactive git rebase is by far my favorite Git tool to use, it scratches a particular itch to create perfect logically atomic commits.

That said, sometimes this kind of history editing tends to backfire spectacularly because these crafted perfect commits have actually never been compiled and tested.

mst

10 hours ago

I tend to go back through and build+test each of my newly crafted commits locally (and for preference have CI set up such that it'll go "hey, haven't seen these commits" and do it again itself).

Some people find doing so this boring and annoying. For whatever reason my brain finds it zen and satisfying.

ssijak

12 hours ago

Do people actually check commit history in detail so often that they absolutely find so much value in ultra clean commit history? I never understood that obsession with 100% clean history.

borrow

42 minutes ago

It's self-reinforcing. If commit history is reasonably clean, the barrier to doing code archeology is lower, so you reach for it more often. And if you do code archeology often, you develop a better sense of what's a clean commit history and what makes commit messages useful.

The need for code archeology depends on a project. When you writing a lot of new code it's probably less important than in a legacy thing where most changes are convoluted tweaks made for non-obvious reasons.

mst

10 hours ago

When I'm hunting a mistake (even/especially my own) finding the commit that introduced the bad line via blame and then looking at that specific diff is extremely handy. Also being able to bounce back and forth between the first broken one and the one before running tests while I work out exactly what I messed up and how.

I wouldn't say I put the effort in to get to 100% clean but something like 95% clean makes future me significantly less enraged at past me's mistakes.

syncsynchalt

an hour ago

Yes, absolutely.

If I'm doing a `git blame` or digging through the annotate gutter in GitHub I prefer to find the appropriate commit context, not a commit with title "typo" or "oops" or "trying something" or "WIP".

In the latter case you can still do the `git log` archaeology to find the actual context but I'd rather not.

Some teams will squash all merges to avoid this problem, but why not have the best of both worlds by merging relatively clean branches?

asqueella

11 hours ago

You don’t even have to do archaeology to benefit from clean commits, it helps tremendously during review too. How do you review nontrivial changes without splitting them in digestible chunks?

freedomben

12 hours ago

Yes, absolutely. Writing out a change log when you have clean commit history is a trivial task. Doing so when it's not clean, can be arduous and take hours, and then still end up being inaccurate.

I also frequently try to find when some bug or change was introduced in the get history, so that I can understand why it was done through the context. Context. When it is just thrown in some random commit that doesn't even have much of a commit message, it is utterly useless. When it is part of a commit that is atomic and targeted at one thing, it is trivial to see why it was done. Even when the comment isn't super helpful (which is common despite intentions, because predicting the future and what will be useful in the future is very difficult) it's still valuable because you can see the code in its full context and it's often clear what it's doing.

So yes, I am a big believer in clean commit history with atomic, single commits that represent the task someone was doing. For example, If it was a bug fix, then the bug fix will be in its own commit with a commit message describing the bug that is being fixed.

playingalong

12 hours ago

Do people wash their hands 15 times a day so that they absolutely fine so much value in getting rid of the bugs?

The answer is: some do, some don't.

cryptonector

6 hours ago

Yes. When doing `git bisect`, for example.

baruch

12 hours ago

I do. To learn a code base it before tremendously, also to find what happened and why things were done it is off great help. I don't need it every day but I routinely do git archeology.

PhilipRoman

9 hours ago

Doesn't have to be 100% clean, but history has tremendous value to me. Often the most important question for me when fixing a problem is "was this intentional" (aka did that guy (who left 10 years before I arrived) know what he was doing?).

As far as I remember, I've had to check history for about 2/3 of bugs I've fixed.

In a perfect world each system would be documented and tested well enough that correctness could be derived from first principles, but we do not live in a perfect world :(

chippiewill

7 hours ago

Only in codebases that have good commit history

jcon321

11 hours ago

Yea teams that have poor tracking and no PR practices.

OJFord

10 hours ago

This makes no sense to me, why would a conflict-free modifiable commit within the last 10 be the one I want to fixup any more often than roughly 1 in 10?

I fixup^ frequently, often often with conflict to resolve in the process, and I have never ever thought 'if only something would automatically choose the target commit for me', even if it was advanced AI why would I trust it to be what I wanted?

^my alias is:

    !f(){ target="$(test -n "$1" && git rev-parse "$1" || git fzsha rev-parse)"; git commit --fixup="$target" ${@:2} && EDITOR=true git rebase -i --autostash --autosquash "$target^"; }; f
`git fzsha` being another alias of mine to choose the target commit with fzf if not given. I rarely use that though, because usually I know it's HEAD~5 or whatever from doing it a second ago, or I've already been looking at the log to work out what I want.

MBlume

a day ago

I've been using this workflow with hg and it's great, happy to see a git port

toastal

14 hours ago

Darcs documentation suggests amending & fixing up WIP commits too

optymizer

a day ago

One of my favorite Sapling commands at Meta

pdpi

a day ago

Yeah, it singlehandedly turned me on to Mercurial when I was there, and has massively shaped the way I use git ever since.

cocoto

8 hours ago

Small question to anyone with this workflow. Do you (re)run CI on every affected commit? If no, what is the point of small commits if you lose any guarantees? I much prefer the honest linear non-modified history.

enw

10 hours ago

In what situations does this solve a problem?

In our projects we only enable squash merge in GitHub and the PRs can have any commits you want. The squashed commit includes link to PR, and PR has detailed summary (which wouldn’t be practical in the commit message).

titusjohnson

4 hours ago

Agreed. It's been a decade or more since I and the teams I worked with did anything other than squash-merging a feature branch into main. The PR body becomes the commit body for the squash, and we're done.

Developers can be as f'n messy as they want on their branches, no one cares because it never hits main.

taberiand

21 hours ago

This sounds very useful, I frequently reset soft and recommit in batches of related changes before PR and this sounds like it streamlines integrating updates quite nicely

adastra22

21 hours ago

The thing that I never knew I needed, but I predict within a few days I won’t be able to live without. Thank you!

thecopy

12 hours ago

If i understand this will break "changes since my last review" and disconnect PR review comments in GitHub?

jeltz

12 hours ago

No, why would it?

saagarjha

19 hours ago

This seems neat in theory but I would be perpetually concerned that it is going to pull some change I don't mean to commit and put it into something for review. How well does it do at that, anecdotally?

mook

16 hours ago

It deals with things that have been staged, so if you don't want to commit then don't stage it. I believe a dirty working tree is fine (and gets ignored).

psanford

19 hours ago

I'm a huge fan of git absorb. I love tools that do-the-right-thing so I can think less. That is what git absorb is.

tripdout

15 hours ago

How does it figure out which commit to add each change to?

andsens

15 hours ago

7 lines into the README buddy: > The command essentially looks at the lines that were modified, finds a changeset modifying those lines, and amends that changeset to include your uncommitted changes.

a1o

a day ago

Uhm, I do a lot of git rebase -i HEAD~2 where I just squash the commit on the latest or sometimes I need to reorder and move the fix commits in specific commits in Pars that multiple commits, which I then need to push force. Is this for a similar use-case? I am not familiar with fixup or how it works.

cryptonector

6 hours ago

Yes. This saves you having to `git rebase -i` or `git rebase --autosquash` to get the fixup to be "absorbed" into the commit it is meant to fix up.

hoten

19 hours ago

fixup lets you mark a change to be rolled back into a previous commit, without changing the history (yet). When you do `git rebase --autosquash` it then automatically moves those commits as `fixup` in the right places. It helps to queue up many fixups at once and only have a single rebase to do when you're done (or for reviewers to see just the changes in a review tool, not have to reassess the entire commit because you already rebased)

btw you could be doing `git commit --amend` if you just want to add to the last commit.

mckn1ght

a day ago

fixup works with rebase if you add the -a flag for autosquash. try it and you’ll see the commits already reordered for you in the interactive menu.

also you can write HEAD~2 as @^^ if you want to save a couple keystrokes!

rom1v

18 hours ago

Nit: Or @~2 (@ is HEAD, so @^^ is HEAD^^).

AlexCoventry

a day ago

You can also just press `c`, then `F`, in magit.

juped

19 hours ago

If I understand the logic it uses correctly this will nearly always attach the fixup to the right commit. It's not for me because I do this manually too fluidly but it seems like a good tool.

renewiltord

a day ago

Pretty clever impl of a tool. I'll be using it, thanks.

lr4444lr

21 hours ago

you don't want to shove them all into an opaque commit that says fixes, because you believe in atomic commits.

Sure I do. The whole branch will be squashed anyway before it's merged in, and a single "fixes" commit while still on its own branch will be easier to track in a PR for addressing everything pointed out earlier.

I mean, don't let me stop anyone from using this or --fixup if this is your flow, but this solves a problem neither I nor anyone in my last 10 years of working with code has.

imiric

21 hours ago

Every team is free to choose what works best for them, but IMO always squashing PRs is not a good strategy. Sometimes you do want to preserve the change history, particularly if the PR does more than a single atomic change, which in practice is very common. There shouldn't be a static merge type preference at all, and this should be chosen on a case-by-case basis.

At the risk of sounding judgemental, I think this preference for always squashing PRs comes from a place of either not understanding atomic commits, not caring about the benefits of them, or just choosing to be lazy. In any case, the loss of history inevitably comes at a cost of making reverting and cherry-picking changes more difficult later, making `git bisect` pretty much worthless, and losing the context of why a change was made.

boolemancer

20 hours ago

> At the risk of sounding judgemental, I think this preference for always squashing PRs comes from a place of either not understanding atomic commits, not caring about the benefits of them, or just choosing to be lazy. In any case, the loss of history inevitably comes at a cost of making reverting and cherry-picking changes more difficult later, as well as losing the context of why a change was made.

1) Why are you ever reverting/cherry-picking at a more granular level than an entire PR anyway? The PR is the thing that gets signed-off on, and the thing that goes through the CI build/tests, so why wouldn't that be the thing kept as an atomic unit?

2) I don't think I've ever cared about the context for a specific commit within a PR once the PR has been merged. What kind of information do you expect to get out of it?

Edit: How does it remove the context for a change or make `git bisect` useless? How big are your PRs that you can't get enough context from finding the PR commit to know why a particular change was made?

keybored

4 hours ago

> Why are you ever reverting/cherry-picking at a more granular level than an entire PR anyway? The PR is the thing that gets signed-off on, and the thing that goes through the CI build/tests, so why wouldn't that be the thing kept as an atomic unit?

The PR could contain three refactoring commits and one whitespace cleanup commit. One main change. The only thing that breaks anything is the main change. Because those others were no-ops as far as the code is concerned.

imiric

15 hours ago

> The PR is the thing that gets signed-off on, and the thing that goes through the CI build/tests, so why wouldn't that be the thing kept as an atomic unit?

Because it often isn't. I don't know about your experience, but in all the teams I've worked in throughout my career the discipline to keep PRs atomic is almost never maintained, and sometimes just doesn't make sense. Sometimes you start working on a change, but spot an issue that is either too trivial to go through the PR/review process, or closely related to the work you started but worthy of a separate commit. Other times large PRs are unavoidable, especially for refactorings, where you want to propose a larger change but the history of the progress is valuable.

I find conventional commits helpful when deciding what makes an atomic change. By forcing a commit to be of a single type (feature, fix, refactor, etc.) it's easier to determine what belongs together and what not. But a PR can contain different commit types with related changes, and squashing them all when merging doesn't make the PR itself atomic.

> I don't think I've ever cared about the context for a specific commit within a PR once the PR has been merged. What kind of information do you expect to get out of it?

Oh, plenty. For one, when looking at `git blame` to determine why a change was made, I hope to find this information in the commit message. This is what commit messages are for anyway. If all commits have this information, following the history of a set of changes becomes much easier. This is helpful not just during code reviews, but after the merge as well, for any new members of the team trying to understand the codebase, or even the author themself in the future.

keybored

4 hours ago

> I find conventional commits helpful when deciding what makes an atomic change.

I already know if I’m doing a fix, a refactor, a “chore” etc. Conventional commits just happen to be the ugliest way you can express those “types” in what looks like English.

imiric

3 hours ago

Yeah, well, that's just, like... your opinion, man.

But I've worked with many, many developers who don't strictly separate commits by type this way. I myself am tempted to do a fix in the same commit as a refactor many times. Conventional commits simply suggest, well, a convention for how to make this separation cleaner and more explicit, so that the intent can be communicated better within a team. I've found this helpful as a guide for making atomic changes. Whether or not you write your commit messages in a certain way is beside the point. But let me know if you come up with a prettier way to communicate this in a team.

boolemancer

5 hours ago

> Because it often isn't. I don't know about your experience, but in all the teams I've worked in throughout my career the discipline to keep PRs atomic is almost never maintained, and sometimes just doesn't make sense. Sometimes you start working on a change, but spot an issue that is either too trivial to go through the PR/review process, or closely related to the work you started but worthy of a separate commit. Other times large PRs are unavoidable, especially for refactorings, where you want to propose a larger change but the history of the progress is valuable.

In my experience at least, PRs are atomic in that they always leave main in a "good state" (where good is pretty loosely defined as 'the tests had to pass once').

Sometimes you might make a few small changes in a PR, but they still go through a review. If they're too big, you might ask someone to split it out into two PRs.

Obviously special cases exist for things like large refactoring, but those should be rare and can be handled on a case by case basis.

But regardless, even if a PR has multiple small changes, I wouldn't revert or cherry-pick just part of it. Just do the whole thing or not at all.

> Oh, plenty. For one, when looking at `git blame` to determine why a change was made, I hope to find this information in the commit message. This is what commit messages are for anyway. If all commits have this information, following the history of a set of changes becomes much easier. This is helpful not just during code reviews, but after the merge as well, for any new members of the team trying to understand the codebase, or even the author themself in the future.

Yeah but the context for `git blame` is still there when doing a squash merge, and the commit message should still be relevant and useful.

My point isn't that history isn't useful, it's that the specific individual commits that make up a PR don't provide more useful context than the combined PR commit itself does.

I don't need to know that a typo was fixed in iteration 5 of feedback in the PR that was introduced in iteration 3. It's not relevant once the PR is merged.

imiric

3 hours ago

I don't have time to reply to all your points, but regarding this:

> I don't need to know that a typo was fixed in iteration 5 of feedback in the PR that was introduced in iteration 3. It's not relevant once the PR is merged.

I agree, these commits should never exist beyond the PR. But I go back to my point about atomic commits being misunderstood. It's not about keeping the history of _all_ changes made, but about keeping history of the most relevant ones in order to make future workflows easier. A few months from now you likely won't care about a typo fix, but you will care about _why_ some change was made, which is what good commits should answer in their commit message. Deciding what "atomic" truly means is often arbitrary, but I've found that making more granular commits is much more helpful than having fewer and larger commits. The same argument applies to the scope and size of PRs as well.

imp0cat

18 hours ago

ad 1) I'd guess it depends on the size of the PR. If they're massive it kinda makes sense.

lr4444lr

7 hours ago

If only we had a software methodology that championed incremental change...

lr4444lr

8 hours ago

Since we're taking the risk of being judgmental, I'd suggest the preference for not squashing branches to preserve the developer's local commit history when merged into the main line suggests a misunderstanding between branches and commits, but if it works for you more power to you.

seadan83

19 hours ago

What are your thoughts on the "ship, show, ask" workflow? [1]

In that workflow, small stuff is simply pushed, which allows PRs to be more single focused and more atomic. Perhaps your only objection is direct pushes to master? I am really curious if that workflow otherwise addresses all of the downsides you stated while still allowing for all PRs to be uniformly rebase-squash merged.

[1] https://martinfowler.com/articles/ship-show-ask.html

imiric

15 hours ago

I wasn't aware of this particular workflow, but I've heard of similar ones before. The late Pieter Hintjens, of ZeroMQ fame, advocated for a strategy called "optimistic merging"[1], which essentially abandons the standard code review process in favor of merging changes ASAP, and fixing any issues as they arise later.

I'm not a fan of this. It allows contributors to abandon any established coding standards, while placing more burden on maintainers to fix issues later. This in practice never happens, so the quality of the codebase degrades over time. Not to mention that it allows malicious actors to merge changes much more easily.

As for "ship, show, ask" specifically, I have similar reservations. I think all changes should go through a review process, even if someone just glances at the changes. It not only gives the opportunity to leave feedback, but also serves as communication so that everyone is aware of the proposed changes. Also, making the decision of whether to choose "ship", "show" or "ask" might only work for senior and well disciplined developers. I think that in most teams you have a mixture of experience, so you'd inevitably end up in situations where a change should've been "show" or "ask", but was "ship", and viceversa. I don't think teams will ever align on a single strategy to make a correct decision, since it's based on subjective opinion to begin with. Always following the same workflow gets rid of these uncertainties.

[1]: http://hintjens.com/blog:106

seadan83

6 hours ago

After having done a few thousand CRs, I've come to believe that CR is of mixed value. Not always good, nor all good. Removing the bad CRs and doing CR when it is useful IMO helps maximizes productivity.

First, I assume a few constraints on "ship, show, ask." First constraint is that all changes go for at least one, if not multiple self reviews. Second, the ability to ship is not outright granted. Until someone has demonstrated they can ship reasonable, working code- they are not given write permissions. They must "ask."

Next, ship is not a given. Things that are not interesting are shipped. This enhances the communication part of PRs. If a PR is sent, it is important or wants extra review. That removes the noise of everything being a PR, and removes the noise from PRs where minor changes are smuggled in for the sake of efficiency. This also removes a lot of nitpicking as well. When asked to CR, we often feel we need to CR, and if our only feedback are minor things- then that is the feedback. 'Perfect' becomes the enemy of good enough. Saving time, efficiency are huge. Do CR when it is useful, and only then.

Last, post merge reviews can still be done by the maintainers to ensure things are going smoothly. Hence, post merge review is an option for both ship and show. This means the coding lifecycle is blocked in CR only when the CR is valuable.

That CR is blocking I feel is a very important drawback. Typos linger because time is finite and only those that do not appreciate that will try to fix everything. The team then gets bogged down spending too much time fixing small things. Leaving the small things is not ideal either. If the overhead to fix small things is tiny, suddenly it takes 5 minutes to do what would otherwise require another person and possibly the next day.

For example, A PR preceded by renaming a few files is powerful. That is a game changer compared to a hard to review CR, staged PRs, or simply never doing the renaming because it runs the diff.

Blanket CR can also ossify a project. Most projects I've seen are written quickly and in large chunks. If they are reviewed at all, it is a lot of 2000 line "LGTM", zero comment reviews. Then when careful CR is done (often by the "scaling" team), suddenly things that are original become gospel. The CR debates the absolute best naming, original intent, spending magnitudes more time than was originally put into the code in the first place. Meanwhile the code was done quickly, slightly worse than good enough- and now good enough is not even good enough. The CR now demands a far higher quality bar compared to the original code when it was written. Fixing a 50k line code base at 500 lines per day is a way to _not_ fix that codebase. The original bad code is therefor fossilized into the project. Time and effort are not infinite.

That turned into a bit of a rant. I wanted to demonstrate that:

(A) there are guard rails on SSA so it is not just a free for all.

(B) selectively requesting CR when it is valuable amplifies the value of CR.

(C) removing the noise of everything is a PR makes PRs stronger as communication tools.

(D) post merge review of all code can still meet the goal of reviewing everything.

I'll tie this back to git absorb now. Git absorb rebases, which arguably invalidates any CR that was done. Ship-Show-Ask allows for there to be more explicit trust where someone does their git absorb, redoes the self review and then merges. Further, the workflow helps clean up PRs to not be as many commits.

Though, I do think the discussion of workflows and CR is really fascinating. Having been essentially just the one senior, or one of two on most all projects I've worked on in the last decade, - I'm certainly (overly) eager to discuss in detail the workflows.

imiric

3 hours ago

Thanks for your perspective. It's interesting to hear a different take on this.

I do think there's no right or wrong answer with these things. Everyone will have different opinions about their ideal workflows, especially senior engineers. The important thing is to find a good balance that everyone on the team is satisfied with, but there will always be concessions, as in any relationship.

What I have found, though, is that the more uncertainties a workflow has, the higher the chances of misunderstanding, or someone not following it in the way someone else thinks they should. So with the way "ship, show, ask" is described, and what you mention here, is that there are many constraints, and decisions to be made about what constitutes an "interesting" change, or what is worthy to be shipped without a CR and what isn't. If the team is mostly senior, and there's a lot of trust involved, this workflow might work, and it might reduce the friction of blocking CRs. But there are a lot of variables involved there.

OTOH, by always requiring CRs and approvals for shipping, there are far less decisions to be made. The workflow is simple, fixed and known to everyone. Sure, sometimes CRs step into nitpicky territory, but this is usually solved by having some conventions, such as making clear what is a nitpick, what is a minor/major suggestion, and what is a blocker. This way nitpicks and minor suggestions can be safely ignored (as long as they're ackgnowledged in some way), and they're not blockers for merging.

In practice I've found that while this workflow does take considerable time and effort, in the vast majority of cases it leads to higher quality code. Even addressing nitpicks and typo fixes is an improvement in most cases. These can be easily automated by suggestions in GitHub, which can be batched and committed directly from the web UI, so it doesn't take much effort at all.

> Last, post merge reviews can still be done by the maintainers to ensure things are going smoothly.

Sure, but is this diligently done? And how well does it scale if the project has many contributors, but few maintainers? Placing the burden on maintainers to ensure certain quality standards is unrealistic. Moreover, it assumes that maintainers have the final say in these matters, instead of their changes also being worthy of discussion, which happens during a CR. I think that the responsibility for ensuring things are running smoothly should be shared by every member of the team, including external contributors.

> Most projects I've seen are written quickly and in large chunks. If they are reviewed at all, it is a lot of 2000 line "LGTM", zero comment reviews.

Those are examples of not doing CRs correctly, not a testament that CRs are not worth the time and effort. You can just as well run into this issue with the SSA workflow. As with any issue with team dynamics, this should be resolved by communicating and aligning on core values and practices the team finds important.

> The CR debates the absolute best naming, original intent, spending magnitudes more time than was originally put into the code in the first place.

Yes, this is to be expected. It takes much more effort to read and understand code than it does to write it. It takes additional time and effort to leave a written comprehensible review, and communicate all of this asynchronously. This scales linearly for every member in the team, and there's no way around it. But the question is whether deciding to invest the time and effort in a well established CR process ultimately results in higher quality software being shipped. In my experience, it does, and it's worth it.

That said, CRs are not the only or, even failproof, method of quality assurance. As mentioned in the SSA article, pair programming is another method, but it has its own drawbacks and limitations. All of these things help, and are best used in combination. They can be uncomfortable, tedious and seem pointless at times, but deciding to not do them because of this is doing a disservice to the final product.

seadan83

2 hours ago

> what constitutes an "interesting" change, or what is worthy to be shipped without a CR

FWIW, I have not seen much trouble in the ambiguity. Things like "move method" refactor, typo fix, add additional test cases, comment clarification, are just shipped. If the update makes an important clarification on a comment that the team should know about, show is interesting. If the code fixes a bug, would be mentioned in a stand up- show is good. Ask is good for a large refactor where there is room for error, when something does not seem clean, merits any type of double check or second opinion, or and especially when it is a first foray into a different part of the codebase.

> Sure, but is (post merge reviews) diligently done?

The beauty I would say is that it does not have to be diligently done. Feature, not a bug.

If the lead maintainers are super busy, it is okay. Perhaps they will have time later and can do things in bulk, or even not at all. The process is no worse and does not stop when the majority of lead maintainers are on vacation.

Interesting changes are still notified via email (thru PRs). This highlights what a person should be looking at.

I think this attitude gives more trust to the team and contributors. It is not the responsibility of just the maintainers and seniors to (pre or post) review everything. No blame game of: "how did this bug get through, and more importantly, how did it get through review??"

> And how well does it scale if the project has many contributors, but few maintainers

Good question. Offhand I would say no worse. Contributors would all be required to ask, which is no different than 'review everything.' The difference would be for maintainers, they would not be required to always have other maintainers do CR in every case. A second potential difference, contributors might be promoted more quickly to have write permission. That in turn helps scale the efforts of the maintainers.

I think this shines on the other extreme when there are very few maintainers. OSS requiring week long turn around times will have trouble onboarding contributors to be regular maintainers. In industry, trusting developers relatively quickly I think is healthy.

The mentality to be looking to give people write permission sooner rather than later IMO is good.

> Those are examples of not doing CRs correctly, not a testament that CRs are not worth the time and effort.

My apologies for not conveying the example clearly. The examples I'm thinking of usually had no review. These are older code bases written long ago before CR. Or, these are startup quality code bases where a few founders jammed out tons of code. Or, code that was put out during a crunch, almost always successive crunches, where the constant pressure to ship was everything. Or, some Greenfield project that was jammed out and then handed over for maintenance and enhancement. Just large codebase that were never great. I'm also implying there is often a different standard for originally authored code compared to changing existing code.

My point is there is a quality bar that is suddenly higher, a lot higher, than the existing code.

> But the question is whether deciding to invest the time and effort in a well established CR process ultimately results in higher quality software being shipped. In my experience, it does, and it's worth it.

I agree, though not sure if actually always worth it. Projects are rarely labelled a failure, but many are. That can manifest as simply too much time going by without enough being shipped.

Though, good CR culture is indeed very valuable. I think it is more an exception than the rule.

I think that complements SSA at the same time. A strong CR culture with review everything might be hard to distinguish from SSA. In that sense, "ask" could be thought of as "second opinion." When a team gets crunched, there is more flexibility. Or, if the team is immature w.r.t to CR, or if there is a significant lack of reviewers- SSA removes barriers.

The idea is to capture 95% of the benefit of CR with 50% the cost.

> Sure, sometimes CRs step into nitpicky territory

Pointing out typos IMO is useful. Though, I've learned to personally delete any CR feedback I write that starts with "nitpick." I no longer believe it to actually be all that helpful or productive. It is a signal that a style discussion is due. It is important I think to respect that a CR is a blocking thing. Keeping someone an extra whatever amount of time for nitpicks is missing the bigger picture of what is important IMO

----

I very much appreciate the dialog and perspective! Thank you for your considered feedback here. I largely agree with your points, particularly the ones I did not respond to.

I suffer a bit too from the lesson of things taking too long is it's own failure mode. Success is not at all guaranteed, even if quality and solid code is shipped. At the same time, solid and quality code is required to scale, and is required for most success. That is to say that projects do fail very much in spite of quality engineering.

keybored

4 hours ago

Why is the preferred goal to “rebase-squash” (redundant) merge? The implied onus here seems to be for the other party to move towards that strategy. But why this practice should be the goal does not seem to be mentioned.

Step one isn’t to find some way to accept squash-merge. Step one is to find some reason for squash-merge.

hamandcheese

20 hours ago

git bisect is still perfectly useful. It will locate the squashed commit/pr which is more than enough for me to zero in on whatever the bug is.

imiric

20 hours ago

How helpful is it if it points you to a commit with 10 different changes? Will you go back to the PR to see the context of each change, and try to guess where the issue is based on the behavior? You're right back at manual debugging at that point.

This workflow really shines when it points you to the actual atomic commit that introduced the issue. Then you're simply a `git revert` away from undoing it, without risking breaking anything else, barring functional conflicts, of course. Try doing that with a squashed commit.

burntsushi

20 hours ago

I'm with you (see my other top level comment), but

> Then you're simply a `git revert` away from undoing it, without risking breaking anything else

This needs careful qualification. On GitHub at least, it is difficult to ensure every commit passes CI. This can result in skipping during bisect for a busted commit. It doesn't happen often enough in my experience too convince me to give up a cleaner history, but it's a downside we should acknowledge.

Ideally GitHub would make testing each individual commit easier.

keybored

16 hours ago

A revert two weeks after the fact will create a new and unique tree (untested) in any case. I don’t if you’re saying that the original commit or the revert might be untested.

In either case the brand new revert could break something. Who knows, it’s a new state.

> It doesn't happen often enough in my experience too convince me to give up a cleaner history, but it's a downside we should acknowledge.

There are tools for that.

https://github.com/mhagger/git-test

burntsushi

10 hours ago

All I'm trying to do is qualify things so that the trade offs can be more honestly assessed. The bisect for finding that commit might not work as well as you hope if you need to skip over commits that don't build or whose tests fail for other reasons. Those things can happen when you aren't testing each individual commit.

I understand there are tools for testing each individual commit. You'll notice that I didn't say it's impossible to ensure every commit is tested. Needing to use random tools to do it is exactly what makes it difficult. And the tool you linked says literally nothing about using it in CI. How good is its CI integration? There's more to it than just running it. On top of all of that, there are more fundamental problems, like increasing CI times/costs dramatically for PRs split into multiple fine grained commits.

Again, anyone here can go look at my projects on GitHub. I try hard to follow atomic commits. I think it's worth doing, even with the downsides. But we shouldn't try to make things look rosier than they actually are.

recursive

19 hours ago

I've not seen code where this revert thing would work. If it's in the PR something else now depends on it.

phist_mcgee

20 hours ago

If you go all in on atomic commits, you're probably going to be rebasing a lot. IME, that's a lot of time spent trying to make a "clean" history all for the benefit of looking back infrequently at what broke.

Can't you just inspect your merged branches and bisect those instead if using the squash strat?

recursive

19 hours ago

There can be another reason. I do care about the benefits, but I don't think there really are any.

rom1v

18 hours ago

> The whole branch will be squashed anyway before it's merged in

That looks like a very wrong process to me. Why would you even want to do that?

travisjungroth

17 hours ago

That whole paragraph is the conditions under which this tool would be useful. If they’re not true for you, this isn’t for you. This is like quoting “You have a feature branch” and saying “No I don’t.”

notlinus

17 hours ago

I wish people would stop saying "atomic commits" and start saying "main/master is stable", because that's what they actually mean. Every git commit is atomic, by definition... But people want every single possible revision to be green and buildable, which is different, and has nothing to do with git. I don't think it makes sense (tags are a lot more helpful for marking what is and isn't stable), but hey.

keybored

4 hours ago

People say what they mean.

“Atomic” sometimes means conceptually: does one thing. Sometimes it means “builds”. Or “works”. And yes: they do mean every single commit builds. Not just on main.

You don’t have to put words in people’s mouth.

usr1106

15 hours ago

Yes, atomic commits is not a very descriptive term. But main/master is stable is only a necessary condition, not a sufficient one.

If you squash (or mix from the beginning) several unrelated changes into a single commit, main/master would be stable.

AFAIK atomic commits means nothing can be taken away without breaking the change and nothing needs to be added to make the change work. How to express that without 2 clauses is indeed a good question.

assiniboine

15 hours ago

Atomic commits ensure each change works independently.

usr1106

14 hours ago

That cannot be achieved in general. Later commits often depend on some earlier one.

krick

21 hours ago

Good for you, but you (and, apparently, everyone in your last 10 years of working with) would have a problem if I was the one reviewing your commits. I mean, to be fair, I often did let it slide (for political/social/practical reasons) and use autosquash, but I always actively discouraged it, so if you are a junior or a new hire with uncertain usefulness status, I'd at least talk to you about that and ask you to fix it. This is not an ok practice, it's only acceptable because too many people don't have a habit of making their own bed and much bigger kinds of technical debt get ignored because in the end it's about shipping stuff, so you just sigh and say "oh, whatever".

hamandcheese

20 hours ago

It has always been enough for me to trace a change back to its original pull request. Extra granularity might be nice sometimes but never essential. And I do a fair bit of digging through git history. Not sure what makes it "not an ok practice". Squashing works fine and lets the 80% of people who don't care spend their efforts thinking about something other than a perfect git history.

notlinus

17 hours ago

For what it's worth, you're right.

dmead

19 hours ago

This sounds great,but kind of an anti pattern in git.

I definitely want to have a "fixes" commit on my feature branch. You should do whatever you want on a feature branch so long as your trunk has a clean history.

This sounds like someone wanted to lift a feature of changesets in mercurial into git. I don't think this is safe and probably breaks a lot of people's mental model of git changelogs being an immutable data structure.

keybored

4 hours ago

> I definitely want to have a "fixes" commit on my feature branch. You should do whatever you want on a feature branch so long as your trunk has a clean history.

This tool creates `fixup!` commits. These commits aren’t rewrite commits. They are commits which later can be used to rewrite the history.

You can use this tool, not rewrite your branch, and then rewrite the branch right before merge.

> This sounds like someone wanted to lift a feature of changesets in mercurial into git. I don't think this is safe and probably breaks a lot of people's mental model of git changelogs being an immutable data structure.

Rewriting is a huge part of the Git culture (and well-supported). Meanwhile Mercurial seems to have rewriting as a sort of add-on.

saagarjha

19 hours ago

At some point the changes are going to get merged in, no? And that that point I would really like the commits to be nice.

dmead

16 hours ago

That's why you squash and make the commit message readable.

saagarjha

15 hours ago

Squash all the commits I worked so hard to make atomic?

cryptonector

6 hours ago

No, you don't want to squash everything into one commit. You want logical, clean history.

cryptonector

6 hours ago

This is no more nor less safe than `git rebase`.

rglynn

21 hours ago

As a frequent user of fixups, this feels like a solution for already broken workflows.

> Instead of manually finding commit SHAs for git commit --fixup

Assuming you are using fixups, is this actually a problem?

I could see this being a possibility if you are: A. not practicing atomic commits or B. have so many commits in your branch that this is a chore.

A. seems unlikely if you are already using fixups and B. seems like a problem worth solving properly rather than going around.

To sum up, I'm not convinced by the elevator pitch. However, I am keenly aware that the workflows of developers differ vastly across industry, company size, technology etc. I'd be interested to understand what problems this or similar tools solve?

keybored

3 hours ago

> Assuming you are using fixups, is this actually a problem?

No. They made a whole tool around a not-actually problem.

Sure you can have a 30-commit branch. Find some OSS project where someone has been doing a large project on a branch for two months because the maintainer hasn’t accepted it yet.

seadan83

18 hours ago

What solutions have you seen for problem (B)?

The open source example is hard to fix AFAIK. Everything needs to be a PR, some changes to older code bases are simply going to be either large or multi-stepped (many commits, and sending them all as stacked PRs is often not efficient enough to be effective). In industry, I think there are more solutions available. Though, overall, I am very curious how you would go about solving B.

cryptonector

6 hours ago

Tools like this are useful for users who are not power users.

adastra22

21 hours ago

How would you manage long-lived branches that are periodically rebased against the master branch (when there is a release, for example)?

DanielHB

16 hours ago

Am I the only one who doesn't like atomic commits (or stacked PRs like graphite)? When I work on large PRs I often rewrite and move things around so much that trying to keep all commits in sync is a nightmare.

I do try to split the work if it is very clearly isolated, but that usually means less than 3 PRs. I have tried graphite `gt absorb` (which might use this project?) and it still creates a mess.

What I do that I wish more people did is that I heavily comment my own PRs with information that doesn't make sense in comments (for example on line X I add a comment: moved here from Y file).

> You have fixes for the bugs, but you don't want to shove them all into an opaque commit that says fixes

I actually like this, but split each fix in its own commit and during review I answer to comments with: "fixed in commit {commit-sha}". So _often_ bugs are introduced during PR review, if the fixes are isolated it is easier to see what changes between review rounds.

alex23478

16 hours ago

I think this really boils down how your team is using Git and which code review tool you're using. (I've never used Gerrit personally, but as far as I understand it, we wouldn't have this conversation, since it aims to refine a single change by re-submitting a commit over and over again?)

For GitHub/GitLab reviews, I'm totally with you - this makes it more convenient for the reviewer to check that/how you've responded to feedback.

But now if you merge this without squashing, your Git history on the main branch contains all your revisions, which makes operations like bisecting or blame more complicated.

For me personally, the sweet spot is currently a mix of stacked-commits and the PR workflow: Use a single commit as the unit of review, and polish that commit using a PR, and use the commit descriptions within that PR to tell the story of you responding to feedback. Then, squash merge that commit.

This provides both a good review experience and a tidy history. If you find a strange bug later and can't grasp how someone could have come up with that, the PR still has all the history.

Together with tools such as git-branchless, this also makes working on stacked-PRs a breeze.

DanielHB

14 hours ago

I have used standard github and graphite reviews. I tend to prefer what I mentioned in my original post than graphite stacked PR review (which are essentially atomic commits)

Yes I also advocate for squash-before-merge, so a lot of little commits is doesn't show up in the main history.

> For me personally, the sweet spot is currently a mix of stacked-commits and the PR workflow: Use a single commit as the unit of review, and polish that commit using a PR, and use the commit descriptions within that PR to tell the story of you responding to feedback. Then, squash merge that commit.

To me time spent on commit polishing (and dealing with conflicts) is time not spent on product. Author comments on PR review, squash-before-merge, and sit-together with reviewer for big PRs to me seems a better compromise. I don't think super polished git history is worth the extra effort for most types of product, as long as I can track a change down to a PR discussion that is enough to me. From there I can track PR review commit changes individually if needed.

Like it is so uncommon for me to go digging on git history that deeply, usually all I care is "code behaving weird && line changed < 1 month ago then probably a bug introduced then"

Of course if you are working on aviation software and the like maybe the priorities are different. But I have spent way too much time dealing with rebase conflicts when trying to chop up my PRs into smaller commits. Dealing with these conflicts often introduces bugs too.

usr1106

15 hours ago

Why are people talking about stacked PRs/MRs? Shouldn't they be called queued? A stack is LIFO and a queue is FIFO.

(Of course in some special case you might want to merge a later one earlier, but I don't think that's the normal case people are talking about.)

DanielHB

15 hours ago

Why is it a "Pull Request" instead of a "Push Request"?

Someone named it that way and it stuck.

usr1106

14 hours ago

You request others/the maintainer to pull. That was the only way before the forges. I guess gitlab's merge request is more descriptive.