jonhohle
15 hours ago
I feel like something was lost along the way.
git init —-bare
will give you a git repo without a working set (just the contents typically in the .git directory). This allows you to create things like `foo.git` instead of `foo/.git`.“origin” is also just the default name for the cloned remote. It could be called anything, and you can have as many remotes as you’d like. You can even namespace where you push back to the same remotes by changing fetch and push paths. At one company it was common to push back to `$user/$feature` to avoid polluting the root namespace with personal branches. It was also common to have `backup/$user` for pushing having a backup of an entire local repo.
I often add a hostname namespace when I’m working from multiple hosts and then push between them directly to another instead of going back to a central server.
For a small static site repo that has documents and server config, I have a remote like:
[remote “my-server”]
url = ssh+git://…/deploy/path.git
fetch = +refs/heads/*:refs/remotes/my-server
push = +refs/heads/*:refs/remotes/my-laptop
So I can push from my computer directly to that server, but those branches won’t overwrite the server’s branches. It acts like a reverse `git pull`, which can be useful for firewalls and other situations where my laptop wouldn’t be routable.webstrand
14 hours ago
git clone --mirror <remote>
is another good one to know, it also makes a bare repository that is an exact clone (including all branches, tags, notes, etc) of a remote repo. Unlike a normal clone that is set up for local tracking branches of the remote.It doesn't include pull requests, when cloning from github, though.
Cheer2171
14 hours ago
> It doesn't include pull requests, when cloning from github, though.
Because GitHub pull requests are a proprietary, centralized, cloud-dependent reimplementation of `git request-pull`.
How the "free software" world slid head first into a proprietary cloud-based "open source" world still boils my blood. Congrats, Microsoft loves and owns it all, isn't that what what we always wanted?
velcrovan
13 hours ago
When this kind of “sliding” happens it’s usually because the base implementation was missing functionality. Turns out CLI interfaces by themselves are (from a usability perspective) incomplete for the kind of collaboration git was designed to facilitate.
Certhas
12 hours ago
In another post discussion, someone suggested git as an alternative to overleaf, a Google Docs for latex... I guess there are plenty of people with blind spots for things that are technically possible, and usabel to experts, and UI that actually empowers much broader classes of users to wield the feature.
pastel8739
11 hours ago
If you actually use the live collaboration features of overleaf, sure, it’s not a replacement. But lots of people use overleaf to write latex by themselves. The experience is just so much worse than developing locally and tracking changes with git.
cozzyd
10 hours ago
Is the joke that overleaf has decent git integration?
derefr
8 hours ago
> Turns out CLI interfaces by themselves are (from a usability perspective) incomplete for the kind of collaboration git was designed to facilitate.
git was designed to facilitate the collaboration scheme of the Linux Kernel Mailing List, which is, as you might guess... a mailing list.
Rather than a pull-request (which tries to repurpose git's branching infrastructure to support collaboration), the intended unit of in-the-large contribution / collaboration in git is supposed to be the patch.
The patch contribution workflow is entirely CLI-based... if you use a CLI mail client (like Linus Torvalds did at the time git was designed.)
The core "technology" of this is, on the contributor side:
1. "trailer" fields on commits (for things like `Fixes`, `Link`, `Reported-By`, etc)
2. `git format-patch`, with flags like `--cover-letter` (this is where the thing you'd think of as the "PR description" goes), `--reroll-count`, etc.
3. a codebase-specific script like Linux's `./scripts/get_maintainer.pl`, to parse out (from source-file-embedded headers) the set of people to notify explicitly about the patch — this is analogous to a PR's concept of "Assignees" + "Reviewers"
4. `git send-email`, feeding in the patch-series generated in step 2, and targeting the recipients list from step 3. (This sends out a separate email for each patch in the series, but in such a way that the messages get threaded to appear as a single conversation thread in modern email clients.)
And on the maintainer side:
5. `s ~/patches/patch-foo.mbox` (i.e. a command in a CLI email client like mutt(1), in the context of the patch-series thread, to save the thread to an .mbox file)
6. `git am -3 --scissors ~/patches/patch-foo.mbox` to split the patch-series mbox file back into individual patches, convert them back into an annotated commit-series, and build that into a topic branch for testing and merging.
Subsystem maintainers, meanwhile, didn't use patches to get topic branches "upstream" [= in Linus's git repo]. Linus just had the subsystem maintainers as git-remotes, and then, when nudged, fetched their integration branches, reviewed them, and merged them, with any communication about this occurring informally out-of-band. In other words, the patch flow was for low-trust collaboration, while direct fetch was for high-trust collaboration.
Interestingly, in the LKML context, `git request-pull` is simply a formalization of the high-trust collaboration workflow (specifically, the out-of-band "hey, fetch my branches and review them" nudge email). It's not used for contribution, only integration; and it doesn't really do anything you can't do with an email — its only real advantages are in keeping the history of those requests within the repo itself, and for forcing requests to be specified in terms of exact git refs to prevent any confusion.
scuff3d
7 hours ago
I'm assuming a "patch" is a group of commits. So would a "patch series" be similar to GitLabs notion of dependent MRs?
kragen
5 hours ago
You normally have one patch per commit. The patch is the diff between that commit and its parent. (I forget how git format-patch handles the case where there are two parents.)
scuff3d
4 hours ago
If that's the case I'm assuming the commit itself is quite large then? Or maybe it would more accurate to say it can be large if all the changes logically go together?
I'm thinking in terms of what I often see from people I work with, where a PR is normally made up of lots of small commits.
kragen
4 hours ago
The idea is that you divide a large change into a series of small commits that each make sense in isolation, so that Linus or Greg Kroah-Hartman or whoever is looking at your proposed change can understand it as quickly as possible—hopefully in order to accept it, rather than to reject it.
scuff3d
4 hours ago
Gotcha, that makes sense. Thanks, I've always been curious about how the Linux kernel works.
kragen
3 hours ago
I may not be the best source for information, not having written anything worth contributing myself.
scuff3d
2 hours ago
Well I appreciate it none the less.
I think the point I always get stuck on is how small is "small" when we're talking about commits/patches. Like if you're adding a new feature (to anything, not necessarily the Linux Kernel), should the entire feature be a single commit or several smaller commits? I go back and forth on this all the time, and if you research you're gonna see a ton of different opinions. I've seen some people argue a commit should basically only be a couple lines of code changed, and others argue it should be the entire feature.
You commonly hear Linus talk about commits/patches having very detailed descriptions attached to them. I have trouble believing people would have time for that if each commit was only a few lines, and larger features were spread out over hundreds of commits.
kragen
2 hours ago
When I'm reviewing commits, I find it useful to see refactoring, which doesn't change behavior, separated from functional changes, and for each commit to leave the tree in a working, testable state. This is also helpful for git bisect.
Often, a change to a new working state is necessarily bigger than a couple of lines, or one of the lines has to get removed later.
I don't want to have to say, "Hmm, I wonder if this will work at the end of the file?" and spend a long time figuring out that it won't, then see that the problem is fixed later in the patch series.
Other people may have other preferences.
udev4096
10 hours ago
It still blows my mind how git has lost it's original ideas of decentralized development because of github and how github, a for-profit - centralized - close-sourced forge, became the center for lots of important open source projects. We need radicle, forgejo, gitea to catch up even more!
viraptor
3 hours ago
It didn't really lose the original ideas. It just never learned that people don't want to use it the way kernel devs want to use it. Git never provided an easy github-like experience, so GitHub took over. Turns out devs in general are not into the "setup completely independent public mailing lists for projects" idea.
seunosewa
9 hours ago
Once they killed mercurial on bitbucket, it was over.
lisbbb
3 hours ago
I once worked at a smaller company that didn't want to shell out for github and we just hosted repos on some VM and used the ssh method. It worked. I just found it to be kind of clunky having come from a bigger place that was doing enterprise source control management with Perforce of all things. Github as a product was fairly new back then, but everyone was trying to switch over to Git for resume reasons there. So then I go to this smaller place using git in the classic manner.
afavour
3 hours ago
I don’t think it’s really that surprising. git didn’t become popular because it was decentralised, it just happened to be. So it stands to reason that part doesn’t get emphasised a ton.
int_19h
an hour ago
It did become popular because it was decentralized, but the specific features that this enabled were less about not depending on a central server, and more about being able to work with the same repo locally with ease without having to be online for most operations (as was the case with Subversion etc). Git lets me have a complete local copy of the source with all the history, branches etc in it, so anything that doesn't require looking at the issues can be done offline if you did a sync recently.
The other big point was local branches. Before DVCS, the concept of a "local branch" was generally not a thing. But now you could suddenly create a branch for each separate issue and easily switch between them while isolating unrelated changes.
seba_dos1
14 hours ago
They are available as refs on the remote to pull though, they just aren't listed so don't end up mirrored either.
paulddraper
11 hours ago
Having a web interface was really appreciated by users, it would seem.
jasode
10 hours ago
>Having a web interface
It's not the interface, it's the web hosting. People want a free destination server that's up 24/7 to store their repository.
If it was only the web interface, people could locally install GitLab or Gitea to get a web browser UI. (Or use whatever modern IDE code editor to have a GUI instead of a CLI for git commands.) But doing that still doesn't solve what GitHub solves: a public server to host the files, issue tracking, etc.
Before git & Github, people put source code for public access on SourceForge and CodeProject. The reason was the same: a zero-cost way to share code with everybody.
pmontra
5 hours ago
It was both.
A 24/7 repository and a 24/7 web URL for the code. Those two features together let devs inspect and download code, and open and discuss issues.
The URL also let automated tools download and install packages.
Familiar UI, network effects made the rest.
JoshTriplett
5 hours ago
Exactly. The UI needs to live wherever the canonical home for the project is, at least until we have a federated solution.
I'm really looking forward to federated forges.
chipsrafferty
7 hours ago
No, actually it's the interface. Many companies would totally host it themselves, but the interface is what gives GH value.
udev4096
10 hours ago
GH is essentially an unlimited storage space. There are countless scripts which makes it possible to even use it as an unlimited mounted storage
FpUser
6 hours ago
And then one day orange gets pissed of at yet another country and it (the repo) is gone
yogishbaliga
10 hours ago
It is also other features such as GitHub workflow, releases, integration with other tools, webhooks etc. that makes it useful.
paulddraper
7 hours ago
That’s true, but GitHub was dominant prior to having most of those features.
eru
43 minutes ago
Yes, I encourage my co-workers, when pushing to a common repo, to use `$user/$whatever` exactly to have their own namespace. The main selling point I'm making is that it makes cleanup of old branches easier, and less conflict-prone.
Tangentially related: when you have multiple local checkouts, often `git worktree` is more convenient than having completely independent local repository. See https://git-scm.com/docs/git-worktree
mzajc
11 hours ago
> “origin” is also just the default name for the cloned remote. It could be called anything, and you can have as many remotes as you’d like.
One remote can also hold more URLs! This is arguably more obscure (Eclipse's EGit doesn't even support it), but works wonders for my workflow, since I want to push to multiple mirrors at the same time.
mckn1ght
10 hours ago
Whenever I fork a repo I rename origin to “fork” and then add the parent repo as a remote named “upstream” so i can pull from that, rebase any of my own changes in to, and push to fork as needed.
Multiple remotes is also how you can combine multiple repos into one monorepo by just fetching and pulling from each one, maybe into different subdirectories to avoid path collisions.
Sharlin
14 hours ago
Git was always explicitly a decentralized, "peer to peer" version control system, as opposed to centralized ones like SVN, with nothing in the protocol itself that makes a distinction between a "server" and a "client". Using it in a centralized fashion is just a workflow that you choose to use (or, realistically, one that somebody else chose for you). Any clone of a repository can be a remote to any other clone, and you can easily have a "git server" (ie. just another directory) in your local filesystem, which is a perfectly reasonable workflow in some cases.
JamesLeonis
13 hours ago
I have a use case just for this. Sometimes my internet goes down while I'm working on my desktop computer. I'll put my work in a branch and push it to my laptop, then go to a coffee shop to continue my work.
kragen
10 hours ago
When I do this I usually push to a bare repo on a USB pendrive.
chipsrafferty
7 hours ago
I just copy files on a USB drive
jonhohle
14 hours ago
This is a better summary than mine.
There was a thread not to long ago where people were conflating git with GitHub. Git is an incredible tool (after coming from SVN/CVS/p4/source safe) that stands on its own apart from hosting providers.
Sharlin
14 hours ago
And GitHub naturally has done nothing to disabuse people of the interpretation that git = GitHub. Meanwhile, the actual raison d'etre for the existence of git of course doesn't use GitHub, or the "pull request" based workflow that GitHub invented and is also not anything intrinsic to git in any way.
webstrand
14 hours ago
It's a little more complex than that. Yes git can work in a peer-to-peer fashion, but the porcelain is definitely set up for a hub-and-spoke model, given how cloning a remote repo only gives you a partial copy of the remote history.
There's other stuff too, like git submodules can't be configured to reference another branch on the local repository and then be cloned correctly, only another remote.
jonhohle
14 hours ago
> given how cloning a remote repo only gives you a partial copy of the remote history
When you clone you get the full remote history and all remote branches (by default). That’s painfully true when you have a repo with large binary blobs (and the reason git-lfs and others exist).
webstrand
6 hours ago
You're right, I got that part wrong, git actually fetches all of the remote commits (but not all of the refs, many things are missing, for instance notes).
But a clone of your clone is not going to work the same way, since remote branches are not cloned by default, either. So it'll only have partial history. This is what I was thinking about.
Sophira
14 hours ago
> given how cloning a remote repo only gives you a partial copy of the remote history
You may be thinking of the optional -depth switch, which allows you to create shallow clones that don't have the full history. If you don't include that, you'll get the full history when cloning.
seba_dos1
13 hours ago
You only get it actually full with "--mirror" switch, but for most use-cases what you get without it is already "full enough".
isaacremuant
14 hours ago
I'd say git submodules have such an awkward UX that should probably not be used except in very rare and organized cases. I've done it before but it has to be worth it.
But I get your larger point.
seba_dos1
13 hours ago
And they're often (not always) used where subtrees would fit better.
webstrand
6 hours ago
I can't get over my fear of subtrees after accidentally nuking one of my repos by doing a rebase across the subtree commit. I've found that using worktrees, with a script in the main branch to set up the worktrees, works pretty well to split history across multiple branches, like what you might want in a monorepo.
Sadly doing a monorepo this way with pnpm doesn't work, since pnpm doesn't enforce package version requirements inside of a pnpm workspace. And it doesn't record installed version information for linked packages either.
kawsper
14 hours ago
I always thought it would have been better, and less confusing for newcomers, if GitHub had named the default remote “github”, instead of origin, in the examples.
tobylane
14 hours ago
If I clone my fork, I always add the upstream remote straight away. Origin and Upstream could each be github, ambiguous.
mckn1ght
10 hours ago
Is this something the remote can control? I figured it was on the local cloner to decide.
Can’t test it now but wonder if this is changed if it affects the remote name for fresh clones: https://git-scm.com/docs/git-config#Documentation/git-config...
pwdisswordfishy
10 hours ago
GitHub could not name it so, because it's not up to GitHub to choose.
seba_dos1
10 hours ago
There are places where it does choose, but arguably it makes sense for it to be consistent with what you get when using "git clone".
masklinn
14 hours ago
How is it less confusing when your fork is also on github?
matrss
13 hours ago
Requiring a fork to open pull requests as an outsider to a project is in itself a idiosyncrasy of GitHub that could be done without. Gitea and Forgejo for example support AGit: https://forgejo.org/docs/latest/user/agit-support/.
Nevertheless, to avoid ambiguity I usually name my personal forks on GitHub gh-<username>.
kragen
10 hours ago
No, it's a normal feature of Git. If I want you to pull my changes, I need to host those changes somewhere that you can access. If you and I are both just using ssh access to our separate Apache servers, for example, I am going to have to push my changes to a fork on my server before you can pull them.
And of course in Git every clone is a fork.
AGit seems to be a new alternative where apparently you can push a new branch to someone else's repository that you don't normally have access to, but that's never guaranteed to be possible, and is certainly very idiosyncratic.
matrss
4 hours ago
Arguably the OG workflow to submit your code is `git send-email`, and that also doesn't require an additional third clone on the same hosting platform as the target repository.
All those workflows are just as valid as the others, I was just pointing out that the way github does it is not the only way it can be done.
kragen
4 hours ago
Yes, that's true. Or git format-patch.
masklinn
13 hours ago
> Requiring a fork to open pull requests as an outsider to a project is in itself a idiosyncrasy of GitHub that could be done without. Gitea and Forgejo for example support AGit: https://forgejo.org/docs/latest/user/agit-support/.
Ah yes, I'm sure the remote being called "origin" is what confuses people when they have to push to a refspec with push options. That's so much more straightforward than a button "create pull request".
ratmice
12 hours ago
As far as I'm concerned the problem isn't that one is easier than the other. It's that in the github case it completely routes around the git client. With AGit+gitea or forgejo you can either click your "create pull request" button, or make a pull request right from the git client. One is necessarily going to require more information than the other to reach the destination...
It's like arguing that instead of having salad or fries on the menu with your entree they should only serve fries.
grimgrin
13 hours ago
agreed, you'd need a second name anyway. and probably "origin" and "upstream" is nicer than "github" and "my-fork" because.. the convention seems like it should apply to all the other git hosts too: codeberg, sourcehut, tfs, etc
ompogUe
8 hours ago
I often init a bare repo on single-use server's I'm working on.
Then, have separate prod and staging clones parallel to that.
Have a post-commit hook set on the bare repo that automatically pushes updates to the staging repo for testing.
When ready, then pull the updates into prod.
Might sound strange, but for certain clients hosting situations, I've found it allows for faster iterations. ymmv