From: Taylor R Campbell <git@campbell.mumble.net>
To: git@vger.kernel.org
Subject: Synchronous replication on push
Date: Sat, 2 Nov 2024 02:06:53 +0000 [thread overview]
Message-ID: <20241102020653.766D1609AC@jupiter.mumble.net> (raw)
Suppose I have a front end repository:
user@frontend.example.com:/repo.git
Whenever I push anything to it, I want the push -- that is, all the
objects, and all the ref updates -- to be synchronously replicated to
another remote repository, the back end:
git@backend.example.com:/repo.git
If this replication fails -- whether because the back end is down, or
because the front end crashed and rolled back to an earlier state, or
because the back end has been updated independently and rejects a
force push, or whatever -- I want the push to fail. But, absent these
failures, I want frontend and backend to store the same set of objects
and refs.
(Actually, I want to replicate it to a quorum of multiple back ends
with a three-phase commit protocol -- but I'll start with the
single-replica case for simplicity.)
How can I do this with git?
One option, of course, is to use a replicated file system like
glusterfs, or replicated block store like DRBD. But that
(a) likely requires a lot more round-trips than git push/send-pack,
(b) can't be used for replication to other git hosts like Github, and
(c) can't be used for other remote transports like git-cinnabar.
So I'd like to do this at the git level, not at the file system or
block store level.
Here are some approaches I've tried:
1. `git clone --mirror -o backend git@backend.example.com:/repo.git'
to create the front end repository, plus the following pre-receive
hook in the front end:
#!/bin/sh
exec git push backend
This doesn't work because the pre-receive hook runs in the
quarantine environment, and `git push' wants to update
`refs/heads/main', which is forbidden in the quarantine
environment.
(However, git push to frontend doesn't actually fail with nonzero
exit status -- it prints an error message, `ref updates forbidden
inside quarantine environment', but exits wtih status 0.)
But maybe the ref update is harmless in this environment.
2. Same as (1), but the pre-receive hook is:
#!/bin/sh
unset GIT_QUARANTINE_PATH
exec git push backend
This doesn't work because `git push' in the pre-receive hook
doesn't find anything it needs to push -- the ref update hasn't
happened yet.
3. Same as (1), but the pre-receive hook assembles a command line of
exec git push backend ${new0}:${ref0} ${new1}:${ref1} ...,
with all the ref updates passed on stdin (ignoring the old values).
This fails because `--mirror can't be combined with refspecs'.
4. Same as (3), but remote.backend.mirror is explicitly disabled after
`git clone --mirror' finishes.
On push to the primary, this prints an error message
remote: error: update_ref failed for ref 'refs/heads/main': ref updates forbidden inside quarantine environment
but somehow the push succeeds in spite of this message, and the
primary and replica both get updated.
And if I inject an error on push to the replica, by making the
replica's pre-receive hook fail with nonzero exit status, neither
primary nor replica is updated and the push fails with an error
message (`pre-receive hook declined') _and_ nonzero exit status --
as desired.
So maybe this actually works, but the error message on _successful_
pushes is unsettling!
5. Same as (1), but the pre-receive hook assembles a command line of
exec git send-pack git@backend.example.com:/repo.git \
${new0}:${ref0} ${new1}:${ref1} ...
with all the ref updates passed on stdin (ignoring the old values).
This seems to work, and it propagates errors injected on push to
the replica, but it is limited to local or ssh remotes, as far as I
can tell -- it does not appear that git-send-pack works with custom
remote transports.
Perhaps using mirror clones is the wrong approach here, and perhaps I
should instead explicitly create tracking branches in the primary that
are only updated if the push succeeds -- but this will still require
getting around the quarantine restrictions on git push in the
pre-receive hook.
Is there a way to achieve this (ideally, with plausible extension to a
three-phase commit protocol) that doesn't trigger unsettling nonfatal
error messages and that works with custom remote transports?
next reply other threads:[~2024-11-02 2:14 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-02 2:06 Taylor R Campbell [this message]
2024-11-02 10:09 ` Synchronous replication on push Matěj Cepl
2024-11-02 13:35 ` Taylor R Campbell
2024-11-02 14:49 ` brian m. carlson
2024-11-04 13:35 ` Taylor R Campbell
2024-11-04 14:40 ` Konstantin Ryabitsev
2024-11-04 15:50 ` Taylor R Campbell
2024-11-04 22:36 ` brian m. carlson
2024-11-04 23:47 ` Jeff King
2024-11-05 1:34 ` Taylor R Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241102020653.766D1609AC@jupiter.mumble.net \
--to=git@campbell.mumble.net \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).