All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH] object-file: fix race in object collision check
Date: Fri, 3 Jan 2025 08:16:48 +0100	[thread overview]
Message-ID: <Z3eO2zxDOzbbFFNk@pks.im> (raw)
In-Reply-To: <20250101181952.GA1391912@coredump.intra.peff.net>

On Wed, Jan 01, 2025 at 01:19:52PM -0500, Jeff King wrote:
> On Wed, Jan 01, 2025 at 08:50:51AM -0800, Junio C Hamano wrote:
> 
> > Jeff King <peff@peff.net> writes:
> > 
> > > There is one gotcha here, though. We call this collision check only if
> > > we got EEXIST trying to move the tempfile into place. If the destination
> > > file then goes away, we can't do the collision check. But is it right to
> > > quietly return success?
> > >
> > > If the contents of the two were the same, that's fine. We don't need the
> > > extra copy.
> > >
> > > But if the contents were not the same, we'd prefer either to actually
> > > copy the contents into place, or to return an error.
> > >
> > > Of course we can't know, because the destination file has gone away. In
> > > the common case they will be the same, but the whole point of this check
> > > is to allow loosening the cryptographic collision of the packfile
> > > contents. So the safest thing would be to retain the tempfile, copying
> > > it into the destination file. That errs on the side of keeping data when
> > > we cannot make a determination.
> > >
> > > IOW, if we see ENOENT on filename_b, should we then loop back in the
> > > caller to try the link() again?
> > 
> > Yuck, I think you're absolutely right.
> 
> So I think this part, if adjusted as I suggested, would fix the race in
> the tests without making anything worse (it's just more code).

Hm, okay, I think that makes sense. Instead of being optimistic like my
first version was we should be pessimistic and assume the worst. Which I
guess is a sensible stance to take in a collision check.

> And then this...
> 
> > > I think check_collision() is used _after_ the attempt to rename() into
> > > place. So there's a race when the tempfile goes away, but I think the
> > > outcome is made a bit worse by your patch.
> > >
> > > Consider a sequence like this:
> > >
> > >   a. Process A writes tmp_pack_foo.
> > >
> > >   b. Process A tries to link tmp_pack_foo to pack-<hash> but finds it
> > >      already exists.
> > >
> > >   c. Process A opens both tmp_pack_foo and pack-<hash>.
> > >
> > >   d. Process A compares the two byte-for-byte, and then returns
> > >      success/failure based on whether they were actually identical.
> > >
> > > Now imagine there is a process B that deletes the file (maybe an
> > > over-zealous "gc --prune=now" deletes the in-use temporary file):
> > >
> > >  - if process B deletes it between steps (a) and (b), process A returns
> > >    an error (there is nothing to link). The caller knows that the data
> > >    was not stored.
> > >
> > >  - if process B deletes it between (b) and (c), then before your patch
> > >    we see an error (because we can't compare the files). After your
> > >    patch, we continue on and return success. The caller knows the data
> > >    was stored (via the original file, not our new copy).
> > >
> > >  - if process B deletes it between (c) and (d), then process A has no
> > >    idea. But at this point it does not matter. If the files were
> > >    identical, we return success (and in fact, process A deletes the file
> > >    itself). And if not identical, then we return error, and the callers
> > >    knows the data was not stored.
> > >
> > > So even though the exact behavior may depend on where we hit the race, I
> > > think ignoring an ENOENT open() error on the tempfile meaningfully
> > > changes what happens in the middle case.
> > >
> > > In practice I don't really expect this to happen, and "gc --prune=now"
> > > is inherently risky in a live repository. But I think we're probably
> > > better off to continue treating it as an error if we can't open our own
> > > tempfile.
> > 
> > So we'd ignore the racy and flaky tests, as hiding the flake by
> > ignoring the error would only hurt the real world users.
> 
> ...is all about ignoring ENOENT on the tmpfile itself. And I think we
> can just drop that hunk entirely. The tests do not care here (they are
> running simultaneous gc, but _not_ a simultaneous "--prune=now" gc).

Yeah, we don't need this hunk then.

I'll send a follow-up change.

Patrick

      reply	other threads:[~2025-01-03  7:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-30 10:32 [PATCH] object-file: fix race in object collision check Patrick Steinhardt
2024-12-30 14:40 ` Junio C Hamano
2024-12-30 14:50   ` Patrick Steinhardt
2024-12-30 14:54     ` Junio C Hamano
2024-12-31  1:42     ` Jeff King
2025-01-01 16:50       ` Junio C Hamano
2025-01-01 18:19         ` Jeff King
2025-01-03  7:16           ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z3eO2zxDOzbbFFNk@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.