All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org,  Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH 21/20] t5319: make corrupted large-offset test more robust
Date: Sat, 14 Oct 2023 12:42:01 -0700	[thread overview]
Message-ID: <xmqq4jitt2ie.fsf@gitster.g> (raw)
In-Reply-To: <20231014004348.GA43880@coredump.intra.peff.net> (Jeff King's message of "Fri, 13 Oct 2023 20:43:48 -0400")

Jeff King <peff@peff.net> writes:

>   4b. But sometimes we hit a different error. If another object points
>       to X as a delta base, then trying to find the type of that object
>       requires walking the delta chain to the base entry (since only the
>       base has the concrete type; deltas themselves are either OFS_DELTA
>       or REF_DELTA).
>
>       Normally this would not require separate offset lookups at all, as
>       deltas are usually stored as OFS_DELTA, specifying the relative
>       offset to the base. But the corrupt idx created in step 1 is done
>       directly with "git pack-objects" and does not pass the
>       --delta-base-offset option, meaning we have REF_DELTA entries!
>       Those do have to consult an index to find the location of the base
>       object, and they use the pack .idx to do this. The same pack .idx
>       that we know is corrupted from step 1!

Tricky.

> The set of objects created in the test is deterministic. But the delta
> selection seems not to be (which is not too surprising, as it is
> multi-threaded).

So, the offsets of the objects are also not deterministic?

> I have seen the failure in Windows CI but haven't
> reproduced it locally (not even with --stress). Re-running a failed
> Windows CI job tends to work. But when I download and examine the trash
> directory from a failed run, it shows a different set of deltas than I
> get locally. But the exact source of non-determinism isn't that
> important; our test should be robust against any order.

Yeah, thanks for digging this tricky situation through.

>   b. The "objects64" setup could use --delta-base-offset. This would fix
>      our problem, but earlier tests have many hard-coded offsets. Using
>      OFS_DELTA would change the locations of objects in the pack (this
>      might even be OK because I think most of the offsets are within the
>      .idx file, but it seems brittle and I'm afraid to touch it).

I am not sure I follow, as it does not sound a solution to anything
if the offsets are not deterministic (and "earlier tests" that have
hard coded offsets are broken no matter what, which is not a problem
this patch introduces).  Puzzled, but not curious enough to think
about it further, as you have already rejected this approach ;-)

>   d. We could ask directly about object X, rather than enumerating all
>      of them. But that requires further hard-coding of the oid (both
>      sha1 and sha256) of object X. I'd prefer not to introduce more
>      brittleness.

Right.

>   e. We can use a --batch-check format that looks at the pack data, but
>      doesn't have to chase deltas. The problem in this case is
>      %(objecttype), which has to walk to the base. But %(objectsize)
>      does not; we can get the value directly from the delta itself.
>      Another option would be %(deltabase), where we report the REF_DELTA
>      name but don't look at its data.
>
> I've gone with option (e) here. It's kind of subtle, but it's simple and
> has no side effects.

Nice.

> @@ -1129,8 +1129,10 @@ test_expect_success 'reader bounds-checks large offset table' '
>  		git multi-pack-index --object-dir=../objects64 write &&
>  		midx=../objects64/pack/multi-pack-index &&
>  		corrupt_chunk_file $midx LOFF clear &&
> -		test_must_fail git cat-file \
> -			--batch-check --batch-all-objects 2>err &&
> +		# using only %(objectsize) is important here; see the commit
> +		# message for more details
> +		test_must_fail git cat-file --batch-all-objects \
> +			--batch-check="%(objectsize)" 2>err &&

A rather unfriendly message to readers, as it is unclear which
commit you are talking about, and a fun thing is that you cannot
say which one.

Will queue.  Thanks.


  reply	other threads:[~2023-10-14 19:42 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-09 20:55 [PATCH 0/20] bounds-checks for chunk-based files Jeff King
2023-10-09 20:58 ` [PATCH 01/20] chunk-format: note that pair_chunk() is unsafe Jeff King
2023-10-10 23:45   ` Taylor Blau
2023-10-11 22:49     ` Jeff King
2023-10-09 20:58 ` [PATCH 02/20] t: add library for munging chunk-format files Jeff King
2023-10-10 23:47   ` Taylor Blau
2023-10-09 20:59 ` [PATCH 03/20] midx: stop ignoring malformed oid fanout chunk Jeff King
2023-10-10 23:50   ` Taylor Blau
2023-10-11 22:52     ` Jeff King
2023-10-09 20:59 ` [PATCH 04/20] commit-graph: check size of " Jeff King
2023-10-11  0:08   ` Taylor Blau
2023-10-11  1:24     ` Taylor Blau
2023-10-11 23:01     ` Jeff King
2023-10-09 21:02 ` [PATCH 05/20] midx: check size of oid lookup chunk Jeff King
2023-10-09 21:04 ` [PATCH 06/20] commit-graph: check consistency of fanout table Jeff King
2023-10-11 14:45   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 07/20] midx: check size of pack names chunk Jeff King
2023-10-11 14:52   ` Taylor Blau
2023-10-11 23:06     ` Jeff King
2023-10-09 21:05 ` [PATCH 08/20] midx: enforce chunk alignment on reading Jeff King
2023-10-11 14:56   ` Taylor Blau
2023-10-11 15:01   ` Taylor Blau
2023-10-11 23:09     ` Jeff King
2023-10-09 21:05 ` [PATCH 09/20] midx: check size of object offset chunk Jeff King
2023-10-11 18:31   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 10/20] midx: bounds-check large " Jeff King
2023-10-11 18:38   ` Taylor Blau
2023-10-11 23:18     ` Jeff King
2023-10-09 21:05 ` [PATCH 11/20] midx: check size of revindex chunk Jeff King
2023-10-11 18:41   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 12/20] commit-graph: check size of commit data chunk Jeff King
2023-10-11 18:46   ` Taylor Blau
2023-10-11 23:22     ` Jeff King
2023-10-09 21:05 ` [PATCH 13/20] commit-graph: detect out-of-bounds extra-edges pointers Jeff King
2023-10-11 19:02   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 14/20] commit-graph: bounds-check base graphs chunk Jeff King
2023-10-11 19:05   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 15/20] commit-graph: check size of generations chunk Jeff King
2023-10-09 21:05 ` [PATCH 16/20] commit-graph: bounds-check generation overflow chunk Jeff King
2023-10-09 21:05 ` [PATCH 17/20] commit-graph: check bounds when accessing BDAT chunk Jeff King
2023-10-11 19:11   ` Taylor Blau
2023-10-11 23:27     ` Jeff King
2023-10-09 21:05 ` [PATCH 18/20] commit-graph: check bounds when accessing BIDX chunk Jeff King
2023-10-11 19:15   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 19/20] commit-graph: detect out-of-order BIDX offsets Jeff King
2023-10-11 19:16   ` Taylor Blau
2023-10-09 21:06 ` [PATCH 20/20] chunk-format: drop pair_chunk_unsafe() Jeff King
2023-10-11 19:19 ` [PATCH 0/20] bounds-checks for chunk-based files Taylor Blau
2023-10-11 23:31   ` Jeff King
2023-10-13 19:25 ` [PATCH 0/8] chunk-format: introduce `pair_chunk_expect()` convenience API Taylor Blau
2023-10-13 19:25   ` [PATCH 1/8] chunk-format: introduce `pair_chunk_expect()` helper Taylor Blau
2023-10-13 19:25   ` [PATCH 2/8] commit-graph: read `OIDF` chunk with `pair_chunk_expect()` Taylor Blau
2023-10-13 19:25   ` [PATCH 3/8] commit-graph: read `CDAT` " Taylor Blau
2023-10-13 19:25   ` [PATCH 4/8] commit-graph: read `GDAT` " Taylor Blau
2023-10-13 19:25   ` [PATCH 5/8] commit-graph: read `BIDX` " Taylor Blau
2023-10-13 19:49     ` Taylor Blau
2023-10-14 16:10     ` Junio C Hamano
2023-10-20 10:31       ` Jeff King
2023-10-13 19:25   ` [PATCH 6/8] midx: read `OIDF` " Taylor Blau
2023-10-13 21:04     ` Junio C Hamano
2023-10-13 19:25   ` [PATCH 7/8] midx: read `OIDL` " Taylor Blau
2023-10-13 19:25   ` [PATCH 8/8] midx: read `OOFF` " Taylor Blau
2023-10-20 10:23   ` [PATCH 0/8] chunk-format: introduce `pair_chunk_expect()` convenience API Jeff King
2023-10-14  0:43 ` [PATCH 21/20] t5319: make corrupted large-offset test more robust Jeff King
2023-10-14 19:42   ` Junio C Hamano [this message]
2023-10-15  3:17     ` Jeff King
2023-10-15 17:04       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4jitt2ie.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.