git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Derrick Stolee <derrickstolee@github.com>,
	Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org,
	Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Subject: Re: [PATCH 5/6] pack-bitmap.c: use `bitmap_index_seek()` where possible
Date: Fri, 24 Mar 2023 12:43:46 -0700	[thread overview]
Message-ID: <xmqqr0tedjf1.fsf@gitster.g> (raw)
In-Reply-To: <20230324183514.GB536252@coredump.intra.peff.net> (Jeff King's message of "Fri, 24 Mar 2023 14:35:14 -0400")

Jeff King <peff@peff.net> writes:

> On Fri, Mar 24, 2023 at 02:06:43PM -0400, Derrick Stolee wrote:
>
>> >> +	bitmap_index_seek(index, header_size, SEEK_CUR);
>> >>  	return 0;
>> >>  }
>> > 
>> > Likewise this function already has bounds checks at the top:
>> > 
>> > 	if (index->map_size < header_size + the_hash_algo->rawsz)
>> > 		return error(_("corrupted bitmap index (too small)"));
>> > 
>> > I'd be perfectly happy if we swapped that our for checking the bounds on
>> > individual reads, but the extra checking in the seek step here just
>> > seems redundant (and again, too late).
>> 
>> I think it would be nice to replace all of these custom bounds
>> checks with a check within bitmap_index_seek() and error conditions
>> done in response to an error code returned by that method. It keeps
>> the code more consistent in the potential future of changing the
>> amount to move the map_pos and the amount checked in these conditions.
>
> Yeah, that's what I was getting at. But doing it at seek time is too
> late. We'll have just read off the end of the array.

Yup.  You illustrated it nicely in your response for the previous
step of the series.  If the typical access pattern is to check, read
and then advance to the next position, and by the time you are ready
to advance to the next position, you'd better have done the checking.

> You really need an interface more like "make sure there are N bytes for
> me to read at offset X". Then you can read and advance past them.
>
> For individual items where you want to copy the bytes out anyway (like a
> be32) you can have a nice interface like:
>
>   if (read_be32(bitmap_git, &commit_idx_pos) < 0 ||
>       read_u8(bitmap_git, &xor_offset) < 0 ||
>       read_u8(bitmap_git, &flags) < 0)
> 	return error("truncated bitmap entry");

Yeah, that kind of flow reads really well.

> But given that there is only one spot that calls these, that kind of
> refactoring might not be worth it (right now it just uses the magic
> number "6" right before grabbing the data).

Yeah, it seems most of the callers with SEEK_SET are "I find the
next offset from a table and jump there in preparation for doing
something".  I suspect callers with SEEK_CUR would fit in the
read_X() pattern better?  From that angle, it smells that the two
kinds of seek functions may want to be split into two different
helpers.

Thanks.



  reply	other threads:[~2023-03-24 19:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-20 20:02 [PATCH 0/6] pack-bitmap: miscellaneous mmap read hardening Taylor Blau
2023-03-20 20:02 ` [PATCH 1/6] pack-bitmap.c: hide bitmap internals in `read_u8()` Taylor Blau
2023-03-21 17:35   ` Jeff King
2023-03-24 17:52     ` Derrick Stolee
2023-03-20 20:02 ` [PATCH 2/6] pack-bitmap.c: hide bitmap internals in `read_be32()` Taylor Blau
2023-03-20 20:02 ` [PATCH 3/6] pack-bitmap.c: drop unnecessary 'inline's Taylor Blau
2023-03-21 17:40   ` Jeff King
2023-03-20 20:02 ` [PATCH 4/6] pack-bitmap.c: factor out manual `map_pos` manipulation Taylor Blau
2023-03-21 17:56   ` Jeff King
2023-03-24 18:04     ` Derrick Stolee
2023-03-24 18:29       ` Jeff King
2023-03-24 23:23         ` Taylor Blau
2023-03-25  4:57           ` Jeff King
2023-03-24 23:13       ` Taylor Blau
2023-03-24 23:24         ` Taylor Blau
2023-03-24 23:08     ` Taylor Blau
2023-03-20 20:02 ` [PATCH 5/6] pack-bitmap.c: use `bitmap_index_seek()` where possible Taylor Blau
2023-03-21 18:05   ` Jeff King
2023-03-24 18:06     ` Derrick Stolee
2023-03-24 18:35       ` Jeff King
2023-03-24 19:43         ` Junio C Hamano [this message]
2023-03-24 20:37           ` Jeff King
2023-03-24 21:38             ` Junio C Hamano
2023-03-24 22:57               ` Taylor Blau
2023-03-20 20:02 ` [PATCH 6/6] pack-bitmap.c: factor out `bitmap_index_seek_commit()` Taylor Blau
2023-03-21 18:13   ` Jeff King
2023-03-21 18:16     ` Taylor Blau
2023-03-21 18:27       ` Jeff King
2023-03-24 18:09         ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqr0tedjf1.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=chakrabortyabhradeep79@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).