git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
	"Eric W. Biederman" <ebiederm@gmail.com>,
	Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v4 4/7] bulk-checkin: implement `SOURCE_INCORE` mode for `bulk_checkin_source`
Date: Mon, 23 Oct 2023 11:19:13 +0200	[thread overview]
Message-ID: <ZTY6kTZT-ni16usH@tanuki> (raw)
In-Reply-To: <e427fe6ad383cc238c13f313dc9f11eab37a3840.1697736516.git.me@ttaylorr.com>

[-- Attachment #1: Type: text/plain, Size: 2588 bytes --]

On Thu, Oct 19, 2023 at 01:28:51PM -0400, Taylor Blau wrote:
> Continue to prepare for streaming an object's contents directly from
> memory by teaching `bulk_checkin_source` how to perform reads and seeks
> based on an address in memory.
> 
> Unlike file descriptors, which manage their own offset internally, we
> have to keep track of how many bytes we've read out of the buffer, and
> make sure we don't read past the end of the buffer.
> 
> Suggested-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  bulk-checkin.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 28bc8d5ab4..60361b3e2e 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -141,11 +141,15 @@ static int already_written(struct bulk_checkin_packfile *state, struct object_id
>  }
>  
>  struct bulk_checkin_source {
> -	enum { SOURCE_FILE } type;
> +	enum { SOURCE_FILE, SOURCE_INCORE } type;
>  
>  	/* SOURCE_FILE fields */
>  	int fd;
>  
> +	/* SOURCE_INCORE fields */
> +	const void *buf;
> +	size_t read;
> +
>  	/* common fields */
>  	size_t size;
>  	const char *path;
> @@ -157,6 +161,11 @@ static off_t bulk_checkin_source_seek_to(struct bulk_checkin_source *source,
>  	switch (source->type) {
>  	case SOURCE_FILE:
>  		return lseek(source->fd, offset, SEEK_SET);
> +	case SOURCE_INCORE:
> +		if (!(0 <= offset && offset < source->size))
> +			return (off_t)-1;
> +		source->read = offset;
> +		return source->read;
>  	default:
>  		BUG("unknown bulk-checkin source: %d", source->type);
>  	}
> @@ -168,6 +177,13 @@ static ssize_t bulk_checkin_source_read(struct bulk_checkin_source *source,
>  	switch (source->type) {
>  	case SOURCE_FILE:
>  		return read_in_full(source->fd, buf, nr);
> +	case SOURCE_INCORE:
> +		assert(source->read <= source->size);

Is there any guideline around when to use `assert()` vs `BUG()`? I think
that this assertion here is quite critical, because when it does not
hold we can end up performing out-of-bounds reads and writes. But as
asserts are typically missing in non-debug builds, this safeguard would
not do anything for our end users, right?

Patrick

> +		if (nr > source->size - source->read)
> +			nr = source->size - source->read;
> +		memcpy(buf, (unsigned char *)source->buf + source->read, nr);
> +		source->read += nr;
> +		return nr;
>  	default:
>  		BUG("unknown bulk-checkin source: %d", source->type);
>  	}
> -- 
> 2.42.0.405.g86fe3250c2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-10-23  9:19 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-19 17:28 [PATCH v4 0/7] merge-ort: implement support for packing objects together Taylor Blau
2023-10-19 17:28 ` [PATCH v4 1/7] bulk-checkin: extract abstract `bulk_checkin_source` Taylor Blau
2023-10-20  7:35   ` Jeff King
2023-10-20 16:55     ` Junio C Hamano
2023-10-19 17:28 ` [PATCH v4 2/7] bulk-checkin: generify `stream_blob_to_pack()` for arbitrary types Taylor Blau
2023-10-19 17:28 ` [PATCH v4 3/7] bulk-checkin: refactor deflate routine to accept a `bulk_checkin_source` Taylor Blau
2023-10-19 17:28 ` [PATCH v4 4/7] bulk-checkin: implement `SOURCE_INCORE` mode for `bulk_checkin_source` Taylor Blau
2023-10-23  9:19   ` Patrick Steinhardt [this message]
2023-10-23 18:58     ` Jeff King
2023-10-24  6:34       ` Patrick Steinhardt
2023-10-24 17:08         ` Junio C Hamano
2023-10-19 17:28 ` [PATCH v4 5/7] bulk-checkin: introduce `index_blob_bulk_checkin_incore()` Taylor Blau
2023-10-19 17:28 ` [PATCH v4 6/7] bulk-checkin: introduce `index_tree_bulk_checkin_incore()` Taylor Blau
2023-10-19 17:29 ` [PATCH v4 7/7] builtin/merge-tree.c: implement support for `--write-pack` Taylor Blau
2023-10-19 21:47 ` [PATCH v4 0/7] merge-ort: implement support for packing objects together Junio C Hamano
2023-10-20  7:29 ` Jeff King
2023-10-20 16:53   ` Junio C Hamano
2023-10-23  9:19 ` Patrick Steinhardt
2023-10-23 22:44 ` [PATCH v5 0/5] " Taylor Blau
2023-10-23 22:44   ` [PATCH v5 1/5] bulk-checkin: extract abstract `bulk_checkin_source` Taylor Blau
2023-10-25  7:37     ` Jeff King
2023-10-25 15:39       ` Taylor Blau
2023-10-27 23:12       ` Junio C Hamano
2023-10-23 22:44   ` [PATCH v5 2/5] bulk-checkin: generify `stream_blob_to_pack()` for arbitrary types Taylor Blau
2023-10-23 22:45   ` [PATCH v5 3/5] bulk-checkin: introduce `index_blob_bulk_checkin_incore()` Taylor Blau
2023-10-25  7:58     ` Patrick Steinhardt
2023-10-25 15:44       ` Taylor Blau
2023-10-25 17:21         ` Eric Sunshine
2023-10-26  8:16           ` Patrick Steinhardt
2023-11-11  0:17           ` Elijah Newren
2023-10-23 22:45   ` [PATCH v5 4/5] bulk-checkin: introduce `index_tree_bulk_checkin_incore()` Taylor Blau
2023-10-23 22:45   ` [PATCH v5 5/5] builtin/merge-tree.c: implement support for `--write-pack` Taylor Blau
2023-10-25  7:58     ` Patrick Steinhardt
2023-10-25 15:46       ` Taylor Blau
2023-11-10 23:51     ` Elijah Newren
2023-11-11  0:27       ` Junio C Hamano
2023-11-11  1:34         ` Taylor Blau
2023-11-11  1:24       ` Taylor Blau
2023-11-13 22:05         ` Jeff King
2023-11-14  1:40           ` Junio C Hamano
2023-11-14  2:54             ` Elijah Newren
2023-11-14 21:55             ` Jeff King
2023-11-14  3:08           ` Elijah Newren
2023-11-13 22:02       ` Jeff King
2023-11-13 22:34         ` Taylor Blau
2023-11-14  2:50           ` Elijah Newren
2023-11-14 21:53             ` Jeff King
2023-11-14 22:04           ` Jeff King
2023-10-23 23:31   ` [PATCH v5 0/5] merge-ort: implement support for packing objects together Junio C Hamano
2023-11-06 15:46     ` Johannes Schindelin
2023-11-06 23:19       ` Junio C Hamano
2023-11-07  3:42       ` Jeff King
2023-11-07 15:58       ` Taylor Blau
2023-11-07 18:22         ` [RFC PATCH 0/3] replay: implement support for writing new objects to a pack Taylor Blau
2023-11-07 18:22           ` [RFC PATCH 1/3] merge-ort.c: finalize ODB transactions after each step Taylor Blau
2023-11-11  3:45             ` Elijah Newren
2023-11-07 18:22           ` [RFC PATCH 2/3] tmp-objdir: introduce `tmp_objdir_repack()` Taylor Blau
2023-11-08  7:05             ` Patrick Steinhardt
2023-11-09 19:26               ` Taylor Blau
2023-11-07 18:23           ` [RFC PATCH 3/3] builtin/replay.c: introduce `--write-pack` Taylor Blau
2023-11-11  3:42           ` [RFC PATCH 0/3] replay: implement support for writing new objects to a pack Elijah Newren
2023-11-11  4:04           ` Elijah Newren
2023-10-25  7:58   ` [PATCH v5 0/5] merge-ort: implement support for packing objects together Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTY6kTZT-ni16usH@tanuki \
    --to=ps@pks.im \
    --cc=ebiederm@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).