All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Justin Tobler <jltobler@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 3/6] object-file: remove flags from transaction packfile writes
Date: Tue, 31 Mar 2026 09:48:38 +0200	[thread overview]
Message-ID: <act8VlnYzyTGOY7Y@pks.im> (raw)
In-Reply-To: <20260331033835.2863514-4-jltobler@gmail.com>

On Mon, Mar 30, 2026 at 10:38:32PM -0500, Justin Tobler wrote:
> The `index_blob_packfile_transaction()` function handles streaming a
> blob from an fd to compute its object ID and conditionally writes the
> object directly to a packfile if the INDEX_WRITE_OBJECT flag is set. A
> subsequent commit will make these packfile object writes part of the
> transaction interface. Consequently, having the object write be
> conditional on this flag is a bit awkward.

One could argue that we can uplift the object into the transaction
interface. But I have to overall agree that this would be rather awkward
as we're now starting to mix concerns that really shouldn't be mixed.
Most importantly, it would be weird to have a transaction where you can
write some objects, but not all of them via such a flag.

> In preparation for this change, introduce a dedicated
> `hash_blob_stream()` helper that only computes the OID from the fd. This
> is invoked by `index_fd()` instead when the INDEX_WRITE_OBJECT is not
> set. The object write performed via `index_blob_packfile_transaction()`
> is made unconditional accordingly.

Makes sense.

> diff --git a/object-file.c b/object-file.c
> index bfbb632cf8..493173eaf4 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1405,6 +1404,34 @@ static void prepare_packfile_transaction(struct odb_transaction_files *transacti
>  		die_errno("unable to write pack header");
>  }
>  
> +static int hash_blob_stream(const struct git_hash_algo *hash_algo,
> +			    struct object_id *result_oid, int fd, size_t size)

We could of course change the interface while at it to also receive the
object type. I'll leave it to you though whether you want to go there,
we don't have any use case for it right now anyway.

> +{
> +	unsigned char buf[16384];
> +	struct git_hash_ctx ctx;
> +	unsigned header_len;
> +
> +	header_len = format_object_header((char *)buf, sizeof(buf),
> +					  OBJ_BLOB, size);
> +	hash_algo->init_fn(&ctx);
> +	git_hash_update(&ctx, buf, header_len);
> +
> +	while (size) {
> +		size_t rsize = size < sizeof(buf) ? size : sizeof(buf);
> +		ssize_t read_result = read_in_full(fd, buf, rsize);
> +
> +		if ((size_t)read_result != rsize)
> +			return -1;

It would be a bit cleaner to first check whether `read_result < 0`
before casting.

    if (read_result < 0 || (size_t) read_result != rsize)
        return -1;

Doesn't make a difference in practice though.

> +		git_hash_update(&ctx, buf, rsize);
> +		size -= read_result;
> +	}
> +
> +	git_hash_final_oid(result_oid, &ctx);
> +
> +	return 0;
> +}

Overall, this function really is simple enough to pull out, even if it
duplicates a tiny amount of logic. Also, it has the benefit that we can
easily skip deflating the data, which we used to do even if we didn't
ultimately write the data to disk, so it was just pointless busywork.

> @@ -1642,7 +1655,7 @@ int index_fd(struct index_state *istate, struct object_id *oid,
>  	     int fd, struct stat *st,
>  	     enum object_type type, const char *path, unsigned flags)
>  {
> -	int ret;
> +	int ret = 0;
>  
>  	/*
>  	 * Call xsize_t() only when needed to avoid potentially unnecessary

In practice this doesn't have to be zero-initialized

> @@ -1659,18 +1672,23 @@ int index_fd(struct index_state *istate, struct object_id *oid,
>  		ret = index_core(istate, oid, fd, xsize_t(st->st_size),
>  				 type, path, flags);
>  	} else {
> -		struct object_database *odb = the_repository->objects;
> -		struct odb_transaction_files *files_transaction;
> -		struct odb_transaction *transaction;
> -
> -		transaction = odb_transaction_begin(odb);
> -		files_transaction = container_of(odb->transaction,
> -						 struct odb_transaction_files,
> -						 base);
> -		ret = index_blob_packfile_transaction(files_transaction, oid, fd,
> -						      xsize_t(st->st_size),
> -						      path, flags);
> -		odb_transaction_commit(transaction);
> +		if (flags & INDEX_WRITE_OBJECT) {
> +			struct object_database *odb = the_repository->objects;
> +			struct odb_transaction_files *files_transaction;
> +			struct odb_transaction *transaction;
> +
> +			transaction = odb_transaction_begin(odb);
> +			files_transaction = container_of(odb->transaction,
> +							 struct odb_transaction_files,
> +							 base);
> +			ret = index_blob_packfile_transaction(files_transaction, oid, fd,
> +						      xsize_t(st->st_size), path);
> +			odb_transaction_commit(transaction);

Okay. It's a bit sad that we have to reach into the files backend here,
but we already did beforehand, and maybe a subsequent commit will fix
this? Reading on.

On another note, it's somewhat curious that the commit doesn't return an
error code. Probably something we should fix eventually.

Patrick

  reply	other threads:[~2026-03-31  7:48 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31  3:38 [PATCH 0/6] odb: add write operation to ODB transaction interface Justin Tobler
2026-03-31  3:38 ` [PATCH 1/6] odb: split `struct odb_transaction` into separate header Justin Tobler
2026-03-31  7:48   ` Patrick Steinhardt
2026-03-31 13:56     ` Justin Tobler
2026-03-31 15:58       ` Junio C Hamano
2026-03-31 16:44         ` Justin Tobler
2026-03-31  3:38 ` [PATCH 2/6] odb/transaction: use pluggable `begin_transaction()` Justin Tobler
2026-03-31  7:48   ` Patrick Steinhardt
2026-03-31  3:38 ` [PATCH 3/6] object-file: remove flags from transaction packfile writes Justin Tobler
2026-03-31  7:48   ` Patrick Steinhardt [this message]
2026-03-31 14:10     ` Justin Tobler
2026-03-31  3:38 ` [PATCH 4/6] object-file: avoid fd seekback by checking object size upfront Justin Tobler
2026-03-31  7:48   ` Patrick Steinhardt
2026-03-31 14:14     ` Justin Tobler
2026-03-31  3:38 ` [PATCH 5/6] object-file: generalize packfile writes to use odb_write_stream Justin Tobler
2026-03-31  7:48   ` Patrick Steinhardt
2026-03-31 14:31     ` Justin Tobler
2026-03-31 22:59       ` Patrick Steinhardt
2026-03-31 23:21         ` Justin Tobler
2026-03-31 23:40           ` Patrick Steinhardt
2026-03-31  3:38 ` [PATCH 6/6] odb/transaction: make `write_object_stream()` pluggable Justin Tobler
2026-03-31  7:48   ` Patrick Steinhardt
2026-03-31 14:40     ` Justin Tobler
2026-04-01  3:03 ` [PATCH v2 0/7] odb: add write operation to ODB transaction interface Justin Tobler
2026-04-01  3:03   ` [PATCH v2 1/7] odb: split `struct odb_transaction` into separate header Justin Tobler
2026-04-01  3:03   ` [PATCH v2 2/7] odb/transaction: use pluggable `begin_transaction()` Justin Tobler
2026-04-01  3:03   ` [PATCH v2 3/7] odb: update `struct odb_write_stream` read() callback Justin Tobler
2026-04-01 11:23     ` Patrick Steinhardt
2026-04-01  3:03   ` [PATCH v2 4/7] object-file: remove flags from transaction packfile writes Justin Tobler
2026-04-01 11:23     ` Patrick Steinhardt
2026-04-01 14:02       ` Justin Tobler
2026-04-01  3:03   ` [PATCH v2 5/7] object-file: avoid fd seekback by checking object size upfront Justin Tobler
2026-04-01  3:03   ` [PATCH v2 6/7] object-file: generalize packfile writes to use odb_write_stream Justin Tobler
2026-04-01  3:03   ` [PATCH v2 7/7] odb/transaction: make `write_object_stream()` pluggable Justin Tobler
2026-04-01 11:24   ` [PATCH v2 0/7] odb: add write operation to ODB transaction interface Patrick Steinhardt
2026-04-02 21:32   ` [PATCH v3 " Justin Tobler
2026-04-02 21:32     ` [PATCH v3 1/7] odb: split `struct odb_transaction` into separate header Justin Tobler
2026-04-02 21:32     ` [PATCH v3 2/7] odb/transaction: use pluggable `begin_transaction()` Justin Tobler
2026-04-02 21:32     ` [PATCH v3 3/7] odb: update `struct odb_write_stream` read() callback Justin Tobler
2026-05-11 17:58       ` Jeff King
2026-05-12 15:19         ` Justin Tobler
2026-04-02 21:32     ` [PATCH v3 4/7] object-file: remove flags from transaction packfile writes Justin Tobler
2026-04-06 20:16       ` Jeff King
2026-04-06 20:19         ` Jeff King
2026-04-02 21:32     ` [PATCH v3 5/7] object-file: avoid fd seekback by checking object size upfront Justin Tobler
2026-04-02 21:32     ` [PATCH v3 6/7] object-file: generalize packfile writes to use odb_write_stream Justin Tobler
2026-04-02 21:32     ` [PATCH v3 7/7] odb/transaction: make `write_object_stream()` pluggable Justin Tobler
2026-04-08  7:25     ` [PATCH v3 0/7] odb: add write operation to ODB transaction interface Patrick Steinhardt
2026-05-14 18:37     ` [PATCH v4 " Justin Tobler
2026-05-14 18:37       ` [PATCH v4 1/7] odb: split `struct odb_transaction` into separate header Justin Tobler
2026-05-14 18:37       ` [PATCH v4 2/7] odb/transaction: use pluggable `begin_transaction()` Justin Tobler
2026-05-14 18:37       ` [PATCH v4 3/7] odb: update `struct odb_write_stream` read() callback Justin Tobler
2026-05-14 18:37       ` [PATCH v4 4/7] object-file: remove flags from transaction packfile writes Justin Tobler
2026-05-14 18:37       ` [PATCH v4 5/7] object-file: avoid fd seekback by checking object size upfront Justin Tobler
2026-05-14 18:37       ` [PATCH v4 6/7] object-file: generalize packfile writes to use odb_write_stream Justin Tobler
2026-05-14 18:37       ` [PATCH v4 7/7] odb/transaction: make `write_object_stream()` pluggable Justin Tobler
2026-05-15  3:56       ` [PATCH v4 0/7] odb: add write operation to ODB transaction interface Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=act8VlnYzyTGOY7Y@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.