git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Justin Tobler <jltobler@gmail.com>
To: git@vger.kernel.org
Cc: ps@pks.im, gitster@pobox.com, Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v3 0/4] bulk-checkin: remove global transaction state
Date: Fri, 22 Aug 2025 16:34:56 -0500	[thread overview]
Message-ID: <20250822213500.1488064-1-jltobler@gmail.com> (raw)
In-Reply-To: <20250821232249.319427-1-jltobler@gmail.com>

Greetings,

The bulk-checkin subsystem provides an interface to write objects to the
object database in a bulk transaction. The state of an ongoing
transaction is stored across several global variables. This series aims
to remove this global transaction state in favor of storing state in in
`struct object_database`. This is done in preparation for a follow-up
change where the goal is to eventually move these transaction interfaces
into "odb.h".

Changes since V2:

- `index_blob_bulk_checkin()` is combined with
  `deflate_blob_bulk_checkin()` in patch 3 instead of 4.
- Continue to use `repo_get_object_directory()` instead of open coding.

Changes since V1:

- `index_blob_bulk_checkin()` now assumes that the caller always
  provides a setup `struct odb_transaction`. Callers are adjusted to
  ensure this.
- Functions in bulk-checkin.c now consistently access the repository
  through the provided `odb_transaction`.

Thanks,
-Justin

Justin Tobler (4):
  bulk-checkin: introduce object database transaction structure
  bulk-checkin: remove global transaction state
  bulk-checkin: require transaction for index_blob_bulk_checkin()
  bulk-checkin: use repository variable from transaction

 builtin/add.c            |   5 +-
 builtin/unpack-objects.c |   5 +-
 builtin/update-index.c   |   7 +-
 bulk-checkin.c           | 152 +++++++++++++++++++++------------------
 bulk-checkin.h           |  25 ++++---
 cache-tree.c             |   5 +-
 object-file.c            |  30 +++++---
 odb.h                    |   8 +++
 read-cache.c             |   5 +-
 9 files changed, 141 insertions(+), 101 deletions(-)

Range-diff against v2:
1:  5c9358e0b03 = 1:  5c9358e0b03 bulk-checkin: introduce object database transaction structure
2:  4a1b80a6baf = 2:  4a1b80a6baf bulk-checkin: remove global transaction state
3:  ce329932fdd ! 3:  ae5dbd0e1af bulk-checkin: require transaction for index_blob_bulk_checkin()
    @@ Commit message
     
         Update `index_blob_bulk_checkin()` to assume that a valid transaction is
         always provided. Callers are now expected to ensure a transaction is set
    -    up beforehand. The single call site in `object-file.c:index_fd()` is
    -    updated accordingly. Due to how `{begin,end}_odb_transaction()` handles
    -    nested transactions, a new transaction is only created and committed if
    -    there is not already an ongoing transaction.
    +    up beforehand. With this simplification, `deflate_blob_bulk_checkin()`
    +    is no longer needed as a standalone internal function and is combined
    +    with `index_blob_bulk_checkin()`. The single call site in
    +    `object-file.c:index_fd()` is updated accordingly. Due to how
    +    `{begin,end}_odb_transaction()` handles nested transactions, a new
    +    transaction is only created and committed if there is not already an
    +    ongoing transaction.
     
         Signed-off-by: Justin Tobler <jltobler@gmail.com>
     
      ## bulk-checkin.c ##
    -@@ bulk-checkin.c: int index_blob_bulk_checkin(struct odb_transaction *transaction,
    - 			    struct object_id *oid, int fd, size_t size,
    - 			    const char *path, unsigned flags)
    +@@ bulk-checkin.c: static void prepare_to_stream(struct bulk_checkin_packfile *state,
    + 		die_errno("unable to write pack header");
    + }
    + 
    +-static int deflate_blob_to_pack(struct bulk_checkin_packfile *state,
    +-				struct object_id *result_oid,
    +-				int fd, size_t size,
    +-				const char *path, unsigned flags)
    ++int index_blob_bulk_checkin(struct odb_transaction *transaction,
    ++			    struct object_id *result_oid, int fd, size_t size,
    ++			    const char *path, unsigned flags)
      {
    ++	struct bulk_checkin_packfile *state = &transaction->packfile;
    + 	off_t seekback, already_hashed_to;
    + 	struct git_hash_ctx ctx;
    + 	unsigned char obuf[16384];
    +@@ bulk-checkin.c: void fsync_loose_object_bulk_checkin(struct odb_transaction *transaction,
    + 	}
    + }
    + 
    +-int index_blob_bulk_checkin(struct odb_transaction *transaction,
    +-			    struct object_id *oid, int fd, size_t size,
    +-			    const char *path, unsigned flags)
    +-{
     -	int status;
     -
     -	if (transaction) {
    @@ bulk-checkin.c: int index_blob_bulk_checkin(struct odb_transaction *transaction,
     -	}
     -
     -	return status;
    -+	return deflate_blob_to_pack(&transaction->packfile, oid, fd, size, path,
    -+				    flags);
    - }
    - 
    +-}
    +-
      struct odb_transaction *begin_odb_transaction(struct object_database *odb)
    + {
    + 	if (!odb->transaction) {
     
      ## bulk-checkin.h ##
     @@ bulk-checkin.h: void fsync_loose_object_bulk_checkin(struct odb_transaction *transaction,
4:  08e26647915 ! 4:  a05af82fddb bulk-checkin: use repository variable from transaction
    @@ Commit message
         `pack_compression_level` and `pack_size_limit_cfg` globals are still
         used.
     
    +    Also adapt functions using packfile state to instead access it through
    +    the transaction. This makes some function parameters redundant and go
    +    away.
    +
         Signed-off-by: Justin Tobler <jltobler@gmail.com>
     
      ## bulk-checkin.c ##
    @@ bulk-checkin.c: static void flush_bulk_checkin_packfile(struct bulk_checkin_pack
     -			    state->written, state->nr_written,
     -			    &state->pack_idx_opts, hash);
     +	strbuf_addf(&packname, "%s/pack/pack-%s.",
    -+		    transaction->odb->sources->path,
    ++		    repo_get_object_directory(transaction->odb->repo),
     +		    hash_to_hex_algop(hash, repo->hash_algo));
     +
     +	finish_tmp_packfile(transaction, &packname, hash);
    @@ bulk-checkin.c: static void flush_batch_fsync(struct odb_transaction *transactio
      	 */
     -	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", repo_get_object_directory(the_repository));
     +	strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX",
    -+		    transaction->odb->sources->path);
    ++		    repo_get_object_directory(transaction->odb->repo));
      	temp = xmks_tempfile(temp_path.buf);
      	fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp));
      	delete_tempfile(&temp);
    @@ bulk-checkin.c: static int stream_blob_to_pack(struct bulk_checkin_packfile *sta
      	reset_pack_idx_option(&state->pack_idx_opts);
      
      	/* Pretend we are going to write only one object */
    -@@ bulk-checkin.c: static void prepare_to_stream(struct bulk_checkin_packfile *state,
    - 		die_errno("unable to write pack header");
    - }
    - 
    --static int deflate_blob_to_pack(struct bulk_checkin_packfile *state,
    --				struct object_id *result_oid,
    --				int fd, size_t size,
    --				const char *path, unsigned flags)
    -+int index_blob_bulk_checkin(struct odb_transaction *transaction,
    -+			    struct object_id *result_oid,
    -+			    int fd, size_t size,
    -+			    const char *path, unsigned flags)
    - {
    -+	struct bulk_checkin_packfile *state = &transaction->packfile;
    - 	off_t seekback, already_hashed_to;
    - 	struct git_hash_ctx ctx;
    - 	unsigned char obuf[16384];
    -@@ bulk-checkin.c: static int deflate_blob_to_pack(struct bulk_checkin_packfile *state,
    +@@ bulk-checkin.c: int index_blob_bulk_checkin(struct odb_transaction *transaction,
      
      	header_len = format_object_header((char *)obuf, sizeof(obuf),
      					  OBJ_BLOB, size);
    @@ bulk-checkin.c: static int deflate_blob_to_pack(struct bulk_checkin_packfile *st
      		if (idx) {
      			hashfile_checkpoint(state->f, &checkpoint);
      			idx->offset = state->offset;
    -@@ bulk-checkin.c: static int deflate_blob_to_pack(struct bulk_checkin_packfile *state,
    +@@ bulk-checkin.c: int index_blob_bulk_checkin(struct odb_transaction *transaction,
      			BUG("should not happen");
      		hashfile_truncate(state->f, &checkpoint);
      		state->offset = checkpoint.offset;
    @@ bulk-checkin.c: static int deflate_blob_to_pack(struct bulk_checkin_packfile *st
      		if (lseek(fd, seekback, SEEK_SET) == (off_t) -1)
      			return error("cannot seek back");
      	}
    -@@ bulk-checkin.c: static int deflate_blob_to_pack(struct bulk_checkin_packfile *state,
    +@@ bulk-checkin.c: int index_blob_bulk_checkin(struct odb_transaction *transaction,
      		return 0;
      
      	idx->crc32 = crc32_end(state->f);
    @@ bulk-checkin.c: void prepare_loose_object_bulk_checkin(struct odb_transaction *t
      	if (transaction->objdir)
      		tmp_objdir_replace_primary_odb(transaction->objdir, 0);
      }
    -@@ bulk-checkin.c: void fsync_loose_object_bulk_checkin(struct odb_transaction *transaction,
    - 	}
    - }
    - 
    --int index_blob_bulk_checkin(struct odb_transaction *transaction,
    --			    struct object_id *oid, int fd, size_t size,
    --			    const char *path, unsigned flags)
    --{
    --	return deflate_blob_to_pack(&transaction->packfile, oid, fd, size, path,
    --				    flags);
    --}
    --
    - struct odb_transaction *begin_odb_transaction(struct object_database *odb)
    - {
    - 	if (!odb->transaction) {
     @@ bulk-checkin.c: void flush_odb_transaction(struct odb_transaction *transaction)
      		return;
      

base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
-- 
2.51.0


  parent reply	other threads:[~2025-08-22 21:35 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-20 22:55 [PATCH 0/3] bulk-checkin: remove global transaction state Justin Tobler
2025-08-20 22:55 ` [PATCH 1/3] bulk-checkin: introduce object database transaction structure Justin Tobler
2025-08-20 22:55 ` [PATCH 2/3] bulk-checkin: remove global transaction state Justin Tobler
2025-08-20 22:55 ` [PATCH 3/3] bulk-checkin: wire repository variable Justin Tobler
2025-08-21  0:15   ` Junio C Hamano
2025-08-21 20:26     ` Justin Tobler
2025-08-21 20:32       ` Junio C Hamano
2025-08-21  0:00 ` [PATCH 0/3] bulk-checkin: remove global transaction state Junio C Hamano
2025-08-21 23:22 ` [PATCH v2 0/4] " Justin Tobler
2025-08-21 23:22   ` [PATCH v2 1/4] bulk-checkin: introduce object database transaction structure Justin Tobler
2025-08-21 23:22   ` [PATCH v2 2/4] bulk-checkin: remove global transaction state Justin Tobler
2025-08-22 16:37     ` Junio C Hamano
2025-08-22 18:07       ` Justin Tobler
2025-08-22 20:25         ` Junio C Hamano
2025-08-21 23:22   ` [PATCH v2 3/4] bulk-checkin: require transaction for index_blob_bulk_checkin() Justin Tobler
2025-08-22 16:49     ` Junio C Hamano
2025-08-22 19:13       ` Justin Tobler
2025-08-22 20:33         ` Junio C Hamano
2025-08-21 23:22   ` [PATCH v2 4/4] bulk-checkin: use repository variable from transaction Justin Tobler
2025-08-22 17:03     ` Junio C Hamano
2025-08-22 19:38       ` Justin Tobler
2025-08-22 21:34   ` Justin Tobler [this message]
2025-08-22 21:34     ` [PATCH v3 1/4] bulk-checkin: introduce object database transaction structure Justin Tobler
2025-08-22 21:34     ` [PATCH v3 2/4] bulk-checkin: remove global transaction state Justin Tobler
2025-08-22 21:34     ` [PATCH v3 3/4] bulk-checkin: require transaction for index_blob_bulk_checkin() Justin Tobler
2025-08-22 21:35     ` [PATCH v3 4/4] bulk-checkin: use repository variable from transaction Justin Tobler
2025-08-25 20:25     ` [PATCH v3 0/4] bulk-checkin: remove global transaction state Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250822213500.1488064-1-jltobler@gmail.com \
    --to=jltobler@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).