git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] refs: fix migration of reflog entries
@ 2025-07-22 11:20 Patrick Steinhardt
  2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
                   ` (12 more replies)
  0 siblings, 13 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Thanks!

Patrick

---
Patrick Steinhardt (8):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  17 +++----
 builtin/reflog.c              | 103 ++++++++++++++++++++++++++++++++++--------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  58 +++++++++++++-----------
 refs.h                        |  24 +++++++++-
 refs/files-backend.c          |  25 ++++++++--
 refs/reftable-backend.c       |  26 +++++++----
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       |  81 +++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 ++++++---
 11 files changed, 283 insertions(+), 78 deletions(-)


---
base-commit: 3f2a94875d2f41fe4758a439f68d8b73cfb19d0f
change-id: 20250722-pks-reflog-append-634172d8ab2c


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-22 22:04   ` Junio C Hamano
  2025-07-22 11:20 ` [PATCH 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
have introduced a new synopsis type that simplifies the rules for
typesetting a command's synopsis. Convert the git-reflog(1)
documentation to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 412f06b8fec..707a9b39edb 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -8,16 +8,16 @@ git-reflog - Manage reflog information
 
 SYNOPSIS
 --------
-[verse]
-'git reflog' [show] [<log-options>] [<ref>]
-'git reflog list'
-'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
+[synopsis]
+git reflog [show] [<log-options>] [<ref>]
+git reflog list
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
 	[--rewrite] [--updateref] [--stale-fix]
 	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
-'git reflog delete' [--rewrite] [--updateref]
+git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
-'git reflog drop' [--all [--single-worktree] | <refs>...]
-'git reflog exists' <ref>
+git reflog drop [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 
 DESCRIPTION
 -----------

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 2/8] builtin/reflog: improve grouping of subcommands
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
  2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-23 18:14   ` Justin Tobler
  2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

The way subcommands of git-reflog(1) are layed out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |  8 ++++----
 builtin/reflog.c              | 38 +++++++++++++++++++-------------------
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 707a9b39edb..6ae13e772b8 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -11,13 +11,13 @@ SYNOPSIS
 [synopsis]
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
-git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
-	[--rewrite] [--updateref] [--stale-fix]
-	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
-git reflog exists <ref>
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
+	[--rewrite] [--updateref] [--stale-fix]
+	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
 
 DESCRIPTION
 -----------
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acaf3e32c2..b00b3f9edc9 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -17,21 +17,21 @@
 #define BUILTIN_REFLOG_LIST_USAGE \
 	N_("git reflog list")
 
-#define BUILTIN_REFLOG_EXPIRE_USAGE \
-	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
-	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
-	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+#define BUILTIN_REFLOG_EXISTS_USAGE \
+	N_("git reflog exists <ref>")
 
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
 
-#define BUILTIN_REFLOG_EXISTS_USAGE \
-	N_("git reflog exists <ref>")
-
 #define BUILTIN_REFLOG_DROP_USAGE \
 	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
 
+#define BUILTIN_REFLOG_EXPIRE_USAGE \
+	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
+	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
+	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+
 static const char *const reflog_show_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	NULL,
@@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
 	NULL,
 };
 
-static const char *const reflog_expire_usage[] = {
-	BUILTIN_REFLOG_EXPIRE_USAGE,
-	NULL
+static const char *const reflog_exists_usage[] = {
+	BUILTIN_REFLOG_EXISTS_USAGE,
+	NULL,
 };
 
 static const char *const reflog_delete_usage[] = {
@@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
 	NULL
 };
 
-static const char *const reflog_exists_usage[] = {
-	BUILTIN_REFLOG_EXISTS_USAGE,
-	NULL,
-};
-
 static const char *const reflog_drop_usage[] = {
 	BUILTIN_REFLOG_DROP_USAGE,
 	NULL,
 };
 
+static const char *const reflog_expire_usage[] = {
+	BUILTIN_REFLOG_EXPIRE_USAGE,
+	NULL
+};
+
 static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
-	BUILTIN_REFLOG_EXPIRE_USAGE,
+	BUILTIN_REFLOG_EXISTS_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
-	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_EXPIRE_USAGE,
 	NULL
 };
 
@@ -404,10 +404,10 @@ int cmd_reflog(int argc,
 	struct option options[] = {
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
-		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
-		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
+		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
 		OPT_END()
 	};
 

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 3/8] refs: export `ref_transaction_update_reflog()`
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
  2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
  2025-07-22 11:20 ` [PATCH 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-23 18:25   ` Justin Tobler
                     ` (2 more replies)
  2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
                   ` (9 subsequent siblings)
  12 siblings, 3 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that functino is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 29 +++++++++++------------------
 refs.h | 15 +++++++++++++++
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index 73913b6627b..188989e4113 100644
--- a/refs.c
+++ b/refs.c
@@ -1362,27 +1362,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 	return 0;
 }
 
-/*
- * Similar to`ref_transaction_update`, but this function is only for adding
- * a reflog update. Supports providing custom committer information. The index
- * field can be utiltized to order updates as desired. When not used, the
- * updates default to being ordered by refname.
- */
-static int ref_transaction_update_reflog(struct ref_transaction *transaction,
-					 const char *refname,
-					 const struct object_id *new_oid,
-					 const struct object_id *old_oid,
-					 const char *committer_info,
-					 unsigned int flags,
-					 const char *msg,
-					 uint64_t index,
-					 struct strbuf *err)
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err)
 {
 	struct ref_update *update;
+	unsigned int flags;
 
 	assert(err);
 
-	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
@@ -3010,8 +3004,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
-					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
-					    data->index++, data->errbuf);
+					    msg, data->index++, data->errbuf);
 	return ret;
 }
 
diff --git a/refs.h b/refs.h
index efa182c6a14..0faf3bc0422 100644
--- a/refs.h
+++ b/refs.h
@@ -794,6 +794,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err);
 
+/*
+ * Similar to`ref_transaction_update`, but this function is only for adding
+ * a reflog update. Supports providing custom committer information. The index
+ * field can be utiltized to order updates as desired. When not used, the
+ * updates default to being ordered by refname.
+ */
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err);
+
 /*
  * Add a reference creation to transaction. new_oid is the value that
  * the reference should have after the update; it must not be

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (2 preceding siblings ...)
  2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-23 19:00   ` Justin Tobler
                     ` (2 more replies)
  2025-07-22 11:20 ` [PATCH 5/8] ident: fix type of string length parameter Patrick Steinhardt
                   ` (8 subsequent siblings)
  12 siblings, 3 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |  1 +
 builtin/reflog.c              | 65 ++++++++++++++++++++++++++++++++++
 t/meson.build                 |  1 +
 t/t1421-reflog-write.sh       | 81 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 148 insertions(+)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 6ae13e772b8..798dbc0a00a 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
 git reflog exists <ref>
+git reflog write <ref> <old-oid> <new-oid> <message>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
diff --git a/builtin/reflog.c b/builtin/reflog.c
index b00b3f9edc9..d0374295620 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -3,6 +3,8 @@
 #include "builtin.h"
 #include "config.h"
 #include "gettext.h"
+#include "hex.h"
+#include "odb.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -20,6 +22,9 @@
 #define BUILTIN_REFLOG_EXISTS_USAGE \
 	N_("git reflog exists <ref>")
 
+#define BUILTIN_REFLOG_WRITE_USAGE \
+	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
+
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
@@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
 	NULL,
 };
 
+static const char *const reflog_write_usage[] = {
+	BUILTIN_REFLOG_WRITE_USAGE,
+	NULL,
+};
+
 static const char *const reflog_delete_usage[] = {
 	BUILTIN_REFLOG_DELETE_USAGE,
 	NULL
@@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
 	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_WRITE_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
 	BUILTIN_REFLOG_EXPIRE_USAGE,
@@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
 	return ret;
 }
 
+static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
+			    struct repository *repo)
+{
+	const struct option options[] = {
+		OPT_END()
+	};
+	struct object_id old_oid, new_oid;
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *tx;
+	const char *ref, *message;
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, reflog_drop_usage, 0);
+	if (argc != 4)
+		usage_with_options(reflog_write_usage, options);
+
+	ref = argv[0];
+	if (check_refname_format(ref, REFNAME_ALLOW_ONELEVEL))
+		die(_("invalid reference name: %s"), ref);
+
+	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid old object ID: '%s'"), argv[1]);
+	if (!is_null_oid(&old_oid) && !odb_has_object(repo->objects, &old_oid, 0))
+		die(_("old object '%s' does not exist"), argv[1]);
+
+	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid new object ID: '%s'"), argv[2]);
+	if (!is_null_oid(&new_oid) && !odb_has_object(repo->objects, &new_oid, 0))
+		die(_("new object '%s' does not exist"), argv[2]);
+
+	message = argv[3];
+
+	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
+	if (!tx)
+		die(_("cannot start transaction: %s"), err.buf);
+
+	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
+					    git_committer_info(0),
+					    message, 0, &err);
+	if (ret)
+		die(_("cannot queue reflog update: %s"), err.buf);
+
+	ret = ref_transaction_commit(tx, &err);
+	if (ret)
+		die(_("cannot commit reflog update: %s"), err.buf);
+
+	ref_transaction_free(tx);
+	strbuf_release(&err);
+	return 0;
+}
+
 /*
  * main "reflog"
  */
@@ -405,6 +469,7 @@ int cmd_reflog(int argc,
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
 		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
 		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
diff --git a/t/meson.build b/t/meson.build
index 1af289425d4..d68f5e24dbe 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -219,6 +219,7 @@ integration_tests = [
   't1418-reflog-exists.sh',
   't1419-exclude-refs.sh',
   't1420-lost-found.sh',
+  't1421-reflog-write.sh',
   't1430-bad-ref-name.sh',
   't1450-fsck.sh',
   't1451-fsck-buffer.sh',
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
new file mode 100755
index 00000000000..e284f42178f
--- /dev/null
+++ b/t/t1421-reflog-write.sh
@@ -0,0 +1,81 @@
+#!/bin/sh
+
+test_description='Manually write reflog entries'
+
+. ./test-lib.sh
+
+SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
+
+test_reflog_matches () {
+	repo="$1" &&
+	refname="$2" &&
+	cat >actual &&
+	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'invalid number of arguments' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
+		do
+			test_must_fail git reflog write $args 2>err &&
+			test_grep "usage: git reflog write" err || return 1
+		done
+	)
+'
+
+test_expect_success 'invalid refname' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'nonexistent old object ID' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID first 2>err &&
+		test_grep "old object .* does not exist" err
+	)
+'
+
+test_expect_success 'nonexistent new object ID' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) first 2>err &&
+		test_grep "new object .* does not exist" err
+	)
+'
+
+test_expect_success 'simple writes' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . refs/heads/something <<-EOF &&
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+
+		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
+		test_reflog_matches . refs/heads/something <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
+		EOF
+	)
+'
+
+test_done

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 5/8] ident: fix type of string length parameter
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (3 preceding siblings ...)
  2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

The last parameter in `split_ident_line()` is the length of the line
passed in by the caller. As such, most callers pass in either the result
of `strlen()`, `struct strbuf::len` or a pointer diff, all of which
are expected to be positive numbers. Regardless of that, the function
accepts a signed integer, which is somewhat confusing.

Fix the function signature to instead accept a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ident.c | 2 +-
 ident.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ident.c b/ident.c
index 281e830573b..0b7aacecd7d 100644
--- a/ident.c
+++ b/ident.c
@@ -272,7 +272,7 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
  * can still be NULL if the input line only has the name/email part
  * (e.g. reading from a reflog entry).
  */
-int split_ident_line(struct ident_split *split, const char *line, int len)
+int split_ident_line(struct ident_split *split, const char *line, size_t len)
 {
 	const char *cp;
 	size_t span;
diff --git a/ident.h b/ident.h
index 6a79febba15..3c034038791 100644
--- a/ident.h
+++ b/ident.h
@@ -35,7 +35,7 @@ void reset_ident_date(void);
  * Signals an success with 0, but time part of the result may be NULL
  * if the input lacks timestamp and zone
  */
-int split_ident_line(struct ident_split *, const char *, int);
+int split_ident_line(struct ident_split *, const char *, size_t);
 
 /*
  * Given a commit or tag object buffer and the commit or tag headers, replaces

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 6/8] refs: fix identity for migrated reflogs
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (4 preceding siblings ...)
  2025-07-22 11:20 ` [PATCH 5/8] ident: fix type of string length parameter Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-23 19:41   ` Justin Tobler
                     ` (2 more replies)
  2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
                   ` (6 subsequent siblings)
  12 siblings, 3 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

When migrating reflog entries between different storage formats we must
reconstruct the identity of reflog entries. This is done by passing the
committer passed to the `migrate_one_reflog_entry()` callback function
to `fmt_ident()`.

This results in an invalid identity though: `fmt_ident()` expects the
caller to provide both name and mail of the author, but we pass the full
identity as mail. This leads to an identity like:

    pks <Patrick Steinhardt ps@pks.im>

Fix the bug by splitting the identity line first. This allows us to
extract both the name and mail so that we can pass them to `fmt_ident()`
separately.

This commit does not yet add any tests as there is another bug in the
reflog migration that will be fixed in a subsequent commit. Once that
bug is fixed we'll make the reflog verification in t1450 stricter, and
that will catch both this bug here and the other bug.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/refs.c b/refs.c
index 188989e4113..64544300dc3 100644
--- a/refs.c
+++ b/refs.c
@@ -2945,7 +2945,7 @@ struct migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf sb;
+	struct strbuf sb, name, mail;
 };
 
 static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
@@ -2984,7 +2984,7 @@ struct reflog_migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf *sb;
+	struct strbuf *sb, *name, *mail;
 };
 
 static int migrate_one_reflog_entry(struct object_id *old_oid,
@@ -2994,13 +2994,22 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 				    const char *msg, void *cb_data)
 {
 	struct reflog_migration_data *data = cb_data;
+	struct ident_split ident;
 	const char *date;
 	int ret;
 
+	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
+		return -1;
+
+	strbuf_reset(data->name);
+	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_reset(data->mail);
+	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
 	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
 	strbuf_reset(data->sb);
 	/* committer contains name and email */
-	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
+	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
@@ -3017,6 +3026,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
 		.transaction = migration_data->transaction,
 		.errbuf = migration_data->errbuf,
 		.sb = &migration_data->sb,
+		.name = &migration_data->name,
+		.mail = &migration_data->mail,
 	};
 
 	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
@@ -3115,6 +3126,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	struct strbuf new_gitdir = STRBUF_INIT;
 	struct migration_data data = {
 		.sb = STRBUF_INIT,
+		.name = STRBUF_INIT,
+		.mail = STRBUF_INIT,
 	};
 	int did_migrate_refs = 0;
 	int ret;
@@ -3290,6 +3303,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	strbuf_release(&data.sb);
+	strbuf_release(&data.name);
+	strbuf_release(&data.mail);
 	return ret;
 }
 

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (5 preceding siblings ...)
  2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-23 20:31   ` Justin Tobler
  2025-07-24 10:21   ` Karthik Nayak
  2025-07-22 11:20 ` [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whteher
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.

When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.

But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal is useful to determine whether we have to resolve the refname
manually to figure out the current state, or whether we should just go
with what the caller has provided.

This actually causes real issues when migrating reflogs, as we don't
know to actually use the caller-provided old object ID when writing
those entries. Instead, reflog entries simply end up with the all-zero
object ID.

Stop unsetting the flag so that we can use it as this described signal,
which we'll do in a subsequent commit. Skip checking the old object ID
for log-only updates so that we don't expect it to match the current
on-disk state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  7 +------
 refs/files-backend.c    |  9 +++++----
 refs/reftable-backend.c | 12 +++---------
 3 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/refs.c b/refs.c
index 64544300dc3..c78d5be6e20 100644
--- a/refs.c
+++ b/refs.c
@@ -1384,11 +1384,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 	update = ref_transaction_add_update(transaction, refname, flags,
 					    new_oid, old_oid, NULL, NULL,
 					    committer_info, msg);
-	/*
-	 * While we do set the old_oid value, we unset the flag to skip
-	 * old_oid verification which only makes sense for refs.
-	 */
-	update->flags &= ~REF_HAVE_OLD;
 	update->index = index;
 
 	/*
@@ -3310,7 +3305,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 
 int ref_update_expects_existing_old_ref(struct ref_update *update)
 {
-	return (update->flags & REF_HAVE_OLD) &&
+	return (update->flags & (REF_HAVE_OLD | REF_LOG_ONLY)) == REF_HAVE_OLD &&
 		(!is_null_oid(&update->old_oid) || update->old_target);
 }
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 89ae4517a97..d519bb615fa 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
 	 * done when new_update is processed.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;
 
 	return 0;
 }
@@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
 						struct object_id *oid,
 						struct strbuf *err)
 {
-	if (!(update->flags & REF_HAVE_OLD) ||
-		   oideq(oid, &update->old_oid))
+	if (update->flags & REF_LOG_ONLY ||
+	    !(update->flags & REF_HAVE_OLD) ||
+	    oideq(oid, &update->old_oid))
 		return 0;
 
 	if (is_null_oid(&update->old_oid)) {
@@ -3061,7 +3061,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
-		if ((update->flags & REF_HAVE_OLD) &&
+		if (!(update->flags & REF_LOG_ONLY) &&
+		    (update->flags & REF_HAVE_OLD) &&
 		    !is_null_oid(&update->old_oid))
 			BUG("initial ref transaction with old_sha1 set");
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4c3817f4ec1..44af58ac50b 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret > 0) {
 		/* The reference does not exist, but we expected it to. */
 		strbuf_addf(err, _("cannot lock ref '%s': "
-
-
 				   "unable to resolve reference '%s'"),
 			    ref_update_original_update_refname(u), u->refname);
 		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
@@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 
 			new_update->parent_update = u;
 
-			/*
-			 * Change the symbolic ref update to log only. Also, it
-			 * doesn't need to check its old OID value, as that will be
-			 * done when new_update is processed.
-			 */
+			/* Change the symbolic ref update to log only. */
 			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-			u->flags &= ~REF_HAVE_OLD;
 		}
 	}
 
@@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 		ret = ref_update_check_old_target(referent->buf, u, err);
 		if (ret)
 			return ret;
-	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
+	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
+		   !oideq(&current_oid, &u->old_oid)) {
 		if (is_null_oid(&u->old_oid)) {
 			strbuf_addf(err, _("cannot lock ref '%s': "
 					   "reference already exists"),

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (6 preceding siblings ...)
  2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-07-22 11:20 ` Patrick Steinhardt
  2025-07-22 22:09   ` Junio C Hamano
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-22 11:20 UTC (permalink / raw)
  To: git; +Cc: Karthik Nayak

When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  3 ++-
 refs.h                  |  9 ++++++++-
 refs/files-backend.c    | 16 +++++++++++++++-
 refs/reftable-backend.c | 14 ++++++++++++++
 t/t1460-refs-migrate.sh | 22 +++++++++++++++-------
 5 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/refs.c b/refs.c
index c78d5be6e20..3d07ead92cb 100644
--- a/refs.c
+++ b/refs.c
@@ -1376,7 +1376,8 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 
 	assert(err);
 
-	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF |
+		REF_LOG_USE_PROVIDED_OIDS;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
diff --git a/refs.h b/refs.h
index 0faf3bc0422..8e5416178ae 100644
--- a/refs.h
+++ b/refs.h
@@ -759,13 +759,20 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
  */
 #define REF_SKIP_CREATE_REFLOG (1 << 12)
 
+/*
+ * When writing a REF_LOG_ONLY record, use the old and new object IDs provided
+ * in the update instead of resolving the old object ID. The caller must also
+ * set both REF_HAVE_OLD and REF_HAVE_NEW.
+ */
+#define REF_LOG_USE_PROVIDED_OIDS (1 << 13)
+
 /*
  * Bitmask of all of the flags that are allowed to be passed in to
  * ref_transaction_update() and friends:
  */
 #define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS                                  \
 	(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
-	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
+	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG | REF_LOG_USE_PROVIDED_OIDS)
 
 /*
  * Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index d519bb615fa..3ebe0323d4e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2976,6 +2976,20 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 				  struct ref_lock *lock,
 				  struct strbuf *err)
 {
+	struct object_id *old_oid = &lock->old_oid;
+
+	if (update->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(update->flags & REF_HAVE_OLD) ||
+		    !(update->flags & REF_HAVE_NEW) ||
+		    !(update->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), update->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		old_oid = &update->old_oid;
+	}
+
 	if (update->new_target) {
 		/*
 		 * We want to get the resolved OID for the target, to ensure
@@ -2993,7 +3007,7 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 		}
 	}
 
-	if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid,
+	if (files_log_ref_write(refs, lock->ref_name, old_oid,
 				&update->new_oid, update->committer_info,
 				update->msg, update->flags, err)) {
 		char *old_msg = strbuf_detach(err, NULL);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 44af58ac50b..99fafd75ebe 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1096,6 +1096,20 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret)
 		return REF_TRANSACTION_ERROR_GENERIC;
 
+	if (u->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(u->flags & REF_HAVE_OLD) ||
+		    !(u->flags & REF_HAVE_NEW) ||
+		    !(u->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), u->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		if (queue_transaction_update(refs, tx_data, u, &u->old_oid, err))
+			return REF_TRANSACTION_ERROR_GENERIC;
+		return 0;
+	}
+
 	/* Verify that the new object ID is valid. */
 	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
 	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
index 2ab97e1b7df..8191b08a79c 100755
--- a/t/t1460-refs-migrate.sh
+++ b/t/t1460-refs-migrate.sh
@@ -7,6 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+print_all_reflog_entries () {
+	repo=$1 &&
+	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
+	cat reflogs | while read reflog
+	do
+		echo "REFLOG: $reflog" &&
+		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
+		return 1
+	done
+}
+
 # Migrate the provided repository from one format to the other and
 # verify that the references and logs are migrated over correctly.
 # Usage: test_migration <repo> <format> [<skip_reflog_verify> [<options...>]]
@@ -28,8 +39,7 @@ test_migration () {
 		--format='%(refname) %(objectname) %(symref)' >expect &&
 	if ! $skip_reflog_verify
 	then
-	   git -C "$repo" reflog --all >expect_logs &&
-	   git -C "$repo" reflog list >expect_log_list
+		print_all_reflog_entries "$repo" >expect_logs
 	fi &&
 
 	git -C "$repo" refs migrate --ref-format="$format" "$@" &&
@@ -39,10 +49,8 @@ test_migration () {
 	test_cmp expect actual &&
 	if ! $skip_reflog_verify
 	then
-		git -C "$repo" reflog --all >actual_logs &&
-		git -C "$repo" reflog list >actual_log_list &&
-		test_cmp expect_logs actual_logs &&
-		test_cmp expect_log_list actual_log_list
+		print_all_reflog_entries "$repo" >actual_logs &&
+		test_cmp expect_logs actual_logs
 	fi &&
 
 	git -C "$repo" rev-parse --show-ref-format >actual &&
@@ -273,7 +281,7 @@ test_expect_success 'multiple reftable blocks with multiple entries' '
 	test_commit -C repo second &&
 	printf "update refs/heads/ref-%d HEAD\n" $(test_seq 3000) >stdin &&
 	git -C repo update-ref --stdin <stdin &&
-	test_migration repo reftable
+	test_migration repo reftable true
 '
 
 test_expect_success 'migrating from files format deletes backend files' '

-- 
2.50.1.465.gcb3da1c9e6.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type
  2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-07-22 22:04   ` Junio C Hamano
  0 siblings, 0 replies; 114+ messages in thread
From: Junio C Hamano @ 2025-07-22 22:04 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
> have introduced a new synopsis type that simplifies the rules for
> typesetting a command's synopsis. Convert the git-reflog(1)
> documentation to use it.

Good.

>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/git-reflog.adoc | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> index 412f06b8fec..707a9b39edb 100644
> --- a/Documentation/git-reflog.adoc
> +++ b/Documentation/git-reflog.adoc
> @@ -8,16 +8,16 @@ git-reflog - Manage reflog information
>  
>  SYNOPSIS
>  --------
> -[verse]
> -'git reflog' [show] [<log-options>] [<ref>]
> -'git reflog list'
> -'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
> +[synopsis]
> +git reflog [show] [<log-options>] [<ref>]
> +git reflog list
> +git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
>  	[--rewrite] [--updateref] [--stale-fix]
>  	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
> -'git reflog delete' [--rewrite] [--updateref]
> +git reflog delete [--rewrite] [--updateref]
>  	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
> -'git reflog drop' [--all [--single-worktree] | <refs>...]
> -'git reflog exists' <ref>
> +git reflog drop [--all [--single-worktree] | <refs>...]
> +git reflog exists <ref>
>  
>  DESCRIPTION
>  -----------

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs
  2025-07-22 11:20 ` [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
@ 2025-07-22 22:09   ` Junio C Hamano
  2025-07-23  4:04     ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Junio C Hamano @ 2025-07-22 22:09 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> +print_all_reflog_entries () {
> +	repo=$1 &&
> +	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
> +	cat reflogs | while read reflog
> +	do
> +		echo "REFLOG: $reflog" &&
> +		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
> +		return 1
> +	done

Let's not cat a single file into a pipe.  What is on the downstream
side of such a pipe is always prepared to read from its standard
input.  I.e.

	test-tool ... >reflogs &&
	while read reflog
	do
		...
	done <reflogs


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs
  2025-07-22 22:09   ` Junio C Hamano
@ 2025-07-23  4:04     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-23  4:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak

On Tue, Jul 22, 2025 at 03:09:17PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > +print_all_reflog_entries () {
> > +	repo=$1 &&
> > +	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
> > +	cat reflogs | while read reflog
> > +	do
> > +		echo "REFLOG: $reflog" &&
> > +		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
> > +		return 1
> > +	done
> 
> Let's not cat a single file into a pipe.  What is on the downstream
> side of such a pipe is always prepared to read from its standard
> input.  I.e.
> 
> 	test-tool ... >reflogs &&
> 	while read reflog
> 	do
> 		...
> 	done <reflogs

Ah, makes sense. Will queue the change locally and send it out with the
next version. Thanks!

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 2/8] builtin/reflog: improve grouping of subcommands
  2025-07-22 11:20 ` [PATCH 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-07-23 18:14   ` Justin Tobler
  2025-07-24  7:42     ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Justin Tobler @ 2025-07-23 18:14 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> The way subcommands of git-reflog(1) are layed out does not make any

s/layed/laid/

> immediate sense. Reorder them such that read-only subcommands precede
> writing commands for a bit more structure.
> 
> Furthermore, move the "expire" subcommand last. This prepares for a
> subsequent change where we are about to introduce a new "write" command
> to append reflog entries. Like this, the writing subcommands are ordered
> such that those affecting a single reflog come before those spanning
> across all reflogs.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/git-reflog.adoc |  8 ++++----
>  builtin/reflog.c              | 38 +++++++++++++++++++-------------------
>  2 files changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> index 707a9b39edb..6ae13e772b8 100644
> --- a/Documentation/git-reflog.adoc
> +++ b/Documentation/git-reflog.adoc
> @@ -11,13 +11,13 @@ SYNOPSIS
>  [synopsis]
>  git reflog [show] [<log-options>] [<ref>]
>  git reflog list
> -git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
> -	[--rewrite] [--updateref] [--stale-fix]
> -	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
> +git reflog exists <ref>
>  git reflog delete [--rewrite] [--updateref]
>  	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
>  git reflog drop [--all [--single-worktree] | <refs>...]
> -git reflog exists <ref>
> +git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
> +	[--rewrite] [--updateref] [--stale-fix]
> +	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
>  
>  DESCRIPTION
>  -----------
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 3acaf3e32c2..b00b3f9edc9 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -17,21 +17,21 @@
>  #define BUILTIN_REFLOG_LIST_USAGE \
>  	N_("git reflog list")
>  
> -#define BUILTIN_REFLOG_EXPIRE_USAGE \
> -	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
> -	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
> -	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
> +#define BUILTIN_REFLOG_EXISTS_USAGE \
> +	N_("git reflog exists <ref>")
>  
>  #define BUILTIN_REFLOG_DELETE_USAGE \
>  	N_("git reflog delete [--rewrite] [--updateref]\n" \
>  	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
>  
> -#define BUILTIN_REFLOG_EXISTS_USAGE \
> -	N_("git reflog exists <ref>")
> -
>  #define BUILTIN_REFLOG_DROP_USAGE \
>  	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
>  
> +#define BUILTIN_REFLOG_EXPIRE_USAGE \
> +	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
> +	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
> +	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
> +
>  static const char *const reflog_show_usage[] = {
>  	BUILTIN_REFLOG_SHOW_USAGE,
>  	NULL,
> @@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
>  	NULL,
>  };
>  
> -static const char *const reflog_expire_usage[] = {
> -	BUILTIN_REFLOG_EXPIRE_USAGE,
> -	NULL
> +static const char *const reflog_exists_usage[] = {
> +	BUILTIN_REFLOG_EXISTS_USAGE,
> +	NULL,
>  };
>  
>  static const char *const reflog_delete_usage[] = {
> @@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
>  	NULL
>  };
>  
> -static const char *const reflog_exists_usage[] = {
> -	BUILTIN_REFLOG_EXISTS_USAGE,
> -	NULL,
> -};
> -
>  static const char *const reflog_drop_usage[] = {
>  	BUILTIN_REFLOG_DROP_USAGE,
>  	NULL,
>  };
>  
> +static const char *const reflog_expire_usage[] = {
> +	BUILTIN_REFLOG_EXPIRE_USAGE,
> +	NULL
> +};
> +
>  static const char *const reflog_usage[] = {
>  	BUILTIN_REFLOG_SHOW_USAGE,
>  	BUILTIN_REFLOG_LIST_USAGE,
> -	BUILTIN_REFLOG_EXPIRE_USAGE,
> +	BUILTIN_REFLOG_EXISTS_USAGE,
>  	BUILTIN_REFLOG_DELETE_USAGE,
>  	BUILTIN_REFLOG_DROP_USAGE,
> -	BUILTIN_REFLOG_EXISTS_USAGE,
> +	BUILTIN_REFLOG_EXPIRE_USAGE,
>  	NULL
>  };
>  
> @@ -404,10 +404,10 @@ int cmd_reflog(int argc,
>  	struct option options[] = {
>  		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
>  		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
> -		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
> -		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
>  		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
> +		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
>  		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
> +		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
>  		OPT_END()
>  	};

Structing the subcommands order in such a manner seems sensible, but I'm
not sure the pattern will be recognized by others that may add
subcommands in the future. Maybe we could leave a comment that mentions
the order?

-Justin

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 3/8] refs: export `ref_transaction_update_reflog()`
  2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-07-23 18:25   ` Justin Tobler
  2025-07-24  8:36   ` Karthik Nayak
  2025-07-24 12:55   ` Toon Claes
  2 siblings, 0 replies; 114+ messages in thread
From: Justin Tobler @ 2025-07-23 18:25 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> In a subsequent commit we'll add another user that wants to write reflog
> entries. This requires them to call `ref_transaction_update_reflog()`,
> but that functino is local to "refs.c".

s/functino/function/

> Export the function to prepare for the change. While at it, drop the
> `flags` field, as all callers are for now expected to use the same flags
> anyway.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 29 +++++++++++------------------
>  refs.h | 15 +++++++++++++++
>  2 files changed, 26 insertions(+), 18 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 73913b6627b..188989e4113 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1362,27 +1362,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
>  	return 0;
>  }
>  
> -/*
> - * Similar to`ref_transaction_update`, but this function is only for adding
> - * a reflog update. Supports providing custom committer information. The index
> - * field can be utiltized to order updates as desired. When not used, the
> - * updates default to being ordered by refname.
> - */
> -static int ref_transaction_update_reflog(struct ref_transaction *transaction,
> -					 const char *refname,
> -					 const struct object_id *new_oid,
> -					 const struct object_id *old_oid,
> -					 const char *committer_info,
> -					 unsigned int flags,
> -					 const char *msg,
> -					 uint64_t index,
> -					 struct strbuf *err)
> +int ref_transaction_update_reflog(struct ref_transaction *transaction,
> +				  const char *refname,
> +				  const struct object_id *new_oid,
> +				  const struct object_id *old_oid,
> +				  const char *committer_info,
> +				  const char *msg,
> +				  uint64_t index,
> +				  struct strbuf *err)
>  {
>  	struct ref_update *update;
> +	unsigned int flags;
>  
>  	assert(err);
>  
> -	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
> +	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
>  
>  	if (!transaction_refname_valid(refname, new_oid, flags, err))
>  		return -1;
> @@ -3010,8 +3004,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
>  
>  	ret = ref_transaction_update_reflog(data->transaction, data->refname,
>  					    new_oid, old_oid, data->sb->buf,
> -					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
> -					    data->index++, data->errbuf);
> +					    msg, data->index++, data->errbuf);

Right now this is only the single caller for
`ref_transaction_update_reflog()`. Since it is intented for all callers
to use the same set of flags, removing the field makes sense.

>  	return ret;
>  }
>  
> diff --git a/refs.h b/refs.h
> index efa182c6a14..0faf3bc0422 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -794,6 +794,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
>  			   unsigned int flags, const char *msg,
>  			   struct strbuf *err);
>  
> +/*
> + * Similar to`ref_transaction_update`, but this function is only for adding
> + * a reflog update. Supports providing custom committer information. The index
> + * field can be utiltized to order updates as desired. When not used, the
> + * updates default to being ordered by refname.
> + */
> +int ref_transaction_update_reflog(struct ref_transaction *transaction,
> +				  const char *refname,
> +				  const struct object_id *new_oid,
> +				  const struct object_id *old_oid,
> +				  const char *committer_info,
> +				  const char *msg,
> +				  uint64_t index,
> +				  struct strbuf *err);
> +
>  /*
>   * Add a reference creation to transaction. new_oid is the value that
>   * the reference should have after the update; it must not be
> 
> -- 
> 2.50.1.465.gcb3da1c9e6.dirty
> 
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-07-23 19:00   ` Justin Tobler
  2025-07-24  7:42     ` Patrick Steinhardt
  2025-07-24 12:54   ` Toon Claes
  2025-07-24 16:20   ` SZEDER Gábor
  2 siblings, 1 reply; 114+ messages in thread
From: Justin Tobler @ 2025-07-23 19:00 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> While we provide a couple of subcommands in git-reflog(1) to remove
> reflog entries, we don't provide any to write new entries. Obviously
> this is not an operation that really would be needed for many use cases
> out there, or otherwise people would have complained that such a command
> does not exist yet. But the introduction of the "reftable" backend
> changes the picture a bit, as it is now basically impossible to manually
> append a reflog entry if one wanted to do so due to the binary format.
> 
> Plug this gap by introducing a simple "write" subcommand. For now, all
> this command does is to append a single new reflog entry with the given
> object IDs and message to the reflog. More specifically, it is not yet
> possible to:
> 
>   - Write multiple reflog entries at once.
> 
>   - Insert reflog entries at arbitrary indices.
> 
>   - Specify the date of the reflog entry.
> 
>   - Insert reflog entries that refer to nonexistent objects.
> 
> If required, those features can be added at a future point in time. For
> now though, the new command aims to fulfill the most basic use cases
> while being as strict as possible when it comes to verifying parameters.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/git-reflog.adoc |  1 +
>  builtin/reflog.c              | 65 ++++++++++++++++++++++++++++++++++
>  t/meson.build                 |  1 +
>  t/t1421-reflog-write.sh       | 81 +++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 148 insertions(+)
> 
> diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> index 6ae13e772b8..798dbc0a00a 100644
> --- a/Documentation/git-reflog.adoc
> +++ b/Documentation/git-reflog.adoc
> @@ -12,6 +12,7 @@ SYNOPSIS
>  git reflog [show] [<log-options>] [<ref>]
>  git reflog list
>  git reflog exists <ref>
> +git reflog write <ref> <old-oid> <new-oid> <message>

The other subcommands each have an entry in the description. Do we want
to also add something for the "write" subcommand?

Also, if we want to be consistent, I noticed the order of the
subcommands listed in the description was not changed either. 

>  git reflog delete [--rewrite] [--updateref]
>  	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
>  git reflog drop [--all [--single-worktree] | <refs>...]
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index b00b3f9edc9..d0374295620 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -3,6 +3,8 @@
>  #include "builtin.h"
>  #include "config.h"
>  #include "gettext.h"
> +#include "hex.h"
> +#include "odb.h"
>  #include "revision.h"
>  #include "reachable.h"
>  #include "wildmatch.h"
> @@ -20,6 +22,9 @@
>  #define BUILTIN_REFLOG_EXISTS_USAGE \
>  	N_("git reflog exists <ref>")
>  
> +#define BUILTIN_REFLOG_WRITE_USAGE \
> +	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
> +
>  #define BUILTIN_REFLOG_DELETE_USAGE \
>  	N_("git reflog delete [--rewrite] [--updateref]\n" \
>  	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
> @@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
>  	NULL,
>  };
>  
> +static const char *const reflog_write_usage[] = {
> +	BUILTIN_REFLOG_WRITE_USAGE,
> +	NULL,
> +};
> +
>  static const char *const reflog_delete_usage[] = {
>  	BUILTIN_REFLOG_DELETE_USAGE,
>  	NULL
> @@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
>  	BUILTIN_REFLOG_SHOW_USAGE,
>  	BUILTIN_REFLOG_LIST_USAGE,
>  	BUILTIN_REFLOG_EXISTS_USAGE,
> +	BUILTIN_REFLOG_WRITE_USAGE,
>  	BUILTIN_REFLOG_DELETE_USAGE,
>  	BUILTIN_REFLOG_DROP_USAGE,
>  	BUILTIN_REFLOG_EXPIRE_USAGE,
> @@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
>  	return ret;
>  }
>  
> +static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
> +			    struct repository *repo)
> +{
> +	const struct option options[] = {
> +		OPT_END()
> +	};
> +	struct object_id old_oid, new_oid;
> +	struct strbuf err = STRBUF_INIT;
> +	struct ref_transaction *tx;
> +	const char *ref, *message;
> +	int ret;
> +
> +	argc = parse_options(argc, argv, prefix, options, reflog_drop_usage, 0);
> +	if (argc != 4)
> +		usage_with_options(reflog_write_usage, options);
> +
> +	ref = argv[0];
> +	if (check_refname_format(ref, REFNAME_ALLOW_ONELEVEL))
> +		die(_("invalid reference name: %s"), ref);
> +
> +	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
> +	if (ret)
> +		die(_("invalid old object ID: '%s'"), argv[1]);
> +	if (!is_null_oid(&old_oid) && !odb_has_object(repo->objects, &old_oid, 0))
> +		die(_("old object '%s' does not exist"), argv[1]);
> +
> +	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
> +	if (ret)
> +		die(_("invalid new object ID: '%s'"), argv[2]);
> +	if (!is_null_oid(&new_oid) && !odb_has_object(repo->objects, &new_oid, 0))
> +		die(_("new object '%s' does not exist"), argv[2]);

Ok so we validate the reference name and the old/new obects names to
make sure they are sane.

> +
> +	message = argv[3];
> +
> +	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
> +	if (!tx)
> +		die(_("cannot start transaction: %s"), err.buf);
> +
> +	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
> +					    git_committer_info(0),
> +					    message, 0, &err);
> +	if (ret)
> +		die(_("cannot queue reflog update: %s"), err.buf);
> +
> +	ret = ref_transaction_commit(tx, &err);
> +	if (ret)
> +		die(_("cannot commit reflog update: %s"), err.buf);

And here we write the reflog entry. Looks good

> +
> +	ref_transaction_free(tx);
> +	strbuf_release(&err);
> +	return 0;
> +}
> +
>  /*
>   * main "reflog"
>   */
> @@ -405,6 +469,7 @@ int cmd_reflog(int argc,
>  		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
>  		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
>  		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
> +		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
>  		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
>  		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
>  		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
> diff --git a/t/meson.build b/t/meson.build
> index 1af289425d4..d68f5e24dbe 100644
> --- a/t/meson.build
> +++ b/t/meson.build
> @@ -219,6 +219,7 @@ integration_tests = [
>    't1418-reflog-exists.sh',
>    't1419-exclude-refs.sh',
>    't1420-lost-found.sh',
> +  't1421-reflog-write.sh',
>    't1430-bad-ref-name.sh',
>    't1450-fsck.sh',
>    't1451-fsck-buffer.sh',
> diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
> new file mode 100755
> index 00000000000..e284f42178f
> --- /dev/null
> +++ b/t/t1421-reflog-write.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +
> +test_description='Manually write reflog entries'
> +
> +. ./test-lib.sh
> +
> +SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
> +
> +test_reflog_matches () {
> +	repo="$1" &&
> +	refname="$2" &&
> +	cat >actual &&
> +	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
> +	test_cmp expected actual
> +}
> +
> +test_expect_success 'invalid number of arguments' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
> +		do
> +			test_must_fail git reflog write $args 2>err &&
> +			test_grep "usage: git reflog write" err || return 1
> +		done
> +	)
> +'
> +
> +test_expect_success 'invalid refname' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
> +		test_grep "invalid reference name: " err
> +	)
> +'
> +
> +test_expect_success 'nonexistent old object ID' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID first 2>err &&
> +		test_grep "old object .* does not exist" err
> +	)
> +'
> +
> +test_expect_success 'nonexistent new object ID' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) first 2>err &&
> +		test_grep "new object .* does not exist" err
> +	)
> +'
> +
> +test_expect_success 'simple writes' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit initial &&
> +		COMMIT_OID=$(git rev-parse HEAD) &&
> +
> +		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
> +		test_reflog_matches . refs/heads/something <<-EOF &&
> +		$ZERO_OID $COMMIT_OID $SIGNATURE	first
> +		EOF
> +
> +		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
> +		test_reflog_matches . refs/heads/something <<-EOF
> +		$ZERO_OID $COMMIT_OID $SIGNATURE	first
> +		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
> +		EOF
> +	)
> +'
> +
> +test_done
> 
> -- 
> 2.50.1.465.gcb3da1c9e6.dirty
> 
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 6/8] refs: fix identity for migrated reflogs
  2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-07-23 19:41   ` Justin Tobler
  2025-07-24  7:42     ` Patrick Steinhardt
  2025-07-24  9:41   ` Karthik Nayak
  2025-07-24 12:56   ` Toon Claes
  2 siblings, 1 reply; 114+ messages in thread
From: Justin Tobler @ 2025-07-23 19:41 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> When migrating reflog entries between different storage formats we must
> reconstruct the identity of reflog entries. This is done by passing the
> committer passed to the `migrate_one_reflog_entry()` callback function
> to `fmt_ident()`.
> 
> This results in an invalid identity though: `fmt_ident()` expects the
> caller to provide both name and mail of the author, but we pass the full
> identity as mail. This leads to an identity like:
> 
>     pks <Patrick Steinhardt ps@pks.im>
> 
> Fix the bug by splitting the identity line first. This allows us to
> extract both the name and mail so that we can pass them to `fmt_ident()`
> separately.

Ok so IIUC, the bug is the result of passing the full committer info to
the mail field in `fmt_ident()` and leaving the name field unset. To
properly address we need to first deconstruct the committer info into
separate name and mail components and pass them separately to
`fmt_ident()`.

> This commit does not yet add any tests as there is another bug in the
> reflog migration that will be fixed in a subsequent commit. Once that
> bug is fixed we'll make the reflog verification in t1450 stricter, and
> that will catch both this bug here and the other bug.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index 188989e4113..64544300dc3 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2945,7 +2945,7 @@ struct migration_data {
>  	struct ref_store *old_refs;
>  	struct ref_transaction *transaction;
>  	struct strbuf *errbuf;
> -	struct strbuf sb;
> +	struct strbuf sb, name, mail;
>  };
>  
>  static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
> @@ -2984,7 +2984,7 @@ struct reflog_migration_data {
>  	struct ref_store *old_refs;
>  	struct ref_transaction *transaction;
>  	struct strbuf *errbuf;
> -	struct strbuf *sb;
> +	struct strbuf *sb, *name, *mail;
>  };
>  
>  static int migrate_one_reflog_entry(struct object_id *old_oid,
> @@ -2994,13 +2994,22 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
>  				    const char *msg, void *cb_data)
>  {
>  	struct reflog_migration_data *data = cb_data;
> +	struct ident_split ident;
>  	const char *date;
>  	int ret;
>  
> +	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
> +		return -1;

Ok now we first deconstruct the committer info.

> +
> +	strbuf_reset(data->name);
> +	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
> +	strbuf_reset(data->mail);
> +	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);

The name and mail components get stored separately.

> +
>  	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
>  	strbuf_reset(data->sb);
>  	/* committer contains name and email */
> -	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
> +	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));

`fmt_ident()` now receives the expected information. Looks good

>  
>  	ret = ref_transaction_update_reflog(data->transaction, data->refname,
>  					    new_oid, old_oid, data->sb->buf,
> @@ -3017,6 +3026,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
>  		.transaction = migration_data->transaction,
>  		.errbuf = migration_data->errbuf,
>  		.sb = &migration_data->sb,
> +		.name = &migration_data->name,
> +		.mail = &migration_data->mail,

I was a bit confused at first why we cared to assign the name and mail
fields here as it didn't look like we actually use them, but it looks
like we do this to release the the underlying strbuf as we don't free it
from `reflog_migration_data`.

>  	};
>  
>  	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
> @@ -3115,6 +3126,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
>  	struct strbuf new_gitdir = STRBUF_INIT;
>  	struct migration_data data = {
>  		.sb = STRBUF_INIT,
> +		.name = STRBUF_INIT,
> +		.mail = STRBUF_INIT,
>  	};
>  	int did_migrate_refs = 0;
>  	int ret;
> @@ -3290,6 +3303,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
>  	ref_transaction_free(transaction);
>  	strbuf_release(&new_gitdir);
>  	strbuf_release(&data.sb);
> +	strbuf_release(&data.name);
> +	strbuf_release(&data.mail);
>  	return ret;
>  }
>  
> 
> -- 
> 2.50.1.465.gcb3da1c9e6.dirty
> 
> 

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-07-23 20:31   ` Justin Tobler
  2025-07-24  7:42     ` Patrick Steinhardt
  2025-07-24 10:21   ` Karthik Nayak
  1 sibling, 1 reply; 114+ messages in thread
From: Justin Tobler @ 2025-07-23 20:31 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Karthik Nayak

On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
> object ID set. If so, the value of that field is used to verify whteher

s/whteher/whether/

> the current state of the reference matches this expected state. It is
> thus an important part of mitigating races with a concurrent process
> that updates the same set of references.
> 
> When writing reflogs though we explicitly unset that flag. This is a
> sensible thing to do: the old state of reflog entry updates may not
> necessarily match the current on-disk state of its accompanying ref, but
> it's only intended to signal what old object ID we want to write into
> the new reflog entry. For example when migrating refs we end up writing
> many reflog entries for a single reference, and most likely those reflog
> entries will have many different old object IDs.
> 
> But unsetting this flag also removes a useful signal, namely that the
> caller _did_ provide an old object ID for a given reflog entry. This
> signal is useful to determine whether we have to resolve the refname
> manually to figure out the current state, or whether we should just go
> with what the caller has provided.
> 
> This actually causes real issues when migrating reflogs, as we don't
> know to actually use the caller-provided old object ID when writing
> those entries. Instead, reflog entries simply end up with the all-zero
> object ID.

Ok, if I'm understanding this correctly, the `REF_HAVE_OLD` flag is also
required to actually record a provided old OID in the reflog entry. If it
is not set, a NUL OID is recorded instead.

> Stop unsetting the flag so that we can use it as this described signal,
> which we'll do in a subsequent commit. Skip checking the old object ID
> for log-only updates so that we don't expect it to match the current
> on-disk state.

Just to clarify, when migrating reflogs, are these operations always
marked with `REF_LOG_ONLY`? The comment for that flag states:

  Used as a flag in ref_update::flags when we want to log a ref
  update but not actually perform it.  This is used when a symbolic ref
  update is split up.                                           

I might be misunderstanding this though.

-Justin

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 2/8] builtin/reflog: improve grouping of subcommands
  2025-07-23 18:14   ` Justin Tobler
@ 2025-07-24  7:42     ` Patrick Steinhardt
  2025-07-24 16:45       ` Junio C Hamano
  0 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-24  7:42 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak

On Wed, Jul 23, 2025 at 01:14:19PM -0500, Justin Tobler wrote:
> > @@ -404,10 +404,10 @@ int cmd_reflog(int argc,
> >  	struct option options[] = {
> >  		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
> >  		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
> > -		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
> > -		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
> >  		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
> > +		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
> >  		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
> > +		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
> >  		OPT_END()
> >  	};
> 
> Structing the subcommands order in such a manner seems sensible, but I'm
> not sure the pattern will be recognized by others that may add
> subcommands in the future. Maybe we could leave a comment that mentions
> the order?

Hm, dunno. I feel like it's subjective where to add a command anyway, so
I'm not sure that a comment would be allt hat helpful.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-23 19:00   ` Justin Tobler
@ 2025-07-24  7:42     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-24  7:42 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak

On Wed, Jul 23, 2025 at 02:00:10PM -0500, Justin Tobler wrote:
> On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> > diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> > index 6ae13e772b8..798dbc0a00a 100644
> > --- a/Documentation/git-reflog.adoc
> > +++ b/Documentation/git-reflog.adoc
> > @@ -12,6 +12,7 @@ SYNOPSIS
> >  git reflog [show] [<log-options>] [<ref>]
> >  git reflog list
> >  git reflog exists <ref>
> > +git reflog write <ref> <old-oid> <new-oid> <message>
> 
> The other subcommands each have an entry in the description. Do we want
> to also add something for the "write" subcommand?

Yeah, let's.

> Also, if we want to be consistent, I noticed the order of the
> subcommands listed in the description was not changed either. 

True, I'll fix that.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 6/8] refs: fix identity for migrated reflogs
  2025-07-23 19:41   ` Justin Tobler
@ 2025-07-24  7:42     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-24  7:42 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak

On Wed, Jul 23, 2025 at 02:41:27PM -0500, Justin Tobler wrote:
> On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> > diff --git a/refs.c b/refs.c
> > index 188989e4113..64544300dc3 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -3017,6 +3026,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
> >  		.transaction = migration_data->transaction,
> >  		.errbuf = migration_data->errbuf,
> >  		.sb = &migration_data->sb,
> > +		.name = &migration_data->name,
> > +		.mail = &migration_data->mail,
> 
> I was a bit confused at first why we cared to assign the name and mail
> fields here as it didn't look like we actually use them, but it looks
> like we do this to release the the underlying strbuf as we don't free it
> from `reflog_migration_data`.

This is an optimization: instead of reallocating a new buffer every time
we compute the name and mail we reuse a buffer. But because we're two
callbacks deep in the callchain we have to splice these buffers through
via multiple callback data structs.

I'll add a note to the commit message.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-23 20:31   ` Justin Tobler
@ 2025-07-24  7:42     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-24  7:42 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git, Karthik Nayak

On Wed, Jul 23, 2025 at 03:31:01PM -0500, Justin Tobler wrote:
> On 25/07/22 01:20PM, Patrick Steinhardt wrote:
> > The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
> > object ID set. If so, the value of that field is used to verify whteher
> 
> s/whteher/whether/
> 
> > the current state of the reference matches this expected state. It is
> > thus an important part of mitigating races with a concurrent process
> > that updates the same set of references.
> > 
> > When writing reflogs though we explicitly unset that flag. This is a
> > sensible thing to do: the old state of reflog entry updates may not
> > necessarily match the current on-disk state of its accompanying ref, but
> > it's only intended to signal what old object ID we want to write into
> > the new reflog entry. For example when migrating refs we end up writing
> > many reflog entries for a single reference, and most likely those reflog
> > entries will have many different old object IDs.
> > 
> > But unsetting this flag also removes a useful signal, namely that the
> > caller _did_ provide an old object ID for a given reflog entry. This
> > signal is useful to determine whether we have to resolve the refname
> > manually to figure out the current state, or whether we should just go
> > with what the caller has provided.
> > 
> > This actually causes real issues when migrating reflogs, as we don't
> > know to actually use the caller-provided old object ID when writing
> > those entries. Instead, reflog entries simply end up with the all-zero
> > object ID.
> 
> Ok, if I'm understanding this correctly, the `REF_HAVE_OLD` flag is also
> required to actually record a provided old OID in the reflog entry. If it
> is not set, a NUL OID is recorded instead.

This fix is not sufficient to record the old object ID, so the commit
message is indeed misleading as it comes from a previous iteration of
this patch seires.

I want to have this signal so that in the next commit we can assert that
if the new `REF_LOG_USE_PROVIDED_OIDS` is set, that the caller also sets
both `REF_HAVE_OLD` and `REF_HAVE_NEW`. So this commit here isn't really
required anymore, but I think it's the right thing to do anyway, and the
additional safety check isn't too bad to have in my opinion.

I'll rewrite the commit message a bit.

> > Stop unsetting the flag so that we can use it as this described signal,
> > which we'll do in a subsequent commit. Skip checking the old object ID
> > for log-only updates so that we don't expect it to match the current
> > on-disk state.
> 
> Just to clarify, when migrating reflogs, are these operations always
> marked with `REF_LOG_ONLY`? The comment for that flag states:
> 
>   Used as a flag in ref_update::flags when we want to log a ref
>   update but not actually perform it.  This is used when a symbolic ref
>   update is split up.                                           
> 
> I might be misunderstanding this though.

The comment is out of date indeed. When queueing reflog updates we use
`ref_transaction_update_reflog()`, which sets the following flags:

  - REF_HAVE_OLD and REF_HAVE_NEW to indicate the object IDs.

  - REF_LOG_ONLY to skip writing the reference itself.

  - REF_FORCE_CREATE_REFLOG to disable heuristics whether or not the
    reflog entry should be written in the first place.

  - REF_NO_DEREF to not write the reflog for the target of a potential
    symref, but for the symref itself.

  - REF_LOG_USE_PROVIDED_OIDS to actually use the provided object IDs
    and not try to resolve state.

It's an awful lot of flags, and I tried to work on a solution where we
don't have to introduce the new flag. But the logic here is so intricate
that it always caused unintended side effects.

Thanks for your review!

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 3/8] refs: export `ref_transaction_update_reflog()`
  2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
  2025-07-23 18:25   ` Justin Tobler
@ 2025-07-24  8:36   ` Karthik Nayak
  2025-07-24 12:55   ` Toon Claes
  2 siblings, 0 replies; 114+ messages in thread
From: Karthik Nayak @ 2025-07-24  8:36 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 1129 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

[snip]

> diff --git a/refs.h b/refs.h
> index efa182c6a14..0faf3bc0422 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -794,6 +794,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
>  			   unsigned int flags, const char *msg,
>  			   struct strbuf *err);
>
> +/*
> + * Similar to`ref_transaction_update`, but this function is only for adding
>

Nit: s/to`/to `/

> + * a reflog update. Supports providing custom committer information. The index
> + * field can be utiltized to order updates as desired. When not used, the
> + * updates default to being ordered by refname.
> + */
> +int ref_transaction_update_reflog(struct ref_transaction *transaction,
> +				  const char *refname,
> +				  const struct object_id *new_oid,
> +				  const struct object_id *old_oid,
> +				  const char *committer_info,
> +				  const char *msg,
> +				  uint64_t index,
> +				  struct strbuf *err);
> +
>  /*
>   * Add a reference creation to transaction. new_oid is the value that
>   * the reference should have after the update; it must not be
>
> --
> 2.50.1.465.gcb3da1c9e6.dirty

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 6/8] refs: fix identity for migrated reflogs
  2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
  2025-07-23 19:41   ` Justin Tobler
@ 2025-07-24  9:41   ` Karthik Nayak
  2025-07-24 12:56   ` Toon Claes
  2 siblings, 0 replies; 114+ messages in thread
From: Karthik Nayak @ 2025-07-24  9:41 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 4563 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> When migrating reflog entries between different storage formats we must
> reconstruct the identity of reflog entries. This is done by passing the
> committer passed to the `migrate_one_reflog_entry()` callback function
> to `fmt_ident()`.
>
> This results in an invalid identity though: `fmt_ident()` expects the
> caller to provide both name and mail of the author, but we pass the full
> identity as mail. This leads to an identity like:
>
>     pks <Patrick Steinhardt ps@pks.im>
>
> Fix the bug by splitting the identity line first. This allows us to
> extract both the name and mail so that we can pass them to `fmt_ident()`
> separately.
>

Well explained.

> This commit does not yet add any tests as there is another bug in the
> reflog migration that will be fixed in a subsequent commit. Once that
> bug is fixed we'll make the reflog verification in t1450 stricter, and
> that will catch both this bug here and the other bug.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 188989e4113..64544300dc3 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2945,7 +2945,7 @@ struct migration_data {
>  	struct ref_store *old_refs;
>  	struct ref_transaction *transaction;
>  	struct strbuf *errbuf;
> -	struct strbuf sb;
> +	struct strbuf sb, name, mail;
>  };
>
>  static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
> @@ -2984,7 +2984,7 @@ struct reflog_migration_data {
>  	struct ref_store *old_refs;
>  	struct ref_transaction *transaction;
>  	struct strbuf *errbuf;
> -	struct strbuf *sb;
> +	struct strbuf *sb, *name, *mail;
>  };
>
>  static int migrate_one_reflog_entry(struct object_id *old_oid,
> @@ -2994,13 +2994,22 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
>  				    const char *msg, void *cb_data)
>  {
>  	struct reflog_migration_data *data = cb_data;
> +	struct ident_split ident;
>  	const char *date;
>  	int ret;
>
> +	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
> +		return -1;
> +
> +	strbuf_reset(data->name);
> +	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
> +	strbuf_reset(data->mail);
> +	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
> +
>  	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
>  	strbuf_reset(data->sb);
>  	/* committer contains name and email */

Nit: This comment is now stale

> -	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
> +	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
>

I was a bit stuck on why we use `WANT_BLANK_IDENT`, since we explicitly
(since we do a split_ident() and that would error out if there is no
name/email) pass the 'name' and the 'email' here as non-null values. So
this seems to be the only option for the enum:

enum want_ident {
	WANT_BLANK_IDENT,
	WANT_AUTHOR_IDENT,
	WANT_COMMITTER_IDENT
};

Since we don't want to extract author or committer information. However,
in fmt_ident() we only use the 'want_ident' value, when either 'name' or
'email' is not set. I found this a bit confusing, perhaps a simple
change of name from 'whose_ident' to 'fallback_ident' would be much more
easier to read and understand. Anyways, this is not for your patch.

>  	ret = ref_transaction_update_reflog(data->transaction, data->refname,
>  					    new_oid, old_oid, data->sb->buf,
> @@ -3017,6 +3026,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
>  		.transaction = migration_data->transaction,
>  		.errbuf = migration_data->errbuf,
>  		.sb = &migration_data->sb,
> +		.name = &migration_data->name,
> +		.mail = &migration_data->mail,
>  	};
>
>  	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
> @@ -3115,6 +3126,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
>  	struct strbuf new_gitdir = STRBUF_INIT;
>  	struct migration_data data = {
>  		.sb = STRBUF_INIT,
> +		.name = STRBUF_INIT,
> +		.mail = STRBUF_INIT,
>  	};
>  	int did_migrate_refs = 0;
>  	int ret;
> @@ -3290,6 +3303,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
>  	ref_transaction_free(transaction);
>  	strbuf_release(&new_gitdir);
>  	strbuf_release(&data.sb);
> +	strbuf_release(&data.name);
> +	strbuf_release(&data.mail);
>  	return ret;
>  }
>
>
> --
> 2.50.1.465.gcb3da1c9e6.dirty

The patch look good. Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
  2025-07-23 20:31   ` Justin Tobler
@ 2025-07-24 10:21   ` Karthik Nayak
  2025-07-24 11:35     ` Patrick Steinhardt
  1 sibling, 1 reply; 114+ messages in thread
From: Karthik Nayak @ 2025-07-24 10:21 UTC (permalink / raw)
  To: Patrick Steinhardt, git

[-- Attachment #1: Type: text/plain, Size: 4699 bytes --]

Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/refs.c b/refs.c
> index 64544300dc3..c78d5be6e20 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1384,11 +1384,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
>  	update = ref_transaction_add_update(transaction, refname, flags,
>  					    new_oid, old_oid, NULL, NULL,
>  					    committer_info, msg);
> -	/*
> -	 * While we do set the old_oid value, we unset the flag to skip
> -	 * old_oid verification which only makes sense for refs.
> -	 */
> -	update->flags &= ~REF_HAVE_OLD;
>  	update->index = index;
>
>  	/*

So we no longer unset the flag, this will ensure that the provided
old_oid is propagated correctly.

> @@ -3310,7 +3305,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
>
>  int ref_update_expects_existing_old_ref(struct ref_update *update)

Nit: Wonder if we should update the comment for the function to reflect
how this works with reflog only entries.

>  {
> -	return (update->flags & REF_HAVE_OLD) &&
> +	return (update->flags & (REF_HAVE_OLD | REF_LOG_ONLY)) == REF_HAVE_OLD &&
>  		(!is_null_oid(&update->old_oid) || update->old_target);
>  }
>

Okay this is check now says, if this is a reflog only entry, we don't
expect the reference to exist.

Nit: I wonder if can make this a bit more readable, perhaps:

  int ref_update_expects_existing_old_ref(struct ref_update *update)
  {
      /* reflog only entries may not match on-disk status of a reference */
      if (update->flags & REF_LOG_ONLY)
          return 0;

      return (update->flags & REF_HAVE_OLD &&
          (!is_null_oid(&update->old_oid) || update->old_target);
  }

I'm okay with the current version too.

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 89ae4517a97..d519bb615fa 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
>  	 * done when new_update is processed.
>  	 */
>  	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
> -	update->flags &= ~REF_HAVE_OLD;
>
>  	return 0;
>  }
> @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
>  						struct object_id *oid,
>  						struct strbuf *err)
>  {
> -	if (!(update->flags & REF_HAVE_OLD) ||
> -		   oideq(oid, &update->old_oid))
> +	if (update->flags & REF_LOG_ONLY ||
> +	    !(update->flags & REF_HAVE_OLD) ||
> +	    oideq(oid, &update->old_oid))
>  		return 0;
>
>  	if (is_null_oid(&update->old_oid)) {
> @@ -3061,7 +3061,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
>  	for (i = 0; i < transaction->nr; i++) {
>  		struct ref_update *update = transaction->updates[i];
>
> -		if ((update->flags & REF_HAVE_OLD) &&
> +		if (!(update->flags & REF_LOG_ONLY) &&
> +		    (update->flags & REF_HAVE_OLD) &&
>  		    !is_null_oid(&update->old_oid))
>  			BUG("initial ref transaction with old_sha1 set");
>
> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index 4c3817f4ec1..44af58ac50b 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
>  	if (ret > 0) {
>  		/* The reference does not exist, but we expected it to. */
>  		strbuf_addf(err, _("cannot lock ref '%s': "
> -
> -

Huh. I'm the author of this misshap. Thanks for the cleanup :D

>  				   "unable to resolve reference '%s'"),
>  			    ref_update_original_update_refname(u), u->refname);
>  		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
> @@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
>
>  			new_update->parent_update = u;
>
> -			/*
> -			 * Change the symbolic ref update to log only. Also, it
> -			 * doesn't need to check its old OID value, as that will be
> -			 * done when new_update is processed.
> -			 */
> +			/* Change the symbolic ref update to log only. */
>  			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
> -			u->flags &= ~REF_HAVE_OLD;
>  		}
>  	}
>
> @@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
>  		ret = ref_update_check_old_target(referent->buf, u, err);
>  		if (ret)
>  			return ret;
> -	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
> +	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
> +		   !oideq(&current_oid, &u->old_oid)) {
>  		if (is_null_oid(&u->old_oid)) {
>  			strbuf_addf(err, _("cannot lock ref '%s': "
>  					   "reference already exists"),
>
> --
> 2.50.1.465.gcb3da1c9e6.dirty

Looks good overall. Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-24 10:21   ` Karthik Nayak
@ 2025-07-24 11:35     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-24 11:35 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git

On Thu, Jul 24, 2025 at 03:21:30AM -0700, Karthik Nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > diff --git a/refs.c b/refs.c
> > index 64544300dc3..c78d5be6e20 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -3310,7 +3305,7 @@ int repo_migrate_ref_storage_format(struct repository *repo,
> >
> >  int ref_update_expects_existing_old_ref(struct ref_update *update)
> 
> Nit: Wonder if we should update the comment for the function to reflect
> how this works with reflog only entries.

Yeah, let's.

> >  {
> > -	return (update->flags & REF_HAVE_OLD) &&
> > +	return (update->flags & (REF_HAVE_OLD | REF_LOG_ONLY)) == REF_HAVE_OLD &&
> >  		(!is_null_oid(&update->old_oid) || update->old_target);
> >  }
> >
> 
> Okay this is check now says, if this is a reflog only entry, we don't
> expect the reference to exist.
> 
> Nit: I wonder if can make this a bit more readable, perhaps:
> 
>   int ref_update_expects_existing_old_ref(struct ref_update *update)
>   {
>       /* reflog only entries may not match on-disk status of a reference */
>       if (update->flags & REF_LOG_ONLY)
>           return 0;
> 
>       return (update->flags & REF_HAVE_OLD &&
>           (!is_null_oid(&update->old_oid) || update->old_target);
>   }
> 
> I'm okay with the current version too.

I think yours reads more straight-forward though.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
  2025-07-23 19:00   ` Justin Tobler
@ 2025-07-24 12:54   ` Toon Claes
  2025-07-25  5:36     ` Patrick Steinhardt
  2025-07-24 16:20   ` SZEDER Gábor
  2 siblings, 1 reply; 114+ messages in thread
From: Toon Claes @ 2025-07-24 12:54 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> While we provide a couple of subcommands in git-reflog(1) to remove
> reflog entries, we don't provide any to write new entries. Obviously
> this is not an operation that really would be needed for many use cases
> out there, or otherwise people would have complained that such a command
> does not exist yet. But the introduction of the "reftable" backend
> changes the picture a bit, as it is now basically impossible to manually
> append a reflog entry if one wanted to do so due to the binary format.
>
> Plug this gap by introducing a simple "write" subcommand. For now, all
> this command does is to append a single new reflog entry with the given
> object IDs and message to the reflog. More specifically, it is not yet
> possible to:
>
>   - Write multiple reflog entries at once.
>
>   - Insert reflog entries at arbitrary indices.
>
>   - Specify the date of the reflog entry.
>
>   - Insert reflog entries that refer to nonexistent objects.
>
> If required, those features can be added at a future point in time. For
> now though, the new command aims to fulfill the most basic use cases
> while being as strict as possible when it comes to verifying parameters.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/git-reflog.adoc |  1 +
>  builtin/reflog.c              | 65 ++++++++++++++++++++++++++++++++++
>  t/meson.build                 |  1 +
>  t/t1421-reflog-write.sh       | 81 +++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 148 insertions(+)
>
> diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> index 6ae13e772b8..798dbc0a00a 100644
> --- a/Documentation/git-reflog.adoc
> +++ b/Documentation/git-reflog.adoc
> @@ -12,6 +12,7 @@ SYNOPSIS
>  git reflog [show] [<log-options>] [<ref>]
>  git reflog list
>  git reflog exists <ref>
> +git reflog write <ref> <old-oid> <new-oid> <message>
>  git reflog delete [--rewrite] [--updateref]
>  	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
>  git reflog drop [--all [--single-worktree] | <refs>...]
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index b00b3f9edc9..d0374295620 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -3,6 +3,8 @@
>  #include "builtin.h"
>  #include "config.h"
>  #include "gettext.h"
> +#include "hex.h"
> +#include "odb.h"
>  #include "revision.h"
>  #include "reachable.h"
>  #include "wildmatch.h"
> @@ -20,6 +22,9 @@
>  #define BUILTIN_REFLOG_EXISTS_USAGE \
>  	N_("git reflog exists <ref>")
>  
> +#define BUILTIN_REFLOG_WRITE_USAGE \
> +	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
> +
>  #define BUILTIN_REFLOG_DELETE_USAGE \
>  	N_("git reflog delete [--rewrite] [--updateref]\n" \
>  	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
> @@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
>  	NULL,
>  };
>  
> +static const char *const reflog_write_usage[] = {
> +	BUILTIN_REFLOG_WRITE_USAGE,
> +	NULL,
> +};
> +
>  static const char *const reflog_delete_usage[] = {
>  	BUILTIN_REFLOG_DELETE_USAGE,
>  	NULL
> @@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
>  	BUILTIN_REFLOG_SHOW_USAGE,
>  	BUILTIN_REFLOG_LIST_USAGE,
>  	BUILTIN_REFLOG_EXISTS_USAGE,
> +	BUILTIN_REFLOG_WRITE_USAGE,
>  	BUILTIN_REFLOG_DELETE_USAGE,
>  	BUILTIN_REFLOG_DROP_USAGE,
>  	BUILTIN_REFLOG_EXPIRE_USAGE,
> @@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
>  	return ret;
>  }
>  
> +static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
> +			    struct repository *repo)
> +{
> +	const struct option options[] = {
> +		OPT_END()
> +	};
> +	struct object_id old_oid, new_oid;
> +	struct strbuf err = STRBUF_INIT;
> +	struct ref_transaction *tx;
> +	const char *ref, *message;
> +	int ret;
> +
> +	argc = parse_options(argc, argv, prefix, options, reflog_drop_usage, 0);

Wrong usage string here: s/reflog_drop_usage/reflog_write_usage/.

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 3/8] refs: export `ref_transaction_update_reflog()`
  2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
  2025-07-23 18:25   ` Justin Tobler
  2025-07-24  8:36   ` Karthik Nayak
@ 2025-07-24 12:55   ` Toon Claes
  2 siblings, 0 replies; 114+ messages in thread
From: Toon Claes @ 2025-07-24 12:55 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> In a subsequent commit we'll add another user that wants to write reflog
> entries. This requires them to call `ref_transaction_update_reflog()`,
> but that functino is local to "refs.c".
>
> Export the function to prepare for the change. While at it, drop the
> `flags` field, as all callers are for now expected to use the same flags
> anyway.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 29 +++++++++++------------------
>  refs.h | 15 +++++++++++++++
>  2 files changed, 26 insertions(+), 18 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 73913b6627b..188989e4113 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1362,27 +1362,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
>  	return 0;
>  }
>  
> -/*
> - * Similar to`ref_transaction_update`, but this function is only for adding
> - * a reflog update. Supports providing custom committer information. The index
> - * field can be utiltized to order updates as desired. When not used, the
> - * updates default to being ordered by refname.

"not used" is a little ambiguous for me. I had to dig a little and in
transaction_update_cmp() in refs/reftable-backend.c the index is only
considered when the value is non-zero. What do you think about replacing
"not used" with "zero"?

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 6/8] refs: fix identity for migrated reflogs
  2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
  2025-07-23 19:41   ` Justin Tobler
  2025-07-24  9:41   ` Karthik Nayak
@ 2025-07-24 12:56   ` Toon Claes
  2 siblings, 0 replies; 114+ messages in thread
From: Toon Claes @ 2025-07-24 12:56 UTC (permalink / raw)
  To: Patrick Steinhardt, git; +Cc: Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> When migrating reflog entries between different storage formats we must
> reconstruct the identity of reflog entries. This is done by passing the
> committer passed to the `migrate_one_reflog_entry()` callback function
> to `fmt_ident()`.
>
> This results in an invalid identity though: `fmt_ident()` expects the
> caller to provide both name and mail of the author, but we pass the full
> identity as mail. This leads to an identity like:
>
>     pks <Patrick Steinhardt ps@pks.im>
>
> Fix the bug by splitting the identity line first. This allows us to
> extract both the name and mail so that we can pass them to `fmt_ident()`
> separately.
>
> This commit does not yet add any tests as there is another bug in the
> reflog migration that will be fixed in a subsequent commit. Once that
> bug is fixed we'll make the reflog verification in t1450 stricter, and
> that will catch both this bug here and the other bug.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 188989e4113..64544300dc3 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2945,7 +2945,7 @@ struct migration_data {
>  	struct ref_store *old_refs;
>  	struct ref_transaction *transaction;
>  	struct strbuf *errbuf;
> -	struct strbuf sb;
> +	struct strbuf sb, name, mail;
>  };
>  
>  static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
> @@ -2984,7 +2984,7 @@ struct reflog_migration_data {
>  	struct ref_store *old_refs;
>  	struct ref_transaction *transaction;
>  	struct strbuf *errbuf;
> -	struct strbuf *sb;
> +	struct strbuf *sb, *name, *mail;
>  };
>  
>  static int migrate_one_reflog_entry(struct object_id *old_oid,
> @@ -2994,13 +2994,22 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
>  				    const char *msg, void *cb_data)
>  {
>  	struct reflog_migration_data *data = cb_data;
> +	struct ident_split ident;
>  	const char *date;
>  	int ret;
>  
> +	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
> +		return -1;
> +
> +	strbuf_reset(data->name);
> +	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
> +	strbuf_reset(data->mail);
> +	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
> +
>  	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
>  	strbuf_reset(data->sb);
>  	/* committer contains name and email */

This comment doesn't make sense here no more. Better leave it out.

> -	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
> +	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));

I was checking whether we could use IDENT_NO_NAME instead of parsing the
ident. But fmt_ident() calls strbuf_addstr_without_crud() which strips
out the angle brackets from the email address, which builds an invalid
identity like: "C O Mitter committer@example.com 1112911993 -0700"

We could opt to add a flag IDENT_LEAVE_CRUD which calls strbuf_addstr()
instead, but I'm not sure we want to go there.

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
  2025-07-23 19:00   ` Justin Tobler
  2025-07-24 12:54   ` Toon Claes
@ 2025-07-24 16:20   ` SZEDER Gábor
  2025-07-24 21:10     ` Junio C Hamano
  2 siblings, 1 reply; 114+ messages in thread
From: SZEDER Gábor @ 2025-07-24 16:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Karthik Nayak, Patrick Steinhardt

On Tue, Jul 22, 2025 at 01:20:53PM +0200, Patrick Steinhardt wrote:
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index b00b3f9edc9..d0374295620 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -3,6 +3,8 @@
>  #include "builtin.h"
>  #include "config.h"
>  #include "gettext.h"
> +#include "hex.h"
> +#include "odb.h"

This series is queued on top of v2.50.0, which doesn't have 'odb.h'
yet.

      CC builtin/reflog.o
  builtin/reflog.c:7:10: fatal error: odb.h: No such file or directory
      7 | #include "odb.h"
        |          ^~~~~~~
  compilation terminated.
  make: *** [Makefile:2821: builtin/reflog.o] Error 1

>  #include "revision.h"
>  #include "reachable.h"
>  #include "wildmatch.h"

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 2/8] builtin/reflog: improve grouping of subcommands
  2025-07-24  7:42     ` Patrick Steinhardt
@ 2025-07-24 16:45       ` Junio C Hamano
  0 siblings, 0 replies; 114+ messages in thread
From: Junio C Hamano @ 2025-07-24 16:45 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Justin Tobler, git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Jul 23, 2025 at 01:14:19PM -0500, Justin Tobler wrote:
>> > @@ -404,10 +404,10 @@ int cmd_reflog(int argc,
>> >  	struct option options[] = {
>> >  		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
>> >  		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
>> > -		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
>> > -		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
>> >  		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
>> > +		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
>> >  		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
>> > +		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
>> >  		OPT_END()
>> >  	};
>> 
>> Structing the subcommands order in such a manner seems sensible, but I'm
>> not sure the pattern will be recognized by others that may add
>> subcommands in the future. Maybe we could leave a comment that mentions
>> the order?
>
> Hm, dunno. I feel like it's subjective where to add a command anyway, so
> I'm not sure that a comment would be allt hat helpful.

I'd agree on both counts.  The only pattern I can see myself is to
have read-only operations first and then read-write operations, but
even there, the choice of "is it read-only?" as an axis feels very
much arbitrary (another obvious one is to list from the everyday 
use to less often used to finally only administrative ones).

If the read-write operations are ordered by severity, I would place
expire (affects only stale entries) between delete (affects one
entry in a reflog) and drop (deletes the whole thing).  But that
again is fairly arbitrary.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-24 16:20   ` SZEDER Gábor
@ 2025-07-24 21:10     ` Junio C Hamano
  2025-07-25  5:36       ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Junio C Hamano @ 2025-07-24 21:10 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Karthik Nayak, Patrick Steinhardt

SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Tue, Jul 22, 2025 at 01:20:53PM +0200, Patrick Steinhardt wrote:
>> diff --git a/builtin/reflog.c b/builtin/reflog.c
>> index b00b3f9edc9..d0374295620 100644
>> --- a/builtin/reflog.c
>> +++ b/builtin/reflog.c
>> @@ -3,6 +3,8 @@
>>  #include "builtin.h"
>>  #include "config.h"
>>  #include "gettext.h"
>> +#include "hex.h"
>> +#include "odb.h"
>
> This series is queued on top of v2.50.0, which doesn't have 'odb.h'
> yet.

Thanks for checking.

Yet this is a topic to fix breakages that happened even before 2.50;
"git refs migrate" started migrating reflogs in 2.48, which had one
fix on top in 2.49.  For a non-security bugfix we typically do not
address anything older than the latest release's maintenance track,
so a series that would fix on top of 2.50 would have been more
appropriate.


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-24 12:54   ` Toon Claes
@ 2025-07-25  5:36     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  5:36 UTC (permalink / raw)
  To: Toon Claes; +Cc: git, Karthik Nayak

On Thu, Jul 24, 2025 at 02:54:53PM +0200, Toon Claes wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> > index 6ae13e772b8..798dbc0a00a 100644
> > --- a/Documentation/git-reflog.adoc
> > +++ b/Documentation/git-reflog.adoc
> > @@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
> >  	return ret;
> >  }
> >  
> > +static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
> > +			    struct repository *repo)
> > +{
> > +	const struct option options[] = {
> > +		OPT_END()
> > +	};
> > +	struct object_id old_oid, new_oid;
> > +	struct strbuf err = STRBUF_INIT;
> > +	struct ref_transaction *tx;
> > +	const char *ref, *message;
> > +	int ret;
> > +
> > +	argc = parse_options(argc, argv, prefix, options, reflog_drop_usage, 0);
> 
> Wrong usage string here: s/reflog_drop_usage/reflog_write_usage/.

Good catch, fixed now.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-24 21:10     ` Junio C Hamano
@ 2025-07-25  5:36       ` Patrick Steinhardt
  2025-07-25 14:35         ` Junio C Hamano
  0 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  5:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git, Karthik Nayak

On Thu, Jul 24, 2025 at 02:10:31PM -0700, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> 
> > On Tue, Jul 22, 2025 at 01:20:53PM +0200, Patrick Steinhardt wrote:
> >> diff --git a/builtin/reflog.c b/builtin/reflog.c
> >> index b00b3f9edc9..d0374295620 100644
> >> --- a/builtin/reflog.c
> >> +++ b/builtin/reflog.c
> >> @@ -3,6 +3,8 @@
> >>  #include "builtin.h"
> >>  #include "config.h"
> >>  #include "gettext.h"
> >> +#include "hex.h"
> >> +#include "odb.h"
> >
> > This series is queued on top of v2.50.0, which doesn't have 'odb.h'
> > yet.
> 
> Thanks for checking.
> 
> Yet this is a topic to fix breakages that happened even before 2.50;
> "git refs migrate" started migrating reflogs in 2.48, which had one
> fix on top in 2.49.  For a non-security bugfix we typically do not
> address anything older than the latest release's maintenance track,
> so a series that would fix on top of 2.50 would have been more
> appropriate.

Sure, I can rebase this on top of v2.50.1. It would then of course
require some smallish fixes when merged to `seen`. The below patch is
what is required to make it work with the v2.50 track.

Patrick

diff --git a/builtin/reflog.c b/builtin/reflog.c
index bc7e7f5e442..d3f0009cb0e 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -4,7 +4,7 @@
 #include "config.h"
 #include "gettext.h"
 #include "hex.h"
-#include "odb.h"
+#include "object-store.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -426,13 +426,13 @@ static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
 	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
 	if (ret)
 		die(_("invalid old object ID: '%s'"), argv[1]);
-	if (!is_null_oid(&old_oid) && !odb_has_object(repo->objects, &old_oid, 0))
+	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
 		die(_("old object '%s' does not exist"), argv[1]);
 
 	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
 	if (ret)
 		die(_("invalid new object ID: '%s'"), argv[2]);
-	if (!is_null_oid(&new_oid) && !odb_has_object(repo->objects, &new_oid, 0))
+	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
 		die(_("new object '%s' does not exist"), argv[2]);
 
 	message = argv[3];

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 0/8] refs: fix migration of reflog entries
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (7 preceding siblings ...)
  2025-07-22 11:20 ` [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
@ 2025-07-25  6:58 ` Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
                     ` (7 more replies)
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (3 subsequent siblings)
  12 siblings, 8 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Changes in v2:
  - !!! The base of this topic has changed so that it sits on top of
    v2.50.1. This is done so that we can backport this change to older
    release tracks.
  - A couple of typo fixes and clarifications for commit messages.
  - Reorder sections in git-reflog(1) manpage according to the
    reordering we have in the synopsis.
  - Add a section for the new `write` command.
  - Improve test coverage for the `git reflog write` command.
  - Avoid `cat`ing a file into a Bash loop.
  - Remove a stale comment.
  - Make `ref_update_expects_existing_old_ref()` a bit more straight
    forward.
  - Link to v1: https://lore.kernel.org/r/20250722-pks-reflog-append-v1-0-183e5949de16@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (8):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  76 +++++++++++++++++--------------
 builtin/reflog.c              | 103 ++++++++++++++++++++++++++++++++++--------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  60 +++++++++++++-----------
 refs.h                        |  24 +++++++++-
 refs/files-backend.c          |  25 ++++++++--
 refs/refs-internal.h          |   3 +-
 refs/reftable-backend.c       |  26 +++++++----
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       |  99 ++++++++++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 ++++++---
 12 files changed, 338 insertions(+), 105 deletions(-)

Range-diff versus v1:

1:  a5f91e51802 = 1:  19a1df63ee6 Documentation/git-reflog: convert to use synopsis type
2:  3b39a73a942 ! 2:  807cc393337 builtin/reflog: improve grouping of subcommands
    @@ Metadata
      ## Commit message ##
         builtin/reflog: improve grouping of subcommands
     
    -    The way subcommands of git-reflog(1) are layed out does not make any
    +    The way subcommands of git-reflog(1) are laid out does not make any
         immediate sense. Reorder them such that read-only subcommands precede
         writing commands for a bit more structure.
     
    @@ Documentation/git-reflog.adoc: SYNOPSIS
      
      DESCRIPTION
      -----------
    +@@ Documentation/git-reflog.adoc: actions, and in addition the `HEAD` reflog records branch switching.
    + 
    + The "list" subcommand lists all refs which have a corresponding reflog.
    + 
    +-The "expire" subcommand prunes older reflog entries. Entries older
    +-than `expire` time, or entries older than `expire-unreachable` time
    +-and not reachable from the current tip, are removed from the reflog.
    +-This is typically not used directly by end users -- instead, see
    +-linkgit:git-gc[1].
    ++The "exists" subcommand checks whether a ref has a reflog.  It exits
    ++with zero status if the reflog exists, and non-zero status if it does
    ++not.
    + 
    + The "delete" subcommand deletes single entries from the reflog, but
    + not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
    +@@ Documentation/git-reflog.adoc: The "drop" subcommand completely removes the reflog for the specified
    + references. This is in contrast to "expire" and "delete", both of which
    + can be used to delete reflog entries, but not the reflog itself.
    + 
    +-The "exists" subcommand checks whether a ref has a reflog.  It exits
    +-with zero status if the reflog exists, and non-zero status if it does
    +-not.
    ++The "expire" subcommand prunes older reflog entries. Entries older
    ++than `expire` time, or entries older than `expire-unreachable` time
    ++and not reachable from the current tip, are removed from the reflog.
    ++This is typically not used directly by end users -- instead, see
    ++linkgit:git-gc[1].
    + 
    + OPTIONS
    + -------
    +@@ Documentation/git-reflog.adoc: Options for `show`
    + `git reflog show` accepts any of the options accepted by `git log`.
    + 
    + 
    ++Options for `delete`
    ++~~~~~~~~~~~~~~~~~~~~
    ++
    ++`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
    ++`--dry-run`, and `--verbose`, with the same meanings as when they are
    ++used with `expire`.
    ++
    ++Options for `drop`
    ++~~~~~~~~~~~~~~~~~~
    ++
    ++--all::
    ++	Drop the reflogs of all references from all worktrees.
    ++
    ++--single-worktree::
    ++	By default when `--all` is specified, reflogs from all working
    ++	trees are dropped. This option limits the processing to reflogs
    ++	from the current working tree only.
    ++
    ++
    + Options for `expire`
    + ~~~~~~~~~~~~~~~~~~~~
    + 
    +@@ Documentation/git-reflog.adoc: which didn't protect objects referred to by reflogs.
    + 	Print extra information on screen.
    + 
    + 
    +-Options for `delete`
    +-~~~~~~~~~~~~~~~~~~~~
    +-
    +-`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
    +-`--dry-run`, and `--verbose`, with the same meanings as when they are
    +-used with `expire`.
    +-
    +-Options for `drop`
    +-~~~~~~~~~~~~~~~~~~
    +-
    +---all::
    +-	Drop the reflogs of all references from all worktrees.
    +-
    +---single-worktree::
    +-	By default when `--all` is specified, reflogs from all working
    +-	trees are dropped. This option limits the processing to reflogs
    +-	from the current working tree only.
    +-
    + GIT
    + ---
    + Part of the linkgit:git[1] suite
     
      ## builtin/reflog.c ##
     @@
3:  ce7a5409eec ! 3:  1fe1c93db2f refs: export `ref_transaction_update_reflog()`
    @@ Commit message
     
         In a subsequent commit we'll add another user that wants to write reflog
         entries. This requires them to call `ref_transaction_update_reflog()`,
    -    but that functino is local to "refs.c".
    +    but that function is local to "refs.c".
     
         Export the function to prepare for the change. While at it, drop the
         `flags` field, as all callers are for now expected to use the same flags
    @@ refs.h: int ref_transaction_update(struct ref_transaction *transaction,
      			   struct strbuf *err);
      
     +/*
    -+ * Similar to`ref_transaction_update`, but this function is only for adding
    ++ * Similar to `ref_transaction_update`, but this function is only for adding
     + * a reflog update. Supports providing custom committer information. The index
    -+ * field can be utiltized to order updates as desired. When not used, the
    ++ * field can be utiltized to order updates as desired. When set to zero, the
     + * updates default to being ordered by refname.
     + */
     +int ref_transaction_update_reflog(struct ref_transaction *transaction,
4:  c0a7e9031dd ! 4:  38835607a92 builtin/reflog: implement subcommand to write new entries
    @@ Documentation/git-reflog.adoc: SYNOPSIS
      git reflog delete [--rewrite] [--updateref]
      	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
      git reflog drop [--all [--single-worktree] | <refs>...]
    +@@ Documentation/git-reflog.adoc: The "exists" subcommand checks whether a ref has a reflog.  It exits
    + with zero status if the reflog exists, and non-zero status if it does
    + not.
    + 
    ++The "write" subcommand writes a single entry to the reflog of a given
    ++reference. This new entry is appended to the reflog and will thus become
    ++the most recent entry. Both the old and new object IDs must not be
    ++abbreviated and must point to existing objects. The reflog message gets
    ++normalized.
    ++
    + The "delete" subcommand deletes single entries from the reflog, but
    + not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
    + reflog delete master@{2}`"). This subcommand is also typically not used
     
      ## builtin/reflog.c ##
     @@
    @@ builtin/reflog.c
      #include "config.h"
      #include "gettext.h"
     +#include "hex.h"
    -+#include "odb.h"
    ++#include "object-store.h"
      #include "revision.h"
      #include "reachable.h"
      #include "wildmatch.h"
    @@ builtin/reflog.c: static int cmd_reflog_drop(int argc, const char **argv, const
     +	const char *ref, *message;
     +	int ret;
     +
    -+	argc = parse_options(argc, argv, prefix, options, reflog_drop_usage, 0);
    ++	argc = parse_options(argc, argv, prefix, options, reflog_write_usage, 0);
     +	if (argc != 4)
     +		usage_with_options(reflog_write_usage, options);
     +
    @@ builtin/reflog.c: static int cmd_reflog_drop(int argc, const char **argv, const
     +	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
     +	if (ret)
     +		die(_("invalid old object ID: '%s'"), argv[1]);
    -+	if (!is_null_oid(&old_oid) && !odb_has_object(repo->objects, &old_oid, 0))
    ++	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
     +		die(_("old object '%s' does not exist"), argv[1]);
     +
     +	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
     +	if (ret)
     +		die(_("invalid new object ID: '%s'"), argv[2]);
    -+	if (!is_null_oid(&new_oid) && !odb_has_object(repo->objects, &new_oid, 0))
    ++	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
     +		die(_("new object '%s' does not exist"), argv[2]);
     +
     +	message = argv[3];
    @@ t/t1421-reflog-write.sh (new)
     +	)
     +'
     +
    -+test_expect_success 'nonexistent old object ID' '
    ++test_expect_success 'nonexistent object IDs' '
     +	test_when_finished "rm -rf repo" &&
     +	git init repo &&
     +	(
     +		cd repo &&
    -+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID first 2>err &&
    -+		test_grep "old object .* does not exist" err
    ++		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID old-object-id 2>err &&
    ++		test_grep "old object .* does not exist" err &&
    ++		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) new-object-id 2>err &&
    ++		test_grep "new object .* does not exist" err
     +	)
     +'
     +
    -+test_expect_success 'nonexistent new object ID' '
    ++test_expect_success 'abbreviated object IDs' '
     +	test_when_finished "rm -rf repo" &&
     +	git init repo &&
     +	(
     +		cd repo &&
    -+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) first 2>err &&
    -+		test_grep "new object .* does not exist" err
    ++		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&
    ++		test_grep "invalid old object ID" err &&
    ++		test_must_fail git reflog write refs/heads/something $ZERO_OID 12345 new-object-id 2>err &&
    ++		test_grep "invalid new object ID" err
    ++	)
    ++'
    ++
    ++test_expect_success 'reflog message gets normalized' '
    ++	test_when_finished "rm -rf repo" &&
    ++	git init repo &&
    ++	(
    ++		cd repo &&
    ++		test_commit initial &&
    ++		COMMIT_OID=$(git rev-parse HEAD) &&
    ++		git reflog write HEAD $COMMIT_OID $COMMIT_OID "$(printf "message\nwith\nnewlines")" &&
    ++		git reflog show -1 --format=%gs HEAD >actual &&
    ++		echo "message with newlines" >expected &&
    ++		test_cmp expected actual
     +	)
     +'
     +
    @@ t/t1421-reflog-write.sh (new)
     +		EOF
     +
     +		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
    ++		# Note: the old object ID of the second reflog entry is broken.
    ++		# This will be fixed in subsequent commits.
     +		test_reflog_matches . refs/heads/something <<-EOF
     +		$ZERO_OID $COMMIT_OID $SIGNATURE	first
    -+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
    ++		$ZERO_OID $COMMIT_OID $SIGNATURE	second
     +		EOF
     +	)
     +'
5:  15fddba8b21 = 5:  7b6c0d00668 ident: fix type of string length parameter
6:  588d7a397f9 ! 6:  8caa2899d25 refs: fix identity for migrated reflogs
    @@ Commit message
         bug is fixed we'll make the reflog verification in t1450 stricter, and
         that will catch both this bug here and the other bug.
     
    +    Note that we also add two new `name` and `mail` string buffers to the
    +    callback structures and splice them through to the callbacks. This is
    +    done so that we can avoid allocating a new buffer every time we compute
    +    the committer information.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## refs.c ##
    @@ refs.c: static int migrate_one_reflog_entry(struct object_id *old_oid,
     +
      	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
      	strbuf_reset(data->sb);
    - 	/* committer contains name and email */
    +-	/* committer contains name and email */
     -	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
     +	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
      
7:  fea43d4a3f9 ! 7:  e2ed6164a7c refs: stop unsetting REF_HAVE_OLD for log-only updates
    @@ Commit message
         refs: stop unsetting REF_HAVE_OLD for log-only updates
     
         The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
    -    object ID set. If so, the value of that field is used to verify whteher
    +    object ID set. If so, the value of that field is used to verify whether
         the current state of the reference matches this expected state. It is
         thus an important part of mitigating races with a concurrent process
         that updates the same set of references.
    @@ Commit message
     
         But unsetting this flag also removes a useful signal, namely that the
         caller _did_ provide an old object ID for a given reflog entry. This
    -    signal is useful to determine whether we have to resolve the refname
    -    manually to figure out the current state, or whether we should just go
    -    with what the caller has provided.
    +    signal will become useful in a subsequent commit, where we add a new
    +    flag that tells the transaction to use the provided old and new object
    +    IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
    +    signal to verify that the caller really did provide an old object ID.
     
    -    This actually causes real issues when migrating reflogs, as we don't
    -    know to actually use the caller-provided old object ID when writing
    -    those entries. Instead, reflog entries simply end up with the all-zero
    -    object ID.
    -
    -    Stop unsetting the flag so that we can use it as this described signal,
    -    which we'll do in a subsequent commit. Skip checking the old object ID
    -    for log-only updates so that we don't expect it to match the current
    -    on-disk state.
    +    Stop unsetting the flag so that we can use it as this described signal
    +    in a subsequent commit. Skip checking the old object ID for log-only
    +    updates so that we don't expect it to match the current on-disk state.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    @@ refs.c: int repo_migrate_ref_storage_format(struct repository *repo,
      
      int ref_update_expects_existing_old_ref(struct ref_update *update)
      {
    --	return (update->flags & REF_HAVE_OLD) &&
    -+	return (update->flags & (REF_HAVE_OLD | REF_LOG_ONLY)) == REF_HAVE_OLD &&
    ++	if (update->flags & REF_LOG_ONLY)
    ++		return 0;
    ++
    + 	return (update->flags & REF_HAVE_OLD) &&
      		(!is_null_oid(&update->old_oid) || update->old_target);
      }
    - 
     
      ## refs/files-backend.c ##
     @@ refs/files-backend.c: static enum ref_transaction_error split_symref_update(struct ref_update *update,
    @@ refs/files-backend.c: static int files_transaction_finish_initial(struct files_r
      			BUG("initial ref transaction with old_sha1 set");
      
     
    + ## refs/refs-internal.h ##
    +@@ refs/refs-internal.h: enum ref_transaction_error ref_update_check_old_target(const char *referent,
    + 
    + /*
    +  * Check if the ref must exist, this means that the old_oid or
    +- * old_target is non NULL.
    ++ * old_target is non NULL. Log-only updates never require the old state to
    ++ * match.
    +  */
    + int ref_update_expects_existing_old_ref(struct ref_update *update);
    + 
    +
      ## refs/reftable-backend.c ##
     @@ refs/reftable-backend.c: static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
      	if (ret > 0) {
8:  24b5261d882 ! 8:  2382633e2e3 refs: fix invalid old object IDs when migrating reflogs
    @@ refs/reftable-backend.c: static enum ref_transaction_error prepare_single_update
      	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
      	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
     
    + ## t/t1421-reflog-write.sh ##
    +@@ t/t1421-reflog-write.sh: test_expect_success 'simple writes' '
    + 		EOF
    + 
    + 		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
    +-		# Note: the old object ID of the second reflog entry is broken.
    +-		# This will be fixed in subsequent commits.
    + 		test_reflog_matches . refs/heads/something <<-EOF
    + 		$ZERO_OID $COMMIT_OID $SIGNATURE	first
    +-		$ZERO_OID $COMMIT_OID $SIGNATURE	second
    ++		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
    + 		EOF
    + 	)
    + '
    +
      ## t/t1460-refs-migrate.sh ##
     @@ t/t1460-refs-migrate.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      
    @@ t/t1460-refs-migrate.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
     +print_all_reflog_entries () {
     +	repo=$1 &&
     +	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
    -+	cat reflogs | while read reflog
    ++	while read reflog
     +	do
     +		echo "REFLOG: $reflog" &&
     +		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
     +		return 1
    -+	done
    ++	done <reflogs
     +}
     +
      # Migrate the provided repository from one format to the other and

---
base-commit: d82adb61ba2fd11d8f2587fca1b6bd7925ce4044
change-id: 20250722-pks-reflog-append-634172d8ab2c


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
have introduced a new synopsis type that simplifies the rules for
typesetting a command's synopsis. Convert the git-reflog(1)
documentation to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 412f06b8fec..707a9b39edb 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -8,16 +8,16 @@ git-reflog - Manage reflog information
 
 SYNOPSIS
 --------
-[verse]
-'git reflog' [show] [<log-options>] [<ref>]
-'git reflog list'
-'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
+[synopsis]
+git reflog [show] [<log-options>] [<ref>]
+git reflog list
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
 	[--rewrite] [--updateref] [--stale-fix]
 	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
-'git reflog delete' [--rewrite] [--updateref]
+git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
-'git reflog drop' [--all [--single-worktree] | <refs>...]
-'git reflog exists' <ref>
+git reflog drop [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 
 DESCRIPTION
 -----------

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 2/8] builtin/reflog: improve grouping of subcommands
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

The way subcommands of git-reflog(1) are laid out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 61 ++++++++++++++++++++++---------------------
 builtin/reflog.c              | 38 +++++++++++++--------------
 2 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 707a9b39edb..c3801b82fb6 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -11,13 +11,13 @@ SYNOPSIS
 [synopsis]
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
-git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
-	[--rewrite] [--updateref] [--stale-fix]
-	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
-git reflog exists <ref>
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
+	[--rewrite] [--updateref] [--stale-fix]
+	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
 
 DESCRIPTION
 -----------
@@ -43,11 +43,9 @@ actions, and in addition the `HEAD` reflog records branch switching.
 
 The "list" subcommand lists all refs which have a corresponding reflog.
 
-The "expire" subcommand prunes older reflog entries. Entries older
-than `expire` time, or entries older than `expire-unreachable` time
-and not reachable from the current tip, are removed from the reflog.
-This is typically not used directly by end users -- instead, see
-linkgit:git-gc[1].
+The "exists" subcommand checks whether a ref has a reflog.  It exits
+with zero status if the reflog exists, and non-zero status if it does
+not.
 
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
@@ -58,9 +56,11 @@ The "drop" subcommand completely removes the reflog for the specified
 references. This is in contrast to "expire" and "delete", both of which
 can be used to delete reflog entries, but not the reflog itself.
 
-The "exists" subcommand checks whether a ref has a reflog.  It exits
-with zero status if the reflog exists, and non-zero status if it does
-not.
+The "expire" subcommand prunes older reflog entries. Entries older
+than `expire` time, or entries older than `expire-unreachable` time
+and not reachable from the current tip, are removed from the reflog.
+This is typically not used directly by end users -- instead, see
+linkgit:git-gc[1].
 
 OPTIONS
 -------
@@ -71,6 +71,25 @@ Options for `show`
 `git reflog show` accepts any of the options accepted by `git log`.
 
 
+Options for `delete`
+~~~~~~~~~~~~~~~~~~~~
+
+`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
+`--dry-run`, and `--verbose`, with the same meanings as when they are
+used with `expire`.
+
+Options for `drop`
+~~~~~~~~~~~~~~~~~~
+
+--all::
+	Drop the reflogs of all references from all worktrees.
+
+--single-worktree::
+	By default when `--all` is specified, reflogs from all working
+	trees are dropped. This option limits the processing to reflogs
+	from the current working tree only.
+
+
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -130,24 +149,6 @@ which didn't protect objects referred to by reflogs.
 	Print extra information on screen.
 
 
-Options for `delete`
-~~~~~~~~~~~~~~~~~~~~
-
-`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
-`--dry-run`, and `--verbose`, with the same meanings as when they are
-used with `expire`.
-
-Options for `drop`
-~~~~~~~~~~~~~~~~~~
-
---all::
-	Drop the reflogs of all references from all worktrees.
-
---single-worktree::
-	By default when `--all` is specified, reflogs from all working
-	trees are dropped. This option limits the processing to reflogs
-	from the current working tree only.
-
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acaf3e32c2..b00b3f9edc9 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -17,21 +17,21 @@
 #define BUILTIN_REFLOG_LIST_USAGE \
 	N_("git reflog list")
 
-#define BUILTIN_REFLOG_EXPIRE_USAGE \
-	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
-	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
-	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+#define BUILTIN_REFLOG_EXISTS_USAGE \
+	N_("git reflog exists <ref>")
 
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
 
-#define BUILTIN_REFLOG_EXISTS_USAGE \
-	N_("git reflog exists <ref>")
-
 #define BUILTIN_REFLOG_DROP_USAGE \
 	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
 
+#define BUILTIN_REFLOG_EXPIRE_USAGE \
+	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
+	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
+	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+
 static const char *const reflog_show_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	NULL,
@@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
 	NULL,
 };
 
-static const char *const reflog_expire_usage[] = {
-	BUILTIN_REFLOG_EXPIRE_USAGE,
-	NULL
+static const char *const reflog_exists_usage[] = {
+	BUILTIN_REFLOG_EXISTS_USAGE,
+	NULL,
 };
 
 static const char *const reflog_delete_usage[] = {
@@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
 	NULL
 };
 
-static const char *const reflog_exists_usage[] = {
-	BUILTIN_REFLOG_EXISTS_USAGE,
-	NULL,
-};
-
 static const char *const reflog_drop_usage[] = {
 	BUILTIN_REFLOG_DROP_USAGE,
 	NULL,
 };
 
+static const char *const reflog_expire_usage[] = {
+	BUILTIN_REFLOG_EXPIRE_USAGE,
+	NULL
+};
+
 static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
-	BUILTIN_REFLOG_EXPIRE_USAGE,
+	BUILTIN_REFLOG_EXISTS_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
-	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_EXPIRE_USAGE,
 	NULL
 };
 
@@ -404,10 +404,10 @@ int cmd_reflog(int argc,
 	struct option options[] = {
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
-		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
-		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
+		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
 		OPT_END()
 	};
 

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 3/8] refs: export `ref_transaction_update_reflog()`
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that function is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 29 +++++++++++------------------
 refs.h | 15 +++++++++++++++
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index dce5c49ca2b..8aa9f7236a3 100644
--- a/refs.c
+++ b/refs.c
@@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 	return 0;
 }
 
-/*
- * Similar to`ref_transaction_update`, but this function is only for adding
- * a reflog update. Supports providing custom committer information. The index
- * field can be utiltized to order updates as desired. When not used, the
- * updates default to being ordered by refname.
- */
-static int ref_transaction_update_reflog(struct ref_transaction *transaction,
-					 const char *refname,
-					 const struct object_id *new_oid,
-					 const struct object_id *old_oid,
-					 const char *committer_info,
-					 unsigned int flags,
-					 const char *msg,
-					 uint64_t index,
-					 struct strbuf *err)
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err)
 {
 	struct ref_update *update;
+	unsigned int flags;
 
 	assert(err);
 
-	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
@@ -3019,8 +3013,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
-					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
-					    data->index++, data->errbuf);
+					    msg, data->index++, data->errbuf);
 	return ret;
 }
 
diff --git a/refs.h b/refs.h
index 46a6008e07f..253dd8f4d5d 100644
--- a/refs.h
+++ b/refs.h
@@ -795,6 +795,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err);
 
+/*
+ * Similar to `ref_transaction_update`, but this function is only for adding
+ * a reflog update. Supports providing custom committer information. The index
+ * field can be utiltized to order updates as desired. When set to zero, the
+ * updates default to being ordered by refname.
+ */
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err);
+
 /*
  * Add a reference creation to transaction. new_oid is the value that
  * the reference should have after the update; it must not be

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-07-25  6:58   ` [PATCH v2 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-28 15:33     ` Kristoffer Haugsbakk
  2025-07-25  6:58   ` [PATCH v2 5/8] ident: fix type of string length parameter Patrick Steinhardt
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |   7 +++
 builtin/reflog.c              |  65 +++++++++++++++++++++++++++
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 101 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 174 insertions(+)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index c3801b82fb6..c8389810273 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
 git reflog exists <ref>
+git reflog write <ref> <old-oid> <new-oid> <message>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
@@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a reflog.  It exits
 with zero status if the reflog exists, and non-zero status if it does
 not.
 
+The "write" subcommand writes a single entry to the reflog of a given
+reference. This new entry is appended to the reflog and will thus become
+the most recent entry. Both the old and new object IDs must not be
+abbreviated and must point to existing objects. The reflog message gets
+normalized.
+
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
 reflog delete master@{2}`"). This subcommand is also typically not used
diff --git a/builtin/reflog.c b/builtin/reflog.c
index b00b3f9edc9..d3f0009cb0e 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -3,6 +3,8 @@
 #include "builtin.h"
 #include "config.h"
 #include "gettext.h"
+#include "hex.h"
+#include "object-store.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -20,6 +22,9 @@
 #define BUILTIN_REFLOG_EXISTS_USAGE \
 	N_("git reflog exists <ref>")
 
+#define BUILTIN_REFLOG_WRITE_USAGE \
+	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
+
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
@@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
 	NULL,
 };
 
+static const char *const reflog_write_usage[] = {
+	BUILTIN_REFLOG_WRITE_USAGE,
+	NULL,
+};
+
 static const char *const reflog_delete_usage[] = {
 	BUILTIN_REFLOG_DELETE_USAGE,
 	NULL
@@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
 	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_WRITE_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
 	BUILTIN_REFLOG_EXPIRE_USAGE,
@@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
 	return ret;
 }
 
+static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
+			    struct repository *repo)
+{
+	const struct option options[] = {
+		OPT_END()
+	};
+	struct object_id old_oid, new_oid;
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *tx;
+	const char *ref, *message;
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, reflog_write_usage, 0);
+	if (argc != 4)
+		usage_with_options(reflog_write_usage, options);
+
+	ref = argv[0];
+	if (check_refname_format(ref, REFNAME_ALLOW_ONELEVEL))
+		die(_("invalid reference name: %s"), ref);
+
+	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid old object ID: '%s'"), argv[1]);
+	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
+		die(_("old object '%s' does not exist"), argv[1]);
+
+	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid new object ID: '%s'"), argv[2]);
+	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
+		die(_("new object '%s' does not exist"), argv[2]);
+
+	message = argv[3];
+
+	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
+	if (!tx)
+		die(_("cannot start transaction: %s"), err.buf);
+
+	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
+					    git_committer_info(0),
+					    message, 0, &err);
+	if (ret)
+		die(_("cannot queue reflog update: %s"), err.buf);
+
+	ret = ref_transaction_commit(tx, &err);
+	if (ret)
+		die(_("cannot commit reflog update: %s"), err.buf);
+
+	ref_transaction_free(tx);
+	strbuf_release(&err);
+	return 0;
+}
+
 /*
  * main "reflog"
  */
@@ -405,6 +469,7 @@ int cmd_reflog(int argc,
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
 		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
 		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
diff --git a/t/meson.build b/t/meson.build
index d052fc3e23d..adcdf09e740 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -220,6 +220,7 @@ integration_tests = [
   't1418-reflog-exists.sh',
   't1419-exclude-refs.sh',
   't1420-lost-found.sh',
+  't1421-reflog-write.sh',
   't1430-bad-ref-name.sh',
   't1450-fsck.sh',
   't1451-fsck-buffer.sh',
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
new file mode 100755
index 00000000000..0cd2c4f4f31
--- /dev/null
+++ b/t/t1421-reflog-write.sh
@@ -0,0 +1,101 @@
+#!/bin/sh
+
+test_description='Manually write reflog entries'
+
+. ./test-lib.sh
+
+SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
+
+test_reflog_matches () {
+	repo="$1" &&
+	refname="$2" &&
+	cat >actual &&
+	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'invalid number of arguments' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
+		do
+			test_must_fail git reflog write $args 2>err &&
+			test_grep "usage: git reflog write" err || return 1
+		done
+	)
+'
+
+test_expect_success 'invalid refname' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'nonexistent object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID old-object-id 2>err &&
+		test_grep "old object .* does not exist" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) new-object-id 2>err &&
+		test_grep "new object .* does not exist" err
+	)
+'
+
+test_expect_success 'abbreviated object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&
+		test_grep "invalid old object ID" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID 12345 new-object-id 2>err &&
+		test_grep "invalid new object ID" err
+	)
+'
+
+test_expect_success 'reflog message gets normalized' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+		git reflog write HEAD $COMMIT_OID $COMMIT_OID "$(printf "message\nwith\nnewlines")" &&
+		git reflog show -1 --format=%gs HEAD >actual &&
+		echo "message with newlines" >expected &&
+		test_cmp expected actual
+	)
+'
+
+test_expect_success 'simple writes' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . refs/heads/something <<-EOF &&
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+
+		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
+		# Note: the old object ID of the second reflog entry is broken.
+		# This will be fixed in subsequent commits.
+		test_reflog_matches . refs/heads/something <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		EOF
+	)
+'
+
+test_done

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 5/8] ident: fix type of string length parameter
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-07-25  6:58   ` [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

The last parameter in `split_ident_line()` is the length of the line
passed in by the caller. As such, most callers pass in either the result
of `strlen()`, `struct strbuf::len` or a pointer diff, all of which
are expected to be positive numbers. Regardless of that, the function
accepts a signed integer, which is somewhat confusing.

Fix the function signature to instead accept a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ident.c | 2 +-
 ident.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ident.c b/ident.c
index 967895d8850..a7a2d132579 100644
--- a/ident.c
+++ b/ident.c
@@ -272,7 +272,7 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
  * can still be NULL if the input line only has the name/email part
  * (e.g. reading from a reflog entry).
  */
-int split_ident_line(struct ident_split *split, const char *line, int len)
+int split_ident_line(struct ident_split *split, const char *line, size_t len)
 {
 	const char *cp;
 	size_t span;
diff --git a/ident.h b/ident.h
index 6a79febba15..3c034038791 100644
--- a/ident.h
+++ b/ident.h
@@ -35,7 +35,7 @@ void reset_ident_date(void);
  * Signals an success with 0, but time part of the result may be NULL
  * if the input lacks timestamp and zone
  */
-int split_ident_line(struct ident_split *, const char *, int);
+int split_ident_line(struct ident_split *, const char *, size_t);
 
 /*
  * Given a commit or tag object buffer and the commit or tag headers, replaces

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 6/8] refs: fix identity for migrated reflogs
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-07-25  6:58   ` [PATCH v2 5/8] ident: fix type of string length parameter Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
  2025-07-25  6:58   ` [PATCH v2 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  7 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

When migrating reflog entries between different storage formats we must
reconstruct the identity of reflog entries. This is done by passing the
committer passed to the `migrate_one_reflog_entry()` callback function
to `fmt_ident()`.

This results in an invalid identity though: `fmt_ident()` expects the
caller to provide both name and mail of the author, but we pass the full
identity as mail. This leads to an identity like:

    pks <Patrick Steinhardt ps@pks.im>

Fix the bug by splitting the identity line first. This allows us to
extract both the name and mail so that we can pass them to `fmt_ident()`
separately.

This commit does not yet add any tests as there is another bug in the
reflog migration that will be fixed in a subsequent commit. Once that
bug is fixed we'll make the reflog verification in t1450 stricter, and
that will catch both this bug here and the other bug.

Note that we also add two new `name` and `mail` string buffers to the
callback structures and splice them through to the callbacks. This is
done so that we can avoid allocating a new buffer every time we compute
the committer information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 8aa9f7236a3..a5f9ffaa45d 100644
--- a/refs.c
+++ b/refs.c
@@ -2954,7 +2954,7 @@ struct migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf sb;
+	struct strbuf sb, name, mail;
 };
 
 static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
@@ -2993,7 +2993,7 @@ struct reflog_migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf *sb;
+	struct strbuf *sb, *name, *mail;
 };
 
 static int migrate_one_reflog_entry(struct object_id *old_oid,
@@ -3003,13 +3003,21 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 				    const char *msg, void *cb_data)
 {
 	struct reflog_migration_data *data = cb_data;
+	struct ident_split ident;
 	const char *date;
 	int ret;
 
+	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
+		return -1;
+
+	strbuf_reset(data->name);
+	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_reset(data->mail);
+	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
 	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
 	strbuf_reset(data->sb);
-	/* committer contains name and email */
-	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
+	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
@@ -3026,6 +3034,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
 		.transaction = migration_data->transaction,
 		.errbuf = migration_data->errbuf,
 		.sb = &migration_data->sb,
+		.name = &migration_data->name,
+		.mail = &migration_data->mail,
 	};
 
 	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
@@ -3124,6 +3134,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	struct strbuf new_gitdir = STRBUF_INIT;
 	struct migration_data data = {
 		.sb = STRBUF_INIT,
+		.name = STRBUF_INIT,
+		.mail = STRBUF_INIT,
 	};
 	int did_migrate_refs = 0;
 	int ret;
@@ -3299,6 +3311,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	strbuf_release(&data.sb);
+	strbuf_release(&data.name);
+	strbuf_release(&data.mail);
 	return ret;
 }
 

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-07-25  6:58   ` [PATCH v2 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  2025-07-25 11:36     ` Jeff King
  2025-07-25  6:58   ` [PATCH v2 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  7 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whether
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.

When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.

But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal will become useful in a subsequent commit, where we add a new
flag that tells the transaction to use the provided old and new object
IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
signal to verify that the caller really did provide an old object ID.

Stop unsetting the flag so that we can use it as this described signal
in a subsequent commit. Skip checking the old object ID for log-only
updates so that we don't expect it to match the current on-disk state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  8 +++-----
 refs/files-backend.c    |  9 +++++----
 refs/refs-internal.h    |  3 ++-
 refs/reftable-backend.c | 12 +++---------
 4 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/refs.c b/refs.c
index a5f9ffaa45d..f88928de746 100644
--- a/refs.c
+++ b/refs.c
@@ -1393,11 +1393,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 	update = ref_transaction_add_update(transaction, refname, flags,
 					    new_oid, old_oid, NULL, NULL,
 					    committer_info, msg);
-	/*
-	 * While we do set the old_oid value, we unset the flag to skip
-	 * old_oid verification which only makes sense for refs.
-	 */
-	update->flags &= ~REF_HAVE_OLD;
 	update->index = index;
 
 	/*
@@ -3318,6 +3313,9 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 
 int ref_update_expects_existing_old_ref(struct ref_update *update)
 {
+	if (update->flags & REF_LOG_ONLY)
+		return 0;
+
 	return (update->flags & REF_HAVE_OLD) &&
 		(!is_null_oid(&update->old_oid) || update->old_target);
 }
diff --git a/refs/files-backend.c b/refs/files-backend.c
index bf6f89b1d19..8b42fe18901 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
 	 * done when new_update is processed.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;
 
 	return 0;
 }
@@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
 						struct object_id *oid,
 						struct strbuf *err)
 {
-	if (!(update->flags & REF_HAVE_OLD) ||
-		   oideq(oid, &update->old_oid))
+	if (update->flags & REF_LOG_ONLY ||
+	    !(update->flags & REF_HAVE_OLD) ||
+	    oideq(oid, &update->old_oid))
 		return 0;
 
 	if (is_null_oid(&update->old_oid)) {
@@ -3059,7 +3059,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
-		if ((update->flags & REF_HAVE_OLD) &&
+		if (!(update->flags & REF_LOG_ONLY) &&
+		    (update->flags & REF_HAVE_OLD) &&
 		    !is_null_oid(&update->old_oid))
 			BUG("initial ref transaction with old_sha1 set");
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f8688708519..95a4dc3902f 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -802,7 +802,8 @@ enum ref_transaction_error ref_update_check_old_target(const char *referent,
 
 /*
  * Check if the ref must exist, this means that the old_oid or
- * old_target is non NULL.
+ * old_target is non NULL. Log-only updates never require the old state to
+ * match.
  */
 int ref_update_expects_existing_old_ref(struct ref_update *update);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4c3817f4ec1..44af58ac50b 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret > 0) {
 		/* The reference does not exist, but we expected it to. */
 		strbuf_addf(err, _("cannot lock ref '%s': "
-
-
 				   "unable to resolve reference '%s'"),
 			    ref_update_original_update_refname(u), u->refname);
 		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
@@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 
 			new_update->parent_update = u;
 
-			/*
-			 * Change the symbolic ref update to log only. Also, it
-			 * doesn't need to check its old OID value, as that will be
-			 * done when new_update is processed.
-			 */
+			/* Change the symbolic ref update to log only. */
 			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-			u->flags &= ~REF_HAVE_OLD;
 		}
 	}
 
@@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 		ret = ref_update_check_old_target(referent->buf, u, err);
 		if (ret)
 			return ret;
-	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
+	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
+		   !oideq(&current_oid, &u->old_oid)) {
 		if (is_null_oid(&u->old_oid)) {
 			strbuf_addf(err, _("cannot lock ref '%s': "
 					   "reference already exists"),

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 8/8] refs: fix invalid old object IDs when migrating reflogs
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-07-25  6:58   ` [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-07-25  6:58   ` Patrick Steinhardt
  7 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-25  6:58 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  3 ++-
 refs.h                  |  9 ++++++++-
 refs/files-backend.c    | 16 +++++++++++++++-
 refs/reftable-backend.c | 14 ++++++++++++++
 t/t1421-reflog-write.sh |  4 +---
 t/t1460-refs-migrate.sh | 22 +++++++++++++++-------
 6 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index f88928de746..946eb48941b 100644
--- a/refs.c
+++ b/refs.c
@@ -1385,7 +1385,8 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 
 	assert(err);
 
-	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF |
+		REF_LOG_USE_PROVIDED_OIDS;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
diff --git a/refs.h b/refs.h
index 253dd8f4d5d..090b4fdff4f 100644
--- a/refs.h
+++ b/refs.h
@@ -760,13 +760,20 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
  */
 #define REF_SKIP_CREATE_REFLOG (1 << 12)
 
+/*
+ * When writing a REF_LOG_ONLY record, use the old and new object IDs provided
+ * in the update instead of resolving the old object ID. The caller must also
+ * set both REF_HAVE_OLD and REF_HAVE_NEW.
+ */
+#define REF_LOG_USE_PROVIDED_OIDS (1 << 13)
+
 /*
  * Bitmask of all of the flags that are allowed to be passed in to
  * ref_transaction_update() and friends:
  */
 #define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS                                  \
 	(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
-	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
+	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG | REF_LOG_USE_PROVIDED_OIDS)
 
 /*
  * Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 8b42fe18901..b891a326a13 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2974,6 +2974,20 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 				  struct ref_lock *lock,
 				  struct strbuf *err)
 {
+	struct object_id *old_oid = &lock->old_oid;
+
+	if (update->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(update->flags & REF_HAVE_OLD) ||
+		    !(update->flags & REF_HAVE_NEW) ||
+		    !(update->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), update->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		old_oid = &update->old_oid;
+	}
+
 	if (update->new_target) {
 		/*
 		 * We want to get the resolved OID for the target, to ensure
@@ -2991,7 +3005,7 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 		}
 	}
 
-	if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid,
+	if (files_log_ref_write(refs, lock->ref_name, old_oid,
 				&update->new_oid, update->committer_info,
 				update->msg, update->flags, err)) {
 		char *old_msg = strbuf_detach(err, NULL);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 44af58ac50b..99fafd75ebe 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1096,6 +1096,20 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret)
 		return REF_TRANSACTION_ERROR_GENERIC;
 
+	if (u->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(u->flags & REF_HAVE_OLD) ||
+		    !(u->flags & REF_HAVE_NEW) ||
+		    !(u->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), u->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		if (queue_transaction_update(refs, tx_data, u, &u->old_oid, err))
+			return REF_TRANSACTION_ERROR_GENERIC;
+		return 0;
+	}
+
 	/* Verify that the new object ID is valid. */
 	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
 	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
index 0cd2c4f4f31..a09b5cd44e9 100755
--- a/t/t1421-reflog-write.sh
+++ b/t/t1421-reflog-write.sh
@@ -89,11 +89,9 @@ test_expect_success 'simple writes' '
 		EOF
 
 		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
-		# Note: the old object ID of the second reflog entry is broken.
-		# This will be fixed in subsequent commits.
 		test_reflog_matches . refs/heads/something <<-EOF
 		$ZERO_OID $COMMIT_OID $SIGNATURE	first
-		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
 		EOF
 	)
 '
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
index 2ab97e1b7df..0e1116a319d 100755
--- a/t/t1460-refs-migrate.sh
+++ b/t/t1460-refs-migrate.sh
@@ -7,6 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+print_all_reflog_entries () {
+	repo=$1 &&
+	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
+	while read reflog
+	do
+		echo "REFLOG: $reflog" &&
+		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
+		return 1
+	done <reflogs
+}
+
 # Migrate the provided repository from one format to the other and
 # verify that the references and logs are migrated over correctly.
 # Usage: test_migration <repo> <format> [<skip_reflog_verify> [<options...>]]
@@ -28,8 +39,7 @@ test_migration () {
 		--format='%(refname) %(objectname) %(symref)' >expect &&
 	if ! $skip_reflog_verify
 	then
-	   git -C "$repo" reflog --all >expect_logs &&
-	   git -C "$repo" reflog list >expect_log_list
+		print_all_reflog_entries "$repo" >expect_logs
 	fi &&
 
 	git -C "$repo" refs migrate --ref-format="$format" "$@" &&
@@ -39,10 +49,8 @@ test_migration () {
 	test_cmp expect actual &&
 	if ! $skip_reflog_verify
 	then
-		git -C "$repo" reflog --all >actual_logs &&
-		git -C "$repo" reflog list >actual_log_list &&
-		test_cmp expect_logs actual_logs &&
-		test_cmp expect_log_list actual_log_list
+		print_all_reflog_entries "$repo" >actual_logs &&
+		test_cmp expect_logs actual_logs
 	fi &&
 
 	git -C "$repo" rev-parse --show-ref-format >actual &&
@@ -273,7 +281,7 @@ test_expect_success 'multiple reftable blocks with multiple entries' '
 	test_commit -C repo second &&
 	printf "update refs/heads/ref-%d HEAD\n" $(test_seq 3000) >stdin &&
 	git -C repo update-ref --stdin <stdin &&
-	test_migration repo reftable
+	test_migration repo reftable true
 '
 
 test_expect_success 'migrating from files format deletes backend files' '

-- 
2.50.1.565.gc32cd1483b.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-25  6:58   ` [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-07-25 11:36     ` Jeff King
  2025-07-28 14:43       ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Jeff King @ 2025-07-25 11:36 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes

On Fri, Jul 25, 2025 at 08:58:29AM +0200, Patrick Steinhardt wrote:

> The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
> object ID set. If so, the value of that field is used to verify whether
> the current state of the reference matches this expected state. It is
> thus an important part of mitigating races with a concurrent process
> that updates the same set of references.
> 
> When writing reflogs though we explicitly unset that flag. This is a
> sensible thing to do: the old state of reflog entry updates may not
> necessarily match the current on-disk state of its accompanying ref, but
> it's only intended to signal what old object ID we want to write into
> the new reflog entry. For example when migrating refs we end up writing
> many reflog entries for a single reference, and most likely those reflog
> entries will have many different old object IDs.
> 
> But unsetting this flag also removes a useful signal, namely that the
> caller _did_ provide an old object ID for a given reflog entry. This
> signal will become useful in a subsequent commit, where we add a new
> flag that tells the transaction to use the provided old and new object
> IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
> signal to verify that the caller really did provide an old object ID.
> 
> Stop unsetting the flag so that we can use it as this described signal
> in a subsequent commit. Skip checking the old object ID for log-only
> updates so that we don't expect it to match the current on-disk state.

I like this direction, but I happened to be working in this area
yesterday[1] and noticed something interesting. You're effectively
replacing this removal of the HAVE_OLD flag when split a symref update:

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index bf6f89b1d19..8b42fe18901 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
>  	 * done when new_update is processed.
>  	 */
>  	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
> -	update->flags &= ~REF_HAVE_OLD;
>  
>  	return 0;
>  }

and then later we get the same logic by checking for LOG_ONLY:

> @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
>  						struct object_id *oid,
>  						struct strbuf *err)
>  {
> -	if (!(update->flags & REF_HAVE_OLD) ||
> -		   oideq(oid, &update->old_oid))
> +	if (update->flags & REF_LOG_ONLY ||
> +	    !(update->flags & REF_HAVE_OLD) ||
> +	    oideq(oid, &update->old_oid))
>  		return 0;
>  
>  	if (is_null_oid(&update->old_oid)) {

Which make sense to me. But the weird thing I noticed is that when we do
something similar for split_head_update(), we don't strip REF_HAVE_OLD!

(For those not familiar with that function, it notices when we are
updating refs/heads/foo that is pointed-to by HEAD, and then adds an
extra HEAD reflog update to the transaction).

So as I understand it, right now we are doing an extra check_old_oid()
on that log-only HEAD update, and after your patch we would stop doing
so.

Which I _think_ is the right thing to do, but it made me wonder if the
transaction were ever non-atomic. That is, could we split off a log-only
update that succeeds, even though the old-oid check for the actual
ref fails?

Historically, I'd guess the answer is mostly "no", because the point of
ref transactions is to be all-or-nothing, and to do the locking and
old-oid checking before writing out any updates. But I also think I saw
some discussion of non-atomic transactions recently. I didn't really
follow it, but is this a potential problem?

-Peff

[1] If you are wondering what work: it is the fact that at least with
    the files backend, we will happily overwrite a dangling symref even
    when the caller asked us to make sure this is a creation event. That
    is easy to fix, but I was surprised that some HEAD updates failed
    after doing so. The problem is that the reflog update for HEAD did
    not clear the HAVE_OLD flag, and my solution was to do so (just like
    split_symref_update() does). But as your topic here shows, that will
    probably result in broken reflogs. And we should be checking for
    LOG_ONLY in check_old_oid, as you're doing here (which would also
    fix my problem).

    But that also makes me wonder: should ref_update_check_old_target()
    also be checking LOG_ONLY now in your patch? I guess not, as it does
    not use HAVE_OLD at all (that is just about the oid). We get the
    equivalent behavior in the split-off log-only transaction item
    because we just do not set "old_target" in the split-off item.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-25  5:36       ` Patrick Steinhardt
@ 2025-07-25 14:35         ` Junio C Hamano
  0 siblings, 0 replies; 114+ messages in thread
From: Junio C Hamano @ 2025-07-25 14:35 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: SZEDER Gábor, git, Karthik Nayak

Patrick Steinhardt <ps@pks.im> writes:

> Sure, I can rebase this on top of v2.50.1. It would then of course
> require some smallish fixes when merged to `seen`. The below patch is
> what is required to make it work with the v2.50 track.
>
> Patrick

Thanks.


> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index bc7e7f5e442..d3f0009cb0e 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -4,7 +4,7 @@
>  #include "config.h"
>  #include "gettext.h"
>  #include "hex.h"
> -#include "odb.h"
> +#include "object-store.h"
>  #include "revision.h"
>  #include "reachable.h"
>  #include "wildmatch.h"
> @@ -426,13 +426,13 @@ static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
>  	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
>  	if (ret)
>  		die(_("invalid old object ID: '%s'"), argv[1]);
> -	if (!is_null_oid(&old_oid) && !odb_has_object(repo->objects, &old_oid, 0))
> +	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
>  		die(_("old object '%s' does not exist"), argv[1]);
>  
>  	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
>  	if (ret)
>  		die(_("invalid new object ID: '%s'"), argv[2]);
> -	if (!is_null_oid(&new_oid) && !odb_has_object(repo->objects, &new_oid, 0))
> +	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
>  		die(_("new object '%s' does not exist"), argv[2]);
>  
>  	message = argv[3];

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-25 11:36     ` Jeff King
@ 2025-07-28 14:43       ` Patrick Steinhardt
  2025-07-29  7:14         ` Jeff King
  0 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-28 14:43 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes

On Fri, Jul 25, 2025 at 07:36:10AM -0400, Jeff King wrote:
> On Fri, Jul 25, 2025 at 08:58:29AM +0200, Patrick Steinhardt wrote:
> 
> > The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
> > object ID set. If so, the value of that field is used to verify whether
> > the current state of the reference matches this expected state. It is
> > thus an important part of mitigating races with a concurrent process
> > that updates the same set of references.
> > 
> > When writing reflogs though we explicitly unset that flag. This is a
> > sensible thing to do: the old state of reflog entry updates may not
> > necessarily match the current on-disk state of its accompanying ref, but
> > it's only intended to signal what old object ID we want to write into
> > the new reflog entry. For example when migrating refs we end up writing
> > many reflog entries for a single reference, and most likely those reflog
> > entries will have many different old object IDs.
> > 
> > But unsetting this flag also removes a useful signal, namely that the
> > caller _did_ provide an old object ID for a given reflog entry. This
> > signal will become useful in a subsequent commit, where we add a new
> > flag that tells the transaction to use the provided old and new object
> > IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
> > signal to verify that the caller really did provide an old object ID.
> > 
> > Stop unsetting the flag so that we can use it as this described signal
> > in a subsequent commit. Skip checking the old object ID for log-only
> > updates so that we don't expect it to match the current on-disk state.
> 
> I like this direction, but I happened to be working in this area
> yesterday[1] and noticed something interesting. You're effectively
> replacing this removal of the HAVE_OLD flag when split a symref update:
> 
> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index bf6f89b1d19..8b42fe18901 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -2493,7 +2493,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
> >  	 * done when new_update is processed.
> >  	 */
> >  	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
> > -	update->flags &= ~REF_HAVE_OLD;
> >  
> >  	return 0;
> >  }
> 
> and then later we get the same logic by checking for LOG_ONLY:
> 
> > @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
> >  						struct object_id *oid,
> >  						struct strbuf *err)
> >  {
> > -	if (!(update->flags & REF_HAVE_OLD) ||
> > -		   oideq(oid, &update->old_oid))
> > +	if (update->flags & REF_LOG_ONLY ||
> > +	    !(update->flags & REF_HAVE_OLD) ||
> > +	    oideq(oid, &update->old_oid))
> >  		return 0;
> >  
> >  	if (is_null_oid(&update->old_oid)) {
> 
> Which make sense to me. But the weird thing I noticed is that when we do
> something similar for split_head_update(), we don't strip REF_HAVE_OLD!

And we shouldn't do that, as in the next commit we actually build on
always having `REF_HAVE_OLD` set for reflog-only updates. So I'd argue
that the problem is actually the other way round: when splitting off the
HEAD update we must resolve the old object ID if `REF_HAVE_OLD` is not
set.

> (For those not familiar with that

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-25  6:58   ` [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-07-28 15:33     ` Kristoffer Haugsbakk
  2025-07-28 18:49       ` Junio C Hamano
  2025-07-29  0:25       ` Ben Knoble
  0 siblings, 2 replies; 114+ messages in thread
From: Kristoffer Haugsbakk @ 2025-07-28 15:33 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes

On Fri, Jul 25, 2025, at 08:58, Patrick Steinhardt wrote:
> While we provide a couple of subcommands in git-reflog(1) to remove
> reflog entries, we don't provide any to write new entries. Obviously
> this is not an operation that really would be needed for many use cases
> out there, or otherwise people would have complained that such a command
> does not exist yet.

This command will allow you to write a simpler unique marker (without
having to make a marker-ref for a ref) that can be used to find back to
a specific point.

I’ve had some use for that.  I used git-update-ref(1) for that because
of `-m` (as well as plumbing-for-scripting).

> diff --git a/Documentation/git-reflog.adoc
> b/Documentation/git-reflog.adoc
> index c3801b82fb6..c8389810273 100644
> --- a/Documentation/git-reflog.adoc
> +++ b/Documentation/git-reflog.adoc
> @@ -12,6 +12,7 @@ SYNOPSIS
>  git reflog [show] [<log-options>] [<ref>]
>  git reflog list
>  git reflog exists <ref>
> +git reflog write <ref> <old-oid> <new-oid> <message>
>  git reflog delete [--rewrite] [--updateref]
>  	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
>  git reflog drop [--all [--single-worktree] | <refs>...]
> @@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a
> reflog.  It exits
>  with zero status if the reflog exists, and non-zero status if it does
>  not.
>
> +The "write" subcommand writes a single entry to the reflog of a given
> +reference. This new entry is appended to the reflog and will thus become
> +the most recent entry. Both the old and new object IDs must not be
> +abbreviated and must point to existing objects. The reflog message gets
> +normalized.
> +

You have to give the full refname to this subcommand.  `git reflog write
... branch <msg>` will update the reflog for the one-level ref `branch`.
But I’m used to using git-reflog(1) with a name like `branch` and it
using `refs/heads/branch` if it exists.  At least that’s how the default
`git reflog show` behaves.

Which means that

    git reflog write ... refs/heads/branch <msg>
    git reflog branch

Will show that written reflog.

Whereas this

    git reflog write ... branch <msg>
    git reflog branch

Will show one entry since `branch` is the one-level ref `branch`, not
`refs/heads/branch`.  Now it looks like `write` truncated the reflog and
wrote a new reflog message (if you mistakenly think that `branch` is a
branch).

It isn’t clear to me how the current doc guides me in the correct
direction here.

I tried `git reflog drop`[1] and it can deal with a branch like
`branch`.  It doesn’t need to be told `refs/heads/branch`.

>  The "delete" subcommand deletes single entries from the reflog, but
>  not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
>  reflog delete master@{2}`"). This subcommand is also typically not used

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-28 15:33     ` Kristoffer Haugsbakk
@ 2025-07-28 18:49       ` Junio C Hamano
  2025-07-28 20:39         ` Karthik Nayak
  2025-07-29  0:25       ` Ben Knoble
  1 sibling, 1 reply; 114+ messages in thread
From: Junio C Hamano @ 2025-07-28 18:49 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: Patrick Steinhardt, git, Karthik Nayak, Justin Tobler,
	SZEDER Gábor, Toon Claes

"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:

> I tried `git reflog drop`[1] and it can deal with a branch like
> `branch`.  It doesn’t need to be told `refs/heads/branch`.

That sounds like a bug to me.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-28 18:49       ` Junio C Hamano
@ 2025-07-28 20:39         ` Karthik Nayak
  2025-07-28 20:59           ` Junio C Hamano
  0 siblings, 1 reply; 114+ messages in thread
From: Karthik Nayak @ 2025-07-28 20:39 UTC (permalink / raw)
  To: Junio C Hamano, Kristoffer Haugsbakk
  Cc: Patrick Steinhardt, git, Justin Tobler, SZEDER Gábor,
	Toon Claes

[-- Attachment #1: Type: text/plain, Size: 828 bytes --]

Junio C Hamano <gitster@pobox.com> writes:

> "Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:
>
>> I tried `git reflog drop`[1] and it can deal with a branch like
>> `branch`.  It doesn’t need to be told `refs/heads/branch`.
>
> That sounds like a bug to me.

So `git reflog drop` `git reflog delete` and `git reflog expire` use
`repo_dwim_log()` to resolve the provided reference.

And `repo_dwim_log()` uses the following `ref_rev_parse_rules` to
resolve the reference.

  static const char *ref_rev_parse_rules[] = {
  	"%.*s",
  	"refs/%.*s",
  	"refs/tags/%.*s",
  	"refs/heads/%.*s",
  	"refs/remotes/%.*s",
  	"refs/remotes/%.*s/HEAD",
  	NULL
  };

Which means we do a best case resolution of a given reference, but the
function also checks for ambiguity and warns for it.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-28 20:39         ` Karthik Nayak
@ 2025-07-28 20:59           ` Junio C Hamano
  2025-07-30  7:55             ` Karthik Nayak
  0 siblings, 1 reply; 114+ messages in thread
From: Junio C Hamano @ 2025-07-28 20:59 UTC (permalink / raw)
  To: Karthik Nayak
  Cc: Kristoffer Haugsbakk, Patrick Steinhardt, git, Justin Tobler,
	SZEDER Gábor, Toon Claes

Karthik Nayak <karthik.188@gmail.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> "Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:
>>
>>> I tried `git reflog drop`[1] and it can deal with a branch like
>>> `branch`.  It doesn’t need to be told `refs/heads/branch`.
>>
>> That sounds like a bug to me.
>
> So `git reflog drop` `git reflog delete` and `git reflog expire` use
> `repo_dwim_log()` to resolve the provided reference.
>
> And `repo_dwim_log()` uses the following `ref_rev_parse_rules` to
> resolve the reference.
>
>   static const char *ref_rev_parse_rules[] = {
>   	"%.*s",
>   	"refs/%.*s",
>   	"refs/tags/%.*s",
>   	"refs/heads/%.*s",
>   	"refs/remotes/%.*s",
>   	"refs/remotes/%.*s/HEAD",
>   	NULL
>   };
>
> Which means we do a best case resolution of a given reference, but the
> function also checks for ambiguity and warns for it.

True.  But as I considered "git reflog" to be a lot closer to the
plumbing than to Porcelain, using the dwim thing smelled like a bug.

It also is OK to update the commands that do not use dwim-log to
also use it.  That way, the result would be consistent across
subcommands of "git reflog".  As long as the users are aware of the
fact that the command uses dwim-log, they can always spell their ref
in full like "refs/heads/branch" to avoid ambiguity check getting in
the way.

Thanks.



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-28 15:33     ` Kristoffer Haugsbakk
  2025-07-28 18:49       ` Junio C Hamano
@ 2025-07-29  0:25       ` Ben Knoble
  2025-07-29  6:14         ` Kristoffer Haugsbakk
  2025-07-29  6:51         ` Patrick Steinhardt
  1 sibling, 2 replies; 114+ messages in thread
From: Ben Knoble @ 2025-07-29  0:25 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: Patrick Steinhardt, git, Karthik Nayak, Justin Tobler,
	Junio C Hamano, SZEDER Gábor, Toon Claes


> Le 28 juil. 2025 à 11:37, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> a écrit :
> 
> On Fri, Jul 25, 2025, at 08:58, Patrick Steinhardt wrote:
>> While we provide a couple of subcommands in git-reflog(1) to remove
>> reflog entries, we don't provide any to write new entries. Obviously
>> this is not an operation that really would be needed for many use cases
>> out there, or otherwise people would have complained that such a command
>> does not exist yet.
> 
> This command will allow you to write a simpler unique marker (without
> having to make a marker-ref for a ref) that can be used to find back to
> a specific point.
> 
> I’ve had some use for that.  I used git-update-ref(1) for that because
> of `-m` (as well as plumbing-for-scripting).
> 
>> diff --git a/Documentation/git-reflog.adoc
>> b/Documentation/git-reflog.adoc
>> index c3801b82fb6..c8389810273 100644
>> --- a/Documentation/git-reflog.adoc
>> +++ b/Documentation/git-reflog.adoc
>> @@ -12,6 +12,7 @@ SYNOPSIS
>> git reflog [show] [<log-options>] [<ref>]
>> git reflog list
>> git reflog exists <ref>
>> +git reflog write <ref> <old-oid> <new-oid> <message>
>> git reflog delete [--rewrite] [--updateref]
>>   [--dry-run | -n] [--verbose] <ref>@{<specifier>}...
>> git reflog drop [--all [--single-worktree] | <refs>...]
>> @@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a
>> reflog.  It exits
>> with zero status if the reflog exists, and non-zero status if it does
>> not.
>> +The "write" subcommand writes a single entry to the reflog of a given
>> +reference. This new entry is appended to the reflog and will thus become
>> +the most recent entry. Both the old and new object IDs must not be
>> +abbreviated and must point to existing objects. The reflog message gets
>> +normalized.
>> +
> 
> You have to give the full refname to this subcommand.  `git reflog write
> ... branch <msg>` will update the reflog for the one-level ref `branch`.
> But I’m used to using git-reflog(1) with a name like `branch` and it
> using `refs/heads/branch` if it exists.  At least that’s how the default
> `git reflog show` behaves.
> 
> Which means that
> 
>   git reflog write ... refs/heads/branch <msg>
>   git reflog branch
> 
> Will show that written reflog.
> 
> Whereas this
> 
>   git reflog write ... branch <msg>
>   git reflog branch
> 
> Will show one entry since `branch` is the one-level ref `branch`, not
> `refs/heads/branch`.  Now it looks like `write` truncated the reflog and
> wrote a new reflog message (if you mistakenly think that `branch` is a
> branch).

This quirk of update-ref bit me the first few times I used it, too. I think it’s at least documented there though.

> 
> It isn’t clear to me how the current doc guides me in the correct
> direction here.
> 
> I tried `git reflog drop`[1] and it can deal with a branch like
> `branch`.  It doesn’t need to be told `refs/heads/branch`.

(Partly responding to comments about what to do with this) I think consistency would be best, and since “git reflog branch” is not abnormal we should continue to allow that. 

> 
>> The "delete" subcommand deletes single entries from the reflog, but
>> not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
>> reflog delete master@{2}`"). This subcommand is also typically not used

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-29  0:25       ` Ben Knoble
@ 2025-07-29  6:14         ` Kristoffer Haugsbakk
  2025-07-29  6:51         ` Patrick Steinhardt
  1 sibling, 0 replies; 114+ messages in thread
From: Kristoffer Haugsbakk @ 2025-07-29  6:14 UTC (permalink / raw)
  To: D. Ben Knoble
  Cc: Patrick Steinhardt, git, Karthik Nayak, Justin Tobler,
	Junio C Hamano, SZEDER Gábor, Toon Claes

On Tue, Jul 29, 2025, at 02:25, Ben Knoble wrote:
>> ...
>> Will show one entry since `branch` is the one-level ref `branch`, not
>> `refs/heads/branch`.  Now it looks like `write` truncated the reflog and
>> wrote a new reflog message (if you mistakenly think that `branch` is a
>> branch).
>
> This quirk of update-ref bit me the first few times I used it, too. I
> think it’s at least documented there though.

I do that with update-ref more than I care to commit^W admit.  But
that’s consistent with the command and well-documented.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-29  0:25       ` Ben Knoble
  2025-07-29  6:14         ` Kristoffer Haugsbakk
@ 2025-07-29  6:51         ` Patrick Steinhardt
  2025-07-29 15:00           ` Junio C Hamano
  1 sibling, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  6:51 UTC (permalink / raw)
  To: Ben Knoble
  Cc: Kristoffer Haugsbakk, git, Karthik Nayak, Justin Tobler,
	Junio C Hamano, SZEDER Gábor, Toon Claes

On Mon, Jul 28, 2025 at 08:25:43PM -0400, Ben Knoble wrote:
> > Le 28 juil. 2025 à 11:37, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> a écrit :
> > On Fri, Jul 25, 2025, at 08:58, Patrick Steinhardt wrote:
> >> diff --git a/Documentation/git-reflog.adoc
> >> b/Documentation/git-reflog.adoc
> >> index c3801b82fb6..c8389810273 100644
> >> --- a/Documentation/git-reflog.adoc
> >> +++ b/Documentation/git-reflog.adoc
> >> @@ -12,6 +12,7 @@ SYNOPSIS
> >> git reflog [show] [<log-options>] [<ref>]
> >> git reflog list
> >> git reflog exists <ref>
> >> +git reflog write <ref> <old-oid> <new-oid> <message>
> >> git reflog delete [--rewrite] [--updateref]
> >>   [--dry-run | -n] [--verbose] <ref>@{<specifier>}...
> >> git reflog drop [--all [--single-worktree] | <refs>...]
> >> @@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a
> >> reflog.  It exits
> >> with zero status if the reflog exists, and non-zero status if it does
> >> not.
> >> +The "write" subcommand writes a single entry to the reflog of a given
> >> +reference. This new entry is appended to the reflog and will thus become
> >> +the most recent entry. Both the old and new object IDs must not be
> >> +abbreviated and must point to existing objects. The reflog message gets
> >> +normalized.
> >> +
> > 
> > You have to give the full refname to this subcommand.  `git reflog write
> > ... branch <msg>` will update the reflog for the one-level ref `branch`.
> > But I’m used to using git-reflog(1) with a name like `branch` and it
> > using `refs/heads/branch` if it exists.  At least that’s how the default
> > `git reflog show` behaves.
> > 
> > Which means that
> > 
> >   git reflog write ... refs/heads/branch <msg>
> >   git reflog branch
> > 
> > Will show that written reflog.
> > 
> > Whereas this
> > 
> >   git reflog write ... branch <msg>
> >   git reflog branch
> > 
> > Will show one entry since `branch` is the one-level ref `branch`, not
> > `refs/heads/branch`.  Now it looks like `write` truncated the reflog and
> > wrote a new reflog message (if you mistakenly think that `branch` is a
> > branch).
> 
> This quirk of update-ref bit me the first few times I used it, too. I
> think it’s at least documented there though.
> 
> > 
> > It isn’t clear to me how the current doc guides me in the correct
> > direction here.
> > 
> > I tried `git reflog drop`[1] and it can deal with a branch like
> > `branch`.  It doesn’t need to be told `refs/heads/branch`.
> 
> (Partly responding to comments about what to do with this) I think
> consistency would be best, and since “git reflog branch” is not
> abnormal we should continue to allow that. 

There's a big difference though: `git reflog drop` won't ever do
anything for a reflog that doesn't exist. Consequently, we know that our
DWIM mechanism can kick in and resolve the reference properly if such a
reflog exists.

But for `git reflog write` that's not the case, as you can write a
reflog message for a yet-nonexistent reflog. The DWIM mechanism cannot
kick in here as there is no reflog. So what do we do in that case? We
could of course just pick the first DWIM rule, which would be that we
decide to write the reflog for "refs/heads/$REFNAME". But... I dunno,
that feels too magicky to m

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-28 14:43       ` Patrick Steinhardt
@ 2025-07-29  7:14         ` Jeff King
  2025-07-29  7:54           ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Jeff King @ 2025-07-29  7:14 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes

On Mon, Jul 28, 2025 at 04:43:12PM +0200, Patrick Steinhardt wrote:

> > > @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
> > >  						struct object_id *oid,
> > >  						struct strbuf *err)
> > >  {
> > > -	if (!(update->flags & REF_HAVE_OLD) ||
> > > -		   oideq(oid, &update->old_oid))
> > > +	if (update->flags & REF_LOG_ONLY ||
> > > +	    !(update->flags & REF_HAVE_OLD) ||
> > > +	    oideq(oid, &update->old_oid))
> > >  		return 0;
> > >  
> > >  	if (is_null_oid(&update->old_oid)) {
> > 
> > Which make sense to me. But the weird thing I noticed is that when we do
> > something similar for split_head_update(), we don't strip REF_HAVE_OLD!
> 
> And we shouldn't do that, as in the next commit we actually build on
> always having `REF_HAVE_OLD` set for reflog-only updates. So I'd argue
> that the problem is actually the other way round: when splitting off the
> HEAD update we must resolve the old object ID if `REF_HAVE_OLD` is not
> set.

Yeah, I agree that after your patches, split_head_update() should
definitely not be clearing that flag. What I more meant was: this patch
is introducing a behavior change for those split HEAD updates, which
used to do the extra old-oid check but now won't (whereas for other
symref log-only updates, you are preserving the behavior).

I _think_ that's a reasonable thing, but I wanted to make sure.

However...

> > (For those not familiar with that

...did you mean to write more? I know you've been running into weird
email truncation issues lately.

-Peff

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-29  7:14         ` Jeff King
@ 2025-07-29  7:54           ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  7:54 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes

On Tue, Jul 29, 2025 at 03:14:55AM -0400, Jeff King wrote:
> On Mon, Jul 28, 2025 at 04:43:12PM +0200, Patrick Steinhardt wrote:
> > > > @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
> > > >  						struct object_id *oid,
> > > >  						struct strbuf *err)
> > > >  {
> > > > -	if (!(update->flags & REF_HAVE_OLD) ||
> > > > -		   oideq(oid, &update->old_oid))
> > > > +	if (update->flags & REF_LOG_ONLY ||
> > > > +	    !(update->flags & REF_HAVE_OLD) ||
> > > > +	    oideq(oid, &update->old_oid))
> > > >  		return 0;
> > > >  
> > > >  	if (is_null_oid(&update->old_oid)) {
> > > 
> > > Which make sense to me. But the weird thing I noticed is that when we do
> > > something similar for split_head_update(), we don't strip REF_HAVE_OLD!
> > 
> > And we shouldn't do that, as in the next commit we actually build on
> > always having `REF_HAVE_OLD` set for reflog-only updates. So I'd argue
> > that the problem is actually the other way round: when splitting off the
> > HEAD update we must resolve the old object ID if `REF_HAVE_OLD` is not
> > set.
> 
> Yeah, I agree that after your patches, split_head_update() should
> definitely not be clearing that flag. What I more meant was: this patch
> is introducing a behavior change for those split HEAD updates, which
> used to do the extra old-oid check but now won't (whereas for other
> symref log-only updates, you are preserving the behavior).
> 
> I _think_ that's a reasonable thing, but I wanted to make sure.
> 
> However...
> 
> > > (For those not familiar with that
> 
> ...did you mean to write more? I know you've been running into weird
> email truncation issues lately.

Sigh. Yes. I really need to figure this out, but I have no clue
whatsoever where to look.

Anyway, here's the remainder of that mail:

On Fri, Jul 25, 2025 at 07:36:10AM -0400, Jeff King wrote:
> On Fri, Jul 25, 2025 at 08:58:29AM +0200, Patrick Steinhardt wrote:
> > @@ -2508,8 +2507,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
> (For those not familiar with that function, it notices when we are
> updating refs/heads/foo that is pointed-to by HEAD, and then adds an
> extra HEAD reflog update to the transaction).
> 
> So as I understand it, right now we are doing an extra check_old_oid()
> on that log-only HEAD update, and after your patch we would stop doing
> so.
> 
> Which I _think_ is the right thing to do, but it made me wonder if the
> transaction were ever non-atomic. That is, could we split off a log-only
> update that succeeds, even though the old-oid check for the actual
> ref fails?

I think this can only happen the other way round: the log update never
gets persisted unless the parent ref is, as we'd otherwise abort. But
what can happen is that we end up with a broken reflog entry. See below.

> Historically, I'd guess the answer is mostly "no", because the point of
> ref transactions is to be all-or-nothing, and to do the locking and
> old-oid checking before writing out any updates. But I also think I saw
> some discussion of non-atomic transactions recently. I didn't really
> follow it, but is this a potential problem?

I'd say that the whole logic has always been flawed: we resolve the
target that "HEAD" points to without locking the reference. Consequently
we have a race in case "HEAD" got updated to point somewhere else, as
we'd still write a reflog entry to "HEAD".

What used to save us a bit is that at least the old object ID would be
correct in such a case because we used to verify it even for log-only
updates. But whether or not we write such a reflog message in the first
place is subject to a race. And when `REF_HAVE_OLD` wasn't set we might
even end up writing a different old object ID than what we write to the
target reference.

Theoretically speaking we'd have to lock "HEAD" immediately after we
have resolved it to ensure that it doesn't change anymore. But that lock
would be quite restrictive.

I guess the next-best thing that we can do is to lock "HEAD" as soon as
we find the ref that it's pointing to. If so, we can re-evaluate whether
"HEAD" still points to the same ref -- and if so, we split off the
update for the reflog. If it doesn't anymore then all bets are off, as
it may be the case that "HEAD" has now been changed to point to a ref
that has already been processed by us.

I guess the safest bet would be to just abort the whole transaction in
that case? After all it is a racy update, but it feels heavy-handed to
reject the whole transaction only because we fail to write a reflog
entry.

But even that doesn't solve this race completely: "HEAD" might have been
unborn at the start of a transaction or refer to a target ref that isn't
updated as part of the ref. So we wouldn't ever get to re-resolving the
ref. We could double check at the end of the transaction whether "HEAD"
has changed, but that isn't really working either as its target ref may
have flip-flopped.

In any case, I think we can improve the situation at least a bit:

  - We lock the parent update before calling `split_head_update()`. This
    ensures the old object ID is resolved already and cannot change
    anymore.

  - In `split_head_update()` we take in the parent lock as a parameter.
    If `REF_HAVE_OLD` is unset we take the old object ID from the parent
    lock and set `REF_HAVE_OLD` for the reflog entry. This ensures that
    we at least use the same old ID for both ref and reflog updates.

But the other race, that "HEAD" may have changed concurrently... I don't
think that one can be plugged without a bigger effort.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v3 0/9] refs: fix migration of reflog entries
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (8 preceding siblings ...)
  2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-07-29  8:55 ` Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
                     ` (8 more replies)
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (2 subsequent siblings)
  12 siblings, 9 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Changes in v2:
  - !!! The base of this topic has changed so that it sits on top of
    v2.50.1. This is done so that we can backport this change to older
    release tracks.
  - A couple of typo fixes and clarifications for commit messages.
  - Reorder sections in git-reflog(1) manpage according to the
    reordering we have in the synopsis.
  - Add a section for the new `write` command.
  - Improve test coverage for the `git reflog write` command.
  - Avoid `cat`ing a file into a Bash loop.
  - Remove a stale comment.
  - Make `ref_update_expects_existing_old_ref()` a bit more straight
    forward.
  - Link to v1: https://lore.kernel.org/r/20250722-pks-reflog-append-v1-0-183e5949de16@pks.im

Changes in v3:
  - `git reflog write` now requires fully-qualified refnames.
  - A new commit that plugs one part of the race around splitting of
    reflogs for HEAD in the "files" backend.
  - Link to v2: https://lore.kernel.org/r/20250725-pks-reflog-append-v2-0-e4e7cbe3f578@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (9):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs/files: detect race when generating reflog entry for HEAD
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  76 ++++++++++++++------------
 builtin/reflog.c              | 103 ++++++++++++++++++++++++++++-------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  60 +++++++++++---------
 refs.h                        |  24 +++++++-
 refs/files-backend.c          |  65 +++++++++++++++++++---
 refs/refs-internal.h          |   3 +-
 refs/reftable-backend.c       |  26 ++++++---
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 124 ++++++++++++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 +++++---
 12 files changed, 401 insertions(+), 107 deletions(-)

Range-diff versus v2:

 1:  65f4647df02 =  1:  027ac6d12f3 Documentation/git-reflog: convert to use synopsis type
 2:  e53a402a88d =  2:  1570cac0cb9 builtin/reflog: improve grouping of subcommands
 3:  4d060861f50 =  3:  af43c907fa0 refs: export `ref_transaction_update_reflog()`
 4:  ddd471f9891 !  4:  4322f98fcdd builtin/reflog: implement subcommand to write new entries
    @@ Documentation/git-reflog.adoc: The "exists" subcommand checks whether a ref has
      
     +The "write" subcommand writes a single entry to the reflog of a given
     +reference. This new entry is appended to the reflog and will thus become
    -+the most recent entry. Both the old and new object IDs must not be
    -+abbreviated and must point to existing objects. The reflog message gets
    -+normalized.
    ++the most recent entry. The reference name must be fully qualified. Both the old
    ++and new object IDs must not be abbreviated and must point to existing objects.
    ++The reflog message gets normalized.
     +
      The "delete" subcommand deletes single entries from the reflog, but
      not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
    @@ builtin/reflog.c: static int cmd_reflog_drop(int argc, const char **argv, const
     +		usage_with_options(reflog_write_usage, options);
     +
     +	ref = argv[0];
    -+	if (check_refname_format(ref, REFNAME_ALLOW_ONELEVEL))
    ++	if (!is_root_ref(ref) && check_refname_format(ref, 0))
     +		die(_("invalid reference name: %s"), ref);
     +
     +	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
    @@ t/t1421-reflog-write.sh (new)
     +	)
     +'
     +
    ++test_expect_success 'unqualified refname is rejected' '
    ++	test_when_finished "rm -rf repo" &&
    ++	git init repo &&
    ++	(
    ++		cd repo &&
    ++		test_must_fail git reflog write unqualified $ZERO_OID $ZERO_OID first 2>err &&
    ++		test_grep "invalid reference name: " err
    ++	)
    ++'
    ++
     +test_expect_success 'nonexistent object IDs' '
     +	test_when_finished "rm -rf repo" &&
     +	git init repo &&
    @@ t/t1421-reflog-write.sh (new)
     +	)
     +'
     +
    ++test_expect_success 'can write to root ref' '
    ++	test_when_finished "rm -rf repo" &&
    ++	git init repo &&
    ++	(
    ++		cd repo &&
    ++		test_commit initial &&
    ++		COMMIT_OID=$(git rev-parse HEAD) &&
    ++
    ++		git reflog write ROOT_REF_HEAD $ZERO_OID $COMMIT_OID first &&
    ++		test_reflog_matches . ROOT_REF_HEAD <<-EOF
    ++		$ZERO_OID $COMMIT_OID $SIGNATURE	first
    ++		EOF
    ++	)
    ++'
    ++
     +test_done
 5:  67028ef4439 =  5:  66de5312e83 ident: fix type of string length parameter
 6:  a6bf88a4e89 =  6:  2b9fe08cf76 refs: fix identity for migrated reflogs
 -:  ----------- >  7:  7f87327e17c refs/files: detect race when generating reflog entry for HEAD
 7:  71b0f753dd3 =  8:  792a1d7ce61 refs: stop unsetting REF_HAVE_OLD for log-only updates
 8:  2d88a1e57b8 =  9:  121630b9d64 refs: fix invalid old object IDs when migrating reflogs

---
base-commit: d82adb61ba2fd11d8f2587fca1b6bd7925ce4044
change-id: 20250722-pks-reflog-append-634172d8ab2c


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
have introduced a new synopsis type that simplifies the rules for
typesetting a command's synopsis. Convert the git-reflog(1)
documentation to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 412f06b8fec..707a9b39edb 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -8,16 +8,16 @@ git-reflog - Manage reflog information
 
 SYNOPSIS
 --------
-[verse]
-'git reflog' [show] [<log-options>] [<ref>]
-'git reflog list'
-'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
+[synopsis]
+git reflog [show] [<log-options>] [<ref>]
+git reflog list
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
 	[--rewrite] [--updateref] [--stale-fix]
 	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
-'git reflog delete' [--rewrite] [--updateref]
+git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
-'git reflog drop' [--all [--single-worktree] | <refs>...]
-'git reflog exists' <ref>
+git reflog drop [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 
 DESCRIPTION
 -----------

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 2/9] builtin/reflog: improve grouping of subcommands
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The way subcommands of git-reflog(1) are laid out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 61 ++++++++++++++++++++++---------------------
 builtin/reflog.c              | 38 +++++++++++++--------------
 2 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 707a9b39edb..c3801b82fb6 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -11,13 +11,13 @@ SYNOPSIS
 [synopsis]
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
-git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
-	[--rewrite] [--updateref] [--stale-fix]
-	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
-git reflog exists <ref>
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
+	[--rewrite] [--updateref] [--stale-fix]
+	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
 
 DESCRIPTION
 -----------
@@ -43,11 +43,9 @@ actions, and in addition the `HEAD` reflog records branch switching.
 
 The "list" subcommand lists all refs which have a corresponding reflog.
 
-The "expire" subcommand prunes older reflog entries. Entries older
-than `expire` time, or entries older than `expire-unreachable` time
-and not reachable from the current tip, are removed from the reflog.
-This is typically not used directly by end users -- instead, see
-linkgit:git-gc[1].
+The "exists" subcommand checks whether a ref has a reflog.  It exits
+with zero status if the reflog exists, and non-zero status if it does
+not.
 
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
@@ -58,9 +56,11 @@ The "drop" subcommand completely removes the reflog for the specified
 references. This is in contrast to "expire" and "delete", both of which
 can be used to delete reflog entries, but not the reflog itself.
 
-The "exists" subcommand checks whether a ref has a reflog.  It exits
-with zero status if the reflog exists, and non-zero status if it does
-not.
+The "expire" subcommand prunes older reflog entries. Entries older
+than `expire` time, or entries older than `expire-unreachable` time
+and not reachable from the current tip, are removed from the reflog.
+This is typically not used directly by end users -- instead, see
+linkgit:git-gc[1].
 
 OPTIONS
 -------
@@ -71,6 +71,25 @@ Options for `show`
 `git reflog show` accepts any of the options accepted by `git log`.
 
 
+Options for `delete`
+~~~~~~~~~~~~~~~~~~~~
+
+`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
+`--dry-run`, and `--verbose`, with the same meanings as when they are
+used with `expire`.
+
+Options for `drop`
+~~~~~~~~~~~~~~~~~~
+
+--all::
+	Drop the reflogs of all references from all worktrees.
+
+--single-worktree::
+	By default when `--all` is specified, reflogs from all working
+	trees are dropped. This option limits the processing to reflogs
+	from the current working tree only.
+
+
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -130,24 +149,6 @@ which didn't protect objects referred to by reflogs.
 	Print extra information on screen.
 
 
-Options for `delete`
-~~~~~~~~~~~~~~~~~~~~
-
-`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
-`--dry-run`, and `--verbose`, with the same meanings as when they are
-used with `expire`.
-
-Options for `drop`
-~~~~~~~~~~~~~~~~~~
-
---all::
-	Drop the reflogs of all references from all worktrees.
-
---single-worktree::
-	By default when `--all` is specified, reflogs from all working
-	trees are dropped. This option limits the processing to reflogs
-	from the current working tree only.
-
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acaf3e32c2..b00b3f9edc9 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -17,21 +17,21 @@
 #define BUILTIN_REFLOG_LIST_USAGE \
 	N_("git reflog list")
 
-#define BUILTIN_REFLOG_EXPIRE_USAGE \
-	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
-	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
-	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+#define BUILTIN_REFLOG_EXISTS_USAGE \
+	N_("git reflog exists <ref>")
 
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
 
-#define BUILTIN_REFLOG_EXISTS_USAGE \
-	N_("git reflog exists <ref>")
-
 #define BUILTIN_REFLOG_DROP_USAGE \
 	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
 
+#define BUILTIN_REFLOG_EXPIRE_USAGE \
+	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
+	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
+	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+
 static const char *const reflog_show_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	NULL,
@@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
 	NULL,
 };
 
-static const char *const reflog_expire_usage[] = {
-	BUILTIN_REFLOG_EXPIRE_USAGE,
-	NULL
+static const char *const reflog_exists_usage[] = {
+	BUILTIN_REFLOG_EXISTS_USAGE,
+	NULL,
 };
 
 static const char *const reflog_delete_usage[] = {
@@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
 	NULL
 };
 
-static const char *const reflog_exists_usage[] = {
-	BUILTIN_REFLOG_EXISTS_USAGE,
-	NULL,
-};
-
 static const char *const reflog_drop_usage[] = {
 	BUILTIN_REFLOG_DROP_USAGE,
 	NULL,
 };
 
+static const char *const reflog_expire_usage[] = {
+	BUILTIN_REFLOG_EXPIRE_USAGE,
+	NULL
+};
+
 static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
-	BUILTIN_REFLOG_EXPIRE_USAGE,
+	BUILTIN_REFLOG_EXISTS_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
-	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_EXPIRE_USAGE,
 	NULL
 };
 
@@ -404,10 +404,10 @@ int cmd_reflog(int argc,
 	struct option options[] = {
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
-		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
-		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
+		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
 		OPT_END()
 	};
 

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()`
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-08-01 11:38     ` Toon Claes
  2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that function is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 29 +++++++++++------------------
 refs.h | 15 +++++++++++++++
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index dce5c49ca2b..8aa9f7236a3 100644
--- a/refs.c
+++ b/refs.c
@@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 	return 0;
 }
 
-/*
- * Similar to`ref_transaction_update`, but this function is only for adding
- * a reflog update. Supports providing custom committer information. The index
- * field can be utiltized to order updates as desired. When not used, the
- * updates default to being ordered by refname.
- */
-static int ref_transaction_update_reflog(struct ref_transaction *transaction,
-					 const char *refname,
-					 const struct object_id *new_oid,
-					 const struct object_id *old_oid,
-					 const char *committer_info,
-					 unsigned int flags,
-					 const char *msg,
-					 uint64_t index,
-					 struct strbuf *err)
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err)
 {
 	struct ref_update *update;
+	unsigned int flags;
 
 	assert(err);
 
-	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
@@ -3019,8 +3013,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
-					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
-					    data->index++, data->errbuf);
+					    msg, data->index++, data->errbuf);
 	return ret;
 }
 
diff --git a/refs.h b/refs.h
index 46a6008e07f..253dd8f4d5d 100644
--- a/refs.h
+++ b/refs.h
@@ -795,6 +795,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err);
 
+/*
+ * Similar to `ref_transaction_update`, but this function is only for adding
+ * a reflog update. Supports providing custom committer information. The index
+ * field can be utiltized to order updates as desired. When set to zero, the
+ * updates default to being ordered by refname.
+ */
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err);
+
 /*
  * Add a reference creation to transaction. new_oid is the value that
  * the reference should have after the update; it must not be

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-07-29  8:55   ` [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29 16:07     ` Junio C Hamano
  2025-08-01 11:37     ` Toon Claes
  2025-07-29  8:55   ` [PATCH v3 5/9] ident: fix type of string length parameter Patrick Steinhardt
                     ` (4 subsequent siblings)
  8 siblings, 2 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |   7 +++
 builtin/reflog.c              |  65 ++++++++++++++++++++++
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 126 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 199 insertions(+)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index c3801b82fb6..34232a539a7 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
 git reflog exists <ref>
+git reflog write <ref> <old-oid> <new-oid> <message>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
@@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a reflog.  It exits
 with zero status if the reflog exists, and non-zero status if it does
 not.
 
+The "write" subcommand writes a single entry to the reflog of a given
+reference. This new entry is appended to the reflog and will thus become
+the most recent entry. The reference name must be fully qualified. Both the old
+and new object IDs must not be abbreviated and must point to existing objects.
+The reflog message gets normalized.
+
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
 reflog delete master@{2}`"). This subcommand is also typically not used
diff --git a/builtin/reflog.c b/builtin/reflog.c
index b00b3f9edc9..a1b4e022041 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -3,6 +3,8 @@
 #include "builtin.h"
 #include "config.h"
 #include "gettext.h"
+#include "hex.h"
+#include "object-store.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -20,6 +22,9 @@
 #define BUILTIN_REFLOG_EXISTS_USAGE \
 	N_("git reflog exists <ref>")
 
+#define BUILTIN_REFLOG_WRITE_USAGE \
+	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
+
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
@@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
 	NULL,
 };
 
+static const char *const reflog_write_usage[] = {
+	BUILTIN_REFLOG_WRITE_USAGE,
+	NULL,
+};
+
 static const char *const reflog_delete_usage[] = {
 	BUILTIN_REFLOG_DELETE_USAGE,
 	NULL
@@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
 	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_WRITE_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
 	BUILTIN_REFLOG_EXPIRE_USAGE,
@@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
 	return ret;
 }
 
+static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
+			    struct repository *repo)
+{
+	const struct option options[] = {
+		OPT_END()
+	};
+	struct object_id old_oid, new_oid;
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *tx;
+	const char *ref, *message;
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, reflog_write_usage, 0);
+	if (argc != 4)
+		usage_with_options(reflog_write_usage, options);
+
+	ref = argv[0];
+	if (!is_root_ref(ref) && check_refname_format(ref, 0))
+		die(_("invalid reference name: %s"), ref);
+
+	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid old object ID: '%s'"), argv[1]);
+	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
+		die(_("old object '%s' does not exist"), argv[1]);
+
+	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid new object ID: '%s'"), argv[2]);
+	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
+		die(_("new object '%s' does not exist"), argv[2]);
+
+	message = argv[3];
+
+	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
+	if (!tx)
+		die(_("cannot start transaction: %s"), err.buf);
+
+	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
+					    git_committer_info(0),
+					    message, 0, &err);
+	if (ret)
+		die(_("cannot queue reflog update: %s"), err.buf);
+
+	ret = ref_transaction_commit(tx, &err);
+	if (ret)
+		die(_("cannot commit reflog update: %s"), err.buf);
+
+	ref_transaction_free(tx);
+	strbuf_release(&err);
+	return 0;
+}
+
 /*
  * main "reflog"
  */
@@ -405,6 +469,7 @@ int cmd_reflog(int argc,
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
 		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
 		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
diff --git a/t/meson.build b/t/meson.build
index d052fc3e23d..adcdf09e740 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -220,6 +220,7 @@ integration_tests = [
   't1418-reflog-exists.sh',
   't1419-exclude-refs.sh',
   't1420-lost-found.sh',
+  't1421-reflog-write.sh',
   't1430-bad-ref-name.sh',
   't1450-fsck.sh',
   't1451-fsck-buffer.sh',
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
new file mode 100755
index 00000000000..6cad64f40ab
--- /dev/null
+++ b/t/t1421-reflog-write.sh
@@ -0,0 +1,126 @@
+#!/bin/sh
+
+test_description='Manually write reflog entries'
+
+. ./test-lib.sh
+
+SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
+
+test_reflog_matches () {
+	repo="$1" &&
+	refname="$2" &&
+	cat >actual &&
+	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'invalid number of arguments' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
+		do
+			test_must_fail git reflog write $args 2>err &&
+			test_grep "usage: git reflog write" err || return 1
+		done
+	)
+'
+
+test_expect_success 'invalid refname' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'unqualified refname is rejected' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write unqualified $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'nonexistent object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID old-object-id 2>err &&
+		test_grep "old object .* does not exist" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) new-object-id 2>err &&
+		test_grep "new object .* does not exist" err
+	)
+'
+
+test_expect_success 'abbreviated object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&
+		test_grep "invalid old object ID" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID 12345 new-object-id 2>err &&
+		test_grep "invalid new object ID" err
+	)
+'
+
+test_expect_success 'reflog message gets normalized' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+		git reflog write HEAD $COMMIT_OID $COMMIT_OID "$(printf "message\nwith\nnewlines")" &&
+		git reflog show -1 --format=%gs HEAD >actual &&
+		echo "message with newlines" >expected &&
+		test_cmp expected actual
+	)
+'
+
+test_expect_success 'simple writes' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . refs/heads/something <<-EOF &&
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+
+		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
+		# Note: the old object ID of the second reflog entry is broken.
+		# This will be fixed in subsequent commits.
+		test_reflog_matches . refs/heads/something <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		EOF
+	)
+'
+
+test_expect_success 'can write to root ref' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write ROOT_REF_HEAD $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . ROOT_REF_HEAD <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+	)
+'
+
+test_done

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 5/9] ident: fix type of string length parameter
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The last parameter in `split_ident_line()` is the length of the line
passed in by the caller. As such, most callers pass in either the result
of `strlen()`, `struct strbuf::len` or a pointer diff, all of which
are expected to be positive numbers. Regardless of that, the function
accepts a signed integer, which is somewhat confusing.

Fix the function signature to instead accept a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ident.c | 2 +-
 ident.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ident.c b/ident.c
index 967895d8850..a7a2d132579 100644
--- a/ident.c
+++ b/ident.c
@@ -272,7 +272,7 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
  * can still be NULL if the input line only has the name/email part
  * (e.g. reading from a reflog entry).
  */
-int split_ident_line(struct ident_split *split, const char *line, int len)
+int split_ident_line(struct ident_split *split, const char *line, size_t len)
 {
 	const char *cp;
 	size_t span;
diff --git a/ident.h b/ident.h
index 6a79febba15..3c034038791 100644
--- a/ident.h
+++ b/ident.h
@@ -35,7 +35,7 @@ void reset_ident_date(void);
  * Signals an success with 0, but time part of the result may be NULL
  * if the input lacks timestamp and zone
  */
-int split_ident_line(struct ident_split *, const char *, int);
+int split_ident_line(struct ident_split *, const char *, size_t);
 
 /*
  * Given a commit or tag object buffer and the commit or tag headers, replaces

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 6/9] refs: fix identity for migrated reflogs
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-07-29  8:55   ` [PATCH v3 5/9] ident: fix type of string length parameter Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When migrating reflog entries between different storage formats we must
reconstruct the identity of reflog entries. This is done by passing the
committer passed to the `migrate_one_reflog_entry()` callback function
to `fmt_ident()`.

This results in an invalid identity though: `fmt_ident()` expects the
caller to provide both name and mail of the author, but we pass the full
identity as mail. This leads to an identity like:

    pks <Patrick Steinhardt ps@pks.im>

Fix the bug by splitting the identity line first. This allows us to
extract both the name and mail so that we can pass them to `fmt_ident()`
separately.

This commit does not yet add any tests as there is another bug in the
reflog migration that will be fixed in a subsequent commit. Once that
bug is fixed we'll make the reflog verification in t1450 stricter, and
that will catch both this bug here and the other bug.

Note that we also add two new `name` and `mail` string buffers to the
callback structures and splice them through to the callbacks. This is
done so that we can avoid allocating a new buffer every time we compute
the committer information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 8aa9f7236a3..a5f9ffaa45d 100644
--- a/refs.c
+++ b/refs.c
@@ -2954,7 +2954,7 @@ struct migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf sb;
+	struct strbuf sb, name, mail;
 };
 
 static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
@@ -2993,7 +2993,7 @@ struct reflog_migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf *sb;
+	struct strbuf *sb, *name, *mail;
 };
 
 static int migrate_one_reflog_entry(struct object_id *old_oid,
@@ -3003,13 +3003,21 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 				    const char *msg, void *cb_data)
 {
 	struct reflog_migration_data *data = cb_data;
+	struct ident_split ident;
 	const char *date;
 	int ret;
 
+	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
+		return -1;
+
+	strbuf_reset(data->name);
+	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_reset(data->mail);
+	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
 	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
 	strbuf_reset(data->sb);
-	/* committer contains name and email */
-	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
+	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
@@ -3026,6 +3034,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
 		.transaction = migration_data->transaction,
 		.errbuf = migration_data->errbuf,
 		.sb = &migration_data->sb,
+		.name = &migration_data->name,
+		.mail = &migration_data->mail,
 	};
 
 	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
@@ -3124,6 +3134,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	struct strbuf new_gitdir = STRBUF_INIT;
 	struct migration_data data = {
 		.sb = STRBUF_INIT,
+		.name = STRBUF_INIT,
+		.mail = STRBUF_INIT,
 	};
 	int did_migrate_refs = 0;
 	int ret;
@@ -3299,6 +3311,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	strbuf_release(&data.sb);
+	strbuf_release(&data.name);
+	strbuf_release(&data.mail);
 	return ret;
 }
 

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-07-29  8:55   ` [PATCH v3 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29 16:16     ` Junio C Hamano
                       ` (2 more replies)
  2025-07-29  8:55   ` [PATCH v3 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  8 siblings, 3 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When updating a reference that is being pointed to HEAD we don't only
write a reflog message for that particular reference, but also generate
one for HEAD. This logic is handled by `split_head_update()`, where we:

  1. Verify that the condition actually triggered. This is done by
     reading HEAD at the start of the transaction so that we can then
     check whether a given reference update refers to its target.

  2. Queue a new log-only update for HEAD in case it did.

But the logic is unfortunately not free of races, as we do not lock the
HEAD reference after we have read its target. This can lead to the
following two scenarios:

  - HEAD gets concurrently updated to point to one of the references we
    have already processed. This causes us not writing a reflog message
    even though we should have done so.

  - HEAD gets concurrently updated to point to not point to a reference
    anymore that we have already processed. This causes us to write a
    reflog message even though we should _not_ have done so.

Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that
is specific to the "files" backend. If set, we will double check that
the HEAD reference still points to the reference that we are creating
the reflog entry for after we have locked HEAD. Furthermore, instead of
manually resolving the old object ID of that entry, we now use the same
old state as for the parent update.

Unfortunately, this change only helps with the second race. We cannot
reliably plug the first race without locking the HEAD reference at the
start of the transaction. Locking HEAD unconditionally would effectively
serialize all writes though, and that doesn't seem like an option. Also,
double checking its value at the end of the transaction is not an option
either, as its target may have flip-flopped during the transaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bf6f89b1d19..ba018b0984a 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -68,6 +68,12 @@
  */
 #define REF_DELETED_RMDIR (1 << 9)
 
+/*
+ * Used to indicate that the reflog-only update has been created via
+ * `split_head_update()`.
+ */
+#define REF_LOG_VIA_SPLIT (1 << 14)
+
 struct ref_lock {
 	char *ref_name;
 	struct lock_file lk;
@@ -2420,9 +2426,10 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
 
 	new_update = ref_transaction_add_update(
 			transaction, "HEAD",
-			update->flags | REF_LOG_ONLY | REF_NO_DEREF,
+			update->flags | REF_LOG_ONLY | REF_NO_DEREF | REF_LOG_VIA_SPLIT,
 			&update->new_oid, &update->old_oid,
 			NULL, NULL, update->committer_info, update->msg);
+	new_update->parent_update = update;
 
 	/*
 	 * Add "HEAD". This insertion is O(N) in the transaction
@@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 	update->backend_data = lock;
 
-	if (update->type & REF_ISSYMREF) {
+	if (update->flags & REF_LOG_VIA_SPLIT) {
+		struct ref_lock *parent_lock;
+
+		if (!update->parent_update)
+			BUG("split update without a parent");
+
+		parent_lock = update->parent_update->backend_data;
+
+		/*
+		 * Check that "HEAD" didn't racily change since we have looked
+		 * it up. If it did we must refuse to write the reflog entry.
+		 *
+		 * Note that this does not catch all races: if "HEAD" was
+		 * racily changed to point to one of the refs part of the
+		 * transaction then we would miss writing the split reflog
+		 * entry for "HEAD".
+		 */
+		if (!(update->type & REF_ISSYMREF) ||
+		    strcmp(update->parent_update->refname, referent.buf)) {
+			strbuf_addstr(err, "HEAD has been racily updated");
+			ret = REF_TRANSACTION_ERROR_GENERIC;
+			goto out;
+		}
+
+		if (update->flags & REF_HAVE_OLD) {
+			oidcpy(&lock->old_oid, &update->old_oid);
+		} else {
+			oidcpy(&lock->old_oid, &parent_lock->old_oid);
+		}
+	} else if (update->type & REF_ISSYMREF) {
 		if (update->flags & REF_NO_DEREF) {
 			/*
 			 * We won't be reading the referent as part of

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  2025-07-29  8:55   ` [PATCH v3 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whether
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.

When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.

But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal will become useful in a subsequent commit, where we add a new
flag that tells the transaction to use the provided old and new object
IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
signal to verify that the caller really did provide an old object ID.

Stop unsetting the flag so that we can use it as this described signal
in a subsequent commit. Skip checking the old object ID for log-only
updates so that we don't expect it to match the current on-disk state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  8 +++-----
 refs/files-backend.c    |  9 +++++----
 refs/refs-internal.h    |  3 ++-
 refs/reftable-backend.c | 12 +++---------
 4 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/refs.c b/refs.c
index a5f9ffaa45d..f88928de746 100644
--- a/refs.c
+++ b/refs.c
@@ -1393,11 +1393,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 	update = ref_transaction_add_update(transaction, refname, flags,
 					    new_oid, old_oid, NULL, NULL,
 					    committer_info, msg);
-	/*
-	 * While we do set the old_oid value, we unset the flag to skip
-	 * old_oid verification which only makes sense for refs.
-	 */
-	update->flags &= ~REF_HAVE_OLD;
 	update->index = index;
 
 	/*
@@ -3318,6 +3313,9 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 
 int ref_update_expects_existing_old_ref(struct ref_update *update)
 {
+	if (update->flags & REF_LOG_ONLY)
+		return 0;
+
 	return (update->flags & REF_HAVE_OLD) &&
 		(!is_null_oid(&update->old_oid) || update->old_target);
 }
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ba018b0984a..85ab2ef2b94 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2500,7 +2500,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
 	 * done when new_update is processed.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;
 
 	return 0;
 }
@@ -2515,8 +2514,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
 						struct object_id *oid,
 						struct strbuf *err)
 {
-	if (!(update->flags & REF_HAVE_OLD) ||
-		   oideq(oid, &update->old_oid))
+	if (update->flags & REF_LOG_ONLY ||
+	    !(update->flags & REF_HAVE_OLD) ||
+	    oideq(oid, &update->old_oid))
 		return 0;
 
 	if (is_null_oid(&update->old_oid)) {
@@ -3095,7 +3095,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
-		if ((update->flags & REF_HAVE_OLD) &&
+		if (!(update->flags & REF_LOG_ONLY) &&
+		    (update->flags & REF_HAVE_OLD) &&
 		    !is_null_oid(&update->old_oid))
 			BUG("initial ref transaction with old_sha1 set");
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f8688708519..95a4dc3902f 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -802,7 +802,8 @@ enum ref_transaction_error ref_update_check_old_target(const char *referent,
 
 /*
  * Check if the ref must exist, this means that the old_oid or
- * old_target is non NULL.
+ * old_target is non NULL. Log-only updates never require the old state to
+ * match.
  */
 int ref_update_expects_existing_old_ref(struct ref_update *update);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4c3817f4ec1..44af58ac50b 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret > 0) {
 		/* The reference does not exist, but we expected it to. */
 		strbuf_addf(err, _("cannot lock ref '%s': "
-
-
 				   "unable to resolve reference '%s'"),
 			    ref_update_original_update_refname(u), u->refname);
 		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
@@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 
 			new_update->parent_update = u;
 
-			/*
-			 * Change the symbolic ref update to log only. Also, it
-			 * doesn't need to check its old OID value, as that will be
-			 * done when new_update is processed.
-			 */
+			/* Change the symbolic ref update to log only. */
 			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-			u->flags &= ~REF_HAVE_OLD;
 		}
 	}
 
@@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 		ret = ref_update_check_old_target(referent->buf, u, err);
 		if (ret)
 			return ret;
-	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
+	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
+		   !oideq(&current_oid, &u->old_oid)) {
 		if (is_null_oid(&u->old_oid)) {
 			strbuf_addf(err, _("cannot lock ref '%s': "
 					   "reference already exists"),

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v3 9/9] refs: fix invalid old object IDs when migrating reflogs
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-07-29  8:55   ` [PATCH v3 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-07-29  8:55   ` Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-29  8:55 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  3 ++-
 refs.h                  |  9 ++++++++-
 refs/files-backend.c    | 16 +++++++++++++++-
 refs/reftable-backend.c | 14 ++++++++++++++
 t/t1421-reflog-write.sh |  4 +---
 t/t1460-refs-migrate.sh | 22 +++++++++++++++-------
 6 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index f88928de746..946eb48941b 100644
--- a/refs.c
+++ b/refs.c
@@ -1385,7 +1385,8 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 
 	assert(err);
 
-	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF |
+		REF_LOG_USE_PROVIDED_OIDS;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
diff --git a/refs.h b/refs.h
index 253dd8f4d5d..090b4fdff4f 100644
--- a/refs.h
+++ b/refs.h
@@ -760,13 +760,20 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
  */
 #define REF_SKIP_CREATE_REFLOG (1 << 12)
 
+/*
+ * When writing a REF_LOG_ONLY record, use the old and new object IDs provided
+ * in the update instead of resolving the old object ID. The caller must also
+ * set both REF_HAVE_OLD and REF_HAVE_NEW.
+ */
+#define REF_LOG_USE_PROVIDED_OIDS (1 << 13)
+
 /*
  * Bitmask of all of the flags that are allowed to be passed in to
  * ref_transaction_update() and friends:
  */
 #define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS                                  \
 	(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
-	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
+	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG | REF_LOG_USE_PROVIDED_OIDS)
 
 /*
  * Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 85ab2ef2b94..905555365b8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3010,6 +3010,20 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 				  struct ref_lock *lock,
 				  struct strbuf *err)
 {
+	struct object_id *old_oid = &lock->old_oid;
+
+	if (update->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(update->flags & REF_HAVE_OLD) ||
+		    !(update->flags & REF_HAVE_NEW) ||
+		    !(update->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), update->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		old_oid = &update->old_oid;
+	}
+
 	if (update->new_target) {
 		/*
 		 * We want to get the resolved OID for the target, to ensure
@@ -3027,7 +3041,7 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 		}
 	}
 
-	if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid,
+	if (files_log_ref_write(refs, lock->ref_name, old_oid,
 				&update->new_oid, update->committer_info,
 				update->msg, update->flags, err)) {
 		char *old_msg = strbuf_detach(err, NULL);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 44af58ac50b..99fafd75ebe 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1096,6 +1096,20 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret)
 		return REF_TRANSACTION_ERROR_GENERIC;
 
+	if (u->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(u->flags & REF_HAVE_OLD) ||
+		    !(u->flags & REF_HAVE_NEW) ||
+		    !(u->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), u->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		if (queue_transaction_update(refs, tx_data, u, &u->old_oid, err))
+			return REF_TRANSACTION_ERROR_GENERIC;
+		return 0;
+	}
+
 	/* Verify that the new object ID is valid. */
 	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
 	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
index 6cad64f40ab..d4e41838f8f 100755
--- a/t/t1421-reflog-write.sh
+++ b/t/t1421-reflog-write.sh
@@ -99,11 +99,9 @@ test_expect_success 'simple writes' '
 		EOF
 
 		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
-		# Note: the old object ID of the second reflog entry is broken.
-		# This will be fixed in subsequent commits.
 		test_reflog_matches . refs/heads/something <<-EOF
 		$ZERO_OID $COMMIT_OID $SIGNATURE	first
-		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
 		EOF
 	)
 '
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
index 2ab97e1b7df..0e1116a319d 100755
--- a/t/t1460-refs-migrate.sh
+++ b/t/t1460-refs-migrate.sh
@@ -7,6 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+print_all_reflog_entries () {
+	repo=$1 &&
+	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
+	while read reflog
+	do
+		echo "REFLOG: $reflog" &&
+		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
+		return 1
+	done <reflogs
+}
+
 # Migrate the provided repository from one format to the other and
 # verify that the references and logs are migrated over correctly.
 # Usage: test_migration <repo> <format> [<skip_reflog_verify> [<options...>]]
@@ -28,8 +39,7 @@ test_migration () {
 		--format='%(refname) %(objectname) %(symref)' >expect &&
 	if ! $skip_reflog_verify
 	then
-	   git -C "$repo" reflog --all >expect_logs &&
-	   git -C "$repo" reflog list >expect_log_list
+		print_all_reflog_entries "$repo" >expect_logs
 	fi &&
 
 	git -C "$repo" refs migrate --ref-format="$format" "$@" &&
@@ -39,10 +49,8 @@ test_migration () {
 	test_cmp expect actual &&
 	if ! $skip_reflog_verify
 	then
-		git -C "$repo" reflog --all >actual_logs &&
-		git -C "$repo" reflog list >actual_log_list &&
-		test_cmp expect_logs actual_logs &&
-		test_cmp expect_log_list actual_log_list
+		print_all_reflog_entries "$repo" >actual_logs &&
+		test_cmp expect_logs actual_logs
 	fi &&
 
 	git -C "$repo" rev-parse --show-ref-format >actual &&
@@ -273,7 +281,7 @@ test_expect_success 'multiple reftable blocks with multiple entries' '
 	test_commit -C repo second &&
 	printf "update refs/heads/ref-%d HEAD\n" $(test_seq 3000) >stdin &&
 	git -C repo update-ref --stdin <stdin &&
-	test_migration repo reftable
+	test_migration repo reftable true
 '
 
 test_expect_success 'migrating from files format deletes backend files' '

-- 
2.50.1.619.g074bbf1d35.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-29  6:51         ` Patrick Steinhardt
@ 2025-07-29 15:00           ` Junio C Hamano
  2025-07-30  5:33             ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Junio C Hamano @ 2025-07-29 15:00 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: Ben Knoble, Kristoffer Haugsbakk, git, Karthik Nayak,
	Justin Tobler, SZEDER Gábor, Toon Claes

Patrick Steinhardt <ps@pks.im> writes:

> There's a big difference though: `git reflog drop` won't ever do
> anything for a reflog that doesn't exist. Consequently, we know that our
> DWIM mechanism can kick in and resolve the reference properly if such a
> reflog exists.
>
> But for `git reflog write` that's not the case, as you can write a
> reflog message for a yet-nonexistent reflog. The DWIM mechanism cannot
> kick in here as there is no reflog. So what do we do in that case? We
> could of course just pick the first DWIM rule, which would be that we
> decide to write the reflog for "refs/heads/$REFNAME". But... I dunno,
> that feels too magicky to m

I concur.  Like update-ref, a command that would work on a name that
does not yet exist, especially when it is a plumbing-ish low-level
command, would be too confusing if it dwimmed based on what names
exist already.

I wonder if it is feasible to correct the UI mistake of "git reflog"
using the dwim-ref logic, and compensate it by teaching "git branch"
and "git tag" options to drop reflog for the thing they act on.  At
that level, there is nothing to dwim---"branch" is about branches
that are either refs/heads/* or (when run with -r) refs/remotes/*.
It's sort of like "for-each-ref" requiring the refname from the top,
while "branch -l" and "branch -l -r" always limit themselves to the
relevant hierarchy.

But it is probably not a good idea and is way too late.

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries
  2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-07-29 16:07     ` Junio C Hamano
  2025-08-01 11:37     ` Toon Claes
  1 sibling, 0 replies; 114+ messages in thread
From: Junio C Hamano @ 2025-07-29 16:07 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, SZEDER Gábor, Toon Claes,
	Jeff King, Kristoffer Haugsbakk, Ben Knoble

Patrick Steinhardt <ps@pks.im> writes:

> +	ref = argv[0];
> +	if (!is_root_ref(ref) && check_refname_format(ref, 0))
> +		die(_("invalid reference name: %s"), ref);

The "root ref" check is new in this iteration, and it makes perfect
sense.

We are not passing REFNAME_ALLOW_ONELEVEL flag, so we explicitly
allow things like HEAD (but exclude things like FETCH_HEAD).


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
@ 2025-07-29 16:16     ` Junio C Hamano
  2025-08-01 11:55     ` Toon Claes
  2025-08-02 11:11     ` Jeff King
  2 siblings, 0 replies; 114+ messages in thread
From: Junio C Hamano @ 2025-07-29 16:16 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, SZEDER Gábor, Toon Claes,
	Jeff King, Kristoffer Haugsbakk, Ben Knoble

Patrick Steinhardt <ps@pks.im> writes:

> When updating a reference that is being pointed to HEAD we don't only
> write a reflog message for that particular reference, but also generate
> one for HEAD. This logic is handled by `split_head_update()`, where we:
>
>   1. Verify that the condition actually triggered. This is done by
>      reading HEAD at the start of the transaction so that we can then
>      check whether a given reference update refers to its target.
>
>   2. Queue a new log-only update for HEAD in case it did.
>
> But the logic is unfortunately not free of races, as we do not lock the
> HEAD reference after we have read its target. This can lead to the
> following two scenarios:
>
>   - HEAD gets concurrently updated to point to one of the references we
>     have already processed. This causes us not writing a reflog message
>     even though we should have done so.
>
>   - HEAD gets concurrently updated to point to not point to a reference
>     anymore that we have already processed. This causes us to write a
>     reflog message even though we should _not_ have done so.
>
> Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that
> is specific to the "files" backend. If set, we will double check that
> the HEAD reference still points to the reference that we are creating
> the reflog entry for after we have locked HEAD. Furthermore, instead of
> manually resolving the old object ID of that entry, we now use the same
> old state as for the parent update.
>
> Unfortunately, this change only helps with the second race. We cannot
> reliably plug the first race without locking the HEAD reference at the
> start of the transaction. Locking HEAD unconditionally would effectively
> serialize all writes though, and that doesn't seem like an option. Also,
> double checking its value at the end of the transaction is not an option
> either, as its target may have flip-flopped during the transaction.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs/files-backend.c | 40 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 38 insertions(+), 2 deletions(-)

This is a step new in this iteration.  I sometimes wonder if the
world were in a much better shape if I didn't record the updates to
underlying branch in the reflog of HEAD (and limited only to record
switching branches), as this is a fallout from the (mis)design.

Anyway, I agree that the change in this patch matches the above
description and takes us in a better place ;-)

Thanks.

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index bf6f89b1d19..ba018b0984a 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -68,6 +68,12 @@
>   */
>  #define REF_DELETED_RMDIR (1 << 9)
>  
> +/*
> + * Used to indicate that the reflog-only update has been created via
> + * `split_head_update()`.
> + */
> +#define REF_LOG_VIA_SPLIT (1 << 14)
> +
>  struct ref_lock {
>  	char *ref_name;
>  	struct lock_file lk;
> @@ -2420,9 +2426,10 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
>  
>  	new_update = ref_transaction_add_update(
>  			transaction, "HEAD",
> -			update->flags | REF_LOG_ONLY | REF_NO_DEREF,
> +			update->flags | REF_LOG_ONLY | REF_NO_DEREF | REF_LOG_VIA_SPLIT,
>  			&update->new_oid, &update->old_oid,
>  			NULL, NULL, update->committer_info, update->msg);
> +	new_update->parent_update = update;
>  
>  	/*
>  	 * Add "HEAD". This insertion is O(N) in the transaction
> @@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
>  
>  	update->backend_data = lock;
>  
> -	if (update->type & REF_ISSYMREF) {
> +	if (update->flags & REF_LOG_VIA_SPLIT) {
> +		struct ref_lock *parent_lock;
> +
> +		if (!update->parent_update)
> +			BUG("split update without a parent");
> +
> +		parent_lock = update->parent_update->backend_data;
> +
> +		/*
> +		 * Check that "HEAD" didn't racily change since we have looked
> +		 * it up. If it did we must refuse to write the reflog entry.
> +		 *
> +		 * Note that this does not catch all races: if "HEAD" was
> +		 * racily changed to point to one of the refs part of the
> +		 * transaction then we would miss writing the split reflog
> +		 * entry for "HEAD".
> +		 */
> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			strbuf_addstr(err, "HEAD has been racily updated");
> +			ret = REF_TRANSACTION_ERROR_GENERIC;
> +			goto out;
> +		}
> +
> +		if (update->flags & REF_HAVE_OLD) {
> +			oidcpy(&lock->old_oid, &update->old_oid);
> +		} else {
> +			oidcpy(&lock->old_oid, &parent_lock->old_oid);
> +		}
> +	} else if (update->type & REF_ISSYMREF) {
>  		if (update->flags & REF_NO_DEREF) {
>  			/*
>  			 * We won't be reading the referent as part of

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-29 15:00           ` Junio C Hamano
@ 2025-07-30  5:33             ` Patrick Steinhardt
  2025-07-30 10:33               ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-30  5:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Knoble, Kristoffer Haugsbakk, git, Karthik Nayak,
	Justin Tobler, SZEDER Gábor, Toon Claes

On Tue, Jul 29, 2025 at 08:00:01AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > There's a big difference though: `git reflog drop` won't ever do
> > anything for a reflog that doesn't exist. Consequently, we know that our
> > DWIM mechanism can kick in and resolve the reference properly if such a
> > reflog exists.
> >
> > But for `git reflog write` that's not the case, as you can write a
> > reflog message for a yet-nonexistent reflog. The DWIM mechanism cannot
> > kick in here as there is no reflog. So what do we do in that case? We
> > could of course just pick the first DWIM rule, which would be that we
> > decide to write the reflog for "refs/heads/$REFNAME". But... I dunno,
> > that feels too magicky to m

Gah, another cut-off mail. This is driving me crazy. I have an idea what
the root cause could be that I've implemented yesterday an hour after
this mail. So fingers crossed this that this will stop now.

Anyway, remainder of this mail:

> Patrick Steinhardt <ps@pks.im> writes:
> > But... I dunno, that feels too magicky to me given that we are in
> > plumbing land.
> > 
> > So I think the next-best thing for now is to make input verification
> > stricter by dropping the `REFNAME_ALLOW_ONELEVEL` flag.

On Tue, Jul 29, 2025 at 08:00:01AM -0700, Junio C Hamano wrote:
> I concur.  Like update-ref, a command that would work on a name that
> does not yet exist, especially when it is a plumbing-ish low-level
> command, would be too confusing if it dwimmed based on what names
> exist already.
> 
> I wonder if it is feasible to correct the UI mistake of "git reflog"
> using the dwim-ref logic, and compensate it by teaching "git branch"
> and "git tag" options to drop reflog for the thing they act on.  At
> that level, there is nothing to dwim---"branch" is about branches
> that are either refs/heads/* or (when run with -r) refs/remotes/*.
> It's sort of like "for-each-ref" requiring the refname from the top,
> while "branch -l" and "branch -l -r" always limit themselves to the
> relevant hierarchy.
> 
> But it is probably not a good idea and is way too late.

Yeah, I agree that it's probably too late now to change it. Whether we
need to teach git-branch(1) or git-tag(1) to do so I'm not sure. It
feels quite unlikely to me that anyone really does this operation
frequently enough to care.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-28 20:59           ` Junio C Hamano
@ 2025-07-30  7:55             ` Karthik Nayak
  0 siblings, 0 replies; 114+ messages in thread
From: Karthik Nayak @ 2025-07-30  7:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Kristoffer Haugsbakk, Patrick Steinhardt, git, Justin Tobler,
	SZEDER Gábor, Toon Claes

[-- Attachment #1: Type: text/plain, Size: 1772 bytes --]

Junio C Hamano <gitster@pobox.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>>> "Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:
>>>
>>>> I tried `git reflog drop`[1] and it can deal with a branch like
>>>> `branch`.  It doesn’t need to be told `refs/heads/branch`.
>>>
>>> That sounds like a bug to me.
>>
>> So `git reflog drop` `git reflog delete` and `git reflog expire` use
>> `repo_dwim_log()` to resolve the provided reference.
>>
>> And `repo_dwim_log()` uses the following `ref_rev_parse_rules` to
>> resolve the reference.
>>
>>   static const char *ref_rev_parse_rules[] = {
>>   	"%.*s",
>>   	"refs/%.*s",
>>   	"refs/tags/%.*s",
>>   	"refs/heads/%.*s",
>>   	"refs/remotes/%.*s",
>>   	"refs/remotes/%.*s/HEAD",
>>   	NULL
>>   };
>>
>> Which means we do a best case resolution of a given reference, but the
>> function also checks for ambiguity and warns for it.
>
> True.  But as I considered "git reflog" to be a lot closer to the
> plumbing than to Porcelain, using the dwim thing smelled like a bug.
>

I agree that 'git reflog' is more of a plumbing command. I'm trying to
see what subcommands of reflog act this way, so we can take a decision
on how to move forward.

> It also is OK to update the commands that do not use dwim-log to
> also use it.  That way, the result would be consistent across
> subcommands of "git reflog".  As long as the users are aware of the
> fact that the command uses dwim-log, they can always spell their ref
> in full like "refs/heads/branch" to avoid ambiguity check getting in
> the way.
>
> Thanks.

Yeah consistency and documentation around it would be great here.

- Karthik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries
  2025-07-30  5:33             ` Patrick Steinhardt
@ 2025-07-30 10:33               ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-07-30 10:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ben Knoble, Kristoffer Haugsbakk, git, Karthik Nayak,
	Justin Tobler, SZEDER Gábor, Toon Claes

On Wed, Jul 30, 2025 at 07:33:30AM +0200, Patrick Steinhardt wrote:
> On Tue, Jul 29, 2025 at 08:00:01AM -0700, Junio C Hamano wrote:
> > Patrick Steinhardt <ps@pks.im> writes:
> > 
> > > There's a big difference though: `git reflog drop` won't ever do
> > > anything for a reflog that doesn't exist. Consequently, we know that our
> > > DWIM mechanism can kick in and resolve the reference properly if such a
> > > reflog exists.
> > >
> > > But for `git reflog write` that's not the case, as you can write a
> > > reflog message for a yet-nonexistent reflog. The DWIM mechanism cannot
> > > kick in here as there is no reflog. So what do we do in that case? We
> > > could of course just pick the first DWIM rule, which would be that we
> > > decide to write the reflog for "refs/heads/$REFNAME". But... I dunno,
> > > that feels too magicky to m
> 
> Gah, another cut-off mail. This is driving me crazy. I have an idea what
> the root cause could be that I've implemented yesterday an hour after
> this mail. So fingers crossed this that this will stop now.

Yup, I was finally able to reproduce the issue. I'm using msmtp to send
mail and have a `passwordeval` script that I use to yield the password.
Recently I've switched to a different password manager though, so I had
to adapt the script a bit. Basically, what the script does is to check
whether I'm already signed in -- if not, it spawns rofi to ask me for my
password.

But: rofi actually reads from stdin, and the `passwordeval` command in
msmtp explicitly must _not_ munge stdin, as stdin is where the mail gets
read from. So this is what caused the mail to get truncated.

So why wasn't I able to reproduce the issue? Well, because it only
happens in case the password store in locked and I need to input my
password. But when reproducing it I already had the password store
unlocked.

The fix is thus quite easy:

diff --git a/home-manager/profiles/graphical/workstation/gitlab/default.nix b/home-manager/profiles/graphical/workstation/gitlab/default.nix
index d47e95a83..3460a29fb 100644
--- a/home-manager/profiles/graphical/workstation/gitlab/default.nix
+++ b/home-manager/profiles/graphical/workstation/gitlab/default.nix
@@ -42,6 +42,8 @@ in
         pkgs.writeShellScript "op-mail-password" ''
           set -eo pipefail
 
+          exec 0<&-
+
           export OP_SESSION=$(systemctl --user show-environment | grep '^OP_SESSION=' | cut -d= -f2)
           if test -z "$OP_SESSION" || ! op vault list &>/dev/null
           then

Finally, another mystery solved. This really has been stressing me out
over the last two weeks.

Patrick

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries
  2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
  2025-07-29 16:07     ` Junio C Hamano
@ 2025-08-01 11:37     ` Toon Claes
  2025-08-04  7:38       ` Patrick Steinhardt
  1 sibling, 1 reply; 114+ messages in thread
From: Toon Claes @ 2025-08-01 11:37 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Jeff King, Kristoffer Haugsbakk, Ben Knoble

Patrick Steinhardt <ps@pks.im> writes:

> +test_expect_success 'abbreviated object IDs' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&

Is the object id rejected because it's short, or because there simply
doesn't exist an object that starts with `12345`? You're not really
testing the former, which you claim in the test name.

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()`
  2025-07-29  8:55   ` [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-08-01 11:38     ` Toon Claes
  2025-08-04  7:37       ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Toon Claes @ 2025-08-01 11:38 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Jeff King, Kristoffer Haugsbakk, Ben Knoble

Patrick Steinhardt <ps@pks.im> writes:

> In a subsequent commit we'll add another user that wants to write reflog
> entries. This requires them to call `ref_transaction_update_reflog()`,
> but that function is local to "refs.c".
>
> Export the function to prepare for the change. While at it, drop the
> `flags` field, as all callers are for now expected to use the same flags
> anyway.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 29 +++++++++++------------------
>  refs.h | 15 +++++++++++++++
>  2 files changed, 26 insertions(+), 18 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index dce5c49ca2b..8aa9f7236a3 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
>  	return 0;
>  }
>  
> -/*
> - * Similar to`ref_transaction_update`, but this function is only for adding

Tiniest nit: for some reason the space after "to" fell away.

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
  2025-07-29 16:16     ` Junio C Hamano
@ 2025-08-01 11:55     ` Toon Claes
  2025-08-02 11:11     ` Jeff King
  2 siblings, 0 replies; 114+ messages in thread
From: Toon Claes @ 2025-08-01 11:55 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Jeff King, Kristoffer Haugsbakk, Ben Knoble

Patrick Steinhardt <ps@pks.im> writes:

> When updating a reference that is being pointed to HEAD we don't only
> write a reflog message for that particular reference, but also generate
> one for HEAD. This logic is handled by `split_head_update()`, where we:
>
>   1. Verify that the condition actually triggered. This is done by
>      reading HEAD at the start of the transaction so that we can then
>      check whether a given reference update refers to its target.
>
>   2. Queue a new log-only update for HEAD in case it did.
>
> But the logic is unfortunately not free of races, as we do not lock the
> HEAD reference after we have read its target. This can lead to the
> following two scenarios:
>
>   - HEAD gets concurrently updated to point to one of the references we
>     have already processed. This causes us not writing a reflog message
>     even though we should have done so.
>
>   - HEAD gets concurrently updated to point to not point to a reference

That's a little much of pointing, right? ;)

Maybe change it to something like:

    - HEAD gets concurrently updated to no longer point to a reference

-- 
Cheers,
Toon

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
  2025-07-29 16:16     ` Junio C Hamano
  2025-08-01 11:55     ` Toon Claes
@ 2025-08-02 11:11     ` Jeff King
  2025-08-04  7:38       ` Patrick Steinhardt
  2 siblings, 1 reply; 114+ messages in thread
From: Jeff King @ 2025-08-02 11:11 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes, Kristoffer Haugsbakk, Ben Knoble

On Tue, Jul 29, 2025 at 10:55:25AM +0200, Patrick Steinhardt wrote:

> Unfortunately, this change only helps with the second race. We cannot
> reliably plug the first race without locking the HEAD reference at the
> start of the transaction. Locking HEAD unconditionally would effectively
> serialize all writes though, and that doesn't seem like an option. Also,
> double checking its value at the end of the transaction is not an option
> either, as its target may have flip-flopped during the transaction.

I agree we should not always take a lock on HEAD, since most refs would
not need it. But I wonder if we could do better by examining HEAD, then
taking a lock when we think we'll need it, and then re-checking the
value of HEAD. That is still racy, though (somebody could have pointed
HEAD at us between the two checks). Fundamentally the files backend is
not atomic across the whole namespace, and we are trying to update two
refs. So I think there will always be some race.

It does make me wonder if this race-fix is even worth it, then. We are
catching the case where somebody moves HEAD away from the ref we are
updating while we are updating it. But without atomicity, do we even
know which happened first? That is, would it be incorrect to update
HEAD anyway? I guess the outcome is observable because their movement of
HEAD generated a reflog entry, and thus the entries would be out of
order. So maybe that is worth it.

Anyway, I had two questions about the code:

> @@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
>  
>  	update->backend_data = lock;
>  
> -	if (update->type & REF_ISSYMREF) {
> +	if (update->flags & REF_LOG_VIA_SPLIT) {
> +		struct ref_lock *parent_lock;
> +
> +		if (!update->parent_update)
> +			BUG("split update without a parent");
> +
> +		parent_lock = update->parent_update->backend_data;
> +
> +		/*
> +		 * Check that "HEAD" didn't racily change since we have looked
> +		 * it up. If it did we must refuse to write the reflog entry.
> +		 *
> +		 * Note that this does not catch all races: if "HEAD" was
> +		 * racily changed to point to one of the refs part of the
> +		 * transaction then we would miss writing the split reflog
> +		 * entry for "HEAD".
> +		 */
> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			strbuf_addstr(err, "HEAD has been racily updated");
> +			ret = REF_TRANSACTION_ERROR_GENERIC;
> +			goto out;
> +		}

One, what happens with a multi-level ref (e.g., HEAD points to
refs/heads/foo which points to refs/heads/bar)?

We've resolved HEAD to get referent.buf. Do we get "foo" or "bar" here?
If "bar", then a write through "foo" will complain. But if we get "foo",
then theoretically a write through "bar" will complain.

I _think_ we are OK, though. Constructing it like this:

  git init
  git commit --allow-empty -m whatever

  git symbolic-ref refs/heads/foo refs/heads/bar
  git symbolic-ref HEAD refs/heads/foo
  git update-ref refs/heads/foo main

triggers the check and shows that our referent from lock_raw_ref() is
the first level (i.e., "foo"). Which is good.

If we swap out "foo" for "bar" in the update-ref call, then we'd get a
mismatch. But in that case we do not figure out that HEAD needs be
written at all! That is, we only do a single level of look-back to
decide whether to write HEAD at all. So as long as we keep doing so, we
are OK.

> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			strbuf_addstr(err, "HEAD has been racily updated");
> +			ret = REF_TRANSACTION_ERROR_GENERIC;
> +			goto out;
> +		}

And two, is an error the right thing here? The user asked us to update
"foo", and we saw that HEAD pointed to it. So we decided to update
HEAD's reflog, too. And when it came time to do so under lock, we found
that HEAD did not point to "foo" any more.

Shouldn't we quietly drop the HEAD reflog update, rather than forcing
the whole transaction to fail? The user never asked us to update HEAD at
all. It was something we opportunistically decided to do, and now we
find out that it is not appropriate to do so.

-Peff

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()`
  2025-08-01 11:38     ` Toon Claes
@ 2025-08-04  7:37       ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  7:37 UTC (permalink / raw)
  To: Toon Claes
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Jeff King, Kristoffer Haugsbakk, Ben Knoble

On Fri, Aug 01, 2025 at 01:38:16PM +0200, Toon Claes wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > In a subsequent commit we'll add another user that wants to write reflog
> > entries. This requires them to call `ref_transaction_update_reflog()`,
> > but that function is local to "refs.c".
> >
> > Export the function to prepare for the change. While at it, drop the
> > `flags` field, as all callers are for now expected to use the same flags
> > anyway.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  refs.c | 29 +++++++++++------------------
> >  refs.h | 15 +++++++++++++++
> >  2 files changed, 26 insertions(+), 18 deletions(-)
> >
> > diff --git a/refs.c b/refs.c
> > index dce5c49ca2b..8aa9f7236a3 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
> >  	return 0;
> >  }
> >  
> > -/*
> > - * Similar to`ref_transaction_update`, but this function is only for adding
> 
> Tiniest nit: for some reason the space after "to" fell away.

This is the preimage though :) I've fixed it in the postimage already.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries
  2025-08-01 11:37     ` Toon Claes
@ 2025-08-04  7:38       ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  7:38 UTC (permalink / raw)
  To: Toon Claes
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Jeff King, Kristoffer Haugsbakk, Ben Knoble

On Fri, Aug 01, 2025 at 01:37:40PM +0200, Toon Claes wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > +test_expect_success 'abbreviated object IDs' '
> > +	test_when_finished "rm -rf repo" &&
> > +	git init repo &&
> > +	(
> > +		cd repo &&
> > +		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&
> 
> Is the object id rejected because it's short, or because there simply
> doesn't exist an object that starts with `12345`? You're not really
> testing the former, which you claim in the test name.

Good point, let me use an existing-but-abbreviated object instead.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-08-02 11:11     ` Jeff King
@ 2025-08-04  7:38       ` Patrick Steinhardt
  2025-08-04 14:47         ` Jeff King
  0 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  7:38 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes, Kristoffer Haugsbakk, Ben Knoble

On Sat, Aug 02, 2025 at 07:11:28AM -0400, Jeff King wrote:
> On Tue, Jul 29, 2025 at 10:55:25AM +0200, Patrick Steinhardt wrote:
> 
> > Unfortunately, this change only helps with the second race. We cannot
> > reliably plug the first race without locking the HEAD reference at the
> > start of the transaction. Locking HEAD unconditionally would effectively
> > serialize all writes though, and that doesn't seem like an option. Also,
> > double checking its value at the end of the transaction is not an option
> > either, as its target may have flip-flopped during the transaction.
> 
> I agree we should not always take a lock on HEAD, since most refs would
> not need it. But I wonder if we could do better by examining HEAD, then
> taking a lock when we think we'll need it, and then re-checking the
> value of HEAD. That is still racy, though (somebody could have pointed
> HEAD at us between the two checks). Fundamentally the files backend is
> not atomic across the whole namespace, and we are trying to update two
> refs. So I think there will always be some race.

We do that though. When queueing the log-only update for HEAD we don't
lock immediately, but we lock once we process that log-only update. And
that's where we now do the check whether HEAD has changed meanwhile,
which should be race-free given that it did change indeed.

> It does make me wonder if this race-fix is even worth it, then. We are
> catching the case where somebody moves HEAD away from the ref we are
> updating while we are updating it. But without atomicity, do we even
> know which happened first? That is, would it be incorrect to update
> HEAD anyway? I guess the outcome is observable because their movement of
> HEAD generated a reflog entry, and thus the entries would be out of
> order. So maybe that is worth it.

So with the above I think that we are race-free for the case where HEAD
has been modified after we have decided to write a reflog entry for it.
We aren't though in the case where we _haven't_ decided to write a
reflog entry, as HEAD might have been adjusted to point to one of the
updated refs meanwhile as you rightfully point out.

But I think that's an acceptable tradeoff. I'd rather write no reflog
entry than an incorrect one.

> Anyway, I had two questions about the code:
> 
> > @@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
> >  
> >  	update->backend_data = lock;
> >  
> > -	if (update->type & REF_ISSYMREF) {
> > +	if (update->flags & REF_LOG_VIA_SPLIT) {
> > +		struct ref_lock *parent_lock;
> > +
> > +		if (!update->parent_update)
> > +			BUG("split update without a parent");
> > +
> > +		parent_lock = update->parent_update->backend_data;
> > +
> > +		/*
> > +		 * Check that "HEAD" didn't racily change since we have looked
> > +		 * it up. If it did we must refuse to write the reflog entry.
> > +		 *
> > +		 * Note that this does not catch all races: if "HEAD" was
> > +		 * racily changed to point to one of the refs part of the
> > +		 * transaction then we would miss writing the split reflog
> > +		 * entry for "HEAD".
> > +		 */
> > +		if (!(update->type & REF_ISSYMREF) ||
> > +		    strcmp(update->parent_update->refname, referent.buf)) {
> > +			strbuf_addstr(err, "HEAD has been racily updated");
> > +			ret = REF_TRANSACTION_ERROR_GENERIC;
> > +			goto out;
> > +		}
> 
> One, what happens with a multi-level ref (e.g., HEAD points to
> refs/heads/foo which points to refs/heads/bar)?
> 
> We've resolved HEAD to get referent.buf. Do we get "foo" or "bar" here?
> If "bar", then a write through "foo" will complain. But if we get "foo",
> then theoretically a write through "bar" will complain.
> 
> I _think_ we are OK, though. Constructing it like this:
> 
>   git init
>   git commit --allow-empty -m whatever
> 
>   git symbolic-ref refs/heads/foo refs/heads/bar
>   git symbolic-ref HEAD refs/heads/foo
>   git update-ref refs/heads/foo main
> 
> triggers the check and shows that our referent from lock_raw_ref() is
> the first level (i.e., "foo"). Which is good.
> 
> If we swap out "foo" for "bar" in the update-ref call, then we'd get a
> mismatch. But in that case we do not figure out that HEAD needs be
> written at all! That is, we only do a single level of look-back to
> decide whether to write HEAD at all. So as long as we keep doing so, we
> are OK.

Yeah, the whole code is best-effort only and doesn't work for all kinds
of edge cases. The mere fact that we only handle this case for HEAD is
already a limitation -- in theory we should do it for all symbolic refs.

> > +		if (!(update->type & REF_ISSYMREF) ||
> > +		    strcmp(update->parent_update->refname, referent.buf)) {
> > +			strbuf_addstr(err, "HEAD has been racily updated");
> > +			ret = REF_TRANSACTION_ERROR_GENERIC;
> > +			goto out;
> > +		}
> 
> And two, is an error the right thing here? The user asked us to update
> "foo", and we saw that HEAD pointed to it. So we decided to update
> HEAD's reflog, too. And when it came time to do so under lock, we found
> that HEAD did not point to "foo" any more.
> 
> Shouldn't we quietly drop the HEAD reflog update, rather than forcing
> the whole transaction to fail? The user never asked us to update HEAD at
> all. It was something we opportunistically decided to do, and now we
> find out that it is not appropriate to do so.

That's something I wasn't quite sure about, either. Honestly, the reason
I shied away from it is that it needs a bit more munging for an edge
case that is hard to test reliably. But I guess we can do something like
the below patch to skip writing the reflog message instea.

Patrick

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 905555365b..851b1b33f4 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -643,14 +643,16 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
 	return 0;
 }
 
-static void unlock_ref(struct ref_lock *lock)
+static int unlock_ref(struct ref_lock *lock)
 {
 	lock->count--;
 	if (!lock->count) {
 		rollback_lock_file(&lock->lk);
 		free(lock->ref_name);
 		free(lock);
+		return 1;
 	}
+	return 0;
 }
 
 /*
@@ -2557,6 +2559,9 @@ struct files_transaction_backend_data {
  *   the referent to transaction.
  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
  *   update of HEAD.
+ *
+ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
+ * error code otherwise.
  */
 static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
 						      struct ref_update *update,
@@ -2617,7 +2622,8 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 		/*
 		 * Check that "HEAD" didn't racily change since we have looked
-		 * it up. If it did we must refuse to write the reflog entry.
+		 * it up. If it did we remove the reflog-only updateg from the
+		 * transaction again.
 		 *
 		 * Note that this does not catch all races: if "HEAD" was
 		 * racily changed to point to one of the refs part of the
@@ -2626,8 +2632,16 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 		 */
 		if (!(update->type & REF_ISSYMREF) ||
 		    strcmp(update->parent_update->refname, referent.buf)) {
-			strbuf_addstr(err, "HEAD has been racily updated");
-			ret = REF_TRANSACTION_ERROR_GENERIC;
+			if (unlock_ref(lock))
+				strmap_remove(&backend_data->ref_locks,
+					      update->refname, 0);
+
+			memmove(transaction->updates + update_idx,
+				transaction->updates + update_idx + 1,
+				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
+			transaction->nr--;
+
+			ret = 1;
 			goto out;
 		}
 
@@ -2896,6 +2910,8 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &refnames_to_check,
 					  err);
 		if (ret) {
+			if (ret > 0)
+				continue;
 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
 				strbuf_reset(err);
 				ret = 0;


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 0/9] refs: fix migration of reflog entries
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (9 preceding siblings ...)
  2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-08-04  9:46 ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
                     ` (8 more replies)
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
  12 siblings, 9 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Changes in v2:
  - !!! The base of this topic has changed so that it sits on top of
    v2.50.1. This is done so that we can backport this change to older
    release tracks.
  - A couple of typo fixes and clarifications for commit messages.
  - Reorder sections in git-reflog(1) manpage according to the
    reordering we have in the synopsis.
  - Add a section for the new `write` command.
  - Improve test coverage for the `git reflog write` command.
  - Avoid `cat`ing a file into a Bash loop.
  - Remove a stale comment.
  - Make `ref_update_expects_existing_old_ref()` a bit more straight
    forward.
  - Link to v1: https://lore.kernel.org/r/20250722-pks-reflog-append-v1-0-183e5949de16@pks.im

Changes in v3:
  - `git reflog write` now requires fully-qualified refnames.
  - A new commit that plugs one part of the race around splitting of
    reflogs for HEAD in the "files" backend.
  - Link to v2: https://lore.kernel.org/r/20250725-pks-reflog-append-v2-0-e4e7cbe3f578@pks.im

Changes in v4:
  - Improve one of the tests to use an existing abbreviated object ID
    instead of a non-existing one to make sure that we indeed fail due
    to the abbreviation.
  - Don't abort the transaction when HEAD has been racily updated, but
    drop the log-only update instead.
  - Link to v3: https://lore.kernel.org/r/20250729-pks-reflog-append-v3-0-9614d310f073@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (9):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs/files: detect race when generating reflog entry for HEAD
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  76 +++++++++++++------------
 builtin/reflog.c              | 103 +++++++++++++++++++++++++++-------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  60 +++++++++++---------
 refs.h                        |  24 +++++++-
 refs/files-backend.c          |  83 +++++++++++++++++++++++++---
 refs/refs-internal.h          |   3 +-
 refs/reftable-backend.c       |  26 ++++++---
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 126 ++++++++++++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 +++++---
 12 files changed, 420 insertions(+), 108 deletions(-)

Range-diff versus v3:

 1:  5ae2971a55 =  1:  d030117041 Documentation/git-reflog: convert to use synopsis type
 2:  256589c289 =  2:  ad1dd8a226 builtin/reflog: improve grouping of subcommands
 3:  3b9e0a1206 =  3:  d6d6d99421 refs: export `ref_transaction_update_reflog()`
 4:  4fcf540ed6 !  4:  4e5433717e builtin/reflog: implement subcommand to write new entries
    @@ t/t1421-reflog-write.sh (new)
     +	git init repo &&
     +	(
     +		cd repo &&
    -+		test_must_fail git reflog write refs/heads/something 12345 $ZERO_OID old-object-id 2>err &&
    ++		test_commit initial &&
    ++		abbreviated_oid=$(git rev-parse HEAD | test_copy_bytes 8) &&
    ++		test_must_fail git reflog write refs/heads/something $abbreviated_oid $ZERO_OID old-object-id 2>err &&
     +		test_grep "invalid old object ID" err &&
    -+		test_must_fail git reflog write refs/heads/something $ZERO_OID 12345 new-object-id 2>err &&
    ++		test_must_fail git reflog write refs/heads/something $ZERO_OID $abbreviated_oid new-object-id 2>err &&
     +		test_grep "invalid new object ID" err
     +	)
     +'
 5:  18b2f61366 =  5:  92e45f582c ident: fix type of string length parameter
 6:  d140c53224 =  6:  e50c5aaae5 refs: fix identity for migrated reflogs
 7:  91c6a7cbcb !  7:  9380dbfdab refs/files: detect race when generating reflog entry for HEAD
    @@ Commit message
             have already processed. This causes us not writing a reflog message
             even though we should have done so.
     
    -      - HEAD gets concurrently updated to point to not point to a reference
    +      - HEAD gets concurrently updated to no longer point to a reference
             anymore that we have already processed. This causes us to write a
             reflog message even though we should _not_ have done so.
     
    @@ refs/files-backend.c
      struct ref_lock {
      	char *ref_name;
      	struct lock_file lk;
    +@@ refs/files-backend.c: int parse_loose_ref_contents(const struct git_hash_algo *algop,
    + 	return 0;
    + }
    + 
    +-static void unlock_ref(struct ref_lock *lock)
    ++static int unlock_ref(struct ref_lock *lock)
    + {
    + 	lock->count--;
    + 	if (!lock->count) {
    + 		rollback_lock_file(&lock->lk);
    + 		free(lock->ref_name);
    + 		free(lock);
    ++		return 1;
    + 	}
    ++	return 0;
    + }
    + 
    + /*
     @@ refs/files-backend.c: static enum ref_transaction_error split_head_update(struct ref_update *update,
      
      	new_update = ref_transaction_add_update(
    @@ refs/files-backend.c: static enum ref_transaction_error split_head_update(struct
      
      	/*
      	 * Add "HEAD". This insertion is O(N) in the transaction
    +@@ refs/files-backend.c: struct files_transaction_backend_data {
    +  *   the referent to transaction.
    +  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
    +  *   update of HEAD.
    ++ *
    ++ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
    ++ * error code otherwise.
    +  */
    + static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
    + 						      struct ref_update *update,
     @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
      
      	update->backend_data = lock;
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
     +
     +		/*
     +		 * Check that "HEAD" didn't racily change since we have looked
    -+		 * it up. If it did we must refuse to write the reflog entry.
    ++		 * it up. If it did we remove the reflog-only updateg from the
    ++		 * transaction again.
     +		 *
     +		 * Note that this does not catch all races: if "HEAD" was
     +		 * racily changed to point to one of the refs part of the
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
     +		 */
     +		if (!(update->type & REF_ISSYMREF) ||
     +		    strcmp(update->parent_update->refname, referent.buf)) {
    -+			strbuf_addstr(err, "HEAD has been racily updated");
    -+			ret = REF_TRANSACTION_ERROR_GENERIC;
    ++			if (unlock_ref(lock))
    ++				strmap_remove(&backend_data->ref_locks,
    ++					      update->refname, 0);
    ++
    ++			memmove(transaction->updates + update_idx,
    ++				transaction->updates + update_idx + 1,
    ++				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
    ++			transaction->nr--;
    ++
    ++			ret = 1;
     +			goto out;
     +		}
     +
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
      		if (update->flags & REF_NO_DEREF) {
      			/*
      			 * We won't be reading the referent as part of
    +@@ refs/files-backend.c: static int files_transaction_prepare(struct ref_store *ref_store,
    + 					  head_ref, &refnames_to_check,
    + 					  err);
    + 		if (ret) {
    ++			if (ret > 0)
    ++				continue;
    + 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    + 				strbuf_reset(err);
    + 				ret = 0;
 8:  8468947824 =  8:  3c6182c96d refs: stop unsetting REF_HAVE_OLD for log-only updates
 9:  78ca2d46f9 =  9:  eafd8f6d7d refs: fix invalid old object IDs when migrating reflogs

---
base-commit: d82adb61ba2fd11d8f2587fca1b6bd7925ce4044
change-id: 20250722-pks-reflog-append-634172d8ab2c


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
have introduced a new synopsis type that simplifies the rules for
typesetting a command's synopsis. Convert the git-reflog(1)
documentation to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 412f06b8fe..707a9b39ed 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -8,16 +8,16 @@ git-reflog - Manage reflog information
 
 SYNOPSIS
 --------
-[verse]
-'git reflog' [show] [<log-options>] [<ref>]
-'git reflog list'
-'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
+[synopsis]
+git reflog [show] [<log-options>] [<ref>]
+git reflog list
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
 	[--rewrite] [--updateref] [--stale-fix]
 	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
-'git reflog delete' [--rewrite] [--updateref]
+git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
-'git reflog drop' [--all [--single-worktree] | <refs>...]
-'git reflog exists' <ref>
+git reflog drop [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 
 DESCRIPTION
 -----------

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 2/9] builtin/reflog: improve grouping of subcommands
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The way subcommands of git-reflog(1) are laid out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 61 ++++++++++++++++++++++---------------------
 builtin/reflog.c              | 38 +++++++++++++--------------
 2 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 707a9b39ed..c3801b82fb 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -11,13 +11,13 @@ SYNOPSIS
 [synopsis]
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
-git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
-	[--rewrite] [--updateref] [--stale-fix]
-	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
-git reflog exists <ref>
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
+	[--rewrite] [--updateref] [--stale-fix]
+	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
 
 DESCRIPTION
 -----------
@@ -43,11 +43,9 @@ actions, and in addition the `HEAD` reflog records branch switching.
 
 The "list" subcommand lists all refs which have a corresponding reflog.
 
-The "expire" subcommand prunes older reflog entries. Entries older
-than `expire` time, or entries older than `expire-unreachable` time
-and not reachable from the current tip, are removed from the reflog.
-This is typically not used directly by end users -- instead, see
-linkgit:git-gc[1].
+The "exists" subcommand checks whether a ref has a reflog.  It exits
+with zero status if the reflog exists, and non-zero status if it does
+not.
 
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
@@ -58,9 +56,11 @@ The "drop" subcommand completely removes the reflog for the specified
 references. This is in contrast to "expire" and "delete", both of which
 can be used to delete reflog entries, but not the reflog itself.
 
-The "exists" subcommand checks whether a ref has a reflog.  It exits
-with zero status if the reflog exists, and non-zero status if it does
-not.
+The "expire" subcommand prunes older reflog entries. Entries older
+than `expire` time, or entries older than `expire-unreachable` time
+and not reachable from the current tip, are removed from the reflog.
+This is typically not used directly by end users -- instead, see
+linkgit:git-gc[1].
 
 OPTIONS
 -------
@@ -71,6 +71,25 @@ Options for `show`
 `git reflog show` accepts any of the options accepted by `git log`.
 
 
+Options for `delete`
+~~~~~~~~~~~~~~~~~~~~
+
+`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
+`--dry-run`, and `--verbose`, with the same meanings as when they are
+used with `expire`.
+
+Options for `drop`
+~~~~~~~~~~~~~~~~~~
+
+--all::
+	Drop the reflogs of all references from all worktrees.
+
+--single-worktree::
+	By default when `--all` is specified, reflogs from all working
+	trees are dropped. This option limits the processing to reflogs
+	from the current working tree only.
+
+
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -130,24 +149,6 @@ which didn't protect objects referred to by reflogs.
 	Print extra information on screen.
 
 
-Options for `delete`
-~~~~~~~~~~~~~~~~~~~~
-
-`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
-`--dry-run`, and `--verbose`, with the same meanings as when they are
-used with `expire`.
-
-Options for `drop`
-~~~~~~~~~~~~~~~~~~
-
---all::
-	Drop the reflogs of all references from all worktrees.
-
---single-worktree::
-	By default when `--all` is specified, reflogs from all working
-	trees are dropped. This option limits the processing to reflogs
-	from the current working tree only.
-
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acaf3e32c..b00b3f9edc 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -17,21 +17,21 @@
 #define BUILTIN_REFLOG_LIST_USAGE \
 	N_("git reflog list")
 
-#define BUILTIN_REFLOG_EXPIRE_USAGE \
-	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
-	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
-	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+#define BUILTIN_REFLOG_EXISTS_USAGE \
+	N_("git reflog exists <ref>")
 
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
 
-#define BUILTIN_REFLOG_EXISTS_USAGE \
-	N_("git reflog exists <ref>")
-
 #define BUILTIN_REFLOG_DROP_USAGE \
 	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
 
+#define BUILTIN_REFLOG_EXPIRE_USAGE \
+	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
+	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
+	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+
 static const char *const reflog_show_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	NULL,
@@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
 	NULL,
 };
 
-static const char *const reflog_expire_usage[] = {
-	BUILTIN_REFLOG_EXPIRE_USAGE,
-	NULL
+static const char *const reflog_exists_usage[] = {
+	BUILTIN_REFLOG_EXISTS_USAGE,
+	NULL,
 };
 
 static const char *const reflog_delete_usage[] = {
@@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
 	NULL
 };
 
-static const char *const reflog_exists_usage[] = {
-	BUILTIN_REFLOG_EXISTS_USAGE,
-	NULL,
-};
-
 static const char *const reflog_drop_usage[] = {
 	BUILTIN_REFLOG_DROP_USAGE,
 	NULL,
 };
 
+static const char *const reflog_expire_usage[] = {
+	BUILTIN_REFLOG_EXPIRE_USAGE,
+	NULL
+};
+
 static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
-	BUILTIN_REFLOG_EXPIRE_USAGE,
+	BUILTIN_REFLOG_EXISTS_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
-	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_EXPIRE_USAGE,
 	NULL
 };
 
@@ -404,10 +404,10 @@ int cmd_reflog(int argc,
 	struct option options[] = {
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
-		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
-		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
+		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
 		OPT_END()
 	};
 

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 3/9] refs: export `ref_transaction_update_reflog()`
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that function is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 29 +++++++++++------------------
 refs.h | 15 +++++++++++++++
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index dce5c49ca2..8aa9f7236a 100644
--- a/refs.c
+++ b/refs.c
@@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 	return 0;
 }
 
-/*
- * Similar to`ref_transaction_update`, but this function is only for adding
- * a reflog update. Supports providing custom committer information. The index
- * field can be utiltized to order updates as desired. When not used, the
- * updates default to being ordered by refname.
- */
-static int ref_transaction_update_reflog(struct ref_transaction *transaction,
-					 const char *refname,
-					 const struct object_id *new_oid,
-					 const struct object_id *old_oid,
-					 const char *committer_info,
-					 unsigned int flags,
-					 const char *msg,
-					 uint64_t index,
-					 struct strbuf *err)
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err)
 {
 	struct ref_update *update;
+	unsigned int flags;
 
 	assert(err);
 
-	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
@@ -3019,8 +3013,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
-					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
-					    data->index++, data->errbuf);
+					    msg, data->index++, data->errbuf);
 	return ret;
 }
 
diff --git a/refs.h b/refs.h
index 46a6008e07..253dd8f4d5 100644
--- a/refs.h
+++ b/refs.h
@@ -795,6 +795,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err);
 
+/*
+ * Similar to `ref_transaction_update`, but this function is only for adding
+ * a reflog update. Supports providing custom committer information. The index
+ * field can be utiltized to order updates as desired. When set to zero, the
+ * updates default to being ordered by refname.
+ */
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err);
+
 /*
  * Add a reference creation to transaction. new_oid is the value that
  * the reference should have after the update; it must not be

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 4/9] builtin/reflog: implement subcommand to write new entries
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-08-04  9:46   ` [PATCH v4 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 5/9] ident: fix type of string length parameter Patrick Steinhardt
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |   7 +++
 builtin/reflog.c              |  65 +++++++++++++++++++++
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 128 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 201 insertions(+)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index c3801b82fb..34232a539a 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
 git reflog exists <ref>
+git reflog write <ref> <old-oid> <new-oid> <message>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
@@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a reflog.  It exits
 with zero status if the reflog exists, and non-zero status if it does
 not.
 
+The "write" subcommand writes a single entry to the reflog of a given
+reference. This new entry is appended to the reflog and will thus become
+the most recent entry. The reference name must be fully qualified. Both the old
+and new object IDs must not be abbreviated and must point to existing objects.
+The reflog message gets normalized.
+
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
 reflog delete master@{2}`"). This subcommand is also typically not used
diff --git a/builtin/reflog.c b/builtin/reflog.c
index b00b3f9edc..a1b4e02204 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -3,6 +3,8 @@
 #include "builtin.h"
 #include "config.h"
 #include "gettext.h"
+#include "hex.h"
+#include "object-store.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -20,6 +22,9 @@
 #define BUILTIN_REFLOG_EXISTS_USAGE \
 	N_("git reflog exists <ref>")
 
+#define BUILTIN_REFLOG_WRITE_USAGE \
+	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
+
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
@@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
 	NULL,
 };
 
+static const char *const reflog_write_usage[] = {
+	BUILTIN_REFLOG_WRITE_USAGE,
+	NULL,
+};
+
 static const char *const reflog_delete_usage[] = {
 	BUILTIN_REFLOG_DELETE_USAGE,
 	NULL
@@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
 	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_WRITE_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
 	BUILTIN_REFLOG_EXPIRE_USAGE,
@@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
 	return ret;
 }
 
+static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
+			    struct repository *repo)
+{
+	const struct option options[] = {
+		OPT_END()
+	};
+	struct object_id old_oid, new_oid;
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *tx;
+	const char *ref, *message;
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, reflog_write_usage, 0);
+	if (argc != 4)
+		usage_with_options(reflog_write_usage, options);
+
+	ref = argv[0];
+	if (!is_root_ref(ref) && check_refname_format(ref, 0))
+		die(_("invalid reference name: %s"), ref);
+
+	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid old object ID: '%s'"), argv[1]);
+	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
+		die(_("old object '%s' does not exist"), argv[1]);
+
+	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid new object ID: '%s'"), argv[2]);
+	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
+		die(_("new object '%s' does not exist"), argv[2]);
+
+	message = argv[3];
+
+	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
+	if (!tx)
+		die(_("cannot start transaction: %s"), err.buf);
+
+	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
+					    git_committer_info(0),
+					    message, 0, &err);
+	if (ret)
+		die(_("cannot queue reflog update: %s"), err.buf);
+
+	ret = ref_transaction_commit(tx, &err);
+	if (ret)
+		die(_("cannot commit reflog update: %s"), err.buf);
+
+	ref_transaction_free(tx);
+	strbuf_release(&err);
+	return 0;
+}
+
 /*
  * main "reflog"
  */
@@ -405,6 +469,7 @@ int cmd_reflog(int argc,
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
 		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
 		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
diff --git a/t/meson.build b/t/meson.build
index d052fc3e23..adcdf09e74 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -220,6 +220,7 @@ integration_tests = [
   't1418-reflog-exists.sh',
   't1419-exclude-refs.sh',
   't1420-lost-found.sh',
+  't1421-reflog-write.sh',
   't1430-bad-ref-name.sh',
   't1450-fsck.sh',
   't1451-fsck-buffer.sh',
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
new file mode 100755
index 0000000000..dd7ffa5241
--- /dev/null
+++ b/t/t1421-reflog-write.sh
@@ -0,0 +1,128 @@
+#!/bin/sh
+
+test_description='Manually write reflog entries'
+
+. ./test-lib.sh
+
+SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
+
+test_reflog_matches () {
+	repo="$1" &&
+	refname="$2" &&
+	cat >actual &&
+	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'invalid number of arguments' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
+		do
+			test_must_fail git reflog write $args 2>err &&
+			test_grep "usage: git reflog write" err || return 1
+		done
+	)
+'
+
+test_expect_success 'invalid refname' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'unqualified refname is rejected' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write unqualified $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'nonexistent object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID old-object-id 2>err &&
+		test_grep "old object .* does not exist" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) new-object-id 2>err &&
+		test_grep "new object .* does not exist" err
+	)
+'
+
+test_expect_success 'abbreviated object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		abbreviated_oid=$(git rev-parse HEAD | test_copy_bytes 8) &&
+		test_must_fail git reflog write refs/heads/something $abbreviated_oid $ZERO_OID old-object-id 2>err &&
+		test_grep "invalid old object ID" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $abbreviated_oid new-object-id 2>err &&
+		test_grep "invalid new object ID" err
+	)
+'
+
+test_expect_success 'reflog message gets normalized' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+		git reflog write HEAD $COMMIT_OID $COMMIT_OID "$(printf "message\nwith\nnewlines")" &&
+		git reflog show -1 --format=%gs HEAD >actual &&
+		echo "message with newlines" >expected &&
+		test_cmp expected actual
+	)
+'
+
+test_expect_success 'simple writes' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . refs/heads/something <<-EOF &&
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+
+		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
+		# Note: the old object ID of the second reflog entry is broken.
+		# This will be fixed in subsequent commits.
+		test_reflog_matches . refs/heads/something <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		EOF
+	)
+'
+
+test_expect_success 'can write to root ref' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write ROOT_REF_HEAD $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . ROOT_REF_HEAD <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+	)
+'
+
+test_done

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 5/9] ident: fix type of string length parameter
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-08-04  9:46   ` [PATCH v4 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The last parameter in `split_ident_line()` is the length of the line
passed in by the caller. As such, most callers pass in either the result
of `strlen()`, `struct strbuf::len` or a pointer diff, all of which
are expected to be positive numbers. Regardless of that, the function
accepts a signed integer, which is somewhat confusing.

Fix the function signature to instead accept a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ident.c | 2 +-
 ident.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ident.c b/ident.c
index 967895d8850..a7a2d132579 100644
--- a/ident.c
+++ b/ident.c
@@ -272,7 +272,7 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
  * can still be NULL if the input line only has the name/email part
  * (e.g. reading from a reflog entry).
  */
-int split_ident_line(struct ident_split *split, const char *line, int len)
+int split_ident_line(struct ident_split *split, const char *line, size_t len)
 {
 	const char *cp;
 	size_t span;
diff --git a/ident.h b/ident.h
index 6a79febba15..3c034038791 100644
--- a/ident.h
+++ b/ident.h
@@ -35,7 +35,7 @@ void reset_ident_date(void);
  * Signals an success with 0, but time part of the result may be NULL
  * if the input lacks timestamp and zone
  */
-int split_ident_line(struct ident_split *, const char *, int);
+int split_ident_line(struct ident_split *, const char *, size_t);
 
 /*
  * Given a commit or tag object buffer and the commit or tag headers, replaces

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 6/9] refs: fix identity for migrated reflogs
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-08-04  9:46   ` [PATCH v4 5/9] ident: fix type of string length parameter Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When migrating reflog entries between different storage formats we must
reconstruct the identity of reflog entries. This is done by passing the
committer passed to the `migrate_one_reflog_entry()` callback function
to `fmt_ident()`.

This results in an invalid identity though: `fmt_ident()` expects the
caller to provide both name and mail of the author, but we pass the full
identity as mail. This leads to an identity like:

    pks <Patrick Steinhardt ps@pks.im>

Fix the bug by splitting the identity line first. This allows us to
extract both the name and mail so that we can pass them to `fmt_ident()`
separately.

This commit does not yet add any tests as there is another bug in the
reflog migration that will be fixed in a subsequent commit. Once that
bug is fixed we'll make the reflog verification in t1450 stricter, and
that will catch both this bug here and the other bug.

Note that we also add two new `name` and `mail` string buffers to the
callback structures and splice them through to the callbacks. This is
done so that we can avoid allocating a new buffer every time we compute
the committer information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 8aa9f7236a..a5f9ffaa45 100644
--- a/refs.c
+++ b/refs.c
@@ -2954,7 +2954,7 @@ struct migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf sb;
+	struct strbuf sb, name, mail;
 };
 
 static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
@@ -2993,7 +2993,7 @@ struct reflog_migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf *sb;
+	struct strbuf *sb, *name, *mail;
 };
 
 static int migrate_one_reflog_entry(struct object_id *old_oid,
@@ -3003,13 +3003,21 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 				    const char *msg, void *cb_data)
 {
 	struct reflog_migration_data *data = cb_data;
+	struct ident_split ident;
 	const char *date;
 	int ret;
 
+	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
+		return -1;
+
+	strbuf_reset(data->name);
+	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_reset(data->mail);
+	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
 	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
 	strbuf_reset(data->sb);
-	/* committer contains name and email */
-	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
+	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
@@ -3026,6 +3034,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
 		.transaction = migration_data->transaction,
 		.errbuf = migration_data->errbuf,
 		.sb = &migration_data->sb,
+		.name = &migration_data->name,
+		.mail = &migration_data->mail,
 	};
 
 	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
@@ -3124,6 +3134,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	struct strbuf new_gitdir = STRBUF_INIT;
 	struct migration_data data = {
 		.sb = STRBUF_INIT,
+		.name = STRBUF_INIT,
+		.mail = STRBUF_INIT,
 	};
 	int did_migrate_refs = 0;
 	int ret;
@@ -3299,6 +3311,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	strbuf_release(&data.sb);
+	strbuf_release(&data.name);
+	strbuf_release(&data.mail);
 	return ret;
 }
 

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-08-04  9:46   ` [PATCH v4 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04 15:38     ` Jeff King
  2025-08-04  9:46   ` [PATCH v4 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  8 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When updating a reference that is being pointed to HEAD we don't only
write a reflog message for that particular reference, but also generate
one for HEAD. This logic is handled by `split_head_update()`, where we:

  1. Verify that the condition actually triggered. This is done by
     reading HEAD at the start of the transaction so that we can then
     check whether a given reference update refers to its target.

  2. Queue a new log-only update for HEAD in case it did.

But the logic is unfortunately not free of races, as we do not lock the
HEAD reference after we have read its target. This can lead to the
following two scenarios:

  - HEAD gets concurrently updated to point to one of the references we
    have already processed. This causes us not writing a reflog message
    even though we should have done so.

  - HEAD gets concurrently updated to no longer point to a reference
    anymore that we have already processed. This causes us to write a
    reflog message even though we should _not_ have done so.

Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that
is specific to the "files" backend. If set, we will double check that
the HEAD reference still points to the reference that we are creating
the reflog entry for after we have locked HEAD. Furthermore, instead of
manually resolving the old object ID of that entry, we now use the same
old state as for the parent update.

Unfortunately, this change only helps with the second race. We cannot
reliably plug the first race without locking the HEAD reference at the
start of the transaction. Locking HEAD unconditionally would effectively
serialize all writes though, and that doesn't seem like an option. Also,
double checking its value at the end of the transaction is not an option
either, as its target may have flip-flopped during the transaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bf6f89b1d1..d0baa4e01c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -68,6 +68,12 @@
  */
 #define REF_DELETED_RMDIR (1 << 9)
 
+/*
+ * Used to indicate that the reflog-only update has been created via
+ * `split_head_update()`.
+ */
+#define REF_LOG_VIA_SPLIT (1 << 14)
+
 struct ref_lock {
 	char *ref_name;
 	struct lock_file lk;
@@ -637,14 +643,16 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
 	return 0;
 }
 
-static void unlock_ref(struct ref_lock *lock)
+static int unlock_ref(struct ref_lock *lock)
 {
 	lock->count--;
 	if (!lock->count) {
 		rollback_lock_file(&lock->lk);
 		free(lock->ref_name);
 		free(lock);
+		return 1;
 	}
+	return 0;
 }
 
 /*
@@ -2420,9 +2428,10 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
 
 	new_update = ref_transaction_add_update(
 			transaction, "HEAD",
-			update->flags | REF_LOG_ONLY | REF_NO_DEREF,
+			update->flags | REF_LOG_ONLY | REF_NO_DEREF | REF_LOG_VIA_SPLIT,
 			&update->new_oid, &update->old_oid,
 			NULL, NULL, update->committer_info, update->msg);
+	new_update->parent_update = update;
 
 	/*
 	 * Add "HEAD". This insertion is O(N) in the transaction
@@ -2550,6 +2559,9 @@ struct files_transaction_backend_data {
  *   the referent to transaction.
  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
  *   update of HEAD.
+ *
+ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
+ * error code otherwise.
  */
 static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
 						      struct ref_update *update,
@@ -2600,7 +2612,45 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 	update->backend_data = lock;
 
-	if (update->type & REF_ISSYMREF) {
+	if (update->flags & REF_LOG_VIA_SPLIT) {
+		struct ref_lock *parent_lock;
+
+		if (!update->parent_update)
+			BUG("split update without a parent");
+
+		parent_lock = update->parent_update->backend_data;
+
+		/*
+		 * Check that "HEAD" didn't racily change since we have looked
+		 * it up. If it did we remove the reflog-only updateg from the
+		 * transaction again.
+		 *
+		 * Note that this does not catch all races: if "HEAD" was
+		 * racily changed to point to one of the refs part of the
+		 * transaction then we would miss writing the split reflog
+		 * entry for "HEAD".
+		 */
+		if (!(update->type & REF_ISSYMREF) ||
+		    strcmp(update->parent_update->refname, referent.buf)) {
+			if (unlock_ref(lock))
+				strmap_remove(&backend_data->ref_locks,
+					      update->refname, 0);
+
+			memmove(transaction->updates + update_idx,
+				transaction->updates + update_idx + 1,
+				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
+			transaction->nr--;
+
+			ret = 1;
+			goto out;
+		}
+
+		if (update->flags & REF_HAVE_OLD) {
+			oidcpy(&lock->old_oid, &update->old_oid);
+		} else {
+			oidcpy(&lock->old_oid, &parent_lock->old_oid);
+		}
+	} else if (update->type & REF_ISSYMREF) {
 		if (update->flags & REF_NO_DEREF) {
 			/*
 			 * We won't be reading the referent as part of
@@ -2860,6 +2910,8 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &refnames_to_check,
 					  err);
 		if (ret) {
+			if (ret > 0)
+				continue;
 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
 				strbuf_reset(err);
 				ret = 0;

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-08-04  9:46   ` [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  2025-08-04  9:46   ` [PATCH v4 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whether
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.

When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.

But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal will become useful in a subsequent commit, where we add a new
flag that tells the transaction to use the provided old and new object
IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
signal to verify that the caller really did provide an old object ID.

Stop unsetting the flag so that we can use it as this described signal
in a subsequent commit. Skip checking the old object ID for log-only
updates so that we don't expect it to match the current on-disk state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  8 +++-----
 refs/files-backend.c    |  9 +++++----
 refs/refs-internal.h    |  3 ++-
 refs/reftable-backend.c | 12 +++---------
 4 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/refs.c b/refs.c
index a5f9ffaa45..f88928de74 100644
--- a/refs.c
+++ b/refs.c
@@ -1393,11 +1393,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 	update = ref_transaction_add_update(transaction, refname, flags,
 					    new_oid, old_oid, NULL, NULL,
 					    committer_info, msg);
-	/*
-	 * While we do set the old_oid value, we unset the flag to skip
-	 * old_oid verification which only makes sense for refs.
-	 */
-	update->flags &= ~REF_HAVE_OLD;
 	update->index = index;
 
 	/*
@@ -3318,6 +3313,9 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 
 int ref_update_expects_existing_old_ref(struct ref_update *update)
 {
+	if (update->flags & REF_LOG_ONLY)
+		return 0;
+
 	return (update->flags & REF_HAVE_OLD) &&
 		(!is_null_oid(&update->old_oid) || update->old_target);
 }
diff --git a/refs/files-backend.c b/refs/files-backend.c
index d0baa4e01c..c37fbfd138 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2502,7 +2502,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
 	 * done when new_update is processed.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;
 
 	return 0;
 }
@@ -2517,8 +2516,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
 						struct object_id *oid,
 						struct strbuf *err)
 {
-	if (!(update->flags & REF_HAVE_OLD) ||
-		   oideq(oid, &update->old_oid))
+	if (update->flags & REF_LOG_ONLY ||
+	    !(update->flags & REF_HAVE_OLD) ||
+	    oideq(oid, &update->old_oid))
 		return 0;
 
 	if (is_null_oid(&update->old_oid)) {
@@ -3111,7 +3111,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
-		if ((update->flags & REF_HAVE_OLD) &&
+		if (!(update->flags & REF_LOG_ONLY) &&
+		    (update->flags & REF_HAVE_OLD) &&
 		    !is_null_oid(&update->old_oid))
 			BUG("initial ref transaction with old_sha1 set");
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f868870851..95a4dc3902 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -802,7 +802,8 @@ enum ref_transaction_error ref_update_check_old_target(const char *referent,
 
 /*
  * Check if the ref must exist, this means that the old_oid or
- * old_target is non NULL.
+ * old_target is non NULL. Log-only updates never require the old state to
+ * match.
  */
 int ref_update_expects_existing_old_ref(struct ref_update *update);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4c3817f4ec..44af58ac50 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret > 0) {
 		/* The reference does not exist, but we expected it to. */
 		strbuf_addf(err, _("cannot lock ref '%s': "
-
-
 				   "unable to resolve reference '%s'"),
 			    ref_update_original_update_refname(u), u->refname);
 		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
@@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 
 			new_update->parent_update = u;
 
-			/*
-			 * Change the symbolic ref update to log only. Also, it
-			 * doesn't need to check its old OID value, as that will be
-			 * done when new_update is processed.
-			 */
+			/* Change the symbolic ref update to log only. */
 			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-			u->flags &= ~REF_HAVE_OLD;
 		}
 	}
 
@@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 		ret = ref_update_check_old_target(referent->buf, u, err);
 		if (ret)
 			return ret;
-	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
+	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
+		   !oideq(&current_oid, &u->old_oid)) {
 		if (is_null_oid(&u->old_oid)) {
 			strbuf_addf(err, _("cannot lock ref '%s': "
 					   "reference already exists"),

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v4 9/9] refs: fix invalid old object IDs when migrating reflogs
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-08-04  9:46   ` [PATCH v4 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-08-04  9:46   ` Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  9:46 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  3 ++-
 refs.h                  |  9 ++++++++-
 refs/files-backend.c    | 16 +++++++++++++++-
 refs/reftable-backend.c | 14 ++++++++++++++
 t/t1421-reflog-write.sh |  4 +---
 t/t1460-refs-migrate.sh | 22 +++++++++++++++-------
 6 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index f88928de74..946eb48941 100644
--- a/refs.c
+++ b/refs.c
@@ -1385,7 +1385,8 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 
 	assert(err);
 
-	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF |
+		REF_LOG_USE_PROVIDED_OIDS;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
diff --git a/refs.h b/refs.h
index 253dd8f4d5..090b4fdff4 100644
--- a/refs.h
+++ b/refs.h
@@ -760,13 +760,20 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
  */
 #define REF_SKIP_CREATE_REFLOG (1 << 12)
 
+/*
+ * When writing a REF_LOG_ONLY record, use the old and new object IDs provided
+ * in the update instead of resolving the old object ID. The caller must also
+ * set both REF_HAVE_OLD and REF_HAVE_NEW.
+ */
+#define REF_LOG_USE_PROVIDED_OIDS (1 << 13)
+
 /*
  * Bitmask of all of the flags that are allowed to be passed in to
  * ref_transaction_update() and friends:
  */
 #define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS                                  \
 	(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
-	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
+	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG | REF_LOG_USE_PROVIDED_OIDS)
 
 /*
  * Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index c37fbfd138..851b1b33f4 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3026,6 +3026,20 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 				  struct ref_lock *lock,
 				  struct strbuf *err)
 {
+	struct object_id *old_oid = &lock->old_oid;
+
+	if (update->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(update->flags & REF_HAVE_OLD) ||
+		    !(update->flags & REF_HAVE_NEW) ||
+		    !(update->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), update->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		old_oid = &update->old_oid;
+	}
+
 	if (update->new_target) {
 		/*
 		 * We want to get the resolved OID for the target, to ensure
@@ -3043,7 +3057,7 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 		}
 	}
 
-	if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid,
+	if (files_log_ref_write(refs, lock->ref_name, old_oid,
 				&update->new_oid, update->committer_info,
 				update->msg, update->flags, err)) {
 		char *old_msg = strbuf_detach(err, NULL);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 44af58ac50..99fafd75eb 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1096,6 +1096,20 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret)
 		return REF_TRANSACTION_ERROR_GENERIC;
 
+	if (u->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(u->flags & REF_HAVE_OLD) ||
+		    !(u->flags & REF_HAVE_NEW) ||
+		    !(u->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), u->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		if (queue_transaction_update(refs, tx_data, u, &u->old_oid, err))
+			return REF_TRANSACTION_ERROR_GENERIC;
+		return 0;
+	}
+
 	/* Verify that the new object ID is valid. */
 	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
 	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
index dd7ffa5241..46df64c176 100755
--- a/t/t1421-reflog-write.sh
+++ b/t/t1421-reflog-write.sh
@@ -101,11 +101,9 @@ test_expect_success 'simple writes' '
 		EOF
 
 		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
-		# Note: the old object ID of the second reflog entry is broken.
-		# This will be fixed in subsequent commits.
 		test_reflog_matches . refs/heads/something <<-EOF
 		$ZERO_OID $COMMIT_OID $SIGNATURE	first
-		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
 		EOF
 	)
 '
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
index 2ab97e1b7d..0e1116a319 100755
--- a/t/t1460-refs-migrate.sh
+++ b/t/t1460-refs-migrate.sh
@@ -7,6 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+print_all_reflog_entries () {
+	repo=$1 &&
+	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
+	while read reflog
+	do
+		echo "REFLOG: $reflog" &&
+		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
+		return 1
+	done <reflogs
+}
+
 # Migrate the provided repository from one format to the other and
 # verify that the references and logs are migrated over correctly.
 # Usage: test_migration <repo> <format> [<skip_reflog_verify> [<options...>]]
@@ -28,8 +39,7 @@ test_migration () {
 		--format='%(refname) %(objectname) %(symref)' >expect &&
 	if ! $skip_reflog_verify
 	then
-	   git -C "$repo" reflog --all >expect_logs &&
-	   git -C "$repo" reflog list >expect_log_list
+		print_all_reflog_entries "$repo" >expect_logs
 	fi &&
 
 	git -C "$repo" refs migrate --ref-format="$format" "$@" &&
@@ -39,10 +49,8 @@ test_migration () {
 	test_cmp expect actual &&
 	if ! $skip_reflog_verify
 	then
-		git -C "$repo" reflog --all >actual_logs &&
-		git -C "$repo" reflog list >actual_log_list &&
-		test_cmp expect_logs actual_logs &&
-		test_cmp expect_log_list actual_log_list
+		print_all_reflog_entries "$repo" >actual_logs &&
+		test_cmp expect_logs actual_logs
 	fi &&
 
 	git -C "$repo" rev-parse --show-ref-format >actual &&
@@ -273,7 +281,7 @@ test_expect_success 'multiple reftable blocks with multiple entries' '
 	test_commit -C repo second &&
 	printf "update refs/heads/ref-%d HEAD\n" $(test_seq 3000) >stdin &&
 	git -C repo update-ref --stdin <stdin &&
-	test_migration repo reftable
+	test_migration repo reftable true
 '
 
 test_expect_success 'migrating from files format deletes backend files' '

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-08-04  7:38       ` Patrick Steinhardt
@ 2025-08-04 14:47         ` Jeff King
  0 siblings, 0 replies; 114+ messages in thread
From: Jeff King @ 2025-08-04 14:47 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes, Kristoffer Haugsbakk, Ben Knoble

On Mon, Aug 04, 2025 at 09:38:29AM +0200, Patrick Steinhardt wrote:

> > Shouldn't we quietly drop the HEAD reflog update, rather than forcing
> > the whole transaction to fail? The user never asked us to update HEAD at
> > all. It was something we opportunistically decided to do, and now we
> > find out that it is not appropriate to do so.
> 
> That's something I wasn't quite sure about, either. Honestly, the reason
> I shied away from it is that it needs a bit more munging for an edge
> case that is hard to test reliably. But I guess we can do something like
> the below patch to skip writing the reflog message instea.

Yeah, I wondered what it would look like to drop a single update from a
transaction, since that's not something we currently allow. And indeed,
this is a bit scary:

> @@ -2626,8 +2632,16 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
>  		 */
>  		if (!(update->type & REF_ISSYMREF) ||
>  		    strcmp(update->parent_update->refname, referent.buf)) {
> -			strbuf_addstr(err, "HEAD has been racily updated");
> -			ret = REF_TRANSACTION_ERROR_GENERIC;
> +			if (unlock_ref(lock))
> +				strmap_remove(&backend_data->ref_locks,
> +					      update->refname, 0);
> +
> +			memmove(transaction->updates + update_idx,
> +				transaction->updates + update_idx + 1,
> +				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
> +			transaction->nr--;
> +
> +			ret = 1;
>  			goto out;
>  		}

Not because it's necessarily wrong, but it feels like a maintainability
problem, when the transaction code learns some new struct field similar
to "ref_locks", and we have to update it here, too. I dunno. Pulling it
out into a "transaction_drop_update()" helper would make that a bit more
obvious, but you're right that the fundamental issue is that we're not
going to be testing this very well.

Maybe erroring out, as your original patch did, is the least-bad thing,
then? I think that _might_ even be what happens in the current code as
an emergent behavior. We leave HAVE_OLD_OID set, so we'd expect HEAD to
resolve to the same thing it originally did. If you've pointed it
elsewhere, then it probably would fail to resolve to that same oid
(unless you pointed to a different branch with the same tip commit).

I really wish there was an easy way to test this. I guess something like
this:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 89ae4517a9..6aeec2e8e0 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2393,6 +2393,7 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
 						    const char *head_ref,
 						    struct strbuf *err)
 {
+	static int force_split = -1;
 	struct ref_update *new_update;
 
 	if ((update->flags & REF_LOG_ONLY) ||
@@ -2401,7 +2402,10 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
 	    (update->flags & REF_UPDATE_VIA_HEAD))
 		return 0;
 
-	if (strcmp(update->refname, head_ref))
+	if (force_split < 0)
+		force_split = git_env_bool("GIT_TEST_FORCE_SPLIT_HEAD_UPDATE", 0);
+
+	if (!force_split && strcmp(update->refname, head_ref))
 		return 0;
 
 	/*

along with:

  git commit --allow-empty one
  git commit --allow-empty two
  GIT_TEST_FORCE_SPLIT_HEAD_UPDATE=1 git branch foo HEAD^

creates roughly the situation (HEAD was never pointed at "foo", but
we'll create the reflog update for it anyway). It does fail with:

  fatal: cannot lock ref 'HEAD': reference already exists

even before your patch. And after, we get:

  fatal: HEAD has been racily updated

So it probably is just not something that happens very often, as I don't
recall ever seeing any discussion of it.

I dunno. Looks like you posted a new version of the series that loosens
this, so I'll take a peek at that (I also wondered whether what you
posted above leaks entries in the update struct, so maybe you've dealt
with that).

-Peff

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-08-04  9:46   ` [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
@ 2025-08-04 15:38     ` Jeff King
  0 siblings, 0 replies; 114+ messages in thread
From: Jeff King @ 2025-08-04 15:38 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes, Kristoffer Haugsbakk, Ben Knoble

On Mon, Aug 04, 2025 at 11:46:07AM +0200, Patrick Steinhardt wrote:

> +		/*
> +		 * Check that "HEAD" didn't racily change since we have looked
> +		 * it up. If it did we remove the reflog-only updateg from the
> +		 * transaction again.
> +		 *
> +		 * Note that this does not catch all races: if "HEAD" was
> +		 * racily changed to point to one of the refs part of the
> +		 * transaction then we would miss writing the split reflog
> +		 * entry for "HEAD".
> +		 */
> +		if (!(update->type & REF_ISSYMREF) ||
> +		    strcmp(update->parent_update->refname, referent.buf)) {
> +			if (unlock_ref(lock))
> +				strmap_remove(&backend_data->ref_locks,
> +					      update->refname, 0);
> +
> +			memmove(transaction->updates + update_idx,
> +				transaction->updates + update_idx + 1,
> +				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
> +			transaction->nr--;
> +
> +			ret = 1;
> +			goto out;
> +		}

OK, so this is basically the same as the patch you posted earlier. Let's
see how it fares with my hacky GIT_TEST_FORCE_SPLIT_HEAD_UPDATE patch:

  $ GIT_TEST_FORCE_SPLIT_HEAD_UPDATE=1 git branch foo HEAD^
  fatal:

Yikes. I'm not sure if there's a bug here, or if my hacky patch is
violating some other assumption. It looks like we get to the die() call
in branch.c:create_branch() because the transaction reports failure, but
with an empty err strbuf.

Ah, I think I see it. When we return from lock_ref_for_update(), we've
set "ret" to "1", indicating we are skipping the update. But then we do
this:

  if (ret > 0)
	continue;

I think there are two problems there:

  1. That "ret" is also used as our return from
     files_transaction_prepare(). So if this is the last update in the
     transaction, then we return "1", rather than "0" for success, and
     the caller thinks there was an error.

  2. If it's not the last transaction, then we go to the next element in
     the loop. But because it's a for-loop, we still increment "i",
     which is wrong (because we shrunk the transaction list). We need to
     check that "i" again.

So maybe:

@@ -2910,8 +2914,11 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &refnames_to_check,
 					  err);
 		if (ret) {
-			if (ret > 0)
+			if (ret > 0) {
+				ret = 0; /* not an error; we skipped it */
+				i--; /* we shrunk the list */
 				continue;
+			}
 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
 				strbuf_reset(err);
 				ret = 0;

I confirmed that fixes case (1). I guess I could test case (2) with a
bigger transaction involving multiple refs, but it's awkward because my
"force split update" patch would try to create multiple HEAD updates. :-/

I guess maybe it should be "pretend HEAD is this", like so:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 851b1b33f4..564b77d0da 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2895,6 +2895,14 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 		FREE_AND_NULL(head_ref);
 	}
 
+	{
+		const char *v = getenv("GIT_TEST_PRETEND_SPLIT_HEAD");
+		if (v) {
+			free(head_ref);
+			head_ref = xstrdup(v);
+		}
+	}
+
 	/*
 	 * Acquire all locks, verify old values if provided, check
 	 * that new values are valid, and write new values to the

We have to be a bit tricky here. The split head update is always added
at the end during the transaction preparation. So we need a situation
where another update is added _after_ that. I guess it would be another
symref split (but done by updating the symref).

So:

  git symbolic-ref refs/heads/SYMREF refs/heads/dest
  (
    echo "create refs/heads/foo HEAD"
    echo "create refs/heads/SYMREF HEAD"
  ) |
  GIT_TEST_PRETEND_SPLIT_HEAD=refs/heads/foo git update-ref --stdin

ends up with four updates:

  - the original create foo
  - the original create SYMREF
  - the reflog update of HEAD from split_head_update()
  - the update of refs/heads/dest from split_symref_update()

And indeed, running that through the debugger shows that we'd otherwise
skip the final update with your patch (but the extra "i--" fixes it).

I also tried this with SANITIZE=leak, and I think you'd need something
like this, as well:

diff --git a/refs.c b/refs.c
index 946eb48941..27c182e107 100644
--- a/refs.c
+++ b/refs.c
@@ -1184,6 +1184,15 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
 	return tr;
 }
 
+void ref_update_free(struct ref_update *u)
+{
+	free(u->msg);
+	free(u->committer_info);
+	free((char *)u->new_target);
+	free((char *)u->old_target);
+	free(u);
+}
+
 void ref_transaction_free(struct ref_transaction *transaction)
 {
 	size_t i;
@@ -1204,13 +1213,8 @@ void ref_transaction_free(struct ref_transaction *transaction)
 		break;
 	}
 
-	for (i = 0; i < transaction->nr; i++) {
-		free(transaction->updates[i]->msg);
-		free(transaction->updates[i]->committer_info);
-		free((char *)transaction->updates[i]->new_target);
-		free((char *)transaction->updates[i]->old_target);
-		free(transaction->updates[i]);
-	}
+	for (i = 0; i < transaction->nr; i++)
+		ref_update_free(transaction->updates[i]);
 
 	if (transaction->rejections)
 		free(transaction->rejections->update_indices);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 851b1b33f4..0246715383 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2640,6 +2640,7 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 				transaction->updates + update_idx + 1,
 				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
 			transaction->nr--;
+			ref_update_free(update);
 
 			ret = 1;
 			goto out;
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 95a4dc3902..6b5895a3b3 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -144,6 +144,8 @@ struct ref_update {
 	const char refname[FLEX_ARRAY];
 };
 
+void ref_update_free(struct ref_update *);
+
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno);

It's a little hard to see that freeing update inside
lock_ref_for_update() is safe (but we "goto out" after and don't look at
it again). I think it would all be a bit more obvious if
lock_ref_for_update() just returned 1 for "skip this", and then the
caller actually shrunk the transaction list.

-Peff

^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 0/9] refs: fix migration of reflog entries
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (10 preceding siblings ...)
  2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-08-05 15:11 ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
                     ` (9 more replies)
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
  12 siblings, 10 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Changes in v2:
  - !!! The base of this topic has changed so that it sits on top of
    v2.50.1. This is done so that we can backport this change to older
    release tracks.
  - A couple of typo fixes and clarifications for commit messages.
  - Reorder sections in git-reflog(1) manpage according to the
    reordering we have in the synopsis.
  - Add a section for the new `write` command.
  - Improve test coverage for the `git reflog write` command.
  - Avoid `cat`ing a file into a Bash loop.
  - Remove a stale comment.
  - Make `ref_update_expects_existing_old_ref()` a bit more straight
    forward.
  - Link to v1: https://lore.kernel.org/r/20250722-pks-reflog-append-v1-0-183e5949de16@pks.im

Changes in v3:
  - `git reflog write` now requires fully-qualified refnames.
  - A new commit that plugs one part of the race around splitting of
    reflogs for HEAD in the "files" backend.
  - Link to v2: https://lore.kernel.org/r/20250725-pks-reflog-append-v2-0-e4e7cbe3f578@pks.im

Changes in v4:
  - Improve one of the tests to use an existing abbreviated object ID
    instead of a non-existing one to make sure that we indeed fail due
    to the abbreviation.
  - Don't abort the transaction when HEAD has been racily updated, but
    drop the log-only update instead.
  - Link to v3: https://lore.kernel.org/r/20250729-pks-reflog-append-v3-0-9614d310f073@pks.im

Changes in v5:
  - Revert back to the logic that aborts the transaction if we see a
    racy HEAD update. It's the pragmatic thing to do for an edge case
    that is very unlikely to ever happen.
  - Link to v4: https://lore.kernel.org/r/20250804-pks-reflog-append-v4-0-13213fef7200@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (9):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs/files: detect race when generating reflog entry for HEAD
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  76 +++++++++++++------------
 builtin/reflog.c              | 103 +++++++++++++++++++++++++++-------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  60 +++++++++++---------
 refs.h                        |  24 +++++++-
 refs/files-backend.c          |  65 +++++++++++++++++++---
 refs/refs-internal.h          |   3 +-
 refs/reftable-backend.c       |  26 ++++++---
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 126 ++++++++++++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 +++++---
 12 files changed, 403 insertions(+), 107 deletions(-)

Range-diff versus v4:

 1:  dd5a8065de =  1:  56319ac4ea Documentation/git-reflog: convert to use synopsis type
 2:  6cb5f94795 =  2:  d294ddc23f builtin/reflog: improve grouping of subcommands
 3:  6ba62d6b33 =  3:  9b28228df6 refs: export `ref_transaction_update_reflog()`
 4:  e5630532ba =  4:  936223d8c8 builtin/reflog: implement subcommand to write new entries
 5:  3a5f13aeb2 =  5:  78a3d896c7 ident: fix type of string length parameter
 6:  1d3ac825e1 =  6:  b164e5ef4a refs: fix identity for migrated reflogs
 7:  35cbf4a06f !  7:  e5817d65bf refs/files: detect race when generating reflog entry for HEAD
    @@ Commit message
         manually resolving the old object ID of that entry, we now use the same
         old state as for the parent update.
     
    +    If we detect such a racy update we abort the transaction. This is a bit
    +    heavy-handed: the user didn't even ask us to write a reflog update for
    +    "HEAD", so it might be surprising if we abort the transaction. That
    +    being said:
    +
    +      - Normal users wouldn't typically hit this case as we only hit the
    +        relevant code when committing to a branch that is being pointed to
    +        by "HEAD" directly. Commands like git-commit(1) typically commit to
    +        "HEAD" itself though.
    +
    +      - Scripted users that use git-update-ref(1) and related plumbing
    +        commands are unlikely to hit this case either, as they would have to
    +        update the pointed-to-branch at the same as "HEAD" is being updated,
    +        which is an exceedingly rare event.
    +
    +    The alternative would be to instead drop the log-only update completely,
    +    but that would require more logic that is hard to verify without adding
    +    infrastructure specific for such a test. So we rather do the pragmatic
    +    thing and don't worry too much about an edge case that is very unlikely
    +    to happen.
    +
         Unfortunately, this change only helps with the second race. We cannot
         reliably plug the first race without locking the HEAD reference at the
         start of the transaction. Locking HEAD unconditionally would effectively
    @@ refs/files-backend.c
      struct ref_lock {
      	char *ref_name;
      	struct lock_file lk;
    -@@ refs/files-backend.c: int parse_loose_ref_contents(const struct git_hash_algo *algop,
    - 	return 0;
    - }
    - 
    --static void unlock_ref(struct ref_lock *lock)
    -+static int unlock_ref(struct ref_lock *lock)
    - {
    - 	lock->count--;
    - 	if (!lock->count) {
    - 		rollback_lock_file(&lock->lk);
    - 		free(lock->ref_name);
    - 		free(lock);
    -+		return 1;
    - 	}
    -+	return 0;
    - }
    - 
    - /*
     @@ refs/files-backend.c: static enum ref_transaction_error split_head_update(struct ref_update *update,
      
      	new_update = ref_transaction_add_update(
    @@ refs/files-backend.c: static enum ref_transaction_error split_head_update(struct
      
      	/*
      	 * Add "HEAD". This insertion is O(N) in the transaction
    -@@ refs/files-backend.c: struct files_transaction_backend_data {
    -  *   the referent to transaction.
    -  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
    -  *   update of HEAD.
    -+ *
    -+ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
    -+ * error code otherwise.
    -  */
    - static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
    - 						      struct ref_update *update,
     @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
      
      	update->backend_data = lock;
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
     +
     +		/*
     +		 * Check that "HEAD" didn't racily change since we have looked
    -+		 * it up. If it did we remove the reflog-only updateg from the
    -+		 * transaction again.
    ++		 * it up. If it did we must refuse to write the reflog entry.
     +		 *
     +		 * Note that this does not catch all races: if "HEAD" was
     +		 * racily changed to point to one of the refs part of the
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
     +		 */
     +		if (!(update->type & REF_ISSYMREF) ||
     +		    strcmp(update->parent_update->refname, referent.buf)) {
    -+			if (unlock_ref(lock))
    -+				strmap_remove(&backend_data->ref_locks,
    -+					      update->refname, 0);
    -+
    -+			memmove(transaction->updates + update_idx,
    -+				transaction->updates + update_idx + 1,
    -+				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
    -+			transaction->nr--;
    -+
    -+			ret = 1;
    ++			strbuf_addstr(err, "HEAD has been racily updated");
    ++			ret = REF_TRANSACTION_ERROR_GENERIC;
     +			goto out;
     +		}
     +
    @@ refs/files-backend.c: static enum ref_transaction_error lock_ref_for_update(stru
      		if (update->flags & REF_NO_DEREF) {
      			/*
      			 * We won't be reading the referent as part of
    -@@ refs/files-backend.c: static int files_transaction_prepare(struct ref_store *ref_store,
    - 					  head_ref, &refnames_to_check,
    - 					  err);
    - 		if (ret) {
    -+			if (ret > 0)
    -+				continue;
    - 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    - 				strbuf_reset(err);
    - 				ret = 0;
 8:  00caacba94 =  8:  06c98ff66f refs: stop unsetting REF_HAVE_OLD for log-only updates
 9:  2271b80a90 =  9:  a3f8e9bee3 refs: fix invalid old object IDs when migrating reflogs

---
base-commit: d82adb61ba2fd11d8f2587fca1b6bd7925ce4044
change-id: 20250722-pks-reflog-append-634172d8ab2c


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 17:04     ` Jean-Noël AVILA
  2025-08-05 15:11   ` [PATCH v5 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
have introduced a new synopsis type that simplifies the rules for
typesetting a command's synopsis. Convert the git-reflog(1)
documentation to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 412f06b8fe..707a9b39ed 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -8,16 +8,16 @@ git-reflog - Manage reflog information
 
 SYNOPSIS
 --------
-[verse]
-'git reflog' [show] [<log-options>] [<ref>]
-'git reflog list'
-'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
+[synopsis]
+git reflog [show] [<log-options>] [<ref>]
+git reflog list
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
 	[--rewrite] [--updateref] [--stale-fix]
 	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
-'git reflog delete' [--rewrite] [--updateref]
+git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
-'git reflog drop' [--all [--single-worktree] | <refs>...]
-'git reflog exists' <ref>
+git reflog drop [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 
 DESCRIPTION
 -----------

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 2/9] builtin/reflog: improve grouping of subcommands
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The way subcommands of git-reflog(1) are laid out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 61 ++++++++++++++++++++++---------------------
 builtin/reflog.c              | 38 +++++++++++++--------------
 2 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 707a9b39ed..c3801b82fb 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -11,13 +11,13 @@ SYNOPSIS
 [synopsis]
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
-git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
-	[--rewrite] [--updateref] [--stale-fix]
-	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
-git reflog exists <ref>
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
+	[--rewrite] [--updateref] [--stale-fix]
+	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
 
 DESCRIPTION
 -----------
@@ -43,11 +43,9 @@ actions, and in addition the `HEAD` reflog records branch switching.
 
 The "list" subcommand lists all refs which have a corresponding reflog.
 
-The "expire" subcommand prunes older reflog entries. Entries older
-than `expire` time, or entries older than `expire-unreachable` time
-and not reachable from the current tip, are removed from the reflog.
-This is typically not used directly by end users -- instead, see
-linkgit:git-gc[1].
+The "exists" subcommand checks whether a ref has a reflog.  It exits
+with zero status if the reflog exists, and non-zero status if it does
+not.
 
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
@@ -58,9 +56,11 @@ The "drop" subcommand completely removes the reflog for the specified
 references. This is in contrast to "expire" and "delete", both of which
 can be used to delete reflog entries, but not the reflog itself.
 
-The "exists" subcommand checks whether a ref has a reflog.  It exits
-with zero status if the reflog exists, and non-zero status if it does
-not.
+The "expire" subcommand prunes older reflog entries. Entries older
+than `expire` time, or entries older than `expire-unreachable` time
+and not reachable from the current tip, are removed from the reflog.
+This is typically not used directly by end users -- instead, see
+linkgit:git-gc[1].
 
 OPTIONS
 -------
@@ -71,6 +71,25 @@ Options for `show`
 `git reflog show` accepts any of the options accepted by `git log`.
 
 
+Options for `delete`
+~~~~~~~~~~~~~~~~~~~~
+
+`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
+`--dry-run`, and `--verbose`, with the same meanings as when they are
+used with `expire`.
+
+Options for `drop`
+~~~~~~~~~~~~~~~~~~
+
+--all::
+	Drop the reflogs of all references from all worktrees.
+
+--single-worktree::
+	By default when `--all` is specified, reflogs from all working
+	trees are dropped. This option limits the processing to reflogs
+	from the current working tree only.
+
+
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -130,24 +149,6 @@ which didn't protect objects referred to by reflogs.
 	Print extra information on screen.
 
 
-Options for `delete`
-~~~~~~~~~~~~~~~~~~~~
-
-`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
-`--dry-run`, and `--verbose`, with the same meanings as when they are
-used with `expire`.
-
-Options for `drop`
-~~~~~~~~~~~~~~~~~~
-
---all::
-	Drop the reflogs of all references from all worktrees.
-
---single-worktree::
-	By default when `--all` is specified, reflogs from all working
-	trees are dropped. This option limits the processing to reflogs
-	from the current working tree only.
-
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acaf3e32c..b00b3f9edc 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -17,21 +17,21 @@
 #define BUILTIN_REFLOG_LIST_USAGE \
 	N_("git reflog list")
 
-#define BUILTIN_REFLOG_EXPIRE_USAGE \
-	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
-	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
-	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+#define BUILTIN_REFLOG_EXISTS_USAGE \
+	N_("git reflog exists <ref>")
 
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
 
-#define BUILTIN_REFLOG_EXISTS_USAGE \
-	N_("git reflog exists <ref>")
-
 #define BUILTIN_REFLOG_DROP_USAGE \
 	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
 
+#define BUILTIN_REFLOG_EXPIRE_USAGE \
+	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
+	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
+	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+
 static const char *const reflog_show_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	NULL,
@@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
 	NULL,
 };
 
-static const char *const reflog_expire_usage[] = {
-	BUILTIN_REFLOG_EXPIRE_USAGE,
-	NULL
+static const char *const reflog_exists_usage[] = {
+	BUILTIN_REFLOG_EXISTS_USAGE,
+	NULL,
 };
 
 static const char *const reflog_delete_usage[] = {
@@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
 	NULL
 };
 
-static const char *const reflog_exists_usage[] = {
-	BUILTIN_REFLOG_EXISTS_USAGE,
-	NULL,
-};
-
 static const char *const reflog_drop_usage[] = {
 	BUILTIN_REFLOG_DROP_USAGE,
 	NULL,
 };
 
+static const char *const reflog_expire_usage[] = {
+	BUILTIN_REFLOG_EXPIRE_USAGE,
+	NULL
+};
+
 static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
-	BUILTIN_REFLOG_EXPIRE_USAGE,
+	BUILTIN_REFLOG_EXISTS_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
-	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_EXPIRE_USAGE,
 	NULL
 };
 
@@ -404,10 +404,10 @@ int cmd_reflog(int argc,
 	struct option options[] = {
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
-		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
-		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
+		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
 		OPT_END()
 	};
 

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 3/9] refs: export `ref_transaction_update_reflog()`
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that function is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 29 +++++++++++------------------
 refs.h | 15 +++++++++++++++
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index dce5c49ca2..8aa9f7236a 100644
--- a/refs.c
+++ b/refs.c
@@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 	return 0;
 }
 
-/*
- * Similar to`ref_transaction_update`, but this function is only for adding
- * a reflog update. Supports providing custom committer information. The index
- * field can be utiltized to order updates as desired. When not used, the
- * updates default to being ordered by refname.
- */
-static int ref_transaction_update_reflog(struct ref_transaction *transaction,
-					 const char *refname,
-					 const struct object_id *new_oid,
-					 const struct object_id *old_oid,
-					 const char *committer_info,
-					 unsigned int flags,
-					 const char *msg,
-					 uint64_t index,
-					 struct strbuf *err)
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err)
 {
 	struct ref_update *update;
+	unsigned int flags;
 
 	assert(err);
 
-	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
@@ -3019,8 +3013,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
-					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
-					    data->index++, data->errbuf);
+					    msg, data->index++, data->errbuf);
 	return ret;
 }
 
diff --git a/refs.h b/refs.h
index 46a6008e07..253dd8f4d5 100644
--- a/refs.h
+++ b/refs.h
@@ -795,6 +795,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err);
 
+/*
+ * Similar to `ref_transaction_update`, but this function is only for adding
+ * a reflog update. Supports providing custom committer information. The index
+ * field can be utiltized to order updates as desired. When set to zero, the
+ * updates default to being ordered by refname.
+ */
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err);
+
 /*
  * Add a reference creation to transaction. new_oid is the value that
  * the reference should have after the update; it must not be

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 4/9] builtin/reflog: implement subcommand to write new entries
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 5/9] ident: fix type of string length parameter Patrick Steinhardt
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |   7 +++
 builtin/reflog.c              |  65 +++++++++++++++++++++
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 128 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 201 insertions(+)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index c3801b82fb..34232a539a 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
 git reflog exists <ref>
+git reflog write <ref> <old-oid> <new-oid> <message>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
@@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a reflog.  It exits
 with zero status if the reflog exists, and non-zero status if it does
 not.
 
+The "write" subcommand writes a single entry to the reflog of a given
+reference. This new entry is appended to the reflog and will thus become
+the most recent entry. The reference name must be fully qualified. Both the old
+and new object IDs must not be abbreviated and must point to existing objects.
+The reflog message gets normalized.
+
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
 reflog delete master@{2}`"). This subcommand is also typically not used
diff --git a/builtin/reflog.c b/builtin/reflog.c
index b00b3f9edc..a1b4e02204 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -3,6 +3,8 @@
 #include "builtin.h"
 #include "config.h"
 #include "gettext.h"
+#include "hex.h"
+#include "object-store.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -20,6 +22,9 @@
 #define BUILTIN_REFLOG_EXISTS_USAGE \
 	N_("git reflog exists <ref>")
 
+#define BUILTIN_REFLOG_WRITE_USAGE \
+	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
+
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
@@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
 	NULL,
 };
 
+static const char *const reflog_write_usage[] = {
+	BUILTIN_REFLOG_WRITE_USAGE,
+	NULL,
+};
+
 static const char *const reflog_delete_usage[] = {
 	BUILTIN_REFLOG_DELETE_USAGE,
 	NULL
@@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
 	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_WRITE_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
 	BUILTIN_REFLOG_EXPIRE_USAGE,
@@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
 	return ret;
 }
 
+static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
+			    struct repository *repo)
+{
+	const struct option options[] = {
+		OPT_END()
+	};
+	struct object_id old_oid, new_oid;
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *tx;
+	const char *ref, *message;
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, reflog_write_usage, 0);
+	if (argc != 4)
+		usage_with_options(reflog_write_usage, options);
+
+	ref = argv[0];
+	if (!is_root_ref(ref) && check_refname_format(ref, 0))
+		die(_("invalid reference name: %s"), ref);
+
+	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid old object ID: '%s'"), argv[1]);
+	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
+		die(_("old object '%s' does not exist"), argv[1]);
+
+	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid new object ID: '%s'"), argv[2]);
+	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
+		die(_("new object '%s' does not exist"), argv[2]);
+
+	message = argv[3];
+
+	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
+	if (!tx)
+		die(_("cannot start transaction: %s"), err.buf);
+
+	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
+					    git_committer_info(0),
+					    message, 0, &err);
+	if (ret)
+		die(_("cannot queue reflog update: %s"), err.buf);
+
+	ret = ref_transaction_commit(tx, &err);
+	if (ret)
+		die(_("cannot commit reflog update: %s"), err.buf);
+
+	ref_transaction_free(tx);
+	strbuf_release(&err);
+	return 0;
+}
+
 /*
  * main "reflog"
  */
@@ -405,6 +469,7 @@ int cmd_reflog(int argc,
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
 		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
 		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
diff --git a/t/meson.build b/t/meson.build
index d052fc3e23..adcdf09e74 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -220,6 +220,7 @@ integration_tests = [
   't1418-reflog-exists.sh',
   't1419-exclude-refs.sh',
   't1420-lost-found.sh',
+  't1421-reflog-write.sh',
   't1430-bad-ref-name.sh',
   't1450-fsck.sh',
   't1451-fsck-buffer.sh',
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
new file mode 100755
index 0000000000..dd7ffa5241
--- /dev/null
+++ b/t/t1421-reflog-write.sh
@@ -0,0 +1,128 @@
+#!/bin/sh
+
+test_description='Manually write reflog entries'
+
+. ./test-lib.sh
+
+SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
+
+test_reflog_matches () {
+	repo="$1" &&
+	refname="$2" &&
+	cat >actual &&
+	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'invalid number of arguments' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
+		do
+			test_must_fail git reflog write $args 2>err &&
+			test_grep "usage: git reflog write" err || return 1
+		done
+	)
+'
+
+test_expect_success 'invalid refname' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'unqualified refname is rejected' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write unqualified $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'nonexistent object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID old-object-id 2>err &&
+		test_grep "old object .* does not exist" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) new-object-id 2>err &&
+		test_grep "new object .* does not exist" err
+	)
+'
+
+test_expect_success 'abbreviated object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		abbreviated_oid=$(git rev-parse HEAD | test_copy_bytes 8) &&
+		test_must_fail git reflog write refs/heads/something $abbreviated_oid $ZERO_OID old-object-id 2>err &&
+		test_grep "invalid old object ID" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $abbreviated_oid new-object-id 2>err &&
+		test_grep "invalid new object ID" err
+	)
+'
+
+test_expect_success 'reflog message gets normalized' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+		git reflog write HEAD $COMMIT_OID $COMMIT_OID "$(printf "message\nwith\nnewlines")" &&
+		git reflog show -1 --format=%gs HEAD >actual &&
+		echo "message with newlines" >expected &&
+		test_cmp expected actual
+	)
+'
+
+test_expect_success 'simple writes' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . refs/heads/something <<-EOF &&
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+
+		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
+		# Note: the old object ID of the second reflog entry is broken.
+		# This will be fixed in subsequent commits.
+		test_reflog_matches . refs/heads/something <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		EOF
+	)
+'
+
+test_expect_success 'can write to root ref' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write ROOT_REF_HEAD $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . ROOT_REF_HEAD <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+	)
+'
+
+test_done

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 5/9] ident: fix type of string length parameter
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The last parameter in `split_ident_line()` is the length of the line
passed in by the caller. As such, most callers pass in either the result
of `strlen()`, `struct strbuf::len` or a pointer diff, all of which
are expected to be positive numbers. Regardless of that, the function
accepts a signed integer, which is somewhat confusing.

Fix the function signature to instead accept a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ident.c | 2 +-
 ident.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ident.c b/ident.c
index 967895d8850..a7a2d132579 100644
--- a/ident.c
+++ b/ident.c
@@ -272,7 +272,7 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
  * can still be NULL if the input line only has the name/email part
  * (e.g. reading from a reflog entry).
  */
-int split_ident_line(struct ident_split *split, const char *line, int len)
+int split_ident_line(struct ident_split *split, const char *line, size_t len)
 {
 	const char *cp;
 	size_t span;
diff --git a/ident.h b/ident.h
index 6a79febba15..3c034038791 100644
--- a/ident.h
+++ b/ident.h
@@ -35,7 +35,7 @@ void reset_ident_date(void);
  * Signals an success with 0, but time part of the result may be NULL
  * if the input lacks timestamp and zone
  */
-int split_ident_line(struct ident_split *, const char *, int);
+int split_ident_line(struct ident_split *, const char *, size_t);
 
 /*
  * Given a commit or tag object buffer and the commit or tag headers, replaces

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 6/9] refs: fix identity for migrated reflogs
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 5/9] ident: fix type of string length parameter Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When migrating reflog entries between different storage formats we must
reconstruct the identity of reflog entries. This is done by passing the
committer passed to the `migrate_one_reflog_entry()` callback function
to `fmt_ident()`.

This results in an invalid identity though: `fmt_ident()` expects the
caller to provide both name and mail of the author, but we pass the full
identity as mail. This leads to an identity like:

    pks <Patrick Steinhardt ps@pks.im>

Fix the bug by splitting the identity line first. This allows us to
extract both the name and mail so that we can pass them to `fmt_ident()`
separately.

This commit does not yet add any tests as there is another bug in the
reflog migration that will be fixed in a subsequent commit. Once that
bug is fixed we'll make the reflog verification in t1450 stricter, and
that will catch both this bug here and the other bug.

Note that we also add two new `name` and `mail` string buffers to the
callback structures and splice them through to the callbacks. This is
done so that we can avoid allocating a new buffer every time we compute
the committer information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 8aa9f7236a..a5f9ffaa45 100644
--- a/refs.c
+++ b/refs.c
@@ -2954,7 +2954,7 @@ struct migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf sb;
+	struct strbuf sb, name, mail;
 };
 
 static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
@@ -2993,7 +2993,7 @@ struct reflog_migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf *sb;
+	struct strbuf *sb, *name, *mail;
 };
 
 static int migrate_one_reflog_entry(struct object_id *old_oid,
@@ -3003,13 +3003,21 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 				    const char *msg, void *cb_data)
 {
 	struct reflog_migration_data *data = cb_data;
+	struct ident_split ident;
 	const char *date;
 	int ret;
 
+	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
+		return -1;
+
+	strbuf_reset(data->name);
+	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_reset(data->mail);
+	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
 	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
 	strbuf_reset(data->sb);
-	/* committer contains name and email */
-	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
+	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
@@ -3026,6 +3034,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
 		.transaction = migration_data->transaction,
 		.errbuf = migration_data->errbuf,
 		.sb = &migration_data->sb,
+		.name = &migration_data->name,
+		.mail = &migration_data->mail,
 	};
 
 	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
@@ -3124,6 +3134,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	struct strbuf new_gitdir = STRBUF_INIT;
 	struct migration_data data = {
 		.sb = STRBUF_INIT,
+		.name = STRBUF_INIT,
+		.mail = STRBUF_INIT,
 	};
 	int did_migrate_refs = 0;
 	int ret;
@@ -3299,6 +3311,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	strbuf_release(&data.sb);
+	strbuf_release(&data.name);
+	strbuf_release(&data.mail);
 	return ret;
 }
 

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When updating a reference that is being pointed to HEAD we don't only
write a reflog message for that particular reference, but also generate
one for HEAD. This logic is handled by `split_head_update()`, where we:

  1. Verify that the condition actually triggered. This is done by
     reading HEAD at the start of the transaction so that we can then
     check whether a given reference update refers to its target.

  2. Queue a new log-only update for HEAD in case it did.

But the logic is unfortunately not free of races, as we do not lock the
HEAD reference after we have read its target. This can lead to the
following two scenarios:

  - HEAD gets concurrently updated to point to one of the references we
    have already processed. This causes us not writing a reflog message
    even though we should have done so.

  - HEAD gets concurrently updated to no longer point to a reference
    anymore that we have already processed. This causes us to write a
    reflog message even though we should _not_ have done so.

Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that
is specific to the "files" backend. If set, we will double check that
the HEAD reference still points to the reference that we are creating
the reflog entry for after we have locked HEAD. Furthermore, instead of
manually resolving the old object ID of that entry, we now use the same
old state as for the parent update.

If we detect such a racy update we abort the transaction. This is a bit
heavy-handed: the user didn't even ask us to write a reflog update for
"HEAD", so it might be surprising if we abort the transaction. That
being said:

  - Normal users wouldn't typically hit this case as we only hit the
    relevant code when committing to a branch that is being pointed to
    by "HEAD" directly. Commands like git-commit(1) typically commit to
    "HEAD" itself though.

  - Scripted users that use git-update-ref(1) and related plumbing
    commands are unlikely to hit this case either, as they would have to
    update the pointed-to-branch at the same as "HEAD" is being updated,
    which is an exceedingly rare event.

The alternative would be to instead drop the log-only update completely,
but that would require more logic that is hard to verify without adding
infrastructure specific for such a test. So we rather do the pragmatic
thing and don't worry too much about an edge case that is very unlikely
to happen.

Unfortunately, this change only helps with the second race. We cannot
reliably plug the first race without locking the HEAD reference at the
start of the transaction. Locking HEAD unconditionally would effectively
serialize all writes though, and that doesn't seem like an option. Also,
double checking its value at the end of the transaction is not an option
either, as its target may have flip-flopped during the transaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bf6f89b1d1..ba018b0984 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -68,6 +68,12 @@
  */
 #define REF_DELETED_RMDIR (1 << 9)
 
+/*
+ * Used to indicate that the reflog-only update has been created via
+ * `split_head_update()`.
+ */
+#define REF_LOG_VIA_SPLIT (1 << 14)
+
 struct ref_lock {
 	char *ref_name;
 	struct lock_file lk;
@@ -2420,9 +2426,10 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
 
 	new_update = ref_transaction_add_update(
 			transaction, "HEAD",
-			update->flags | REF_LOG_ONLY | REF_NO_DEREF,
+			update->flags | REF_LOG_ONLY | REF_NO_DEREF | REF_LOG_VIA_SPLIT,
 			&update->new_oid, &update->old_oid,
 			NULL, NULL, update->committer_info, update->msg);
+	new_update->parent_update = update;
 
 	/*
 	 * Add "HEAD". This insertion is O(N) in the transaction
@@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 	update->backend_data = lock;
 
-	if (update->type & REF_ISSYMREF) {
+	if (update->flags & REF_LOG_VIA_SPLIT) {
+		struct ref_lock *parent_lock;
+
+		if (!update->parent_update)
+			BUG("split update without a parent");
+
+		parent_lock = update->parent_update->backend_data;
+
+		/*
+		 * Check that "HEAD" didn't racily change since we have looked
+		 * it up. If it did we must refuse to write the reflog entry.
+		 *
+		 * Note that this does not catch all races: if "HEAD" was
+		 * racily changed to point to one of the refs part of the
+		 * transaction then we would miss writing the split reflog
+		 * entry for "HEAD".
+		 */
+		if (!(update->type & REF_ISSYMREF) ||
+		    strcmp(update->parent_update->refname, referent.buf)) {
+			strbuf_addstr(err, "HEAD has been racily updated");
+			ret = REF_TRANSACTION_ERROR_GENERIC;
+			goto out;
+		}
+
+		if (update->flags & REF_HAVE_OLD) {
+			oidcpy(&lock->old_oid, &update->old_oid);
+		} else {
+			oidcpy(&lock->old_oid, &parent_lock->old_oid);
+		}
+	} else if (update->type & REF_ISSYMREF) {
 		if (update->flags & REF_NO_DEREF) {
 			/*
 			 * We won't be reading the referent as part of

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 15:11   ` [PATCH v5 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  2025-08-05 18:47   ` [PATCH v5 0/9] refs: fix migration of reflog entries Jeff King
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whether
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.

When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.

But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal will become useful in a subsequent commit, where we add a new
flag that tells the transaction to use the provided old and new object
IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
signal to verify that the caller really did provide an old object ID.

Stop unsetting the flag so that we can use it as this described signal
in a subsequent commit. Skip checking the old object ID for log-only
updates so that we don't expect it to match the current on-disk state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  8 +++-----
 refs/files-backend.c    |  9 +++++----
 refs/refs-internal.h    |  3 ++-
 refs/reftable-backend.c | 12 +++---------
 4 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/refs.c b/refs.c
index a5f9ffaa45..f88928de74 100644
--- a/refs.c
+++ b/refs.c
@@ -1393,11 +1393,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 	update = ref_transaction_add_update(transaction, refname, flags,
 					    new_oid, old_oid, NULL, NULL,
 					    committer_info, msg);
-	/*
-	 * While we do set the old_oid value, we unset the flag to skip
-	 * old_oid verification which only makes sense for refs.
-	 */
-	update->flags &= ~REF_HAVE_OLD;
 	update->index = index;
 
 	/*
@@ -3318,6 +3313,9 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 
 int ref_update_expects_existing_old_ref(struct ref_update *update)
 {
+	if (update->flags & REF_LOG_ONLY)
+		return 0;
+
 	return (update->flags & REF_HAVE_OLD) &&
 		(!is_null_oid(&update->old_oid) || update->old_target);
 }
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ba018b0984..85ab2ef2b9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2500,7 +2500,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
 	 * done when new_update is processed.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;
 
 	return 0;
 }
@@ -2515,8 +2514,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
 						struct object_id *oid,
 						struct strbuf *err)
 {
-	if (!(update->flags & REF_HAVE_OLD) ||
-		   oideq(oid, &update->old_oid))
+	if (update->flags & REF_LOG_ONLY ||
+	    !(update->flags & REF_HAVE_OLD) ||
+	    oideq(oid, &update->old_oid))
 		return 0;
 
 	if (is_null_oid(&update->old_oid)) {
@@ -3095,7 +3095,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
-		if ((update->flags & REF_HAVE_OLD) &&
+		if (!(update->flags & REF_LOG_ONLY) &&
+		    (update->flags & REF_HAVE_OLD) &&
 		    !is_null_oid(&update->old_oid))
 			BUG("initial ref transaction with old_sha1 set");
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f868870851..95a4dc3902 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -802,7 +802,8 @@ enum ref_transaction_error ref_update_check_old_target(const char *referent,
 
 /*
  * Check if the ref must exist, this means that the old_oid or
- * old_target is non NULL.
+ * old_target is non NULL. Log-only updates never require the old state to
+ * match.
  */
 int ref_update_expects_existing_old_ref(struct ref_update *update);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4c3817f4ec..44af58ac50 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret > 0) {
 		/* The reference does not exist, but we expected it to. */
 		strbuf_addf(err, _("cannot lock ref '%s': "
-
-
 				   "unable to resolve reference '%s'"),
 			    ref_update_original_update_refname(u), u->refname);
 		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
@@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 
 			new_update->parent_update = u;
 
-			/*
-			 * Change the symbolic ref update to log only. Also, it
-			 * doesn't need to check its old OID value, as that will be
-			 * done when new_update is processed.
-			 */
+			/* Change the symbolic ref update to log only. */
 			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-			u->flags &= ~REF_HAVE_OLD;
 		}
 	}
 
@@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 		ret = ref_update_check_old_target(referent->buf, u, err);
 		if (ret)
 			return ret;
-	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
+	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
+		   !oideq(&current_oid, &u->old_oid)) {
 		if (is_null_oid(&u->old_oid)) {
 			strbuf_addf(err, _("cannot lock ref '%s': "
 					   "reference already exists"),

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v5 9/9] refs: fix invalid old object IDs when migrating reflogs
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-08-05 15:11   ` Patrick Steinhardt
  2025-08-05 18:47   ` [PATCH v5 0/9] refs: fix migration of reflog entries Jeff King
  9 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-05 15:11 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  3 ++-
 refs.h                  |  9 ++++++++-
 refs/files-backend.c    | 16 +++++++++++++++-
 refs/reftable-backend.c | 14 ++++++++++++++
 t/t1421-reflog-write.sh |  4 +---
 t/t1460-refs-migrate.sh | 22 +++++++++++++++-------
 6 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index f88928de74..946eb48941 100644
--- a/refs.c
+++ b/refs.c
@@ -1385,7 +1385,8 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 
 	assert(err);
 
-	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF |
+		REF_LOG_USE_PROVIDED_OIDS;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
diff --git a/refs.h b/refs.h
index 253dd8f4d5..090b4fdff4 100644
--- a/refs.h
+++ b/refs.h
@@ -760,13 +760,20 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
  */
 #define REF_SKIP_CREATE_REFLOG (1 << 12)
 
+/*
+ * When writing a REF_LOG_ONLY record, use the old and new object IDs provided
+ * in the update instead of resolving the old object ID. The caller must also
+ * set both REF_HAVE_OLD and REF_HAVE_NEW.
+ */
+#define REF_LOG_USE_PROVIDED_OIDS (1 << 13)
+
 /*
  * Bitmask of all of the flags that are allowed to be passed in to
  * ref_transaction_update() and friends:
  */
 #define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS                                  \
 	(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
-	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
+	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG | REF_LOG_USE_PROVIDED_OIDS)
 
 /*
  * Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 85ab2ef2b9..905555365b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3010,6 +3010,20 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 				  struct ref_lock *lock,
 				  struct strbuf *err)
 {
+	struct object_id *old_oid = &lock->old_oid;
+
+	if (update->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(update->flags & REF_HAVE_OLD) ||
+		    !(update->flags & REF_HAVE_NEW) ||
+		    !(update->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), update->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		old_oid = &update->old_oid;
+	}
+
 	if (update->new_target) {
 		/*
 		 * We want to get the resolved OID for the target, to ensure
@@ -3027,7 +3041,7 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 		}
 	}
 
-	if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid,
+	if (files_log_ref_write(refs, lock->ref_name, old_oid,
 				&update->new_oid, update->committer_info,
 				update->msg, update->flags, err)) {
 		char *old_msg = strbuf_detach(err, NULL);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 44af58ac50..99fafd75eb 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1096,6 +1096,20 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret)
 		return REF_TRANSACTION_ERROR_GENERIC;
 
+	if (u->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(u->flags & REF_HAVE_OLD) ||
+		    !(u->flags & REF_HAVE_NEW) ||
+		    !(u->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), u->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		if (queue_transaction_update(refs, tx_data, u, &u->old_oid, err))
+			return REF_TRANSACTION_ERROR_GENERIC;
+		return 0;
+	}
+
 	/* Verify that the new object ID is valid. */
 	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
 	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
index dd7ffa5241..46df64c176 100755
--- a/t/t1421-reflog-write.sh
+++ b/t/t1421-reflog-write.sh
@@ -101,11 +101,9 @@ test_expect_success 'simple writes' '
 		EOF
 
 		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
-		# Note: the old object ID of the second reflog entry is broken.
-		# This will be fixed in subsequent commits.
 		test_reflog_matches . refs/heads/something <<-EOF
 		$ZERO_OID $COMMIT_OID $SIGNATURE	first
-		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
 		EOF
 	)
 '
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
index 2ab97e1b7d..0e1116a319 100755
--- a/t/t1460-refs-migrate.sh
+++ b/t/t1460-refs-migrate.sh
@@ -7,6 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+print_all_reflog_entries () {
+	repo=$1 &&
+	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
+	while read reflog
+	do
+		echo "REFLOG: $reflog" &&
+		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
+		return 1
+	done <reflogs
+}
+
 # Migrate the provided repository from one format to the other and
 # verify that the references and logs are migrated over correctly.
 # Usage: test_migration <repo> <format> [<skip_reflog_verify> [<options...>]]
@@ -28,8 +39,7 @@ test_migration () {
 		--format='%(refname) %(objectname) %(symref)' >expect &&
 	if ! $skip_reflog_verify
 	then
-	   git -C "$repo" reflog --all >expect_logs &&
-	   git -C "$repo" reflog list >expect_log_list
+		print_all_reflog_entries "$repo" >expect_logs
 	fi &&
 
 	git -C "$repo" refs migrate --ref-format="$format" "$@" &&
@@ -39,10 +49,8 @@ test_migration () {
 	test_cmp expect actual &&
 	if ! $skip_reflog_verify
 	then
-		git -C "$repo" reflog --all >actual_logs &&
-		git -C "$repo" reflog list >actual_log_list &&
-		test_cmp expect_logs actual_logs &&
-		test_cmp expect_log_list actual_log_list
+		print_all_reflog_entries "$repo" >actual_logs &&
+		test_cmp expect_logs actual_logs
 	fi &&
 
 	git -C "$repo" rev-parse --show-ref-format >actual &&
@@ -273,7 +281,7 @@ test_expect_success 'multiple reftable blocks with multiple entries' '
 	test_commit -C repo second &&
 	printf "update refs/heads/ref-%d HEAD\n" $(test_seq 3000) >stdin &&
 	git -C repo update-ref --stdin <stdin &&
-	test_migration repo reftable
+	test_migration repo reftable true
 '
 
 test_expect_success 'migrating from files format deletes backend files' '

-- 
2.50.1.723.g3e08bea96f.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-08-05 17:04     ` Jean-Noël AVILA
  2025-08-05 21:47       ` Junio C Hamano
  0 siblings, 1 reply; 114+ messages in thread
From: Jean-Noël AVILA @ 2025-08-05 17:04 UTC (permalink / raw)
  To: git, Patrick Steinhardt
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble

On Tuesday, 5 August 2025 17:11:31 CEST Patrick Steinhardt wrote:
> With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
> have introduced a new synopsis type that simplifies the rules for
> typesetting a command's synopsis. Convert the git-reflog(1)
> documentation to use it.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/git-reflog.adoc | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
> index 412f06b8fe..707a9b39ed 100644
> --- a/Documentation/git-reflog.adoc
> +++ b/Documentation/git-reflog.adoc
> @@ -8,16 +8,16 @@ git-reflog - Manage reflog information
> 
>  SYNOPSIS
>  --------
> -[verse]
> -'git reflog' [show] [<log-options>] [<ref>]
> -'git reflog list'
> -'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
> +[synopsis]
> +git reflog [show] [<log-options>] [<ref>]
> +git reflog list
> +git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
>  	[--rewrite] [--updateref] [--stale-fix]
>  	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
> -'git reflog delete' [--rewrite] [--updateref]
> +git reflog delete [--rewrite] [--updateref]
>  	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
> -'git reflog drop' [--all [--single-worktree] | <refs>...]
> -'git reflog exists' <ref>
> +git reflog drop [--all [--single-worktree] | <refs>...]
> +git reflog exists <ref>
> 
>  DESCRIPTION
>  -----------


Hello,

Be careful that with the doc lint series I'm proposing, this change will raise 
a failure: one of the tests checks that switching the main synopsis to 
[synopsis] is linked to switching the definitions lists to inline synopsis, 
using `backticks`. This check may be too restrictive though.

Jean-Noël



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v5 0/9] refs: fix migration of reflog entries
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
                     ` (8 preceding siblings ...)
  2025-08-05 15:11   ` [PATCH v5 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
@ 2025-08-05 18:47   ` Jeff King
  2025-08-06  5:53     ` Patrick Steinhardt
  9 siblings, 1 reply; 114+ messages in thread
From: Jeff King @ 2025-08-05 18:47 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes, Kristoffer Haugsbakk, Ben Knoble

On Tue, Aug 05, 2025 at 05:11:30PM +0200, Patrick Steinhardt wrote:

> Changes in v5:
>   - Revert back to the logic that aborts the transaction if we see a
>     racy HEAD update. It's the pragmatic thing to do for an edge case
>     that is very unlikely to ever happen.
>   - Link to v4: https://lore.kernel.org/r/20250804-pks-reflog-append-v4-0-13213fef7200@pks.im

Thanks, this makes sense to me. I hadn't reviewed the whole series very
thoroughly, but I think others did. And certainly this version addresses
all of the discussion I did participate in.

-Peff

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-08-05 17:04     ` Jean-Noël AVILA
@ 2025-08-05 21:47       ` Junio C Hamano
  2025-08-06  5:53         ` Patrick Steinhardt
  0 siblings, 1 reply; 114+ messages in thread
From: Junio C Hamano @ 2025-08-05 21:47 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, Jean-Noël AVILA, Karthik Nayak, Justin Tobler,
	SZEDER Gábor, Toon Claes, Jeff King, Kristoffer Haugsbakk,
	Ben Knoble

Jean-Noël AVILA <jn.avila@free.fr> writes:

> Be careful that with the doc lint series I'm proposing, this change will raise 
> a failure: one of the tests checks that switching the main synopsis to 
> [synopsis] is linked to switching the definitions lists to inline synopsis, 
> using `backticks`. This check may be too restrictive though.

This is what I've queued on top of your topic to prepare for today's
integration.

--- >8 ---
Subject: [PATCH] fixup! Documentation/git-reflog: convert to use synopsis type

---
 Documentation/git-reflog.adoc | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 34232a539a..38af0c977a 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -88,10 +88,10 @@ used with `expire`.
 Options for `drop`
 ~~~~~~~~~~~~~~~~~~
 
---all::
+`--all`::
 	Drop the reflogs of all references from all worktrees.
 
---single-worktree::
+`--single-worktree`::
 	By default when `--all` is specified, reflogs from all working
 	trees are dropped. This option limits the processing to reflogs
 	from the current working tree only.
@@ -100,15 +100,15 @@ Options for `drop`
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
---all::
+`--all`::
 	Process the reflogs of all references.
 
---single-worktree::
+`--single-worktree`::
 	By default when `--all` is specified, reflogs from all working
 	trees are processed. This option limits the processing to reflogs
 	from the current working tree only.
 
---expire=<time>::
+`--expire=<time>`::
 	Prune entries older than the specified time. If this option is
 	not specified, the expiration time is taken from the
 	configuration setting `gc.reflogExpire`, which in turn
@@ -116,7 +116,7 @@ Options for `expire`
 	of their age; `--expire=never` turns off pruning of reachable
 	entries (but see `--expire-unreachable`).
 
---expire-unreachable=<time>::
+`--expire-unreachable=<time>`::
 	Prune entries older than `<time>` that are not reachable from
 	the current tip of the branch. If this option is not
 	specified, the expiration time is taken from the configuration
@@ -126,17 +126,17 @@ Options for `expire`
 	turns off early pruning of unreachable entries (but see
 	`--expire`).
 
---updateref::
+`--updateref`::
 	Update the reference to the value of the top reflog entry (i.e.
 	<ref>@\{0\}) if the previous top entry was pruned.  (This
 	option is ignored for symbolic references.)
 
---rewrite::
+`--rewrite`::
 	If a reflog entry's predecessor is pruned, adjust its "old"
 	SHA-1 to be equal to the "new" SHA-1 field of the entry that
 	now precedes it.
 
---stale-fix::
+`--stale-fix`::
 	Prune any reflog entries that point to "broken commits". A
 	broken commit is a commit that is not reachable from any of
 	the reference tips and that refers, directly or indirectly, to
@@ -147,12 +147,12 @@ has the same cost as 'git prune'.  It is primarily intended to fix
 corruption caused by garbage collecting using older versions of Git,
 which didn't protect objects referred to by reflogs.
 
--n::
---dry-run::
+`-n`::
+`--dry-run`::
 	Do not actually prune any entries; just show what would have
 	been pruned.
 
---verbose::
+`--verbose`::
 	Print extra information on screen.
 
 
-- 
2.51.0-rc0-162-g220549999b


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-08-05 21:47       ` Junio C Hamano
@ 2025-08-06  5:53         ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jean-Noël AVILA, Karthik Nayak, Justin Tobler,
	SZEDER Gábor, Toon Claes, Jeff King, Kristoffer Haugsbakk,
	Ben Knoble

On Tue, Aug 05, 2025 at 02:47:41PM -0700, Junio C Hamano wrote:
> Jean-Noël AVILA <jn.avila@free.fr> writes:
> 
> > Be careful that with the doc lint series I'm proposing, this change will raise 
> > a failure: one of the tests checks that switching the main synopsis to 
> > [synopsis] is linked to switching the definitions lists to inline synopsis, 
> > using `backticks`. This check may be too restrictive though.
> 
> This is what I've queued on top of your topic to prepare for today's
> integration.

Thanks. I see that the fixup commit cannot trivially be applied to the
first commit due to a merge conflict. So let me send a new version that
absorbs the fix.

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v5 0/9] refs: fix migration of reflog entries
  2025-08-05 18:47   ` [PATCH v5 0/9] refs: fix migration of reflog entries Jeff King
@ 2025-08-06  5:53     ` Patrick Steinhardt
  0 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:53 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Karthik Nayak, Justin Tobler, Junio C Hamano,
	SZEDER Gábor, Toon Claes, Kristoffer Haugsbakk, Ben Knoble

On Tue, Aug 05, 2025 at 02:47:12PM -0400, Jeff King wrote:
> On Tue, Aug 05, 2025 at 05:11:30PM +0200, Patrick Steinhardt wrote:
> 
> > Changes in v5:
> >   - Revert back to the logic that aborts the transaction if we see a
> >     racy HEAD update. It's the pragmatic thing to do for an edge case
> >     that is very unlikely to ever happen.
> >   - Link to v4: https://lore.kernel.org/r/20250804-pks-reflog-append-v4-0-13213fef7200@pks.im
> 
> Thanks, this makes sense to me. I hadn't reviewed the whole series very
> thoroughly, but I think others did. And certainly this version addresses
> all of the discussion I did participate in.

Thanks for the discussion!

Patrick

^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v6 0/9] refs: fix migration of reflog entries
  2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
                   ` (11 preceding siblings ...)
  2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
@ 2025-08-06  5:54 ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
                     ` (8 more replies)
  12 siblings, 9 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

Hi,

after the announcement that "reftable" will become the default backend
in Git 3.0 I've revived the efforts to implement this backend in
libgit2. I'm happy to report that this implementation is almost done by
now: out of 3000 tests only four are failing now.

For two of these tests I have been completely puzzled why those are
failing, as everything really looked perfectly fine in libgit2. As it
turned out, the bug wasn't in libgit2 though, but in Git. Namely, the
way we migrate reflog entries between storage formats is broken in two
ways:

  - The identity we write into the reflog entries is wrong.

  - The old commit ID of reflog entries is always set to all-zeroes.
    This is what caused the libgit2 tests to fail, as I used `git refs
    migrate` to convert test repositories to use reftables.

This patch series fixes both of these issues. Furthermore, it also adds
a new `git reflog write` subcommand to write new reflog entries for a
specific reference. This command was helpful to reproduce some test
constellations in libgit2.

Changes in v2:
  - !!! The base of this topic has changed so that it sits on top of
    v2.50.1. This is done so that we can backport this change to older
    release tracks.
  - A couple of typo fixes and clarifications for commit messages.
  - Reorder sections in git-reflog(1) manpage according to the
    reordering we have in the synopsis.
  - Add a section for the new `write` command.
  - Improve test coverage for the `git reflog write` command.
  - Avoid `cat`ing a file into a Bash loop.
  - Remove a stale comment.
  - Make `ref_update_expects_existing_old_ref()` a bit more straight
    forward.
  - Link to v1: https://lore.kernel.org/r/20250722-pks-reflog-append-v1-0-183e5949de16@pks.im

Changes in v3:
  - `git reflog write` now requires fully-qualified refnames.
  - A new commit that plugs one part of the race around splitting of
    reflogs for HEAD in the "files" backend.
  - Link to v2: https://lore.kernel.org/r/20250725-pks-reflog-append-v2-0-e4e7cbe3f578@pks.im

Changes in v4:
  - Improve one of the tests to use an existing abbreviated object ID
    instead of a non-existing one to make sure that we indeed fail due
    to the abbreviation.
  - Don't abort the transaction when HEAD has been racily updated, but
    drop the log-only update instead.
  - Link to v3: https://lore.kernel.org/r/20250729-pks-reflog-append-v3-0-9614d310f073@pks.im

Changes in v5:
  - Revert back to the logic that aborts the transaction if we see a
    racy HEAD update. It's the pragmatic thing to do for an edge case
    that is very unlikely to ever happen.
  - Link to v4: https://lore.kernel.org/r/20250804-pks-reflog-append-v4-0-13213fef7200@pks.im

Changes in v6:
  - Convert options to use backticks in git-reflog(1) to appease the
    upcoming new manpage linter.
  - Link to v5: https://lore.kernel.org/r/20250805-pks-reflog-append-v5-0-050997db09d5@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (9):
      Documentation/git-reflog: convert to use synopsis type
      builtin/reflog: improve grouping of subcommands
      refs: export `ref_transaction_update_reflog()`
      builtin/reflog: implement subcommand to write new entries
      ident: fix type of string length parameter
      refs: fix identity for migrated reflogs
      refs/files: detect race when generating reflog entry for HEAD
      refs: stop unsetting REF_HAVE_OLD for log-only updates
      refs: fix invalid old object IDs when migrating reflogs

 Documentation/git-reflog.adoc |  96 +++++++++++++++++---------------
 builtin/reflog.c              | 103 +++++++++++++++++++++++++++-------
 ident.c                       |   2 +-
 ident.h                       |   2 +-
 refs.c                        |  60 +++++++++++---------
 refs.h                        |  24 +++++++-
 refs/files-backend.c          |  65 +++++++++++++++++++---
 refs/refs-internal.h          |   3 +-
 refs/reftable-backend.c       |  26 ++++++---
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 126 ++++++++++++++++++++++++++++++++++++++++++
 t/t1460-refs-migrate.sh       |  22 +++++---
 12 files changed, 413 insertions(+), 117 deletions(-)

Range-diff versus v5:

 1:  22b0ea5e69 <  -:  ---------- Documentation/git-reflog: convert to use synopsis type
 -:  ---------- >  1:  289dcbe595 Documentation/git-reflog: convert to use synopsis type
 2:  bf5d901269 !  2:  0363c102e6 builtin/reflog: improve grouping of subcommands
    @@ Documentation/git-reflog.adoc: Options for `show`
     +Options for `drop`
     +~~~~~~~~~~~~~~~~~~
     +
    -+--all::
    ++`--all`::
     +	Drop the reflogs of all references from all worktrees.
     +
    -+--single-worktree::
    ++`--single-worktree`::
     +	By default when `--all` is specified, reflogs from all working
     +	trees are dropped. This option limits the processing to reflogs
     +	from the current working tree only.
    @@ Documentation/git-reflog.adoc: which didn't protect objects referred to by reflo
     -Options for `drop`
     -~~~~~~~~~~~~~~~~~~
     -
    ----all::
    +-`--all`::
     -	Drop the reflogs of all references from all worktrees.
     -
    ----single-worktree::
    +-`--single-worktree`::
     -	By default when `--all` is specified, reflogs from all working
     -	trees are dropped. This option limits the processing to reflogs
     -	from the current working tree only.
 3:  709148c2a2 =  3:  ff885b29f4 refs: export `ref_transaction_update_reflog()`
 4:  3ad459bf6e =  4:  8ad1992cc9 builtin/reflog: implement subcommand to write new entries
 5:  207546401a =  5:  bb5dddf606 ident: fix type of string length parameter
 6:  0791e82645 =  6:  e15eba8029 refs: fix identity for migrated reflogs
 7:  e5c0280a36 =  7:  c83037d9a4 refs/files: detect race when generating reflog entry for HEAD
 8:  c31e03ebcb =  8:  8be3d5c126 refs: stop unsetting REF_HAVE_OLD for log-only updates
 9:  61acc7af7a =  9:  11a21b71f6 refs: fix invalid old object IDs when migrating reflogs

---
base-commit: d82adb61ba2fd11d8f2587fca1b6bd7925ce4044
change-id: 20250722-pks-reflog-append-634172d8ab2c


^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

With 974cdca345c (doc: introduce a synopsis typesetting, 2024-09-24) we
have introduced a new synopsis type that simplifies the rules for
typesetting a command's synopsis. Convert the git-reflog(1)
documentation to use it.

While at it, convert the list of options to use backticks. This is done
to appease an upcoming new linter that mandates the use of backticks
when using the synopsis type.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 412f06b8fe..0d6601fdea 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -8,16 +8,16 @@ git-reflog - Manage reflog information
 
 SYNOPSIS
 --------
-[verse]
-'git reflog' [show] [<log-options>] [<ref>]
-'git reflog list'
-'git reflog expire' [--expire=<time>] [--expire-unreachable=<time>]
+[synopsis]
+git reflog [show] [<log-options>] [<ref>]
+git reflog list
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
 	[--rewrite] [--updateref] [--stale-fix]
 	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
-'git reflog delete' [--rewrite] [--updateref]
+git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
-'git reflog drop' [--all [--single-worktree] | <refs>...]
-'git reflog exists' <ref>
+git reflog drop [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 
 DESCRIPTION
 -----------
@@ -74,15 +74,15 @@ Options for `show`
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
---all::
+`--all`::
 	Process the reflogs of all references.
 
---single-worktree::
+`--single-worktree`::
 	By default when `--all` is specified, reflogs from all working
 	trees are processed. This option limits the processing to reflogs
 	from the current working tree only.
 
---expire=<time>::
+`--expire=<time>`::
 	Prune entries older than the specified time. If this option is
 	not specified, the expiration time is taken from the
 	configuration setting `gc.reflogExpire`, which in turn
@@ -90,7 +90,7 @@ Options for `expire`
 	of their age; `--expire=never` turns off pruning of reachable
 	entries (but see `--expire-unreachable`).
 
---expire-unreachable=<time>::
+`--expire-unreachable=<time>`::
 	Prune entries older than `<time>` that are not reachable from
 	the current tip of the branch. If this option is not
 	specified, the expiration time is taken from the configuration
@@ -100,17 +100,17 @@ Options for `expire`
 	turns off early pruning of unreachable entries (but see
 	`--expire`).
 
---updateref::
+`--updateref`::
 	Update the reference to the value of the top reflog entry (i.e.
 	<ref>@\{0\}) if the previous top entry was pruned.  (This
 	option is ignored for symbolic references.)
 
---rewrite::
+`--rewrite`::
 	If a reflog entry's predecessor is pruned, adjust its "old"
 	SHA-1 to be equal to the "new" SHA-1 field of the entry that
 	now precedes it.
 
---stale-fix::
+`--stale-fix`::
 	Prune any reflog entries that point to "broken commits". A
 	broken commit is a commit that is not reachable from any of
 	the reference tips and that refers, directly or indirectly, to
@@ -121,12 +121,12 @@ has the same cost as 'git prune'.  It is primarily intended to fix
 corruption caused by garbage collecting using older versions of Git,
 which didn't protect objects referred to by reflogs.
 
--n::
---dry-run::
+`-n`::
+`--dry-run`::
 	Do not actually prune any entries; just show what would have
 	been pruned.
 
---verbose::
+`--verbose`::
 	Print extra information on screen.
 
 
@@ -140,10 +140,10 @@ used with `expire`.
 Options for `drop`
 ~~~~~~~~~~~~~~~~~~
 
---all::
+`--all`::
 	Drop the reflogs of all references from all worktrees.
 
---single-worktree::
+`--single-worktree`::
 	By default when `--all` is specified, reflogs from all working
 	trees are dropped. This option limits the processing to reflogs
 	from the current working tree only.

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 2/9] builtin/reflog: improve grouping of subcommands
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

The way subcommands of git-reflog(1) are laid out does not make any
immediate sense. Reorder them such that read-only subcommands precede
writing commands for a bit more structure.

Furthermore, move the "expire" subcommand last. This prepares for a
subsequent change where we are about to introduce a new "write" command
to append reflog entries. Like this, the writing subcommands are ordered
such that those affecting a single reflog come before those spanning
across all reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc | 61 ++++++++++++++++++++++---------------------
 builtin/reflog.c              | 38 +++++++++++++--------------
 2 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 0d6601fdea..4eb6c25607 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -11,13 +11,13 @@ SYNOPSIS
 [synopsis]
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
-git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
-	[--rewrite] [--updateref] [--stale-fix]
-	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
+git reflog exists <ref>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
-git reflog exists <ref>
+git reflog expire [--expire=<time>] [--expire-unreachable=<time>]
+	[--rewrite] [--updateref] [--stale-fix]
+	[--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]
 
 DESCRIPTION
 -----------
@@ -43,11 +43,9 @@ actions, and in addition the `HEAD` reflog records branch switching.
 
 The "list" subcommand lists all refs which have a corresponding reflog.
 
-The "expire" subcommand prunes older reflog entries. Entries older
-than `expire` time, or entries older than `expire-unreachable` time
-and not reachable from the current tip, are removed from the reflog.
-This is typically not used directly by end users -- instead, see
-linkgit:git-gc[1].
+The "exists" subcommand checks whether a ref has a reflog.  It exits
+with zero status if the reflog exists, and non-zero status if it does
+not.
 
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
@@ -58,9 +56,11 @@ The "drop" subcommand completely removes the reflog for the specified
 references. This is in contrast to "expire" and "delete", both of which
 can be used to delete reflog entries, but not the reflog itself.
 
-The "exists" subcommand checks whether a ref has a reflog.  It exits
-with zero status if the reflog exists, and non-zero status if it does
-not.
+The "expire" subcommand prunes older reflog entries. Entries older
+than `expire` time, or entries older than `expire-unreachable` time
+and not reachable from the current tip, are removed from the reflog.
+This is typically not used directly by end users -- instead, see
+linkgit:git-gc[1].
 
 OPTIONS
 -------
@@ -71,6 +71,25 @@ Options for `show`
 `git reflog show` accepts any of the options accepted by `git log`.
 
 
+Options for `delete`
+~~~~~~~~~~~~~~~~~~~~
+
+`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
+`--dry-run`, and `--verbose`, with the same meanings as when they are
+used with `expire`.
+
+Options for `drop`
+~~~~~~~~~~~~~~~~~~
+
+`--all`::
+	Drop the reflogs of all references from all worktrees.
+
+`--single-worktree`::
+	By default when `--all` is specified, reflogs from all working
+	trees are dropped. This option limits the processing to reflogs
+	from the current working tree only.
+
+
 Options for `expire`
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -130,24 +149,6 @@ which didn't protect objects referred to by reflogs.
 	Print extra information on screen.
 
 
-Options for `delete`
-~~~~~~~~~~~~~~~~~~~~
-
-`git reflog delete` accepts options `--updateref`, `--rewrite`, `-n`,
-`--dry-run`, and `--verbose`, with the same meanings as when they are
-used with `expire`.
-
-Options for `drop`
-~~~~~~~~~~~~~~~~~~
-
-`--all`::
-	Drop the reflogs of all references from all worktrees.
-
-`--single-worktree`::
-	By default when `--all` is specified, reflogs from all working
-	trees are dropped. This option limits the processing to reflogs
-	from the current working tree only.
-
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 3acaf3e32c..b00b3f9edc 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -17,21 +17,21 @@
 #define BUILTIN_REFLOG_LIST_USAGE \
 	N_("git reflog list")
 
-#define BUILTIN_REFLOG_EXPIRE_USAGE \
-	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
-	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
-	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+#define BUILTIN_REFLOG_EXISTS_USAGE \
+	N_("git reflog exists <ref>")
 
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
 
-#define BUILTIN_REFLOG_EXISTS_USAGE \
-	N_("git reflog exists <ref>")
-
 #define BUILTIN_REFLOG_DROP_USAGE \
 	N_("git reflog drop [--all [--single-worktree] | <refs>...]")
 
+#define BUILTIN_REFLOG_EXPIRE_USAGE \
+	N_("git reflog expire [--expire=<time>] [--expire-unreachable=<time>]\n" \
+	   "                  [--rewrite] [--updateref] [--stale-fix]\n" \
+	   "                  [--dry-run | -n] [--verbose] [--all [--single-worktree] | <refs>...]")
+
 static const char *const reflog_show_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	NULL,
@@ -42,9 +42,9 @@ static const char *const reflog_list_usage[] = {
 	NULL,
 };
 
-static const char *const reflog_expire_usage[] = {
-	BUILTIN_REFLOG_EXPIRE_USAGE,
-	NULL
+static const char *const reflog_exists_usage[] = {
+	BUILTIN_REFLOG_EXISTS_USAGE,
+	NULL,
 };
 
 static const char *const reflog_delete_usage[] = {
@@ -52,23 +52,23 @@ static const char *const reflog_delete_usage[] = {
 	NULL
 };
 
-static const char *const reflog_exists_usage[] = {
-	BUILTIN_REFLOG_EXISTS_USAGE,
-	NULL,
-};
-
 static const char *const reflog_drop_usage[] = {
 	BUILTIN_REFLOG_DROP_USAGE,
 	NULL,
 };
 
+static const char *const reflog_expire_usage[] = {
+	BUILTIN_REFLOG_EXPIRE_USAGE,
+	NULL
+};
+
 static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
-	BUILTIN_REFLOG_EXPIRE_USAGE,
+	BUILTIN_REFLOG_EXISTS_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
-	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_EXPIRE_USAGE,
 	NULL
 };
 
@@ -404,10 +404,10 @@ int cmd_reflog(int argc,
 	struct option options[] = {
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
-		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
-		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
+		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
 		OPT_END()
 	};
 

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 3/9] refs: export `ref_transaction_update_reflog()`
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

In a subsequent commit we'll add another user that wants to write reflog
entries. This requires them to call `ref_transaction_update_reflog()`,
but that function is local to "refs.c".

Export the function to prepare for the change. While at it, drop the
`flags` field, as all callers are for now expected to use the same flags
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 29 +++++++++++------------------
 refs.h | 15 +++++++++++++++
 2 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index dce5c49ca2..8aa9f7236a 100644
--- a/refs.c
+++ b/refs.c
@@ -1371,27 +1371,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 	return 0;
 }
 
-/*
- * Similar to`ref_transaction_update`, but this function is only for adding
- * a reflog update. Supports providing custom committer information. The index
- * field can be utiltized to order updates as desired. When not used, the
- * updates default to being ordered by refname.
- */
-static int ref_transaction_update_reflog(struct ref_transaction *transaction,
-					 const char *refname,
-					 const struct object_id *new_oid,
-					 const struct object_id *old_oid,
-					 const char *committer_info,
-					 unsigned int flags,
-					 const char *msg,
-					 uint64_t index,
-					 struct strbuf *err)
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err)
 {
 	struct ref_update *update;
+	unsigned int flags;
 
 	assert(err);
 
-	flags |= REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
@@ -3019,8 +3013,7 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
-					    REF_HAVE_NEW | REF_HAVE_OLD, msg,
-					    data->index++, data->errbuf);
+					    msg, data->index++, data->errbuf);
 	return ret;
 }
 
diff --git a/refs.h b/refs.h
index 46a6008e07..253dd8f4d5 100644
--- a/refs.h
+++ b/refs.h
@@ -795,6 +795,21 @@ int ref_transaction_update(struct ref_transaction *transaction,
 			   unsigned int flags, const char *msg,
 			   struct strbuf *err);
 
+/*
+ * Similar to `ref_transaction_update`, but this function is only for adding
+ * a reflog update. Supports providing custom committer information. The index
+ * field can be utiltized to order updates as desired. When set to zero, the
+ * updates default to being ordered by refname.
+ */
+int ref_transaction_update_reflog(struct ref_transaction *transaction,
+				  const char *refname,
+				  const struct object_id *new_oid,
+				  const struct object_id *old_oid,
+				  const char *committer_info,
+				  const char *msg,
+				  uint64_t index,
+				  struct strbuf *err);
+
 /*
  * Add a reference creation to transaction. new_oid is the value that
  * the reference should have after the update; it must not be

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 4/9] builtin/reflog: implement subcommand to write new entries
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
                     ` (2 preceding siblings ...)
  2025-08-06  5:54   ` [PATCH v6 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 5/9] ident: fix type of string length parameter Patrick Steinhardt
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

While we provide a couple of subcommands in git-reflog(1) to remove
reflog entries, we don't provide any to write new entries. Obviously
this is not an operation that really would be needed for many use cases
out there, or otherwise people would have complained that such a command
does not exist yet. But the introduction of the "reftable" backend
changes the picture a bit, as it is now basically impossible to manually
append a reflog entry if one wanted to do so due to the binary format.

Plug this gap by introducing a simple "write" subcommand. For now, all
this command does is to append a single new reflog entry with the given
object IDs and message to the reflog. More specifically, it is not yet
possible to:

  - Write multiple reflog entries at once.

  - Insert reflog entries at arbitrary indices.

  - Specify the date of the reflog entry.

  - Insert reflog entries that refer to nonexistent objects.

If required, those features can be added at a future point in time. For
now though, the new command aims to fulfill the most basic use cases
while being as strict as possible when it comes to verifying parameters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-reflog.adoc |   7 +++
 builtin/reflog.c              |  65 +++++++++++++++++++++
 t/meson.build                 |   1 +
 t/t1421-reflog-write.sh       | 128 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 201 insertions(+)

diff --git a/Documentation/git-reflog.adoc b/Documentation/git-reflog.adoc
index 4eb6c25607..38af0c977a 100644
--- a/Documentation/git-reflog.adoc
+++ b/Documentation/git-reflog.adoc
@@ -12,6 +12,7 @@ SYNOPSIS
 git reflog [show] [<log-options>] [<ref>]
 git reflog list
 git reflog exists <ref>
+git reflog write <ref> <old-oid> <new-oid> <message>
 git reflog delete [--rewrite] [--updateref]
 	[--dry-run | -n] [--verbose] <ref>@{<specifier>}...
 git reflog drop [--all [--single-worktree] | <refs>...]
@@ -47,6 +48,12 @@ The "exists" subcommand checks whether a ref has a reflog.  It exits
 with zero status if the reflog exists, and non-zero status if it does
 not.
 
+The "write" subcommand writes a single entry to the reflog of a given
+reference. This new entry is appended to the reflog and will thus become
+the most recent entry. The reference name must be fully qualified. Both the old
+and new object IDs must not be abbreviated and must point to existing objects.
+The reflog message gets normalized.
+
 The "delete" subcommand deletes single entries from the reflog, but
 not the reflog itself. Its argument must be an _exact_ entry (e.g. "`git
 reflog delete master@{2}`"). This subcommand is also typically not used
diff --git a/builtin/reflog.c b/builtin/reflog.c
index b00b3f9edc..a1b4e02204 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -3,6 +3,8 @@
 #include "builtin.h"
 #include "config.h"
 #include "gettext.h"
+#include "hex.h"
+#include "object-store.h"
 #include "revision.h"
 #include "reachable.h"
 #include "wildmatch.h"
@@ -20,6 +22,9 @@
 #define BUILTIN_REFLOG_EXISTS_USAGE \
 	N_("git reflog exists <ref>")
 
+#define BUILTIN_REFLOG_WRITE_USAGE \
+	N_("git reflog write <ref> <old-oid> <new-oid> <message>")
+
 #define BUILTIN_REFLOG_DELETE_USAGE \
 	N_("git reflog delete [--rewrite] [--updateref]\n" \
 	   "                  [--dry-run | -n] [--verbose] <ref>@{<specifier>}...")
@@ -47,6 +52,11 @@ static const char *const reflog_exists_usage[] = {
 	NULL,
 };
 
+static const char *const reflog_write_usage[] = {
+	BUILTIN_REFLOG_WRITE_USAGE,
+	NULL,
+};
+
 static const char *const reflog_delete_usage[] = {
 	BUILTIN_REFLOG_DELETE_USAGE,
 	NULL
@@ -66,6 +76,7 @@ static const char *const reflog_usage[] = {
 	BUILTIN_REFLOG_SHOW_USAGE,
 	BUILTIN_REFLOG_LIST_USAGE,
 	BUILTIN_REFLOG_EXISTS_USAGE,
+	BUILTIN_REFLOG_WRITE_USAGE,
 	BUILTIN_REFLOG_DELETE_USAGE,
 	BUILTIN_REFLOG_DROP_USAGE,
 	BUILTIN_REFLOG_EXPIRE_USAGE,
@@ -392,6 +403,59 @@ static int cmd_reflog_drop(int argc, const char **argv, const char *prefix,
 	return ret;
 }
 
+static int cmd_reflog_write(int argc, const char **argv, const char *prefix,
+			    struct repository *repo)
+{
+	const struct option options[] = {
+		OPT_END()
+	};
+	struct object_id old_oid, new_oid;
+	struct strbuf err = STRBUF_INIT;
+	struct ref_transaction *tx;
+	const char *ref, *message;
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, reflog_write_usage, 0);
+	if (argc != 4)
+		usage_with_options(reflog_write_usage, options);
+
+	ref = argv[0];
+	if (!is_root_ref(ref) && check_refname_format(ref, 0))
+		die(_("invalid reference name: %s"), ref);
+
+	ret = get_oid_hex_algop(argv[1], &old_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid old object ID: '%s'"), argv[1]);
+	if (!is_null_oid(&old_oid) && !has_object(the_repository, &old_oid, 0))
+		die(_("old object '%s' does not exist"), argv[1]);
+
+	ret = get_oid_hex_algop(argv[2], &new_oid, repo->hash_algo);
+	if (ret)
+		die(_("invalid new object ID: '%s'"), argv[2]);
+	if (!is_null_oid(&new_oid) && !has_object(the_repository, &new_oid, 0))
+		die(_("new object '%s' does not exist"), argv[2]);
+
+	message = argv[3];
+
+	tx = ref_store_transaction_begin(get_main_ref_store(repo), 0, &err);
+	if (!tx)
+		die(_("cannot start transaction: %s"), err.buf);
+
+	ret = ref_transaction_update_reflog(tx, ref, &new_oid, &old_oid,
+					    git_committer_info(0),
+					    message, 0, &err);
+	if (ret)
+		die(_("cannot queue reflog update: %s"), err.buf);
+
+	ret = ref_transaction_commit(tx, &err);
+	if (ret)
+		die(_("cannot commit reflog update: %s"), err.buf);
+
+	ref_transaction_free(tx);
+	strbuf_release(&err);
+	return 0;
+}
+
 /*
  * main "reflog"
  */
@@ -405,6 +469,7 @@ int cmd_reflog(int argc,
 		OPT_SUBCOMMAND("show", &fn, cmd_reflog_show),
 		OPT_SUBCOMMAND("list", &fn, cmd_reflog_list),
 		OPT_SUBCOMMAND("exists", &fn, cmd_reflog_exists),
+		OPT_SUBCOMMAND("write", &fn, cmd_reflog_write),
 		OPT_SUBCOMMAND("delete", &fn, cmd_reflog_delete),
 		OPT_SUBCOMMAND("drop", &fn, cmd_reflog_drop),
 		OPT_SUBCOMMAND("expire", &fn, cmd_reflog_expire),
diff --git a/t/meson.build b/t/meson.build
index d052fc3e23..adcdf09e74 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -220,6 +220,7 @@ integration_tests = [
   't1418-reflog-exists.sh',
   't1419-exclude-refs.sh',
   't1420-lost-found.sh',
+  't1421-reflog-write.sh',
   't1430-bad-ref-name.sh',
   't1450-fsck.sh',
   't1451-fsck-buffer.sh',
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
new file mode 100755
index 0000000000..dd7ffa5241
--- /dev/null
+++ b/t/t1421-reflog-write.sh
@@ -0,0 +1,128 @@
+#!/bin/sh
+
+test_description='Manually write reflog entries'
+
+. ./test-lib.sh
+
+SIGNATURE="C O Mitter <committer@example.com> 1112911993 -0700"
+
+test_reflog_matches () {
+	repo="$1" &&
+	refname="$2" &&
+	cat >actual &&
+	test-tool -C "$repo" ref-store main for-each-reflog-ent "$refname" >expected &&
+	test_cmp expected actual
+}
+
+test_expect_success 'invalid number of arguments' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		for args in "" "1" "1 2" "1 2 3" "1 2 3 4 5"
+		do
+			test_must_fail git reflog write $args 2>err &&
+			test_grep "usage: git reflog write" err || return 1
+		done
+	)
+'
+
+test_expect_success 'invalid refname' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write "refs/heads/ invalid" $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'unqualified refname is rejected' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write unqualified $ZERO_OID $ZERO_OID first 2>err &&
+		test_grep "invalid reference name: " err
+	)
+'
+
+test_expect_success 'nonexistent object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_must_fail git reflog write refs/heads/something $(test_oid deadbeef) $ZERO_OID old-object-id 2>err &&
+		test_grep "old object .* does not exist" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $(test_oid deadbeef) new-object-id 2>err &&
+		test_grep "new object .* does not exist" err
+	)
+'
+
+test_expect_success 'abbreviated object IDs' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		abbreviated_oid=$(git rev-parse HEAD | test_copy_bytes 8) &&
+		test_must_fail git reflog write refs/heads/something $abbreviated_oid $ZERO_OID old-object-id 2>err &&
+		test_grep "invalid old object ID" err &&
+		test_must_fail git reflog write refs/heads/something $ZERO_OID $abbreviated_oid new-object-id 2>err &&
+		test_grep "invalid new object ID" err
+	)
+'
+
+test_expect_success 'reflog message gets normalized' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+		git reflog write HEAD $COMMIT_OID $COMMIT_OID "$(printf "message\nwith\nnewlines")" &&
+		git reflog show -1 --format=%gs HEAD >actual &&
+		echo "message with newlines" >expected &&
+		test_cmp expected actual
+	)
+'
+
+test_expect_success 'simple writes' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write refs/heads/something $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . refs/heads/something <<-EOF &&
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+
+		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
+		# Note: the old object ID of the second reflog entry is broken.
+		# This will be fixed in subsequent commits.
+		test_reflog_matches . refs/heads/something <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		EOF
+	)
+'
+
+test_expect_success 'can write to root ref' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		COMMIT_OID=$(git rev-parse HEAD) &&
+
+		git reflog write ROOT_REF_HEAD $ZERO_OID $COMMIT_OID first &&
+		test_reflog_matches . ROOT_REF_HEAD <<-EOF
+		$ZERO_OID $COMMIT_OID $SIGNATURE	first
+		EOF
+	)
+'
+
+test_done

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 5/9] ident: fix type of string length parameter
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
                     ` (3 preceding siblings ...)
  2025-08-06  5:54   ` [PATCH v6 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

The last parameter in `split_ident_line()` is the length of the line
passed in by the caller. As such, most callers pass in either the result
of `strlen()`, `struct strbuf::len` or a pointer diff, all of which
are expected to be positive numbers. Regardless of that, the function
accepts a signed integer, which is somewhat confusing.

Fix the function signature to instead accept a `size_t`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ident.c | 2 +-
 ident.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ident.c b/ident.c
index 967895d8850..a7a2d132579 100644
--- a/ident.c
+++ b/ident.c
@@ -272,7 +272,7 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
  * can still be NULL if the input line only has the name/email part
  * (e.g. reading from a reflog entry).
  */
-int split_ident_line(struct ident_split *split, const char *line, int len)
+int split_ident_line(struct ident_split *split, const char *line, size_t len)
 {
 	const char *cp;
 	size_t span;
diff --git a/ident.h b/ident.h
index 6a79febba15..3c034038791 100644
--- a/ident.h
+++ b/ident.h
@@ -35,7 +35,7 @@ void reset_ident_date(void);
  * Signals an success with 0, but time part of the result may be NULL
  * if the input lacks timestamp and zone
  */
-int split_ident_line(struct ident_split *, const char *, int);
+int split_ident_line(struct ident_split *, const char *, size_t);
 
 /*
  * Given a commit or tag object buffer and the commit or tag headers, replaces

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 6/9] refs: fix identity for migrated reflogs
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
                     ` (4 preceding siblings ...)
  2025-08-06  5:54   ` [PATCH v6 5/9] ident: fix type of string length parameter Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

When migrating reflog entries between different storage formats we must
reconstruct the identity of reflog entries. This is done by passing the
committer passed to the `migrate_one_reflog_entry()` callback function
to `fmt_ident()`.

This results in an invalid identity though: `fmt_ident()` expects the
caller to provide both name and mail of the author, but we pass the full
identity as mail. This leads to an identity like:

    pks <Patrick Steinhardt ps@pks.im>

Fix the bug by splitting the identity line first. This allows us to
extract both the name and mail so that we can pass them to `fmt_ident()`
separately.

This commit does not yet add any tests as there is another bug in the
reflog migration that will be fixed in a subsequent commit. Once that
bug is fixed we'll make the reflog verification in t1450 stricter, and
that will catch both this bug here and the other bug.

Note that we also add two new `name` and `mail` string buffers to the
callback structures and splice them through to the callbacks. This is
done so that we can avoid allocating a new buffer every time we compute
the committer information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 8aa9f7236a..a5f9ffaa45 100644
--- a/refs.c
+++ b/refs.c
@@ -2954,7 +2954,7 @@ struct migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf sb;
+	struct strbuf sb, name, mail;
 };
 
 static int migrate_one_ref(const char *refname, const char *referent UNUSED, const struct object_id *oid,
@@ -2993,7 +2993,7 @@ struct reflog_migration_data {
 	struct ref_store *old_refs;
 	struct ref_transaction *transaction;
 	struct strbuf *errbuf;
-	struct strbuf *sb;
+	struct strbuf *sb, *name, *mail;
 };
 
 static int migrate_one_reflog_entry(struct object_id *old_oid,
@@ -3003,13 +3003,21 @@ static int migrate_one_reflog_entry(struct object_id *old_oid,
 				    const char *msg, void *cb_data)
 {
 	struct reflog_migration_data *data = cb_data;
+	struct ident_split ident;
 	const char *date;
 	int ret;
 
+	if (split_ident_line(&ident, committer, strlen(committer)) < 0)
+		return -1;
+
+	strbuf_reset(data->name);
+	strbuf_add(data->name, ident.name_begin, ident.name_end - ident.name_begin);
+	strbuf_reset(data->mail);
+	strbuf_add(data->mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
+
 	date = show_date(timestamp, tz, DATE_MODE(NORMAL));
 	strbuf_reset(data->sb);
-	/* committer contains name and email */
-	strbuf_addstr(data->sb, fmt_ident("", committer, WANT_BLANK_IDENT, date, 0));
+	strbuf_addstr(data->sb, fmt_ident(data->name->buf, data->mail->buf, WANT_BLANK_IDENT, date, 0));
 
 	ret = ref_transaction_update_reflog(data->transaction, data->refname,
 					    new_oid, old_oid, data->sb->buf,
@@ -3026,6 +3034,8 @@ static int migrate_one_reflog(const char *refname, void *cb_data)
 		.transaction = migration_data->transaction,
 		.errbuf = migration_data->errbuf,
 		.sb = &migration_data->sb,
+		.name = &migration_data->name,
+		.mail = &migration_data->mail,
 	};
 
 	return refs_for_each_reflog_ent(migration_data->old_refs, refname,
@@ -3124,6 +3134,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	struct strbuf new_gitdir = STRBUF_INIT;
 	struct migration_data data = {
 		.sb = STRBUF_INIT,
+		.name = STRBUF_INIT,
+		.mail = STRBUF_INIT,
 	};
 	int did_migrate_refs = 0;
 	int ret;
@@ -3299,6 +3311,8 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 	ref_transaction_free(transaction);
 	strbuf_release(&new_gitdir);
 	strbuf_release(&data.sb);
+	strbuf_release(&data.name);
+	strbuf_release(&data.mail);
 	return ret;
 }
 

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 7/9] refs/files: detect race when generating reflog entry for HEAD
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
                     ` (5 preceding siblings ...)
  2025-08-06  5:54   ` [PATCH v6 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

When updating a reference that is being pointed to HEAD we don't only
write a reflog message for that particular reference, but also generate
one for HEAD. This logic is handled by `split_head_update()`, where we:

  1. Verify that the condition actually triggered. This is done by
     reading HEAD at the start of the transaction so that we can then
     check whether a given reference update refers to its target.

  2. Queue a new log-only update for HEAD in case it did.

But the logic is unfortunately not free of races, as we do not lock the
HEAD reference after we have read its target. This can lead to the
following two scenarios:

  - HEAD gets concurrently updated to point to one of the references we
    have already processed. This causes us not writing a reflog message
    even though we should have done so.

  - HEAD gets concurrently updated to no longer point to a reference
    anymore that we have already processed. This causes us to write a
    reflog message even though we should _not_ have done so.

Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that
is specific to the "files" backend. If set, we will double check that
the HEAD reference still points to the reference that we are creating
the reflog entry for after we have locked HEAD. Furthermore, instead of
manually resolving the old object ID of that entry, we now use the same
old state as for the parent update.

If we detect such a racy update we abort the transaction. This is a bit
heavy-handed: the user didn't even ask us to write a reflog update for
"HEAD", so it might be surprising if we abort the transaction. That
being said:

  - Normal users wouldn't typically hit this case as we only hit the
    relevant code when committing to a branch that is being pointed to
    by "HEAD" directly. Commands like git-commit(1) typically commit to
    "HEAD" itself though.

  - Scripted users that use git-update-ref(1) and related plumbing
    commands are unlikely to hit this case either, as they would have to
    update the pointed-to-branch at the same as "HEAD" is being updated,
    which is an exceedingly rare event.

The alternative would be to instead drop the log-only update completely,
but that would require more logic that is hard to verify without adding
infrastructure specific for such a test. So we rather do the pragmatic
thing and don't worry too much about an edge case that is very unlikely
to happen.

Unfortunately, this change only helps with the second race. We cannot
reliably plug the first race without locking the HEAD reference at the
start of the transaction. Locking HEAD unconditionally would effectively
serialize all writes though, and that doesn't seem like an option. Also,
double checking its value at the end of the transaction is not an option
either, as its target may have flip-flopped during the transaction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/files-backend.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bf6f89b1d1..ba018b0984 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -68,6 +68,12 @@
  */
 #define REF_DELETED_RMDIR (1 << 9)
 
+/*
+ * Used to indicate that the reflog-only update has been created via
+ * `split_head_update()`.
+ */
+#define REF_LOG_VIA_SPLIT (1 << 14)
+
 struct ref_lock {
 	char *ref_name;
 	struct lock_file lk;
@@ -2420,9 +2426,10 @@ static enum ref_transaction_error split_head_update(struct ref_update *update,
 
 	new_update = ref_transaction_add_update(
 			transaction, "HEAD",
-			update->flags | REF_LOG_ONLY | REF_NO_DEREF,
+			update->flags | REF_LOG_ONLY | REF_NO_DEREF | REF_LOG_VIA_SPLIT,
 			&update->new_oid, &update->old_oid,
 			NULL, NULL, update->committer_info, update->msg);
+	new_update->parent_update = update;
 
 	/*
 	 * Add "HEAD". This insertion is O(N) in the transaction
@@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 	update->backend_data = lock;
 
-	if (update->type & REF_ISSYMREF) {
+	if (update->flags & REF_LOG_VIA_SPLIT) {
+		struct ref_lock *parent_lock;
+
+		if (!update->parent_update)
+			BUG("split update without a parent");
+
+		parent_lock = update->parent_update->backend_data;
+
+		/*
+		 * Check that "HEAD" didn't racily change since we have looked
+		 * it up. If it did we must refuse to write the reflog entry.
+		 *
+		 * Note that this does not catch all races: if "HEAD" was
+		 * racily changed to point to one of the refs part of the
+		 * transaction then we would miss writing the split reflog
+		 * entry for "HEAD".
+		 */
+		if (!(update->type & REF_ISSYMREF) ||
+		    strcmp(update->parent_update->refname, referent.buf)) {
+			strbuf_addstr(err, "HEAD has been racily updated");
+			ret = REF_TRANSACTION_ERROR_GENERIC;
+			goto out;
+		}
+
+		if (update->flags & REF_HAVE_OLD) {
+			oidcpy(&lock->old_oid, &update->old_oid);
+		} else {
+			oidcpy(&lock->old_oid, &parent_lock->old_oid);
+		}
+	} else if (update->type & REF_ISSYMREF) {
 		if (update->flags & REF_NO_DEREF) {
 			/*
 			 * We won't be reading the referent as part of

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
                     ` (6 preceding siblings ...)
  2025-08-06  5:54   ` [PATCH v6 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  2025-08-06  5:54   ` [PATCH v6 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

The `REF_HAVE_OLD` flag indicates whether a given ref update has its old
object ID set. If so, the value of that field is used to verify whether
the current state of the reference matches this expected state. It is
thus an important part of mitigating races with a concurrent process
that updates the same set of references.

When writing reflogs though we explicitly unset that flag. This is a
sensible thing to do: the old state of reflog entry updates may not
necessarily match the current on-disk state of its accompanying ref, but
it's only intended to signal what old object ID we want to write into
the new reflog entry. For example when migrating refs we end up writing
many reflog entries for a single reference, and most likely those reflog
entries will have many different old object IDs.

But unsetting this flag also removes a useful signal, namely that the
caller _did_ provide an old object ID for a given reflog entry. This
signal will become useful in a subsequent commit, where we add a new
flag that tells the transaction to use the provided old and new object
IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a
signal to verify that the caller really did provide an old object ID.

Stop unsetting the flag so that we can use it as this described signal
in a subsequent commit. Skip checking the old object ID for log-only
updates so that we don't expect it to match the current on-disk state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  8 +++-----
 refs/files-backend.c    |  9 +++++----
 refs/refs-internal.h    |  3 ++-
 refs/reftable-backend.c | 12 +++---------
 4 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/refs.c b/refs.c
index a5f9ffaa45..f88928de74 100644
--- a/refs.c
+++ b/refs.c
@@ -1393,11 +1393,6 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 	update = ref_transaction_add_update(transaction, refname, flags,
 					    new_oid, old_oid, NULL, NULL,
 					    committer_info, msg);
-	/*
-	 * While we do set the old_oid value, we unset the flag to skip
-	 * old_oid verification which only makes sense for refs.
-	 */
-	update->flags &= ~REF_HAVE_OLD;
 	update->index = index;
 
 	/*
@@ -3318,6 +3313,9 @@ int repo_migrate_ref_storage_format(struct repository *repo,
 
 int ref_update_expects_existing_old_ref(struct ref_update *update)
 {
+	if (update->flags & REF_LOG_ONLY)
+		return 0;
+
 	return (update->flags & REF_HAVE_OLD) &&
 		(!is_null_oid(&update->old_oid) || update->old_target);
 }
diff --git a/refs/files-backend.c b/refs/files-backend.c
index ba018b0984..85ab2ef2b9 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2500,7 +2500,6 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update,
 	 * done when new_update is processed.
 	 */
 	update->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-	update->flags &= ~REF_HAVE_OLD;
 
 	return 0;
 }
@@ -2515,8 +2514,9 @@ static enum ref_transaction_error check_old_oid(struct ref_update *update,
 						struct object_id *oid,
 						struct strbuf *err)
 {
-	if (!(update->flags & REF_HAVE_OLD) ||
-		   oideq(oid, &update->old_oid))
+	if (update->flags & REF_LOG_ONLY ||
+	    !(update->flags & REF_HAVE_OLD) ||
+	    oideq(oid, &update->old_oid))
 		return 0;
 
 	if (is_null_oid(&update->old_oid)) {
@@ -3095,7 +3095,8 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
 	for (i = 0; i < transaction->nr; i++) {
 		struct ref_update *update = transaction->updates[i];
 
-		if ((update->flags & REF_HAVE_OLD) &&
+		if (!(update->flags & REF_LOG_ONLY) &&
+		    (update->flags & REF_HAVE_OLD) &&
 		    !is_null_oid(&update->old_oid))
 			BUG("initial ref transaction with old_sha1 set");
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f868870851..95a4dc3902 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -802,7 +802,8 @@ enum ref_transaction_error ref_update_check_old_target(const char *referent,
 
 /*
  * Check if the ref must exist, this means that the old_oid or
- * old_target is non NULL.
+ * old_target is non NULL. Log-only updates never require the old state to
+ * match.
  */
 int ref_update_expects_existing_old_ref(struct ref_update *update);
 
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 4c3817f4ec..44af58ac50 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1180,8 +1180,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret > 0) {
 		/* The reference does not exist, but we expected it to. */
 		strbuf_addf(err, _("cannot lock ref '%s': "
-
-
 				   "unable to resolve reference '%s'"),
 			    ref_update_original_update_refname(u), u->refname);
 		return REF_TRANSACTION_ERROR_NONEXISTENT_REF;
@@ -1235,13 +1233,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 
 			new_update->parent_update = u;
 
-			/*
-			 * Change the symbolic ref update to log only. Also, it
-			 * doesn't need to check its old OID value, as that will be
-			 * done when new_update is processed.
-			 */
+			/* Change the symbolic ref update to log only. */
 			u->flags |= REF_LOG_ONLY | REF_NO_DEREF;
-			u->flags &= ~REF_HAVE_OLD;
 		}
 	}
 
@@ -1265,7 +1258,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 		ret = ref_update_check_old_target(referent->buf, u, err);
 		if (ret)
 			return ret;
-	} else if ((u->flags & REF_HAVE_OLD) && !oideq(&current_oid, &u->old_oid)) {
+	} else if ((u->flags & (REF_LOG_ONLY | REF_HAVE_OLD)) == REF_HAVE_OLD &&
+		   !oideq(&current_oid, &u->old_oid)) {
 		if (is_null_oid(&u->old_oid)) {
 			strbuf_addf(err, _("cannot lock ref '%s': "
 					   "reference already exists"),

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v6 9/9] refs: fix invalid old object IDs when migrating reflogs
  2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
                     ` (7 preceding siblings ...)
  2025-08-06  5:54   ` [PATCH v6 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
@ 2025-08-06  5:54   ` Patrick Steinhardt
  8 siblings, 0 replies; 114+ messages in thread
From: Patrick Steinhardt @ 2025-08-06  5:54 UTC (permalink / raw)
  To: git
  Cc: Karthik Nayak, Justin Tobler, Junio C Hamano, SZEDER Gábor,
	Toon Claes, Jeff King, Kristoffer Haugsbakk, Ben Knoble,
	Jean-Noël AVILA

When migrating reflog entries between different storage formats we end
up with invalid old object IDs for the migrated entries: instead of
writing the old object ID of the to-be-migrated entry, we end up with
the all-zeroes object ID.

The root cause of this issue is that we don't know to use the old object
ID provided by the caller. Instead, we manually resolve the old object
ID by resolving the current value of its matching reference. But as that
reference does not yet exist in the target ref storage we always end up
resolving it to all-zeroes.

This issue got unnoticed as there is no user-facing command that would
even show the old object ID. While `git log -g` knows to show the new
object ID, we don't have any formatting directive to show the old object
ID.

Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If
set, backends are instructed to use the old and new object IDs provided
by the caller, without doing any manual resolving. Set this flag in
`ref_transaction_update_reflog()`.

Amend our tests in t1460-refs-migrate to use our test tool to read
reflog entries. This test tool prints out both old and new object ID of
each reflog entry, which fixes the test gap. Furthermore it also prints
the full identity used to write the reflog, which provides test coverage
for the previous commit in this patch series that fixed the identity for
migrated reflogs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                  |  3 ++-
 refs.h                  |  9 ++++++++-
 refs/files-backend.c    | 16 +++++++++++++++-
 refs/reftable-backend.c | 14 ++++++++++++++
 t/t1421-reflog-write.sh |  4 +---
 t/t1460-refs-migrate.sh | 22 +++++++++++++++-------
 6 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/refs.c b/refs.c
index f88928de74..946eb48941 100644
--- a/refs.c
+++ b/refs.c
@@ -1385,7 +1385,8 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction,
 
 	assert(err);
 
-	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF;
+	flags = REF_HAVE_OLD | REF_HAVE_NEW | REF_LOG_ONLY | REF_FORCE_CREATE_REFLOG | REF_NO_DEREF |
+		REF_LOG_USE_PROVIDED_OIDS;
 
 	if (!transaction_refname_valid(refname, new_oid, flags, err))
 		return -1;
diff --git a/refs.h b/refs.h
index 253dd8f4d5..090b4fdff4 100644
--- a/refs.h
+++ b/refs.h
@@ -760,13 +760,20 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
  */
 #define REF_SKIP_CREATE_REFLOG (1 << 12)
 
+/*
+ * When writing a REF_LOG_ONLY record, use the old and new object IDs provided
+ * in the update instead of resolving the old object ID. The caller must also
+ * set both REF_HAVE_OLD and REF_HAVE_NEW.
+ */
+#define REF_LOG_USE_PROVIDED_OIDS (1 << 13)
+
 /*
  * Bitmask of all of the flags that are allowed to be passed in to
  * ref_transaction_update() and friends:
  */
 #define REF_TRANSACTION_UPDATE_ALLOWED_FLAGS                                  \
 	(REF_NO_DEREF | REF_FORCE_CREATE_REFLOG | REF_SKIP_OID_VERIFICATION | \
-	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG)
+	 REF_SKIP_REFNAME_VERIFICATION | REF_SKIP_CREATE_REFLOG | REF_LOG_USE_PROVIDED_OIDS)
 
 /*
  * Add a reference update to transaction. `new_oid` is the value that
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 85ab2ef2b9..905555365b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3010,6 +3010,20 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 				  struct ref_lock *lock,
 				  struct strbuf *err)
 {
+	struct object_id *old_oid = &lock->old_oid;
+
+	if (update->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(update->flags & REF_HAVE_OLD) ||
+		    !(update->flags & REF_HAVE_NEW) ||
+		    !(update->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), update->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		old_oid = &update->old_oid;
+	}
+
 	if (update->new_target) {
 		/*
 		 * We want to get the resolved OID for the target, to ensure
@@ -3027,7 +3041,7 @@ static int parse_and_write_reflog(struct files_ref_store *refs,
 		}
 	}
 
-	if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid,
+	if (files_log_ref_write(refs, lock->ref_name, old_oid,
 				&update->new_oid, update->committer_info,
 				update->msg, update->flags, err)) {
 		char *old_msg = strbuf_detach(err, NULL);
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 44af58ac50..99fafd75eb 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -1096,6 +1096,20 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor
 	if (ret)
 		return REF_TRANSACTION_ERROR_GENERIC;
 
+	if (u->flags & REF_LOG_USE_PROVIDED_OIDS) {
+		if (!(u->flags & REF_HAVE_OLD) ||
+		    !(u->flags & REF_HAVE_NEW) ||
+		    !(u->flags & REF_LOG_ONLY)) {
+			strbuf_addf(err, _("trying to write reflog for '%s'"
+					   "with incomplete values"), u->refname);
+			return REF_TRANSACTION_ERROR_GENERIC;
+		}
+
+		if (queue_transaction_update(refs, tx_data, u, &u->old_oid, err))
+			return REF_TRANSACTION_ERROR_GENERIC;
+		return 0;
+	}
+
 	/* Verify that the new object ID is valid. */
 	if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) &&
 	    !(u->flags & REF_SKIP_OID_VERIFICATION) &&
diff --git a/t/t1421-reflog-write.sh b/t/t1421-reflog-write.sh
index dd7ffa5241..46df64c176 100755
--- a/t/t1421-reflog-write.sh
+++ b/t/t1421-reflog-write.sh
@@ -101,11 +101,9 @@ test_expect_success 'simple writes' '
 		EOF
 
 		git reflog write refs/heads/something $COMMIT_OID $COMMIT_OID second &&
-		# Note: the old object ID of the second reflog entry is broken.
-		# This will be fixed in subsequent commits.
 		test_reflog_matches . refs/heads/something <<-EOF
 		$ZERO_OID $COMMIT_OID $SIGNATURE	first
-		$ZERO_OID $COMMIT_OID $SIGNATURE	second
+		$COMMIT_OID $COMMIT_OID $SIGNATURE	second
 		EOF
 	)
 '
diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh
index 2ab97e1b7d..0e1116a319 100755
--- a/t/t1460-refs-migrate.sh
+++ b/t/t1460-refs-migrate.sh
@@ -7,6 +7,17 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
 
+print_all_reflog_entries () {
+	repo=$1 &&
+	test-tool -C "$repo" ref-store main for-each-reflog >reflogs &&
+	while read reflog
+	do
+		echo "REFLOG: $reflog" &&
+		test-tool -C "$repo" ref-store main for-each-reflog-ent "$reflog" ||
+		return 1
+	done <reflogs
+}
+
 # Migrate the provided repository from one format to the other and
 # verify that the references and logs are migrated over correctly.
 # Usage: test_migration <repo> <format> [<skip_reflog_verify> [<options...>]]
@@ -28,8 +39,7 @@ test_migration () {
 		--format='%(refname) %(objectname) %(symref)' >expect &&
 	if ! $skip_reflog_verify
 	then
-	   git -C "$repo" reflog --all >expect_logs &&
-	   git -C "$repo" reflog list >expect_log_list
+		print_all_reflog_entries "$repo" >expect_logs
 	fi &&
 
 	git -C "$repo" refs migrate --ref-format="$format" "$@" &&
@@ -39,10 +49,8 @@ test_migration () {
 	test_cmp expect actual &&
 	if ! $skip_reflog_verify
 	then
-		git -C "$repo" reflog --all >actual_logs &&
-		git -C "$repo" reflog list >actual_log_list &&
-		test_cmp expect_logs actual_logs &&
-		test_cmp expect_log_list actual_log_list
+		print_all_reflog_entries "$repo" >actual_logs &&
+		test_cmp expect_logs actual_logs
 	fi &&
 
 	git -C "$repo" rev-parse --show-ref-format >actual &&
@@ -273,7 +281,7 @@ test_expect_success 'multiple reftable blocks with multiple entries' '
 	test_commit -C repo second &&
 	printf "update refs/heads/ref-%d HEAD\n" $(test_seq 3000) >stdin &&
 	git -C repo update-ref --stdin <stdin &&
-	test_migration repo reftable
+	test_migration repo reftable true
 '
 
 test_expect_success 'migrating from files format deletes backend files' '

-- 
2.51.0.rc0.215.g125493bb4a.dirty


^ permalink raw reply related	[flat|nested] 114+ messages in thread

end of thread, other threads:[~2025-08-06  5:54 UTC | newest]

Thread overview: 114+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-22 22:04   ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-23 18:14   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 16:45       ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-07-23 18:25   ` Justin Tobler
2025-07-24  8:36   ` Karthik Nayak
2025-07-24 12:55   ` Toon Claes
2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-23 19:00   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 12:54   ` Toon Claes
2025-07-25  5:36     ` Patrick Steinhardt
2025-07-24 16:20   ` SZEDER Gábor
2025-07-24 21:10     ` Junio C Hamano
2025-07-25  5:36       ` Patrick Steinhardt
2025-07-25 14:35         ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 5/8] ident: fix type of string length parameter Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-23 19:41   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24  9:41   ` Karthik Nayak
2025-07-24 12:56   ` Toon Claes
2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-23 20:31   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 10:21   ` Karthik Nayak
2025-07-24 11:35     ` Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-07-22 22:09   ` Junio C Hamano
2025-07-23  4:04     ` Patrick Steinhardt
2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-28 15:33     ` Kristoffer Haugsbakk
2025-07-28 18:49       ` Junio C Hamano
2025-07-28 20:39         ` Karthik Nayak
2025-07-28 20:59           ` Junio C Hamano
2025-07-30  7:55             ` Karthik Nayak
2025-07-29  0:25       ` Ben Knoble
2025-07-29  6:14         ` Kristoffer Haugsbakk
2025-07-29  6:51         ` Patrick Steinhardt
2025-07-29 15:00           ` Junio C Hamano
2025-07-30  5:33             ` Patrick Steinhardt
2025-07-30 10:33               ` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 5/8] ident: fix type of string length parameter Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-25 11:36     ` Jeff King
2025-07-28 14:43       ` Patrick Steinhardt
2025-07-29  7:14         ` Jeff King
2025-07-29  7:54           ` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-01 11:38     ` Toon Claes
2025-08-04  7:37       ` Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-29 16:07     ` Junio C Hamano
2025-08-01 11:37     ` Toon Claes
2025-08-04  7:38       ` Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-07-29 16:16     ` Junio C Hamano
2025-08-01 11:55     ` Toon Claes
2025-08-02 11:11     ` Jeff King
2025-08-04  7:38       ` Patrick Steinhardt
2025-08-04 14:47         ` Jeff King
2025-07-29  8:55   ` [PATCH v3 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-04 15:38     ` Jeff King
2025-08-04  9:46   ` [PATCH v4 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-05 17:04     ` Jean-Noël AVILA
2025-08-05 21:47       ` Junio C Hamano
2025-08-06  5:53         ` Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-05 18:47   ` [PATCH v5 0/9] refs: fix migration of reflog entries Jeff King
2025-08-06  5:53     ` Patrick Steinhardt
2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).