Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH] column: disallow negative padding
From: Kristoffer Haugsbakk @ 2024-02-11 18:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Torek, git, Tiago Pascoal
In-Reply-To: <xmqqle7q997e.fsf@gitster.g>

On Sun, Feb 11, 2024, at 18:55, Junio C Hamano wrote:
> "Kristoffer Haugsbakk" <code@khaugsbakk.name> writes:
>> Yeah, thanks. You probably saved me a v3. :)
>>
>> Although I failed to notice that the string I stole was just a plain
>> string, not a translation string. And apparently there are no generic
>> “non-negative” translation strings. So I’ll just make a new one.
>
> The last time I took a look, I thought there were more than just the
> single entry point you patched that can feed negative padding into
> the machinery?  Don't you need to cover them as well?
>
> Thanks.

I’ve incorporated the `column.c` patch you posted in my
not-yet-published v2. Hopefully that was it.(? :) ) I’ll take another
look.

v2 is finished now so maybe I’ll send it out soon.

-- 
Kristoffer Haugsbakk

^ permalink raw reply

* [PATCH v4 0/5] for-each-ref: add '--include-root-refs' option
From: Karthik Nayak @ 2024-02-11 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, ps, phillip.wood123, Karthik Nayak
In-Reply-To: <20240119142705.139374-1-karthik.188@gmail.com>

This is the forth version of my patch series to print root refs
in git-for-each-ref(1).

With the introduction of the reftable backend, it becomes ever
so important to provide the necessary tooling for printing all refs
associated with a worktree.

While regular refs stored within the "refs/" namespace are currently
supported by multiple commands like git-for-each-ref(1),
git-show-ref(1). Neither support printing root refs within the worktree.

This patch series is a follow up to the RFC/discussion we had earlier on
the list [1].

The first 4 commits add the required functionality to ensure we can print
all refs (regular, pseudo, HEAD). The 5th commit modifies the
git-for-each-ref(1) command to add the "--include-root-refs" command which
will include HEAD and pseudorefs with regular "refs/" refs.

[1]: https://lore.kernel.org/git/20231221170715.110565-1-karthik.188@gmail.com/#t

Changes from v3:
1. Move from using 'git for-each-ref ""' to print root refs to adding
the '--include-root-refs' option for git-for-each-ref(1). This provides better
UX for users.
2. Modify `is_pseudoref()` to use `refs_resolve_ref_unsafe`.
3. Includes reftable-backend changes and is now rebased on top of next (ed35d3359).  

Range-diff:

1:  2141a2a62b ! 1:  98130a7ad7 refs: introduce `is_pseudoref()` and `is_headref()`
    @@ Commit message
         related files like 'BISECT_START' to a new directory similar to the
         'rebase-merge' directory.
     
    +    Helped-by: Jeff King <peff@peff.net>
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
     
      ## refs.c ##
    @@ refs.c: static int is_pseudoref_syntax(const char *refname)
     +		return 0;
     +
     +	if (ends_with(refname, "_HEAD")) {
    -+		 read_ref_full(refname, RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
    -+		      &oid, NULL);
    -+		 return !is_null_oid(&oid);
    ++		refs_resolve_ref_unsafe(refs, refname,
    ++   					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
    ++   					&oid, NULL);
    ++   		return !is_null_oid(&oid);
     +	}
     +
     +	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
     +		if (!strcmp(refname, irregular_pseudorefs[i])) {
    -+			read_ref_full(refname, RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
    -+						  &oid, NULL);
    ++			refs_resolve_ref_unsafe(refs, refname,
    ++   						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
    ++   						&oid, NULL);
     +			return !is_null_oid(&oid);
     +		}
     +
2:  c96f0a9c83 = 2:  060ab08af5 refs: extract out `loose_fill_ref_dir_regular_file()`
3:  d165358b83 ! 3:  40d2375ad9 refs: introduce `refs_for_each_all_refs()`
    @@ Metadata
     Author: Karthik Nayak <karthik.188@gmail.com>
     
      ## Commit message ##
    -    refs: introduce `refs_for_each_all_refs()`
    +    refs: introduce `refs_for_each_include_root_refs()`
     
    -    Introduce a new ref iteration flag `DO_FOR_EACH_INCLUDE_ALL_REFS`, which
    -    will be used to iterate over all refs. In the files backend this is
    -    limited to regular refs, pseudorefs and HEAD. For other backends like
    -    the reftable this is the universal set of all refs stored in the
    -    backend.
    +    Introduce a new ref iteration flag `DO_FOR_EACH_INCLUDE_ROOT_REFS`,
    +    which will be used to iterate over regular refs plus pseudorefs and
    +    HEAD.
     
         Refs which fall outside the `refs/` and aren't either pseudorefs or HEAD
         are more of a grey area. This is because we don't block the users from
    -    creating such refs but they are not officially supported. In the files
    -    backend, we can isolate such files from other files.
    +    creating such refs but they are not officially supported.
     
    -    Introduce `refs_for_each_all_refs()` which calls `do_for_each_ref()`
    -    with this newly introduced flag.
    +    Introduce `refs_for_each_include_root_refs()` which calls
    +    `do_for_each_ref()` with this newly introduced flag.
     
         In `refs/files-backend.c`, introduce a new function
         `add_pseudoref_and_head_entries()` to add pseudorefs and HEAD to the
         `ref_dir`. We then finally call `add_pseudoref_and_head_entries()`
    -    whenever the `DO_FOR_EACH_INCLUDE_ALL_REFS` flag is set. Any new ref
    +    whenever the `DO_FOR_EACH_INCLUDE_ROOT_REFS` flag is set. Any new ref
         backend will also have to implement similar changes on its end.
     
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
    @@ refs.c: int for_each_rawref(each_ref_fn fn, void *cb_data)
      	return refs_for_each_rawref(get_main_ref_store(the_repository), fn, cb_data);
      }
      
    -+int refs_for_each_all_refs(struct ref_store *refs, each_ref_fn fn,
    -+			   void *cb_data)
    ++int refs_for_each_include_root_refs(struct ref_store *refs, each_ref_fn fn,
    ++				    void *cb_data)
     +{
     +	return do_for_each_ref(refs, "", NULL, fn, 0,
    -+			       DO_FOR_EACH_INCLUDE_ALL_REFS, cb_data);
    ++			       DO_FOR_EACH_INCLUDE_ROOT_REFS, cb_data);
     +}
     +
      static int qsort_strcmp(const void *va, const void *vb)
    @@ refs.h: int for_each_namespaced_ref(const char **exclude_patterns,
      int for_each_rawref(each_ref_fn fn, void *cb_data);
      
     +/*
    -+ * Iterates over all ref types, regular, pseudorefs and HEAD.
    ++ * Iterates over all refs including root refs, i.e. pseudorefs and HEAD.
     + */
    -+int refs_for_each_all_refs(struct ref_store *refs, each_ref_fn fn,
    -+			   void *cb_data);
    ++int refs_for_each_include_root_refs(struct ref_store *refs, each_ref_fn fn,
    ++				    void *cb_data);
     +
      /*
       * Normalizes partial refs to their fully qualified form.
    @@ refs/files-backend.c: static struct ref_cache *get_loose_ref_cache(struct files_
      
     +		dir = get_ref_dir(refs->loose->root);
     +
    -+		if (flags & DO_FOR_EACH_INCLUDE_ALL_REFS)
    ++		if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
     +			add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
    -+										   refs->loose->root->name);
    ++						       refs->loose->root->name);
     +
      		/*
      		 * Add an incomplete entry for "refs/" (to be filled
    @@ refs/refs-internal.h: enum do_for_each_ref_flags {
      	DO_FOR_EACH_OMIT_DANGLING_SYMREFS = (1 << 2),
     +
     +	/*
    -+	 * Include all refs in the $GIT_DIR in contrast to generally only listing
    -+	 * references having the "refs/" prefix. In the files-backend this is
    -+	 * limited to regular refs, pseudorefs and HEAD.
    ++	 * Include root refs i.e. HEAD and pseudorefs along with the regular
    ++	 * refs.
     +	 */
    -+	DO_FOR_EACH_INCLUDE_ALL_REFS = (1 << 3),
    ++	DO_FOR_EACH_INCLUDE_ROOT_REFS = (1 << 3),
      };
      
      /*
4:  a17983d0ba < -:  ---------- for-each-ref: avoid filtering on empty pattern
-:  ---------- > 4:  b4b9435505 ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
-:  ---------- > 5:  ee99ac41ae for-each-ref: add new option to include root refs

Karthik Nayak (5):
  refs: introduce `is_pseudoref()` and `is_headref()`
  refs: extract out `loose_fill_ref_dir_regular_file()`
  refs: introduce `refs_for_each_include_root_refs()`
  ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
  for-each-ref: add new option to include root refs

 Documentation/git-for-each-ref.txt |   5 +-
 builtin/for-each-ref.c             |  11 ++-
 ref-filter.c                       |  29 ++++++-
 ref-filter.h                       |   7 +-
 refs.c                             |  48 +++++++++++
 refs.h                             |   9 ++
 refs/files-backend.c               | 127 +++++++++++++++++++++--------
 refs/refs-internal.h               |   6 ++
 refs/reftable-backend.c            |  11 ++-
 t/t6302-for-each-ref-filter.sh     |  31 +++++++
 10 files changed, 238 insertions(+), 46 deletions(-)

-- 
2.43.GIT


^ permalink raw reply

* [PATCH v4 1/5] refs: introduce `is_pseudoref()` and `is_headref()`
From: Karthik Nayak @ 2024-02-11 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, ps, phillip.wood123, Karthik Nayak, Jeff King
In-Reply-To: <20240211183923.131278-1-karthik.188@gmail.com>

Introduce two new functions `is_pseudoref()` and `is_headref()`. This
provides the necessary functionality for us to add pseudorefs and HEAD
to the loose ref cache in the files backend, allowing us to build
tooling to print these refs.

The `is_pseudoref()` function internally calls `is_pseudoref_syntax()`
but adds onto it by also checking to ensure that the pseudoref either
ends with a "_HEAD" suffix or matches a list of exceptions. After which
we also parse the contents of the pseudoref to ensure that it conforms
to the ref format.

We cannot directly add the new syntax checks to `is_pseudoref_syntax()`
because the function is also used by `is_current_worktree_ref()` and
making it stricter to match only known pseudorefs might have unintended
consequences due to files like 'BISECT_START' which isn't a pseudoref
but sometimes contains object ID.

Keeping this in mind, we leave `is_pseudoref_syntax()` as is and create
`is_pseudoref()` which is stricter. Ideally we'd want to move the new
syntax checks to `is_pseudoref_syntax()` but a prerequisite for this
would be to actually remove the exception list by converting those
pseudorefs to also contain a '_HEAD' suffix and perhaps move bisect
related files like 'BISECT_START' to a new directory similar to the
'rebase-merge' directory.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 refs.c | 41 +++++++++++++++++++++++++++++++++++++++++
 refs.h |  3 +++
 2 files changed, 44 insertions(+)

diff --git a/refs.c b/refs.c
index fff343c256..d8e4cf9a11 100644
--- a/refs.c
+++ b/refs.c
@@ -860,6 +860,47 @@ static int is_pseudoref_syntax(const char *refname)
 	return 1;
 }

+int is_pseudoref(struct ref_store *refs, const char *refname)
+{
+	static const char *const irregular_pseudorefs[] = {
+		"AUTO_MERGE",
+		"BISECT_EXPECTED_REV",
+		"NOTES_MERGE_PARTIAL",
+		"NOTES_MERGE_REF",
+		"MERGE_AUTOSTASH",
+	};
+	struct object_id oid;
+	size_t i;
+
+	if (!is_pseudoref_syntax(refname))
+		return 0;
+
+	if (ends_with(refname, "_HEAD")) {
+		refs_resolve_ref_unsafe(refs, refname,
+   					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
+   					&oid, NULL);
+   		return !is_null_oid(&oid);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
+		if (!strcmp(refname, irregular_pseudorefs[i])) {
+			refs_resolve_ref_unsafe(refs, refname,
+   						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
+   						&oid, NULL);
+			return !is_null_oid(&oid);
+		}
+
+	return 0;
+}
+
+int is_headref(struct ref_store *refs, const char *refname)
+{
+	if (!strcmp(refname, "HEAD"))
+		return refs_ref_exists(refs, refname);
+
+	return 0;
+}
+
 static int is_current_worktree_ref(const char *ref) {
 	return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
 }
diff --git a/refs.h b/refs.h
index 303c5fac4d..f66cdd731c 100644
--- a/refs.h
+++ b/refs.h
@@ -1023,4 +1023,7 @@ extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
  */
 void update_ref_namespace(enum ref_namespace namespace, char *ref);

+int is_pseudoref(struct ref_store *refs, const char *refname);
+int is_headref(struct ref_store *refs, const char *refname);
+
 #endif /* REFS_H */
-- 
2.43.GIT

^ permalink raw reply related

* [PATCH v4 2/5] refs: extract out `loose_fill_ref_dir_regular_file()`
From: Karthik Nayak @ 2024-02-11 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, ps, phillip.wood123, Karthik Nayak
In-Reply-To: <20240211183923.131278-1-karthik.188@gmail.com>

Extract out the code for adding a single file to the loose ref dir as
`loose_fill_ref_dir_regular_file()` from `loose_fill_ref_dir()` in
`refs/files-backend.c`.

This allows us to use this function independently in the following
commits where we add code to also add pseudorefs to the ref dir.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 refs/files-backend.c | 62 +++++++++++++++++++++++---------------------
 1 file changed, 33 insertions(+), 29 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 75dcc21ecb..65128821a8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -229,6 +229,38 @@ static void add_per_worktree_entries_to_dir(struct ref_dir *dir, const char *dir
 	}
 }
 
+static void loose_fill_ref_dir_regular_file(struct files_ref_store *refs,
+					    const char *refname,
+					    struct ref_dir *dir)
+{
+	struct object_id oid;
+	int flag;
+
+	if (!refs_resolve_ref_unsafe(&refs->base, refname, RESOLVE_REF_READING,
+				     &oid, &flag)) {
+		oidclr(&oid);
+		flag |= REF_ISBROKEN;
+	} else if (is_null_oid(&oid)) {
+		/*
+		 * It is so astronomically unlikely
+		 * that null_oid is the OID of an
+		 * actual object that we consider its
+		 * appearance in a loose reference
+		 * file to be repo corruption
+		 * (probably due to a software bug).
+		 */
+		flag |= REF_ISBROKEN;
+	}
+
+	if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) {
+		if (!refname_is_safe(refname))
+			die("loose refname is dangerous: %s", refname);
+		oidclr(&oid);
+		flag |= REF_BAD_NAME | REF_ISBROKEN;
+	}
+	add_entry_to_dir(dir, create_ref_entry(refname, &oid, flag));
+}
+
 /*
  * Read the loose references from the namespace dirname into dir
  * (without recursing).  dirname must end with '/'.  dir must be the
@@ -257,8 +289,6 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 	strbuf_add(&refname, dirname, dirnamelen);
 
 	while ((de = readdir(d)) != NULL) {
-		struct object_id oid;
-		int flag;
 		unsigned char dtype;
 
 		if (de->d_name[0] == '.')
@@ -274,33 +304,7 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 					 create_dir_entry(dir->cache, refname.buf,
 							  refname.len));
 		} else if (dtype == DT_REG) {
-			if (!refs_resolve_ref_unsafe(&refs->base,
-						     refname.buf,
-						     RESOLVE_REF_READING,
-						     &oid, &flag)) {
-				oidclr(&oid);
-				flag |= REF_ISBROKEN;
-			} else if (is_null_oid(&oid)) {
-				/*
-				 * It is so astronomically unlikely
-				 * that null_oid is the OID of an
-				 * actual object that we consider its
-				 * appearance in a loose reference
-				 * file to be repo corruption
-				 * (probably due to a software bug).
-				 */
-				flag |= REF_ISBROKEN;
-			}
-
-			if (check_refname_format(refname.buf,
-						 REFNAME_ALLOW_ONELEVEL)) {
-				if (!refname_is_safe(refname.buf))
-					die("loose refname is dangerous: %s", refname.buf);
-				oidclr(&oid);
-				flag |= REF_BAD_NAME | REF_ISBROKEN;
-			}
-			add_entry_to_dir(dir,
-					 create_ref_entry(refname.buf, &oid, flag));
+			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
 		}
 		strbuf_setlen(&refname, dirnamelen);
 	}
-- 
2.43.GIT


^ permalink raw reply related

* [PATCH v4 3/5] refs: introduce `refs_for_each_include_root_refs()`
From: Karthik Nayak @ 2024-02-11 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, ps, phillip.wood123, Karthik Nayak
In-Reply-To: <20240211183923.131278-1-karthik.188@gmail.com>

Introduce a new ref iteration flag `DO_FOR_EACH_INCLUDE_ROOT_REFS`,
which will be used to iterate over regular refs plus pseudorefs and
HEAD.

Refs which fall outside the `refs/` and aren't either pseudorefs or HEAD
are more of a grey area. This is because we don't block the users from
creating such refs but they are not officially supported.

Introduce `refs_for_each_include_root_refs()` which calls
`do_for_each_ref()` with this newly introduced flag.

In `refs/files-backend.c`, introduce a new function
`add_pseudoref_and_head_entries()` to add pseudorefs and HEAD to the
`ref_dir`. We then finally call `add_pseudoref_and_head_entries()`
whenever the `DO_FOR_EACH_INCLUDE_ROOT_REFS` flag is set. Any new ref
backend will also have to implement similar changes on its end.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 refs.c               |  7 +++++
 refs.h               |  6 ++++
 refs/files-backend.c | 65 ++++++++++++++++++++++++++++++++++++++++----
 refs/refs-internal.h |  6 ++++
 4 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/refs.c b/refs.c
index d8e4cf9a11..77f4c1e4c2 100644
--- a/refs.c
+++ b/refs.c
@@ -1765,6 +1765,13 @@ int for_each_rawref(each_ref_fn fn, void *cb_data)
 	return refs_for_each_rawref(get_main_ref_store(the_repository), fn, cb_data);
 }
 
+int refs_for_each_include_root_refs(struct ref_store *refs, each_ref_fn fn,
+				    void *cb_data)
+{
+	return do_for_each_ref(refs, "", NULL, fn, 0,
+			       DO_FOR_EACH_INCLUDE_ROOT_REFS, cb_data);
+}
+
 static int qsort_strcmp(const void *va, const void *vb)
 {
 	const char *a = *(const char **)va;
diff --git a/refs.h b/refs.h
index f66cdd731c..5cfaee6229 100644
--- a/refs.h
+++ b/refs.h
@@ -398,6 +398,12 @@ int for_each_namespaced_ref(const char **exclude_patterns,
 int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data);
 int for_each_rawref(each_ref_fn fn, void *cb_data);
 
+/*
+ * Iterates over all refs including root refs, i.e. pseudorefs and HEAD.
+ */
+int refs_for_each_include_root_refs(struct ref_store *refs, each_ref_fn fn,
+				    void *cb_data);
+
 /*
  * Normalizes partial refs to their fully qualified form.
  * Will prepend <prefix> to the <pattern> if it doesn't start with 'refs/'.
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 65128821a8..9c1c42fe52 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -315,9 +315,59 @@ static void loose_fill_ref_dir(struct ref_store *ref_store,
 	add_per_worktree_entries_to_dir(dir, dirname);
 }
 
-static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs)
+/*
+ * Add pseudorefs to the ref dir by parsing the directory for any files
+ * which follow the pseudoref syntax.
+ */
+static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
+					 struct ref_dir *dir,
+					 const char *dirname)
+{
+	struct files_ref_store *refs =
+		files_downcast(ref_store, REF_STORE_READ, "fill_ref_dir");
+	struct strbuf path = STRBUF_INIT, refname = STRBUF_INIT;
+	struct dirent *de;
+	size_t dirnamelen;
+	DIR *d;
+
+	files_ref_path(refs, &path, dirname);
+
+	d = opendir(path.buf);
+	if (!d) {
+		strbuf_release(&path);
+		return;
+	}
+
+	strbuf_addstr(&refname, dirname);
+	dirnamelen = refname.len;
+
+	while ((de = readdir(d)) != NULL) {
+		unsigned char dtype;
+
+		if (de->d_name[0] == '.')
+			continue;
+		if (ends_with(de->d_name, ".lock"))
+			continue;
+		strbuf_addstr(&refname, de->d_name);
+
+		dtype = get_dtype(de, &path, 1);
+		if (dtype == DT_REG && (is_pseudoref(ref_store, de->d_name) ||
+								is_headref(ref_store, de->d_name)))
+			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
+
+		strbuf_setlen(&refname, dirnamelen);
+	}
+	strbuf_release(&refname);
+	strbuf_release(&path);
+	closedir(d);
+}
+
+static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs,
+					     unsigned int flags)
 {
 	if (!refs->loose) {
+		struct ref_dir *dir;
+
 		/*
 		 * Mark the top-level directory complete because we
 		 * are about to read the only subdirectory that can
@@ -328,12 +378,17 @@ static struct ref_cache *get_loose_ref_cache(struct files_ref_store *refs)
 		/* We're going to fill the top level ourselves: */
 		refs->loose->root->flag &= ~REF_INCOMPLETE;
 
+		dir = get_ref_dir(refs->loose->root);
+
+		if (flags & DO_FOR_EACH_INCLUDE_ROOT_REFS)
+			add_pseudoref_and_head_entries(dir->cache->ref_store, dir,
+						       refs->loose->root->name);
+
 		/*
 		 * Add an incomplete entry for "refs/" (to be filled
 		 * lazily):
 		 */
-		add_entry_to_dir(get_ref_dir(refs->loose->root),
-				 create_dir_entry(refs->loose, "refs/", 5));
+		add_entry_to_dir(dir, create_dir_entry(refs->loose, "refs/", 5));
 	}
 	return refs->loose;
 }
@@ -861,7 +916,7 @@ static struct ref_iterator *files_ref_iterator_begin(
 	 * disk, and re-reads it if not.
 	 */
 
-	loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs),
+	loose_iter = cache_ref_iterator_begin(get_loose_ref_cache(refs, flags),
 					      prefix, ref_store->repo, 1);
 
 	/*
@@ -1222,7 +1277,7 @@ static int files_pack_refs(struct ref_store *ref_store,
 
 	packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
 
-	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL,
+	iter = cache_ref_iterator_begin(get_loose_ref_cache(refs, 0), NULL,
 					the_repository, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		/*
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 83e0f0bba3..73a8fa18ad 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -260,6 +260,12 @@ enum do_for_each_ref_flags {
 	 * INCLUDE_BROKEN, since they are otherwise not included at all.
 	 */
 	DO_FOR_EACH_OMIT_DANGLING_SYMREFS = (1 << 2),
+
+	/*
+	 * Include root refs i.e. HEAD and pseudorefs along with the regular
+	 * refs.
+	 */
+	DO_FOR_EACH_INCLUDE_ROOT_REFS = (1 << 3),
 };
 
 /*
-- 
2.43.GIT


^ permalink raw reply related

* [PATCH v4 4/5] ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
From: Karthik Nayak @ 2024-02-11 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, ps, phillip.wood123, Karthik Nayak
In-Reply-To: <20240211183923.131278-1-karthik.188@gmail.com>

The flag 'FILTER_REFS_ALL' is a bit ambiguous, where ALL doesn't specify
if it means to contain refs from all worktrees or whether all types of
refs (regular, HEAD & pseudorefs) or all of the above.

Since here it is actually referring to all refs with the "refs/" prefix,
let's rename it to 'FILTER_REFS_REGULAR' to indicate that this is
specifically for regular refs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 builtin/for-each-ref.c | 2 +-
 ref-filter.c           | 2 +-
 ref-filter.h           | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 3885a9c28e..23d352e371 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -97,7 +97,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	}
 
 	filter.match_as_path = 1;
-	filter_and_format_refs(&filter, FILTER_REFS_ALL, sorting, &format);
+	filter_and_format_refs(&filter, FILTER_REFS_REGULAR, sorting, &format);
 
 	ref_filter_clear(&filter);
 	ref_sorting_release(sorting);
diff --git a/ref-filter.c b/ref-filter.c
index be14b56e32..acb960e35c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3047,7 +3047,7 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 			ret = for_each_fullref_in("refs/remotes/", fn, cb_data);
 		else if (filter->kind == FILTER_REFS_TAGS)
 			ret = for_each_fullref_in("refs/tags/", fn, cb_data);
-		else if (filter->kind & FILTER_REFS_ALL)
+		else if (filter->kind & FILTER_REFS_REGULAR)
 			ret = for_each_fullref_in_pattern(filter, fn, cb_data);
 		if (!ret && (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(fn, cb_data);
diff --git a/ref-filter.h b/ref-filter.h
index 07cd6f6da3..5416936800 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -19,10 +19,10 @@
 #define FILTER_REFS_BRANCHES       0x0004
 #define FILTER_REFS_REMOTES        0x0008
 #define FILTER_REFS_OTHERS         0x0010
-#define FILTER_REFS_ALL            (FILTER_REFS_TAGS | FILTER_REFS_BRANCHES | \
+#define FILTER_REFS_REGULAR        (FILTER_REFS_TAGS | FILTER_REFS_BRANCHES | \
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
-#define FILTER_REFS_KIND_MASK      (FILTER_REFS_ALL | FILTER_REFS_DETACHED_HEAD)
+#define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD)
 
 struct atom_value;
 struct ref_sorting;
-- 
2.43.GIT


^ permalink raw reply related

* [PATCH v4 5/5] for-each-ref: add new option to include root refs
From: Karthik Nayak @ 2024-02-11 18:39 UTC (permalink / raw)
  To: git; +Cc: gitster, ps, phillip.wood123, Karthik Nayak
In-Reply-To: <20240211183923.131278-1-karthik.188@gmail.com>

The git-for-each-ref(1) command doesn't provide a way to print root refs
i.e pseudorefs and HEAD with the regular "refs/" prefixed refs.

This commit adds a new option "--include-root-refs" to
git-for-each-ref(1). When used this would also print pseudorefs and HEAD
for the current worktree.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 Documentation/git-for-each-ref.txt |  5 ++++-
 builtin/for-each-ref.c             | 11 ++++++++---
 ref-filter.c                       | 27 +++++++++++++++++++++++++-
 ref-filter.h                       |  5 ++++-
 refs/reftable-backend.c            | 11 +++++++----
 t/t6302-for-each-ref-filter.sh     | 31 ++++++++++++++++++++++++++++++
 6 files changed, 80 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 3a9ad91b7a..c1dd12b93c 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git for-each-ref' [--count=<count>] [--shell|--perl|--python|--tcl]
 		   [(--sort=<key>)...] [--format=<format>]
-		   [ --stdin | <pattern>... ]
+		   [--include-root-refs] [ --stdin | <pattern>... ]
 		   [--points-at=<object>]
 		   [--merged[=<object>]] [--no-merged[=<object>]]
 		   [--contains[=<object>]] [--no-contains[=<object>]]
@@ -105,6 +105,9 @@ TAB %(refname)`.
 	any excluded pattern(s) are shown. Matching is done using the
 	same rules as `<pattern>` above.
 
+--include-root-refs::
+	List root refs (HEAD and pseudorefs) apart from regular refs.
+
 FIELD NAMES
 -----------
 
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 23d352e371..9ed146dad3 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -20,10 +20,10 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 {
 	struct ref_sorting *sorting;
 	struct string_list sorting_options = STRING_LIST_INIT_DUP;
-	int icase = 0;
+	int icase = 0, include_root_refs = 0, from_stdin = 0;
 	struct ref_filter filter = REF_FILTER_INIT;
 	struct ref_format format = REF_FORMAT_INIT;
-	int from_stdin = 0;
+	unsigned int flags = FILTER_REFS_REGULAR;
 	struct strvec vec = STRVEC_INIT;
 
 	struct option opts[] = {
@@ -53,6 +53,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 		OPT_NO_CONTAINS(&filter.no_commit, N_("print only refs which don't contain the commit")),
 		OPT_BOOL(0, "ignore-case", &icase, N_("sorting and filtering are case insensitive")),
 		OPT_BOOL(0, "stdin", &from_stdin, N_("read reference patterns from stdin")),
+		OPT_BOOL(0, "include-root-refs", &include_root_refs, N_("also include HEAD ref and pseudorefs")),
 		OPT_END(),
 	};
 
@@ -96,8 +97,12 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 		filter.name_patterns = argv;
 	}
 
+	if (include_root_refs) {
+		flags |= FILTER_REFS_ROOT_REFS;
+	}
+
 	filter.match_as_path = 1;
-	filter_and_format_refs(&filter, FILTER_REFS_REGULAR, sorting, &format);
+	filter_and_format_refs(&filter, flags, sorting, &format);
 
 	ref_filter_clear(&filter);
 	ref_sorting_release(sorting);
diff --git a/ref-filter.c b/ref-filter.c
index acb960e35c..0e83e29390 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2628,6 +2628,12 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       each_ref_fn cb,
 				       void *cb_data)
 {
+	if (filter->kind == FILTER_REFS_KIND_MASK) {
+		/* in this case, we want to print all refs including root refs. */
+		return refs_for_each_include_root_refs(get_main_ref_store(the_repository),
+						       cb, cb_data);
+	}
+
 	if (!filter->match_as_path) {
 		/*
 		 * in this case, the patterns are applied after
@@ -2750,6 +2756,9 @@ static int ref_kind_from_refname(const char *refname)
 			return ref_kind[i].kind;
 	}
 
+	if (is_pseudoref(get_main_ref_store(the_repository), refname))
+		return FILTER_REFS_PSEUDOREFS;
+
 	return FILTER_REFS_OTHERS;
 }
 
@@ -2781,6 +2790,16 @@ static struct ref_array_item *apply_ref_filter(const char *refname, const struct
 
 	/* Obtain the current ref kind from filter_ref_kind() and ignore unwanted refs. */
 	kind = filter_ref_kind(filter, refname);
+
+	/*
+	 * When printing HEAD with all other refs, we want to apply the same formatting
+	 * rules as the other refs, so we simply ask it to be treated as a pseudoref.
+	 */
+	if (filter->kind == FILTER_REFS_KIND_MASK && kind == FILTER_REFS_DETACHED_HEAD)
+		kind = FILTER_REFS_PSEUDOREFS;
+	else if (!(kind & filter->kind))
+		return NULL;
+
 	if (!(kind & filter->kind))
 		return NULL;
 
@@ -3049,7 +3068,13 @@ static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 			ret = for_each_fullref_in("refs/tags/", fn, cb_data);
 		else if (filter->kind & FILTER_REFS_REGULAR)
 			ret = for_each_fullref_in_pattern(filter, fn, cb_data);
-		if (!ret && (filter->kind & FILTER_REFS_DETACHED_HEAD))
+
+		/*
+		 * When printing all ref types, HEAD is already included,
+		 * so we don't want to print HEAD again.
+		 */
+		if (!ret && (filter->kind != FILTER_REFS_KIND_MASK) &&
+		    (filter->kind & FILTER_REFS_DETACHED_HEAD))
 			head_ref(fn, cb_data);
 	}
 
diff --git a/ref-filter.h b/ref-filter.h
index 5416936800..0ca28d2bba 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -22,7 +22,10 @@
 #define FILTER_REFS_REGULAR        (FILTER_REFS_TAGS | FILTER_REFS_BRANCHES | \
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
-#define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD)
+#define FILTER_REFS_PSEUDOREFS     0x0040
+#define FILTER_REFS_ROOT_REFS      (FILTER_REFS_DETACHED_HEAD | FILTER_REFS_PSEUDOREFS)
+#define FILTER_REFS_KIND_MASK      (FILTER_REFS_REGULAR | FILTER_REFS_DETACHED_HEAD | \
+				    FILTER_REFS_PSEUDOREFS)
 
 struct atom_value;
 struct ref_sorting;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index a14f2ad7f4..c23a516ac2 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -364,12 +364,15 @@ static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			break;
 
 		/*
-		 * The files backend only lists references contained in
-		 * "refs/". We emulate the same behaviour here and thus skip
-		 * all references that don't start with this prefix.
+		 * The files backend only lists references contained in "refs/" unless
+		 * the root refs are to be included. We emulate the same behaviour here.
 		 */
-		if (!starts_with(iter->ref.refname, "refs/"))
+		if (!starts_with(iter->ref.refname, "refs/") &&
+		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
+		     (is_pseudoref(&iter->refs->base, iter->ref.refname) ||
+		      is_headref(&iter->refs->base, iter->ref.refname)))) {
 			continue;
+		}
 
 		if (iter->prefix &&
 		    strncmp(iter->prefix, iter->ref.refname, strlen(iter->prefix))) {
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 82f3d1ea0f..948f1bb5f4 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -31,6 +31,37 @@ test_expect_success 'setup some history and refs' '
 	git update-ref refs/odd/spot main
 '
 
+test_expect_success '--include-root-refs pattern prints pseudorefs' '
+	cat >expect <<-\EOF &&
+	HEAD
+	ORIG_HEAD
+	refs/heads/main
+	refs/heads/side
+	refs/odd/spot
+	refs/tags/annotated-tag
+	refs/tags/doubly-annotated-tag
+	refs/tags/doubly-signed-tag
+	refs/tags/four
+	refs/tags/one
+	refs/tags/signed-tag
+	refs/tags/three
+	refs/tags/two
+	EOF
+	git update-ref ORIG_HEAD main &&
+	git for-each-ref --format="%(refname)" --include-root-refs >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--include-root-refs with other patterns' '
+	cat >expect <<-\EOF &&
+	HEAD
+	ORIG_HEAD
+	EOF
+	git update-ref ORIG_HEAD main &&
+	git for-each-ref --format="%(refname)" --include-root-refs "*HEAD" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'filtering with --points-at' '
 	cat >expect <<-\EOF &&
 	refs/heads/main
-- 
2.43.GIT


^ permalink raw reply related

* Re: git gc changes ownerships of files linux
From: brian m. carlson @ 2024-02-11 19:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Torsten Bögershausen, K_V, git
In-Reply-To: <xmqqcyt39cju.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 1032 bytes --]

On 2024-02-11 at 16:43:33, Junio C Hamano wrote:
> You definitely must set up your initial directory with g+s if you
> are usihng the group-writable shared directory model (which I would
> actually be surprised to see in 2020---is a shared machine with more
> than one user-account still a thing???); adjust_shared_perm() will
> not help you there.

I think it's relatively common to have shell hosts from which to log
into production machines, or to have shared hosts for students at
universities, and I do know that shared web hosting is still quite
popular (because it tends to be very economical and low maintenance for
the user).

I don't know that shared repositories are really that common anymore,
and I do usually recommend that people clone their own copies whenever
possible, but I have seen posts on StackOverflow where people are in
fact using a shared repository (possibly with multiple worktrees) on one
system for various reasons.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* [PATCH v2] column: disallow negative padding
From: Kristoffer Haugsbakk @ 2024-02-11 19:27 UTC (permalink / raw)
  To: git; +Cc: Kristoffer Haugsbakk, Tiago Pascoal, Chris Torek, Junio C Hamano
In-Reply-To: <76688ed2cc20031d70823d9f5d214f42b3bd1409.1707501064.git.code@khaugsbakk.name>

A negative padding does not make sense and can cause errors in the
memory allocator since it’s interpreted as an unsigned integer.

Disallow negative padding. Also guard against negative padding in
`column.c` where it is conditionally used.

Reported-by: Tiago Pascoal <tiago@pascoal.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    v2:
    • Incorporate Junio’s changes (guard against negative padding in
      `column.c`)
    • Tweak commit message based on Junio’s analysis
    • Use gettext for error message
      • However I noticed that the “translation string” from `fast-import`
        isn’t a translation string. So let’s invent a new one and use a
        parameter so that it can be used elsewhere.
    • Make a test

 builtin/column.c  |  2 ++
 column.c          |  4 ++--
 t/t9002-column.sh | 11 +++++++++++
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/builtin/column.c b/builtin/column.c
index e80218f81f9..10ff7e01668 100644
--- a/builtin/column.c
+++ b/builtin/column.c
@@ -45,6 +45,8 @@ int cmd_column(int argc, const char **argv, const char *prefix)
 	memset(&copts, 0, sizeof(copts));
 	copts.padding = 1;
 	argc = parse_options(argc, argv, prefix, options, builtin_column_usage, 0);
+	if (copts.padding < 0)
+		die(_("%s must be non-negative"), "--padding");
 	if (argc)
 		usage_with_options(builtin_column_usage, options);
 	if (real_command || command) {
diff --git a/column.c b/column.c
index ff2f0abf399..c723428bc70 100644
--- a/column.c
+++ b/column.c
@@ -189,7 +189,7 @@ void print_columns(const struct string_list *list, unsigned int colopts,
 	memset(&nopts, 0, sizeof(nopts));
 	nopts.indent = opts && opts->indent ? opts->indent : "";
 	nopts.nl = opts && opts->nl ? opts->nl : "\n";
-	nopts.padding = opts ? opts->padding : 1;
+	nopts.padding = (opts && 0 <= opts->padding) ? opts->padding : 1;
 	nopts.width = opts && opts->width ? opts->width : term_columns() - 1;
 	if (!column_active(colopts)) {
 		display_plain(list, "", "\n");
@@ -373,7 +373,7 @@ int run_column_filter(int colopts, const struct column_options *opts)
 		strvec_pushf(argv, "--width=%d", opts->width);
 	if (opts && opts->indent)
 		strvec_pushf(argv, "--indent=%s", opts->indent);
-	if (opts && opts->padding)
+	if (opts && 0 <= opts->padding)
 		strvec_pushf(argv, "--padding=%d", opts->padding);
 
 	fflush(stdout);
diff --git a/t/t9002-column.sh b/t/t9002-column.sh
index 348cc406582..d5b98e615bc 100755
--- a/t/t9002-column.sh
+++ b/t/t9002-column.sh
@@ -196,4 +196,15 @@ EOF
 	test_cmp expected actual
 '
 
+test_expect_success 'padding must be non-negative' '
+	cat >input <<\EOF &&
+1 2 3 4 5 6
+EOF
+	cat >expected <<\EOF &&
+fatal: --padding must be non-negative
+EOF
+	test_must_fail git column --mode=column --padding=-1 <input >actual 2>&1 &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.43.0


^ permalink raw reply related

* RE: git gc changes ownerships of files linux
From: rsbecker @ 2024-02-11 19:48 UTC (permalink / raw)
  To: 'brian m. carlson', 'Junio C Hamano'
  Cc: 'Torsten Bögershausen', 'K_V', git
In-Reply-To: <ZckbeJqdvIfY4YPu@tapette.crustytoothpaste.net>

On Sunday, February 11, 2024 2:10 PM, brian m. carlson wrote:
>On 2024-02-11 at 16:43:33, Junio C Hamano wrote:
>> You definitely must set up your initial directory with g+s if you are
>> usihng the group-writable shared directory model (which I would
>> actually be surprised to see in 2020---is a shared machine with more
>> than one user-account still a thing???); adjust_shared_perm() will not
>> help you there.
>
>I think it's relatively common to have shell hosts from which to log into production
>machines, or to have shared hosts for students at universities, and I do know that
>shared web hosting is still quite popular (because it tends to be very economical and
>low maintenance for the user).
>
>I don't know that shared repositories are really that common anymore, and I do
>usually recommend that people clone their own copies whenever possible, but I
>have seen posts on StackOverflow where people are in fact using a shared
>repository (possibly with multiple worktrees) on one system for various reasons.

In my community, shared repositories are particularly common on the operations side of the fence. It is a balance between the need for one user id (generally does not log on) running the scripts, and the individual operations staff specifying them. I have developed (commercial) solutions to this that remove the need shared repositories in this circumstance, but up to now, I have seen them used. This comes into play when multiple people are manipulating web server content without a separate deployment mechanism.

^ permalink raw reply

* Bug: git-subtree: splitting subtree fails when subtree was added, removed and added again
From: tionis @ 2024-02-11 20:16 UTC (permalink / raw)
  To: git

I discovered a problem in git-subtree that I could also reproduce in a 
minimal proof of concept over at
https://github.com/tionis/git-subtree-bug-poc

The same bug was discussed some time ago on stackoverflow: 
https://stackoverflow.com/questions/68761778/git-subtree-cache-exists

It seems to occur when a subtree was added, removed, alter added again 
and then modified in some commit.

This leads to some error with the cache leading to an error message like

fatal: cache for $commit-hash-here already exists!

I'm unsure how to work around this other than modify the history or 
creating a branch from before the first addition and fixing it there.

^ permalink raw reply

* [PATCH v5 0/2] add-patch: classify '@' as a synonym for 'HEAD'
From: Ghanshyam Thakkar @ 2024-02-11 20:20 UTC (permalink / raw)
  To: git; +Cc: gitster, phillip.wood123, sunshine, ps, Ghanshyam Thakkar
In-Reply-To: <20240206225122.1095766-3-shyamthakkar001@gmail.com>

Since v4, I have taken Junio's suggestion to change '@' to 'HEAD' as
early as reasonably possible. And also added new test to check the
no-op case of 'checkout HEAD/@'. 

And the second patch removes all the perl prerequisites from the tests
which use patch mode functionality, as pointed out by Phillip. And have
also taken into account Eric's suggestion to not remove perl
prerequisites while amending them in patch (1/2) and do it all
together in patch (2/2).

Ghanshyam Thakkar (2):
  add-patch: classify '@' as a synonym for 'HEAD'
  add -p tests: remove PERL prerequisites

 add-patch.c                    |  8 ------
 builtin/checkout.c             |  4 ++-
 builtin/reset.c                |  4 ++-
 t/t2016-checkout-patch.sh      | 46 +++++++++++++++++++---------------
 t/t2020-checkout-detach.sh     | 12 +++++++++
 t/t2024-checkout-dwim.sh       |  2 +-
 t/t2071-restore-patch.sh       | 46 ++++++++++++++++++----------------
 t/t3904-stash-patch.sh         |  6 -----
 t/t7105-reset-patch.sh         | 37 ++++++++++++++-------------
 t/t7106-reset-unborn-branch.sh |  2 +-
 t/t7514-commit-patch.sh        |  6 -----
 11 files changed, 91 insertions(+), 82 deletions(-)

-- 
2.43.0

^ permalink raw reply

* [PATCH v5 1/2] add-patch: classify '@' as a synonym for 'HEAD'
From: Ghanshyam Thakkar @ 2024-02-11 20:20 UTC (permalink / raw)
  To: git; +Cc: gitster, phillip.wood123, sunshine, ps, Ghanshyam Thakkar
In-Reply-To: <20240206225122.1095766-3-shyamthakkar001@gmail.com>

Currently, (restore, checkout, reset) commands correctly take '@' as a
synonym for 'HEAD'. However, in patch mode different prompts/messages
are given on command line due to patch mode machinery not considering
'@' to be a synonym for 'HEAD' due to literal string comparison with
the word 'HEAD', and therefore assigning patch_mode_($command)_nothead
and triggering reverse mode (-R in diff-index). The NEEDSWORK comment
suggested comparing commit objects to get around this. However, doing
so would also take a non-checked out branch pointing to the same commit
as HEAD, as HEAD. This would cause confusion to the user.

Therefore, after parsing '@', replace it with 'HEAD' as reasonably
early as possible. This also solves another problem of disparity
between 'git checkout HEAD' and 'git checkout @' (latter detaches at
the HEAD commit and the former does not).

Trade-offs:
- Some of the errors would show the revision argument as 'HEAD' when
  given '@'. This should be fine, as most users who probably use '@'
  would be aware that it is a shortcut for 'HEAD' and most probably
  used to use 'HEAD'. There is also relevant documentation in
  'gitrevisions' manpage about '@' being the shortcut for 'HEAD'. Also,
  the simplicity of the solution far outweighs this cost.

- Consider '@' as a shortcut for 'HEAD' even if 'refs/heads/@' exists
  at a different commit. Naming a branch '@' is an obvious foot-gun and
  many existing commands already take '@' for 'HEAD' even if
  'refs/heads/@' exists at a different commit or does not exist at all
  (e.g. 'git log @', 'git push origin @' etc.). Therefore this is an
  existing assumption and should not be a problem.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
 add-patch.c                |  8 -------
 builtin/checkout.c         |  4 +++-
 builtin/reset.c            |  4 +++-
 t/t2016-checkout-patch.sh  | 46 +++++++++++++++++++++-----------------
 t/t2020-checkout-detach.sh | 12 ++++++++++
 t/t2071-restore-patch.sh   | 18 +++++++++------
 t/t7105-reset-patch.sh     | 15 ++++++++-----
 7 files changed, 64 insertions(+), 43 deletions(-)

diff --git a/add-patch.c b/add-patch.c
index 79eda168eb..68f525b35c 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -1729,14 +1729,6 @@ int run_add_p(struct repository *r, enum add_p_mode mode,
 	if (mode == ADD_P_STASH)
 		s.mode = &patch_mode_stash;
 	else if (mode == ADD_P_RESET) {
-		/*
-		 * NEEDSWORK: Instead of comparing to the literal "HEAD",
-		 * compare the commit objects instead so that other ways of
-		 * saying the same thing (such as "@") are also handled
-		 * appropriately.
-		 *
-		 * This applies to the cases below too.
-		 */
 		if (!revision || !strcmp(revision, "HEAD"))
 			s.mode = &patch_mode_reset_head;
 		else
diff --git a/builtin/checkout.c b/builtin/checkout.c
index a6e30931b5..067c251933 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1224,7 +1224,9 @@ static void setup_new_branch_info_and_source_tree(
 	struct tree **source_tree = &opts->source_tree;
 	struct object_id branch_rev;
 
-	new_branch_info->name = xstrdup(arg);
+	/* treat '@' as a shortcut for 'HEAD' */
+	new_branch_info->name = !strcmp(arg, "@") ? xstrdup("HEAD") :
+						    xstrdup(arg);
 	setup_branch_path(new_branch_info);
 
 	if (!check_refname_format(new_branch_info->path, 0) &&
diff --git a/builtin/reset.c b/builtin/reset.c
index 8390bfe4c4..f0bf29a478 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -281,7 +281,9 @@ static void parse_args(struct pathspec *pathspec,
 			verify_filename(prefix, argv[0], 1);
 		}
 	}
-	*rev_ret = rev;
+
+	/* treat '@' as a shortcut for 'HEAD' */
+	*rev_ret = !strcmp("@", rev) ? "HEAD" : rev;
 
 	parse_pathspec(pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
diff --git a/t/t2016-checkout-patch.sh b/t/t2016-checkout-patch.sh
index 747eb5563e..c4f9bf09aa 100755
--- a/t/t2016-checkout-patch.sh
+++ b/t/t2016-checkout-patch.sh
@@ -38,26 +38,32 @@ test_expect_success 'git checkout -p with staged changes' '
 	verify_state dir/foo index index
 '
 
-test_expect_success 'git checkout -p HEAD with NO staged changes: abort' '
-	set_and_save_state dir/foo work head &&
-	test_write_lines n y n | git checkout -p HEAD &&
-	verify_saved_state bar &&
-	verify_saved_state dir/foo
-'
-
-test_expect_success 'git checkout -p HEAD with NO staged changes: apply' '
-	test_write_lines n y y | git checkout -p HEAD &&
-	verify_saved_state bar &&
-	verify_state dir/foo head head
-'
-
-test_expect_success 'git checkout -p HEAD with change already staged' '
-	set_state dir/foo index index &&
-	# the third n is to get out in case it mistakenly does not apply
-	test_write_lines n y n | git checkout -p HEAD &&
-	verify_saved_state bar &&
-	verify_state dir/foo head head
-'
+for opt in "HEAD" "@"
+do
+	test_expect_success "git checkout -p $opt with NO staged changes: abort" '
+		set_and_save_state dir/foo work head &&
+		test_write_lines n y n | git checkout -p $opt >output &&
+		verify_saved_state bar &&
+		verify_saved_state dir/foo &&
+		test_grep "Discard" output
+	'
+
+	test_expect_success "git checkout -p $opt with NO staged changes: apply" '
+		test_write_lines n y y | git checkout -p $opt >output &&
+		verify_saved_state bar &&
+		verify_state dir/foo head head &&
+		test_grep "Discard" output
+	'
+
+	test_expect_success "git checkout -p $opt with change already staged" '
+		set_state dir/foo index index &&
+		# the third n is to get out in case it mistakenly does not apply
+		test_write_lines n y n | git checkout -p $opt >output &&
+		verify_saved_state bar &&
+		verify_state dir/foo head head &&
+		test_grep "Discard" output
+	'
+done
 
 test_expect_success 'git checkout -p HEAD^...' '
 	# the third n is to get out in case it mistakenly does not apply
diff --git a/t/t2020-checkout-detach.sh b/t/t2020-checkout-detach.sh
index 8202ef8c74..bce284c297 100755
--- a/t/t2020-checkout-detach.sh
+++ b/t/t2020-checkout-detach.sh
@@ -45,6 +45,18 @@ test_expect_success 'checkout branch does not detach' '
 	check_not_detached
 '
 
+for opt in "HEAD" "@"
+do
+	test_expect_success "checkout $opt no-op/don't detach" '
+		reset &&
+		cat .git/HEAD >expect &&
+		git checkout $opt &&
+		cat .git/HEAD >actual &&
+		check_not_detached &&
+		test_cmp expect actual
+	'
+done
+
 test_expect_success 'checkout tag detaches' '
 	reset &&
 	git checkout tag &&
diff --git a/t/t2071-restore-patch.sh b/t/t2071-restore-patch.sh
index b5c5c0ff7e..3dc9184b4a 100755
--- a/t/t2071-restore-patch.sh
+++ b/t/t2071-restore-patch.sh
@@ -44,13 +44,17 @@ test_expect_success PERL 'git restore -p with staged changes' '
 	verify_state dir/foo index index
 '
 
-test_expect_success PERL 'git restore -p --source=HEAD' '
-	set_state dir/foo work index &&
-	# the third n is to get out in case it mistakenly does not apply
-	test_write_lines n y n | git restore -p --source=HEAD &&
-	verify_saved_state bar &&
-	verify_state dir/foo head index
-'
+for opt in "HEAD" "@"
+do
+	test_expect_success PERL "git restore -p --source=$opt" '
+		set_state dir/foo work index &&
+		# the third n is to get out in case it mistakenly does not apply
+		test_write_lines n y n | git restore -p --source=$opt >output &&
+		verify_saved_state bar &&
+		verify_state dir/foo head index &&
+		test_grep "Discard" output
+	'
+done
 
 test_expect_success PERL 'git restore -p --source=HEAD^' '
 	set_state dir/foo work index &&
diff --git a/t/t7105-reset-patch.sh b/t/t7105-reset-patch.sh
index 05079c7246..0f597416d8 100755
--- a/t/t7105-reset-patch.sh
+++ b/t/t7105-reset-patch.sh
@@ -26,12 +26,15 @@ test_expect_success PERL 'saying "n" does nothing' '
 	verify_saved_state bar
 '
 
-test_expect_success PERL 'git reset -p' '
-	test_write_lines n y | git reset -p >output &&
-	verify_state dir/foo work head &&
-	verify_saved_state bar &&
-	test_grep "Unstage" output
-'
+for opt in "HEAD" "@" ""
+do
+	test_expect_success PERL "git reset -p $opt" '
+		test_write_lines n y | git reset -p $opt >output &&
+		verify_state dir/foo work head &&
+		verify_saved_state bar &&
+		test_grep "Unstage" output
+	'
+done
 
 test_expect_success PERL 'git reset -p HEAD^' '
 	test_write_lines n y | git reset -p HEAD^ >output &&
-- 
2.43.0


^ permalink raw reply related

* [PATCH v5 2/2] add -p tests: remove PERL prerequisites
From: Ghanshyam Thakkar @ 2024-02-11 20:20 UTC (permalink / raw)
  To: git; +Cc: gitster, phillip.wood123, sunshine, ps, Ghanshyam Thakkar
In-Reply-To: <20240206225122.1095766-3-shyamthakkar001@gmail.com>

The Perl version of the add -i/-p commands has been removed since
20b813d (add: remove "add.interactive.useBuiltin" & Perl "git
add--interactive", 2023-02-07)

Therefore, Perl prerequisite in the test scripts which use the patch
mode functionality is not neccessary.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
 t/t2024-checkout-dwim.sh       |  2 +-
 t/t2071-restore-patch.sh       | 28 ++++++++++++++--------------
 t/t3904-stash-patch.sh         |  6 ------
 t/t7105-reset-patch.sh         | 22 +++++++++++-----------
 t/t7106-reset-unborn-branch.sh |  2 +-
 t/t7514-commit-patch.sh        |  6 ------
 6 files changed, 27 insertions(+), 39 deletions(-)

diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh
index a97416ce65..a3b1449ef1 100755
--- a/t/t2024-checkout-dwim.sh
+++ b/t/t2024-checkout-dwim.sh
@@ -113,7 +113,7 @@ test_expect_success 'checkout of branch from multiple remotes fails with advice'
 	test_grep ! "^hint: " stderr
 '
 
-test_expect_success PERL 'checkout -p with multiple remotes does not print advice' '
+test_expect_success 'checkout -p with multiple remotes does not print advice' '
 	git checkout -B main &&
 	test_might_fail git branch -D foo &&
 
diff --git a/t/t2071-restore-patch.sh b/t/t2071-restore-patch.sh
index 3dc9184b4a..27e85be40a 100755
--- a/t/t2071-restore-patch.sh
+++ b/t/t2071-restore-patch.sh
@@ -4,7 +4,7 @@ test_description='git restore --patch'
 
 . ./lib-patch-mode.sh
 
-test_expect_success PERL 'setup' '
+test_expect_success 'setup' '
 	mkdir dir &&
 	echo parent >dir/foo &&
 	echo dummy >bar &&
@@ -16,28 +16,28 @@ test_expect_success PERL 'setup' '
 	save_head
 '
 
-test_expect_success PERL 'restore -p without pathspec is fine' '
+test_expect_success 'restore -p without pathspec is fine' '
 	echo q >cmd &&
 	git restore -p <cmd
 '
 
 # note: bar sorts before dir/foo, so the first 'n' is always to skip 'bar'
 
-test_expect_success PERL 'saying "n" does nothing' '
+test_expect_success 'saying "n" does nothing' '
 	set_and_save_state dir/foo work head &&
 	test_write_lines n n | git restore -p &&
 	verify_saved_state bar &&
 	verify_saved_state dir/foo
 '
 
-test_expect_success PERL 'git restore -p' '
+test_expect_success 'git restore -p' '
 	set_and_save_state dir/foo work head &&
 	test_write_lines n y | git restore -p &&
 	verify_saved_state bar &&
 	verify_state dir/foo head head
 '
 
-test_expect_success PERL 'git restore -p with staged changes' '
+test_expect_success 'git restore -p with staged changes' '
 	set_state dir/foo work index &&
 	test_write_lines n y | git restore -p &&
 	verify_saved_state bar &&
@@ -46,7 +46,7 @@ test_expect_success PERL 'git restore -p with staged changes' '
 
 for opt in "HEAD" "@"
 do
-	test_expect_success PERL "git restore -p --source=$opt" '
+	test_expect_success "git restore -p --source=$opt" '
 		set_state dir/foo work index &&
 		# the third n is to get out in case it mistakenly does not apply
 		test_write_lines n y n | git restore -p --source=$opt >output &&
@@ -56,7 +56,7 @@ do
 	'
 done
 
-test_expect_success PERL 'git restore -p --source=HEAD^' '
+test_expect_success 'git restore -p --source=HEAD^' '
 	set_state dir/foo work index &&
 	# the third n is to get out in case it mistakenly does not apply
 	test_write_lines n y n | git restore -p --source=HEAD^ &&
@@ -64,7 +64,7 @@ test_expect_success PERL 'git restore -p --source=HEAD^' '
 	verify_state dir/foo parent index
 '
 
-test_expect_success PERL 'git restore -p --source=HEAD^...' '
+test_expect_success 'git restore -p --source=HEAD^...' '
 	set_state dir/foo work index &&
 	# the third n is to get out in case it mistakenly does not apply
 	test_write_lines n y n | git restore -p --source=HEAD^... &&
@@ -72,7 +72,7 @@ test_expect_success PERL 'git restore -p --source=HEAD^...' '
 	verify_state dir/foo parent index
 '
 
-test_expect_success PERL 'git restore -p handles deletion' '
+test_expect_success 'git restore -p handles deletion' '
 	set_state dir/foo work index &&
 	rm dir/foo &&
 	test_write_lines n y | git restore -p &&
@@ -85,21 +85,21 @@ test_expect_success PERL 'git restore -p handles deletion' '
 # dir/foo.  There's always an extra 'n' to reject edits to dir/foo in
 # the failure case (and thus get out of the loop).
 
-test_expect_success PERL 'path limiting works: dir' '
+test_expect_success 'path limiting works: dir' '
 	set_state dir/foo work head &&
 	test_write_lines y n | git restore -p dir &&
 	verify_saved_state bar &&
 	verify_state dir/foo head head
 '
 
-test_expect_success PERL 'path limiting works: -- dir' '
+test_expect_success 'path limiting works: -- dir' '
 	set_state dir/foo work head &&
 	test_write_lines y n | git restore -p -- dir &&
 	verify_saved_state bar &&
 	verify_state dir/foo head head
 '
 
-test_expect_success PERL 'path limiting works: HEAD^ -- dir' '
+test_expect_success 'path limiting works: HEAD^ -- dir' '
 	set_state dir/foo work head &&
 	# the third n is to get out in case it mistakenly does not apply
 	test_write_lines y n n | git restore -p --source=HEAD^ -- dir &&
@@ -107,7 +107,7 @@ test_expect_success PERL 'path limiting works: HEAD^ -- dir' '
 	verify_state dir/foo parent head
 '
 
-test_expect_success PERL 'path limiting works: foo inside dir' '
+test_expect_success 'path limiting works: foo inside dir' '
 	set_state dir/foo work head &&
 	# the third n is to get out in case it mistakenly does not apply
 	test_write_lines y n n | (cd dir && git restore -p foo) &&
@@ -115,7 +115,7 @@ test_expect_success PERL 'path limiting works: foo inside dir' '
 	verify_state dir/foo head head
 '
 
-test_expect_success PERL 'none of this moved HEAD' '
+test_expect_success 'none of this moved HEAD' '
 	verify_saved_head
 '
 
diff --git a/t/t3904-stash-patch.sh b/t/t3904-stash-patch.sh
index accfe3845c..368fc2a6cc 100755
--- a/t/t3904-stash-patch.sh
+++ b/t/t3904-stash-patch.sh
@@ -3,12 +3,6 @@
 test_description='stash -p'
 . ./lib-patch-mode.sh
 
-if ! test_have_prereq PERL
-then
-	skip_all='skipping stash -p tests, perl not available'
-	test_done
-fi
-
 test_expect_success 'setup' '
 	mkdir dir &&
 	echo parent > dir/foo &&
diff --git a/t/t7105-reset-patch.sh b/t/t7105-reset-patch.sh
index 0f597416d8..4e7ad2b0f3 100755
--- a/t/t7105-reset-patch.sh
+++ b/t/t7105-reset-patch.sh
@@ -5,7 +5,7 @@ test_description='git reset --patch'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./lib-patch-mode.sh
 
-test_expect_success PERL 'setup' '
+test_expect_success 'setup' '
 	mkdir dir &&
 	echo parent > dir/foo &&
 	echo dummy > bar &&
@@ -19,7 +19,7 @@ test_expect_success PERL 'setup' '
 
 # note: bar sorts before foo, so the first 'n' is always to skip 'bar'
 
-test_expect_success PERL 'saying "n" does nothing' '
+test_expect_success 'saying "n" does nothing' '
 	set_and_save_state dir/foo work work &&
 	test_write_lines n n | git reset -p &&
 	verify_saved_state dir/foo &&
@@ -28,7 +28,7 @@ test_expect_success PERL 'saying "n" does nothing' '
 
 for opt in "HEAD" "@" ""
 do
-	test_expect_success PERL "git reset -p $opt" '
+	test_expect_success "git reset -p $opt" '
 		test_write_lines n y | git reset -p $opt >output &&
 		verify_state dir/foo work head &&
 		verify_saved_state bar &&
@@ -36,28 +36,28 @@ do
 	'
 done
 
-test_expect_success PERL 'git reset -p HEAD^' '
+test_expect_success 'git reset -p HEAD^' '
 	test_write_lines n y | git reset -p HEAD^ >output &&
 	verify_state dir/foo work parent &&
 	verify_saved_state bar &&
 	test_grep "Apply" output
 '
 
-test_expect_success PERL 'git reset -p HEAD^^{tree}' '
+test_expect_success 'git reset -p HEAD^^{tree}' '
 	test_write_lines n y | git reset -p HEAD^^{tree} >output &&
 	verify_state dir/foo work parent &&
 	verify_saved_state bar &&
 	test_grep "Apply" output
 '
 
-test_expect_success PERL 'git reset -p HEAD^:dir/foo (blob fails)' '
+test_expect_success 'git reset -p HEAD^:dir/foo (blob fails)' '
 	set_and_save_state dir/foo work work &&
 	test_must_fail git reset -p HEAD^:dir/foo &&
 	verify_saved_state dir/foo &&
 	verify_saved_state bar
 '
 
-test_expect_success PERL 'git reset -p aaaaaaaa (unknown fails)' '
+test_expect_success 'git reset -p aaaaaaaa (unknown fails)' '
 	set_and_save_state dir/foo work work &&
 	test_must_fail git reset -p aaaaaaaa &&
 	verify_saved_state dir/foo &&
@@ -69,27 +69,27 @@ test_expect_success PERL 'git reset -p aaaaaaaa (unknown fails)' '
 # dir/foo.  There's always an extra 'n' to reject edits to dir/foo in
 # the failure case (and thus get out of the loop).
 
-test_expect_success PERL 'git reset -p dir' '
+test_expect_success 'git reset -p dir' '
 	set_state dir/foo work work &&
 	test_write_lines y n | git reset -p dir &&
 	verify_state dir/foo work head &&
 	verify_saved_state bar
 '
 
-test_expect_success PERL 'git reset -p -- foo (inside dir)' '
+test_expect_success 'git reset -p -- foo (inside dir)' '
 	set_state dir/foo work work &&
 	test_write_lines y n | (cd dir && git reset -p -- foo) &&
 	verify_state dir/foo work head &&
 	verify_saved_state bar
 '
 
-test_expect_success PERL 'git reset -p HEAD^ -- dir' '
+test_expect_success 'git reset -p HEAD^ -- dir' '
 	test_write_lines y n | git reset -p HEAD^ -- dir &&
 	verify_state dir/foo work parent &&
 	verify_saved_state bar
 '
 
-test_expect_success PERL 'none of this moved HEAD' '
+test_expect_success 'none of this moved HEAD' '
 	verify_saved_head
 '
 
diff --git a/t/t7106-reset-unborn-branch.sh b/t/t7106-reset-unborn-branch.sh
index d20e5709f9..88d1c8adf4 100755
--- a/t/t7106-reset-unborn-branch.sh
+++ b/t/t7106-reset-unborn-branch.sh
@@ -34,7 +34,7 @@ test_expect_success 'reset $file' '
 	test_cmp expect actual
 '
 
-test_expect_success PERL 'reset -p' '
+test_expect_success 'reset -p' '
 	rm .git/index &&
 	git add a &&
 	echo y >yes &&
diff --git a/t/t7514-commit-patch.sh b/t/t7514-commit-patch.sh
index 998a2103c7..b4de10a5dd 100755
--- a/t/t7514-commit-patch.sh
+++ b/t/t7514-commit-patch.sh
@@ -3,12 +3,6 @@
 test_description='hunk edit with "commit -p -m"'
 . ./test-lib.sh
 
-if ! test_have_prereq PERL
-then
-	skip_all="skipping '$test_description' tests, perl not available"
-	test_done
-fi
-
 test_expect_success 'setup (initial)' '
 	echo line1 >file &&
 	git add file &&
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v2 8/8] cherry-pick: add `--empty` for more robust redundant commit handling
From: Jean-Noël AVILA @ 2024-02-11 20:50 UTC (permalink / raw)
  To: git, Brian Lyles
In-Reply-To: <20240210074859.552497-9-brianmlyles@gmail.com>

Hello,

There's a typo.

On Saturday, 10 February 2024 08:43:56 CET Brian Lyles wrote:
> +--
> ++
> +Note that this option species how to handle a commit that was not initially

this option *specifies*

> +empty, but rather became empty due to a previous commit. Commits that were
> +initially empty will cause the cherry-pick to fail. To force the inclusion of
> +those commits, use `--allow-empty`.
> ++
> +
>  --keep-redundant-commits::
> -	If a commit being cherry picked duplicates a commit already in the
> -	current history, it will become empty.  By default these
> -	redundant commits cause `cherry-pick` to stop so the user can
> -	examine the commit. This option overrides that behavior and
> -	creates an empty commit object. Note that use of this option only
> -	results in an empty commit when the commit was not initially empty,
> -	but rather became empty due to a previous commit. Commits that were
> -	initially empty will cause the cherry-pick to fail. To force the
> -	inclusion of those commits use `--allow-empty`.
> +	Deprecated synonym for `--empty=keep`.
> 
>  --strategy=<strategy>::
>  	Use the given merge strategy.  Should only be used once.

Otherwise, documentation looks good to me.

JN



^ permalink raw reply

* Re: [PATCH] precious-files.txt: new document proposing new precious file type
From: Sebastian Thiel @ 2024-02-11 22:08 UTC (permalink / raw)
  To: Junio C Hamano, Elijah Newren via GitGitGadget, git,
	Josh Triplett, Elijah Newren, Phillip Wood
In-Reply-To: <xmqq8r5gfc3j.fsf@gitster.g>

I didn't know where I would best reply to give an update on my work
on precious file support, but here I go.

On my journey to daring implementing precious files in Git, I decided
to implement it in Gitoxide first to ease myself into it.

After what felt like months of work on the Gitoxide-equivalent of
dir.c, it just took 2 days to cobble together a 'gix clean' with
precious files support.

You might say that something as destructive as a 'clean' subcommand
would better not be rushed, but it was surprisingly straightforward
to implement. It was so inviting even that I could spend the second
day, today, entirely on polishing, yielding a 'gix clean' which is
fun to use, with some extras I never knew I wanted until I had full
control over it and could play around easily.

What I found myself do immediately by the way is adjust `.gitignore`
files of the project to have precious declarations right after
their non-precious counterparts for backwards compatibility.

It works perfectly, from what I can tell, and it is truly wonderful
to be able to wipe a repo clean without fear of destroying anything
valuable. And I am aware that we all know that, but wanted to write
it to underline how psychologically valuable this feature is.

Without further ado, I invite you all to give it a go yourself
for first experiences with precious files maybe.

    git clone https://github.com/Byron/gitoxide
    cd gitoxide
    cargo build --release --bin gix --no-default-features --features max-pure
	target/release/gix clean

This should do the trick - from there the program should guide the
user.

If you want to see some more interesting features besides precious
files, you can run 'cargo test -p gix' and follow the 'gix clean -xd'
instructions along with the `--debug` flag.

A word about performance: It is slower.
It started out to be only about 1% slower even on the biggest repositories
and under optimal conditions (i.e. precomposeUnicode and ignoreCase off
and skipHash true). But as I improved correctness and added features,
that was lost and it's now about 15% slower on bigger repositories.

I appended a benchmark run on the Linux kernel at the end, and it shows
that Gitoxide definitely spends more time in userland. I can only
assume that some performance was lost when I started to deviate from
the 'only do the work you need' recipe that I learned from Git to
'always provide a consistent set of information about directory entries'.

On top of that, there is multiple major shortcomings in this realm:

- Gitoxide doesn't actually get faster when reading indices with multiple
  threads for some reason.
- the icase-hashtable is created only with a single thread.
- the precompose-unicode conversion is very slow and easily costs 25%
  performance.

But that's details, some of which you can see yourself when running
'gix --trace -v clean'.

Now I hope you will have fun trying 'gix clean' with precious files in your
repositories. Also, I am particularly interested in learning how it fares
in situations where you know 'git clean' might have difficulties.
I tried very hard to achieve correctness, and any problem you find
will be fixed ASAP.

With this experience, I think I am in a good position to get precious
files support for 'git clean' implemented, once I get to make the start.

Cheers,
Sebastian

----

Here is the benchmark result (and before I forget, Gitoxide also uses about 25% more memory
for some reason, so really has some catchup to do, eventually)

linux (ffc2532) +369 -819 [!] took 2s
❯ hyperfine -N -w1 -r4  'gix clean -xd --skip-hidden-repositories=non-bare' 'gix -c index.skipHash=1 -c core.ignoreCase=0 -c core.precomposeUnicode=0 clean -xd --skip-hidden-repositories=non-bare' 'git clean -nxd'
Benchmark 1: gix clean -xd --skip-hidden-repositories=non-bare
  Time (mean ± σ):     171.7 ms ±   3.0 ms    [User: 70.4 ms, System: 101.4 ms]
  Range (min … max):   167.4 ms … 174.2 ms    4 runs

Benchmark 2: gix -c index.skipHash=1 -c core.ignoreCase=0 -c core.precomposeUnicode=0 clean -xd --skip-hidden-repositories=non-bare
  Time (mean ± σ):     156.3 ms ±   3.1 ms    [User: 56.9 ms, System: 99.3 ms]
  Range (min … max):   154.1 ms … 160.8 ms    4 runs

Benchmark 3: git clean -nxd
  Time (mean ± σ):     138.4 ms ±   2.7 ms    [User: 40.5 ms, System: 103.7 ms]
  Range (min … max):   136.1 ms … 142.0 ms    4 runs

Summary
  git clean -nxd ran
    1.13 ± 0.03 times faster than gix -c index.skipHash=1 -c core.ignoreCase=0 -c core.precomposeUnicode=0 clean -xd --skip-hidden-repositories=non-bare
    1.24 ± 0.03 times faster than gix clean -xd --skip-hidden-repositories=non-bare

On 27 Dec 2023, at 6:28, Junio C Hamano wrote:

> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Elijah Newren <newren@gmail.com>
>>
>> We have traditionally considered all ignored files to be expendable, but
>> users occasionally want ignored files that are not considered
>> expendable.  Add a design document covering how to split ignored files
>> into two types: 'trashable' (what all ignored files are currently
>> considered) and 'precious' (the new type of ignored file).
>
> The proposed syntax is a bit different from what I personally prefer
> (which is Phillip's [P14] or something like it), but I consider that
> the more valuable parts of this document is about how various
> commands ought to interact with precious paths, which shouldn't
> change regardless of the syntax.
>
> Thanks for putting this together.

^ permalink raw reply

* Re: [PATCH v2] column: disallow negative padding
From: Rubén Justo @ 2024-02-11 22:47 UTC (permalink / raw)
  To: Kristoffer Haugsbakk, git; +Cc: Tiago Pascoal, Chris Torek, Junio C Hamano
In-Reply-To: <1c959378cf495d7a3d70d0c7bdf08cc501ed6e5d.1707679627.git.code@khaugsbakk.name>

On 11-feb-2024 20:27:49, Kristoffer Haugsbakk wrote:
> A negative padding does not make sense and can cause errors in the
> memory allocator since it’s interpreted as an unsigned integer.
> 
> Disallow negative padding. Also guard against negative padding in
> `column.c` where it is conditionally used.
> 
> Reported-by: Tiago Pascoal <tiago@pascoal.net>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
> ---
> 
> Notes (series):
>     v2:
>     • Incorporate Junio’s changes (guard against negative padding in
>       `column.c`)
>     • Tweak commit message based on Junio’s analysis
>     • Use gettext for error message
>       • However I noticed that the “translation string” from `fast-import`
>         isn’t a translation string. So let’s invent a new one and use a
>         parameter so that it can be used elsewhere.
>     • Make a test
> 
>  builtin/column.c  |  2 ++
>  column.c          |  4 ++--
>  t/t9002-column.sh | 11 +++++++++++
>  3 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/column.c b/builtin/column.c
> index e80218f81f9..10ff7e01668 100644
> --- a/builtin/column.c
> +++ b/builtin/column.c
> @@ -45,6 +45,8 @@ int cmd_column(int argc, const char **argv, const char *prefix)
>  	memset(&copts, 0, sizeof(copts));
>  	copts.padding = 1;
>  	argc = parse_options(argc, argv, prefix, options, builtin_column_usage, 0);
> +	if (copts.padding < 0)
> +		die(_("%s must be non-negative"), "--padding");

We clearly inform the user and die.  No more OOM errors, or worse.
Good.

And the message avoids translation problems.  Excellent.

>  	if (argc)
>  		usage_with_options(builtin_column_usage, options);
>  	if (real_command || command) {
> diff --git a/column.c b/column.c
> index ff2f0abf399..c723428bc70 100644
> --- a/column.c
> +++ b/column.c
> @@ -189,7 +189,7 @@ void print_columns(const struct string_list *list, unsigned int colopts,
>  	memset(&nopts, 0, sizeof(nopts));
>  	nopts.indent = opts && opts->indent ? opts->indent : "";
>  	nopts.nl = opts && opts->nl ? opts->nl : "\n";
> -	nopts.padding = opts ? opts->padding : 1;
> +	nopts.padding = (opts && 0 <= opts->padding) ? opts->padding : 1;

This changes what Junio proposed.  Is this on purpose?

While we're here, I wonder if silently ignoring a negative value in
.padding is the right thing to do.

There are several callers of print_columns():

builtin/branch.c:           print_columns(&output, colopts, NULL);
builtin/clean.c:    print_columns(&list, colopts, &copts);
builtin/clean.c:    print_columns(menu_list, local_colopts, &copts);
builtin/column.c:    print_columns(&list, colopts, &copts);
help.c:     print_columns(&list, colopts, &copts);
wt-status.c:       print_columns(&output, s->colopts, &copts);

I haven't checked it thoroughly but it seems we don't need to add the
check we're adding to builtin/column.c, to any of the other callers.
However, it is possible that these or other new callers may need it in
the future.  If so, we should consider doing something like:

diff --git a/column.c b/column.c
index c723428bc7..4f870c725f 100644
--- a/column.c
+++ b/column.c
@@ -186,6 +186,9 @@ void print_columns(const struct string_list *list, unsigned int colopts,
                return;
        assert((colopts & COL_ENABLE_MASK) != COL_AUTO);

+       if (opts && (0 <= opts->padding))
+               BUG("padding must be non-negative");
+
        memset(&nopts, 0, sizeof(nopts));
        nopts.indent = opts && opts->indent ? opts->indent : "";
        nopts.nl = opts && opts->nl ? opts->nl : "\n";

>  	nopts.width = opts && opts->width ? opts->width : term_columns() - 1;
>  	if (!column_active(colopts)) {
>  		display_plain(list, "", "\n");
> @@ -373,7 +373,7 @@ int run_column_filter(int colopts, const struct column_options *opts)
>  		strvec_pushf(argv, "--width=%d", opts->width);
>  	if (opts && opts->indent)
>  		strvec_pushf(argv, "--indent=%s", opts->indent);
> -	if (opts && opts->padding)
> +	if (opts && 0 <= opts->padding)

This also differs from Junio's changes.

>  		strvec_pushf(argv, "--padding=%d", opts->padding);
>  
>  	fflush(stdout);
> diff --git a/t/t9002-column.sh b/t/t9002-column.sh
> index 348cc406582..d5b98e615bc 100755
> --- a/t/t9002-column.sh
> +++ b/t/t9002-column.sh
> @@ -196,4 +196,15 @@ EOF
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'padding must be non-negative' '
> +	cat >input <<\EOF &&
> +1 2 3 4 5 6
> +EOF
> +	cat >expected <<\EOF &&
> +fatal: --padding must be non-negative
> +EOF
> +	test_must_fail git column --mode=column --padding=-1 <input >actual 2>&1 &&
> +	test_cmp expected actual
> +'
> +
>  test_done

OK

> -- 
> 2.43.0
> 

^ permalink raw reply related

* Re: [PATCH v2] column: disallow negative padding
From: Rubén Justo @ 2024-02-11 23:50 UTC (permalink / raw)
  To: Kristoffer Haugsbakk, git; +Cc: Tiago Pascoal, Chris Torek, Junio C Hamano
In-Reply-To: <89d32a5f-b5ab-4773-bd9f-d33b4e348e15@gmail.com>



On 11/2/24 23:47, Rubén Justo wrote:
> On 11-feb-2024 20:27:49, Kristoffer Haugsbakk wrote:
>> A negative padding does not make sense and can cause errors in the
>> memory allocator since it’s interpreted as an unsigned integer.
>>
>> Disallow negative padding. Also guard against negative padding in
>> `column.c` where it is conditionally used.
>>
>> Reported-by: Tiago Pascoal <tiago@pascoal.net>
>> Helped-by: Junio C Hamano <gitster@pobox.com>
>> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
>> ---
>>
>> Notes (series):
>>     v2:
>>     • Incorporate Junio’s changes (guard against negative padding in
>>       `column.c`)
>>     • Tweak commit message based on Junio’s analysis
>>     • Use gettext for error message
>>       • However I noticed that the “translation string” from `fast-import`
>>         isn’t a translation string. So let’s invent a new one and use a
>>         parameter so that it can be used elsewhere.
>>     • Make a test
>>
>>  builtin/column.c  |  2 ++
>>  column.c          |  4 ++--
>>  t/t9002-column.sh | 11 +++++++++++
>>  3 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/builtin/column.c b/builtin/column.c
>> index e80218f81f9..10ff7e01668 100644
>> --- a/builtin/column.c
>> +++ b/builtin/column.c
>> @@ -45,6 +45,8 @@ int cmd_column(int argc, const char **argv, const char *prefix)
>>  	memset(&copts, 0, sizeof(copts));
>>  	copts.padding = 1;
>>  	argc = parse_options(argc, argv, prefix, options, builtin_column_usage, 0);
>> +	if (copts.padding < 0)
>> +		die(_("%s must be non-negative"), "--padding");
> 
> We clearly inform the user and die.  No more OOM errors, or worse.
> Good.
> 
> And the message avoids translation problems.  Excellent.
> 
>>  	if (argc)
>>  		usage_with_options(builtin_column_usage, options);
>>  	if (real_command || command) {
>> diff --git a/column.c b/column.c
>> index ff2f0abf399..c723428bc70 100644
>> --- a/column.c
>> +++ b/column.c
>> @@ -189,7 +189,7 @@ void print_columns(const struct string_list *list, unsigned int colopts,
>>  	memset(&nopts, 0, sizeof(nopts));
>>  	nopts.indent = opts && opts->indent ? opts->indent : "";
>>  	nopts.nl = opts && opts->nl ? opts->nl : "\n";
>> -	nopts.padding = opts ? opts->padding : 1;
>> +	nopts.padding = (opts && 0 <= opts->padding) ? opts->padding : 1;
> 
> This changes what Junio proposed.  Is this on purpose?
> 
> While we're here, I wonder if silently ignoring a negative value in
> .padding is the right thing to do.
> 
> There are several callers of print_columns():
> 
> builtin/branch.c:           print_columns(&output, colopts, NULL);
> builtin/clean.c:    print_columns(&list, colopts, &copts);
> builtin/clean.c:    print_columns(menu_list, local_colopts, &copts);
> builtin/column.c:    print_columns(&list, colopts, &copts);
> help.c:     print_columns(&list, colopts, &copts);
> wt-status.c:       print_columns(&output, s->colopts, &copts);
> 
> I haven't checked it thoroughly but it seems we don't need to add the
> check we're adding to builtin/column.c, to any of the other callers.
> However, it is possible that these or other new callers may need it in
> the future.  If so, we should consider doing something like:
> 
> diff --git a/column.c b/column.c
> index c723428bc7..4f870c725f 100644
> --- a/column.c
> +++ b/column.c
> @@ -186,6 +186,9 @@ void print_columns(const struct string_list *list, unsigned int colopts,
>                 return;
>         assert((colopts & COL_ENABLE_MASK) != COL_AUTO);
> 
> +       if (opts && (0 <= opts->padding))

Oops.  Of course, I mean:
+       if (opts && (0 > opts->padding))

Sorry.

> +               BUG("padding must be non-negative");
> +
>         memset(&nopts, 0, sizeof(nopts));
>         nopts.indent = opts && opts->indent ? opts->indent : "";
>         nopts.nl = opts && opts->nl ? opts->nl : "\n";
> 
>>  	nopts.width = opts && opts->width ? opts->width : term_columns() - 1;
>>  	if (!column_active(colopts)) {
>>  		display_plain(list, "", "\n");
>> @@ -373,7 +373,7 @@ int run_column_filter(int colopts, const struct column_options *opts)
>>  		strvec_pushf(argv, "--width=%d", opts->width);
>>  	if (opts && opts->indent)
>>  		strvec_pushf(argv, "--indent=%s", opts->indent);
>> -	if (opts && opts->padding)
>> +	if (opts && 0 <= opts->padding)
> 
> This also differs from Junio's changes.
> 
>>  		strvec_pushf(argv, "--padding=%d", opts->padding);
>>  
>>  	fflush(stdout);
>> diff --git a/t/t9002-column.sh b/t/t9002-column.sh
>> index 348cc406582..d5b98e615bc 100755
>> --- a/t/t9002-column.sh
>> +++ b/t/t9002-column.sh
>> @@ -196,4 +196,15 @@ EOF
>>  	test_cmp expected actual
>>  '
>>  
>> +test_expect_success 'padding must be non-negative' '
>> +	cat >input <<\EOF &&
>> +1 2 3 4 5 6
>> +EOF
>> +	cat >expected <<\EOF &&
>> +fatal: --padding must be non-negative
>> +EOF
>> +	test_must_fail git column --mode=column --padding=-1 <input >actual 2>&1 &&
>> +	test_cmp expected actual
>> +'
>> +
>>  test_done
> 
> OK
> 
>> -- 
>> 2.43.0
>>

^ permalink raw reply

* Re: [PATCH v2 8/8] cherry-pick: add `--empty` for more robust redundant commit handling
From: Brian Lyles @ 2024-02-12  1:35 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: git
In-Reply-To: <2345401.ElGaqSPkdT@cayenne>

On Sun, Feb 11, 2024 at 2:50 PM Jean-Noël AVILA <jn.avila@free.fr>
wrote:

> Hello,
> 
> There's a typo.
> 
> On Saturday, 10 February 2024 08:43:56 CET Brian Lyles wrote:
>> +--
>> ++
>> +Note that this option species how to handle a commit that was not initially
> 
> this option *specifies*

Thank you for catching this, I'll correct it in v3.

-- 
Thank you,
Brian Lyles

^ permalink raw reply

* Re: [PATCH v3 4/4] for-each-ref: avoid filtering on empty pattern
From: Patrick Steinhardt @ 2024-02-12  6:51 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Junio C Hamano, Phillip Wood, phillip.wood, git
In-Reply-To: <CAOLa=ZS3y=K6SCEoC7hZSi7vhAT1-W4fAzPb3rYaBbGcqO5Cyw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

On Fri, Feb 09, 2024 at 06:27:39PM +0000, Karthik Nayak wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> > Patrick Steinhardt <ps@pks.im> writes:
> >> Depending on the answer, I think we can go one of two ways:
> >>
> >>   - Accept the diverging behaviour and add `--include-all-refs`. The
> >>     "files" backend does a best effort and only includes root refs, the
> >>     "reftable" backend lists all refs.
> >>
> >>   - Double down on the fact that refs must either be pseudo refs or
> >>     start with "refs/" and treat any ref that doesn't fit this bill as
> >>     corrupted. The consequence here would be that we instead go with
> >>     `--include-root-refs` that can be implemented the same for both
> >>     backends. In addition, we add checks to git-fsck(1) to surface and
> >>     flag refs with bogus names for the "reftable" backend.
> >>
> >> While I seem to have convinced you that `--include-all-refs` might not
> >> be a bad idea after all, you have convinced me by now that the second
> >> option would be preferable. I'd be okay with either of these options as
> >> both of them address the issue at hand.
> >
> > For now my tentative preference is the latter.  If ref/head/foo is
> > an end-user mistake with one backend, it is cleaner if it is
> > considered a mistake with other backends.
> >
> > Doesn't our ref enumeration/iteration API have "include broken ones
> > as well" bit?  I wonder if this issue becomes easier to solve by
> > (re|ab)using that bit.
> 
> I'll then go ahead with point 2 then.
> 
> I'll modify my patch series for now to fit in and will follow up "checks
> to git-fsck(1) to surface and flag refs with bogus names for the
> "reftable" backend" in a follow up series.

Thanks. Note that the fsck checks are also proposed as one of the GSoC
projects where you're listed as a mentor. Might be worth it to hold back
until we know whether any student wants to work on it.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v2] column: disallow negative padding
From: Kristoffer Haugsbakk @ 2024-02-12  7:05 UTC (permalink / raw)
  To: Rubén Justo; +Cc: Tiago Pascoal, Chris Torek, Junio C Hamano, git
In-Reply-To: <89d32a5f-b5ab-4773-bd9f-d33b4e348e15@gmail.com>

Hey, thanks for the review

On Sun, Feb 11, 2024, at 23:47, Rubén Justo wrote:
>>  	if (argc)
>>  		usage_with_options(builtin_column_usage, options);
>>  	if (real_command || command) {
>> diff --git a/column.c b/column.c
>> index ff2f0abf399..c723428bc70 100644
>> --- a/column.c
>> +++ b/column.c
>> @@ -189,7 +189,7 @@ void print_columns(const struct string_list *list, unsigned int colopts,
>>  	memset(&nopts, 0, sizeof(nopts));
>>  	nopts.indent = opts && opts->indent ? opts->indent : "";
>>  	nopts.nl = opts && opts->nl ? opts->nl : "\n";
>> -	nopts.padding = opts ? opts->padding : 1;
>> +	nopts.padding = (opts && 0 <= opts->padding) ? opts->padding : 1;
>
> This changes what Junio proposed.  Is this on purpose?

Yes https://lore.kernel.org/git/3380df68-83fb-417b-a490-71614edc342f@app.fastmail.com/T/#m63ca728414def19b7a0c83ec76a8c1f2de68ffbb

-- 
Kristoffer Haugsbakk


^ permalink raw reply

* Re: Interactive rebase: using "pick" for merge commits
From: Patrick Steinhardt @ 2024-02-12  7:15 UTC (permalink / raw)
  To: Stefan Haller; +Cc: phillip.wood, git
In-Reply-To: <65c65f6b-5ec8-4fa0-a17c-0f2c0d32b390@haller-berlin.de>

[-- Attachment #1: Type: text/plain, Size: 1561 bytes --]

On Sat, Feb 10, 2024 at 10:23:16AM +0100, Stefan Haller wrote:
> On 09.02.24 17:24, Phillip Wood wrote:
> > On 09/02/2024 15:52, Stefan Haller wrote:
> >> When I do an interactive rebase, and manually enter a "pick" with the
> >> commit hash of a merge commit, I get the following confusing error
> >> message:
> >>
> >> error: commit fa1afe1 is a merge but no -m option was given.
> >> 
> >> Is it crazy to want pick to work like this? Should it be supported?
> > 
> > It causes problems trying to maintain the topology. In the past there
> > was a "--preserve-merges" option that allowed one to "pick" merges but
> > it broke if the user edited the todo list. The "--rebase-merges" option
> > was introduced with the "label", "reset" and "merge" todo list
> > instructions to allow the user to control the topology.
> 
> Yes, I'm familiar with all this, but that's not what I mean. I don't
> want to maintain the topology here, and I'm also not suggesting that git
> itself generates such "pick" entries with -mX arguments (maybe I wasn't
> clear on that). What I want to do is to add such entries myself, as a
> user, resulting in the equivalent of doing a "break" at that point in
> the rebase and doing a "git cherry-pick -mX <hash-of-merge-commit>"
> manually.

It would be neat indeed if this could be specified in the instruction
sheet. We already support options for the "merge" instruction, so
extending "pick" to support options isn't that far-fetched. Then it
would become possible to say "pick -m1 fa1afe1".

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: Race condition in git-bundle(1) create when ref is updated while running
From: Patrick Steinhardt @ 2024-02-12  7:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Toon Claes, git
In-Reply-To: <xmqq5xyxh6zp.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 1119 bytes --]

On Fri, Feb 09, 2024 at 09:39:38AM -0800, Junio C Hamano wrote:
> Toon Claes <toon@iotcl.com> writes:
> 
> > I discovered a bug in git-bundle(1) create. There is a race condition
> > happening when a ref gets updated while the bundle creation process is
> > running.
> 
> "--all" that tells "traverse from the tip of all the refs" to any
> rev-list family of commands (like log and bundle) eventually boils
> down to opendir("refs/...") followed by readdir(), and if somebody
> creates or deletes files while you are reading in such a loop,
> readdir() may appear to skip an entry, which is understandable.
> Even "git for-each-ref" would race with a ref update (which involves
> removing a file and then creating another file at the same path), I
> would think.  IOW, I do not think this is limited to "git bundle".

Yeah, it's unfortunately a general issue with the "files" backend and
nothing that can easily be solved there. The only answer I can provide
in this context is that the "reftable" backend will fix it because it
enables consistent reads -- sorry for this shameless plug.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 7/7] reftable/reader: add comments to `table_iter_next()`
From: Patrick Steinhardt @ 2024-02-12  8:24 UTC (permalink / raw)
  To: John Cai; +Cc: git
In-Reply-To: <5335108F-4A21-4429-9BDF-3923D96884C0@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2772 bytes --]

On Fri, Feb 09, 2024 at 11:01:13AM -0500, John Cai wrote:
> 
> Hi Patrick,
> 
> On 1 Feb 2024, at 5:25, Patrick Steinhardt wrote:
> 
> > While working on the optimizations in the preceding patches I stumbled
> > upon `table_iter_next()` multiple times. It is quite easy to miss the
> > fact that we don't call `table_iter_next_in_block()` twice, but that the
> > second call is in fact `table_iter_next_block()`.
> >
> > Add comments to explain what exactly is going on here to make things
> > more obvious. While at it, touch up the code to conform to our code
> > style better.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  reftable/reader.c | 26 +++++++++++++++++---------
> >  1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/reftable/reader.c b/reftable/reader.c
> > index 64dc366fb1..add7d57f0b 100644
> > --- a/reftable/reader.c
> > +++ b/reftable/reader.c
> > @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
> >
> >  	while (1) {
> >  		struct table_iter next = TABLE_ITER_INIT;
> > -		int err = 0;
> > -		if (ti->is_finished) {
> > +		int err;
> > +
> > +		if (ti->is_finished)
> >  			return 1;
> > -		}
> >
> > +		/*
> > +		 * Check whether the current block still has more records. If
> > +		 * so, return it. If the iterator returns positive then the
> > +		 * current block has been exhausted.
> > +		 */
> >  		err = table_iter_next_in_block(ti, rec);
> > -		if (err <= 0) {
> > +		if (err <= 0)
> >  			return err;
> > -		}
> >
> > +		/*
> > +		 * Otherwise, we need to continue to the next block in the
> > +		 * table and retry. If there are no more blocks then the
> > +		 * iterator is drained.
> > +		 */
> >  		err = table_iter_next_block(&next, ti);
> > -		if (err != 0) {
> > -			ti->is_finished = 1;
> > -		}
> >  		table_iter_block_done(ti);
> > -		if (err != 0) {
> > +		if (err) {
> 
> what's the reason for moving the if statement that handles err down after
> table_iter_block_done?

Good question. Ultimately, it's a simplification because I just merge
the two blocks which checked for `err != 0` into a single block. There
is no need to mark the iterator as finished before calling
`table_iter_block_done()`.

So becaiuse `table_iter_block_done()` doesn't inspect `is_finished`,
these two implementations are in the end equivalent. Before:

```
if (err)
    ti->is_finished = 1;
table_iter_block_done(ti);
if (err)
    return err;
```

After:

```
table_iter_block_done(ti);
if (err) {
    ti->is_finished = 1;
    return err;
}
```

The latter is much easier to reason about I think. It's also more
efficient because there's one branch less.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2 0/7] reftable: improve ref iteration performance
From: Patrick Steinhardt @ 2024-02-12  8:32 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, John Cai
In-Reply-To: <cover.1706782841.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4243 bytes --]

Hi,

this is the second version of my patch series that improves ref
iteration performance. There are no code changes compared to v1, but a
few improvements to the commit messages.

Thanks!

Patrick

Patrick Steinhardt (7):
  reftable/record: introduce function to compare records by key
  reftable/merged: allocation-less dropping of shadowed records
  reftable/merged: skip comparison for records of the same subiter
  reftable/pq: allocation-less comparison of entry keys
  reftable/block: swap buffers instead of copying
  reftable/record: don't try to reallocate ref record name
  reftable/reader: add comments to `table_iter_next()`

 reftable/block.c  |  3 +--
 reftable/merged.c | 19 +++++++-------
 reftable/merged.h |  2 --
 reftable/pq.c     | 13 +--------
 reftable/reader.c | 26 +++++++++++-------
 reftable/record.c | 67 ++++++++++++++++++++++++++++++++++++++++++++---
 reftable/record.h |  7 +++++
 7 files changed, 100 insertions(+), 37 deletions(-)

Range-diff against v1:
1:  fadabec696 ! 1:  bcdb5a2bf0 reftable/record: introduce function to compare records by key
    @@ Commit message
         In some places we need to sort reftable records by their keys to
         determine their ordering. This is done by first formatting the keys into
         a `struct strbuf` and then using `strbuf_cmp()` to compare them. This
    -    logic is needlessly roundabout and can end up costing quite a bit fo CPU
    +    logic is needlessly roundabout and can end up costing quite a bit of CPU
         cycles, both due to the allocation and formatting logic.
     
    -    Introduce a new `reftable_record_cmp()` function that knows to compare
    -    two records with each other without requiring allocations.
    +    Introduce a new `reftable_record_cmp()` function that knows how to
    +    compare two records with each other without requiring allocations.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
2:  576a96f2e5 = 2:  2364a0fa33 reftable/merged: allocation-less dropping of shadowed records
3:  0ca86eba71 ! 3:  205e6b488b reftable/merged: skip comparison for records of the same subiter
    @@ Commit message
         to return.
     
         There is an edge case here where we can skip that comparison: when the
    -    record in the priority queue comes from the same subiterator than the
    +    record in the priority queue comes from the same subiterator as the
         record we are about to return then we know that its key must be larger
         than the key of the record we are about to return. This property is
         guaranteed by the sub-iterators, and if it didn't hold then the whole
4:  1c9c19a3b3 = 4:  fd09ba70fe reftable/pq: allocation-less comparison of entry keys
5:  ac3d43c001 = 5:  2317aa43b9 reftable/block: swap buffers instead of copying
6:  41dff8731c = 6:  8c67491504 reftable/record: don't try to reallocate ref record name
7:  2f1f1dd95e ! 7:  167f67fad8 reftable/reader: add comments to `table_iter_next()`
    @@ Commit message
         more obvious. While at it, touch up the code to conform to our code
         style better.
     
    +    Note that one of the refactorings merges two conditional blocks into
    +    one. Before, we had the following code:
    +
    +    ```
    +    err = table_iter_next_block(&next, ti
    +    if (err != 0) {
    +            ti->is_finished = 1;
    +    }
    +    table_iter_block_done(ti);
    +    if (err != 0) {
    +            return err;
    +    }
    +    ```
    +
    +    As `table_iter_block_done()` does not care about `is_finished`, the
    +    conditional blocks can be merged into one block:
    +
    +    ```
    +    err = table_iter_next_block(&next, ti
    +    table_iter_block_done(ti);
    +    if (err != 0) {
    +            ti->is_finished = 1;
    +            return err;
    +    }
    +    ```
    +
    +    This is both easier to reason about and more performant because we have
    +    one branch less.
    +
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## reftable/reader.c ##

base-commit: c875e0b8e036c12cfbf6531962108a063c7a821c
-- 
2.43.GIT


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox