Git development

Git development
 help / color / mirror / Atom feed

* Re: Race condition on `git checkout -c`
From: Chris Torek @ 2023-01-19 22:55 UTC (permalink / raw)
  To: Arthur Milchior; +Cc: git
In-Reply-To: <CAEcbrFdE6X6=ppBWmFZrm0Z2RqGqFatjNHdZbGb_RMteCk6P6g@mail.gmail.com>

(Top note: you mean `git checkout -b` or `git switch -c`, not `git
checkout -c`.)

On Thu, Jan 19, 2023 at 1:24 PM Arthur Milchior
<arthur.milchior@gmail.com> wrote:
>
> I expect either:
> * to see an error message stating that `b` already exists
> * to see `b` and `B` in the list of branch.

[snip]

> uname: Darwin 22.2.0 Darwin Kernel Version 22.2.0: Fri Nov 11 02:03:51

Darwin (macOS) is your problem here.  The same problem
occurs on Windows, but not on Linux, by default.

Technically the problem is in the file system itself, combined with
the ways (plural) that Git stores branch names.

As far as Git itself is concerned, branch names are *always* case
sensitive, so branches named `b` and `B` are different.  But Git
stores branch names in two different formats, at the moment:

 * Some are kept in a plain-text file `.git/packed-refs`.
 * Some are stored as directory-and-file-names in `.git/refs/`.

If the OS's file system is case sensitive, as most standard Linux
file systems are, there's no problem with the latter. But if the OS's
file system is case-INsensitive, as the default file systems on
Windows and MacOS are, this becomes a problem: the attempt
to create both `refs/heads/b` and a different file, `refs/head/B`,
fails, with one of the two names "winning" and the other attempt
to create a new name simply re-using the existing name.

If you create a case-sensitive disk volume on your Mac, which
can be a simple click-to-mount virtual drive within your regular
case-insensitive file system, you can happily use Git without this
complication. Note that the same complication applies to file
names: Linux users can create two different, separate files
named `README.TXT` and `ReadMe.txt` in a GIt project, and
if you use the default case-insensitive macOS file system, you
won't be able to check out both files.  Using your case sensitive
volume will allow you to work with the Linux group.

If and when a future version of Git starts using reftables instead
of the file system to store branch and tag names, this particular
issue will go away, but until then, I recommend keeping a case
sensitive volume handy on your mac, and more generally,
avoiding mixing upper and lower case branch and/or file names
(at all, ever) whenever possible.  This avoids a lot of problems,
though nothing can get you past the Windows `aux.h` problem. :-)

Chris

^ permalink raw reply

* [PATCH 7/6] fsck: do not assume NUL-termination of buffers
From: Jeff King @ 2023-01-19 23:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, René Scharfe,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <Y8ifa7hyqxSbL92U@coredump.intra.peff.net>

On Wed, Jan 18, 2023 at 08:39:55PM -0500, Jeff King wrote:

> On Wed, Jan 18, 2023 at 03:35:06PM -0500, Jeff King wrote:
> 
> > The other option is having the fsck code avoid looking past the size it
> > was given. I think the intent is that this should work, from commits
> > like 4d0d89755e (Make sure fsck_commit_buffer() does not run out of the
> > buffer, 2014-09-11). We do use skip_prefix() and parse_oid_hex(), which
> > won't respect the size, but I think[1] that's OK because we'll have
> > parsed up to the end-of-header beforehand (and those functions would
> > never match past there).
> > 
> > Which would mean that 9a1a3a4d4c (mktag: allow omitting the header/body
> > \n separator, 2021-01-05) and acf9de4c94 (mktag: use fsck instead of
> > custom verify_tag(), 2021-01-05) were buggy, and we can just fix them.
> > 
> > [1] But I said "I think" above because it can get pretty subtle. There's
> >     some more discussion in this thread:
> > 
> >       https://lore.kernel.org/git/20150625155128.C3E9738005C@gemini.denx.de/
> > 
> >     but I haven't yet convinced myself it's safe. This is exactly the
> >     kind of analysis I wish I had the power to nerd-snipe René into.
> 
> I poked at this a bit more, and it definitely isn't safe.

So here's the result of my digging on this. The good news is that this
one commit on top of the rest of the series should make everything safe.
I'm sorry the explanation is a bit long, but I wanted to capture a bit
of the history, the subtle assumptions, and how I approached analyzing
and fixing it.

There are a few paths forward here:

  - apply this on top of the earlier 6 patches. This is the simplest
    thing, and my preference. It does mean that t3800 temporarily has a
    read-one-char-past-buffer bug that is detected by ASan after patch 6
    but before this patch is applied.

  - put this fix first. Unfortunately the tests rely on having patch 6
    in order to be able to feed a non-NUL-terminated buffer to fsck.
    Options there are:

      - split this patch into two: code fix goes at the beginning of the
        series, and then the tests come at the end. The downside here is
        that it's very hard to run the tests on the pre-fixed code to
        verify that they are finding problems (you'd have to revert the
        fix, or re-order patches to get the broken state)

      - introduce a test-helper that lets you feed a buffer to
        fsck_buffer(). That can demonstrate the problem and fix
        independently of any hash-object changes. But it ends up being a
        fair bit of boilerplate, and ultimately we want to test
        hash-object anyway.

  - decide the whole "make fsck work with arbitrary buffers" thing is
    too subtle and error-prone. I don't think this, or else I wouldn't
    have made this patch. But I think it's an argument that can be made
    (and is roughly the approach we decided to take way back in the 2015
    thread linked above). The solution there is to make sure we
    NUL-terminate everything. As I said before, this is tricky because
    of mmap. But we could probably just skip using mmap in index_core()
    for non-blobs (which don't tend to be very big), and then assume
    fsck on individual blobs is safe (it is, because they won't have
    been marked as gitmodules, etc for more detailed scanning).

    I think it could work. I kind of prefer just making the fsck
    functions safe. Even though the way they do left-to-right scanning
    is error-prone, at least the ugliness is contained inside them,
    rather than this "sure, I take a ptr/len combo, but make sure you
    allocate an extra NUL byte!" assumption that currently exists.

Anyway, here's the patch. I'm happy to repost the whole 7-patch series,
too, but since the earlier ones didn't change in my preferred path
forward, this seemed easier for now. ;)

-- >8 --
Subject: [PATCH] fsck: do not assume NUL-termination of buffers

The fsck code operates on an object buffer represented as a pointer/len
combination. However, the parsing of commits and tags is a little bit
loose; we mostly scan left-to-right through the buffer, without checking
whether we've gone past the length we were given.

This has traditionally been OK because the buffers we feed to fsck
always have an extra NUL after the end of the object content, which ends
any left-to-right scan. That has always been true for objects we read
from the odb, and we made it true for incoming index-pack/unpack-objects
checks in a1e920a0a7 (index-pack: terminate object buffers with NUL,
2014-12-08).

However, we recently added an exception: hash-object asks index_fd() to
do fsck checks. That _may_ have an extra NUL (if we read from a pipe
into a strbuf), but it might not (if we read the contents from the
file). Nor can we just teach it to always add a NUL. We may mmap the
on-disk file, which will not have any extra bytes (if it's a multiple of
the page size). Not to mention that this is a rather subtle assumption
for the fsck code to make.

Instead, let's make sure that the fsck parsers don't ever look past the
size of the buffer they've been given. This _almost_ works already,
thanks to earlier work in 4d0d89755e (Make sure fsck_commit_buffer()
does not run out of the buffer, 2014-09-11). The theory there is that we
check up front whether we have the end of header double-newline
separator. And then any left-to-right scanning we do is OK as long as it
stops when it hits that boundary.

However, we later softened that in 84d18c0bcf (fsck: it is OK for a tag
and a commit to lack the body, 2015-06-28), which allows the
double-newline header to be missing, but does require that the header
ends in a newline. That was OK back then, because of the NUL-termination
guarantees (including the one from a1e920a0a7 mentioned above).

Because 84d18c0bcf guarantees that any header line does end in a
newline, we are still OK with most of the left-to-right scanning. We
only need to take care after completing a line, to check that there is
another line (and we didn't run out of buffer).

Most of these checks are just need to check "buffer < buffer_end" (where
buffer is advanced as we parse) before scanning for the next header
line. But here are a few notes:

  - we don't technically need to check for remaining buffer before
    parsing the very first line ("tree" for a commit, or "object" for a
    tag), because verify_headers() rejects a totally empty buffer. But
    we'll do so in the name of consistency and defensiveness.

  - there are some calls to strchr('\n'). These are actually OK by the
    "the final header line must end in a newline" guarantee from
    verify_headers(). They will always find that rather than run off the
    end of the buffer. Curiously, they do check for a NULL return and
    complain, but I believe that condition can never be reached.

    However, I converted them to use memchr() with a proper size and
    retained the NULL checks. Using memchr() is not much longer and
    makes it more obvious what is going on. Likewise, retaining the NULL
    checks serves as a defensive measure in case my analysis is wrong.

  - commit 9a1a3a4d4c (mktag: allow omitting the header/body \n
    separator, 2021-01-05), does check for the end-of-buffer condition,
    but does so with "!*buffer", relying explicitly on the NUL
    termination. We can accomplish the same thing with a pointer
    comparison. I also folded it into the follow-on conditional that
    checks the contents of the buffer, for consistency with the other
    checks.

  - fsck_ident() uses parse_timestamp(), which is based on strtoumax().
    That function will happily skip past leading whitespace, including
    newlines, which makes it a risk. We can fix this by scanning to the
    first digit ourselves, and then using parse_timestamp() to do the
    actual numeric conversion.

    Note that as a side effect this fixes the fact that we missed
    zero-padded timestamps like "<email>   0123" (whereas we would
    complain about "<email> 0123"). I doubt anybody cares, but I
    mention it here for completeness.

  - fsck_tree() does not need any modifications. It relies on
    decode_tree_entry() to do the actual parsing, and that function
    checks both that there are enough bytes in the buffer to represent
    an entry, and that there is a NUL at the appropriate spot (one
    hash-length from the end; this may not be the NUL for the entry we
    are parsing, but we know that in the worst case, everything from our
    current position to that NUL is a filename, so we won't run out of
    bytes).

In addition to fixing the code itself, we'd like to make sure our rather
subtle assumptions are not violated in the future. So this patch does
two more things:

  - add comments around verify_headers() documenting the link between
    what it checks and the memory safety of the callers. I don't expect
    this code to be modified frequently, but this may help somebody from
    accidentally breaking things.

  - add a thorough set of tests covering truncations at various key
    spots (e.g., for a "tree $oid" line, in the middle of the word
    "tree", right after it, after the space, in the middle of the $oid,
    and right at the end of the line. Most of these are fine already (it
    is only truncating right at the end of the line that is currently
    broken). And some of them are not even possible with the current
    code (we parse "tree " as a unit, so truncating before the space is
    equivalent). But I aimed here to consider the code a black box and
    look for any truncations that would be a problem for a left-to-right
    parser.

Signed-off-by: Jeff King <peff@peff.net>
---
 fsck.c                 |  67 ++++++++++++++++----
 t/t1451-fsck-buffer.sh | 140 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 194 insertions(+), 13 deletions(-)
 create mode 100755 t/t1451-fsck-buffer.sh

diff --git a/fsck.c b/fsck.c
index c2c8facd2d..2b18717ee8 100644
--- a/fsck.c
+++ b/fsck.c
@@ -748,6 +748,23 @@ static int fsck_tree(const struct object_id *tree_oid,
 	return retval;
 }
 
+/*
+ * Confirm that the headers of a commit or tag object end in a reasonable way,
+ * either with the usual "\n\n" separator, or at least with a trailing newline
+ * on the final header line.
+ *
+ * This property is important for the memory safety of our callers. It allows
+ * them to scan the buffer linewise without constantly checking the remaining
+ * size as long as:
+ *
+ *   - they check that there are bytes left in the buffer at the start of any
+ *     line (i.e., that the last newline they saw was not the final one we
+ *     found here)
+ *
+ *   - any intra-line scanning they do will stop at a newline, which will worst
+ *     case hit the newline we found here as the end-of-header. This makes it
+ *     OK for them to use helpers like parse_oid_hex(), or even skip_prefix().
+ */
 static int verify_headers(const void *data, unsigned long size,
 			  const struct object_id *oid, enum object_type type,
 			  struct fsck_options *options)
@@ -808,6 +825,20 @@ static int fsck_ident(const char **ident,
 	if (*p != ' ')
 		return report(options, oid, type, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date");
 	p++;
+	/*
+	 * Our timestamp parser is based on the C strto*() functions, which
+	 * will happily eat whitespace, including the newline that is supposed
+	 * to prevent us walking past the end of the buffer. So do our own
+	 * scan, skipping linear whitespace but not newlines, and then
+	 * confirming we found a digit. We _could_ be even more strict here,
+	 * as we really expect only a single space, but since we have
+	 * traditionally allowed extra whitespace, we'll continue to do so.
+	 */
+	while (*p == ' ' || *p == '\t')
+		p++;
+	if (!isdigit(*p))
+		return report(options, oid, type, FSCK_MSG_BAD_DATE,
+			      "invalid author/committer line - bad date");
 	if (*p == '0' && p[1] != ' ')
 		return report(options, oid, type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date");
 	if (date_overflows(parse_timestamp(p, &end, 10)))
@@ -834,20 +865,26 @@ static int fsck_commit(const struct object_id *oid,
 	unsigned author_count;
 	int err;
 	const char *buffer_begin = buffer;
+	const char *buffer_end = buffer + size;
 	const char *p;
 
+	/*
+	 * We _must_ stop parsing immediately if this reports failure, as the
+	 * memory safety of the rest of the function depends on it. See the
+	 * comment above the definition of verify_headers() for more details.
+	 */
 	if (verify_headers(buffer, size, oid, OBJ_COMMIT, options))
 		return -1;
 
-	if (!skip_prefix(buffer, "tree ", &buffer))
+	if (buffer >= buffer_end || !skip_prefix(buffer, "tree ", &buffer))
 		return report(options, oid, OBJ_COMMIT, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line");
 	if (parse_oid_hex(buffer, &tree_oid, &p) || *p != '\n') {
 		err = report(options, oid, OBJ_COMMIT, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1");
 		if (err)
 			return err;
 	}
 	buffer = p + 1;
-	while (skip_prefix(buffer, "parent ", &buffer)) {
+	while (buffer < buffer_end && skip_prefix(buffer, "parent ", &buffer)) {
 		if (parse_oid_hex(buffer, &parent_oid, &p) || *p != '\n') {
 			err = report(options, oid, OBJ_COMMIT, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
 			if (err)
@@ -856,7 +893,7 @@ static int fsck_commit(const struct object_id *oid,
 		buffer = p + 1;
 	}
 	author_count = 0;
-	while (skip_prefix(buffer, "author ", &buffer)) {
+	while (buffer < buffer_end && skip_prefix(buffer, "author ", &buffer)) {
 		author_count++;
 		err = fsck_ident(&buffer, oid, OBJ_COMMIT, options);
 		if (err)
@@ -868,7 +905,7 @@ static int fsck_commit(const struct object_id *oid,
 		err = report(options, oid, OBJ_COMMIT, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
 	if (err)
 		return err;
-	if (!skip_prefix(buffer, "committer ", &buffer))
+	if (buffer >= buffer_end || !skip_prefix(buffer, "committer ", &buffer))
 		return report(options, oid, OBJ_COMMIT, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
 	err = fsck_ident(&buffer, oid, OBJ_COMMIT, options);
 	if (err)
@@ -899,13 +936,19 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
 	int ret = 0;
 	char *eol;
 	struct strbuf sb = STRBUF_INIT;
+	const char *buffer_end = buffer + size;
 	const char *p;
 
+	/*
+	 * We _must_ stop parsing immediately if this reports failure, as the
+	 * memory safety of the rest of the function depends on it. See the
+	 * comment above the definition of verify_headers() for more details.
+	 */
 	ret = verify_headers(buffer, size, oid, OBJ_TAG, options);
 	if (ret)
 		goto done;
 
-	if (!skip_prefix(buffer, "object ", &buffer)) {
+	if (buffer >= buffer_end || !skip_prefix(buffer, "object ", &buffer)) {
 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line");
 		goto done;
 	}
@@ -916,11 +959,11 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
 	}
 	buffer = p + 1;
 
-	if (!skip_prefix(buffer, "type ", &buffer)) {
+	if (buffer >= buffer_end || !skip_prefix(buffer, "type ", &buffer)) {
 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line");
 		goto done;
 	}
-	eol = strchr(buffer, '\n');
+	eol = memchr(buffer, '\n', buffer_end - buffer);
 	if (!eol) {
 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line");
 		goto done;
@@ -932,11 +975,11 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
 		goto done;
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tag ", &buffer)) {
+	if (buffer >= buffer_end || !skip_prefix(buffer, "tag ", &buffer)) {
 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line");
 		goto done;
 	}
-	eol = strchr(buffer, '\n');
+	eol = memchr(buffer, '\n', buffer_end - buffer);
 	if (!eol) {
 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line");
 		goto done;
@@ -952,18 +995,16 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
 	}
 	buffer = eol + 1;
 
-	if (!skip_prefix(buffer, "tagger ", &buffer)) {
+	if (buffer >= buffer_end || !skip_prefix(buffer, "tagger ", &buffer)) {
 		/* early tags do not contain 'tagger' lines; warn only */
 		ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
 		if (ret)
 			goto done;
 	}
 	else
 		ret = fsck_ident(&buffer, oid, OBJ_TAG, options);
-	if (!*buffer)
-		goto done;
 
-	if (!starts_with(buffer, "\n")) {
+	if (buffer < buffer_end && !starts_with(buffer, "\n")) {
 		/*
 		 * The verify_headers() check will allow
 		 * e.g. "[...]tagger <tagger>\nsome
diff --git a/t/t1451-fsck-buffer.sh b/t/t1451-fsck-buffer.sh
new file mode 100755
index 0000000000..9ac270abab
--- /dev/null
+++ b/t/t1451-fsck-buffer.sh
@@ -0,0 +1,140 @@
+#!/bin/sh
+
+test_description='fsck on buffers without NUL termination
+
+The goal here is to make sure that the various fsck parsers never look
+past the end of the buffer they are given, even when encountering broken
+or truncated objects.
+
+We have to use "hash-object" for this because most code paths that read objects
+append an extra NUL for safety after the buffer. But hash-object, since it is
+reading straight from a file (and possibly even mmap-ing it) cannot always do
+so.
+
+These tests _might_ catch such overruns in normal use, but should be run with
+ASan or valgrind for more confidence.
+'
+. ./test-lib.sh
+
+# the general idea for tags and commits is to build up the "base" file
+# progressively, and then test new truncations on top of it.
+reset () {
+	test_expect_success 'reset input to empty' '
+		>base
+	'
+}
+
+add () {
+	content="$1"
+	type=${content%% *}
+	test_expect_success "add $type line" '
+		echo "$content" >>base
+	'
+}
+
+check () {
+	type=$1
+	fsck=$2
+	content=$3
+	test_expect_success "truncated $type ($fsck, \"$content\")" '
+		# do not pipe into hash-object here; we want to increase
+		# the chance that it uses a fixed-size buffer or mmap,
+		# and a pipe would be read into a strbuf.
+		{
+			cat base &&
+			echo "$content"
+		} >input &&
+		test_must_fail git hash-object -t "$type" input 2>err &&
+		grep "$fsck" err
+	'
+}
+
+test_expect_success 'create valid objects' '
+	git commit --allow-empty -m foo &&
+	commit=$(git rev-parse --verify HEAD) &&
+	tree=$(git rev-parse --verify HEAD^{tree})
+'
+
+reset
+check commit missingTree ""
+check commit missingTree "tr"
+check commit missingTree "tree"
+check commit badTreeSha1 "tree "
+check commit badTreeSha1 "tree 1234"
+add "tree $tree"
+
+# these expect missingAuthor because "parent" is optional
+check commit missingAuthor ""
+check commit missingAuthor "par"
+check commit missingAuthor "parent"
+check commit badParentSha1 "parent "
+check commit badParentSha1 "parent 1234"
+add "parent $commit"
+
+check commit missingAuthor ""
+check commit missingAuthor "au"
+check commit missingAuthor "author"
+ident_checks () {
+	check $1 missingEmail "$2 "
+	check $1 missingEmail "$2 name"
+	check $1 badEmail "$2 name <"
+	check $1 badEmail "$2 name <email"
+	check $1 missingSpaceBeforeDate "$2 name <email>"
+	check $1 badDate "$2 name <email> "
+	check $1 badDate "$2 name <email> 1234"
+	check $1 badTimezone "$2 name <email> 1234 "
+	check $1 badTimezone "$2 name <email> 1234 +"
+}
+ident_checks commit author
+add "author name <email> 1234 +0000"
+
+check commit missingCommitter ""
+check commit missingCommitter "co"
+check commit missingCommitter "committer"
+ident_checks commit committer
+add "committer name <email> 1234 +0000"
+
+reset
+check tag missingObject ""
+check tag missingObject "obj"
+check tag missingObject "object"
+check tag badObjectSha1 "object "
+check tag badObjectSha1 "object 1234"
+add "object $commit"
+
+check tag missingType ""
+check tag missingType "ty"
+check tag missingType "type"
+check tag badType "type "
+check tag badType "type com"
+add "type commit"
+
+check tag missingTagEntry ""
+check tag missingTagEntry "ta"
+check tag missingTagEntry "tag"
+check tag badTagName "tag "
+add "tag foo"
+
+check tag missingTagger ""
+check tag missingTagger "ta"
+check tag missingTagger "tagger"
+ident_checks tag tagger
+
+# trees are a binary format and can't use our earlier helpers
+test_expect_success 'truncated tree (short hash)' '
+	printf "100644 foo\0\1\1\1\1" >input &&
+	test_must_fail git hash-object -t tree input 2>err &&
+	grep badTree err
+'
+
+test_expect_success 'truncated tree (missing nul)' '
+	# these two things are indistinguishable to the parser. The important
+	# thing about this is example is that there are enough bytes to
+	# make up a hash, and that there is no NUL (and we confirm that the
+	# parser does not walk past the end of the buffer).
+	printf "100644 a long filename, or a hash with missing nul?" >input &&
+	test_must_fail git hash-object -t tree input 2>err &&
+	grep badTree err
+'
+
+test_done
-- 
2.39.1.616.gd06fca9e99


^ permalink raw reply related

* Re: [PATCH] worktree: teach find_shared_symref to ignore current worktree
From: Rubén Justo @ 2023-01-19 23:18 UTC (permalink / raw)
  To: phillip.wood, Junio C Hamano; +Cc: Git List, Eric Sunshine
In-Reply-To: <cedc93ec-aa6f-65bf-65be-0dca3d4d0186@dunelm.org.uk>

On 19-ene-2023 10:48:27, Phillip Wood wrote:

> I'm afraid I'm still not totally clear from this message exactly what the
> problem is. Having looked at die_if_checked_out() I think it maybe the

I'm going to reroll, I hope it makes sense then.

> If that is the problem you're trying to solve I think it would be better to
> keep the signature of find_shared_symref() the same and add a helper
> function that is called by die_if_checked_out() and find_shared_symref()

OK.

> which can optionally ignore the current worktree. The commit message for
> such a patch should explain you're fixing a bug rather than trying to change
> the existing behavior.
> 

Yes, the message is not clear that the commit is fixing a bug.  I'll add
that.

Thank you. 

^ permalink raw reply

* Re: [PATCH v2 1/3] avoid unnecessary worktrees traversing
From: Rubén Justo @ 2023-01-19 23:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List
In-Reply-To: <xmqqilh21ale.fsf@gitster.g>

On 19-ene-2023 13:24:45, Junio C Hamano wrote:
> Rubén Justo <rjusto@gmail.com> writes:
> 
> > -static void reject_rebase_or_bisect_branch(const char *target)
> > +static int ishead_and_reject_rebase_or_bisect_branch(const char *target)
> 
> The original name was already horrible but it became much worse by
> acquiring a non-word "ishead" as part of it X-<.
> 
> At least let's replace "rebase or bisect" with something a bit more
> generic, extensible, and shorter phrase.  For example, isn't the
> point of having the function was to give us a mechansim to see if
> the branch with the given name is not to be modified because it is
> being worked on elsewhere?  "The branch is in use" would be a good
> phrase to express such a concept, so die_if_branch_is_in_use() or
> something along that line may be easier to grok.

I agree, the naming is ugly.

The idea is to return, if not die(), from the iteration we are doing in
that function, whether the branch is checked out in any worktree.  That
information allows us later, if we know in advance that no HEAD needs to
be adjusted, to avoid calling replace_each_worktree_head_symref(),
saving us a new and unnecessary traversal of the worktrees.

There is a second idea to, in next commits, return also if the branch
is an unborn branch.

die_if_branch_is_in_use() is a better name for reject_rebase_or_... but
don't know how it fits with these ideas.  I'm open to suggestions.

I'll reroll with a better approach.

Thank you.

^ permalink raw reply

* Re: [PATCH 3/3] branch: rename orphan branches in any worktree
From: Rubén Justo @ 2023-01-19 23:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List
In-Reply-To: <xmqqedrq1a7h.fsf@gitster.g>

On 19-ene-2023 13:33:06, Junio C Hamano wrote:
> Rubén Justo <rjusto@gmail.com> writes:
> 
> > +	if (!copy && !(ishead > 1) &&
> 
> Logically it might be necessary to be able to extend "is that branch
> what we have checked out, yes or no?" bool into something else that
> can be something other than 0 or 1, but as soon as you did so,
> "is_head" is no longer a Boolean "is it a HEAD, yes or no?".
> 
> Now what does that value really _mean_?  Please rename the variable
> and helper function appropriately to make it clear what is going on.

The idea is that an unborn branch needs to be a HEAD, so (head > 1)
codifies that information.

As I said in another reply in this thread, I'm going to reroll.  I hope
to make it clearer then.

Thank you.

^ permalink raw reply

* Re: [PATCH] fetch: fix duplicate remote parallel fetch bug
From: Calvin Wan @ 2023-01-19 23:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq8rhy172q.fsf@gitster.g>

> As it always is possible to edit .git/config manually, it is
> necessary to perform deduplication like this patch does on the
> consumer side of the list, but do you know if our tool create
> duplication, or is it entirely something the end-user does manually?

I checked git-remote and there is protection against duplication
there, but I'm unsure if there are other places where remotes are
being added/renamed. I discovered the bug initially by using
git-config.

^ permalink raw reply

* Re: [PATCH 7/6] fsck: do not assume NUL-termination of buffers
From: Junio C Hamano @ 2023-01-19 23:58 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Taylor Blau, René Scharfe,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <Y8nOmZHv7T843uBn@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> So here's the result of my digging on this. The good news is that this
> one commit on top of the rest of the series should make everything safe.
> I'm sorry the explanation is a bit long, but I wanted to capture a bit
> of the history, the subtle assumptions, and how I approached analyzing
> and fixing it.
>
> There are a few paths forward here:
>
>   - apply this on top of the earlier 6 patches. This is the simplest
>     thing, and my preference. It does mean that t3800 temporarily has a
>     read-one-char-past-buffer bug that is detected by ASan after patch 6
>     but before this patch is applied.

That sounds reasonable, even though purist among us may find it
slightly disturbing that it breaks "bisectability".

> Anyway, here's the patch. I'm happy to repost the whole 7-patch series,
> too, but since the earlier ones didn't change in my preferred path
> forward, this seemed easier for now. ;)

Thanks.  Will queue.


^ permalink raw reply

* Re: [PATCH 3/5] log: Push to/cc handling down into show_log()
From: Junio C Hamano @ 2023-01-20  0:33 UTC (permalink / raw)
  To: Zev Weiss
  Cc: git, brian m. carlson, Ævar Arnfjörð Bjarmason,
	René Scharfe, Denton Liu, Emma Brooks, Eric Sunshine,
	Patrick Hemmer
In-Reply-To: <20230119223858.29262-4-zev@bewilderbeest.net>

Zev Weiss <zev@bewilderbeest.net> writes:

> Subject: Re: [PATCH 3/5] log: Push to/cc handling down into show_log()

s/Push/push/
cf. Documentation/SubmittingPatches[[summary-section]]

This is common to all these patches.

> @@ -1326,6 +1326,7 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
>  	pp.rev = rev;
>  	pp.print_email_subject = 1;
>  	pp_user_info(&pp, NULL, &sb, committer, encoding);
> +	format_recipients(rev, &sb);

This is where the two new members in the rev structure is used.

> @@ -2028,9 +2029,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  		strbuf_addch(&buf, '\n');
>  	}
>  
> -	recipients_to_header_buf("To", &buf, &extra_to);
> -	recipients_to_header_buf("Cc", &buf, &extra_cc);
> -
> +	rev.to_recipients = &extra_to;
> +	rev.cc_recipients = &extra_cc;
>  	rev.extra_headers = to_free = strbuf_detach(&buf, NULL);

And these two members point at borrowed memory.  extra_to and
extra_cc is freed after everything is done, near the end of the
cmd_format_patch() function.  We don't leak any extra memory by this
change, which is good.

^ permalink raw reply

* [PATCH v9 1/2] send-email: refactor header generation functions
From: Michael Strawbridge @ 2023-01-20  1:24 UTC (permalink / raw)
  To: git
  Cc: michael.strawbridge, Luben Tuikov, Junio C Hamano,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <20230120012459.920932-1-michael.strawbridge@amd.com>

Split process_file and send_message into easier to use functions.
Making SMTP header information widely available.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
---
 git-send-email.perl | 49 ++++++++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 5861e99a6e..42f135a266 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1495,16 +1495,7 @@ sub file_name_is_absolute {
 	return File::Spec::Functions::file_name_is_absolute($path);
 }
 
-# Prepares the email, then asks the user what to do.
-#
-# If the user chooses to send the email, it's sent and 1 is returned.
-# If the user chooses not to send the email, 0 is returned.
-# If the user decides they want to make further edits, -1 is returned and the
-# caller is expected to call send_message again after the edits are performed.
-#
-# If an error occurs sending the email, this just dies.
-
-sub send_message {
+sub gen_header {
 	my @recipients = unique_email_list(@to);
 	@cc = (grep { my $cc = extract_valid_address_or_die($_);
 		      not grep { $cc eq $_ || $_ =~ /<\Q${cc}\E>$/ } @recipients
@@ -1546,6 +1537,22 @@ sub send_message {
 	if (@xh) {
 		$header .= join("\n", @xh) . "\n";
 	}
+	my $recipients_ref = \@recipients;
+	return ($recipients_ref, $to, $date, $gitversion, $cc, $ccline, $header);
+}
+
+# Prepares the email, then asks the user what to do.
+#
+# If the user chooses to send the email, it's sent and 1 is returned.
+# If the user chooses not to send the email, 0 is returned.
+# If the user decides they want to make further edits, -1 is returned and the
+# caller is expected to call send_message again after the edits are performed.
+#
+# If an error occurs sending the email, this just dies.
+
+sub send_message {
+	my ($recipients_ref, $to, $date, $gitversion, $cc, $ccline, $header) = gen_header();
+	my @recipients = @$recipients_ref;
 
 	my @sendmail_parameters = ('-i', @recipients);
 	my $raw_from = $sender;
@@ -1735,11 +1742,8 @@ sub send_message {
 $references = $initial_in_reply_to || '';
 $message_num = 0;
 
-# Prepares the email, prompts the user, sends it out
-# Returns 0 if an edit was done and the function should be called again, or 1
-# otherwise.
-sub process_file {
-	my ($t) = @_;
+sub pre_process_file {
+	my ($t, $quiet) = @_;
 
 	open my $fh, "<", $t or die sprintf(__("can't open file %s"), $t);
 
@@ -1893,9 +1897,9 @@ sub process_file {
 	}
 	close $fh;
 
-	push @to, recipients_cmd("to-cmd", "to", $to_cmd, $t)
+	push @to, recipients_cmd("to-cmd", "to", $to_cmd, $t, $quiet)
 		if defined $to_cmd;
-	push @cc, recipients_cmd("cc-cmd", "cc", $cc_cmd, $t)
+	push @cc, recipients_cmd("cc-cmd", "cc", $cc_cmd, $t, $quiet)
 		if defined $cc_cmd && !$suppress_cc{'cccmd'};
 
 	if ($broken_encoding{$t} && !$has_content_type) {
@@ -1954,6 +1958,15 @@ sub process_file {
 			@initial_to = @to;
 		}
 	}
+}
+
+# Prepares the email, prompts the user, and sends it out
+# Returns 0 if an edit was done and the function should be called again, or 1
+# on the email being successfully sent out.
+sub process_file {
+	my ($t) = @_;
+
+        pre_process_file($t, $quiet);
 
 	my $message_was_sent = send_message();
 	if ($message_was_sent == -1) {
@@ -2002,7 +2015,7 @@ sub process_file {
 # Execute a command (e.g. $to_cmd) to get a list of email addresses
 # and return a results array
 sub recipients_cmd {
-	my ($prefix, $what, $cmd, $file) = @_;
+	my ($prefix, $what, $cmd, $file, $quiet) = @_;
 
 	my @addresses = ();
 	open my $fh, "-|", "$cmd \Q$file\E"
-- 
2.34.1


^ permalink raw reply related

* [PATCH v9 2/2] send-email: expose header information to git-send-email's sendemail-validate hook
From: Michael Strawbridge @ 2023-01-20  1:24 UTC (permalink / raw)
  To: git
  Cc: michael.strawbridge, Luben Tuikov, Junio C Hamano,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <20230120012459.920932-1-michael.strawbridge@amd.com>

To allow further flexibility in the Git hook, the SMTP header
information of the email which git-send-email intends to send, is now
passed as the 2nd argument to the sendemail-validate hook.

As an example, this can be useful for acting upon keywords in the
subject or specific email addresses.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
---
 Documentation/githooks.txt | 27 ++++++++++++++++++----
 git-send-email.perl        | 46 ++++++++++++++++++++++----------------
 t/t9001-send-email.sh      | 27 ++++++++++++++++++++--
 3 files changed, 75 insertions(+), 25 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index a16e62bc8c..0decbfc92d 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -583,10 +583,29 @@ processed by rebase.
 sendemail-validate
 ~~~~~~~~~~~~~~~~~~
 
-This hook is invoked by linkgit:git-send-email[1].  It takes a single parameter,
-the name of the file that holds the e-mail to be sent.  Exiting with a
-non-zero status causes `git send-email` to abort before sending any
-e-mails.
+This hook is invoked by linkgit:git-send-email[1].
+
+It takes these command line arguments. They are,
+1. the name of the file which holds the contents of the email to be sent.
+2. The name of the file which holds the SMTP headers of the email.
+
+The SMTP headers are passed in the exact same way as they are passed to the
+user's Mail Transport Agent (MTA). In effect, the email given to the user's
+MTA, is the contents of $2 followed by the contents of $1.
+
+Below is an example for a few common headers. Take notice of the
+capitalization and multi-line tab structure.
+
+  From: Example <from@example.com>
+  To: to@example.com
+  Cc: cc@example.com,
+	  A <author@example.com>,
+	  One <one@example.com>,
+	  two@example.com
+  Subject: PATCH-STRING
+
+Exiting with a non-zero status causes `git send-email` to abort
+before sending any e-mails.
 
 fsmonitor-watchman
 ~~~~~~~~~~~~~~~~~~
diff --git a/git-send-email.perl b/git-send-email.perl
index 42f135a266..0e595d6ac5 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -785,16 +785,31 @@ sub is_format_patch_arg {
 	push @files, $repo->command('format-patch', '-o', File::Temp::tempdir(CLEANUP => 1), @rev_list_opts);
 }
 
-@files = handle_backup_files(@files);
+if (defined $sender) {
+	$sender =~ s/^\s+|\s+$//g;
+	($sender) = expand_aliases($sender);
+} else {
+	$sender = $repoauthor->() || $repocommitter->() || '';
+}
+
+# $sender could be an already sanitized address
+# (e.g. sendemail.from could be manually sanitized by user).
+# But it's a no-op to run sanitize_address on an already sanitized address.
+$sender = sanitize_address($sender);
+
+$time = time - scalar $#files;
 
 if ($validate) {
 	foreach my $f (@files) {
 		unless (-p $f) {
+		        pre_process_file($f, 1);
 			validate_patch($f, $target_xfer_encoding);
 		}
 	}
 }
 
+@files = handle_backup_files(@files);
+
 if (@files) {
 	unless ($quiet) {
 		print $_,"\n" for (@files);
@@ -1043,18 +1058,6 @@ sub file_declares_8bit_cte {
 	}
 }
 
-if (defined $sender) {
-	$sender =~ s/^\s+|\s+$//g;
-	($sender) = expand_aliases($sender);
-} else {
-	$sender = $repoauthor->() || $repocommitter->() || '';
-}
-
-# $sender could be an already sanitized address
-# (e.g. sendemail.from could be manually sanitized by user).
-# But it's a no-op to run sanitize_address on an already sanitized address.
-$sender = sanitize_address($sender);
-
 my $to_whom = __("To whom should the emails be sent (if anyone)?");
 my $prompting = 0;
 if (!@initial_to && !defined $to_cmd) {
@@ -1214,10 +1217,6 @@ sub make_message_id {
 	#print "new message id = $message_id\n"; # Was useful for debugging
 }
 
-
-
-$time = time - scalar $#files;
-
 sub unquote_rfc2047 {
 	local ($_) = @_;
 	my $charset;
@@ -2101,11 +2100,20 @@ sub validate_patch {
 			chdir($repo->wc_path() or $repo->repo_path())
 				or die("chdir: $!");
 			local $ENV{"GIT_DIR"} = $repo->repo_path();
+
+			my ($recipients_ref, $to, $date, $gitversion, $cc, $ccline, $header) = gen_header();
+
+			require File::Temp;
+			my ($header_filehandle, $header_filename) = File::Temp::tempfile(
+                            ".gitsendemail.header.XXXXXX", DIR => $repo->repo_path());
+			print $header_filehandle $header;
+
 			my @cmd = ("git", "hook", "run", "--ignore-missing",
 				    $hook_name, "--");
-			my @cmd_msg = (@cmd, "<patch>");
-			my @cmd_run = (@cmd, $target);
+			my @cmd_msg = (@cmd, "<patch>", "<header>");
+			my @cmd_run = (@cmd, $target, $header_filename);
 			$hook_error = system_or_msg(\@cmd_run, undef, "@cmd_msg");
+			unlink($header_filehandle);
 			chdir($cwd_save) or die("chdir: $!");
 		}
 		if ($hook_error) {
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 1130ef21b3..8a5c111a24 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -540,7 +540,7 @@ test_expect_success $PREREQ "--validate respects relative core.hooksPath path" '
 	test_path_is_file my-hooks.ran &&
 	cat >expect <<-EOF &&
 	fatal: longline.patch: rejected by sendemail-validate hook
-	fatal: command '"'"'git hook run --ignore-missing sendemail-validate -- <patch>'"'"' died with exit code 1
+	fatal: command '"'"'git hook run --ignore-missing sendemail-validate -- <patch> <header>'"'"' died with exit code 1
 	warning: no patches were sent
 	EOF
 	test_cmp expect actual
@@ -559,12 +559,35 @@ test_expect_success $PREREQ "--validate respects absolute core.hooksPath path" '
 	test_path_is_file my-hooks.ran &&
 	cat >expect <<-EOF &&
 	fatal: longline.patch: rejected by sendemail-validate hook
-	fatal: command '"'"'git hook run --ignore-missing sendemail-validate -- <patch>'"'"' died with exit code 1
+	fatal: command '"'"'git hook run --ignore-missing sendemail-validate -- <patch> <header>'"'"' died with exit code 1
 	warning: no patches were sent
 	EOF
 	test_cmp expect actual
 '
 
+test_expect_success $PREREQ "--validate hook supports header argument" '
+	write_script my-hooks/sendemail-validate <<-\EOF &&
+        if test "$#" -ge 2
+	then
+		grep "X-test-header: v1.0" "$2"
+	else
+		echo "No header arg passed"
+		exit 1
+	fi
+	EOF
+	test_config core.hooksPath "my-hooks" &&
+	rm -fr outdir &&
+	git format-patch \
+		--add-header="X-test-header: v1.0" \
+		-n HEAD^1 -o outdir &&
+	git send-email \
+		--dry-run \
+		--to=nobody@example.com \
+		--smtp-server="$(pwd)/fake.sendmail" \
+		--validate \
+		outdir/000?-*.patch
+'
+
 for enc in 7bit 8bit quoted-printable base64
 do
 	test_expect_success $PREREQ "--transfer-encoding=$enc produces correct header" '
-- 
2.34.1


^ permalink raw reply related

* [PATCH v9 0/2] send-email: expose header information to git-send-email's sendemail-validate hook
From: Michael Strawbridge @ 2023-01-20  1:24 UTC (permalink / raw)
  To: git; +Cc: michael.strawbridge

Thanks to Ævar for an idea to simplify these patches further.

Michael Strawbridge (2):
  send-email: refactor header generation functions
  send-email: expose header information to git-send-email's
    sendemail-validate hook

 Documentation/githooks.txt | 27 +++++++++--
 git-send-email.perl        | 95 +++++++++++++++++++++++---------------
 t/t9001-send-email.sh      | 27 ++++++++++-
 3 files changed, 106 insertions(+), 43 deletions(-)

-- 
2.34.1


^ permalink raw reply

* Re: Redirect isn't working:
From: brian m. carlson @ 2023-01-20  1:46 UTC (permalink / raw)
  To: Jack Adrian Zappa; +Cc: git-mailing-list
In-Reply-To: <CAKepmaibtbRsKqmiZEtDNoLHWr=JyZ7Fhang4dnmw5ROGmBQTQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1649 bytes --]

On 2023-01-19 at 18:06:17, Jack Adrian Zappa wrote:
> Here is the filled out form for submitting a bug report:
> 
> What did you do before the bug happened? (Steps to reproduce your issue)
> 
>   At the command line, I typed the following command:
> 
>   $ git diff --no-index --word-diff --color=always <(od -An -t x1
> -w10000000000 file1.txt) <(od -An -t x1 -w10000000000 file2.txt)
> 
> What did you expect to happen? (Expected behavior)
> 
>   I expected to see the diff of the two files.
> 
> What happened instead? (Actual behavior)
> 
>   Instead I saw this:
> 
>   error: Could not access '/proc/1961/fd/63'
> 
> What's different between what you expected and what actually happened?
> 
>   The difference is that I would get the difference between the two
> files as opposed to a critical failure message.

Thanks for the report.  This has come up before (and I attempted a
patch), but the ultimate problem here is that the /proc/PID/fd/* files
are symlinks on Linux, and instead of dereferencing symlinks, Git tries
to read them and print where they point to.  That doesn't work in this
case because the symlinks are symlinks to anonymous pipes, which can't
be resolved.

On other systems, the behaviour may be different because those files may
be represented differently.

My patch attempted to solve this by simply opening the contents instead
of stat'ing them as normal, but I think we found some edge cases and it
didn't get finished.  Note that this behaviour is what plain diff does
because it doesn't handle symlinks at all.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply

* Re: [PATCH] rebase: mark --update-refs as requiring the merge backend
From: Elijah Newren @ 2023-01-20  1:54 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git
In-Reply-To: <f480813c-7583-179f-0149-d970d3f2519f@github.com>

On Thu, Jan 19, 2023 at 1:47 PM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 1/19/2023 12:36 AM, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > --update-refs is built in terms of the sequencer, which requires the
> > merge backend.  It was already marked as incompatible with the apply
> > backend in the git-rebase manual, but the code didn't check for this
> > incompatibility and warn the user.  Check and warn now.
>
> Thank you for submitting this version.
>
> > @@ -1514,6 +1514,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
> >               }
> >       }
> >
> > +     if (options.update_refs)
> > +             imply_merge(&options, "--update-refs");
> > +
>
> This solution is very elegant. The only downside is the lack of warning
> if --update-refs was implied by rebase.updateRefs=true, but I'm happy to
> delay implementing that warning in favor of your complete solution here.
>
> Thinking ahead to that option, are there other options that are implied
> by config that are required in imply_merge()? Is --autosquash in that
> camp? If so, then maybe it would make sense to expand imply_merge() to
> include a boolean config key and supply that warning within its
> implementation. (This consideration should not block this patch, as it
> is complete as-is.)

Ooh, that does sound like a useful future idea to improve things further.

> > While at it, fix a typo in t3422...and fix some misleading wording (all
> > useful options other than --whitespace=fix have long since been
> > implemented in the merge backend).
>
> >  #
> > -# Rebase has lots of useful options like --whitepsace=fix, which are
> > -# actually all built in terms of flags to git-am.  Since neither
> > -# --merge nor --interactive (nor any options that imply those two) use
> > -# git-am, using them together will result in flags like --whitespace=fix
> > -# being ignored.  Make sure rebase warns the user and aborts instead.
> > +# Rebase has a useful option, --whitespace=fix, which is actually
> > +# built in terms of flags to git-am.  Since neither --merge nor
> > +# --interactive (nor any options that imply those two) use git-am,
> > +# using them together will result in --whitespace=fix being ignored.
> > +# Make sure rebase warns the user and aborts instead.
> >  #
>
> Thanks for the update here. The -C option is also used in this test,
> so --whitespace=fix isn't the only option, right? At least, -C doesn't
> make sense to port over to the merge backend, so maybe that's what
> you mean by --whitespace=fix being the last one?

Sorry if it was confusing.  I was trying to figure out how to word it
to remove the old confusion, and actually spent a while thinking about
this point you brought up...but after a while gave up and just
submitted my patch because it was taking too long and I didn't think
the -C option was worth as much brain power as has been spent on it.
But since you noted the incongruity, let me explain my thought process
a little...

* I stated that --whitespace=... was the last *useful* option, not
that it was the last option.  Yes, that was an implicit assertion that
-C is not useful, though I avoided stating it directly for fear I'd
have to dig up proof and/or spend a bunch of time trying to reproduce
an old bug and debug it.

* The correct way to port the '-C' option to the merge backend would
probably be to just accept the flag and ignore it.  The merge backend
already functions with entire files or essentially infinite context.
I can't think of any case where ignoring this option would result in
negative surprises, other than possibly confusing people about how the
algorithm behaves and making them decide to tweak it to no avail.  And
even if users were to waste time twiddling that flag in combination
with the merge backend, even that might be a correct "port" of the
existing -C option because...

* I once tried to use the '-C' option with the apply backend to see if
it could workaround a bug inherent to the apply backend (where people
have a source file with several very similar code blocks, and rebase
or am applying changes to the wrong one due to fuzz factor and large
deletions/insertions having happened upstream elsewhere in the file).
It appeared to have absolutely no effect.  I suspected it was buggy,
but didn't feel the option was worth debugging (I thought switching
the default rebase backend was a much better use of time).

* We have *zero* tests of the functionality of rebase's -C option in
the testsuite.  The only testing we do for rebase's -C option is
solely around option parsing.

....

And you inspired me to do some digging.  rebase -C was introduced with
67dad687ad ("add -C[NUM] to git-am", 2007-02-08).  Based on feedback
on the patch, the author added the -C option not just to git-am but
also to git-rebase.  However, the way it was added to rebase was to
just pass it through to git-am, with no corresponding option to
format-patch.  Therefore, format-patch is going to continue generating
patches with 3 lines of context, while we tell git-am/git-apply to pay
attention to more than 3 lines of context.  The -C<num> option is
documented as using all available context if <num> exceeds the number
of lines of context provided in the input patches.  So, the -C option
to rebase has never done anything useful, and the port from the legacy
rebase script to the builtin C did not accidentally introduce any
modicum of utility to this option; rather, it faithfully ported this
pointless option precisely as-is.

So, I'll adjust this series to include a preliminary patch that rips
the rebase -C option out.

> The user could also explicitly request the apply backend with --apply,
> but this test script doesn't check it, strangely. That seems like an
> oversight, but not a drawback to your patch.
>
> >  test_rebase_am_only () {
> > @@ -60,6 +60,11 @@ test_rebase_am_only () {
> >               test_must_fail git rebase $opt --exec 'true' A
> >       "
> >
> > +     test_expect_success "$opt incompatible with --update-refs" "
> > +             git checkout B^0 &&
> > +             test_must_fail git rebase $opt --update-refs A
> > +     "
> > +
>
> Thanks for adding this test. I would delay the rebase.updateRefs
> version until there is extra behavior to check, such as the
> warning message.
>
> Thanks,
> -Stolee

^ permalink raw reply

* [CI]: Is t7527 known to be flakey?
From: Junio C Hamano @ 2023-01-20  2:52 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: git

The said test failed its linux-musl job in its first attempt, but
re-running the failed job passed.

    https://github.com/git/git/actions/runs/3963948890/jobs/6792356234
    (seen@e096683 attempt #1 linux-musl)

    https://github.com/git/git/actions/runs/3963948890/jobs/6792850313
    (seen@e096683 attempt #2 linux-musl)


^ permalink raw reply

* Re: [PATCH v3] checkout/switch: disallow checking out same branch in multiple worktrees
From: Junio C Hamano @ 2023-01-20  3:10 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón
  Cc: git, pclouds, Jinwook Jeong, Rubén Justo, Eric Sunshine
In-Reply-To: <20230119055325.1013-1-carenas@gmail.com>

Merging this on top of 'seen' that used to pass the CI tests

    https://github.com/git/git/actions/runs/3963948890

now fails many tests

    https://github.com/git/git/actions/runs/3964312300

Funny thing is that nothing fails locally for me on x86_64 Linux
with gcc-12.  Anything rings a bell?

Thanks.

^ permalink raw reply

* Re: [PATCH v3] checkout/switch: disallow checking out same branch in multiple worktrees
From: Carlo Arenas @ 2023-01-20  3:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, pclouds, Jinwook Jeong, Rubén Justo, Eric Sunshine
In-Reply-To: <xmqqa62dzyst.fsf@gitster.g>

I think it only fails its own test, and I can't reproduce locally
either with macOS and clang.

I was going to propose a newer version but it seems to be similarly affected:

  https://github.com/carenas/git/actions/runs/3964409976

It will be better to drop it for now; will produce one without the
heisenbug, somehow

Carlo

^ permalink raw reply

* Re: [PATCH v4] sparse-checkout.txt: new document with sparse-checkout directions
From: Elijah Newren @ 2023-01-20  4:30 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Elijah Newren via GitGitGadget, git, Victoria Dye, Derrick Stolee,
	Shaoxuan Yuan, Matheus Tavares, Glen Choo, Martin von Zweigbergk
In-Reply-To: <CAOLTT8SCtHD1KeLdRQ9cJNVm-uAae+=ii6L6r4DYDb4Lr4miSQ@mail.gmail.com>

On Sat, Jan 14, 2023 at 2:19 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> Elijah Newren <newren@gmail.com> 于2022年11月16日周三 13:49写道：
>
> > [...]
> > > > +    The fact that files can move between the 'tracked' and 'untracked'
> > > > +    categories means some commands will have to treat untracked files
> > > > +    differently.  But if we have to treat untracked files differently,
> > > > +    then additional commands may also need changes:
> > > > +
> > > > +      * status
> > > > +      * clean
> > > > +
> > >
> > > I'm a bit worried about git status, because it's used in many shells
> > > (e.g. zsh) i
> > > in the git prompt function. Its default behavior is restricted, otherwise users
> > > may get blocked when they use zsh to cd to that directory. I don't know how
> > > to reproduce this problem (since the scenario is built on checkout to a local
> > > unborn branch).
> >
> > Could you elaborate?  I'm not sure if you are talking about an
> > existing problem that you are worried about being exacerbated, or a
> > hypothetical problem that could occur with changes.  Further, your
> > wording is so vague about the problem, that I have no idea what its
> > nature is or whether any changes to status would even possibly have
> > any bearing on it.  But the suggested changes to git status are
> > simply:
> >
>
> I find this special case, it will fetch some blobs when "git status".
> First, we init a git repository, then set sparse specification to "*.js" with
> no-cone mode, then use blob:none filter to fetch all commits and trees,
> and finally checkout to the default branch.
>
> #!/bin/sh
>
> rm -rf sparse-checkout-example
> git init sparse-checkout-example
> git -C sparse-checkout-example remote add origin
> git@github.com:derrickstolee/sparse-checkout-example.git
> git -C sparse-checkout-example sparse-checkout set --no-cone *.js
> git -C sparse-checkout-example fetch origin --filter=blob:none main
> git -C sparse-checkout-example branch --track main origin/main
> git -C sparse-checkout-example checkout main
>
> Then let's do a git status, which some zsh git plugin
> will do when user "cd" the git repository.
>
> # git -C sparse-checkout-exmaple status
>
> remote: Enumerating objects: 1, done.
> remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 1
> Receiving objects: 100% (1/1), 416 bytes | 416.00 KiB/s, done.
> remote: Enumerating objects: 1, done.
> remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 1
> Receiving objects: 100% (1/1), 160 bytes | 160.00 KiB/s, done.
> remote: Enumerating objects: 1, done.
> remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 1
> Receiving objects: 100% (1/1), 2.01 KiB | 2.01 MiB/s, done.
> remote: Enumerating objects: 1, done.
> remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 1
> Receiving objects: 100% (1/1), 120 bytes | 120.00 KiB/s, done.
> On branch main
> Your branch is up to date with 'origin/main'.
>
> You are in a sparse checkout with 8% of tracked files present.
>
>
> It took 17.00 seconds to enumerate untracked files. 'status -uno'
> may speed it up, but you have to be careful not to forget to add
> new files yourself (see 'git help status').
> nothing to commit, working tree clean
>
> Yeah, here it fetches four blobs, and takes 17s!!!
>
> So what blobs are we fetching?
>
> GIT_TRACE_PACKET=1 git -C sparse-checkout-example status
> ...
> 18:02:32.989231 pkt-line.c:80           packet:        fetch> want
> dff85a65c0ef4b50a4c01bdd4a247b974bc45f90
> ...
> 18:02:37.059203 pkt-line.c:80           packet:        fetch> want
> f07ead02d13f62414589b1f1b891bb6a764ec91f
> ...
> 18:02:40.868899 pkt-line.c:80           packet:        fetch> want
> 3c4efe206bd0e7230ad0ae8396a3c883c8207906
> ...
> 18:02:44.961809 pkt-line.c:80           packet:        fetch> want
> 6590681af7e177dc71fe08648c4bbf4223b82866
>
>
> Then let's we look what's the blob:
> git log --find-object=dff85a65c0ef4b50a4c01bdd4a247b974bc45f90 --stat
> commit 8ec229339caad56eb849c67361a9699004564177
> Author: Derrick Stolee <dstolee@microsoft.com>
> Date:   Mon Dec 30 13:30:27 2019 -0500
>
>     Add twbs/bootstrap
>
>  web/browser/.gitignore | 47 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
>
> git log --find-object=f07ead02d13f62414589b1f1b891bb6a764ec91f --stat
> commit 9c5a31030de62355410a322923e33e90a00032f6
> Author: Derrick Stolee <dstolee@microsoft.com>
> Date:   Mon Dec 30 13:31:06 2019 -0500
>
>     Add artsy/artsy.github.io
>
>  web/editor/.gitignore | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> Yeah, it seems that git status fetch these .gitigore files.
> So what's the wrong here?

Nothing; this looks exactly as I'd expect.

`git status` is supposed to check whether untracked files are ignored
or not, and it needs .gitignore files to do so.  If an entire
directory is missing, then it can avoid loading a .gitignore under
that directory, but you set up your patterns such that no directory is
likely to be excluded.

Maybe there are other special cases that one could attempt to handle
(e.g. first check if there are any untracked files in a directory and
only then check for which are ignored, but that'd probably take some
big refactoring of some hairy dir.c code to accomplish something like
that, and you'd have to be really careful to not regress the
performance of non-sparse cases).  Personally, I don't think it's
worth it.  If you really don't like it, I'd suggest choosing any one
of the following ways to avoid it:

  * Include the .gitignore files in your sparse-checkout; might as
well since you're going to get them anyway.
  * Set status.showUntrackedFiles to `no` so that .gitignore files are
irrelevant
  * Use cone mode instead of non-cone mode (and yes, restructure your
repo if needed)
  * Remove the .gitignore files and commit their deletion.  Just do
without them, somehow.

^ permalink raw reply

* Re: [PATCH v3] checkout/switch: disallow checking out same branch in multiple worktrees
From: Carlo Arenas @ 2023-01-20  4:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, pclouds, Jinwook Jeong, Rubén Justo, Eric Sunshine
In-Reply-To: <CAPUEsph5phu8z=nYJt=zRh7q40-BwKuzhVQ-F=iLQppd9gR5hA@mail.gmail.com>

I should have known; the test is affected by the bug Rubén is trying to fix in:

  https://lore.kernel.org/git/eeb0c778-af0a-235c-f009-bca3abafdb15@gmail.com/

Where the order that the worktrees are found is not deterministic and
so the ignore_current_worktree flag to die_if_checked_out() might
trigger incorrectly.

Carlo

^ permalink raw reply

* [PATCH v2 0/2] rebase: mark --update-refs as requiring the merge backend
From: Elijah Newren via GitGitGadget @ 2023-01-20  4:56 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Elijah Newren, Elijah Newren
In-Reply-To: <pull.1466.git.1674106587550.gitgitgadget@gmail.com>

--update-refs is built in terms of the sequencer, which requires the merge
backend. It was already marked as incompatible with the apply backend in the
git-rebase manual, but the code didn't check for this incompatibility and
warn the user. Check and warn now.

While at it, fix a typo in t3422...and fix some misleading wording (all
useful options other than --whitespace=fix have long since been implemented
in the merge backend).

Changes since v1:

 * Add a patch nuking the -C option to rebase (fixes confusion around the
   comment in t3422 and acknowledges the fact that the option is totally and
   utterly useless and always has been. It literally never affects the
   results of a rebase.)

Signed-off-by: Elijah Newren newren@gmail.com

Elijah Newren (2):
  rebase: remove completely useless -C option
  rebase: mark --update-refs as requiring the merge backend

 Documentation/git-rebase.txt           |  9 ---------
 builtin/rebase.c                       | 12 ++++--------
 t/t3406-rebase-message.sh              |  7 -------
 t/t3422-rebase-incompatible-options.sh | 16 ++++++++++------
 4 files changed, 14 insertions(+), 30 deletions(-)


base-commit: 2b4f5a4e4bb102ac8d967cea653ed753b608193c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1466%2Fnewren%2Frebase-update-refs-imply-merge-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1466/newren/rebase-update-refs-imply-merge-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1466

Range-diff vs v1:

 -:  ----------- > 1:  a0f8f5fac1c rebase: remove completely useless -C option
 1:  f7459f0996b = 2:  2e44d0b7e57 rebase: mark --update-refs as requiring the merge backend

-- 
gitgitgadget

^ permalink raw reply

* [PATCH v2 1/2] rebase: remove completely useless -C option
From: Elijah Newren via GitGitGadget @ 2023-01-20  4:56 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Elijah Newren, Elijah Newren, Elijah Newren
In-Reply-To: <pull.1466.v2.git.1674190573.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

The `-C` option to rebase was introduced with 67dad687ad ("add -C[NUM]
to git-am", 2007-02-08).  Based on feedback on the patch, the author
added the -C option not just to git-am but also to git-rebase.  But did
it such that the option was just passed through to git-am (which passes
it along to git-apply), with no corresponding option to format-patch.

As per the git-apply documentation for the `-C` option:
    Ensure at least <n> lines of surrounding context match...When fewer
    lines of surrounding context exist they all must match.

The fact that format-patch was not passed a -U option to increase the
number of context lines meant that there would still only be 3 lines of
context to match on.  So, anyone attempting to use this option in
git-rebase would just be spinning their wheels and wasting time.  I was
one such user a number of years ago.

Since this option can at best waste users time and is nothing more than
a confusing no-op, and has never been anything other than a confusing
no-op, and no one has ever bothered to create a testcase for it that
goes beyond option parsing, simply excise the option.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/git-rebase.txt           | 9 ---------
 builtin/rebase.c                       | 9 +--------
 t/t3406-rebase-message.sh              | 7 -------
 t/t3422-rebase-incompatible-options.sh | 1 -
 4 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index f9675bd24e6..412887deda7 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -416,14 +416,6 @@ include::rerere-options.txt[]
 	Allows the pre-rebase hook to run, which is the default.  This option can
 	be used to override `--no-verify`.  See also linkgit:githooks[5].
 
--C<n>::
-	Ensure at least `<n>` lines of surrounding context match before
-	and after each change.  When fewer lines of surrounding
-	context exist they all must match.  By default no context is
-	ever ignored.  Implies `--apply`.
-+
-See also INCOMPATIBLE OPTIONS below.
-
 --no-ff::
 --force-rebase::
 -f::
@@ -631,7 +623,6 @@ The following options:
 
  * --apply
  * --whitespace
- * -C
 
 are incompatible with the following options:
 
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 1481c5b6a5b..ace8bd4a41c 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1067,8 +1067,6 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 			 N_("ignore author date and use current date")),
 		OPT_HIDDEN_BOOL(0, "ignore-date", &options.ignore_date,
 				N_("synonym of --reset-author-date")),
-		OPT_PASSTHRU_ARGV('C', NULL, &options.git_am_opts, N_("n"),
-				  N_("passed to 'git apply'"), 0),
 		OPT_BOOL(0, "ignore-whitespace", &ignore_whitespace,
 			 N_("ignore changes in whitespace")),
 		OPT_PASSTHRU_ARGV(0, "whitespace", &options.git_am_opts,
@@ -1390,12 +1388,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		if (!strcmp(option, "--whitespace=fix") ||
 		    !strcmp(option, "--whitespace=strip"))
 			allow_preemptive_ff = 0;
-		else if (skip_prefix(option, "-C", &p)) {
-			while (*p)
-				if (!isdigit(*(p++)))
-					die(_("switch `C' expects a "
-					      "numerical value"));
-		} else if (skip_prefix(option, "--whitespace=", &p)) {
+		else if (skip_prefix(option, "--whitespace=", &p)) {
 			if (*p && strcmp(p, "warn") && strcmp(p, "nowarn") &&
 			    strcmp(p, "error") && strcmp(p, "error-all"))
 				die("Invalid whitespace option: '%s'", p);
diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
index ceca1600053..c8dca845dd7 100755
--- a/t/t3406-rebase-message.sh
+++ b/t/t3406-rebase-message.sh
@@ -78,13 +78,6 @@ test_expect_success 'rebase --onto outputs the invalid ref' '
 	test_i18ngrep "invalid-ref" err
 '
 
-test_expect_success 'error out early upon -C<n> or --whitespace=<bad>' '
-	test_must_fail git rebase -Cnot-a-number HEAD 2>err &&
-	test_i18ngrep "numerical value" err &&
-	test_must_fail git rebase --whitespace=bad HEAD 2>err &&
-	test_i18ngrep "Invalid whitespace option" err
-'
-
 write_reflog_expect () {
 	if test $mode = --apply
 	then
diff --git a/t/t3422-rebase-incompatible-options.sh b/t/t3422-rebase-incompatible-options.sh
index 6dabb05a2ad..ebcbd79ab8a 100755
--- a/t/t3422-rebase-incompatible-options.sh
+++ b/t/t3422-rebase-incompatible-options.sh
@@ -63,6 +63,5 @@ test_rebase_am_only () {
 }
 
 test_rebase_am_only --whitespace=fix
-test_rebase_am_only -C4
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v2 2/2] rebase: mark --update-refs as requiring the merge backend
From: Elijah Newren via GitGitGadget @ 2023-01-20  4:56 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Elijah Newren, Elijah Newren, Elijah Newren
In-Reply-To: <pull.1466.v2.git.1674190573.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

--update-refs is built in terms of the sequencer, which requires the
merge backend.  It was already marked as incompatible with the apply
backend in the git-rebase manual, but the code didn't check for this
incompatibility and warn the user.  Check and warn now.

While at it, fix a typo in t3422...and fix some misleading wording (all
useful options other than --whitespace=fix have long since been
implemented in the merge backend).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/rebase.c                       |  3 +++
 t/t3422-rebase-incompatible-options.sh | 15 ++++++++++-----
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index ace8bd4a41c..e8bcdbf9fcd 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1507,6 +1507,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		}
 	}
 
+	if (options.update_refs)
+		imply_merge(&options, "--update-refs");
+
 	if (options.type == REBASE_UNSPECIFIED) {
 		if (!strcmp(options.default_backend, "merge"))
 			imply_merge(&options, "--merge");
diff --git a/t/t3422-rebase-incompatible-options.sh b/t/t3422-rebase-incompatible-options.sh
index ebcbd79ab8a..d72c863b21b 100755
--- a/t/t3422-rebase-incompatible-options.sh
+++ b/t/t3422-rebase-incompatible-options.sh
@@ -25,11 +25,11 @@ test_expect_success 'setup' '
 '
 
 #
-# Rebase has lots of useful options like --whitepsace=fix, which are
-# actually all built in terms of flags to git-am.  Since neither
-# --merge nor --interactive (nor any options that imply those two) use
-# git-am, using them together will result in flags like --whitespace=fix
-# being ignored.  Make sure rebase warns the user and aborts instead.
+# Rebase has a useful option, --whitespace=fix, which is actually
+# built in terms of flags to git-am.  Since neither --merge nor
+# --interactive (nor any options that imply those two) use git-am,
+# using them together will result in --whitespace=fix being ignored.
+# Make sure rebase warns the user and aborts instead.
 #
 
 test_rebase_am_only () {
@@ -60,6 +60,11 @@ test_rebase_am_only () {
 		test_must_fail git rebase $opt --exec 'true' A
 	"
 
+	test_expect_success "$opt incompatible with --update-refs" "
+		git checkout B^0 &&
+		test_must_fail git rebase $opt --update-refs A
+	"
+
 }
 
 test_rebase_am_only --whitespace=fix
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v2 1/2] docs: fix sparse-checkout docs link
From: Elijah Newren @ 2023-01-20  5:12 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Derrick Stolee, Victoria Dye,
	Nguyễn Thái Ngọc Duy, ZheNing Hu
In-Reply-To: <cde4827da13a1391501057ee42655836f0ccd337.1674149666.git.gitgitgadget@gmail.com>

On Thu, Jan 19, 2023 at 9:34 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> There is a linking error when other documents want to refer to
> technical/sparse-checkout.txt, Also, there is something wrong
> with the format of the paragraphs in sparse-checkout.txt, which
> makes acsiidoc compile errors.
>
> So fix the format of sparse-checkout.txt, and link it in the
> Makefile to correct that.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/Makefile                      |  1 +
>  Documentation/technical/sparse-checkout.txt | 43 ++++++++++++++-------
>  2 files changed, 30 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 9c67c3a1c50..7540da27b8c 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -121,6 +121,7 @@ TECH_DOCS += technical/reftable
>  TECH_DOCS += technical/scalar
>  TECH_DOCS += technical/send-pack-pipeline
>  TECH_DOCS += technical/shallow
> +TECH_DOCS += technical/sparse-checkout
>  TECH_DOCS += technical/trivial-merge
>  SP_ARTICLES += $(TECH_DOCS)
>  SP_ARTICLES += technical/api-index
> diff --git a/Documentation/technical/sparse-checkout.txt b/Documentation/technical/sparse-checkout.txt
> index fa0d01cbda4..52e86764a6c 100644
> --- a/Documentation/technical/sparse-checkout.txt
> +++ b/Documentation/technical/sparse-checkout.txt
> @@ -1,3 +1,6 @@
> +Sparse Checkout Design Notes
> +============================
> +
>  Table of contents:
>
>    * Terminology
> @@ -14,7 +17,8 @@ Table of contents:
>    * Reference Emails
>
>
> -=== Terminology ===
> +Terminology
> +-----------
>
>  cone mode: one of two modes for specifying the desired subset of files
>         in a sparse-checkout.  In cone-mode, the user specifies
> @@ -92,7 +96,8 @@ vivifying: When a command restores a tracked file to the working tree (and
>         file), this is referred to as "vivifying" the file.
>
>
> -=== Purpose of sparse-checkouts ===
> +Purpose of sparse-checkouts
> +---------------------------
>
>  sparse-checkouts exist to allow users to work with a subset of their
>  files.
> @@ -255,7 +260,8 @@ will perceive the checkout as dense, and commands should thus behave as if
>  all files are present.
>
>
> -=== Usecases of primary concern ===
> +Usecases of primary concern
> +---------------------------
>
>  Most of the rest of this document will focus on Behavior A and Behavior
>  B.  Some notes about the other two cases and why we are not focusing on
> @@ -300,7 +306,8 @@ Behavior C do not assume they are part of the Behavior B camp and propose
>  patches that break things for the real Behavior B folks.
>
>
> -=== Oversimplified mental models ===
> +Oversimplified mental models
> +----------------------------
>
>  An oversimplification of the differences in the above behaviors is:
>
> @@ -313,7 +320,8 @@ An oversimplification of the differences in the above behaviors is:
>               they can later lazily be populated instead.
>
>
> -=== Desired behavior ===
> +Desired behavior
> +----------------
>
>  As noted previously, despite the simple idea of just working with a subset
>  of files, there are a range of different behavioral changes that need to be
> @@ -544,7 +552,8 @@ understanding these differences can be beneficial.
>    * gitk?
>
>
> -=== Behavior classes ===
> +Behavior classes
> +----------------
>
>  From the above there are a few classes of behavior:
>
> @@ -611,7 +620,8 @@ From the above there are a few classes of behavior:
>      specification.
>
>
> -=== Subcommand-dependent defaults ===
> +Subcommand-dependent defaults
> +-----------------------------
>
>  Note that we have different defaults depending on the command for the
>  desired behavior :
> @@ -751,7 +761,8 @@ desired behavior :
>      implemented.
>
>
> -=== Sparse specification vs. sparsity patterns ===
> +Sparse specification vs. sparsity patterns
> +------------------------------------------
>
>  In a well-behaved situation, the sparse specification is given directly
>  by the $GIT_DIR/info/sparse-checkout file.  However, it can transiently
> @@ -823,7 +834,8 @@ under behavior B index operations are lumped with history and tend to
>  operate full-tree.
>
>
> -=== Implementation Questions ===
> +Implementation Questions
> +------------------------
>
>    * Do the options --scope={sparse,all} sound good to others?  Are there better
>      options?
> @@ -894,7 +906,8 @@ operate full-tree.
>      is seamless for them.
>
>
> -=== Implementation Goals/Plans ===
> +Implementation Goals/Plans
> +--------------------------
>
>   * Get buy-in on this document in general.
>
> @@ -922,13 +935,14 @@ operate full-tree.
>       commands.  IMPORTANT: make sure diff machinery changes don't mess with
>       format-patch, fast-export, etc.
>
> -=== Known bugs ===
> +Known bugs
> +----------

Everything so far makes sense.

>  This list used to be a lot longer (see e.g. [1,2,3,4,5,6,7,8,9]), but we've
>  been working on it.
>
> -0. Behavior A is not well supported in Git.  (Behavior B didn't used to
> -   be either, but was the easier of the two to implement.)
> +Behavior A is not well supported in Git.  (Behavior B didn't used to
> +be either, but was the easier of the two to implement.)

Why did you remove the numbering on this, though?  If asciidoc doesn't
like starting with 0 (the only guess I can think of for why you'd
change this), shouldn't the series be renumbered starting at 1 rather
than removing this from the list?

>  1. am and apply:
>
> @@ -1052,7 +1066,8 @@ been working on it.
>      https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/
>
>
> -=== Reference Emails ===
> +Reference Emails
> +----------------
>
>  Emails that detail various bugs we've had in sparse-checkout:
>
> --
> gitgitgadget

^ permalink raw reply

* Re: [PATCH v2 2/2] ls-files: add %(skipworktree) atom to format option
From: Elijah Newren @ 2023-01-20  5:30 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Derrick Stolee, Victoria Dye,
	Nguyễn Thái Ngọc Duy, ZheNing Hu
In-Reply-To: <9ebd6b77a69be414388a52a482912173f2a4e7d8.1674149666.git.gitgitgadget@gmail.com>

On Thu, Jan 19, 2023 at 9:34 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> Because sometimes we want to check if the files in the
> index match the sparse specification, so introduce
> "%(skipworktree)" atom to git ls-files `--format` option.
> When we use this option, if the file match the sparse
> specification, it will output "1", otherwise, output
> empty string "".

Why is that output format useful?  It seems like it'll just lead to
bugs, or someone re-implementing the same field with a different name
to make it useful in the future.  In particular, if there are multiple
boolean fields and someone specifies e.g.
   git ls-files --format="%(path) %(skipworktree) %(intentToAdd)"
and both boolean fields are displayed the same way (either a "1" or a
blank string), and we see something like:
   foo.c 1
   bar.c 1
Then how do we know if foo.c and bar.c are SKIP_WORKTREE or
INTENT_TO_ADD?  The "1" could have come from either field.

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/git-ls-files.txt |  5 +++++
>  builtin/ls-files.c             |  3 +++
>  t/t3013-ls-files-format.sh     | 23 +++++++++++++++++++++++
>  3 files changed, 31 insertions(+)
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 440043cdb8e..2540b404808 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -260,6 +260,11 @@ eolattr::
>         that applies to the path.
>  path::
>         The pathname of the file which is recorded in the index.
> +skipworktree::
> +       If the file in the index marked with SKIP_WORKTREE bit.
> +       It means the file do not match the sparse specification.
> +       See link:technical/sparse-checkout.txt[sparse-checkout]
> +       for more information.

minor nits: Missing an "is", and "do" should be "does".

I'm curious whether the second sentence is even necessary; we've
already got the link to the more technical docs.  Perhaps just:

skipworktree::
    Whether the file in the index has the SKIP_WORKTREE bit set.
    See link:technical/sparse-checkout.txt[sparse-checkout]
    for more information.

>  EXCLUDE PATTERNS
>  ----------------
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index a03b559ecaa..bbff868ae6b 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -280,6 +280,9 @@ static size_t expand_show_index(struct strbuf *sb, const char *start,
>                               data->pathname));
>         else if (skip_prefix(start, "(path)", &p))
>                 write_name_to_buf(sb, data->pathname);
> +       else if (skip_prefix(start, "(skipworktree)", &p))
> +               strbuf_addstr(sb, ce_skip_worktree(data->ce) ?
> +                             "1" : "");

As I mentioned in response to the commit message, I don't understand
why having an empty string would be desirable.

>         else
>                 die(_("bad ls-files format: %%%.*s"), (int)len, start);
>
> diff --git a/t/t3013-ls-files-format.sh b/t/t3013-ls-files-format.sh
> index efb7450bf1e..cd35dba5930 100755
> --- a/t/t3013-ls-files-format.sh
> +++ b/t/t3013-ls-files-format.sh
> @@ -92,4 +92,27 @@ test_expect_success 'git ls-files --format with --debug' '
>         test_cmp expect actual
>  '
>
> +test_expect_success 'git ls-files --format with skipworktree' '
> +       test_when_finished "git sparse-checkout disable" &&
> +       mkdir dir1 dir2 &&
> +       echo "file1" >dir1/file1.txt &&
> +       echo "file2" >dir2/file2.txt &&
> +       git add dir1 dir2 &&
> +       git commit -m skipworktree &&
> +       git sparse-checkout set dir1 &&
> +       git ls-files --format="%(path)%(skipworktree)" >actual &&
> +       cat >expect <<-\EOF &&
> +       dir1/file1.txt
> +       dir2/file2.txt1
> +       o1.txt
> +       o2.txt
> +       o3.txt
> +       o4.txt
> +       o5.txt
> +       o6.txt
> +       o7.txt
> +       EOF
> +       test_cmp expect actual
> +'

I find this test hard to read; it's just too easy to miss
"dir2/file2.txt1" vs "dir2/file2.txt".  I'd suggest at least adding a
space, and likely having the skipworktree attribute come first in the
format string.  It might also be useful to add "dir*" on the ls-files
command to limit which paths are shown, just because there's an awful
lot of files in the root directory and no variance between them, and
it's easier to notice the binary difference between two items than
having a full 9 and figuring out which are relevant.

^ permalink raw reply

* Re: [PATCH v2 1/2] rebase: remove completely useless -C option
From: Eric Sunshine @ 2023-01-20  5:40 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Derrick Stolee, Elijah Newren
In-Reply-To: <a0f8f5fac1c3f79cd46b943e95636728677dffef.1674190573.git.gitgitgadget@gmail.com>

On Fri, Jan 20, 2023 at 12:14 AM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> The `-C` option to rebase was introduced with 67dad687ad ("add -C[NUM]
> to git-am", 2007-02-08).  Based on feedback on the patch, the author
> added the -C option not just to git-am but also to git-rebase.  But did
> it such that the option was just passed through to git-am (which passes
> it along to git-apply), with no corresponding option to format-patch.
>
> As per the git-apply documentation for the `-C` option:
>     Ensure at least <n> lines of surrounding context match...When fewer
>     lines of surrounding context exist they all must match.
>
> The fact that format-patch was not passed a -U option to increase the
> number of context lines meant that there would still only be 3 lines of
> context to match on.  So, anyone attempting to use this option in
> git-rebase would just be spinning their wheels and wasting time.  I was
> one such user a number of years ago.
>
> Since this option can at best waste users time and is nothing more than
> a confusing no-op, and has never been anything other than a confusing
> no-op, and no one has ever bothered to create a testcase for it that
> goes beyond option parsing, simply excise the option.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>

Is there a chance that this patch could break some existing tooling or
someone's workflow or alias? If so, perhaps it would be better to
continue recognizing it, but as a proper no-op? That is:

(1) continue accepting the option but mark it PARSE_OPT_HIDDEN or some such

(2) completely ignore the option

(3) in case someone runs across it in some existing script or example
and wants to know what it does, leave one mention of it in the
documentation, such as:

    -C<n>
        Does nothing. This option is accepted for backward compatibility
        only. Do not use.

^ permalink raw reply

* Re: [PATCH v2 1/2] rebase: remove completely useless -C option
From: Elijah Newren @ 2023-01-20  6:42 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Elijah Newren via GitGitGadget, git, Derrick Stolee
In-Reply-To: <CAPig+cQj4Gi+ikkPz3wffqGZVz5uALGzYV25nio3k4+cthP9Uw@mail.gmail.com>

On Thu, Jan 19, 2023 at 9:40 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Fri, Jan 20, 2023 at 12:14 AM Elijah Newren via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > The `-C` option to rebase was introduced with 67dad687ad ("add -C[NUM]
> > to git-am", 2007-02-08).  Based on feedback on the patch, the author
> > added the -C option not just to git-am but also to git-rebase.  But did
> > it such that the option was just passed through to git-am (which passes
> > it along to git-apply), with no corresponding option to format-patch.
> >
> > As per the git-apply documentation for the `-C` option:
> >     Ensure at least <n> lines of surrounding context match...When fewer
> >     lines of surrounding context exist they all must match.
> >
> > The fact that format-patch was not passed a -U option to increase the
> > number of context lines meant that there would still only be 3 lines of
> > context to match on.  So, anyone attempting to use this option in
> > git-rebase would just be spinning their wheels and wasting time.  I was
> > one such user a number of years ago.
> >
> > Since this option can at best waste users time and is nothing more than
> > a confusing no-op, and has never been anything other than a confusing
> > no-op, and no one has ever bothered to create a testcase for it that
> > goes beyond option parsing, simply excise the option.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
>
> Is there a chance that this patch could break some existing tooling or
> someone's workflow or alias?  If so, perhaps it would be better to
> continue recognizing it, but as a proper no-op? That is:
>
> (1) continue accepting the option but mark it PARSE_OPT_HIDDEN or some such
>
> (2) completely ignore the option
>
> (3) in case someone runs across it in some existing script or example
> and wants to know what it does, leave one mention of it in the
> documentation, such as:
>
>     -C<n>
>         Does nothing. This option is accepted for backward compatibility
>         only. Do not use.

If an option becomes a no-op over time due to other changes, this
would likely be the route I'd lean towards.  I'm just not sure it's
merited in this case.

We already turned numerous no-op choices in rebase into errors with
commit 8295ed690b ("rebase: make the backend configurable via config
setting", 2020-02-15), where we started pointing out that apply-based
and merge-based options were incompatible (as opposed to silently
accepting competing options and ignoring ones not supported by
whichever backend happened to trump at the time based on the options.)
 So, there's recent precedent to jump directly to errors with no-ops
in rebase.  Those cases are slightly different in that options were
only conditionally no-ops, whereas in this case we have an option that
is unconditionally a no-op, suggesting we might be able to do
something stronger.

On top of that, I just can't find any evidence of anyone ever using
this option: (a) it was only put in as an afterthought by the original
author (who was only really interested in git-am -C), (b) there are
absolutely no testcases in the testsuite covering its behavior, (c) my
searches of the mailing list and google don't turn up any hits though
admittedly it's hard to figure out what to search on, and (d) while I
tried to use the option at times, it was only because I was doing work
to make backends consistent and switch the default one and trying to
understand all the inner workings, and even then I gave up on it and
punted it down the road.

And, of course, the option never worked as advertised anyway.

That combination makes me think nuking it would go completely
unnoticed outside this mailing list.  If others would rather see the
more cautious route despite all the above, I can implement it, but
likely alter your step (2) from ignoring into throwing an error
message (or at the very least a warning).

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox