All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Danial Alihosseini <danial.alihosseini@gmail.com>
Cc: Jeff King <peff@peff.net>, Derrick Stolee <dstolee@microsoft.com>,
	git@vger.kernel.org
Subject: Re: git 2.34.0: Behavior of `**` in gitignore is different from previous versions.
Date: Fri, 19 Nov 2021 09:51:12 -0500	[thread overview]
Message-ID: <72fffbff-16f7-fa17-b212-67aae9e1b034@gmail.com> (raw)
In-Reply-To: <CACLOEFbY3LwMa2uhc=9jmcGFf0mvWzEM=YityLyFcuGWXVmqbw@mail.gmail.com>

On 11/18/2021 11:01 PM, Danial Alihosseini wrote:
> Thanks for your follow-up.
> 
> I wanted to ignore all files in the "data" folder except ".txt" ones.
> 
> As mentioned in the gitignore doc, there should be a difference
> between "**" and "**/".
> 
>> A trailing "/**" matches everything inside. For example, "abc/**" matches all files inside directory "abc", relative to the location of the .gitignore file, with infinite depth.
> 
> and,
>>
>> A slash followed by two consecutive asterisks then a slash matches zero or more directories. For example, "a/**/b" matches "a/b", "a/x/b", "a/x/y/b" and so on.
> 
> and also,
>>
>> It is not possible to re-include a file if a parent directory of that file is excluded.

So this is the key point of why you can't just omit the !data/**/
line.

What is unclear to me is what exactly "match a directory" means.
If we ignore a directory, then we ignore everything inside it (until
another pattern says we should care about it), but the converse
should also hold: if we have a pattern like "!data/**/", then that
should mean "include everything inside data/<A>/ where <A> is any
directory name".

My inability to form a mental model where the existing behavior
matches the documented specification is an indicator that this was
changed erroneously. A revert patch is included at the end of this
message.

If anyone could help clarify my understanding here, then maybe
there is room for improving the documentation.

> So, I excluded all files by "data/**", re-included just directories
> (at any depth) by "!data/**/" and, re-included ".txt" files (at any
> depth) by "!data/**/*.txt".
> However, if I wanted to re-include a directory and all of its
> contents, I could use something like "!data/data1/**", without
> trailing slash.
> As I see, there is no separation between "**" and "**/" in the current version.
> 
> I think there is another point that the previous behavior test case
> should be like this:
> test_expect_success 'directories and ** matches' '
>     cat >.gitignore <<-\EOF &&
>     data/**
>     !data/**/
>     !data/**/*.txt
>     EOF
>     git check-ignore file \
>         data/file data/data1/file1 data/data1/file1.txt \
>         data/data2/file2 data/data2/file2.txt >actual &&
>     cat >expect <<-\EOF &&
>     data/file
>     data/data1/file1
>     data/data2/file2
>     EOF
>     test_cmp expect actual
> '

Thank you for seeing my poorly-edited test and understanding
what it _should_ have been.

Here is a revert including this new test. I further tested
with the Scalar functional tests [1] to see if I could find
what I was thinking when making this original change. Since
this is so similar to another revert we caught before the
2.34.0 release, I regret not checking adjacent commits for
a similar "reverting does not affect the test suite" case.

---- >8 ----

From f833967da83ecd51090ebb75a4b8173a775d16f1 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Fri, 19 Nov 2021 09:13:49 -0500
Subject: [PATCH] dir: revert "dir: select directories correctly"

This reverts commit f6526728f950cacfd5b5e42bcc65f2c47f3da654.

The change in f652672 (dir: select directories correctly, 2021-09-24)
caused a regression in directory-based matches with non-cone-mode
patterns, especially for .gitignore patterns. A test is included to
prevent this regression in the future.

The commit ed495847 (dir: fix pattern matching on dirs, 2021-09-24) was
reverted in 5ceb663 (dir: fix directory-matching bug, 2021-11-02) for
similar reasons. Neither commit changed tests, and tests added later in
the series continue to pass when these commits are reverted.

Reported-by: Danial Alihosseini <danial.alihosseini@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c              | 54 +++++-----------------------------------------
 t/t0008-ignores.sh | 17 +++++++++++++++
 2 files changed, 22 insertions(+), 49 deletions(-)

diff --git a/dir.c b/dir.c
index 9ea6cfe61c..86afa2eae0 100644
--- a/dir.c
+++ b/dir.c
@@ -1303,44 +1303,6 @@ int match_pathname(const char *pathname, int pathlen,
 				 WM_PATHNAME) == 0;
 }
 
-static int path_matches_dir_pattern(const char *pathname,
-				    int pathlen,
-				    struct strbuf **path_parent,
-				    int *dtype,
-				    struct path_pattern *pattern,
-				    struct index_state *istate)
-{
-	if (!*path_parent) {
-		char *slash;
-		CALLOC_ARRAY(*path_parent, 1);
-		strbuf_add(*path_parent, pathname, pathlen);
-		slash = find_last_dir_sep((*path_parent)->buf);
-
-		if (slash)
-			strbuf_setlen(*path_parent, slash - (*path_parent)->buf);
-		else
-			strbuf_setlen(*path_parent, 0);
-	}
-
-	/*
-	 * If the parent directory matches the pattern, then we do not
-	 * need to check for dtype.
-	 */
-	if ((*path_parent)->len &&
-	    match_pathname((*path_parent)->buf, (*path_parent)->len,
-			   pattern->base,
-			   pattern->baselen ? pattern->baselen - 1 : 0,
-			   pattern->pattern, pattern->nowildcardlen,
-			   pattern->patternlen, pattern->flags))
-		return 1;
-
-	*dtype = resolve_dtype(*dtype, istate, pathname, pathlen);
-	if (*dtype != DT_DIR)
-		return 0;
-
-	return 1;
-}
-
 /*
  * Scan the given exclude list in reverse to see whether pathname
  * should be ignored.  The first match (i.e. the last on the list), if
@@ -1356,7 +1318,6 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
 {
 	struct path_pattern *res = NULL; /* undecided */
 	int i;
-	struct strbuf *path_parent = NULL;
 
 	if (!pl->nr)
 		return NULL;	/* undefined */
@@ -1366,10 +1327,11 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
 		const char *exclude = pattern->pattern;
 		int prefix = pattern->nowildcardlen;
 
-		if (pattern->flags & PATTERN_FLAG_MUSTBEDIR &&
-		    !path_matches_dir_pattern(pathname, pathlen, &path_parent,
-					      dtype, pattern, istate))
-			continue;
+		if (pattern->flags & PATTERN_FLAG_MUSTBEDIR) {
+			*dtype = resolve_dtype(*dtype, istate, pathname, pathlen);
+			if (*dtype != DT_DIR)
+				continue;
+		}
 
 		if (pattern->flags & PATTERN_FLAG_NODIR) {
 			if (match_basename(basename,
@@ -1393,12 +1355,6 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
 			break;
 		}
 	}
-
-	if (path_parent) {
-		strbuf_release(path_parent);
-		free(path_parent);
-	}
-
 	return res;
 }
 
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 46a5038fed..fce7c7b408 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -828,6 +828,23 @@ test_expect_success 'exact prefix matching (without root)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'directories and ** matches' '
+	cat >.gitignore <<-\EOF &&
+	data/**
+	!data/**/
+	!data/**/*.txt
+	EOF
+	git check-ignore file \
+		data/file data/data1/file1 data/data1/file1.txt \
+		data/data2/file2 data/data2/file2.txt >actual &&
+	cat >expect <<-\EOF &&
+	data/file
+	data/data1/file1
+	data/data2/file2
+	EOF
+	test_cmp expect actual
+'
+
 ############################################################################
 #
 # test whitespace handling
-- 
2.34.0.dirty


  reply	other threads:[~2021-11-19 14:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18 16:41 git 2.34.0: Behavior of `**` in gitignore is different from previous versions Danial Alihosseini
2021-11-18 17:04 ` Jeff King
2021-11-18 22:09   ` Derrick Stolee
2021-11-19  4:01     ` Danial Alihosseini
2021-11-19 14:51       ` Derrick Stolee [this message]
2021-11-19 17:06         ` Danial Alihosseini
2021-11-19 20:05         ` Johannes Sixt
2021-11-19 20:33           ` Derrick Stolee
2021-11-19 20:57             ` Ævar Arnfjörð Bjarmason
2021-11-20 22:41             ` Chris Torek
2021-11-21  0:46               ` Junio C Hamano
2021-11-23 12:21                 ` Philip Oakley
2021-11-23 21:13                   ` Danial Alihosseini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72fffbff-16f7-fa17-b212-67aae9e1b034@gmail.com \
    --to=stolee@gmail.com \
    --cc=danial.alihosseini@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.