* [PATCH] Teach ignore machinery about pattern "/" @ 2012-05-25 12:47 Nguyễn Thái Ngọc Duy 2012-05-25 17:32 ` Junio C Hamano 0 siblings, 1 reply; 7+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-05-25 12:47 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy Pattern "/" is ambiguous because the slash can mean "anchor the pattern to this location" (e.g. /path), but it can also mean "match directories only" (e.g. path/). Currently git interprets it as the latter except that 'path' is an empty string, which makes this pattern totally useless. On the other hand, it's intuitive to read '/' as "match root directory", or equivalent to "/*". (The other way to see it is "match all directories", which leads to the same result). One may wonder why bother an "ignore all" pattern. The pattern is useful when you want to keep just a small portion of the working directory. But that's still a rare case. A more popular case would be sparse checkout, where ignore rules are used to _include_. The thus now "include all" pattern is used to say "make a sparse checkout that includes everything except this and this". Recognize this special pattern and turn it the working equivalence pattern "/*" Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- dir.c | 3 +++ t/t3001-ls-files-others-exclude.sh | 6 ++++++ 2 files changed, 9 insertions(+) diff --git a/dir.c b/dir.c index c6a98cc..b65d37c 100644 --- a/dir.c +++ b/dir.c @@ -308,6 +308,9 @@ void add_exclude(const char *string, const char *base, int to_exclude = 1; int flags = 0; + if (string[0] == '/' && string[1] == '\0') + string = "/*"; + if (*string == '!') { to_exclude = 0; string++; diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh index c8fe978..7cb790d 100755 --- a/t/t3001-ls-files-others-exclude.sh +++ b/t/t3001-ls-files-others-exclude.sh @@ -175,6 +175,12 @@ test_expect_success 'negated exclude matches can override previous ones' ' grep "^a.1" output ' +test_expect_success '"/" pattern is equivalent to "/*" (exclude all)' ' + git ls-files --others --exclude=/ >output && + : >expected && + test_cmp expected output +' + test_expect_success 'subdirectory ignore (setup)' ' mkdir -p top/l1/l2 && ( -- 1.7.10.2.549.g9354186 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Teach ignore machinery about pattern "/" 2012-05-25 12:47 [PATCH] Teach ignore machinery about pattern "/" Nguyễn Thái Ngọc Duy @ 2012-05-25 17:32 ` Junio C Hamano 2012-05-26 4:30 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2012-05-25 17:32 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > Pattern "/" is ambiguous because the slash can mean "anchor the > pattern to this location" (e.g. /path), but it can also mean > "match directories only" (e.g. path/). Currently git interprets it as > the latter except that 'path' is an empty string, which makes this > pattern totally useless. Did the old version interprete it as the former? How does the above observation relate to the end-user help request in the other thread? When talking about an ambiguous expression X that can be in multiple ways and each interpretation gives surprisingly different result, if there are ways A, B and C to unambiguously spell each and every of its possible and useful interpretations, and if one of the interpretations C is to ignore the expression altogether, it is preferrable to either (1) warn if it does not trigger anything useful to write a no-op; or (2) just ignore it if a no-op is a meaningful behaviour when you see X. > On the other hand, it's intuitive to read '/' as "match root > directory", or equivalent to "/*". (The other way to see it is "match > all directories", which leads to the same result). I am a bit confused about the above, especially "The other way" part. Does this alternative interpretation view "/" as "/foo" whose "foo" happens to be an empty string, or does it view "/" as "foo/" whose "foo" happens to be an empty string? The former would mean "ignore foo, and don't bother descending into foo if it is a directory, as everything in it is ignored", while the latter would mean "anywhere in this directory and its subdirectories, if we see foo that is a directory, don't bother descending into it as everything in it is ignored." What I am trying to get at is why you are making '/' the same as '/*' as opposed to '*/', '/*/', or even a plain '*'. > One may wonder why bother an "ignore all" pattern. The pattern is > useful when you want to keep just a small portion of the working > directory. But that's still a rare case. > > A more popular case would be sparse checkout, where ignore rules are > used to _include_. The thus now "include all" pattern is used to say > "make a sparse checkout that includes everything except this and > this". If the "sparse checkout" hack writes "/" for whatever reason, it should be corrected to write "*" (or perhaps "/*") instead. I do not see it as a valid factor to affect when we decide what should be done for a possibly ambiguous "/" entry in the exclude machinery. > Recognize this special pattern and turn it the working equivalence > pattern "/*" Regardless of the answer to the "is it *, /*, */ or /*/" question above, I do not think silent conversion is a right solution for the ambiguity. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Teach ignore machinery about pattern "/" 2012-05-25 17:32 ` Junio C Hamano @ 2012-05-26 4:30 ` Nguyen Thai Ngoc Duy 2012-05-27 6:15 ` Junio C Hamano 2012-06-03 12:36 ` Nguyen Thai Ngoc Duy 0 siblings, 2 replies; 7+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-05-26 4:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Sat, May 26, 2012 at 12:32 AM, Junio C Hamano <gitster@pobox.com> wrote: > Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > >> Pattern "/" is ambiguous because the slash can mean "anchor the >> pattern to this location" (e.g. /path), but it can also mean >> "match directories only" (e.g. path/). Currently git interprets it as >> the latter except that 'path' is an empty string, which makes this >> pattern totally useless. > > Did the old version interprete it as the former? That calls for bit of testing, which I'll do soon. > How does the above > observation relate to the end-user help request in the other thread? I assume you meant the recent sparse-checkout thread. "/" happens to work due to (probably) a mistake in sparse-checkout v1.7.2 even though "/" behavior is undefined. This sets a specific behavior to "/" and makes 1.7.10 treat "/" like 1.7.2. > When talking about an ambiguous expression X that can be in multiple ways > and each interpretation gives surprisingly different result, if there are > ways A, B and C to unambiguously spell each and every of its possible and > useful interpretations, and if one of the interpretations C is to ignore > the expression altogether, it is preferrable to either > > (1) warn if it does not trigger anything useful to write a no-op; or > (2) just ignore it if a no-op is a meaningful behaviour > > when you see X. except that we do the ignoring in (2) but I don't see this as a meaningful behavior. >> On the other hand, it's intuitive to read '/' as "match root >> directory", or equivalent to "/*". (The other way to see it is "match >> all directories", which leads to the same result). > > I am a bit confused about the above, especially "The other way" part. > > Does this alternative interpretation view "/" as "/foo" whose "foo" > happens to be an empty string, or does it view "/" as "foo/" whose "foo" > happens to be an empty string? The latter. Though there may be two way of interpreting the empty string as pattern. The computer would happily say "match nothing", but my human brain tends to thinkg "match everything" as "match nothing" does not really makes sense as a pattern. > The former would mean "ignore foo, and don't bother descending into foo if > it is a directory, as everything in it is ignored", while the latter would > mean "anywhere in this directory and its subdirectories, if we see foo > that is a directory, don't bother descending into it as everything in it > is ignored." > > What I am trying to get at is why you are making '/' the same as '/*' as > opposed to '*/', '/*/', or even a plain '*'. Because I see "/" as a special case of "/foo". It makes more sense than "/" as a special case of "foo/", at least not without more thoughts on it. >> One may wonder why bother an "ignore all" pattern. The pattern is >> useful when you want to keep just a small portion of the working >> directory. But that's still a rare case. >> >> A more popular case would be sparse checkout, where ignore rules are >> used to _include_. The thus now "include all" pattern is used to say >> "make a sparse checkout that includes everything except this and >> this". > > If the "sparse checkout" hack writes "/" for whatever reason, it should be > corrected to write "*" (or perhaps "/*") instead. I do not see it as a > valid factor to affect when we decide what should be done for a possibly > ambiguous "/" entry in the exclude machinery. That's one way of seeing it. What I propose in the patch is "this is useless no-op pattern (*), let's assign some meaning to it, what makes most sense" (*) still needs verification though. > >> Recognize this special pattern and turn it the working equivalence >> pattern "/*" > > Regardless of the answer to the "is it *, /*, */ or /*/" question above, > I do not think silent conversion is a right solution for the ambiguity. -- Duy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Teach ignore machinery about pattern "/" 2012-05-26 4:30 ` Nguyen Thai Ngoc Duy @ 2012-05-27 6:15 ` Junio C Hamano 2012-06-03 12:36 ` Nguyen Thai Ngoc Duy 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2012-05-27 6:15 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: >> When talking about an ambiguous expression X that can be in multiple ways >> and each interpretation gives surprisingly different result, if there are >> ways A, B and C to unambiguously spell each and every of its possible and >> useful interpretations, and if one of the interpretations C is to ignore >> the expression altogether, it is preferrable to either >> >> (1) warn if it does not trigger anything useful to write a no-op; or >> (2) just ignore it if a no-op is a meaningful behaviour >> >> when you see X. > > except that we do the ignoring in (2) but I don't see this as a > meaningful behavior. That would mean you would want to update the code to do (1), not silently trigger a different behaviour. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Teach ignore machinery about pattern "/" 2012-05-26 4:30 ` Nguyen Thai Ngoc Duy 2012-05-27 6:15 ` Junio C Hamano @ 2012-06-03 12:36 ` Nguyen Thai Ngoc Duy 2012-06-03 21:49 ` Junio C Hamano 1 sibling, 1 reply; 7+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-06-03 12:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Sat, May 26, 2012 at 11:30 AM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > On Sat, May 26, 2012 at 12:32 AM, Junio C Hamano <gitster@pobox.com> wrote: >> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: >> >>> Pattern "/" is ambiguous because the slash can mean "anchor the >>> pattern to this location" (e.g. /path), but it can also mean >>> "match directories only" (e.g. path/). Currently git interprets it as >>> the latter except that 'path' is an empty string, which makes this >>> pattern totally useless. >> >> Did the old version interprete it as the former? > > That calls for bit of testing, which I'll do soon. "/" is no-op since 1.4.0 (tested with refs/tags/v*), tested with "git add ." with $GIT_DIR/info/exclude containing only "/". "git add" in pre-1.4 is shell script and needs to be installed to work properly so I only tested v1.3.0 and v1.0.0, and it worked too (i.e. "/" is still no-op). I think it's safe to make it a meaningful pattern (or warn about its uselessness). -- Duy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Teach ignore machinery about pattern "/" 2012-06-03 12:36 ` Nguyen Thai Ngoc Duy @ 2012-06-03 21:49 ` Junio C Hamano 2012-06-03 22:50 ` Caleb Marchent 0 siblings, 1 reply; 7+ messages in thread From: Junio C Hamano @ 2012-06-03 21:49 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > On Sat, May 26, 2012 at 11:30 AM, Nguyen Thai Ngoc Duy > <pclouds@gmail.com> wrote: >> On Sat, May 26, 2012 at 12:32 AM, Junio C Hamano <gitster@pobox.com> wrote: >>> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: >>> >>>> Pattern "/" is ambiguous because the slash can mean "anchor the >>>> pattern to this location" (e.g. /path), but it can also mean >>>> "match directories only" (e.g. path/). Currently git interprets it as >>>> the latter except that 'path' is an empty string, which makes this >>>> pattern totally useless. >>> >>> Did the old version interprete it as the former? >> >> That calls for bit of testing, which I'll do soon. > > "/" is no-op since 1.4.0 (tested with refs/tags/v*),... So historically we had three possible interpretations of "/", i.e. "/path" that happens to have an empty path, "path/" that happens to have an empty path, and a no-op. That strongly suggests that we should at least warn when we see "/" that the user is playing with fire by feeding the exclude machinery something without clear semantics. Doing a no-op might be a safer option than adding it new special case semantics to it, but even then I do not think we should silently ignore it. Silently giving new semantics for this particular special case is out of the question, I would have to say. Thanks for digging. ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH] Teach ignore machinery about pattern "/" 2012-06-03 21:49 ` Junio C Hamano @ 2012-06-03 22:50 ` Caleb Marchent 0 siblings, 0 replies; 7+ messages in thread From: Caleb Marchent @ 2012-06-03 22:50 UTC (permalink / raw) To: Junio C Hamano, Nguyen Thai Ngoc Duy; +Cc: git@vger.kernel.org Hi Guy's, Seeing the discussion about the interpretation of '/' - I would like to add some additional support for my case that it should represent the contents of the top level dir: If I do "ls /" on a unix system - I get a list of the contents of the root of the files system, I think we should maintain this in git, such that '/' represents the top of the tree rather than suppressing it. Best Regards, Caleb Caleb Marchent, Software Developer, Amino Communications Ltd, Buckingway Business Park, Anderson Road, Swavesey, Cambridge, CB24 4UQ, UK cmarchent@aminocom.com www.aminocom.com Tel: +44 1954 234 100 ________________________________________ From: git-owner@vger.kernel.org [git-owner@vger.kernel.org] on behalf of Junio C Hamano [gitster@pobox.com] Sent: 03 June 2012 22:49 To: Nguyen Thai Ngoc Duy Cc: git@vger.kernel.org Subject: Re: [PATCH] Teach ignore machinery about pattern "/" Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > On Sat, May 26, 2012 at 11:30 AM, Nguyen Thai Ngoc Duy > <pclouds@gmail.com> wrote: >> On Sat, May 26, 2012 at 12:32 AM, Junio C Hamano <gitster@pobox.com> wrote: >>> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: >>> >>>> Pattern "/" is ambiguous because the slash can mean "anchor the >>>> pattern to this location" (e.g. /path), but it can also mean >>>> "match directories only" (e.g. path/). Currently git interprets it as >>>> the latter except that 'path' is an empty string, which makes this >>>> pattern totally useless. >>> >>> Did the old version interprete it as the former? >> >> That calls for bit of testing, which I'll do soon. > > "/" is no-op since 1.4.0 (tested with refs/tags/v*),... So historically we had three possible interpretations of "/", i.e. "/path" that happens to have an empty path, "path/" that happens to have an empty path, and a no-op. That strongly suggests that we should at least warn when we see "/" that the user is playing with fire by feeding the exclude machinery something without clear semantics. Doing a no-op might be a safer option than adding it new special case semantics to it, but even then I do not think we should silently ignore it. Silently giving new semantics for this particular special case is out of the question, I would have to say. Thanks for digging. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-06-03 22:56 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-25 12:47 [PATCH] Teach ignore machinery about pattern "/" Nguyễn Thái Ngọc Duy 2012-05-25 17:32 ` Junio C Hamano 2012-05-26 4:30 ` Nguyen Thai Ngoc Duy 2012-05-27 6:15 ` Junio C Hamano 2012-06-03 12:36 ` Nguyen Thai Ngoc Duy 2012-06-03 21:49 ` Junio C Hamano 2012-06-03 22:50 ` Caleb Marchent
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).