* [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix @ 2016-08-14 8:56 Johannes Schindelin 2016-08-14 20:42 ` Junio C Hamano 0 siblings, 1 reply; 5+ messages in thread From: Johannes Schindelin @ 2016-08-14 8:56 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Jacob Keller The '>' character is not a legal part of filenames on Windows. So let's just not use it in Git's source code. This poses a challenge in the test script t4013 which distills command-lines into file names (so that the expected outcome can be stored in files with said names). We have to take particular care not to confound the existing conversion of unwanted characters to underscores with the new substitution of '>': the existing conversion chose to collapse runs of multiple unwanted characters into a single underscore. If we allowed '>' to be collapsed, too, the file name generated from the command "diff [...]=-- [...]" would be identical to the one generated from "diff [...]=--> [...]". Please squash this patch into 3c90ffd2f01e2d0d080c8e42df2ee89709b324de Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Published-As: https://github.com/dscho/git/releases/tag/mingw-t4013-v1 Fetch-It-Via: git fetch https://github.com/dscho/git mingw-t4013-v1 For the record: this prevented my beautiful CI jobs from even checking out the source code for `pu` in the last days. Junio, please let me know if you would prefer this as a separate patch. t/t4013-diff-various.sh | 2 +- ...aster^_side => diff.diff_--diff-line-prefix=--__master_master^_side} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename t/t4013/{diff.diff_--diff-line-prefix=-->_master_master^_side => diff.diff_--diff-line-prefix=--__master_master^_side} (100%) diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh index 5204645..84e2ee0 100755 --- a/t/t4013-diff-various.sh +++ b/t/t4013-diff-various.sh @@ -115,7 +115,7 @@ do case "$cmd" in '' | '#'*) continue ;; esac - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') + test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/') pfx=$(printf "%04d" $test_count) expect="$TEST_DIRECTORY/t4013/diff.$test" actual="$pfx-diff.$test" diff --git a/t/t4013/diff.diff_--diff-line-prefix=-->_master_master^_side b/t/t4013/diff.diff_--diff-line-prefix=--__master_master^_side similarity index 100% rename from t/t4013/diff.diff_--diff-line-prefix=-->_master_master^_side rename to t/t4013/diff.diff_--diff-line-prefix=--__master_master^_side -- 2.9.2.691.g78954f3 base-commit: 945e149951a152207b56d5e49ff5167d151a4c89 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix 2016-08-14 8:56 [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix Johannes Schindelin @ 2016-08-14 20:42 ` Junio C Hamano 2016-08-15 14:07 ` Johannes Schindelin 0 siblings, 1 reply; 5+ messages in thread From: Junio C Hamano @ 2016-08-14 20:42 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Jacob Keller Johannes Schindelin <johannes.schindelin@gmx.de> writes: > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') > + test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/') The existing sed scriptlet says "we cannot have slash and do not want to have space in filename, so we squash runs of them to a single underscore". If you have more characters that you do not want, you should add that to the existing set instead. While you are at it, it may be sensible to add a colon to that set, too, no? Something like this, perhaps? > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') > + test=$(echo "$cmd" | sed -e 's|[/ <:>][/ <:>]*|_|g') ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix 2016-08-14 20:42 ` Junio C Hamano @ 2016-08-15 14:07 ` Johannes Schindelin 2016-08-15 16:20 ` Junio C Hamano 0 siblings, 1 reply; 5+ messages in thread From: Johannes Schindelin @ 2016-08-15 14:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jacob Keller Hi Junio, On Sun, 14 Aug 2016, Junio C Hamano wrote: > Johannes Schindelin <johannes.schindelin@gmx.de> writes: > > > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') > > + test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/') > > The existing sed scriptlet says "we cannot have slash and do not > want to have space in filename, so we squash runs of them to a > single underscore". If you have more characters that you do not > want, you should add that to the existing set instead. No, we cannot do that. I even mentioned it in my commit message why: We have to take particular care not to confound the existing conversion of unwanted characters to underscores with the new substitution of '>': the existing conversion chose to collapse runs of multiple unwanted characters into a single underscore. If we allowed '>' to be collapsed, too, the file name generated from the command "diff [...]=-- [...]" would be identical to the one generated from "diff [...]=--> [...]". And as there is exactly this case (two command-lines, differing only in the '>' character), doing what you suggested would *break* the test since it would now look at the wrong file. I know that this is so because my first iteration of the patch did exactly what you suggested. > While you are at it, it may be sensible to add a colon to that set, too, > no? > > Something like this, perhaps? > > > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') > > + test=$(echo "$cmd" | sed -e 's|[/ <:>][/ <:>]*|_|g') Maybe. But what other characters are missing? Those are not the only ones illegal on Windows: [/ \0-\x1f"*:<>?\\|] would be a more complete version of what you suggested. Except that it is not a valid basic regex. But is that really necessary? I think not, for the following reasons: - my patch was specifically designed to stop my CI from pestering me about a totally broken revision that cannot even be checked out (and causes subsequent problems because of it), - as such, my patch was meant not to be an all-encompassing solution to the problem of filenames that are illegal on Windows, but really a tiny patch that you could apply as quickly as possible so that my CI jobs would stop pestering me (which they did not, because `pu` is still broken), - I even meant this little patch to be so small that it can be easily squashed into Jacob's patch, - I do not want to complicate regular expressions unless *really* needed (and you have to admit that we do not need to address any more characters than the '>' *right now*), as - regular expressions are not exactly an easy meal, so it makes sense to keep them as simple as possible both for contributors' as well as for maintainers' sake, and - when trying to come up with a super-complete solution, it is really easy not only to spend waaaaay too much time on it, but also to introduce bugs that are not spotted for a loooong time because nothing actually exercises the newly introduced code, and - If It Ain't Broke, Don't Fix It. - the broader solution must come separately, and not as a mere add-on to one test case: we really need to ensure that such file names do not enter Git's source code. I will send my proposal to address the larger issue in a moment, in the meantime I *beg* you to add this here patch as a quick fix to my CI woes, either by squashing it into the indicated commit, or by appending it to the topic branch. I do not care which one, as long as `pu` gets fixed. Thanks, Dscho ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix 2016-08-15 14:07 ` Johannes Schindelin @ 2016-08-15 16:20 ` Junio C Hamano 2016-08-16 15:39 ` Johannes Schindelin 0 siblings, 1 reply; 5+ messages in thread From: Junio C Hamano @ 2016-08-15 16:20 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Jacob Keller Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > On Sun, 14 Aug 2016, Junio C Hamano wrote: > >> Johannes Schindelin <johannes.schindelin@gmx.de> writes: >> >> > - test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g') >> > + test=$(echo "$cmd" | sed -e 's|[/ ][/ ]*|_|g' -e 'y/>/_/') >> ... > I know that this is so because my first iteration of the patch did exactly > what you suggested. You probably should have stepped back and taken a deep breath before writing the second round. Doing so after writing it before sending would also have been OK. Among three characters that are special cased here, the problem "if we squish a run of problematic characters into one underscore, we risk making the result ambiguous" is NOT limited to '>'; it is not a new problem with '>', either. I can think of two other possible solutions offhand: (1) drop "squishing a run", i.e. [/ ]*, from the regexp, and rename existing test vectors that would be affected; (2) change the string used in the offending test so that squishing will not make the result ambiguous. before special casing "y/>/_/"; as there is nothing in ">" that inherently causes the ambiguity that won't be caused by "/" or " ", adding second clause to the sed expression that does things differently only for ">" is simply wrong. After all, you only wanted to affect the "prefix=-->" test and not the whole set of the tests in the script. Obviously (1) is a lot of impact with little gain, and as Jacob already offered to do, I think (2) is a lot more sensible solution and it also is more in line with your "If it isn't broken, do not fix it", I would say. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix 2016-08-15 16:20 ` Junio C Hamano @ 2016-08-16 15:39 ` Johannes Schindelin 0 siblings, 0 replies; 5+ messages in thread From: Johannes Schindelin @ 2016-08-16 15:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jacob Keller Hi Junio, On Mon, 15 Aug 2016, Junio C Hamano wrote: > Obviously (1) is a lot of impact with little gain, and as Jacob > already offered to do, I think (2) is a lot more sensible solution > and it also is more in line with your "If it isn't broken, do not > fix it", I would say. Yep, that makes most sense to me, too. Ciao, Dscho ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-08-16 15:39 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-14 8:56 [PATCH] squash! diff: add --diff-line-prefix option for passing in a prefix Johannes Schindelin 2016-08-14 20:42 ` Junio C Hamano 2016-08-15 14:07 ` Johannes Schindelin 2016-08-15 16:20 ` Junio C Hamano 2016-08-16 15:39 ` Johannes Schindelin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).