* What's cooking extra
@ 2010-05-19 14:33 Junio C Hamano
2010-05-19 15:12 ` A Large Angry SCM
` (3 more replies)
0 siblings, 4 replies; 35+ messages in thread
From: Junio C Hamano @ 2010-05-19 14:33 UTC (permalink / raw)
To: git
First, let me thank everybody who participated in the list traffic for the
past few weeks, reporting problems, answering questions, raising issues,
discussing them, sending patches and giving feedback to them, while I was
away.
I took a new day-job (yesterday was my second day) and haven't quite
adjusted yet, but I finally managed to find some time and energy to browse
through the list traffic and even queued a handful of topics. I expect
I'll be more productive and back to speed in a week or two, but until then
I expect to still be slower than my usual self.
Here are the topics I've picked up so far (excluding the ones that were
trivially and obviously correct that went directly to 'maint' or
'master'):
* ec/diff-noprefix-config (2010-05-02) 1 commit
- diff: add configuration option for disabling diff prefixes.
* jk/diff-m-doc (2010-05-08) 1 commit
- docs: clarify meaning of -M for git-log
* mc/maint-zoneparse (2010-05-17) 1 commit
- Add "Z" as an alias for the timezone "UTC"
* mg/notes-dry-run (2010-05-14) 1 commit
- notes: dry-run and verbose options for prune
* mg/rev-parse-lrbranches-locals (2010-05-14) 1 commit
- revlist: Introduce --lrbranches and --locals revision specifiers
(this branch uses mg/rev-parse-option-sifter-deprecation.)
* mg/rev-parse-option-sifter-deprecation (2010-05-14) 3 commits
- t6018: make sure all tested symbolic names are different revs
- t6018: add tests for rev-list's --branches and --tags
- rev-parse: deprecate use as an option sifter
(this branch is used by mg/rev-parse-lrbranches-locals.)
I am aware of the following topics, that are probably all worthy of
inclusion at some point, but am unclear in what status their discussions
are. I'd appreciate it if people can help me come up with a list of
topics that are fully discussed, and if patch submitters of these topics
can re-send the final "to apply" copy.
* (Eyvind Bernhardsen and Linus) Fixing the behaviour of crlf attribute;
ignoring them when core.autocrlf is not in effect was a wrong design
decision.
I agree with what Linus said in the thread; I haven't yet looked at the
discussion in the past few days. Also I don't know where '[PATCH v2]
Add "core.eol" config variable' fits in the picture.
* (Chris Lamb, Jeff King, Thomas Rast) "rebase -i" mishandles a patch
with backslash in the title
* (Rene) grep on binary files
* (Linus) "git show ':/this is now a regex'"
* (Gary V. Vaughan) Portability patches
* (Ævar Arnfjörð Bjarmason) cvsserver updates
* (Bo Yang, Thomas Rast) "log --graph" improvements
* (Pavan Kumar Sunkara) instaweb and web--browse
* (Yann Droneaud, Linus) matching utf8 locale in t9129
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-19 14:33 What's cooking extra Junio C Hamano
@ 2010-05-19 15:12 ` A Large Angry SCM
2010-05-19 17:06 ` Finn Arne Gangstad
` (2 subsequent siblings)
3 siblings, 0 replies; 35+ messages in thread
From: A Large Angry SCM @ 2010-05-19 15:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
[...]
>
> I took a new day-job (yesterday was my second day) and haven't quite
> adjusted yet, but I finally managed to find some time and energy to browse
> through the list traffic and even queued a handful of topics. I expect
> I'll be more productive and back to speed in a week or two, but until then
> I expect to still be slower than my usual self.
>
This is interesting news. Congratulations on the new job! It must be a
good one to get you to leave southern California. Is your new employer
subsidizing your git work similar to your previous employer?
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-19 14:33 What's cooking extra Junio C Hamano
2010-05-19 15:12 ` A Large Angry SCM
@ 2010-05-19 17:06 ` Finn Arne Gangstad
2010-05-19 20:09 ` Eyvind Bernhardsen
2010-05-22 13:09 ` Clemens Buchacher
2010-05-21 16:16 ` Ævar Arnfjörð Bjarmason
2010-05-22 21:24 ` René Scharfe
3 siblings, 2 replies; 35+ messages in thread
From: Finn Arne Gangstad @ 2010-05-19 17:06 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Wed, May 19, 2010 at 07:33:32AM -0700, Junio C Hamano wrote:
> * (Eyvind Bernhardsen and Linus) Fixing the behaviour of crlf attribute;
> ignoring them when core.autocrlf is not in effect was a wrong design
> decision.
>
> I agree with what Linus said in the thread; I haven't yet looked at the
> discussion in the past few days. Also I don't know where '[PATCH v2]
> Add "core.eol" config variable' fits in the picture.
I think this one is pretty much discussed by now, with the latest
changes this should do pretty much what Linus wanted. Together with
the autocrlf patch, this should make CRLF handling on Windows and
mixed Linux/Windows setups a lot better!
- Finn Arne
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-19 17:06 ` Finn Arne Gangstad
@ 2010-05-19 20:09 ` Eyvind Bernhardsen
2010-05-22 13:09 ` Clemens Buchacher
1 sibling, 0 replies; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-19 20:09 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git@vger.kernel.org List, Finn Arne Gangstad
On 19. mai 2010, at 19.06, Finn Arne Gangstad wrote:
> On Wed, May 19, 2010 at 07:33:32AM -0700, Junio C Hamano wrote:
>
>> * (Eyvind Bernhardsen and Linus) Fixing the behaviour of crlf attribute;
>> ignoring them when core.autocrlf is not in effect was a wrong design
>> decision.
>>
>> I agree with what Linus said in the thread; I haven't yet looked at the
>> discussion in the past few days. Also I don't know where '[PATCH v2]
>> Add "core.eol" config variable' fits in the picture.
>
> I think this one is pretty much discussed by now, with the latest
> changes this should do pretty much what Linus wanted. Together with
> the autocrlf patch, this should make CRLF handling on Windows and
> mixed Linux/Windows setups a lot better!
Yes, it's done. I'll resend the series with the core.eol patch included.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-19 14:33 What's cooking extra Junio C Hamano
2010-05-19 15:12 ` A Large Angry SCM
2010-05-19 17:06 ` Finn Arne Gangstad
@ 2010-05-21 16:16 ` Ævar Arnfjörð Bjarmason
2010-05-22 21:24 ` René Scharfe
3 siblings, 0 replies; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2010-05-21 16:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Wed, May 19, 2010 at 14:33, Junio C Hamano <gitster@pobox.com> wrote:
> I am aware of the following topics, that are probably all worthy of
> inclusion at some point, but am unclear in what status their discussions
> are. I'd appreciate it if people can help me come up with a list of
> topics that are fully discussed, and if patch submitters of these topics
> can re-send the final "to apply" copy.
> ...
> * (Ævar Arnfjörð Bjarmason) cvsserver updates
When I originally submitted this in 2008 you seemed to be willing to
let it go in as-is. But arguably it's better if this were to be
squashed into one patch. It's more readable, but some history will be
destroyed along the way.
I don't care either way, but I can submit it in squashed form on
request.
Thanks.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-19 17:06 ` Finn Arne Gangstad
2010-05-19 20:09 ` Eyvind Bernhardsen
@ 2010-05-22 13:09 ` Clemens Buchacher
2010-05-22 19:42 ` Eyvind Bernhardsen
1 sibling, 1 reply; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-22 13:09 UTC (permalink / raw)
To: Finn Arne Gangstad; +Cc: Junio C Hamano, git, Eyvind Bernhardsen
On Wed, May 19, 2010 at 07:06:56PM +0200, Finn Arne Gangstad wrote:
> On Wed, May 19, 2010 at 07:33:32AM -0700, Junio C Hamano wrote:
>
> > * (Eyvind Bernhardsen and Linus) Fixing the behaviour of crlf attribute;
> > ignoring them when core.autocrlf is not in effect was a wrong design
> > decision.
> >
> > I agree with what Linus said in the thread; I haven't yet looked at the
> > discussion in the past few days. Also I don't know where '[PATCH v2]
> > Add "core.eol" config variable' fits in the picture.
>
> I think this one is pretty much discussed by now, with the latest
> changes this should do pretty much what Linus wanted.
That is not the impression I got. Linus was objecting to the idea of new
attribute and configuration variables, which essentially do the same thing
but with slightly different semantics.
As soon as the existing crlf attribute is given priority over core.autocrlf,
all the problems discussed originally go away. So what exactly are the new
attributes supposed to do?
Also, could you post a truth table for all the parameters involved (eol,
crlf, core.autocrlf, core.eol). The documentation in the patches is too
confusing for me to understand even that.
And, renaming the crlf attribute to text? Where did Linus suggest that? If
we do that, we don't even have to talk about backwards compatibility any
more.
Clemens
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-22 13:09 ` Clemens Buchacher
@ 2010-05-22 19:42 ` Eyvind Bernhardsen
2010-05-22 22:27 ` Clemens Buchacher
0 siblings, 1 reply; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-22 19:42 UTC (permalink / raw)
To: Clemens Buchacher; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On 22. mai 2010, at 15.09, Clemens Buchacher wrote:
> On Wed, May 19, 2010 at 07:06:56PM +0200, Finn Arne Gangstad wrote:
>> On Wed, May 19, 2010 at 07:33:32AM -0700, Junio C Hamano wrote:
>>
>>> * (Eyvind Bernhardsen and Linus) Fixing the behaviour of crlf attribute;
>>> ignoring them when core.autocrlf is not in effect was a wrong design
>>> decision.
>>>
>>> I agree with what Linus said in the thread; I haven't yet looked at the
>>> discussion in the past few days. Also I don't know where '[PATCH v2]
>>> Add "core.eol" config variable' fits in the picture.
>>
>> I think this one is pretty much discussed by now, with the latest
>> changes this should do pretty much what Linus wanted.
>
> That is not the impression I got. Linus was objecting to the idea of new
> attribute and configuration variables, which essentially do the same thing
> but with slightly different semantics.
>
> As soon as the existing crlf attribute is given priority over core.autocrlf,
> all the problems discussed originally go away. So what exactly are the new
> attributes supposed to do?
There is one new attribute, "eol", that is used for files which need a specific line ending. Being able to "force" LF or CRLF line endings has been requested several times on the list, and is already sort of provided for by "crlf=input".
Linus had lots of strong objections, but I also made lots of changes during the course of the discussion.
> Also, could you post a truth table for all the parameters involved (eol,
> crlf, core.autocrlf, core.eol). The documentation in the patches is too
> confusing for me to understand even that.
I'll do my best. Basically there are two things to keep track of: "should this file be normalized as text?" and "which line endings should this file have in the working directory?". There is an attribute for each, and two configuration variables that set the default values of the attributes.
Unfortunately my mail client mangles ascii art, so I can't do a table. This will have to suffice:
Any file with the "text" attribute set will have its line endings normalized to LF in the repository. If "text" is set to the special value "auto", git will only convert the file if it looks like a text file.
The "eol" attribute is used for files that need a specific line ending. Setting it also sets "text".
core.eol controls which line endings to use for normalized files that don't have the "eol" attribute set, and defaults to the platform native line ending.
When core.autocrlf is set, the default value of the "text" attribute is set to "auto" but with an extra safety feature: if a file contains CRs in the index, it won't be normalized. The extra feature comes from Finn Arne's "safe autocrlf" patch.
There is a backwards compatibility wrinkle in that core.autocrlf will override core.eol if the latter isn't explicitly set, so that "core.autocrlf=true" still results in CRLFs in the working directory on Linux.
> And, renaming the crlf attribute to text? Where did Linus suggest that? If
> we do that, we don't even have to talk about backwards compatibility any
> more.
In <alpine.LFD.2.00.1005121824260.3711@i5.linux-foundation.org>:
> So if you rename these things, keep them separate. Make the "am I a
> text-file" boolean be a boolean (plus "auto"), and just call it "text".
> And make the "what end of line to use" be just "eol" then.
So he didn't suggest renaming it, but he did suggest a better name and UI than the one I came up with. I expected objections to renaming the attribute, which is why that is a separate commit, but people seemed supportive overall.
The "crlf" attribute will be used if it is present so backwards compatibility is preserved to a degree. Scripts that test for the "crlf" attribute explicitly (such as git-cvsserver, which I fixed) will break. I don't know how big a problem that is going to be in practice, but nobody raised it as an issue during the discussion.
Compatibility with older clients is a valid concern, but older clients will ignore "crlf" as well as "text" unless core.autocrlf is set, so they will cause problems no matter what.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-19 14:33 What's cooking extra Junio C Hamano
` (2 preceding siblings ...)
2010-05-21 16:16 ` Ævar Arnfjörð Bjarmason
@ 2010-05-22 21:24 ` René Scharfe
2010-05-22 21:26 ` [PATCH 1/8] grep: add test script for binary file handling René Scharfe
` (7 more replies)
3 siblings, 8 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:24 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Phil Lawrence
Am 19.05.2010 16:33, schrieb Junio C Hamano:
> I am aware of the following topics, that are probably all worthy of
> inclusion at some point, but am unclear in what status their discussions
> are. I'd appreciate it if people can help me come up with a list of
> topics that are fully discussed, and if patch submitters of these topics
> can re-send the final "to apply" copy.
> * (Rene) grep on binary files
There was one helpful comment from Dmitry, which I addressed in a follow-up
patch. No reply from Phil, the one who started the topic, though.
I'll send an updated round as replies to this message:
[PATCH 1/8] grep: add test script for binary file handling
Adds a simple test script documenting what git grep can do with
binary files. New: tests for -L and -q.
[PATCH 2/8] grep: grep: refactor handling of binary mode options
Cleanup patch; unchanged.
[PATCH 3/8] grep: --count over binary
[PATCH 4/8] grep: --name-only over binary
Correctness patches for handling of the options --count and
--name-only in connection with binary files. The first one was
reimplemented and the second one is new.
[PATCH 5/8] grep: use memmem() for fixed string search
[PATCH 6/8] grep: continue case insensitive fixed string search after NUL chars
These two patches make git grep -F work on binary files. They
have been rebased against the preceding changed patches but are
unchanged otherwise.
[PATCH 7/8] grep: use REG_STARTEND for all matching if available
This make git grep work on binary files if the platform's
regexec() supports the flag REG_STARTEND. Our own version in
compat/ doesn't, unfortunately. In the first round it consisted
of two patches, which have been squashed and rebased.
[PATCH 8/8] grep: support NUL chars in search strings for -F
New patch, adds support for NUL in patterns, but only for git
grep -F (not -Fi). It's main value is the addition of tests to
show the current limitations regarding searching for NULs.
builtin/grep.c | 8 +++-
grep.c | 98 +++++++++++++++++++++++++++------------------
grep.h | 2 +
t/t7008-grep-binary.sh | 102 ++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 169 insertions(+), 41 deletions(-)
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 1/8] grep: add test script for binary file handling
2010-05-22 21:24 ` René Scharfe
@ 2010-05-22 21:26 ` René Scharfe
2010-05-22 21:28 ` [PATCH 2/8] grep: grep: refactor handling of binary mode options René Scharfe
` (6 subsequent siblings)
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
t/t7008-grep-binary.sh | 42 ++++++++++++++++++++++++++++++++++++++++++
1 files changed, 42 insertions(+), 0 deletions(-)
create mode 100755 t/t7008-grep-binary.sh
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
new file mode 100755
index 0000000..2320e74
--- /dev/null
+++ b/t/t7008-grep-binary.sh
@@ -0,0 +1,42 @@
+#!/bin/sh
+
+test_description='git grep in binary files'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' "
+ printf 'binary\000file\n' >a &&
+ git add a &&
+ git commit -m.
+"
+
+test_expect_success 'git grep ina a' '
+ echo Binary file a matches >expect &&
+ git grep ina a >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'git grep -ah ina a' '
+ git grep -ah ina a >actual &&
+ test_cmp a actual
+'
+
+test_expect_success 'git grep -I ina a' '
+ : >expect &&
+ test_must_fail git grep -I ina a >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'git grep -L bar a' '
+ echo a >expect &&
+ git grep -L bar a >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'git grep -q ina a' '
+ : >expect &&
+ git grep -q ina a >actual &&
+ test_cmp expect actual
+'
+
+test_done
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 2/8] grep: grep: refactor handling of binary mode options
2010-05-22 21:24 ` René Scharfe
2010-05-22 21:26 ` [PATCH 1/8] grep: add test script for binary file handling René Scharfe
@ 2010-05-22 21:28 ` René Scharfe
2010-05-22 21:29 ` [PATCH 3/8] grep: --count over binary René Scharfe
` (5 subsequent siblings)
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:28 UTC (permalink / raw)
Cc: Junio C Hamano, git
Turn the switch inside-out and add labels for each possible value
of ->binary. This makes the code easier to read and avoids calling
buffer_is_binary() if the option -a was given.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
grep.c | 20 +++++++++++---------
1 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/grep.c b/grep.c
index 543b1d5..2a8e879 100644
--- a/grep.c
+++ b/grep.c
@@ -800,17 +800,19 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name,
opt->show_hunk_mark = 1;
opt->last_shown = 0;
- if (buffer_is_binary(buf, size)) {
- switch (opt->binary) {
- case GREP_BINARY_DEFAULT:
+ switch (opt->binary) {
+ case GREP_BINARY_DEFAULT:
+ if (buffer_is_binary(buf, size))
binary_match_only = 1;
- break;
- case GREP_BINARY_NOMATCH:
+ break;
+ case GREP_BINARY_NOMATCH:
+ if (buffer_is_binary(buf, size))
return 0; /* Assume unmatch */
- break;
- default:
- break;
- }
+ break;
+ case GREP_BINARY_TEXT:
+ break;
+ default:
+ die("bug: unknown binary handling mode");
}
memset(&xecfg, 0, sizeof(xecfg));
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 3/8] grep: --count over binary
2010-05-22 21:24 ` René Scharfe
2010-05-22 21:26 ` [PATCH 1/8] grep: add test script for binary file handling René Scharfe
2010-05-22 21:28 ` [PATCH 2/8] grep: grep: refactor handling of binary mode options René Scharfe
@ 2010-05-22 21:29 ` René Scharfe
2010-05-22 21:30 ` [PATCH 4/8] grep: --name-only " René Scharfe
` (4 subsequent siblings)
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
The intent of showing the message "Binary file xyz matches" for
binary files is to avoid annoying users by potentially messing up
their terminals by printing control characters. In --count mode,
this precaution isn't necessary.
Display counts of matches if -c/--count was specified, even if -a
was not given. GNU grep does the same.
Moving the check for ->count before the code for handling binary
file also avoids printing context lines if --count and -[ABC] were
used together, so we can remove the part of the comment that
mentions this behaviour. Again, GNU grep does the same.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
grep.c | 9 ++++-----
t/t7008-grep-binary.sh | 6 ++++++
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/grep.c b/grep.c
index 2a8e879..35c18b7 100644
--- a/grep.c
+++ b/grep.c
@@ -873,6 +873,8 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name,
count++;
if (opt->status_only)
return 1;
+ if (opt->count)
+ goto next_line;
if (binary_match_only) {
opt->output(opt, "Binary file ", 12);
output_color(opt, name, strlen(name),
@@ -886,16 +888,12 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name,
}
/* Hit at this line. If we haven't shown the
* pre-context lines, we would need to show them.
- * When asked to do "count", this still show
- * the context which is nonsense, but the user
- * deserves to get that ;-).
*/
if (opt->pre_context)
show_pre_context(opt, name, buf, bol, lno);
else if (opt->funcname)
show_funcname_line(opt, name, buf, bol, lno);
- if (!opt->count)
- show_line(opt, bol, eol, name, lno, ':');
+ show_line(opt, bol, eol, name, lno, ':');
last_hit = lno;
}
else if (last_hit &&
@@ -939,6 +937,7 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name,
output_sep(opt, ':');
snprintf(buf, sizeof(buf), "%u\n", count);
opt->output(opt, buf, strlen(buf));
+ return 1;
}
return !!last_hit;
}
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 2320e74..91970ea 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -27,6 +27,12 @@ test_expect_success 'git grep -I ina a' '
test_cmp expect actual
'
+test_expect_success 'git grep -c ina a' '
+ echo a:1 >expect &&
+ git grep -c ina a >actual &&
+ test_cmp expect actual
+'
+
test_expect_success 'git grep -L bar a' '
echo a >expect &&
git grep -L bar a >actual &&
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 4/8] grep: --name-only over binary
2010-05-22 21:24 ` René Scharfe
` (2 preceding siblings ...)
2010-05-22 21:29 ` [PATCH 3/8] grep: --count over binary René Scharfe
@ 2010-05-22 21:30 ` René Scharfe
2010-05-22 21:32 ` [PATCH 5/8] grep: use memmem() for fixed string search René Scharfe
` (3 subsequent siblings)
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:30 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
As with the option -c/--count, git grep with the option -l/--name-only
should work the same with binary files as with text files because
there is no danger of messing up the terminal with control characters
from the contents of matching files. GNU grep does the same.
Move the check for ->name_only before the one for binary_match_only,
thus making the latter irrelevant for git grep -l.
Reported-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
grep.c | 8 ++++----
t/t7008-grep-binary.sh | 6 ++++++
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/grep.c b/grep.c
index 35c18b7..22639cd 100644
--- a/grep.c
+++ b/grep.c
@@ -873,6 +873,10 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name,
count++;
if (opt->status_only)
return 1;
+ if (opt->name_only) {
+ show_name(opt, name);
+ return 1;
+ }
if (opt->count)
goto next_line;
if (binary_match_only) {
@@ -882,10 +886,6 @@ static int grep_buffer_1(struct grep_opt *opt, const char *name,
opt->output(opt, " matches\n", 9);
return 1;
}
- if (opt->name_only) {
- show_name(opt, name);
- return 1;
- }
/* Hit at this line. If we haven't shown the
* pre-context lines, we would need to show them.
*/
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 91970ea..4a12d97 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -33,6 +33,12 @@ test_expect_success 'git grep -c ina a' '
test_cmp expect actual
'
+test_expect_success 'git grep -l ina a' '
+ echo a >expect &&
+ git grep -l ina a >actual &&
+ test_cmp expect actual
+'
+
test_expect_success 'git grep -L bar a' '
echo a >expect &&
git grep -L bar a >actual &&
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 5/8] grep: use memmem() for fixed string search
2010-05-22 21:24 ` René Scharfe
` (3 preceding siblings ...)
2010-05-22 21:30 ` [PATCH 4/8] grep: --name-only " René Scharfe
@ 2010-05-22 21:32 ` René Scharfe
2010-05-22 21:34 ` [PATCH 6/8] grep: continue case insensitive fixed string search after NUL chars René Scharfe
` (2 subsequent siblings)
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Allow searching beyond NUL characters by using memmem() instead of
strstr().
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
grep.c | 16 +++++++++-------
t/t7008-grep-binary.sh | 4 ++++
2 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/grep.c b/grep.c
index 22639cd..c3affb6 100644
--- a/grep.c
+++ b/grep.c
@@ -329,14 +329,15 @@ static void show_name(struct grep_opt *opt, const char *name)
opt->output(opt, opt->null_following_name ? "\0" : "\n", 1);
}
-
-static int fixmatch(const char *pattern, char *line, int ignore_case, regmatch_t *match)
+static int fixmatch(const char *pattern, char *line, char *eol,
+ int ignore_case, regmatch_t *match)
{
char *hit;
+
if (ignore_case)
hit = strcasestr(line, pattern);
else
- hit = strstr(line, pattern);
+ hit = memmem(line, eol - line, pattern, strlen(pattern));
if (!hit) {
match->rm_so = match->rm_eo = -1;
@@ -399,7 +400,7 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
again:
if (p->fixed)
- hit = !fixmatch(p->pattern, bol, p->ignore_case, pmatch);
+ hit = !fixmatch(p->pattern, bol, eol, p->ignore_case, pmatch);
else
hit = !regexec(&p->regexp, bol, 1, pmatch, eflags);
@@ -725,9 +726,10 @@ static int look_ahead(struct grep_opt *opt,
int hit;
regmatch_t m;
- if (p->fixed)
- hit = !fixmatch(p->pattern, bol, p->ignore_case, &m);
- else {
+ if (p->fixed) {
+ hit = !fixmatch(p->pattern, bol, bol + *left_p,
+ p->ignore_case, &m);
+ } else {
#ifdef REG_STARTEND
m.rm_so = 0;
m.rm_eo = *left_p;
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 4a12d97..9adc9ed 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -51,4 +51,8 @@ test_expect_success 'git grep -q ina a' '
test_cmp expect actual
'
+test_expect_success 'git grep -F ile a' '
+ git grep -F ile a
+'
+
test_done
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 6/8] grep: continue case insensitive fixed string search after NUL chars
2010-05-22 21:24 ` René Scharfe
` (4 preceding siblings ...)
2010-05-22 21:32 ` [PATCH 5/8] grep: use memmem() for fixed string search René Scharfe
@ 2010-05-22 21:34 ` René Scharfe
2010-05-22 21:35 ` [PATCH 7/8] grep: use REG_STARTEND for all matching if available René Scharfe
2010-05-22 21:43 ` [PATCH 8/8] grep: support NUL chars in search strings for -F René Scharfe
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Functions for C strings, like strcasestr(), can't see beyond NUL
characters. Check if there is such an obstacle on the line and try
again behind it.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
grep.c | 12 +++++++++---
t/t7008-grep-binary.sh | 4 ++++
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/grep.c b/grep.c
index c3affb6..b95803b 100644
--- a/grep.c
+++ b/grep.c
@@ -334,9 +334,15 @@ static int fixmatch(const char *pattern, char *line, char *eol,
{
char *hit;
- if (ignore_case)
- hit = strcasestr(line, pattern);
- else
+ if (ignore_case) {
+ char *s = line;
+ do {
+ hit = strcasestr(s, pattern);
+ if (hit)
+ break;
+ s += strlen(s) + 1;
+ } while (s < eol);
+ } else
hit = memmem(line, eol - line, pattern, strlen(pattern));
if (!hit) {
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 9adc9ed..9660842 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -55,4 +55,8 @@ test_expect_success 'git grep -F ile a' '
git grep -F ile a
'
+test_expect_success 'git grep -Fi iLE a' '
+ git grep -Fi iLE a
+'
+
test_done
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 7/8] grep: use REG_STARTEND for all matching if available
2010-05-22 21:24 ` René Scharfe
` (5 preceding siblings ...)
2010-05-22 21:34 ` [PATCH 6/8] grep: continue case insensitive fixed string search after NUL chars René Scharfe
@ 2010-05-22 21:35 ` René Scharfe
2010-05-22 21:43 ` [PATCH 8/8] grep: support NUL chars in search strings for -F René Scharfe
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Refactor REG_STARTEND handling inlook_ahead() into a new helper,
regmatch(), and use it for line matching, too. This allows regex
matching beyond NUL characters if regexec() supports the flag. NUL
characters themselves are not matched in any way, though.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
grep.c | 24 ++++++++++++++----------
t/t7008-grep-binary.sh | 10 ++++++++++
2 files changed, 24 insertions(+), 10 deletions(-)
diff --git a/grep.c b/grep.c
index b95803b..70a776f 100644
--- a/grep.c
+++ b/grep.c
@@ -356,6 +356,17 @@ static int fixmatch(const char *pattern, char *line, char *eol,
}
}
+static int regmatch(const regex_t *preg, char *line, char *eol,
+ regmatch_t *match, int eflags)
+{
+#ifdef REG_STARTEND
+ match->rm_so = 0;
+ match->rm_eo = eol - line;
+ eflags |= REG_STARTEND;
+#endif
+ return regexec(preg, line, 1, match, eflags);
+}
+
static int strip_timestamp(char *bol, char **eol_p)
{
char *eol = *eol_p;
@@ -408,7 +419,7 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
if (p->fixed)
hit = !fixmatch(p->pattern, bol, eol, p->ignore_case, pmatch);
else
- hit = !regexec(&p->regexp, bol, 1, pmatch, eflags);
+ hit = !regmatch(&p->regexp, bol, eol, pmatch, eflags);
if (hit && p->word_regexp) {
if ((pmatch[0].rm_so < 0) ||
@@ -735,15 +746,8 @@ static int look_ahead(struct grep_opt *opt,
if (p->fixed) {
hit = !fixmatch(p->pattern, bol, bol + *left_p,
p->ignore_case, &m);
- } else {
-#ifdef REG_STARTEND
- m.rm_so = 0;
- m.rm_eo = *left_p;
- hit = !regexec(&p->regexp, bol, 1, &m, REG_STARTEND);
-#else
- hit = !regexec(&p->regexp, bol, 1, &m, 0);
-#endif
- }
+ } else
+ hit = !regmatch(&p->regexp, bol, bol + *left_p, &m, 0);
if (!hit || m.rm_so < 0 || m.rm_eo < 0)
continue;
if (earliest < 0 || m.rm_so < earliest)
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 9660842..4f5e74f 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -59,4 +59,14 @@ test_expect_success 'git grep -Fi iLE a' '
git grep -Fi iLE a
'
+# This test actually passes on platforms where regexec() supports the
+# flag REG_STARTEND.
+test_expect_failure 'git grep ile a' '
+ git grep ile a
+'
+
+test_expect_failure 'git grep .fi a' '
+ git grep .fi a
+'
+
test_done
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 8/8] grep: support NUL chars in search strings for -F
2010-05-22 21:24 ` René Scharfe
` (6 preceding siblings ...)
2010-05-22 21:35 ` [PATCH 7/8] grep: use REG_STARTEND for all matching if available René Scharfe
@ 2010-05-22 21:43 ` René Scharfe
7 siblings, 0 replies; 35+ messages in thread
From: René Scharfe @ 2010-05-22 21:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Search patterns in a file specified with -f can contain NUL characters.
The current code ignores all characters on a line after a NUL.
Pass the actual length of the line all the way from the pattern file to
fixmatch() and use it for case-sensitive fixed string matching.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
Support for -F was easy, but in order to be able to search for NULs
with -Fi, -G and -E, we'd need a different case-insensitive fixed
string search function (memcasemem?) and a different regex library, or
at least use a different (non-POSIX) entry point.
How badly do we need this feature? If the new regex lib is faster or
improves multi-platform support then NUL support would be a nice side
effect, I think, but this feature alone doesn't justify a switch in my
eyes.
builtin/grep.c | 8 ++++++--
grep.c | 33 ++++++++++++++++++++-------------
grep.h | 2 ++
t/t7008-grep-binary.sh | 30 ++++++++++++++++++++++++++++++
4 files changed, 58 insertions(+), 15 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index b194ea3..d0a73da 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -724,11 +724,15 @@ static int file_callback(const struct option *opt, const char *arg, int unset)
if (!patterns)
die_errno("cannot open '%s'", arg);
while (strbuf_getline(&sb, patterns, '\n') == 0) {
+ char *s;
+ size_t len;
+
/* ignore empty line like grep does */
if (sb.len == 0)
continue;
- append_grep_pattern(grep_opt, strbuf_detach(&sb, NULL), arg,
- ++lno, GREP_PATTERN);
+
+ s = strbuf_detach(&sb, &len);
+ append_grep_pat(grep_opt, s, len, arg, ++lno, GREP_PATTERN);
}
fclose(patterns);
strbuf_release(&sb);
diff --git a/grep.c b/grep.c
index 70a776f..82fb349 100644
--- a/grep.c
+++ b/grep.c
@@ -7,6 +7,7 @@ void append_header_grep_pattern(struct grep_opt *opt, enum grep_header_field fie
{
struct grep_pat *p = xcalloc(1, sizeof(*p));
p->pattern = pat;
+ p->patternlen = strlen(pat);
p->origin = "header";
p->no = 0;
p->token = GREP_PATTERN_HEAD;
@@ -19,8 +20,15 @@ void append_header_grep_pattern(struct grep_opt *opt, enum grep_header_field fie
void append_grep_pattern(struct grep_opt *opt, const char *pat,
const char *origin, int no, enum grep_pat_token t)
{
+ append_grep_pat(opt, pat, strlen(pat), origin, no, t);
+}
+
+void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen,
+ const char *origin, int no, enum grep_pat_token t)
+{
struct grep_pat *p = xcalloc(1, sizeof(*p));
p->pattern = pat;
+ p->patternlen = patlen;
p->origin = origin;
p->no = no;
p->token = t;
@@ -44,8 +52,8 @@ struct grep_opt *grep_opt_dup(const struct grep_opt *opt)
append_header_grep_pattern(ret, pat->field,
pat->pattern);
else
- append_grep_pattern(ret, pat->pattern, pat->origin,
- pat->no, pat->token);
+ append_grep_pat(ret, pat->pattern, pat->patternlen,
+ pat->origin, pat->no, pat->token);
}
return ret;
@@ -329,21 +337,21 @@ static void show_name(struct grep_opt *opt, const char *name)
opt->output(opt, opt->null_following_name ? "\0" : "\n", 1);
}
-static int fixmatch(const char *pattern, char *line, char *eol,
- int ignore_case, regmatch_t *match)
+static int fixmatch(struct grep_pat *p, char *line, char *eol,
+ regmatch_t *match)
{
char *hit;
- if (ignore_case) {
+ if (p->ignore_case) {
char *s = line;
do {
- hit = strcasestr(s, pattern);
+ hit = strcasestr(s, p->pattern);
if (hit)
break;
s += strlen(s) + 1;
} while (s < eol);
} else
- hit = memmem(line, eol - line, pattern, strlen(pattern));
+ hit = memmem(line, eol - line, p->pattern, p->patternlen);
if (!hit) {
match->rm_so = match->rm_eo = -1;
@@ -351,7 +359,7 @@ static int fixmatch(const char *pattern, char *line, char *eol,
}
else {
match->rm_so = hit - line;
- match->rm_eo = match->rm_so + strlen(pattern);
+ match->rm_eo = match->rm_so + p->patternlen;
return 0;
}
}
@@ -417,7 +425,7 @@ static int match_one_pattern(struct grep_pat *p, char *bol, char *eol,
again:
if (p->fixed)
- hit = !fixmatch(p->pattern, bol, eol, p->ignore_case, pmatch);
+ hit = !fixmatch(p, bol, eol, pmatch);
else
hit = !regmatch(&p->regexp, bol, eol, pmatch, eflags);
@@ -743,10 +751,9 @@ static int look_ahead(struct grep_opt *opt,
int hit;
regmatch_t m;
- if (p->fixed) {
- hit = !fixmatch(p->pattern, bol, bol + *left_p,
- p->ignore_case, &m);
- } else
+ if (p->fixed)
+ hit = !fixmatch(p, bol, bol + *left_p, &m);
+ else
hit = !regmatch(&p->regexp, bol, bol + *left_p, &m, 0);
if (!hit || m.rm_so < 0 || m.rm_eo < 0)
continue;
diff --git a/grep.h b/grep.h
index 89342e5..0aebebd 100644
--- a/grep.h
+++ b/grep.h
@@ -29,6 +29,7 @@ struct grep_pat {
int no;
enum grep_pat_token token;
const char *pattern;
+ size_t patternlen;
enum grep_header_field field;
regex_t regexp;
unsigned fixed:1;
@@ -104,6 +105,7 @@ struct grep_opt {
void *output_priv;
};
+extern void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
extern void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
extern void append_header_grep_pattern(struct grep_opt *, enum grep_header_field, const char *);
extern void compile_grep_patterns(struct grep_opt *opt);
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 4f5e74f..eb8ca88 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -69,4 +69,34 @@ test_expect_failure 'git grep .fi a' '
git grep .fi a
'
+test_expect_success 'git grep -F y<NUL>f a' "
+ printf 'y\000f' >f &&
+ git grep -f f -F a
+"
+
+test_expect_success 'git grep -F y<NUL>x a' "
+ printf 'y\000x' >f &&
+ test_must_fail git grep -f f -F a
+"
+
+test_expect_success 'git grep -Fi Y<NUL>f a' "
+ printf 'Y\000f' >f &&
+ git grep -f f -Fi a
+"
+
+test_expect_failure 'git grep -Fi Y<NUL>x a' "
+ printf 'Y\000x' >f &&
+ test_must_fail git grep -f f -Fi a
+"
+
+test_expect_success 'git grep y<NUL>f a' "
+ printf 'y\000f' >f &&
+ git grep -f f a
+"
+
+test_expect_failure 'git grep y<NUL>x a' "
+ printf 'y\000x' >f &&
+ test_must_fail git grep -f f a
+"
+
test_done
--
1.7.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-22 19:42 ` Eyvind Bernhardsen
@ 2010-05-22 22:27 ` Clemens Buchacher
2010-05-23 10:36 ` Eyvind Bernhardsen
0 siblings, 1 reply; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-22 22:27 UTC (permalink / raw)
To: Eyvind Bernhardsen; +Cc: Finn Arne Gangstad, Junio C Hamano, git
Hi Eyvind,
Thanks for the extended summary. I still have several doubts, as
detailed below. But I understand that this has been heavily
discussed and if the discussion has indeed come to a conclusion,
then I will not complain about it now.
On Sat, May 22, 2010 at 09:42:14PM +0200, Eyvind Bernhardsen wrote:
> On 22. mai 2010, at 15.09, Clemens Buchacher wrote:
>
> > As soon as the existing crlf attribute is given priority over
> > core.autocrlf, all the problems discussed originally go away.
> > So what exactly are the new attributes supposed to do?
For all my comments below I am assuming that the behavior of
autocrlf will be changed to respect the crlf/text attribute by
default.
> There is one new attribute, "eol", that is used for files which
> need a specific line ending. Being able to "force" LF or CRLF
> line endings has been requested several times on the list, and is
> already sort of provided for by "crlf=input".
[...]
> Any file with the "text" attribute set will have its line endings
> normalized to LF in the repository. If "text" is set to the
> special value "auto", git will only convert the file if it looks
> like a text file.
>
> The "eol" attribute is used for files that need a specific line
> ending. Setting it also sets "text".
If a file needs specific line endings, why enable conversion for
this file at all? Just make sure the repository contains the
correct version and unset the crlf attribute.
> core.eol controls which line endings to use for normalized files
> that don't have the "eol" attribute set, and defaults to the
> platform native line ending.
That makes sense to me, except for the part where I need a per-file
attribute.
> When core.autocrlf is set, the default value of the "text"
> attribute is set to "auto" but with an extra safety feature: if a
> file contains CRs in the index, it won't be normalized. The
> extra feature comes from Finn Arne's "safe autocrlf" patch.
>
> There is a backwards compatibility wrinkle in that core.autocrlf
> will override core.eol if the latter isn't explicitly set, so
> that "core.autocrlf=true" still results in CRLFs in the working
> directory on Linux.
This also makes sense. I just fear that making this frequently
misunderstood feature even more complex will only confuse users
further.
I do see the value of a global core.eol option, however, since it
allows me to convert to LF instead of CRLF, which AFAIK is not
currently possible.
On the other hand, this will cause users to stop caring whether or
not a file in the repository has LF or CRLF line endings. I am not
sure if that is a good thing, but I suppose it is better than what
we have now.
> > And, renaming the crlf attribute to text? Where did Linus suggest that? If
> > we do that, we don't even have to talk about backwards compatibility any
> > more.
>
> In <alpine.LFD.2.00.1005121824260.3711@i5.linux-foundation.org>:
> > So if you rename these things, keep them separate. Make the "am I a
> > text-file" boolean be a boolean (plus "auto"), and just call it "text".
> > And make the "what end of line to use" be just "eol" then.
I see. Well, if we rename the "crlf" attribute then we will have a
macro attribute "binary" and an attribute "text", which are not the
opposite of each other. That is a bit strange.
> The "crlf" attribute will be used if it is present so backwards
> compatibility is preserved to a degree.
Ah, ok. That is fine then.
> Scripts that test for
> the "crlf" attribute explicitly (such as git-cvsserver, which I
> fixed) will break. I don't know how big a problem that is going
> to be in practice, but nobody raised it as an issue during the
> discussion.
I agree that should not be a big issue.
Regards,
Clemens
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-22 22:27 ` Clemens Buchacher
@ 2010-05-23 10:36 ` Eyvind Bernhardsen
2010-05-23 11:51 ` Clemens Buchacher
0 siblings, 1 reply; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-23 10:36 UTC (permalink / raw)
To: Clemens Buchacher; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On 23. mai 2010, at 00.27, Clemens Buchacher wrote:
> Hi Eyvind,
>
> Thanks for the extended summary. I still have several doubts, as
> detailed below. But I understand that this has been heavily
> discussed and if the discussion has indeed come to a conclusion,
> then I will not complain about it now.
I appreciate your comments. The more people who take a critical look at the series, the better :) I think most of your concerns were covered during the discussion, but it's always good to have more sanity checks.
[...]
> For all my comments below I am assuming that the behavior of
> autocrlf will be changed to respect the crlf/text attribute by
> default.
That's right, the attribute is used regardless of what autocrlf is set to.
>> The "eol" attribute is used for files that need a specific line
>> ending. Setting it also sets "text".
>
> If a file needs specific line endings, why enable conversion for
> this file at all? Just make sure the repository contains the
> correct version and unset the crlf attribute.
Yeah, that's what I initially thought too, but it makes sense to be able to use normalization to prevent line ending breakages in your repository. If a file needs CRLFs for some tool to work, you don't want anyone to inadvertently convert it to LF, and "eol=crlf" makes git enforce that.
[...]
>> There is a backwards compatibility wrinkle in that core.autocrlf
>> will override core.eol if the latter isn't explicitly set, so
>> that "core.autocrlf=true" still results in CRLFs in the working
>> directory on Linux.
>
> This also makes sense. I just fear that making this frequently
> misunderstood feature even more complex will only confuse users
> further.
I agree, but I really wanted to avoid breaking any settings. I'm thinking of submitting a patch to remove the backwards compatibility for 1.8, leaving core.autocrlf as a simple boolean meaning "* text=auto".
> I do see the value of a global core.eol option, however, since it
> allows me to convert to LF instead of CRLF, which AFAIK is not
> currently possible.
Actually, since git normalizes to LF, "eol=lf" simply means "convert on input but not on output", which is what "core.autocrlf=input" currently does. The fact that you didn't know this reflects the poor usability of core.autocrlf, which is one of the things this series is trying to rectify :)
It also allows for the possibility of supporting other line endings such as the classic Mac "CR only". That might not be a good thing.
> On the other hand, this will cause users to stop caring whether or
> not a file in the repository has LF or CRLF line endings. I am not
> sure if that is a good thing, but I suppose it is better than what
> we have now.
The problem is that users don't care anyway, and they've been trained by other VCSes that they shouldn't need to care. This series allows me to put "* text=auto" in my repository and git will automatically normalize text files regardless of the user's configuration, which wasn't possible before.
[...]
> I see. Well, if we rename the "crlf" attribute then we will have a
> macro attribute "binary" and an attribute "text", which are not the
> opposite of each other. That is a bit strange.
Yes, but I don't think it should cause any problems in practice. Since the "binary" attribute just means "-text -diff" it will override a "* text=auto", which I is the only relevant interaction I can see.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-23 10:36 ` Eyvind Bernhardsen
@ 2010-05-23 11:51 ` Clemens Buchacher
2010-05-23 12:53 ` Eyvind Bernhardsen
2010-05-24 12:12 ` Dmitry Potapov
0 siblings, 2 replies; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-23 11:51 UTC (permalink / raw)
To: Eyvind Bernhardsen; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On Sun, May 23, 2010 at 12:36:51PM +0200, Eyvind Bernhardsen wrote:
> On 23. mai 2010, at 00.27, Clemens Buchacher wrote:
>
> >> The "eol" attribute is used for files that need a specific line
> >> ending. Setting it also sets "text".
> >
> > If a file needs specific line endings, why enable conversion for
> > this file at all? Just make sure the repository contains the
> > correct version and unset the crlf attribute.
>
> Yeah, that's what I initially thought too, but it makes sense to
> be able to use normalization to prevent line ending breakages in
> your repository. If a file needs CRLFs for some tool to work,
> you don't want anyone to inadvertently convert it to LF, and
> "eol=crlf" makes git enforce that.
Unsetting crlf/text already disables converting it to LF. The user
would have to change the line endings in his work tree and commit
the file with wrong line endings. I do not see how this can happen
inadvertently.
> > I do see the value of a global core.eol option, however, since it
> > allows me to convert to LF instead of CRLF, which AFAIK is not
> > currently possible.
>
> Actually, since git normalizes to LF, "eol=lf" simply means
> "convert on input but not on output", which is what
> "core.autocrlf=input" currently does. The fact that you didn't
> know this reflects the poor usability of core.autocrlf, which is
> one of the things this series is trying to rectify :)
No, I am aware of autocrlf=input, but apparently I did not
understand the meaning of eol=lf correctly. So if a file has CRLF
endings in the repository, and eol=lf, it will _not_ be converted
to LF in the work tree? Conversely, if it has LF endings in the
repository, and eol=crlf, it _will_ be converted to CRLF in the
work tree?
I was expecting eol=lf and eol=crlf to be symmetric, which is also
the reason for my reply to Finn's safe crlf patch.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-23 11:51 ` Clemens Buchacher
@ 2010-05-23 12:53 ` Eyvind Bernhardsen
2010-05-23 13:26 ` Ævar Arnfjörð Bjarmason
2010-05-24 9:49 ` Clemens Buchacher
2010-05-24 12:12 ` Dmitry Potapov
1 sibling, 2 replies; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-23 12:53 UTC (permalink / raw)
To: Clemens Buchacher; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On 23. mai 2010, at 13.51, Clemens Buchacher wrote:
> On Sun, May 23, 2010 at 12:36:51PM +0200, Eyvind Bernhardsen wrote:
>> Yeah, that's what I initially thought too, but it makes sense to
>> be able to use normalization to prevent line ending breakages in
>> your repository. If a file needs CRLFs for some tool to work,
>> you don't want anyone to inadvertently convert it to LF, and
>> "eol=crlf" makes git enforce that.
>
> Unsetting crlf/text already disables converting it to LF. The user
> would have to change the line endings in his work tree and commit
> the file with wrong line endings. I do not see how this can happen
> inadvertently.
That's because you don't use (or work with people who use) editors that mangle line endings without asking. Modern versions of vi and Emacs don't do this, but the problem still exists in popular Windows editors. I don't have strong feelings about the need for this feature, but it has been requested so it's probably useful.
[...]
> No, I am aware of autocrlf=input, but apparently I did not
> understand the meaning of eol=lf correctly. So if a file has CRLF
> endings in the repository, and eol=lf, it will _not_ be converted
> to LF in the work tree? Conversely, if it has LF endings in the
> repository, and eol=crlf, it _will_ be converted to CRLF in the
> work tree?
That is correct, but "eol=lf" means that the file _should_ be LF-only in the repository. If it isn't, the repository is "corrupted". Such a file is marked as dirty when it is checked out and will be normalized to LF-only line endings when it is committed, at which point the repository will be fixed.
This should only be a problem if you set the "text" or "eol" attributes in an existing repository, or if someone adds CRLFs to a normalized file using an older version of git.
I'm sorry if I was unclear about Finn Arne's patch: "safe autocrlf" only applies when files are normalized due to core.autocrlf. If "text" or "eol" is set on a file, the file is normalized to LF-only regardless of whether it contains CRs in the repository.
> I was expecting eol=lf and eol=crlf to be symmetric, which is also
> the reason for my reply to Finn's safe crlf patch.
I think they are symmetric in practice. While it's possible to get the repository into a state where you will get CRLFs in a text file that should be LF-only, it is both unlikely and easily fixed.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-23 12:53 ` Eyvind Bernhardsen
@ 2010-05-23 13:26 ` Ævar Arnfjörð Bjarmason
2010-05-24 9:49 ` Clemens Buchacher
1 sibling, 0 replies; 35+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2010-05-23 13:26 UTC (permalink / raw)
To: Eyvind Bernhardsen
Cc: Clemens Buchacher, Finn Arne Gangstad, Junio C Hamano, git
On Sun, May 23, 2010 at 12:53, Eyvind Bernhardsen
<eyvind.bernhardsen@gmail.com> wrote:
> I don't have strong feelings about the need for this feature, but it
> has been requested so it's probably useful.
It is, sanity checks for broken user tools like these are
unfortunately important for Windows adoption of Git.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-23 12:53 ` Eyvind Bernhardsen
2010-05-23 13:26 ` Ævar Arnfjörð Bjarmason
@ 2010-05-24 9:49 ` Clemens Buchacher
2010-05-24 12:47 ` Dmitry Potapov
2010-05-24 21:11 ` Eyvind Bernhardsen
1 sibling, 2 replies; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-24 9:49 UTC (permalink / raw)
To: Eyvind Bernhardsen; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On Sun, May 23, 2010 at 02:53:18PM +0200, Eyvind Bernhardsen wrote:
> On 23. mai 2010, at 13.51, Clemens Buchacher wrote:
>
> > On Sun, May 23, 2010 at 12:36:51PM +0200, Eyvind Bernhardsen wrote:
> >> Yeah, that's what I initially thought too, but it makes sense to
> >> be able to use normalization to prevent line ending breakages in
> >> your repository. If a file needs CRLFs for some tool to work,
> >> you don't want anyone to inadvertently convert it to LF, and
> >> "eol=crlf" makes git enforce that.
> >
> > Unsetting crlf/text already disables converting it to LF. The user
> > would have to change the line endings in his work tree and commit
> > the file with wrong line endings. I do not see how this can happen
> > inadvertently.
>
> That's because you don't use (or work with people who use)
> editors that mangle line endings without asking.
That I can imagine. But due to their change, the file will not work
any more, not even on their own platform. What did they even make
the change for then?
> I don't have strong feelings about the need for this feature, but
> it has been requested so it's probably useful.
Just because a feature is requested doesn't mean it's useful, or
even harmless. This has nothing to do with version control in the
first place, so I do not see why we should suffer the additional
complication.
> > No, I am aware of autocrlf=input, but apparently I did not
> > understand the meaning of eol=lf correctly. So if a file has CRLF
> > endings in the repository, and eol=lf, it will _not_ be converted
> > to LF in the work tree? Conversely, if it has LF endings in the
> > repository, and eol=crlf, it _will_ be converted to CRLF in the
> > work tree?
>
> That is correct, but "eol=lf" means that the file _should_ be
> LF-only in the repository. If it isn't, the repository is
> "corrupted". Such a file is marked as dirty when it is checked
> out and will be normalized to LF-only line endings when it is
> committed, at which point the repository will be fixed.
With CRLF file in the repository, core.autocrlf=true and
core.eol=lf, I tested the patches currently in pu (0ed6711a) in the
following three scenarios:
1. no attributes are set
2. text attribute is set or auto
3. eol attribute is set to lf
In the first scenario, the behavior is completely asymmetric. LF
files will be converted to CRLF, if core.eol=crlf, but CRLF files
will _not_ be converted to LF, even if core.eol=lf. And it will not
be marked as dirty either. Yet this is the default behavior in
terms of attributes.
The only justification I can think of for this behavior is the fact
that on platforms with LF line endings, most tools can deal with
CRLF line endings. Not very convincing.
In the second scenario, the file is marked as dirty. Neither reset
--hard nor checkout HEAD . fix the problem. The file has to be
added and committed, after which the line endings are _still_ CRLF.
This appears to be the old autocrlf=true behavior. Is this
intentional?
The third scenario is similar to the second scenario, only it warns
me about CRLF conversion during add and commit. The file still ends
up having CRLF line endings in the work tree. git reset --hard does
not fix the line endings. touch'ing the file finally makes git diff
notice that the file is dirty, but git status still does not list
the file.
So to me, end of line conversion is still as confusing as it gets.
> This should only be a problem if you set the "text" or "eol"
> attributes in an existing repository, or if someone adds CRLFs to
> a normalized file using an older version of git.
In other words, this will be a problem all the time, since by
default people will not even know about text or eol.
Clemens
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-23 11:51 ` Clemens Buchacher
2010-05-23 12:53 ` Eyvind Bernhardsen
@ 2010-05-24 12:12 ` Dmitry Potapov
2010-05-24 12:22 ` Erik Faye-Lund
1 sibling, 1 reply; 35+ messages in thread
From: Dmitry Potapov @ 2010-05-24 12:12 UTC (permalink / raw)
To: Clemens Buchacher
Cc: Eyvind Bernhardsen, Finn Arne Gangstad, Junio C Hamano, git
On Sun, May 23, 2010 at 01:51:27PM +0200, Clemens Buchacher wrote:
> On Sun, May 23, 2010 at 12:36:51PM +0200, Eyvind Bernhardsen wrote:
> >
> > Actually, since git normalizes to LF, "eol=lf" simply means
> > "convert on input but not on output", which is what
> > "core.autocrlf=input" currently does. The fact that you didn't
> > know this reflects the poor usability of core.autocrlf, which is
> > one of the things this series is trying to rectify :)
>
> No, I am aware of autocrlf=input, but apparently I did not
> understand the meaning of eol=lf correctly. So if a file has CRLF
> endings in the repository, and eol=lf, it will _not_ be converted
> to LF in the work tree? Conversely, if it has LF endings in the
> repository, and eol=crlf, it _will_ be converted to CRLF in the
> work tree?
All text files should LF in the repository, if some file does not, it
means the repository is corrupted in respect of handling text files.
So, the situation is not symmetric at all! I don't know how better to
handle this "corruption". I guess, it should be a warning about some
files having different ending that it should have had based on their
attributes.
>
> I was expecting eol=lf and eol=crlf to be symmetric, which is also
> the reason for my reply to Finn's safe crlf patch.
Finn's patch about _automatic_ text detection when there is no explicit
policy about what ending this file should have.
Dmitry
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 12:12 ` Dmitry Potapov
@ 2010-05-24 12:22 ` Erik Faye-Lund
2010-05-24 12:42 ` Dmitry Potapov
0 siblings, 1 reply; 35+ messages in thread
From: Erik Faye-Lund @ 2010-05-24 12:22 UTC (permalink / raw)
To: Dmitry Potapov
Cc: Clemens Buchacher, Eyvind Bernhardsen, Finn Arne Gangstad,
Junio C Hamano, git
On Mon, May 24, 2010 at 2:12 PM, Dmitry Potapov <dpotapov@gmail.com> wrote:
> On Sun, May 23, 2010 at 01:51:27PM +0200, Clemens Buchacher wrote:
>> On Sun, May 23, 2010 at 12:36:51PM +0200, Eyvind Bernhardsen wrote:
>> >
>> > Actually, since git normalizes to LF, "eol=lf" simply means
>> > "convert on input but not on output", which is what
>> > "core.autocrlf=input" currently does. The fact that you didn't
>> > know this reflects the poor usability of core.autocrlf, which is
>> > one of the things this series is trying to rectify :)
>>
>> No, I am aware of autocrlf=input, but apparently I did not
>> understand the meaning of eol=lf correctly. So if a file has CRLF
>> endings in the repository, and eol=lf, it will _not_ be converted
>> to LF in the work tree? Conversely, if it has LF endings in the
>> repository, and eol=crlf, it _will_ be converted to CRLF in the
>> work tree?
>
> All text files should LF in the repository, if some file does not, it
> means the repository is corrupted in respect of handling text files.
> So, the situation is not symmetric at all! I don't know how better to
> handle this "corruption". I guess, it should be a warning about some
> files having different ending that it should have had based on their
> attributes.
>
I thought the original motivation behind this change was to make repos
with CRLF-textfiles work without reporting diffs on all lines when
autocrlf was enabled. Because checking in CRLF-files DOES happen, and
for some of us the reality is that we have to deal with such repos.
Perhaps I'm confusing this discussion with some other, though.
--
Erik "kusma" Faye-Lund
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 12:22 ` Erik Faye-Lund
@ 2010-05-24 12:42 ` Dmitry Potapov
0 siblings, 0 replies; 35+ messages in thread
From: Dmitry Potapov @ 2010-05-24 12:42 UTC (permalink / raw)
To: kusmabite
Cc: Clemens Buchacher, Eyvind Bernhardsen, Finn Arne Gangstad,
Junio C Hamano, git
On Mon, May 24, 2010 at 02:22:13PM +0200, Erik Faye-Lund wrote:
>
> I thought the original motivation behind this change was to make repos
> with CRLF-textfiles work without reporting diffs on all lines when
> autocrlf was enabled. Because checking in CRLF-files DOES happen, and
> for some of us the reality is that we have to deal with such repos.
Sure, but then CRLF files are treated as "binary" as far as autocrlf is
concerned. There is no conversion for such files even though automatic
text detection would detect them as. Thus, you do not have to worry that
enabling autocrlf may be incompatible with some repositories.
The situation is different when a file explicitly marked by attributes
to have some particular ending. It is a policy of that repository, and
if it is not followed, it means it is "corrupted".
It is similar to what you would have with SVN with eol-style=native for
some file while it being stored with CRLF inside of the SVN repository
(obviously, any standard SVN client should not allow this to happen).
Dmitry
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 9:49 ` Clemens Buchacher
@ 2010-05-24 12:47 ` Dmitry Potapov
2010-05-24 20:45 ` Eyvind Bernhardsen
2010-05-24 20:56 ` Clemens Buchacher
2010-05-24 21:11 ` Eyvind Bernhardsen
1 sibling, 2 replies; 35+ messages in thread
From: Dmitry Potapov @ 2010-05-24 12:47 UTC (permalink / raw)
To: Clemens Buchacher
Cc: Eyvind Bernhardsen, Finn Arne Gangstad, Junio C Hamano, git
On Mon, May 24, 2010 at 11:49:05AM +0200, Clemens Buchacher wrote:
>
> Just because a feature is requested doesn't mean it's useful, or
> even harmless. This has nothing to do with version control in the
> first place, so I do not see why we should suffer the additional
> complication.
IMHO, ability to enforce a specific for particular types of files is
a good thing, and if you can say that these files must have CRLF, it
should be possible to say that those files must have LF. At least, it
would be a logic thing to do. I don't see how it can be harmful.
>
> With CRLF file in the repository, core.autocrlf=true and
> core.eol=lf,
I wonder if this combination should be allowed. core.autocrlf=true
always implied that the native EOL is CRLF. So I do not think any
reasonable behavior can be deduced for this combination. Can you
imagine _anyone_ who would want to have such settings? Otherwise,
it is better to error out if this combination is encountered.
Dmitry
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 12:47 ` Dmitry Potapov
@ 2010-05-24 20:45 ` Eyvind Bernhardsen
2010-05-24 20:56 ` Clemens Buchacher
1 sibling, 0 replies; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-24 20:45 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Clemens Buchacher, Finn Arne Gangstad, Junio C Hamano, git
On 24. mai 2010, at 14.47, Dmitry Potapov wrote:
> On Mon, May 24, 2010 at 11:49:05AM +0200, Clemens Buchacher wrote:
>> With CRLF file in the repository, core.autocrlf=true and
>> core.eol=lf,
>
> I wonder if this combination should be allowed. core.autocrlf=true
> always implied that the native EOL is CRLF. So I do not think any
> reasonable behavior can be deduced for this combination. Can you
> imagine _anyone_ who would want to have such settings? Otherwise,
> it is better to error out if this combination is encountered.
It errors out when core.autocrlf=input conflicts with core.eol, but allows an explicitly set core.eol to override (with no warning) when core.autocrlf=true. That way, the meaning of core.autocrlf can later be changed to simply enable normalization without touching the output format--unfortunately removing any sense to the name.
Leaving core.autocrlf to mean what its name implies would require a new config variable (core.autotext?) for people who want automatic normalization but LFs in their working directory.
I'll be semi-offline for the next week or so, so any update of the series will have to wait until I get back.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 12:47 ` Dmitry Potapov
2010-05-24 20:45 ` Eyvind Bernhardsen
@ 2010-05-24 20:56 ` Clemens Buchacher
2010-05-24 21:09 ` Eyvind Bernhardsen
1 sibling, 1 reply; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-24 20:56 UTC (permalink / raw)
To: Dmitry Potapov
Cc: Eyvind Bernhardsen, Finn Arne Gangstad, Junio C Hamano, git
On Mon, May 24, 2010 at 04:47:34PM +0400, Dmitry Potapov wrote:
> On Mon, May 24, 2010 at 11:49:05AM +0200, Clemens Buchacher wrote:
>
> > With CRLF file in the repository, core.autocrlf=true and
> > core.eol=lf,
>
> I wonder if this combination should be allowed. core.autocrlf=true
> always implied that the native EOL is CRLF.
I just did what the commit message suggested:
If core.eol is set explicitly (including setting it to "native"), it
will override core.autocrlf so that
[core]
autocrlf = true
eol = lf
normalizes all files that look like text, but does not put CRLFs in the
working directory.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 20:56 ` Clemens Buchacher
@ 2010-05-24 21:09 ` Eyvind Bernhardsen
0 siblings, 0 replies; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-24 21:09 UTC (permalink / raw)
To: Clemens Buchacher; +Cc: Dmitry Potapov, Finn Arne Gangstad, Junio C Hamano, git
On 24. mai 2010, at 22.56, Clemens Buchacher wrote:
> On Mon, May 24, 2010 at 04:47:34PM +0400, Dmitry Potapov wrote:
>
>> On Mon, May 24, 2010 at 11:49:05AM +0200, Clemens Buchacher wrote:
>>
>>> With CRLF file in the repository, core.autocrlf=true and
>>> core.eol=lf,
>>
>> I wonder if this combination should be allowed. core.autocrlf=true
>> always implied that the native EOL is CRLF.
>
> I just did what the commit message suggested:
>
> If core.eol is set explicitly (including setting it to "native"), it
> will override core.autocrlf so that
>
> [core]
> autocrlf = true
> eol = lf
>
> normalizes all files that look like text, but does not put CRLFs in the
> working directory.
Right. While technically true, this is misleading in that it implies that you will get LF line endings on all your files.
I will update the documentation to reflect that core.autocrlf is only useful if you want to work with CRLF line endings in a repository that is not normalized. I won't be able to do that for a week or so, unfortunately.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 9:49 ` Clemens Buchacher
2010-05-24 12:47 ` Dmitry Potapov
@ 2010-05-24 21:11 ` Eyvind Bernhardsen
2010-05-24 22:11 ` Clemens Buchacher
1 sibling, 1 reply; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-24 21:11 UTC (permalink / raw)
To: Clemens Buchacher; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On 24. mai 2010, at 11.49, Clemens Buchacher wrote:
> With CRLF file in the repository, core.autocrlf=true and
> core.eol=lf, I tested the patches currently in pu (0ed6711a) in the
> following three scenarios:
>
> 1. no attributes are set
> 2. text attribute is set or auto
> 3. eol attribute is set to lf
>
> In the first scenario, the behavior is completely asymmetric. LF
> files will be converted to CRLF, if core.eol=crlf, but CRLF files
> will _not_ be converted to LF, even if core.eol=lf. And it will not
> be marked as dirty either. Yet this is the default behavior in
> terms of attributes.
Default in terms of attributes maybe, but not in terms of configuration (you need to explicitly set core.eol to lf to see this). But I see your point, core.autocrlf just doesn't work very well unless you want CRLFs.
> The only justification I can think of for this behavior is the fact
> that on platforms with LF line endings, most tools can deal with
> CRLF line endings. Not very convincing.
Well, without "safe autocrlf", this would have worked as you expected, marking files as dirty and converting on checkin. Unfortunately, without "safe autocrlf", core.autocrlf just doesn't work in practice, PRECISELY because the person with core.autocrlf set has to clean up after everybody else.
> In the second scenario, the file is marked as dirty. Neither reset
> --hard nor checkout HEAD . fix the problem. The file has to be
> added and committed, after which the line endings are _still_ CRLF.
> This appears to be the old autocrlf=true behavior. Is this
> intentional?
Yes, since it uses the old autocrlf implementation. The documentation states that you need to convert your repository when enabling normalization, but is this going to be big problem in practice?
Perhaps the working directory file should be converted to LF, but when? When it is added? When it's committed? If you checkout -f after you commit the change, the file should have the correct line ending, because then it is normalized in the repository and git knows what to do with it.
> The third scenario is similar to the second scenario, only it warns
> me about CRLF conversion during add and commit. The file still ends
> up having CRLF line endings in the work tree. git reset --hard does
> not fix the line endings. touch'ing the file finally makes git diff
> notice that the file is dirty, but git status still does not list
> the file.
>
> So to me, end of line conversion is still as confusing as it gets.
All of your confusion seems to stem from starting with a repository that is not normalized and adding normalization to it. Yes, that is a pain, but there is no way to avoid that, and there is a recipe for fixing it in the documentation.
>> This should only be a problem if you set the "text" or "eol"
>> attributes in an existing repository, or if someone adds CRLFs to
>> a normalized file using an older version of git.
>
> In other words, this will be a problem all the time, since by
> default people will not even know about text or eol.
By default people will not enable core.autocrlf either. I just don't see the huge problem here. If your repository does not need normalization, _good_! Nothing will ever be normalized or converted. If somebody wants to work on that repository but prefers CRLFs, they can enable core.autocrlf, and it will work sanely (thanks to "safe autocrlf"), converting text files that can safely be normalized back but leaving all other files alone.
If you later discover that you want normalized text files in your repository, you _will_ have to convert all your files. That's a bit tricky, but it's not the huge problem you're making it out to be, and it only has to be done once.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 21:11 ` Eyvind Bernhardsen
@ 2010-05-24 22:11 ` Clemens Buchacher
2010-05-25 6:41 ` Eyvind Bernhardsen
0 siblings, 1 reply; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-24 22:11 UTC (permalink / raw)
To: Eyvind Bernhardsen; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On Mon, May 24, 2010 at 11:11:40PM +0200, Eyvind Bernhardsen wrote:
> If you later discover that you want normalized text files in your
> repository, you _will_ have to convert all your files. That's a
> bit tricky, but it's not the huge problem you're making it out to
> be, and it only has to be done once.
I am not just making this stuff up. These things have bitten me in
the past, and there have been complaints about it in #git. And even
after finding the solution I always felt like crlf handling in git
was really broken. I was hoping that after enabling the new eol
handling, these weird effects would go away, but obviously they
don't.
Maybe for you it does not seem like such a big deal, because you
are now so familiar with the inner workings of this algorithm. But
to a naive user like me it is very counter-intuitive.
I am not saying that the new features are all wrong. Some of them
really are a major improvement. My main point is that it is still
confusing and that in itself will cause issues for many people.
And I don't see why we cannot do better. In the first scenario of
my previous post (no attributes set), since I already enable
core.eol = lf, couldn't we handle that as if text=auto were set on
every file? Isn't that what core.autocrlf = true means in this
case?
And once we normalized the file to LF, why don't we also checkout
that version, or at least mark it as dirty in the index, so a reset
--hard will fix it up?
Regards,
Clemens
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-24 22:11 ` Clemens Buchacher
@ 2010-05-25 6:41 ` Eyvind Bernhardsen
2010-05-25 8:27 ` Anthony Youngman
2010-05-25 8:33 ` Clemens Buchacher
0 siblings, 2 replies; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-25 6:41 UTC (permalink / raw)
To: Clemens Buchacher; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On 25. mai 2010, at 00.11, Clemens Buchacher wrote:
> I am not just making this stuff up. These things have bitten me in
> the past, and there have been complaints about it in #git. And even
> after finding the solution I always felt like crlf handling in git
> was really broken. I was hoping that after enabling the new eol
> handling, these weird effects would go away, but obviously they
> don't.
Sorry, I was in a hurry and shouldn't have been so dismissive. I do think that the new features will help more than they cause trouble, though.
[...]
> And I don't see why we cannot do better. In the first scenario of
> my previous post (no attributes set), since I already enable
> core.eol = lf, couldn't we handle that as if text=auto were set on
> every file? Isn't that what core.autocrlf = true means in this
> case?
Isn't that just the current core.autocrlf=input behaviour? The reason autocrlf was changed is because it breaks badly on repositories that have text files (ie, files that the autocrlf mechanism wants to normalize) that contain CRLFs in the repository.
If you see someone who has problems with autocrlf in #git, it's almost certainly because autocrlf is on by default, they cloned a repository, and now a seemingly random selection of files are dirty and checking them out doesn't help.
The "safe autocrlf" patch fixes this by not trying to normalize any files that are not already normalized in the index. This is what you noticed: the files do not show up as dirty and will not have their line endings converted. The tradeoff is that setting "core.autocrlf" no longer normalizes all text files, only new ones and ones that are already normalized.
You (rightly) expected line endings to be normalized to LF when core.eol=lf, and I do need to fix that in the documentation. Safe autocrlf _only_ works if you want CRLF line endings in your working directory.
A similar feature that converts text files that have CRLF in the repository to LF would need more development. I don't know how many people would use such a feature, and I would solve that problem by setting "* text=auto" and normalizing the repository instead.
> And once we normalized the file to LF, why don't we also checkout
> that version, or at least mark it as dirty in the index, so a reset
> --hard will fix it up?
I dunno. Won't it be even more confusing that the file is still dirty after you add it? The problem with converting it in the working directory when you add is that it loses information: if you didnt' want that file to be converted, there's no way to revert (this is very bad if it's a file that contained a mix of CRLF and LF).
Maybe rewording the "[CR]LF will be replaced by [CR]LF in <file>" warning to tell the user how to get the working directory version normalized would be the best solution.
As I said, though, this is a one-time problem: once your repository is normalized and has text attributes set, it will stay normalized.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-25 6:41 ` Eyvind Bernhardsen
@ 2010-05-25 8:27 ` Anthony Youngman
2010-06-07 19:55 ` Eyvind Bernhardsen
2010-05-25 8:33 ` Clemens Buchacher
1 sibling, 1 reply; 35+ messages in thread
From: Anthony Youngman @ 2010-05-25 8:27 UTC (permalink / raw)
To: Eyvind Bernhardsen
Cc: Clemens Buchacher, Finn Arne Gangstad, Junio C Hamano, git
On 05/25/10 07:41, Eyvind Bernhardsen wrote:
> The "safe autocrlf" patch fixes this by not trying to normalize any files that are not already normalized in the index. This is what you noticed: the files do not show up as dirty and will not have their line endings converted. The tradeoff is that setting "core.autocrlf" no longer normalizes all text files, only new ones and ones that are already normalized.
>
> You (rightly) expected line endings to be normalized to LF when core.eol=lf, and I do need to fix that in the documentation. Safe autocrlf _only_ works if you want CRLF line endings in your working directory.
>
Just a suggestion ...
For core.autocrlf (or somewhere else more appropriate) could we add to
false and true the option "force"?
Bearing in mind "force" is always considered "a bit dangerous", that
merely means "I don't care if it has crlf in the repository, change all
commits to lf" (and checkouts to crlf if appropriate).
Yep, things are likely to break, but I'm thinking this is the sort of
situation where a lead dev could say to themselves "I know what I'm
doing, we need to clean up, and if I set that as my options, then I know
I can fix any resulting mess".
Cheers,
Wol
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-25 6:41 ` Eyvind Bernhardsen
2010-05-25 8:27 ` Anthony Youngman
@ 2010-05-25 8:33 ` Clemens Buchacher
1 sibling, 0 replies; 35+ messages in thread
From: Clemens Buchacher @ 2010-05-25 8:33 UTC (permalink / raw)
To: Eyvind Bernhardsen; +Cc: Finn Arne Gangstad, Junio C Hamano, git
On Tue, May 25, 2010 at 08:41:14AM +0200, Eyvind Bernhardsen wrote:
> On 25. mai 2010, at 00.11, Clemens Buchacher wrote:
>
> > And once we normalized the file to LF, why don't we also checkout
> > that version, or at least mark it as dirty in the index, so a reset
> > --hard will fix it up?
>
> I dunno. Won't it be even more confusing that the file is still
> dirty after you add it? The problem with converting it in the
> working directory when you add is that it loses information: if
> you didnt' want that file to be converted, there's no way to
> revert (this is very bad if it's a file that contained a mix of
> CRLF and LF).
I was trying to illustrate this with a test script, but after
playing with this some more, I noticed that this is not a problem
with your changes, since I can reproduce with git v1.7.0.5:
#!/bin/sh
cd $(mktemp -d)
git init
git config core.autocrlf input
echo -n 'hi\r\nbye\r\n' > file
git add file
git commit -m initial
# no output
git diff --name-status
sleep 1
touch file
# file dirty
git diff --name-status
# at this point, reset --hard would convert 'file' to LF in the
# work tree
git update-index file
# no output
git diff --name-status
The same behavior can be observed with the 'eol=lf' attribute,
before and after normalizing the file.
Regards,
Clemens
---
$ sh -x testcrlf2.sh
+ set -e
+ mktemp -d
+ cd /tmp/tmp.Smnx0U7IT1
+ git init
Initialized empty Git repository in /tmp/tmp.Smnx0U7IT1/.git/
+ git config core.autocrlf input
+ echo -n hi\r\nbye\r\n
+ git add file
warning: CRLF will be replaced by LF in file.
+ git commit -m initial
[master (root-commit) d3be96f] initial
warning: CRLF will be replaced by LF in file.
1 files changed, 2 insertions(+), 0 deletions(-)
create mode 100644 file
+ git diff --name-status
+ sleep 1
+ touch file
+ git diff --name-status
M file
+ git update-index file
warning: CRLF will be replaced by LF in file.
+ git diff --name-status
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: What's cooking extra
2010-05-25 8:27 ` Anthony Youngman
@ 2010-06-07 19:55 ` Eyvind Bernhardsen
0 siblings, 0 replies; 35+ messages in thread
From: Eyvind Bernhardsen @ 2010-06-07 19:55 UTC (permalink / raw)
To: Anthony Youngman
Cc: Clemens Buchacher, Finn Arne Gangstad, Junio C Hamano, git
Hi,
I'm sorry I've taken so long to respond.
On 25. mai 2010, at 10.27, Anthony Youngman wrote:
> Just a suggestion ...
> For core.autocrlf (or somewhere else more appropriate) could we add to
> false and true the option "force"?
>
> Bearing in mind "force" is always considered "a bit dangerous", that
> merely means "I don't care if it has crlf in the repository, change all
> commits to lf" (and checkouts to crlf if appropriate).
>
> Yep, things are likely to break, but I'm thinking this is the sort of
> situation where a lead dev could say to themselves "I know what I'm
> doing, we need to clean up, and if I set that as my options, then I know
> I can fix any resulting mess".
If I understand you correctly, putting "* text=auto" in .gitattributes would do what you want, with the added benefit that you document the decision to normalize inside the repository. If you don't think that's a benefit, you can put "* text=auto" in .git/info/attributes for the same normalizing effect.
--
Eyvind
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2010-06-07 19:55 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-19 14:33 What's cooking extra Junio C Hamano
2010-05-19 15:12 ` A Large Angry SCM
2010-05-19 17:06 ` Finn Arne Gangstad
2010-05-19 20:09 ` Eyvind Bernhardsen
2010-05-22 13:09 ` Clemens Buchacher
2010-05-22 19:42 ` Eyvind Bernhardsen
2010-05-22 22:27 ` Clemens Buchacher
2010-05-23 10:36 ` Eyvind Bernhardsen
2010-05-23 11:51 ` Clemens Buchacher
2010-05-23 12:53 ` Eyvind Bernhardsen
2010-05-23 13:26 ` Ævar Arnfjörð Bjarmason
2010-05-24 9:49 ` Clemens Buchacher
2010-05-24 12:47 ` Dmitry Potapov
2010-05-24 20:45 ` Eyvind Bernhardsen
2010-05-24 20:56 ` Clemens Buchacher
2010-05-24 21:09 ` Eyvind Bernhardsen
2010-05-24 21:11 ` Eyvind Bernhardsen
2010-05-24 22:11 ` Clemens Buchacher
2010-05-25 6:41 ` Eyvind Bernhardsen
2010-05-25 8:27 ` Anthony Youngman
2010-06-07 19:55 ` Eyvind Bernhardsen
2010-05-25 8:33 ` Clemens Buchacher
2010-05-24 12:12 ` Dmitry Potapov
2010-05-24 12:22 ` Erik Faye-Lund
2010-05-24 12:42 ` Dmitry Potapov
2010-05-21 16:16 ` Ævar Arnfjörð Bjarmason
2010-05-22 21:24 ` René Scharfe
2010-05-22 21:26 ` [PATCH 1/8] grep: add test script for binary file handling René Scharfe
2010-05-22 21:28 ` [PATCH 2/8] grep: grep: refactor handling of binary mode options René Scharfe
2010-05-22 21:29 ` [PATCH 3/8] grep: --count over binary René Scharfe
2010-05-22 21:30 ` [PATCH 4/8] grep: --name-only " René Scharfe
2010-05-22 21:32 ` [PATCH 5/8] grep: use memmem() for fixed string search René Scharfe
2010-05-22 21:34 ` [PATCH 6/8] grep: continue case insensitive fixed string search after NUL chars René Scharfe
2010-05-22 21:35 ` [PATCH 7/8] grep: use REG_STARTEND for all matching if available René Scharfe
2010-05-22 21:43 ` [PATCH 8/8] grep: support NUL chars in search strings for -F René Scharfe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).