Git development
 help / color / mirror / Atom feed
* Re: [BUG] attribute "eol" with "crlf"
From: Junio C Hamano @ 2011-12-16 21:21 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Ralf Thielow, git
In-Reply-To: <vpqr504wf70.fsf@bauges.imag.fr>

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

> Ralf Thielow <ralf.thielow@googlemail.com> writes:
>
>> There's a bug in git-1.7.8 if you use the attribute "eol" with "crlf".
>>
>> Steps to reproduce:
>> - add and commit a text file which uses 0d0a for line breaks
>> 7465 7374 0d0a 0d0a 7465 7374 0d0a       test....test..
>> - add ".gitattributes" with "*.txt eol=crlf"
>> - change a line in the file
>> - execute "git checkout [file]"
>>
>> The result is:
>> 7465 7374 0d0d 0a0d 0d0a 7465 7374 0d0d  test......test..
>
> It seems to me to be the expected behavior. You committed a file whose
> line endings are not normalized to LF in the repository, and asked for a
> conversion LF -> CRLF on checkout, which Git did.
>
> Git can't know exactly the moment when you edit .gitattributes, so it
> can't do the conversion at the time you add the eol=crlf attribute. It
> does it on checkout.
>
>> 0d0a was replaced by 0d0d0a.
>
> I'd say 0a (LF) was replaced by 0d0a (CRLF).
>
> What behavior would you have expected?

The sequence adds "test\r\n" file without .gitattributes to have the
repository record that exact byte sequence for the file. But then later
goes around and says "This file wants to express the end of line with CRLF
on the filesystem, so please replace LF in the repository representation
to CRLF when checking out, and replace CRLF in the working tree to LF when
checking in".

So it is not surprising that "\r\n" coming from the repository is replaced
to "\r\r\n" when checked out. As far as the repository data is concerned,
that line has a funny byte with value "\r" at the end, immediately before
the line terminator "\n".

What you said is _technically_ correct in that sense.

However, I think the CRLF filter used to have a hack to strip "\r" if the
repository data records "\r" at the end of line. This was intended to help
people who checked in such a broken text file (if it is a text file, then
raw ascii CR does not have a place in it in the repository representation)
and it was a useful hack to help people recover from such mistakes to
start the project from DOS-only world (with CRLF in the repository data)
and migrate to cross platform world (with LF in the repository data, CRLF
in the DOS working tree).  I suspect that the streaming filter conversion
may not have the same hack in it.

^ permalink raw reply

* Re: How to automatically correct an almost correct auto-merge?
From: Neal Kreitzinger @ 2011-12-16 21:32 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git
In-Reply-To: <20111216203215.GG1868@goldbirke>

On 12/16/2011 2:32 PM, SZEDER Gábor wrote:
> Hi,
>
>
> Briefly:
>
> Neighboring areas of a file are modified in two branches.  Git
> merges the two branches without conflicts, but the result is not
> semantically correct.  How can I teach git to produce the correct
> merge result when performing the same merge later on?
>
>
> Longer:
>
> The following commands create a file and two branches, both of them
> modifying the file by adding lines in the same area:
>
> git init cat>file<<\EOF 1
>
> 2 EOF git add file git commit -m file git apply<<\EOF diff --git
> a/file b/file index 1c3e7efc..121366a2 100644 --- a/file +++ b/file
> @@ -1,3 +1,5 @@ 1
>
> +a + 2 EOF git commit -a -m a git checkout -b branch HEAD^ git
> apply<<\EOF diff --git a/file b/file index 1c3e7efc..f2e91d4f 100644
> --- a/file +++ b/file @@ -1,3 +1,6 @@ 1 +b + +c
>
> 2 EOF git commit -a -m 'b c' git checkout master
>
>
> At this point I merge 'branch' and git produces the following
> result:
>
> $ git merge branch Auto-merging file Merge made by the 'recursive'
> strategy. file |    3 +++ 1 files changed, 3 insertions(+), 0
> deletions(-) $ cat file 1 b
>
> c
>
> a
>
> 2
>
>
> Now, these changes and the merge above are the minimal receipe which
> corresponds to a real merge I'm having trouble with at dayjob.  Just
> imagine that '1' and '2' are the beginning and end of a function,
> 'b' is the declaration of a new local variable, and 'a' and 'c' are
> new code blocks.  As it happens, the semantically correct result
> would be the following:
>
> 1 b
>
> a
>
> c
>
> 2
>
> i.e. 'a' must be executed before 'c'.
>
> I corrected the merge result manually, but these two branches are
> merged a couple of times a day into an integration branch, and they
> will likely cook for a few weeks, which means a lot of merges, and a
> lot of manual corrections.  So I'm looking for a way to teach git to
> produce the semantically correct merge result.  Something like
> 'rerere' would be great, but of course I can't use 'rerere' in this
> case, because there are no merge conflicts at all...
>
> Any ideas?  Did someone deal with similar issues before?
>
You can produce conflicts by implementing keyword expansion on "line 1"
(or whatever the first commentable line is in your language) of your 
source changes during your pre-commit hook.  We do a keyword expansion 
on "user" (whomai) and "date" (date) keywords.  This will cause "line 1" 
to conflict on same file edits in a merge.  As for trusting rerere, we 
don't.  We do it manually with kdiff3 as the mergetool.  If rerere works 
for you reliably in this scenario then I'd like to know about it.

v/r,
neal

^ permalink raw reply

* Re: Infinite loop in cascade_filter_fn()
From: Junio C Hamano @ 2011-12-16 22:01 UTC (permalink / raw)
  To: Carlos Martín Nieto; +Cc: Henrik Grubbström, Git Mailing list
In-Reply-To: <20111128104812.GA2386@beez.lab.cmartin.tk>

Carlos Martín Nieto <cmn@elego.de> writes:

> Subject: [PATCHv2] convert: track state in LF-to-CRLF filter
>
> There may not be enough space to store CRLF in the output. If we don't
> fill the buffer, then the filter will keep getting called with the same
> short buffer and will loop forever.
>
> Instead, always store the CR and record whether there's a missing LF
> if so we store it in the output buffer the next time the function gets
> called.
>
> Reported-by: Henrik Grubbström <grubba@roxen.com>
> Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
> ---
>  convert.c |   50 +++++++++++++++++++++++++++++++++++++-------------
>  1 files changed, 37 insertions(+), 13 deletions(-)
>
> diff --git a/convert.c b/convert.c
> index 86e9c29..1c91409 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -876,24 +876,39 @@ int is_null_stream_filter(struct stream_filter *filter)
>  /*
>   * LF-to-CRLF filter
>   */
> +
> +struct lf_to_crlf_filter {
> +	struct stream_filter filter;
> +	int want_lf;
> +};
> +
>  static int lf_to_crlf_filter_fn(struct stream_filter *filter,
>  				const char *input, size_t *isize_p,
>  				char *output, size_t *osize_p)
>  {
> -	size_t count;
> +	size_t count, o = 0;
> +	struct lf_to_crlf_filter *lfcrlf = (struct lf_to_crlf_filter *) filter;
> +
> +	/* Output a pending LF if we need to */
> +	if (lfcrlf->want_lf) {
> +		output[o++] = '\n';
> +		lfcrlf->want_lf = 0;
> +	}
>  
>  	if (!input)
> -		return 0; /* we do not keep any states */
> +		return 0; /* We've already dealt with the state */
> +

Shouldn't we be decrementing *osize_p by 'o' to signal that we used that
many bytes in the output buffer here before returning to the caller?

>  	count = *isize_p;
>  	if (count) {
> -		size_t i, o;
> -		for (i = o = 0; o < *osize_p && i < count; i++) {
> +		size_t i;
> +		for (i = 0; o < *osize_p && i < count; i++) {
>  			char ch = input[i];
>  			if (ch == '\n') {
> -				if (o + 1 < *osize_p)
> -					output[o++] = '\r';
> -				else
> -					break;
> +				output[o++] = '\r';
> +				if (o >= *osize_p) {
> +					lfcrlf->want_lf = 1;
> +					continue; /* We need to increase i */
> +				}
>  			}
>  			output[o++] = ch;
>  		}

^ permalink raw reply

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Johannes Sixt @ 2011-12-16 22:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <20111216192104.GA19924@sigill.intra.peff.net>

Am 16.12.2011 20:21, schrieb Jeff King:
> I'm not clear from what you wrote on whether you were saying it is
> simply sub-optimal, or whether on balance it is way worse than the
> default funcname matching.

I'm saying the latter. Okay, we're talking "only" about hunk headers.
But when you are reviewing patches, they are *extremely* useful and a
time-saver; when they are wrong or not present, they are exactly the
opposite.

> So, I'm confused. If you are using this, surely you have "*.c diff=xcpp"
> in your attributes file, and my patch has no effect for you,

Sure I have. What I didn't say (sorry for that!), but wanted to hint at
is that this is to experiment with a pattern in order to ultimately
improve the built-in pattern. The topic came up just the other day, and
I took Thomas Rast's suggestion to experiment with a simplified pattern:

http://thread.gmane.org/gmane.comp.version-control.git/186355/focus=186439

But as is, the built-in pattern misses way too many anchor points in C++
code.

-- Hannes

^ permalink raw reply

* Re: [BUG] attribute "eol" with "crlf"
From: Ralf Thielow @ 2011-12-16 22:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthieu Moy, git
In-Reply-To: <7vmxasgqlm.fsf@alter.siamese.dyndns.org>

So i have to commit ".gitattributes" and everything is fine for me after!?

> The sequence adds "test\r\n" file without .gitattributes to have the
> repository record that exact byte sequence for the file. But then later
> goes around and says "This file wants to express the end of line with CRLF
> on the filesystem, so please replace LF in the repository representation
> to CRLF when checking out, and replace CRLF in the working tree to LF when
> checking in".
>
> So it is not surprising that "\r\n" coming from the repository is replaced
> to "\r\r\n" when checked out. As far as the repository data is concerned,
> that line has a funny byte with value "\r" at the end, immediately before
> the line terminator "\n".
>
> What you said is _technically_ correct in that sense.
>
> However, I think the CRLF filter used to have a hack to strip "\r" if the
> repository data records "\r" at the end of line. This was intended to help
> people who checked in such a broken text file (if it is a text file, then
> raw ascii CR does not have a place in it in the repository representation)
> and it was a useful hack to help people recover from such mistakes to
> start the project from DOS-only world (with CRLF in the repository data)
> and migrate to cross platform world (with LF in the repository data, CRLF
> in the DOS working tree).  I suspect that the streaming filter conversion
> may not have the same hack in it.

^ permalink raw reply

* Re: [BUG] attribute "eol" with "crlf"
From: Junio C Hamano @ 2011-12-16 22:17 UTC (permalink / raw)
  To: Ralf Thielow; +Cc: Matthieu Moy, git
In-Reply-To: <CAN0XMOL674Hw_LctTC+8NNqA84Of6dMjdKT0SU+DWMG7EYShYQ@mail.gmail.com>

Ralf Thielow <ralf.thielow@googlemail.com> writes:

> So i have to commit ".gitattributes" and everything is fine for me after!?

No.  Sorry if I was unclear, but I do not see which part was unclear in
what I wrote, so...

>> The sequence adds "test\r\n" file without .gitattributes to have the
>> repository record that exact byte sequence for the file. But then later
>> goes around and says "This file wants to express the end of line with CRLF
>> on the filesystem, so please replace LF in the repository representation
>> to CRLF when checking out, and replace CRLF in the working tree to LF when
>> checking in".
>>
>> So it is not surprising that "\r\n" coming from the repository is replaced
>> to "\r\r\n" when checked out. As far as the repository data is concerned,
>> that line has a funny byte with value "\r" at the end, immediately before
>> the line terminator "\n".
>>
>> What you said is _technically_ correct in that sense.
>>
>> However, I think the CRLF filter used to have a hack to strip "\r" if the
>> repository data records "\r" at the end of line. This was intended to help
>> people who checked in such a broken text file (if it is a text file, then
>> raw ascii CR does not have a place in it in the repository representation)
>> and it was a useful hack to help people recover from such mistakes to
>> start the project from DOS-only world (with CRLF in the repository data)
>> and migrate to cross platform world (with LF in the repository data, CRLF
>> in the DOS working tree).  I suspect that the streaming filter conversion
>> may not have the same hack in it.

^ permalink raw reply

* Strange behaviour with git-add, .gitignore and a tracked file
From: David ‘Bombe’ Roden @ 2011-12-16 21:57 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: Text/Plain, Size: 875 bytes --]

Hi.

I stumbled across some strange behaviour today:

---
git init git-test
cd git-test
echo '*' > .gitignore
git add -f .gitignore
git commit -m "Add gitignore."
mkdir dir
echo a > dir/file
git add -f dir/file
git commit -m "Add dir/file."
echo b >> dir/file
git add dir/file
---

The last git-add results in the following error message:

---
The following paths are ignored by one of your .gitignore files:
dir
Use -f if you really want to add them.
fatal: no files added
---

While it is true that the file specification matches a pattern in .gitignore, 
the file is already tracked and should not be ignored.

This behaviour was reproduced with 1.7.7.4 (on OS X), 1.7.5.4, 1.7.1, 
1.7.8-247-g10f4eb6 (latest master as of a couple of hours ago) (all Linux).


Greetings,

	David
-- 
David ‘Bombe’ Roden <bombe@pterodactylus.net>

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: [msysGit] [PATCH] gitk: fix the display of files when filtered by path
From: Jakub Narebski @ 2011-12-16 22:27 UTC (permalink / raw)
  To: Pat Thoyts; +Cc: Junio C Hamano, Martin von Zweigbergk, Paul Mackerras, git
In-Reply-To: <8739clzdzw.fsf@fox.patthoyts.tk>

Pat Thoyts <patthoyts@users.sourceforge.net> writes:
> Junio C Hamano <gitster@pobox.com> writes:

>> I have this slight suspicion that Paul would appreciate if somebody who
>> cares about gitk who is capable and willing steps forward and takes over
>> the maintainership of gitk, as he is busy in his other projects.
> 
> I can do this one along with git-gui if this is the case.

I wonder if having common maintainer for both gitk and git-gui would
lead to first, splitting gitk into smaller files like git-gui was, and
second sharing common Tcl/Tk bindings / wrappers between gitk and
git-gui...

-- 
Jakub Narębski

^ permalink raw reply

* Re: [BUG] attribute "eol" with "crlf"
From: Ralf Thielow @ 2011-12-16 22:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthieu Moy, git
In-Reply-To: <7vehw4go0v.fsf@alter.siamese.dyndns.org>

Basicly I want to force the line endings of the files on my
project. :/

2011/12/16 Junio C Hamano <gitster@pobox.com>:
> Ralf Thielow <ralf.thielow@googlemail.com> writes:
>
>> So i have to commit ".gitattributes" and everything is fine for me after!?
>
> No.  Sorry if I was unclear, but I do not see which part was unclear in
> what I wrote, so...
>
>>> The sequence adds "test\r\n" file without .gitattributes to have the
>>> repository record that exact byte sequence for the file. But then later
>>> goes around and says "This file wants to express the end of line with CRLF
>>> on the filesystem, so please replace LF in the repository representation
>>> to CRLF when checking out, and replace CRLF in the working tree to LF when
>>> checking in".
>>>
>>> So it is not surprising that "\r\n" coming from the repository is replaced
>>> to "\r\r\n" when checked out. As far as the repository data is concerned,
>>> that line has a funny byte with value "\r" at the end, immediately before
>>> the line terminator "\n".
>>>
>>> What you said is _technically_ correct in that sense.
>>>
>>> However, I think the CRLF filter used to have a hack to strip "\r" if the
>>> repository data records "\r" at the end of line. This was intended to help
>>> people who checked in such a broken text file (if it is a text file, then
>>> raw ascii CR does not have a place in it in the repository representation)
>>> and it was a useful hack to help people recover from such mistakes to
>>> start the project from DOS-only world (with CRLF in the repository data)
>>> and migrate to cross platform world (with LF in the repository data, CRLF
>>> in the DOS working tree).  I suspect that the streaming filter conversion
>>> may not have the same hack in it.

^ permalink raw reply

* [PATCH] lf_to_crlf_filter(): tell the caller we added "\n" when draining
From: Junio C Hamano @ 2011-12-16 22:43 UTC (permalink / raw)
  To: Git Mailing list; +Cc: Carlos Martín Nieto, Henrik Grubbström
In-Reply-To: <7viplggoq9.fsf@alter.siamese.dyndns.org>

This can only happen when the input size is multiple of the
buffer size of the cascade filter (16k) and ends with an LF,
but in such a case, the code forgot to tell the caller that
it added the "\n" it could not add during the last round.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 convert.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/convert.c b/convert.c
index c2c2c11..c028275 100644
--- a/convert.c
+++ b/convert.c
@@ -879,7 +879,7 @@ int is_null_stream_filter(struct stream_filter *filter)
 
 struct lf_to_crlf_filter {
 	struct stream_filter filter;
-	int want_lf;
+	unsigned want_lf:1;
 };
 
 static int lf_to_crlf_filter_fn(struct stream_filter *filter,
@@ -895,8 +895,11 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 		lf_to_crlf->want_lf = 0;
 	}
 
-	if (!input)
-		return 0; /* We've already dealt with the state */
+	/* We are told to drain */
+	if (!input) {
+		*osize_p -= o;
+		return 0;
+	}
 
 	count = *isize_p;
 	if (count) {
@@ -931,10 +934,9 @@ static struct stream_filter_vtbl lf_to_crlf_vtbl = {
 
 static struct stream_filter *lf_to_crlf_filter(void)
 {
-	struct lf_to_crlf_filter *lf_to_crlf = xmalloc(sizeof(*lf_to_crlf));
+	struct lf_to_crlf_filter *lf_to_crlf = xcalloc(1, sizeof(*lf_to_crlf));
 
 	lf_to_crlf->filter.vtbl = &lf_to_crlf_vtbl;
-	lf_to_crlf->want_lf = 0;
 	return (struct stream_filter *)lf_to_crlf;
 }
 
-- 
1.7.8.351.g9111b

^ permalink raw reply related

* Re: [bug?] checkout -m doesn't work without a base version
From: Ramsay Jones @ 2011-12-16 22:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andreas Schwab, Michael Schubert, Pete Harlan, git
In-Reply-To: <7v4nx1pwjg.fsf@alter.siamese.dyndns.org>

Junio C Hamano wrote:
> You could do the usual "unnecessary initialization" trick, though.

Yes, I wrote exactly this patch some days ago (when it first appeared
in pu), but haven't found the time to send it ... :-D

ATB,
Ramsay Jones

^ permalink raw reply

* [PATCH] Fix an "variable might be used uninitialized" gcc warning
From: Ramsay Jones @ 2011-12-16 22:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: GIT Mailing-list


In particular, gcc issues the following warning:

        CC builtin/checkout.o
    builtin/checkout.c: In function `cmd_checkout':
    builtin/checkout.c:160: warning: 'mode' might be used uninitialized \
        in this function

However, the analysis performed by gcc is too conservative, in this
case, since the mode variable will not be used uninitialised. Note that,
if the mode variable is not set in the loop, then "threeway[1]" will
also still be set to the null SHA1. This will then result in control
leaving the function, almost directly after the loop, well before the
potential use in the call to make_cache_entry().

In order to suppress the warning, we initialise the mode variable to
zero in it's declaration.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
---

Just in case you haven't found the time to apply your own patch!

[Note that only 2 out of the 3 versions of gcc I use issues this
warning]

ATB,
Ramsay Jones

 builtin/checkout.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 787d468..f1984d9 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -157,7 +157,7 @@ static int checkout_merged(int pos, struct checkout *state)
 	unsigned char sha1[20];
 	mmbuffer_t result_buf;
 	unsigned char threeway[3][20];
-	unsigned mode;
+	unsigned mode = 0;
 
 	memset(threeway, 0, sizeof(threeway));
 	while (pos < active_nr) {
-- 
1.7.8

^ permalink raw reply related

* [PATCH] grep.h: Fix compilation error on mingw
From: Ramsay Jones @ 2011-12-16 22:56 UTC (permalink / raw)
  To: trast; +Cc: Junio C Hamano, GIT Mailing-list


In particular, gcc complains as follows:

        CC bisect.o
    In file included from revision.h:5,
                     from bisect.c:4:
    grep.h:138: error: expected '=', ',', ';', 'asm' or \
        '__attribute__' before 'grep_attr_mutex'
    make: *** [bisect.o] Error 1

In order to fix the error, we include the 'thread-utils.h' header
file in grep.h, since it provides a definition of pthread_mutex_t
(indirectly via compat/win32/pthread.h).

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
---

Hi Thomas,

If you need to re-roll your grep-threading series, could you please
squash this patch into your commit 53f4b6f7 (grep: enable threading
with -p and -W using lazy attribute lookup, 12-12-2011).

[Note: you could also remove the '#include "thread-utils.h"' in both
grep.c and builtin/grep.c, since it is now included from grep.h.]

Thanks!

ATB,
Ramsay Jones

 grep.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/grep.h b/grep.h
index 15d227c..b6f06cb 100644
--- a/grep.h
+++ b/grep.h
@@ -8,6 +8,7 @@ typedef int pcre;
 typedef int pcre_extra;
 #endif
 #include "kwset.h"
+#include "thread-utils.h"
 
 enum grep_pat_token {
 	GREP_PATTERN,
-- 
1.7.8

^ permalink raw reply related

* Re: migrating from svn: How to clean up history?
From: Thomas Ferris Nicolaisen @ 2011-12-16 23:11 UTC (permalink / raw)
  To: Hartmut Goebel; +Cc: git
In-Reply-To: <4EE766D5.5040702@goebel-consult.de>

> I already did some experiments with `git filter-branch` without success. (I manage renaming the tags, though.)
>
> Any hints how to to this clean-up?

There are some examples of how to use filter-branch to remove empty
commits, and some other handy tips on this page:

http://thomasrast.ch/git/git-svn-conversion.html

^ permalink raw reply

* Re: [BUG] attribute "eol" with "crlf"
From: Junio C Hamano @ 2011-12-16 23:24 UTC (permalink / raw)
  To: Ralf Thielow; +Cc: Junio C Hamano, Matthieu Moy, git
In-Reply-To: <CAN0XMOKts7UR6eSYWA9-xj-YCpprvhbqwfdbq4U6Hfrn0nUONQ@mail.gmail.com>

Ralf Thielow <ralf.thielow@googlemail.com> writes:

> Basicly I want to force the line endings of the files on my
> project. :/

You need to fix the data you have recorded with CRLF in the repository if
you are using eol=crlf, which means "This file wants to ..." (see below; I
do not want to type the same thing again).

> 2011/12/16 Junio C Hamano <gitster@pobox.com>:
>> Ralf Thielow <ralf.thielow@googlemail.com> writes:
>>
>>> So i have to commit ".gitattributes" and everything is fine for me after!?
>>
>> No.  Sorry if I was unclear, but I do not see which part was unclear in
>> what I wrote, so...
>>
>>>> The sequence adds "test\r\n" file without .gitattributes to have the
>>>> repository record that exact byte sequence for the file. But then later
>>>> goes around and says "This file wants to express the end of line with CRLF
>>>> on the filesystem, so please replace LF in the repository representation
>>>> to CRLF when checking out, and replace CRLF in the working tree to LF when
>>>> checking in".
>>>>
>>>> So it is not surprising that "\r\n" coming from the repository is replaced
>>>> to "\r\r\n" when checked out. As far as the repository data is concerned,
>>>> that line has a funny byte with value "\r" at the end, immediately before
>>>> the line terminator "\n".
>>>>
>>>> What you said is _technically_ correct in that sense.
>>>> ...

^ permalink raw reply

* Re: [BUG] attribute "eol" with "crlf"
From: Junio C Hamano @ 2011-12-16 23:34 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Carlos Martín Nieto, Ralf Thielow, git
In-Reply-To: <7vmxasgqlm.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> ...
> What you said is _technically_ correct in that sense.
>
> However, I think the CRLF filter used to have a hack to strip "\r" if the
> repository data records "\r" at the end of line. This was intended to help
> people who checked in such a broken text file (if it is a text file, then
> raw ascii CR does not have a place in it in the repository representation)
> and it was a useful hack to help people recover from such mistakes to
> start the project from DOS-only world (with CRLF in the repository data)
> and migrate to cross platform world (with LF in the repository data, CRLF
> in the DOS working tree).  I suspect that the streaming filter conversion
> may not have the same hack in it.

Perhaps something like this, but I do not use CRLF myself, so it probably
needs to be checked by extra sets of eyes.

Thanks.

-- >8 --
Subject: lf_to_crlf_filter(): resurrect CRLF->CRLF hack

The non-streaming version of the filter counts CRLF and LF in the whole
buffer, and returns without doing anything when they match (i.e. what is
recorded in the object store already uses CRLF). This was done to help
people who added files from the DOS world before realizing they want to go
cross platform and adding .gitattributes to tell Git that they only want
CRLF in their working tree.

The streaming version of the filter does not want to read the whole thing
before starting to work, as that defeats the whole point of streaming. So
we instead check what byte follows CR whenever we see one, and add CR
before LF only when the LF does not immediately follow CR already to keep
CRLF as is.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 convert.c |   60 ++++++++++++++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/convert.c b/convert.c
index c028275..8daf4e4 100644
--- a/convert.c
+++ b/convert.c
@@ -879,7 +879,8 @@ int is_null_stream_filter(struct stream_filter *filter)
 
 struct lf_to_crlf_filter {
 	struct stream_filter filter;
-	unsigned want_lf:1;
+	unsigned has_held:1;
+	char held;
 };
 
 static int lf_to_crlf_filter_fn(struct stream_filter *filter,
@@ -889,10 +890,14 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 	size_t count, o = 0;
 	struct lf_to_crlf_filter *lf_to_crlf = (struct lf_to_crlf_filter *)filter;
 
-	/* Output a pending LF if we need to */
-	if (lf_to_crlf->want_lf) {
-		output[o++] = '\n';
-		lf_to_crlf->want_lf = 0;
+	/*
+	 * We may be holding onto the CR to see if it is followed by a
+	 * LF, in which case we would need to go to the main loop.
+	 * Otherwise, just emit it to the output stream.
+	 */
+	if (lf_to_crlf->has_held && (lf_to_crlf->held != '\r' || !input)) {
+		output[o++] = lf_to_crlf->held;
+		lf_to_crlf->has_held = 0;
 	}
 
 	/* We are told to drain */
@@ -902,22 +907,57 @@ static int lf_to_crlf_filter_fn(struct stream_filter *filter,
 	}
 
 	count = *isize_p;
-	if (count) {
+	if (count || lf_to_crlf->has_held) {
 		size_t i;
+		int was_cr = 0;
+
+		if (lf_to_crlf->has_held) {
+			was_cr = 1;
+			lf_to_crlf->has_held = 0;
+		}
+
 		for (i = 0; o < *osize_p && i < count; i++) {
 			char ch = input[i];
+
 			if (ch == '\n') {
 				output[o++] = '\r';
-				if (o >= *osize_p) {
-					lf_to_crlf->want_lf = 1;
-					continue; /* We need to increase i */
-				}
+			} else if (was_cr) {
+				/*
+				 * Previous round saw CR and it is not followed
+				 * by a LF; emit the CR before processing the
+				 * current character.
+				 */
+				output[o++] = '\r';
 			}
+
+			/*
+			 * We may have consumed the last output slot,
+			 * in which case we need to break out of this
+			 * loop; hold the current character before
+			 * returning.
+			 */
+			if (*osize_p <= o) {
+				lf_to_crlf->has_held = 1;
+				lf_to_crlf->held = ch;
+				continue; /* break but increment i */
+			}
+
+			if (ch == '\r') {
+				was_cr = 1;
+				continue;
+			}
+
+			was_cr = 0;
 			output[o++] = ch;
 		}
 
 		*osize_p -= o;
 		*isize_p -= i;
+
+		if (!lf_to_crlf->has_held && was_cr) {
+			lf_to_crlf->has_held = 1;
+			lf_to_crlf->held = '\r';
+		}
 	}
 	return 0;
 }

^ permalink raw reply related

* Re: Strange behaviour with git-add, .gitignore and a tracked file
From: Junio C Hamano @ 2011-12-16 23:42 UTC (permalink / raw)
  To: David ‘Bombe’ Roden; +Cc: git
In-Reply-To: <201112162258.04661.bombe@pterodactylus.net>

David ‘Bombe’ Roden  <bombe@pterodactylus.net> writes:

> This behaviour was reproduced with 1.7.7.4 (on OS X), 1.7.5.4, 1.7.1, 
> 1.7.8-247-g10f4eb6 (latest master as of a couple of hours ago) (all Linux).

Doesn't that suggest perhaps it could be a designed behaviour?

Perhaps to help users to realize that their .gitignore pattern may want to
be improved?

^ permalink raw reply

* Re: [PATCH] Fix an "variable might be used uninitialized" gcc warning
From: Jonathan Nieder @ 2011-12-16 23:59 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Junio C Hamano, GIT Mailing-list
In-Reply-To: <4EEBC9D6.6010204@ramsay1.demon.co.uk>

Ramsay Jones wrote:

>         CC builtin/checkout.o
>     builtin/checkout.c: In function `cmd_checkout':
>     builtin/checkout.c:160: warning: 'mode' might be used uninitialized \
>         in this function
[...]
> [Note that only 2 out of the 3 versions of gcc I use issues this
> warning]

Which version of gcc is that?  Is gcc getting more sane, so we won't
have to worry about this after a while, or is the false positive a
new regression that should be reported to them?

^ permalink raw reply

* Re: What's cooking in git.git (Dec 2011, #05; Thu, 15)
From: Junio C Hamano @ 2011-12-17  0:39 UTC (permalink / raw)
  To: git
In-Reply-To: <7vd3bpl6i8.fsf@alter.siamese.dyndns.org>

I don't want to do issue #06 yet, so only differences from #05 are listed
here for today's updates.  Extra sets of eyeballs on commits in 'pu' are
very much appreciated.

* tr/grep-threading (2011-12-16) 3 commits
 - grep: disable threading in non-worktree case
 - grep: enable threading with -p and -W using lazy attribute lookup
 - grep: load funcname patterns for -W

The second one was updated to include thread-utils.h in grep.h

* cn/maint-lf-to-crlf-filter (2011-12-16) 1 commit
 - lf_to_crlf_filter(): tell the caller we added "\n" when draining
 (this branch is used by jc/maint-lf-to-crlf-keep-crlf.)

Fixes up an earlier fix already in 'master'.

* jc/maint-lf-to-crlf-keep-crlf (2011-12-16) 1 commit
 - lf_to_crlf_filter(): resurrect CRLF->CRLF hack
 (this branch uses cn/maint-lf-to-crlf-filter.)

Builds on top of it.

* jc/request-pull-show-head-4 (2011-12-16) 1 commit
  (merged to 'next' on 2011-12-16 at bea51ac)
 + request-pull: update the "pull" command generation logic

Fixes up an earlier update already in 'master'.

* jk/doc-fsck (2011-12-16) 1 commit
 - docs: brush up obsolete bits of git-fsck manpage

* jk/follow-rename-score (2011-12-16) 1 commit
 - use custom rename score during --follow

* jk/pretty-reglog-ent (2011-12-16) 1 commit
 - pretty: give placeholders to reflog identity

Three fairly straightforward patches from Peff; will merge to 'next'
soonish.

^ permalink raw reply

* Re: git-p4.skipSubmitEdit
From: Michael Horowitz @ 2011-12-17  0:46 UTC (permalink / raw)
  To: Luke Diamand; +Cc: Pete Wyckoff, L. A. Linden Levy, git
In-Reply-To: <4EEBA106.9010001@diamand.org>

Actually, it is the opposite.  Bailout works fine, it is when I ":wq"
in Vi for example, that it fails to submit and reverts all my changes.

        if os.stat(template_file).st_mtime <= mtime:
            while True:
                response = raw_input("Submit template unchanged.
Submit anyway? [y]es, [n]o (skip this patch) ")
                if response == 'y':
                    return True
                if response == 'n':
                    return False
        # I think this else needs to be added here, so when the file
has been modified since it was opened in the editor, it will properly
submit the change.
        else:
            return True


Mike



On Fri, Dec 16, 2011 at 2:50 PM, Luke Diamand <luke@diamand.org> wrote:
>
> On 16/12/11 15:38, Michael Horowitz wrote:
>>
>> All,
>>
>> It appears that this change has introduced a bug that causes submit to
>> fail every time if you do not skip the submit edit.
>>
>>  From what I can tell, this is because the new edit_template method
>> does not return True at the end.
>
>
> Could you say exactly what you're doing?
>
> I've just tried it myself and it seems to work fine:
>
> git-p4 clone
> git commit -m 'a change'
> git-p4 submit
> <quit from editor, with/without modifying it>
>
> And I couldn't see any paths in edit_template that returned without a value of True, except the one where the user decides to bail out.
>
> This is with Pete's skipSubmitEdit change.
>
> Thanks!
> Luke

^ permalink raw reply

* Re: Revisiting metadata storage
From: Jonathan Nieder @ 2011-12-17  0:48 UTC (permalink / raw)
  To: Hilco Wijbenga
  Cc: Richard Hartmann, Ronan Keryell, Git List, Martin von Zweigbergk
In-Reply-To: <CAE1pOi2GW=3o7QgTEcUYbjif3WokpVdgL6UdKXu9x0yKH-vrGw@mail.gmail.com>

(+cc: Martin, who has been doing excellent work on "git rebase", just
 in case he's curious)
Hi again,

Hilco Wijbenga wrote:

> Yes, I guess the problem is that it uses the worktree as its workspace.

That's a comfort.  Thanks for explaining.

> (I know others disagree but to me it's a bug that Git touches files
> that it doesn't actually change.)

No, I somewhat agree.  If a command is touching more files than it needs
to, then that is likely to be a bug, or at least an opportunity for
improvement.

Where we might disagree is in how many files "git rebase" needs to
touch.  So let's consider your use case.

[...]
> I usually rebase in the morning to get an up-to-date tree. Is that
> considered too often? Perhaps it's my Subversion background but I'm
> not comfortable diverging too much. Is that too paranoid? :-)
>
> So IIUC, I can do "git rebase master" even after multiple "git merge master"s?

The second question is easy --- the answer is "yes".

Your first question is more a matter of opinion.  I will just say a
little about "git rebase", to help you decide for yourself.

The original and still-primary purpose of "git rebase" is to refresh and
clean up a short patch series that is going to be submitted by email
to some project, by making the series apply to a newer basis version.
You can imitate what it does fairly simply by hand:

	# on master
	git checkout -b master-rebased new-upstream
	git cherry-pick HEAD..master; # [*]
	git branch -M master

That is, we check out the new basis version and apply any "local"
changes on top of it one at a time, using human help to resolve
conflicts as necessary.

This procedure has the nice property that it is dead simple.  It also
is easily tweaked to produce an "interactive" variant that reorders
the patches or runs other commands in between applying patches (for
example, you can ask git to run the test suite after each commit when
rebasing by adding "exec make test" after each "pick" line in the
editor shown by "git rebase -i").  And in the end you have a nice
patch series that applies without fuzz and doesn't require people
reading your patches to think about the older code base they were
originally written against.

However, rebasing has a few downsides.

The most important one is that each time you rebase, you are making
new, untested commits.  When you rebase the 300-patch series that
you have been debugging in collaboration with other people, you
can no longer say "these changes have been in use and being tested
for a few months now; chances are we have already ironed out most
of the obvious bugs".  The usual heuristic that patches towards the
beginning of the series most likely work better and are less likely to
have introduced that new crash that makes your program not work at all
no longer applies, since when you rebase, you can easily introduce
a mistake in conflict resolution.  All "local" code is new code.

In the history

  A --- o --- o --- B [upstream]
   \
    P --- Q --- R [master]

after rebasing

  A --- o --- o --- B --- P' --- Q' --- R' [master]

it is even tempting for people to not test the intermediate commits P'
and Q' before publishing their work, resulting in a history where
intermediate commits involved in telling the story do not even build.
So building and testing old versions to track down a change in
behavior (e.g., with "git bisect") becomes hard.  The history is not
actual history.

That is easy to mitigate by only rebasing your small, _private_ patch
series that is not part of meaningful history.  When asking others to
incorporate the changes into permanent history, the contributor
hopefully carefully checks over them for sanity and checks each
intermediate version before they can be applied.  And history on a
large, public scale is still stable.

For similar reasons, rebasing can make life difficult for people
trying to write patches based on your patches.  The section RECOVERING
FROM UPSTREAM REBASE in the git-rebase(1) manual page has more on
that.

If you want to incorporate changes into your branch and preserve the
history of well-tested commits (for example, if you are the upstream
maintainer, pulling in changes from other people), a command to do
this is git-merge(1).  It does not have to rewind or rewrite anything;
it just uses a 3-way merge algorithm to apply the new changes and
writes a commit indicating it has done so and with pointers to the two
parent commits so history consumers can see the full story.

Another consideration.

When using Subversion and working against the trunk, I find myself
using "svn update" every day and right before commiting.  Otherwise, I
may be forced to deal with a painful conflict resolution, or worse,
commit a change to one file that uses an API that has been removed in
another file.

However, when using git, I do not find myself needing to do that.
Instead, most work pertaining to a particular goal happens on a branch
specific to that topic, I pick one version to develop against, and I
mostly stick to it.  This way, I am not distracted by irrelevant
breakage or other changes introduced in areas orthogonal to my topic.

"But how do you make sure your changes work with the current
codebase?" you might ask.  Here:

	# on branch "topic"

	# switch to working on an anonymous branch, or rather no
	# branch at all
	git checkout --detach
	# grab latest changes to test against
	git merge origin/master
	# test!
	make test

The "git merge" step does not present me with the same conflicts it
did yesterday thanks to the "git rerere" facility (since I have
[rerere] enabled set to true).  If my topic needs some nontrivial
reconciliation with the wider changes in the project (if there is an
API change, say), I might use "git merge" when on branch topic (i.e.,
_not_ detached) to record the resolution and use the commit message to
describe what happened.  Or I might just rebase.

Because of the nature of patches applied by mail, before sending the
patches out, I either rebase one last time or very loudly mention
which old version of the codebase the patch series applies to.
Usually the former.  But this step would not be necessary if asking
people to pull from me using a protocol that transfers the actual
objects.

Hope that helps,
Jonathan

[*] Actually, "git rebase" is a little smarter than that, in that it
notices and skips patches that have already been applied in
new-upstream.  A better imitation would be to use

	git cherry-pick --cherry-pick --right-only HEAD...master

Even better is to look at git-rebase.sh to see what it actually does.
:)

^ permalink raw reply

* Re: git-p4.skipSubmitEdit
From: Michael Horowitz @ 2011-12-17  0:49 UTC (permalink / raw)
  To: Luke Diamand; +Cc: Pete Wyckoff, L. A. Linden Levy, git
In-Reply-To: <CAFLRbor3Gnqhudmg8G_U37Nbo7ENoCEy0iFVGRP4i_AmatJHxw@mail.gmail.com>

Oh, and in the case where you do edit the template, it doesn't give
you an error or anything, it looks like it succeeded, but you'll
notice the change never got submitted to Perforce.  If you look
carefully though, you can see it reverting each of your edited files
in the P4 tree.

Mike



On Fri, Dec 16, 2011 at 7:46 PM, Michael Horowitz
<michael.horowitz@ieee.org> wrote:
> Actually, it is the opposite.  Bailout works fine, it is when I ":wq"
> in Vi for example, that it fails to submit and reverts all my changes.
>
>        if os.stat(template_file).st_mtime <= mtime:
>            while True:
>                response = raw_input("Submit template unchanged.
> Submit anyway? [y]es, [n]o (skip this patch) ")
>                if response == 'y':
>                    return True
>                if response == 'n':
>                    return False
>        # I think this else needs to be added here, so when the file
> has been modified since it was opened in the editor, it will properly
> submit the change.
>        else:
>            return True
>
>
> Mike
>
>
>
> On Fri, Dec 16, 2011 at 2:50 PM, Luke Diamand <luke@diamand.org> wrote:
>>
>> On 16/12/11 15:38, Michael Horowitz wrote:
>>>
>>> All,
>>>
>>> It appears that this change has introduced a bug that causes submit to
>>> fail every time if you do not skip the submit edit.
>>>
>>>  From what I can tell, this is because the new edit_template method
>>> does not return True at the end.
>>
>>
>> Could you say exactly what you're doing?
>>
>> I've just tried it myself and it seems to work fine:
>>
>> git-p4 clone
>> git commit -m 'a change'
>> git-p4 submit
>> <quit from editor, with/without modifying it>
>>
>> And I couldn't see any paths in edit_template that returned without a value of True, except the one where the user decides to bail out.
>>
>> This is with Pete's skipSubmitEdit change.
>>
>> Thanks!
>> Luke

^ permalink raw reply

* Re: git-p4 using notes
From: Michael Horowitz @ 2011-12-17  0:57 UTC (permalink / raw)
  To: Luke Diamand; +Cc: git
In-Reply-To: <4EEBA24F.8030103@diamand.org>

That would be great.  If you need me to help test, let me know.
Unfortunately I don't know Python and I know very little about the
internals of git, so I can't help much more than that...

Mike



On Fri, Dec 16, 2011 at 2:55 PM, Luke Diamand <luke@diamand.org> wrote:
> On 16/12/11 16:07, Michael Horowitz wrote:
>>
>> For those of you using git-p4 because of a company requirement to use
>> Perforce, but really wish you could use git only, the most frustrating
>> part is the fact that when changes are submitted, the commit message
>> is rewritten to include a reference to the P4 change number which is
>> used by the sync.  When syncing back changes, this causes the commit
>> hash to be different, and so blows away your old commit and any parent
>> commit references and such.
>>
>> I read someplace, I can't remember where at this point, that if git-p4
>> used notes to write the P4 change information, that would not impact
>> the commit hash, so when merging back, things would not be
>> overwritten, and you can maintain branches and commit history properly
>> in git.
>>
>> I just ran into this project, where it seems that someone has
>> re-written git-p4 to use notes: https://github.com/ermshiperete/git-p4
>>
>> I was wondering if any of the maintainers of git-p4 has considered
>> this, and might want to leverage this work to merge into the main git
>> repo, possibly with an option to choose between the two behaviors.
>
>
> I'm not sure I qualify for such a grand title as maintainer, but I was going
> to give this a go in the new year as it would be quite useful, unless
> someone beat me to it. I want to fix some problems with labels first though.
>
> Regards!
> Luke

^ permalink raw reply

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Jeff King @ 2011-12-17  1:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git, Brandon Casey
In-Reply-To: <7vehw4ia5x.fsf@alter.siamese.dyndns.org>

On Fri, Dec 16, 2011 at 11:33:30AM -0800, Junio C Hamano wrote:

> I think we recently saw that the optional built-in one for C did not even
> understand a function that returns a pointer, and nobody complained about
> it for a long time,

Yeah. That implies to me that either people don't really care about
this feature, or that they are not actually using it because it requires
special configuration (we are not even using it in git.git, for
example).

> > And if it is bad on balance, is the right solution to avoid exposing
> > people to it, or is it to make our patterns better?
> 
> Can't we do both, by avoid exposing normal users to broken one while
> people who want to improve the pattern based one work on unbreak it?

Sure, we can do both. But if nobody is eating the dogfood, it will never
grow to taste better. Maybe we should start by using diff=c in git
itself?

> > I.e., is it fixable,
> > or is it simply too hard a problem to get right in the general case, and
> > we shouldn't turn it on by default?
> 
> I do not think that is the "either-or" question. My impression has been
> that even if it is fixable, it is too broken and produces worse result
> than the simple default in its current form.

What I meant by the either-or was: if it is fixable, then we should fix
it and consider turning it on as a default. If it's too hard to get
right, then we probably never want it on by default, and people who do
like it can turn it on (presumably because it works on their code
style).

-Peff

^ permalink raw reply

* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Jeff King @ 2011-12-17  1:21 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <4EEBC0A7.3030303@kdbg.org>

On Fri, Dec 16, 2011 at 11:05:27PM +0100, Johannes Sixt wrote:

> Am 16.12.2011 20:21, schrieb Jeff King:
> > I'm not clear from what you wrote on whether you were saying it is
> > simply sub-optimal, or whether on balance it is way worse than the
> > default funcname matching.
> 
> I'm saying the latter. Okay, we're talking "only" about hunk headers.
> But when you are reviewing patches, they are *extremely* useful and a
> time-saver; when they are wrong or not present, they are exactly the
> opposite.

Right. I don't think it is worth arguing "well, it's only funcname
headers". Because that same argument applies to both the advantages
(i.e., hopefully with the patch we are generating better funcname
headers) and disadvantage (i.e., it seems that we might be generating
worse funcname headers).

So it is really a question of "how good" or "how bad" for each style.

> Sure I have. What I didn't say (sorry for that!), but wanted to hint at
> is that this is to experiment with a pattern in order to ultimately
> improve the built-in pattern. The topic came up just the other day, and
> I took Thomas Rast's suggestion to experiment with a simplified pattern:
> 
> http://thread.gmane.org/gmane.comp.version-control.git/186355/focus=186439
> 
> But as is, the built-in pattern misses way too many anchor points in C++
> code.

Yeah, I can certainly agree that the patterns could be better.

Maybe we should just table the extension mapping for now, then, and see
if the patterns improve? Or maybe just drop the C ones (and probably the
objc one based on other parts of the thread) and do the rest?

-Peff

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox