Git development
 help / color / mirror / Atom feed
* Re: [PATCH] Teach remote machinery about remotes.default config variable
From: Johannes Schindelin @ 2008-01-12 20:24 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Junio C Hamano, git
In-Reply-To: <47891658.3090604@gmail.com>

Hi,

On Sat, 12 Jan 2008, Mark Levedahl wrote:

> Junio C Hamano wrote:
> > Ahh.
> > 
> > Does that suggest the new configuration thing is only about the 
> > "submodule update" command, not "remotes.default" that affects how the 
> > non-submodule merge and fetch works?
>
> Yes - this patch set was inspired by the single question of "how do I 
> avoid needing to define origin as opposed to a server-specific nickname 
> now that I am using sub-modules?"

Why is your patch then not about git-submodule?

And I still fail to see -- even for submodules -- how you begin to tackle 
that lookup problem.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Teach remote machinery about remotes.default config variable
From: Johannes Schindelin @ 2008-01-12 20:22 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Junio C Hamano, git
In-Reply-To: <4788F907.1050306@gmail.com>

Hi,

On Sat, 12 Jan 2008, Mark Levedahl wrote:

> Johannes Schindelin wrote:
> > 
> > No, that was not _at all_ my argument.
> > 
> > I said that hiding it under a different name _that you have to look 
> > up, too_ does _not_ make things easier.
> > 
> >   
> Granted, *IF* we had to look it up, but we don't. In fact, we use the 
> convention
>    servername.foo.bar
> has nickname
>    servername
> 
> So, we need to know the server name we are using, and that server name 
> is the nickname. So, no confusion and no extra lookup step. (Our server 
> names are unique without the domain suffixes, so this works well for 
> us).

How do you know _which_ default remote name your current repository uses? 
Exactly: you have to look it up.  So your whole *IF* argument is bogus.

And if you already have to look something up, and the user fiddled with 
her setup, you can no longer be sure that nickname servername points to 
servername.foo.bar, and you are in even more trouble.

That is why I maintain that your solution does not make things better.

Hth,
Dscho

^ permalink raw reply

* Re: git-commit fatal: Out of memory? mmap failed: Bad file descriptor
From: Alex Riesen @ 2008-01-12 20:16 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Git Mailing List, drafnel
In-Reply-To: <4787F1F5.2010905@nrlssc.navy.mil>

Brandon Casey, Fri, Jan 11, 2008 23:47:17 +0100:
> 
> It's reproduceable for me by amending the commit.
> 
> Any suggestions?

strace -o log -f git commit -C HEAD --amend

and post the "log" here (assuming it failed)

^ permalink raw reply

* Re: [PATCH] Teach remote machinery about remotes.default config variable
From: Mark Levedahl @ 2008-01-12 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vwsqeubj8.fsf@gitster.siamese.dyndns.org>

Junio C Hamano wrote:
> Ahh.
>
> Does that suggest the new configuration thing is only about the
> "submodule update" command, not "remotes.default" that affects
> how the non-submodule merge and fetch works?
>
>   
Yes - this patch set was inspired by the single question of "how do I 
avoid needing to define origin as opposed to a server-specific nickname 
now that I am using sub-modules?"

Mark

^ permalink raw reply

* Re: [PATCH] git-svn: handle leading/trailing whitespace from svnsync revprops
From: Junio C Hamano @ 2008-01-12 19:31 UTC (permalink / raw)
  To: Eric Wong; +Cc: Dennis Schridde, git
In-Reply-To: <7vprw6ub1f.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Git.pm is even worse.  It uses the line-noise prototype...

Sorry, I misremembered.  Git.pm is not a prototype offender.

^ permalink raw reply

* Re: Re-casing directories on case-insensitive systems
From: Dmitry Potapov @ 2008-01-12 19:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kevin Ballard, Johannes Schindelin, git
In-Reply-To: <alpine.LFD.1.00.0801121040010.2806@woody.linux-foundation.org>

On Sat, Jan 12, 2008 at 10:47:10AM -0800, Linus Torvalds wrote:
> 
> And that isn't going to change. It's the only sane way to do 
> locale-independent names: people can *choose* to see the filenames as some 
> UTF-8 sequence, or a series of Latin1, or anything, but that's not 
> something git itself will care about.

Unfortunately, to agree on a single encoding for different systems is
even more difficult than agreeing on a single end-of-line encoding.
OTOH, it is not a real issue as long as anyone use ASCII names only.

> 
> Trying to involve locale in name comparison simply isn't possible.

Agreed. However, the proper solution would be that all filenames are
stored in UTF-8, so conversation is done when a file is added to the
index. But that requires a lot of work, and as I said before, I doubt
that many people really want to store files with non-ASCII names, after
all, Git is a developer tool. So, as far as I am concern, it does not
worth efforts.

Dmitry

^ permalink raw reply

* Re: [PATCH 2/5] git-submodule: New subcommand 'summary' (2) - hard work
From: Junio C Hamano @ 2008-01-12 19:25 UTC (permalink / raw)
  To: Ping Yin; +Cc: git, gitster
In-Reply-To: <46dff0320801120312i7b22f13vb9fe2394b1f687a9@mail.gmail.com>

"Ping Yin" <pkufranky@gmail.com> writes:

>> +                       echo "* $name $sha1_src...$sha1_dst:"
>
> If it's a type change (head submodule but index blob, or the the
> reverse), $sha1_dst or $sha1_src will be the sha1 of the blob. It's
> inapprociate to be shown as if it's a commit in the submodule. May
> 00000000 should be shown instead of the blob sha1?

I do not think that adds much value.  When A or B is a
non-commit, you know that A...B notation does not apply, and
because it is probably a rare situation you would want to make
it even more clearer to the reader by using a different
notation.  Like

    echo "* $name have changed from submodule $sha1_src to blob $sha1_dst!!".

perhaps in red bold letter in larger font ;-)

^ permalink raw reply

* Re: [PATCH 2/5] git-submodule: New subcommand 'summary' (2) - hard work
From: Junio C Hamano @ 2008-01-12 19:21 UTC (permalink / raw)
  To: Ping Yin; +Cc: Junio C Hamano, git
In-Reply-To: <46dff0320801120424v1b780a97x8a4ecfcfe8e52f7@mail.gmail.com>

"Ping Yin" <pkufranky@gmail.com> writes:

>> >
>> > I think you would want to read full 40-char sha1_src and
>> > sha1_dst with "while read", and keep that full 40-char in these
>> > variables, and use them when calling rev-parse here.
>>
>> Hmm, precision is really a problem. However, "git diff --raw" will not
>> always give full 40-char sha1, instead it will give sha1 with enough
>> length. So maybe i can use the sha1 from "git diff --raw" ?
>>
> Oh, I'm wrong. It seems 'git diff --raw' will always give full 40-char
> sha1 for submodule entry and abbreviated sha1 for blob entry.

It is not recommended to use "git diff" in scripts when you can
use one of the "git diff-*" plumbing.  In this case I think you
would want "git-diff-index".  Also see --abbrev option.

You can never determine how many hexdigits are "enough" from the
containing project, as it does not have to have access to the
submodule object store.  That's the reason I suggested to read
full object name from diff-index and use it for error reporting
and object retrieval, and shorten it in the UI for normal status
noise.

^ permalink raw reply

* Re: [PATCH 3/5] git-submodule: New subcommand 'summary' (3) - limit summary size
From: Junio C Hamano @ 2008-01-12 19:17 UTC (permalink / raw)
  To: Ping Yin; +Cc: Junio C Hamano, git
In-Reply-To: <46dff0320801120151s7959edddp1e1f8b506da79e4e@mail.gmail.com>

"Ping Yin" <pkufranky@gmail.com> writes:

> On Jan 12, 2008 4:36 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Ping Yin <pkufranky@gmail.com> writes:
>>
>> > @@ -265,6 +267,10 @@ set_name_rev () {
>> >  #
>> >  modules_summary()
>> >  {
>> > +     summary_limit=${summary_limit:-1000000}
>>
>> Why a million?
> Because i think a million is big enough. I'd better define a constant
> for unlimited number.

I think that is a wrong approach to begin with.  You are
assuming that you will always limit and by using improbably
large limit to pretend it is unlimited.  Why not making the
summary list generator truely capable of produce an unlimited
list?

I also think using 100 or so as a sane default, allowing the
user to override to say "I do not want any limitation", is a
much better default.

^ permalink raw reply

* Re: [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions
From: Dmitry Potapov @ 2008-01-12 19:14 UTC (permalink / raw)
  To: Steffen Prohaska; +Cc: torvalds, git
In-Reply-To: <12001604531066-git-send-email-prohaska@zib.de>

On Sat, Jan 12, 2008 at 06:54:13PM +0100, Steffen Prohaska wrote:
> diff --git a/convert.c b/convert.c
> index 4df7559..598cf0b 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -132,6 +132,27 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
>  				*dst++ = c;
>  		} while (--len);
>  	}
> +	if (safe_crlf) {
> +		if ((action == CRLF_INPUT) || auto_crlf <= 0) {
> +			/* autocrlf=input: check if we removed CRLFs */
> +			if (buf->len != dst - buf->buf) {
> +				if (safe_crlf == SAFE_CRLF_WARN)
> +					warning("Stripped CRLF from %s.", path);
> +				else
> +					die("Refusing to strip CRLF from %s.", path);
> +			}

This check is okay, however

> +		} else {
> +			/* autocrlf=true: check if we had LFs (without CR) */
> +			if (stats.lf != stats.crlf) {
> +				if (safe_crlf == SAFE_CRLF_WARN)
> +					warning(
> +					  "Checkout will replace LFs with CRLF in %s", path);
> +				else
> +					die("Checkout would replace LFs with CRLF in %s", path);
> +			}
> +		}

this is not, because if you really want to be sure that file will not be mangled
by checkout, you should not allow a text file with naked LF when autocrlf=true.
And the following lines after gather_stats() can cause:

		/* No CR? Nothing to convert, regardless. */
		if (!stats.cr)
			return 0;

So, I propose a slightly different patch for convert.c:

diff --git a/convert.c b/convert.c
index 4df7559..9fd88d9 100644
--- a/convert.c
+++ b/convert.c
@@ -90,9 +90,6 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 		return 0;
 
 	gather_stats(src, len, &stats);
-	/* No CR? Nothing to convert, regardless. */
-	if (!stats.cr)
-		return 0;
 
 	if (action == CRLF_GUESS) {
 		/*
@@ -108,8 +105,23 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 		 */
 		if (is_binary(len, &stats))
 			return 0;
+
+		if (safe_crlf) {
+			/* check if we have "naked" LFs */
+			if (stats.lf != stats.crlf) {
+				if (safe_crlf == SAFE_CRLF_WARN)
+					warning(
+					  "Checkout will replace LFs with CRLF in %s", path);
+				else
+					die("Checkout would replace LFs with CRLF in %s", path);
+			}
+		}
 	}
 
+	/* No CR? Nothing to convert, regardless. */
+	if (!stats.cr)
+		return 0;
+
 	/* only grow if not in place */
 	if (strbuf_avail(buf) + buf->len < len)
 		strbuf_grow(buf, len - buf->len);
@@ -131,6 +143,16 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 			if (! (c == '\r' && (1 < len && *src == '\n')))
 				*dst++ = c;
 		} while (--len);
+
+		if (safe_crlf && (action == CRLF_INPUT || auto_crlf <= 0)) {
+			/* autocrlf=input: check if we removed CRLFs */
+			if (buf->len != dst - buf->buf) {
+				if (safe_crlf == SAFE_CRLF_WARN)
+					warning("Stripped CRLF from %s.", path);
+				else
+					die("Refusing to strip CRLF from %s.", path);
+			}
+		}
 	}
 	strbuf_setlen(buf, dst - buf->buf);
 	return 1;


Dmitry

^ permalink raw reply related

* Re: [ANNOUNCE] GIT 1.5.4-rc3
From: Junio C Hamano @ 2008-01-12 19:13 UTC (permalink / raw)
  To: Roger C. Soares; +Cc: git
In-Reply-To: <4788CDAC.5030409@intelinet.com.br>

"Roger C. Soares" <rogersoares@intelinet.com.br> writes:

> To start, I already had git installed from EPEL.
> Downloaded perl-Error from
> http://dag.wieers.com/rpm/packages/perl-Error/ to satisfy
> dependencies. There was another dependacy for git-arch I think, but as
> I don't need it I just deleted this one.
> When trying to install perl-Error it conflicted with perl-Git from
> EPEL. I think they included perl-Error files inside their perl-Git rpm.
> So, after uninstalling all git rpms from EPEL, installing perl-Error
> from dag.wieers, the rc3 git rpms installed successfully.

Thanks for your report.

I do not know what EPEL is (sorry, I do not live in RPM land),
but I think a package perl-Git that includes perl-Error is
misbuilt.  We are not the official source of where the users
should get perl-Error from.

^ permalink raw reply

* Re: [ANNOUNCE] GIT 1.5.4-rc3
From: Junio C Hamano @ 2008-01-12 19:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Ismail Dönmez, git
In-Reply-To: <20080112090432.GA6134@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> It might be more readable to actually set a variable pathspec_size and
> use that.

Ahh.

Yes, the "seen" thing as Réne suggests is moderately painful to
get right and in the longer run I think we need an API clean-up
around pathspec handling.  In any case, the fix looks correct.

Thanks for catching this.

^ permalink raw reply

* Re: [PATCH] git-svn: handle leading/trailing whitespace from svnsync revprops
From: Junio C Hamano @ 2008-01-12 18:57 UTC (permalink / raw)
  To: Eric Wong; +Cc: Dennis Schridde, git
In-Reply-To: <20080112091242.GA27109@soma>

Eric Wong <normalperson@yhbt.net> writes:

> The statements are not equivalent, however.  I'd have to add
>
> 	$var = $1;
>
> too, because I needed to extract what was inside the ( ) since the '$'
> doesn't catch the trailing newline, either.

Ahh, _stupid me_.

Yes, you said '$', not '\Z', but somehow I mistook m|^(.*)$| as
a no-op "whole thing".  Sorry.

> Good points, I've been mindlessly taking "interesting" things from other
> Perl code I've seen over the years and using it in my own without
> thinking about it too hard :x
>
> I'll avoid them in the future.  Unfortunately, Git.pm also suffers from
> this as well.

Git.pm is even worse.  It uses the line-noise prototype which is
a very good and cute hack to allow people to (1) emulate Perl's
built-in and (2) come up with syntax sugars, but has a similar
issue that defeats old-school intuition as wantarray-return
subroutines does.

The caller needs to be careful about receiving return values
with wantarray-return subroutines.  The caller needs to be
careful about how to send in the parameters with line-noise
prototyped subs.

In any case, this kind of clean-up is not within the scope of
changes during rc cycle.  I'll take your bugfix as is.

Thanks.

^ permalink raw reply

* Re: Re-casing directories on case-insensitive systems
From: Linus Torvalds @ 2008-01-12 18:47 UTC (permalink / raw)
  To: Dmitry Potapov; +Cc: Kevin Ballard, Johannes Schindelin, git
In-Reply-To: <20080112144629.GE2963@dpotapov.dyndns.org>



On Sat, 12 Jan 2008, Dmitry Potapov wrote:
> 
> After cursory look at the source code, I wonder if converting name1
> and name2 to upper case before memcmp in cache_name_compare() can
> help case-insensitive systems. This change will change the order of
> file names in the index, but I suppose that it should not be a problem,
> because the index is host specific. Though, this fix is too simple, so
> I guess, I missed something.

No, the index isn't host-specific, and we also have a deep knowledge of 
the fact that the index order is the same as the unpacked tree order.

So no, we absolutely cannot just sort the index differently. We literally 
need to have a separate key for a "upper case lookup".

(That separate key can be just a hash table - it doesn't need to be 
something you can iterate over, so it can be pretty simple).

> > (And that's totally ignoring the fact that case-insensitivity then also 
> > has tons of i18n issues and can get *really* messy 
> 
> The proper support of i18n is not simple even without case-insensitivity.
> For instance, there are four different encodings widely used for Russian
> letters.

.. and git is very clear about this: filenames are *not* "characters" in 
the i18n sense, they are series of bytes. There is absolutely no room for 
ambiguity, and there is no locale for those things.

And that isn't going to change. It's the only sane way to do 
locale-independent names: people can *choose* to see the filenames as some 
UTF-8 sequence, or a series of Latin1, or anything, but that's not 
something git itself will care about.

Trying to involve locale in name comparison simply isn't possible. Two 
different repositories on two different filesystems would get two 
different answers. And that is simply unacceptable in a distributed 
system.

What we can do is to make the simple cases (ie the locale-*independent* 
ones) warn about problems with case insensitivity.

			Linus

^ permalink raw reply

* Re: [PATCH] Teach remote machinery about remotes.default config variable
From: Junio C Hamano @ 2008-01-12 18:46 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: git
In-Reply-To: <4788BFA8.2030508@gmail.com>

Mark Levedahl <mlevedahl@gmail.com> writes:

> Junio C Hamano wrote:
>> Sorry, I may be missing something.
>>
>> Even if you have a submodule, you can go there and that will be
>> a valid freestanding repository.  You can always be explicit,
>> bypassing any behaviour that defaults to 'origin' to avoid
>> ambiguity.
>>
> "git-submodule update" *requires* that origin is defined in all
> sub-modules. There is no way to avoid this behavior.

Ahh.

Does that suggest the new configuration thing is only about the
"submodule update" command, not "remotes.default" that affects
how the non-submodule merge and fetch works?

^ permalink raw reply

* Re: [PATCH decompress BUG] Fix decompress_next_from() wrong argument value
From: Junio C Hamano @ 2008-01-12 18:44 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <e5bfff550801112342w4faee040nad294f3962160180@mail.gmail.com>

"Marco Costalba" <mcostalba@gmail.com> writes:

> Do you prefer patches differently organized or I can keep the same
> patch contents (of course with squashing the bug fixes in) ?

My impression was that the organization was good (addition of
the helpers, and then conversion to existing code to use the
helper piece-by-piece), even though I admit that I did not look
at them very deeply.

^ permalink raw reply

* Re: valgrind test script integration
From: Jeff King @ 2008-01-12 18:12 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.LSU.1.00.0801121808150.8333@wbgn129.biozentrum.uni-wuerzburg.de>

On Sat, Jan 12, 2008 at 06:10:30PM +0100, Johannes Schindelin wrote:

> Nevertheless, I think that would be better.
> 
> BTW does your first patch cope with scripts properly? (I.e. also valgrind 
> the git programs called by the script)

No. To do that, you would need to set up an alternate directory at the
head of the PATH with 'git' in it (and git-*, for everything we want to
intercept, which I think would be all builtins, but probably not scripts
(unless you want to run valgrind on perl or bash, which is probably not
useful to us)).

I started down that route, but it was a little ugly. How do we make that
directory? Where is it stored? Is it generated each time the test script
is run, or part of the Makefile?

-Peff

^ permalink raw reply

* Re: [ANNOUNCE] GIT 1.5.4-rc3
From: Jeff King @ 2008-01-12 18:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Ismail Dönmez, Junio C Hamano, git
In-Reply-To: <alpine.LSU.1.00.0801121756400.8333@wbgn129.biozentrum.uni-wuerzburg.de>

On Sat, Jan 12, 2008 at 05:57:49PM +0100, Johannes Schindelin wrote:

> >  		if (pathspec) {
> > -			memset(seen, 0, argc);
> > +			memset(seen, 0, argc > 0 ? argc : 1);
> >  			matches = match_pathspec(pathspec, ent->name, ent->len,
> >  						 baselen, seen);
> >  		} else {
> 
> Would it not be better to guard the memset by an "if (argc)", and set 
> "seen" to NULL by default?

I am not sure what you mean by "guard with if (argc)"; it needs to be
memset in either case (either with N slots, or with 1 if no argc). Seen
could not be NULL previously, but Rene's patch makes that possible.

-Peff

^ permalink raw reply

* Re: Project Hosting with git ?
From: Stephen Sinclair @ 2008-01-12 18:08 UTC (permalink / raw)
  To: Grégoire Barbier; +Cc: git
In-Reply-To: <47890078.3050809@gbarbier.org>

> I don't think so. A working http-push over webdav would be a dumb
> protocol (passive filesystem upload).

I see.  Not having read your whole thread, I assumed you intended
http-push to be active, more like the svn counterpart.  My bad.
That said, I think having an active http-push would be pretty useful.
A passive http-push would be useful mainly for getting past http proxy
servers I guess.  Definitely could be nice for that.

Steve

^ permalink raw reply

* Re: Project Hosting with git ?
From: Grégoire Barbier @ 2008-01-12 18:01 UTC (permalink / raw)
  To: Stephen Sinclair; +Cc: git
In-Reply-To: <9b3e2dc20801120954k24f7ccb6vf019f30843ff1b84@mail.gmail.com>

Stephen Sinclair a écrit :
>> I disagree: git does not work "fine" over http, it only works fine for
>> fetch/pull.
> 
> You're taking me out of context.  I meant it works fine for public
> hosting so that users can easily clone and create patches.  This is
> the main motivation for publishing your repo, so in that sense it
> "works fine".  (for me)

I agree.

> Since SF supports ssh, there's no reason to need http-push.  I wish
> they would just provide some recent git binaries on the sourceforge
> server, then we could git-push properly over ssh instead of using to
> use scp or rsync.  I think that would be a good start. *shrug*
> 
> In any case, http-push over webdav would still require git binaries to
> be installed somewhere on SF, so it's essentially the same problem.

I don't think so. A working http-push over webdav would be a dumb 
protocol (passive filesystem upload).
However as you said before, this would not be taken in account by SF 
statistics and menus.

-- 
Grégoire Barbier - gb à gbarbier.org - +33 6 21 35 73 49

^ permalink raw reply

* [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions
From: Steffen Prohaska @ 2008-01-12 17:54 UTC (permalink / raw)
  To: torvalds, dpotapov, git; +Cc: Steffen Prohaska
In-Reply-To: <alpine.LFD.1.00.0801111103420.3148@woody.linux-foundation.org>

I promised to think about the CRLF discussion and here is what
I believe we could do:
 - Leave the current core.autocrlf mechanism as is.
 - Add a mechanism to warn the user if an irreversible conversion happens
 - After we have the mechanisms for configuring the conversion and for
   configuring the safety level, we can decide which defaults to use on
   the different platforms, namely Windows and Unix.

I propose to set the following defaults:
 - Unix: core.autocrlf=input, core.safecrlf=warn
 - Windows: core.autocrlf=true, core.safecrlf=warn

This patch is declared as WIP because tests and a documentation are missing.
I'm also not sure if calling warning() and die() is the right thing to do at
this place.  Interestingly, in some (all?) cases, crlf_to_git() is called two
times for a path during git add, resulting in the warning printed two times.  I
didn't yet analyze why this happens.  Maybe the the warnings and errors printed
should be more verbose?

[ Linus, Dimitry was right about stats.lf. ]

    Steffen

---- snip snap ---

CRLF conversion bears a slight chance of corrupting data.
autocrlf=true will convert CRLF to LF during commit and LF to
CRLF during checkout.  A file that containes a mixture of LF and
CRLF before the commit cannot be recreated by git.  For text
files this does not really matter because we do not care about
the line endings anyway; but for binary files that are
accidentally classified as text the conversion can result in
corrupted data.

If you recognize such corruption during commit you can easily fix
it by setting the conversion type explicitly in .gitattributes.
Right after committing you still have the original file in your
work tree and this file is not yet corrupted.

However, in mixed Windows/Unix environments text files quite
easily can end up containing a mixture of CRLF and LF line
endings and git should handle such situations gracefully.  For
example a user could copy a CRLF file from Windows to Unix and
mix it with an existing LF file there.  The result would contain
both types of line endings.

Unfortunately, the desired effect of cleaning up text files
with mixed lineendings and undesired effect of corrupting binary
files can not be distinguished.  In both cases CRLF are removed
in an irreversible way.  For text files this is the right thing
to do, while for binary file its corrupting data.

In a sane environment committing and checking out the same file
should not modify the origin file in the work tree.  For
autocrlf=input the original file must not contain CRLF.  For
autocrlf=true the original file must not contain LF without
preceding CR.  Otherwise the conversion is irreversible.  Note,
git might be able to recreate the original file with different
autocrlf settings, but in the current environment checking out
will yield a file that differs from the file before the commit.

This patch adds a mechanism that can either warn the user about
an irreversible conversion or can even refuse to convert.  The
mechanism is controlled by the variable core.safecrlf, with the
following values
 - false: disable safecrlf mechanism
 - warn: warn about irreversible conversions
 - true: refuse irreversible conversions

The default is to warn.

A concept of a safety check was originally proposed in a similar
way by Linus Torvalds.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 cache.h       |    8 ++++++++
 config.c      |    9 +++++++++
 convert.c     |   21 +++++++++++++++++++++
 environment.c |    1 +
 4 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/cache.h b/cache.h
index 39331c2..4e03e3d 100644
--- a/cache.h
+++ b/cache.h
@@ -330,6 +330,14 @@ extern size_t packed_git_limit;
 extern size_t delta_base_cache_limit;
 extern int auto_crlf;
 
+enum safe_crlf {
+	SAFE_CRLF_FALSE = 0,
+	SAFE_CRLF_FAIL = 1,
+	SAFE_CRLF_WARN = 2,
+};
+
+extern enum safe_crlf safe_crlf;
+
 #define GIT_REPO_VERSION 0
 extern int repository_format_version;
 extern int check_repository_format(void);
diff --git a/config.c b/config.c
index 857deb6..0a46046 100644
--- a/config.c
+++ b/config.c
@@ -407,6 +407,15 @@ int git_default_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.safecrlf")) {
+		if (value && !strcasecmp(value, "warn")) {
+			safe_crlf = SAFE_CRLF_WARN;
+			return 0;
+		}
+		safe_crlf = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (!strcmp(var, "user.name")) {
 		strlcpy(git_default_name, value, sizeof(git_default_name));
 		return 0;
diff --git a/convert.c b/convert.c
index 4df7559..598cf0b 100644
--- a/convert.c
+++ b/convert.c
@@ -132,6 +132,27 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 				*dst++ = c;
 		} while (--len);
 	}
+	if (safe_crlf) {
+		if ((action == CRLF_INPUT) || auto_crlf <= 0) {
+			/* autocrlf=input: check if we removed CRLFs */
+			if (buf->len != dst - buf->buf) {
+				if (safe_crlf == SAFE_CRLF_WARN)
+					warning("Stripped CRLF from %s.", path);
+				else
+					die("Refusing to strip CRLF from %s.", path);
+			}
+		} else {
+			/* autocrlf=true: check if we had LFs (without CR) */
+			if (stats.lf != stats.crlf) {
+				if (safe_crlf == SAFE_CRLF_WARN)
+					warning(
+					  "Checkout will replace LFs with CRLF in %s", path);
+				else
+					die("Checkout would replace LFs with CRLF in %s", path);
+			}
+		}
+	}
+
 	strbuf_setlen(buf, dst - buf->buf);
 	return 1;
 }
diff --git a/environment.c b/environment.c
index 18a1c4e..e351e99 100644
--- a/environment.c
+++ b/environment.c
@@ -35,6 +35,7 @@ int pager_use_color = 1;
 char *editor_program;
 char *excludes_file;
 int auto_crlf = 0;	/* 1: both ways, -1: only when adding git objects */
+enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
 
 /* This is set by setup_git_dir_gently() and/or git_default_config() */
-- 
1.5.4.rc2.60.g46ee

^ permalink raw reply related

* Re: Project Hosting with git ?
From: Stephen Sinclair @ 2008-01-12 17:54 UTC (permalink / raw)
  To: Grégoire Barbier; +Cc: git
In-Reply-To: <4788FBDE.6090903@gbarbier.org>

> I disagree: git does not work "fine" over http, it only works fine for
> fetch/pull.

You're taking me out of context.  I meant it works fine for public
hosting so that users can easily clone and create patches.  This is
the main motivation for publishing your repo, so in that sense it
"works fine".  (for me)
Since SF supports ssh, there's no reason to need http-push.  I wish
they would just provide some recent git binaries on the sourceforge
server, then we could git-push properly over ssh instead of using to
use scp or rsync.  I think that would be a good start. *shrug*

In any case, http-push over webdav would still require git binaries to
be installed somewhere on SF, so it's essentially the same problem.


Steve

^ permalink raw reply

* Re: Project Hosting with git ?
From: Jakub Narebski @ 2008-01-12 17:46 UTC (permalink / raw)
  To: Stephen Sinclair; +Cc: Neshama Parhoti, git
In-Reply-To: <9b3e2dc20801120845n15d59fe6q178ba257c12a28e0@mail.gmail.com>

"Stephen Sinclair" <radarsat1@gmail.com> writes:
> Neshama Parhoti wrote:

> > I mean, if I open a SourceForge project, I have to use cvs/subversion right ?
> >
> > Is there any way to use git ?
> 
> There is currently an open feature request on sourceforge for git support.
> Please feel free to add a comment to the thread, hopefully if enough
> people do so they'll do something about it.
> (Though I wouldn't be surprised if they're working on it.)
> 
> https://sourceforge.net/tracker/?func=detail&atid=350001&aid=1828327&group_id=1

Savannah, which is FOSS hosting site using SourceForge derived engine
(named Savane) has Git support.  What is a bit strange is that its fork
Gna!, which also uses Savane, does not have Git support.  This is because
Savannah runs "cleanup" branch of Savane while Gna! runs plain (?) Savane;
see http://git.or.cz/gitwiki/InterfacesFrontendsAndToolsWishlist#Savane

Alioth (Debian related projects hosting only) uses modified GForge
engine, which in turn is some fork of SourceForge engine, also has
git support.

Not that it helps much in adding Git support to SourceForge, as both
Savane and GForge are GPL, and IIRC SF.net engine is now closed-source.

> However, git works fine over http.  I have a project on SF which I was
> using with subversion, but I recently switched the project over to
> git.
> 
> I simply posted a bare git repo on the project website, and bang it's
> "hosted" on sourceforge.  In order to automate things a bit, I set up
> a local repo which, when I push to it, runs git-update-server-info and

You need only to enable (chmod a+x) default update hook for that.

> then uses rsync to upload the repo changes to the SF web server.
> 
> It seems to work fine.  I do occasionally git-clone the http-hosted
> repo just to make sure things are still working, and so far no
> problems.

Nice solution, can be used in with any software hosting (BerliOS or
Sarovar for example) which provides rsync or other way of syncing.

> The downside is that SF will not collect statistics on the git repo.
> However, I've been using ohloh.net to track it instead, which works
> wonderfully.

You can use GitStat in the same way as you provide Git repository,
  http://tree.celinuxforum.org/gitstat/
  http://sourceforge.net/projects/gitstat/
  http://www.ohloh.net/projects/8207?p=gitstat
but Ohloh have quite a nice features.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply

* Re: Project Hosting with git ?
From: Grégoire Barbier @ 2008-01-12 17:41 UTC (permalink / raw)
  To: Stephen Sinclair; +Cc: Neshama Parhoti, git
In-Reply-To: <9b3e2dc20801120845n15d59fe6q178ba257c12a28e0@mail.gmail.com>

Stephen Sinclair a écrit :
> However, git works fine over http.  I have a project on SF which I was
> using with subversion, but I recently switched the project over to
> git.

I disagree: git does not work "fine" over http, it only works fine for 
fetch/pull.
At less with last versions, push over http/webdav does not work (and by 
the way corrupts the remote repository, by changing the HEAD sha without 
uploading related objects).
I initiated a thread about that on the list a few weeks ago, with 
subject "git over webdav: what can I do for improving http-push ?". 
However I did not (yet?) post a patch.

> I simply posted a bare git repo on the project website, and bang it's
> "hosted" on sourceforge.  In order to automate things a bit, I set up
> a local repo which, when I push to it, runs git-update-server-info and
> then uses rsync to upload the repo changes to the SF web server.

I agree with this: if you push locally and then upload, it works fine. 
With rsync, ftp, sftp or whatever you want.

-- 
Grégoire Barbier - gb à gbarbier.org - +33 6 21 35 73 49

^ permalink raw reply

* Re: [PATCH] Teach remote machinery about remotes.default config variable
From: Mark Levedahl @ 2008-01-12 17:29 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git
In-Reply-To: <alpine.LSU.1.00.0801121748290.8333@wbgn129.biozentrum.uni-wuerzburg.de>

Johannes Schindelin wrote:
>
> No, that was not _at all_ my argument.
>
> I said that hiding it under a different name _that you have to look up, 
> too_ does _not_ make things easier.
>
>   
Granted, *IF* we had to look it up, but we don't. In fact, we use the 
convention
    servername.foo.bar
has nickname
    servername

So, we need to know the server name we are using, and that server name 
is the nickname. So, no confusion and no extra lookup step. (Our server 
names are unique without the domain suffixes, so this works well for us).

Mark

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox