Git development
 help / color / mirror / Atom feed
* Re: Error : git svn fetch
From: Eric Wong @ 2009-01-08  2:55 UTC (permalink / raw)
  To: chongyc; +Cc: git
In-Reply-To: <E48CF49FF0FE4F96BE206B2689165AF9@VMware>

chongyc <chongyc27@gmail.com> wrote:
> Hi
> 
> I found that 'git svn fetch' failed in cloning the hudson svn reposotory.
> 
> I want to git-clone the svn repository
> 
> svn repository URL : https://svn.dev.java.net/svn/hudson/
> username : guest
> password :
> 
> 
> So I run followings to git-clone
> 
> [root@localhost hudson]# git --version
> git version 1.6.0.6
> [root@localhost hudson]# git svn init -T trunk -t tags -b branches 
> https://svn.dev.java.net/svn/hudson/
> [root@localhost hudson]# git svn fetch
> Found possible branch point: 
> https://svn.dev.java.net/svn/hudson/tags/hudson-1_230 => 
> https://svn.dev.java.net/svn/hudson/branches/buildnav-1636, 10490
> Initializing parent: buildnav-1636@10490
> Found possible branch point: 
> https://svn.dev.java.net/svn/hudson/trunk/hudson/main => 
> https://svn.dev.java.net/svn/hudson/tags/hudson-1_230, 10450
> Initializing parent: buildnav-1636@10450
> Found branch parent: (buildnav-1636@10490) a1c395e5db063ca1ffbbe008e309c5
> 11d56219e0
> Following parent with do_switch
> remoting/pom.xml was not found in commit 
> a1c395e5db063ca1ffbbe008e309c511d56219e0 (r10447)
> [root@localhost hudson]#
> 
> What shall I do to git-clone it ?
> 
> Please help me

Hi, sorry for the late reply, I've been very distracted.

Looking at the hudson repository, the layout is non-standard and very
complex, with subdirectories being branched and tagged all over.  The
standard globbing that git-svn uses for most repositories does won't
work.  You'll have to map things manually:

[svn-remote "svn"]
        url = https://svn.dev.java.net/svn/hudson
        fetch = trunk/hudson:refs/remotes/trunk
        fetch = branches/tom:refs/remotes/tom
	...

Alternately, you could just clone the root and have all the branches all
over the place in one tree (your eventually working copy will be huge).

  git svn clone https://svn.dev.java.net/svn/hudson


Basically this is the equivalent of:

  svn co https://svn.dev.java.net/svn/hudson

Except you'll have the full history.

-- 
Eric Wong

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Shawn O. Pearce @ 2009-01-08  3:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: R. Tyler Ballance, Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <alpine.LFD.2.00.0901071836290.3283@localhost.localdomain>

Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
> >
> >         tyler@grapefruit:~/source/git/linux-2.6> git pull
> >         error: failed to read object be1b87c70af69acfadb8a27a7a76dfb61de92643 at offset 1850923
> >         from .git/objects/pack/pack-dbe154052997a05499eb6b4fd90b924da68e799a.pack
> >         fatal: object be1b87c70af69acfadb8a27a7a76dfb61de92643 is corrupted
> 
> Btw, this is an interesting error message, mostly because of what is 
> _not_ there.
> 
> In particular, it doesn't report any reason _why_ it failed to read the 
> object, which as far as I can tell can happen for only one reason: 
> unpack_compressed_entry() returns NULL, and that path is the only thing 
> that can do so without a message.
> 
> And it only does it if zlib fails.

Ok, well, in this case I've been able to reproduce a zlib inflate
failure on the base object in a 2 deep delta chain.  We got back:

  #define Z_STREAM_ERROR (-2)

this causes the buffer to be freed and NULL to come back out of
unpack_compressed_entry(), and then everything is corrupt...

-- 
Shawn.

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Linus Torvalds @ 2009-01-08  2:52 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <1231374514.8870.621.camel@starfruit>



On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
>
>         tyler@grapefruit:~/source/git/linux-2.6> git pull
>         error: failed to read object be1b87c70af69acfadb8a27a7a76dfb61de92643 at offset 1850923
>         from .git/objects/pack/pack-dbe154052997a05499eb6b4fd90b924da68e799a.pack
>         fatal: object be1b87c70af69acfadb8a27a7a76dfb61de92643 is corrupted

Btw, this is an interesting error message, mostly because of what is 
_not_ there.

In particular, it doesn't report any reason _why_ it failed to read the 
object, which as far as I can tell can happen for only one reason: 
unpack_compressed_entry() returns NULL, and that path is the only thing 
that can do so without a message.

And it only does it if zlib fails.

Now, zlib can fail because the unpacking fails, but it can fail for other 
cases too.

And the thing is, we don't check/report those kinds of failure cases very 
well. Which really bit us here, because if we had checked the return value 
of inflateInit(), we'd almost certainly would have gotten a nice big "you 
ran out of memory" thing, and we wouldn't have been chasing this down as a 
corruption issue.

We probably should wrap all the "inflateInit()" calls, and do something 
like

	static void xinflateInit(z_streamp strm)
	{
		const char *err;

		switch (inflateInit(strm)) {
		case Z_OK:
			return;
		case Z_MEM_ERROR:
			err = "out of memory";
			break;
		case Z_VERSION_ERROR:
			err = "wrong version";
			break;
		default:
			err = "error";
		}
		die("inflateInit: %s (%s)", err,
			strm->msg ? strm->msg : "no message");
	}

or similar. That way we'd get good error reports when we run out of 
memory, rather than consider it to be a corruption issue.

We could also try to make a few of these wrappers actually release some of 
the memory (the way xmmap() does), but there are likely diminishing 
returns. And the much more important issue is the proper error reporting: 
if we had reported "out of memory", we'd not have spent so much time 
chasing disk corruption etc.

			Linus

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Boyd Stephen Smith Jr. @ 2009-01-08  2:52 UTC (permalink / raw)
  To: James Pickens
  Cc: Git ML, R. Tyler Ballance, Nicolas Pitre, Jan Krüger, kb,
	Linus Torvalds
In-Reply-To: <885649360901071821t2ea481b5k83ab800f6aeb897@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 945 bytes --]

On Wednesday 2009 January 07 20:21:18 James Pickens wrote:
>On Wed, Jan 7, 2009, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> Can you cnfirm that your "reproducible" case starts working with that
>> addition to your ~/.gitconfig? If so, the solution is pretty simple: we
>> should just lower the default pack windowsize.
>
>Umm... isn't that more of a workaround than a solution?  I.e., if you lower
>the default pack windowsize, couldn't the corruption still happen under the
>right conditions?

IMHO:
I agree, somewhat.  I'm fine with a "die()" message when there's not enough 
memory, but either corruption or just a spurious, but scary "<SHA> is 
corrupt" messages should be fixed.
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss@iguanasuicide.net                     ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.net/                      \_/     

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Shawn O. Pearce @ 2009-01-08  2:43 UTC (permalink / raw)
  To: James Pickens
  Cc: Git ML, R. Tyler Ballance, Nicolas Pitre, Jan Krüger, kb,
	Linus Torvalds
In-Reply-To: <885649360901071821t2ea481b5k83ab800f6aeb897@mail.gmail.com>

James Pickens <jepicken@gmail.com> wrote:
> On Wed, Jan 7, 2009, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > Can you cnfirm that your "reproducible" case starts working with that
> > addition to your ~/.gitconfig? If so, the solution is pretty simple: we
> > should just lower the default pack windowsize.
> 
> Umm... isn't that more of a workaround than a solution?  I.e., if you lower
> the default pack windowsize, couldn't the corruption still happen under the
> right conditions?

Uhm, yea.  So I managed to reproduce it on a Linux system here.
Different object ids than R. Tyler's case, but I'm going to try
to debug it and see why we are getting these.

For those following along at home, Linus' 2.6 tree:

$ ulimit -v `echo '150 * 1024'|bc -l`
$ git co 56d18e9932ebf4e8eca42d2ce509450e6c9c1666
HEAD is now at 56d18e9... Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
$ git merge 9e42d0cf5020aaf217433cad1a224745241d212a
Updating 56d18e9..9e42d0c
error: failed to read delta base object ef135b90084f3c54fccea4e273aeff029db2d873 at offset 48342508 from .git/objects/pack/pack-edb47354be787909e05c15bd1d9eb8b4684d2e4d.pack
error: failed to read delta base object c4e828b71d96622bb258938d69aab9cec53d5cae at offset 128427683 from .git/objects/pack/pack-edb47354be787909e05c15bd1d9eb8b4684d2e4d.pack
error: failed to read object 3cd5a6463cfd9306095bf6312a9b7ab09d4f2f5d at offset 128427777 from .git/objects/pack/pack-edb47354be787909e05c15bd1d9eb8b4684d2e4d.pack
fatal: object 3cd5a6463cfd9306095bf6312a9b7ab09d4f2f5d is corrupted

No, the repository is not corrupt.  We f'd up our memory management
somewhere.

-- 
Shawn.

^ permalink raw reply

* Re: collapsing commits with rebase
From: Boyd Stephen Smith Jr. @ 2009-01-08  2:39 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: geoffrey.russell, git
In-Reply-To: <20090108023224.GU21154@genesis.frugalware.org>

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

On Wednesday 2009 January 07 20:32:24 Miklos Vajna wrote:
>On Wed, Jan 07, 2009 at 08:11:32PM -0600, "Boyd Stephen Smith Jr." 
<bss@iguanasuicide.net> wrote:
>> git merge -s sha(D)
>
>You probably mean --squash here, -s stands for --strategy - and I *hope*
>you don't have git-sha(D) in your PATH, as a custom merge strategy. ;-)

Oops.  Yes.  My mistake.
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss@iguanasuicide.net                     ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.net/                      \_/     

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: collapsing commits with rebase
From: Miklos Vajna @ 2009-01-08  2:32 UTC (permalink / raw)
  To: Boyd Stephen Smith Jr.; +Cc: geoffrey.russell, git
In-Reply-To: <200901072011.37299.bss@iguanasuicide.net>

[-- Attachment #1: Type: text/plain, Size: 266 bytes --]

On Wed, Jan 07, 2009 at 08:11:32PM -0600, "Boyd Stephen Smith Jr." <bss@iguanasuicide.net> wrote:
> git merge -s sha(D)

You probably mean --squash here, -s stands for --strategy - and I *hope*
you don't have git-sha(D) in your PATH, as a custom merge strategy. ;-)

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Sam Vilain @ 2009-01-08  2:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Pierre Habouzit, davidel, Francis Galiegue,
	Git ML
In-Reply-To: <alpine.LFD.2.00.0901071222300.3057@localhost.localdomain>

Linus Torvalds wrote:
> On Thu, 8 Jan 2009, Sam Vilain wrote:
>   
>> Whatever happens, the current deterministic diff algorithm needs to stay
>> for generating patch-id's... those really can't be allowed to change.
>>     
>
> Sure they can.
>
> We never cache patch-id's over a long time. And we _have_ changed xdiff to 
> modify the output of the patches before, quite regardless of any patience 
> issues: see commit 9b28d55401a529ff08c709f42f66e765c93b0a20, which 
> admittedly doesn't affect any _normal_ diffs, but can generate subtly 
> different results for some cases.
>   

There's at least one person who thinks that they should be deterministic 
enough to be able to be placed in commit messages;

http://article.gmane.org/gmane.comp.version-control.git/95671

Now of course the git cherry-pick feature to add the old patch ID to the 
commit message isn't written yet; but unless there was a fall-back mode 
to produce a "stable" patch ID, these breadcrumbs would become (even 
more) worthless.

Sam

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: James Pickens @ 2009-01-08  2:21 UTC (permalink / raw)
  To: Git ML
  Cc: R. Tyler Ballance, Nicolas Pitre, Jan Krüger, kb,
	Linus Torvalds
In-Reply-To: <alpine.LFD.2.00.0901071644330.3283@localhost.localdomain>

On Wed, Jan 7, 2009, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Can you cnfirm that your "reproducible" case starts working with that
> addition to your ~/.gitconfig? If so, the solution is pretty simple: we
> should just lower the default pack windowsize.

Umm... isn't that more of a workaround than a solution?  I.e., if you lower
the default pack windowsize, couldn't the corruption still happen under the
right conditions?

James

^ permalink raw reply

* Re: collapsing commits with rebase
From: Boyd Stephen Smith Jr. @ 2009-01-08  2:11 UTC (permalink / raw)
  To: geoffrey.russell; +Cc: git
In-Reply-To: <93c3eada0901071608r190a723bma502b68c4ab81a08@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

On Wednesday 2009 January 07 18:08:44 Geoff Russell wrote:
>I have a series of commits:
>
>    A---B---C---D---E---F

Assuming you also have a ref (e.g. Foo) that points to F:
git checkout sha(B)
git merge -s sha(D)
git rebase --onto $(cat .git/HEAD) sha(E) Foo

should do what you want.

After the checkout:
A -> B [HEAD] -> C -> D -> E -> F [Foo]
(detached HEAD)

After the merge:
A -> B -> C -> D -> E -> F [Foo]
     |
     +--> [HEAD]
(detached HEAD)

After the rebase:
A -> B -> E' -> F' [Foo, HEAD]
     |
     +--> C -> D -> E -> F
(C, D, E, and F will be pruned at some point)
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss@iguanasuicide.net                     ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.net/                      \_/     

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: collapsing commits with rebase
From: Geoff Russell @ 2009-01-08  1:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0901080144270.30769@pacific.mpi-cbg.de>

On Thu, Jan 8, 2009 at 11:15 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 8 Jan 2009, Geoff Russell wrote:
>
>> Dear gits,
>>
>> I have a series of commits:
>>
>>     A---B---C---D---E---F
>>
>> I want to collapse B---C---D into one single commit. git rebase -i B
>> will allow me to do this, but I'm looking for a non-interactive
>> incantation.
>
> You set GIT_EDITOR to a script ;-)

This is plan B.

>
> Alternatively, something like this should work for you:
>
>        $ git checkout A
>        $ git read-tree -u -m D
>        $ git commit -m "My message"
>        $ git cherry-pick E
>        $ git cherry-pick F

Plan B is looking good, because I'd generally like the commit message to be the
concatenation of the messages for B,C and D.

Many thanks.

Geoff.

^ permalink raw reply

* Re: [PATCH v3] parse-opt: migrate builtin-ls-files.
From: Miklos Vajna @ 2009-01-08  1:54 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Pierre Habouzit, git
In-Reply-To: <alpine.DEB.1.00.0901080207520.30769@pacific.mpi-cbg.de>

[-- Attachment #1: Type: text/plain, Size: 407 bytes --]

On Thu, Jan 08, 2009 at 02:10:15AM +0100, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > Interdiff: git diff b2a38d9..ee34fcc in my repo.
> 
> ... or on repo.or.cz:
> 
> http://repo.or.cz/w/git/vmiklos.git?a=treediff;hpb=b2a38d9;hb=ee34fcc

That *is* what I meant by 'my repo'. ;-)

BTW thanks for the link, I didn't know about it - AFAIK stock gitweb
does not have a treediff view.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Shawn O. Pearce @ 2009-01-08  1:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: R. Tyler Ballance, Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <alpine.LFD.2.00.0901071726020.3283@localhost.localdomain>

Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, 7 Jan 2009, Linus Torvalds wrote:
> > 
> > So there's a few rules to packedgitwindowsize:
> > 
> >  - we need to be able to have at least two windows open at a time, in 
> >    addition to all the "normal" memory git needs just for objects, of 
> >    course. And quite frankly, you'd be better off with a few more windows, 
> >    even if that obviously implies smaller windows.
> 
> Btw, I'm not 100% certain of this. Somebody should double-check me. Maybe 
> there are cases where we want more than two windows alive. And maybe there 
> aren't even that, and we can always make do with just one.
> 
> So I will _not_ guarantee that "at least two pack windows" is necessarily 
> the right answer. The windowing code was mostly other people doing it. I 
> think Shawn and Nico.

I was fairly certain we needed at least two windows open at once,
but reviewing the code in sha1_file.c I don't see a reason for that
restriction anymore.

I think it used to have to do with the delta reconstruction; to
unpack a delta we would read the delta header from one window,
but we may need base data from another.  The delta unpack code
kept the delta window pinned in use, so we couldn't replace it
to access base data from elsewhere, hence we needed two windows.
Thinking about it now I don't recall how we handled the recusion
on a delta chain longer than 2.  ;-)

But looking at the code we have long since refactored it so this
isn't an issue anymore.  We release the window between reading
the delta header and reading the base, so the delta window can be
replaced if necessary.  I think the "2 window minimum" is just a
performance suggestion, not a requirement.

-- 
Shawn.

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Linus Torvalds @ 2009-01-08  1:29 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <alpine.LFD.2.00.0901071702190.3283@localhost.localdomain>



On Wed, 7 Jan 2009, Linus Torvalds wrote:
> 
> So there's a few rules to packedgitwindowsize:
> 
>  - we need to be able to have at least two windows open at a time, in 
>    addition to all the "normal" memory git needs just for objects, of 
>    course. And quite frankly, you'd be better off with a few more windows, 
>    even if that obviously implies smaller windows.

Btw, I'm not 100% certain of this. Somebody should double-check me. Maybe 
there are cases where we want more than two windows alive. And maybe there 
aren't even that, and we can always make do with just one.

So I will _not_ guarantee that "at least two pack windows" is necessarily 
the right answer. The windowing code was mostly other people doing it. I 
think Shawn and Nico.

			Linus

^ permalink raw reply

* Re: [PATCH v3] parse-opt: migrate builtin-ls-files.
From: Johannes Schindelin @ 2009-01-08  1:10 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Junio C Hamano, Pierre Habouzit, git
In-Reply-To: <1231376145-32331-1-git-send-email-vmiklos@frugalware.org>

Hi,

On Thu, 8 Jan 2009, Miklos Vajna wrote:

> Interdiff: git diff b2a38d9..ee34fcc in my repo.

... or on repo.or.cz:

http://repo.or.cz/w/git/vmiklos.git?a=treediff;hpb=b2a38d9;hb=ee34fcc

Ciao,
Dscho

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Linus Torvalds @ 2009-01-08  1:08 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <1231376259.8870.633.camel@starfruit>



On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
> > 
> > Can you cnfirm that your "reproducible" case starts working with that 
> > addition to your ~/.gitconfig? If so, the solution is pretty simple: we 
> > should just lower the default pack windowsize.
> 
> This certainly corrected the issue, is there some magic
> packedgitwindowsize that i should be looking at my own repository (our
> internal one) in order to prevent the issue from occurring? 
> 
> Looking into .git/objects/pack, I think the two biggest pack files are
> 3.5G and 177MBG respectively :-!

So there's a few rules to packedgitwindowsize:

 - we need to be able to have at least two windows open at a time, in 
   addition to all the "normal" memory git needs just for objects, of 
   course. And quite frankly, you'd be better off with a few more windows, 
  even if that obviously implies smaller windows.

 - the window size really wants to be a round power-of-two number, and at 
   _least_ it wants to be a nice multiple of the 2*page size.

So if you have a virtual memory limit of 1.5GB, I'd hesitate to make the 
pack window size less than 512M, and 256M is probably better. That way, 
I'd expect you to be able to always have at least four windows open 
(assuming a reasonably generous half a gigabyte for "other stuff").

And quite frankly, there's not a huge downside to making them smaller. At 
"just" 32MB, you'll still fit plenty of data in one pack window, and while 
it will cost you a few mmap/unmap's to switch windows around, most 
operations simply will not likely ever notice. At least not under Linux, 
where mmap/munmap is pretty cheap.

		Linus

^ permalink raw reply

* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: R. Tyler Ballance @ 2009-01-08  1:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <alpine.LFD.2.00.0901071652490.3283@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 2255 bytes --]

On Wed, 2009-01-07 at 17:01 -0800, Linus Torvalds wrote:
> 
> On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
> > 
> > I was only mentioning OS X with regards to the Samba/NFS red herring,
> > the rest of our operations are on 64-bit Linux machines.
> 
> Ahh, ok. Good. 
> 
> > > I could easily see that if you have a virtual memory size limit of 1.5GB, 
> > > and the pack window size is 1GB, we might have trouble. Because we could 
> > > only keep one such pack window in memory at a time.
> > 
> > The DEFAULT_PACKED_GIT_WINDOW_SIZE in our local builds is 256M, FWIW
> 
> Interesting. So you already had to lower it. However, now that you mention 
> it, and now that I search for your emails about it on the mailing list (I 
> don't normally read the mailing list except very occasionally), I see your 
> patch that does
> 
> 	#define DYNAMIC_WINDOW_SIZE_PERCENTAGE 0.85
> 	...
> 	packed_git_window_size = (unsigned int)(as->rlim_cur * DYNAMIC_WINDOW_SIZE_PERCENTAGE);
> 
> which is actually very bad.
> 
> It's bad for several reasons:
> 
>  - 85% of the virtual address space is actually pessimal.
> 
>    You need space for AT LEAST two full-sized windows, so you need less 
>    than 50%.
> 
>  - the way that variable is used, it _has_ to be a multiple of the page 
>    size. In fact, it needs to be a multiple of _twice_ the page size. So 
>    just doing a random fraction of the rlimit is not correct.

This patch never made it into any of our Git builds because my flight
landed and it wasn't stable enough (and as you pointed out, it sucks ;))



> 
> Setting it in the .gitconfig does it right, though.
> 
> > > If so, then adding a
> > > 
> > > 	[core]
> > > 		packedgitwindowsize = 64M
> > > 
> > > might make a difference. It would certainly be very interesting to hear if 
> > > there's any impact.
> > 
> > I can try this still if you'd like, but it doesn't seem like that'd be
> > the issue since we're already lowering the window size system-wide
> 
> Please do try, at least if your local git changes still match that patch I 
> found, because that patch generates problems.

See my prior reply in "Public repo case!" sent at 4:57PST

-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Linus Torvalds @ 2009-01-08  1:01 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <1231375780.8870.629.camel@starfruit>



On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
> 
> I was only mentioning OS X with regards to the Samba/NFS red herring,
> the rest of our operations are on 64-bit Linux machines.

Ahh, ok. Good. 

> > I could easily see that if you have a virtual memory size limit of 1.5GB, 
> > and the pack window size is 1GB, we might have trouble. Because we could 
> > only keep one such pack window in memory at a time.
> 
> The DEFAULT_PACKED_GIT_WINDOW_SIZE in our local builds is 256M, FWIW

Interesting. So you already had to lower it. However, now that you mention 
it, and now that I search for your emails about it on the mailing list (I 
don't normally read the mailing list except very occasionally), I see your 
patch that does

	#define DYNAMIC_WINDOW_SIZE_PERCENTAGE 0.85
	...
	packed_git_window_size = (unsigned int)(as->rlim_cur * DYNAMIC_WINDOW_SIZE_PERCENTAGE);

which is actually very bad.

It's bad for several reasons:

 - 85% of the virtual address space is actually pessimal.

   You need space for AT LEAST two full-sized windows, so you need less 
   than 50%.

 - the way that variable is used, it _has_ to be a multiple of the page 
   size. In fact, it needs to be a multiple of _twice_ the page size. So 
   just doing a random fraction of the rlimit is not correct.

Setting it in the .gitconfig does it right, though.

> > If so, then adding a
> > 
> > 	[core]
> > 		packedgitwindowsize = 64M
> > 
> > might make a difference. It would certainly be very interesting to hear if 
> > there's any impact.
> 
> I can try this still if you'd like, but it doesn't seem like that'd be
> the issue since we're already lowering the window size system-wide

Please do try, at least if your local git changes still match that patch I 
found, because that patch generates problems.

		Linus

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: R. Tyler Ballance @ 2009-01-08  0:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <alpine.LFD.2.00.0901071644330.3283@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

On Wed, 2009-01-07 at 16:48 -0800, Linus Torvalds wrote:
> 
> On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
> >
> > My most esteemed colleague (Ken aka kb) who pointed out the memory issue
> > was on the right path (I think), and I have a reproduction case you can
> > try with your very own Linux kernel tree!
> > 
> > WOO!
> > 
> > I set ulimit -v really low (150M), and the operations I made got an
> > mmap(2) fatal error, but there is a sweet spot that I found, see the
> > transcript below.
> 
> This is indeed the packfile mapping. The sweet spot you found depends on 
> how big the biggest two pack-files are, I do believe.
> 
> And if you do that
> 
> 	[core]
> 		packedgitwindowsize = 64M
> 
> I think you'll find that it works. Of course, with a _really_ low ulimit, 
> you'd need to make it even smaller, but at some point you start hitting 
> other problems than the pack-file limits, ie just the simple fact that git 
> wants and expects you to have a certain amount of memory available ;)
> 
> Can you cnfirm that your "reproducible" case starts working with that 
> addition to your ~/.gitconfig? If so, the solution is pretty simple: we 
> should just lower the default pack windowsize.

This certainly corrected the issue, is there some magic
packedgitwindowsize that i should be looking at my own repository (our
internal one) in order to prevent the issue from occurring? 

Looking into .git/objects/pack, I think the two biggest pack files are
3.5G and 177MBG respectively :-!


Cheers
-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* [PATCH v3] parse-opt: migrate builtin-ls-files.
From: Miklos Vajna @ 2009-01-08  0:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Pierre Habouzit, git
In-Reply-To: <20090107144640.GD831@artemis.corp>

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
---

On Wed, Jan 07, 2009 at 03:46:40PM +0100, Pierre Habouzit <madcoder@debian.org> wrote:
> > +   if (unset)
> > +           dir->show_ignored = 0;
> > +   else
> > +           dir->show_ignored = 1;
>
> dir->show_ignored = !unset ?

True, cleaned up all 3 occurrences.

Interdiff: git diff b2a38d9..ee34fcc in my repo.

 builtin-ls-files.c |  294 ++++++++++++++++++++++++++++------------------------
 1 files changed, 159 insertions(+), 135 deletions(-)

diff --git a/builtin-ls-files.c b/builtin-ls-files.c
index f72eb85..ffa8210 100644
--- a/builtin-ls-files.c
+++ b/builtin-ls-files.c
@@ -10,6 +10,7 @@
 #include "dir.h"
 #include "builtin.h"
 #include "tree.h"
+#include "parse-options.h"
 
 static int abbrev;
 static int show_deleted;
@@ -28,6 +29,7 @@ static const char **pathspec;
 static int error_unmatch;
 static char *ps_matched;
 static const char *with_tree;
+static int exc_given;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -395,156 +397,178 @@ int report_path_error(const char *ps_matched, const char **pathspec, int prefix_
 	return errors;
 }
 
-static const char ls_files_usage[] =
-	"git ls-files [-z] [-t] [-v] (--[cached|deleted|others|stage|unmerged|killed|modified])* "
-	"[ --ignored ] [--exclude=<pattern>] [--exclude-from=<file>] "
-	"[ --exclude-per-directory=<filename> ] [--exclude-standard] "
-	"[--full-name] [--abbrev] [--] [<file>]*";
+static const char * const ls_files_usage[] = {
+	"git ls-files [options] [<file>]*",
+	NULL
+};
+
+static int option_parse_z(const struct option *opt,
+			  const char *arg, int unset)
+{
+	if (unset)
+		line_terminator = '\n';
+	else
+		line_terminator = 0;
+	return 0;
+}
+
+static int option_parse_exclude(const struct option *opt,
+				const char *arg, int unset)
+{
+	struct dir_struct *dir = opt->value;
+
+	exc_given = 1;
+	add_exclude(arg, "", 0, &dir->exclude_list[EXC_CMDL]);
+
+	return 0;
+}
+
+static int option_parse_exclude_from(const struct option *opt,
+				     const char *arg, int unset)
+{
+	struct dir_struct *dir = opt->value;
+
+	exc_given = 1;
+	add_excludes_from_file(dir, arg);
+
+	return 0;
+}
+
+static int option_parse_exclude_standard(const struct option *opt,
+					 const char *arg, int unset)
+{
+	struct dir_struct *dir = opt->value;
+
+	exc_given = 1;
+	setup_standard_excludes(dir);
+
+	return 0;
+}
+
+static int option_parse_ignored(const struct option *opt,
+				const char *arg, int unset)
+{
+	struct dir_struct *dir = opt->value;
+
+	dir->show_ignored = !unset;
+
+	return 0;
+}
+
+static int option_parse_directory(const struct option *opt,
+				  const char *arg, int unset)
+{
+	struct dir_struct *dir = opt->value;
+
+	dir->show_other_directories = !unset;
+
+	return 0;
+}
+
+static int option_parse_empty(const struct option *opt,
+				 const char *arg, int unset)
+{
+	struct dir_struct *dir = opt->value;
+
+	dir->hide_empty_directories = unset;
+
+	return 0;
+}
+
+static int option_parse_full_name(const struct option *opt,
+				  const char *arg, int unset)
+{
+	prefix_offset = 0;
+
+	return 0;
+}
 
 int cmd_ls_files(int argc, const char **argv, const char *prefix)
 {
-	int i;
-	int exc_given = 0, require_work_tree = 0;
+	int require_work_tree = 0, show_tag = 0;
 	struct dir_struct dir;
+	struct option builtin_ls_files_options[] = {
+		{ OPTION_CALLBACK, 'z', NULL, NULL, NULL,
+			"paths are separated with NUL character",
+			PARSE_OPT_NOARG, option_parse_z },
+		OPT_BOOLEAN('t', NULL, &show_tag,
+			"identify the file status with tags"),
+		OPT_BOOLEAN('v', NULL, &show_valid_bit,
+			"use lowercase letters for 'assume unchanged' files"),
+		OPT_BOOLEAN('c', "cached", &show_cached,
+				"show cached files in the output (default)"),
+		OPT_BOOLEAN('d', "deleted", &show_deleted,
+				"show deleted files in the output"),
+		OPT_BOOLEAN('m', "modified", &show_modified,
+				"show modified files in the output"),
+		OPT_BOOLEAN('o', "others", &show_others,
+				"show other files in the output"),
+		{ OPTION_CALLBACK, 'i', "ignored", &dir, NULL,
+			"show ignored files in the output",
+			PARSE_OPT_NOARG, option_parse_ignored },
+		OPT_BOOLEAN('s', "stage", &show_stage,
+			"show staged contents' object name in the output"),
+		OPT_BOOLEAN('k', "killed", &show_killed,
+			"show files on the filesystem that need to be removed"),
+		{ OPTION_CALLBACK, 0, "directory", &dir, NULL,
+			"show 'other' directories' name only",
+			PARSE_OPT_NOARG, option_parse_directory },
+		{ OPTION_CALLBACK, 0, "empty-directory", &dir, NULL,
+			"list empty directories",
+			PARSE_OPT_NOARG, option_parse_empty },
+		OPT_BOOLEAN('u', "unmerged", &show_unmerged,
+			"show unmerged files in the output"),
+		{ OPTION_CALLBACK, 'x', "exclude", &dir, "pattern",
+			"skip files matching pattern",
+			0, option_parse_exclude },
+		{ OPTION_CALLBACK, 'X', "exclude-from", &dir, "file",
+			"exclude patterns are read from <file>",
+			0, option_parse_exclude_from },
+		OPT_STRING(0, "exclude-per-directory", &dir.exclude_per_dir, "file",
+			"read additional per-directory exclude patterns in <file>"),
+		{ OPTION_CALLBACK, 0, "exclude-standard", &dir, NULL,
+			"add the standard git exclusions",
+			PARSE_OPT_NOARG, option_parse_exclude_standard },
+		{ OPTION_CALLBACK, 0, "full-name", NULL, NULL,
+			"make the output relative to the project top directory",
+			PARSE_OPT_NOARG, option_parse_full_name },
+		OPT_BOOLEAN(0, "error-unmatch", &error_unmatch,
+			"if any <file> is not in the index, treat this as an error"),
+		OPT_STRING(0, "with-tree", &with_tree, "tree-ish",
+			"pretend that paths removed since <tree-ish> are still present"),
+		OPT__ABBREV(&abbrev),
+		OPT_END()
+	};
 
 	memset(&dir, 0, sizeof(dir));
 	if (prefix)
 		prefix_offset = strlen(prefix);
 	git_config(git_default_config, NULL);
 
-	for (i = 1; i < argc; i++) {
-		const char *arg = argv[i];
-
-		if (!strcmp(arg, "--")) {
-			i++;
-			break;
-		}
-		if (!strcmp(arg, "-z")) {
-			line_terminator = 0;
-			continue;
-		}
-		if (!strcmp(arg, "-t") || !strcmp(arg, "-v")) {
-			tag_cached = "H ";
-			tag_unmerged = "M ";
-			tag_removed = "R ";
-			tag_modified = "C ";
-			tag_other = "? ";
-			tag_killed = "K ";
-			if (arg[1] == 'v')
-				show_valid_bit = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-c") || !strcmp(arg, "--cached")) {
-			show_cached = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-d") || !strcmp(arg, "--deleted")) {
-			show_deleted = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-m") || !strcmp(arg, "--modified")) {
-			show_modified = 1;
-			require_work_tree = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-o") || !strcmp(arg, "--others")) {
-			show_others = 1;
-			require_work_tree = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-i") || !strcmp(arg, "--ignored")) {
-			dir.show_ignored = 1;
-			require_work_tree = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-s") || !strcmp(arg, "--stage")) {
-			show_stage = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-k") || !strcmp(arg, "--killed")) {
-			show_killed = 1;
-			require_work_tree = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--directory")) {
-			dir.show_other_directories = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--no-empty-directory")) {
-			dir.hide_empty_directories = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-u") || !strcmp(arg, "--unmerged")) {
-			/* There's no point in showing unmerged unless
-			 * you also show the stage information.
-			 */
-			show_stage = 1;
-			show_unmerged = 1;
-			continue;
-		}
-		if (!strcmp(arg, "-x") && i+1 < argc) {
-			exc_given = 1;
-			add_exclude(argv[++i], "", 0, &dir.exclude_list[EXC_CMDL]);
-			continue;
-		}
-		if (!prefixcmp(arg, "--exclude=")) {
-			exc_given = 1;
-			add_exclude(arg+10, "", 0, &dir.exclude_list[EXC_CMDL]);
-			continue;
-		}
-		if (!strcmp(arg, "-X") && i+1 < argc) {
-			exc_given = 1;
-			add_excludes_from_file(&dir, argv[++i]);
-			continue;
-		}
-		if (!prefixcmp(arg, "--exclude-from=")) {
-			exc_given = 1;
-			add_excludes_from_file(&dir, arg+15);
-			continue;
-		}
-		if (!prefixcmp(arg, "--exclude-per-directory=")) {
-			exc_given = 1;
-			dir.exclude_per_dir = arg + 24;
-			continue;
-		}
-		if (!strcmp(arg, "--exclude-standard")) {
-			exc_given = 1;
-			setup_standard_excludes(&dir);
-			continue;
-		}
-		if (!strcmp(arg, "--full-name")) {
-			prefix_offset = 0;
-			continue;
-		}
-		if (!strcmp(arg, "--error-unmatch")) {
-			error_unmatch = 1;
-			continue;
-		}
-		if (!prefixcmp(arg, "--with-tree=")) {
-			with_tree = arg + 12;
-			continue;
-		}
-		if (!prefixcmp(arg, "--abbrev=")) {
-			abbrev = strtoul(arg+9, NULL, 10);
-			if (abbrev && abbrev < MINIMUM_ABBREV)
-				abbrev = MINIMUM_ABBREV;
-			else if (abbrev > 40)
-				abbrev = 40;
-			continue;
-		}
-		if (!strcmp(arg, "--abbrev")) {
-			abbrev = DEFAULT_ABBREV;
-			continue;
-		}
-		if (*arg == '-')
-			usage(ls_files_usage);
-		break;
+	argc = parse_options(argc, argv, builtin_ls_files_options,
+			ls_files_usage, 0);
+	if (show_tag || show_valid_bit) {
+		tag_cached = "H ";
+		tag_unmerged = "M ";
+		tag_removed = "R ";
+		tag_modified = "C ";
+		tag_other = "? ";
+		tag_killed = "K ";
 	}
+	if (show_modified || show_others || dir.show_ignored || show_killed)
+		require_work_tree = 1;
+	if (show_unmerged)
+		/* There's no point in showing unmerged unless
+		 * you also show the stage information.
+		 */
+		show_stage = 1;
+	if (dir.exclude_per_dir)
+		exc_given = 1;
 
 	if (require_work_tree && !is_inside_work_tree())
 		setup_work_tree();
 
-	pathspec = get_pathspec(prefix, argv + i);
+	pathspec = get_pathspec(prefix, argv);
 
 	/* Verify that the pathspec matches the prefix */
 	if (pathspec)
-- 
1.6.1

^ permalink raw reply related

* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: R. Tyler Ballance @ 2009-01-08  0:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <alpine.LFD.2.00.0901071621340.3283@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 2976 bytes --]

On Wed, 2009-01-07 at 16:37 -0800, Linus Torvalds wrote:
> 
> On Wed, 7 Jan 2009, Linus Torvalds wrote:
> >
> > >         > limit ~1.5GB -> corrupt file
> > >         > limit ~3GB -> magically no longer corrupt.
> > 
> > That is interesting, although I also worry that there might be other 
> > issues going on (ie since you've reported thigns magically fixing 
> > themselves, maybe the ulimit tests just _happened_ to show that, even if 
> > it wasn't the core reason).
> > 
> > BUT! This is definitely worth looking at.
> > 
> > For example, we do have some cases where we try to do "mmap()", and if it 
> > fails, we try to free some memory and try again. In particular, in 
> > xmmap(), if an mmap() fails - which may be due to running out of virtual 
> > address space - we'll actually try to release some pack-file memory and 
> > try again. Maybe there's a bug there - and it would be one that seldom 
> > triggers for others.
> 
> Ho humm. We really do have some interesting things there. 

Always enjoyable when these mail threads get this deep ;)

> 
> Is this a 64-bit machine? I didn't think OS X did that, but if there is 
> some limited 64-bit support there, maybe "sizeof(void *)" is 8, then we 
> default the default git pack-window to a pretty healthy 1GB.

I was only mentioning OS X with regards to the Samba/NFS red herring,
the rest of our operations are on 64-bit Linux machines.

The machine I reproduced this on ("Public repo case!") is the following:
        tyler@grapefruit:~> uname -a
        Linux grapefruit.corp.slide.com 2.6.27.7-9-default #1 SMP
        2008-12-04 18:10:04 +0100 x86_64 x86_64 x86_64 GNU/Linux
        tyler@grapefruit:~> cat /etc/issue
        Welcome to openSUSE 11.1   - Kernel \r (\l).
        
The machines we're experiencing this issue on "in the wild" are:
        xdev3 (master)% uname -a 
        Linux xdev3 2.6.24-22-server #1 SMP Mon Nov 24 20:06:28 UTC 2008
        x86_64 GNU/Linux
        xdev3 (master)% cat /etc/issue
        Ubuntu 8.04.1 \n \l
> 
> I could easily see that if you have a virtual memory size limit of 1.5GB, 
> and the pack window size is 1GB, we might have trouble. Because we could 
> only keep one such pack window in memory at a time.

The DEFAULT_PACKED_GIT_WINDOW_SIZE in our local builds is 256M, FWIW

> 
> I have _not_ looked at the code, though. I'd have expected a SIGSEGV if we 
> really had issues with the window handling.
> 
> Anyway, _if_ your system has 64-bit pointers, then _maybe_ something the 
> default 1GB pack window causes problem.
> 
> If so, then adding a
> 
> 	[core]
> 		packedgitwindowsize = 64M
> 
> might make a difference. It would certainly be very interesting to hear if 
> there's any impact.

I can try this still if you'd like, but it doesn't seem like that'd be
the issue since we're already lowering the window size system-wide



Cheers
-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Linus Torvalds @ 2009-01-08  0:48 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <1231374514.8870.621.camel@starfruit>



On Wed, 7 Jan 2009, R. Tyler Ballance wrote:
>
> My most esteemed colleague (Ken aka kb) who pointed out the memory issue
> was on the right path (I think), and I have a reproduction case you can
> try with your very own Linux kernel tree!
> 
> WOO!
> 
> I set ulimit -v really low (150M), and the operations I made got an
> mmap(2) fatal error, but there is a sweet spot that I found, see the
> transcript below.

This is indeed the packfile mapping. The sweet spot you found depends on 
how big the biggest two pack-files are, I do believe.

And if you do that

	[core]
		packedgitwindowsize = 64M

I think you'll find that it works. Of course, with a _really_ low ulimit, 
you'd need to make it even smaller, but at some point you start hitting 
other problems than the pack-file limits, ie just the simple fact that git 
wants and expects you to have a certain amount of memory available ;)

Can you cnfirm that your "reproducible" case starts working with that 
addition to your ~/.gitconfig? If so, the solution is pretty simple: we 
should just lower the default pack windowsize.

		Linus

^ permalink raw reply

* Re: collapsing commits with rebase
From: Johannes Schindelin @ 2009-01-08  0:45 UTC (permalink / raw)
  To: Geoff Russell; +Cc: git
In-Reply-To: <93c3eada0901071608r190a723bma502b68c4ab81a08@mail.gmail.com>

Hi,

On Thu, 8 Jan 2009, Geoff Russell wrote:

> Dear gits,
> 
> I have a series of commits:
> 
>     A---B---C---D---E---F
> 
> I want to collapse B---C---D into one single commit. git rebase -i B 
> will allow me to do this, but I'm looking for a non-interactive 
> incantation.

You set GIT_EDITOR to a script ;-)

Alternatively, something like this should work for you:

	$ git checkout A
	$ git read-tree -u -m D
	$ git commit -m "My message"
	$ git cherry-pick E
	$ git cherry-pick F

Hth,
Dscho

^ permalink raw reply

* Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: Linus Torvalds @ 2009-01-08  0:37 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: Nicolas Pitre, Jan Krüger, Git ML
In-Reply-To: <alpine.LFD.2.00.0901071520330.3057@localhost.localdomain>



On Wed, 7 Jan 2009, Linus Torvalds wrote:
>
> >         > limit ~1.5GB -> corrupt file
> >         > limit ~3GB -> magically no longer corrupt.
> 
> That is interesting, although I also worry that there might be other 
> issues going on (ie since you've reported thigns magically fixing 
> themselves, maybe the ulimit tests just _happened_ to show that, even if 
> it wasn't the core reason).
> 
> BUT! This is definitely worth looking at.
> 
> For example, we do have some cases where we try to do "mmap()", and if it 
> fails, we try to free some memory and try again. In particular, in 
> xmmap(), if an mmap() fails - which may be due to running out of virtual 
> address space - we'll actually try to release some pack-file memory and 
> try again. Maybe there's a bug there - and it would be one that seldom 
> triggers for others.

Ho humm. We really do have some interesting things there. 

Is this a 64-bit machine? I didn't think OS X did that, but if there is 
some limited 64-bit support there, maybe "sizeof(void *)" is 8, then we 
default the default git pack-window to a pretty healthy 1GB.

I could easily see that if you have a virtual memory size limit of 1.5GB, 
and the pack window size is 1GB, we might have trouble. Because we could 
only keep one such pack window in memory at a time.

I have _not_ looked at the code, though. I'd have expected a SIGSEGV if we 
really had issues with the window handling.

Anyway, _if_ your system has 64-bit pointers, then _maybe_ something the 
default 1GB pack window causes problem.

If so, then adding a

	[core]
		packedgitwindowsize = 64M

might make a difference. It would certainly be very interesting to hear if 
there's any impact.

		Linus

^ permalink raw reply

* Public repro case! Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
From: R. Tyler Ballance @ 2009-01-08  0:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Jan Krüger, Git ML, kb
In-Reply-To: <alpine.LFD.2.00.0901071520330.3057@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 2370 bytes --]

On Wed, 2009-01-07 at 15:29 -0800, Linus Torvalds wrote:
> It is certainly possible. It's too bad that it's private, because it makes 
> it _much_ harder to try to pinpoint this.

My most esteemed colleague (Ken aka kb) who pointed out the memory issue
was on the right path (I think), and I have a reproduction case you can
try with your very own Linux kernel tree!

WOO!

I set ulimit -v really low (150M), and the operations I made got an
mmap(2) fatal error, but there is a sweet spot that I found, see the
transcript below. I basically chose an arbitrary revision from a couple
of weeks ago, and rolled the repository back to that point, then I tried
with iterations of ulimit -v 150, 250, 450, and then back down to 350.

        tyler@grapefruit:~/source/git/linux-2.6> limit
        cputime         unlimited
        filesize        unlimited
        datasize        unlimited
        stacksize       8MB
        coredumpsize    0kB
        memoryuse       2561MB
        maxproc         24564
        descriptors     1024
        memorylocked    64kB
        addressspace    unlimited
        maxfilelocks    unlimited
        sigpending      24564
        msgqueue        819200
        nice            0
        rt_priority     0
        tyler@grapefruit:~/source/git/linux-2.6> export
        START=56d18e9932ebf4e8eca42d2ce509450e6c9c1666
        tyler@grapefruit:~/source/git/linux-2.6> git reset --hard $START
        HEAD is now at 56d18e9 Merge branch 'upstream' of
        git://ftp.linux-mips.org/pub/scm/upstream-linus
        tyler@grapefruit:~/source/git/linux-2.6> ulimit -v `echo "350 *
        1024" | bc -l`
        tyler@grapefruit:~/source/git/linux-2.6> git pull
        error: failed to read object
        be1b87c70af69acfadb8a27a7a76dfb61de92643 at offset 1850923
        from .git/objects/pack/pack-dbe154052997a05499eb6b4fd90b924da68e799a.pack
        fatal: object be1b87c70af69acfadb8a27a7a76dfb61de92643 is
        corrupted
        tyler@grapefruit:~/source/git/linux-2.6>
        
I've tried this a couple of times, and it does seem to be reproducible,
let me know if you have any issues reproducing it locally and I'll try
to dig into it more with valgrind or something a bit more pin-pointing
than "ulimit -v && try, try again"


Cheers
-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox