Git development
 help / color / mirror / Atom feed
* Re: [PATCH] Rename ".dotest/" to ".git/rebase" and ".dotest-merge" to "rebase-merge"
From: Olivier Marin @ 2008-07-23 14:54 UTC (permalink / raw)
  To: Stephan Beyer
  Cc: Junio C Hamano, Theodore Tso, Nanako Shiraishi,
	Johannes Schindelin, René Scharfe, Joe Fiorini, git,
	Jari Aalto
In-Reply-To: <20080723011341.GE5904@leksak.fem-net>

Stephan Beyer a écrit :
> 
>>> Perhaps I am confused, but ...

I can understand. ;-)

>>> Why is there "HEAD" and "ORIG_HEAD" and not only "ORIG_HEAD"?
>> Just being a bit defensive -- in this case I think it might be Ok to say
>> "read-tree --reset -u ORIG_HEAD", but I haven't checked in a conflicted
>> case.

git read-tree --reset -u ORIG_HEAD clears local changes which is not good.

> Well, the test suite fails:
> * FAIL 4: am --abort goes back after failed am
> 
>                         git-am --abort &&
>                         git rev-parse HEAD >actual &&
>                         git rev-parse initial >expect &&
>                         test_cmp expect actual &&
>   here>                 test_cmp file-2-expect file-2 &&

Local changes have been lost.

> The reason of my question was that I *blindly* incorporated the change into
> sequencer to make it able to work on a dirty working tree and thus to be
> able to migrate am onto it without losing the ability to apply patches
> on a dirty working tree....

Are you talking about your seq-proto-dev3 branch?

> All am tests applied afterwards, but the sequencer and the rebase-i
> test suite failed in a place where I didn't expect it. I *then* had
> a deeper look at the read-tree line and I was wondering what the "HEAD"
> should achieve.
> I removed it and all tests passed. (I didn't have t4151 in my branch
> at that point.)
> 
> Now, because t4151 does not pass, I am wondering what's the best thing
> I could do...

I looked at your code. You use reset_almost_hard() instead of "reset --hard",
it's fine but you does not update require_clean_work_tree() to be less
restrictive and let the sequencer work with local modifications. Those two
lines must be removed, I think:
  git update-index --ignore-submodules --refresh &&
  git diff-files --quiet --ignore-submodules &&

Try that with the original read-tree line and t4151 should pass.

Ah, you should change "Applying 6" with "Applying \"6\"" in t4151-am-abort.sh
too.

Olivier.

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Jay Soffian @ 2008-07-23 14:47 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080723134926.GA12888@elte.hu>

On Wed, Jul 23, 2008 at 9:49 AM, Ingo Molnar <mingo@elte.hu> wrote:
> #!/bin/bash
>
> usage () {
>  echo 'usage: git-fastmerge <refspec>..'
>  exit -1
> }
>
> [ $# = 0 ] && usage
>
> BRANCH=$1
>
> MERGECACHE=.git/mergecache
>
> [ ! -d $MERGECACHE ] && { mkdir $MERGECACHE || usage; }
>
> HEAD_SHA1=$(git-log -1 --pretty=format:"%H")
> BRANCH_SHA1=$(git-log -1 --pretty=format:"%H" $BRANCH)
>
> CACHE=$MERGECACHE/$HEAD_SHA1/$BRANCH_SHA1
>
> [ -f "$CACHE" -a "$CACHE" -nt .git/refs/heads/$BRANCH_SHA1 ] && {

Shouldn't this be:

[ -f "$CACHE" -a "$CACHE" -nt .git/refs/heads/$BRANCH ] && {

?

j.

^ permalink raw reply

* Re: [PATCH] index-pack: never prune base_cache.
From: Johannes Schindelin @ 2008-07-23 14:41 UTC (permalink / raw)
  To: Björn Steinbrink; +Cc: Pierre Habouzit, spearce, Git ML, Junio C Hamano
In-Reply-To: <20080723134448.GB11679@atjola.homenet>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1070 bytes --]

Hi,

On Wed, 23 Jul 2008, Björn Steinbrink wrote:

> diff --git a/index-pack.c b/index-pack.c
> index ac20a46..33ba8ef 100644
> --- a/index-pack.c
> +++ b/index-pack.c
> @@ -699,6 +699,9 @@ static struct object_entry *append_obj_to_pack(
>  	write_or_die(output_fd, header, n);
>  	obj[0].idx.crc32 = crc32(0, Z_NULL, 0);
>  	obj[0].idx.crc32 = crc32(obj[0].idx.crc32, header, n);
> +	obj[0].hdr_size = n;
> +	obj[0].type = type;
> +	obj[0].size = size;
>  	obj[1].idx.offset = obj[0].idx.offset + n;
>  	obj[1].idx.offset += write_compressed(output_fd, buf, size, &obj[0].idx.crc32);
>  	hashcpy(obj->idx.sha1, sha1);

I confirm that the issues I saw went away with this patch, and it looks 
obviously like the correct approach.

The only things valgrind is still complaining about (apart from libz, 
which I will not bother commenting about) are uninitialized parts of the 
data being written to disk, and a crc over them.

Judging from the addresses, those are probably the bytes that are padded 
for 4- or 8-byte alignment, so they are probably fine.

Thanks,
Dscho

^ permalink raw reply

* Re: [RFC] Git User's Survey 2008
From: Dmitry Potapov @ 2008-07-23 14:38 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Stephan Beyer
In-Reply-To: <200807230325.04184.jnareb@gmail.com>

On Wed, Jul 23, 2008 at 03:25:03AM +0200, Jakub Narebski wrote:
>    02. What is your preferred non-programming language?
>   (or) What is the language you want computer communicate with you?

IMHO, the later wording of the question is much better.

>    05. How did you hear about Git?
>        (single choice?, in 2007 it was free-form)
>      - Linux kernel news (LKML, LWN, KernelTrap, KernelTraffic,...),
>        news site or magazine, blog entry, some project uses it,
>        presentation or seminar (real life, not on-line), SCM research,
>        IRC, mailing list, other Internet, other off-line, other(*)

I think "friend" would be a reasonable choice here too.

>    09. When did you start using git? From which version?
>      - pre 1.0, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5
>      + might be important when checking "what did you find hardest" etc.
>      + perhaps we should ask in addition to this question, or in place
>        of this question (replacing it) what git version one uses; it
>        should be multiple choice, and allow 'master', 'next', 'pu',
>        'dirty (with own modifications)' versions in addition.

I think: "What version do you use now?" and "How log do you use git?"
may be more useful here. From which version may give rather confusing
results because someone may "start" with 1.4 a week ago just because
that is the version included in Debian Etch and after realizing that
version 1.4 has serious usability issues upgraded git to 1.5. Besides,
1.5 is around for a long time now (as most as long as all previous
versions), so 1.5 can mean either one month of usage or 18 months...


Dmitry

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 14:14 UTC (permalink / raw)
  To: Sergey Vlasov; +Cc: git
In-Reply-To: <20080723140959.GB9537@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> Even assuming that the filesystem is sane, is my merge-cache 
> implementation semantically equivalent to a git-merge? One detail is 
> that i suspect it is not equivalent in the git-merge --no-ff case. 
> (but that is a not too interesting non-default case anyway)

actually, since --no-ff creates a merge commit and thus propagates the 
head sha1, this should work fine as well.

(besides the small detail that my script has $1 hardcoded so parameters 
are not properly passed onto.)

	Ingo

^ permalink raw reply

* Re: [RFC] Git User's Survey 2008
From: Stephan Beyer @ 2008-07-23 14:12 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <200807230325.04184.jnareb@gmail.com>

Hi,

Jakub Narebski wrote:
> I'd rather avoid free-form questions, even if they are more interesting,
> as they are PITA to analyse and summarize, especially to create some 
> kind of histogram from free-form replies data (some of 2007 free-form
> responses are not fully summarized even now).

Then we should use a web-based survey, because e-mail-based will always
be used to write free-form answers, I think.

So for multiple choice questions an "Other" item is often useful, but
-- if the survey service allows it -- I'd prefer if the "Other" item
enables a text input field (free-form) to let the user be more concrete.
Those values are informational only, but could be added to Git User's
Survey 2009, if they occur more than once. :)

> Third, where to send survey to / where to publish information about the 
> survey?  Last year the announcement was send to git mailing list, to
> LKML (Linux kernel mailing list), and mailing list for git projects 
> found on GitProjects page on GIT wiki.  Now that the number of projects 
> using Git as version control system has grown, I don't think it would 
> be good idea to "spam" all those mailing list; and if we don't send 
> notice to all other projects I'm not sure if we should include LKML.

Hmm, perhaps we could spam some news sites[1] on the web and keep the lists
clean.  Of course, this is advertising for git, too ;-)

[1] I could write something for German-speaking pro-linux.de and symlink.ch
    though I don't know if they take it as news.

> Last year survey announcement was put on Git Homepage (thanks Pasky), 
> and on front page of Git Wiki; info about survey was also put on two 
> git hosting sites: kernel.org and repo.or.cz.

Nice. That should be done again ;-)

> Last year the survey was meant to take three weeks, but was up longer.

Perhaps this is much too much, but my first thought was: 8 weeks.
Hmm, perhaps 5 weeks?

>    04. Which programming languages you are proficient with?

This is a really nasty multiple choice question.

>        (The choices include programming languages used by git)
>        (zero or more: multiple choice)
>      - C, shell, Perl, Python, Tcl/Tk
>      + (should we include other languages, like C++, Java, PHP,
>         Ruby,...?)

Perhaps yes.
The programming language list
	https://www.ohloh.net/tools
could be a start %)

>      - Linux kernel news (LKML, LWN, KernelTrap, KernelTraffic,...),
>        news site or magazine, blog entry, some project uses it,
>        presentation or seminar (real life, not on-line), SCM research,
>        IRC, mailing list, other Internet, other off-line, other(*)

- other off-line: told by friend, must be used at job, ...

>      + the problem is with having not very long list (not too many
>        choices), but spanning all possibilities.

Hmmm.
Is this a limitation by the free web-based services or is this due to
survey usability issues?

>    09. When did you start using git? From which version?
>      - pre 1.0, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5
>      + might be important when checking "what did you find hardest" etc.
>      + perhaps we should ask in addition to this question, or in place
>        of this question (replacing it) what git version one uses; it
>        should be multiple choice, and allow 'master', 'next', 'pu',
>        'dirty (with own modifications)' versions in addition.

Hmm, the master/next/pu/dirty question will be a mystery to most git users
that have never cared about git source code.

> How you use Git

>    16. Which porcelains / interfaces / implementations do you use?
>        (zero or more: multiple choice)
>      - core-git, Cogito (deprecated), StGIT, Guilt, pg (deprecated),
>        Pyrite, Easy Git, IsiSetup, jgit, my own scripts, other

I wonder if this could be extended to get an idea how many people use
plumbing directly or, even better, add a question like:

	Which of the following git commands or extra git tools do you use regularly?

	[list of all plumbing, porcelain and tools like stgit, guilt, etc]

or "... have you never used?", or "...have you ever used?"...
Just to get an idea of what commands are often used by the users.

Perhaps it is even useful to extend that list by some behavior-changing
options, like:

	[ ] git add
	[ ] git add -i / -p
	...
	[ ] git am
	[ ] git am -i
	...
	[ ] git merge
	[ ] git merge with strategy
	...
	[ ] git rebase
	[ ] git rebase -i

...though this is also handled by question 27 (see below).

Yes, this will be a long list :-)
And don't forget the [ ] other :)

Of course this question could be split into
 - extra tools
 - guis (like question 17.)
 - helpers
 - porcelain
 - plumbings

Question 27. should be in this section, too:

>    27. Which of the following features do you use?
>        (zero or more: multiple choice)
>      - git-gui or other commit tool, gitk or other history viewer, patch
>        management interface (e.g. StGIT), bundle, eol conversion,
>        gitattributes, submodules, separate worktree, reflog, stash,
>        shallow clone, detaching HEAD, mergetool, interactive rebase,
>        add --interactive or other partial commit helper, commit
>        templates, bisect, other (not mentioned here)
>      + should probably be sorted in some resemblance of order
>      + are there any new features which should be listed here?

Hmm, I'd remove "git-gui or other commit tool, gitk or other history
viewer, patch management interface (e.g. StGIT)".
And depending of the question I just proposed, "interactive rebase",
"add --interactive [...]", "bisect" could be removed, too.

>    18. Which (main) git web interface do you use for your projects?
>        (zero or more: multiple choice)
>      - gitweb, cgit, wit (Ruby), git-php, viewgit (PHP), other
>      + should there be a question about web server (Apache, IIS, ...)
>        used to host git web interface?

No, why should we care about the web server? :)

>    22. How does Git compare to other SCM tools you have used?
>      - worse/equal (or comparable)/better
>    23. What would you most like to see improved about Git?
>        (features, bugs, plug-ins, documentation, ...)
>    24. If you want to see Git more widely used, what do you
>        think we could do to make this happen?
>      + Is this question necessary/useful?  Do we need wider adoption?

Hmmm,
"Do you miss features in git that you know from other SCMs?"
"If yes, what features are these?"

>    26. How do you compare current version with version from year ago?
>      - current version is: better/worse/no changes

Since this is single-choice, a "don't know"/"cannot say" option should 
be added.

>    28. If you use some important Git features not mentioned above,
>        what are it?

"what are it" sounds somehow funny. Is it correct?
"what are them?" or "what are those?"
Or
"If you use some important Git feature not mentioned above, what is it?"

> Documentation
> 
>    29. Do you use the Git wiki?
>     -  yes/no
>    30. Do you find Git wiki useful?
>     -  yes/no/somewhat
>    31. Do you contribute to Git wiki?
>     -  yes/no/only corrections or spam removal
>    32. Do you find Git's on-line help (homepage, documentation) useful?
>     -  yes/no/somewhat
>    33. Do you find help distributed with Git useful
>        (manpages, manual, tutorial, HOWTO, release notes)?
>     -  yes/no/somewhat
>    34. What could be improved on the Git homepage?
>        (free form)
>    35. What could be improved in Git documentation?
>        (free form)

36. Do you think there is too few documentation on the web?
37. Do you think there is too much documentation on the web?

No ;-) Perhaps:

36. Do you think it is easy to find out how to do a specific task with
    git?

> Open forum
> 
>    46. What other comments or suggestions do you have that are not
>        covered by the questions above?
>        (free form)

About the survey

47. Do you have any comments about the survey?
48. Should such a survey be repeated next year?
(Or: Would you take part in such a survey next year again?)
    [ ] Yes
    [ ] No, but 2010 again.
    [ ] No, never again.

Just a poor idea to get "feedback" if people like to take part in this
survey or not.


Regards,
  Stephan

-- 
Stephan Beyer <s-beyer@gmx.net>, PGP 0x6EDDD207FCC5040F

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 14:09 UTC (permalink / raw)
  To: Sergey Vlasov; +Cc: git
In-Reply-To: <20080723174140.b749191a.vsu@altlinux.ru>

* Sergey Vlasov <vsu@altlinux.ru> wrote:

> On Wed, 23 Jul 2008 15:05:18 +0200 Ingo Molnar wrote:
> 
> > Anyone can simulate it by switching to the linus/master branch of the
> > current Linux kernel tree, and doing:
> >
> >    time for ((i=0; i<140; i++)); do git-merge v2.6.26; done
> >
> >    real    1m26.397s
> >    user    1m10.048s
> >    sys     0m13.944s
> 
> Timing results here (E6750 @ 2.66GHz):
> 41.61s user 3.71s system 99% cpu 45.530 total
> 
> However, testing whether there is something new to merge could be
> performed significantly faster:
> 
> $ time sh -c 'for ((i=0; i<140; i++)); do [ -n "$(git rev-list --max-count=1 v2.6.26 ^HEAD)" ]; done'
> sh -c   5.49s user 0.26s system 99% cpu 5.786 total
> 
> The same loop with "git merge-base v2.6.26 HEAD" takes about 40 
> seconds here - apparently finding the merge base is the expensive 
> part, and it makes sense to avoid it if you expect that most of your 
> branches do not contain anything new to merge.

using git-fastmerge i get 2.4 seconds:

  $ time for ((i=0; i<140; i++)); do git-fastmerge v2.6.26; done
  [...]
  real    0m2.388s
  user    0m1.211s
  sys     0m1.131s

for something that 'progresses' in a forward manner (which merges do 
fundamentally) nothing beats the performance of a timestamped cache i 
think.

at least for my usecase.

Even assuming that the filesystem is sane, is my merge-cache 
implementation semantically equivalent to a git-merge? One detail is 
that i suspect it is not equivalent in the git-merge --no-ff case. (but 
that is a not too interesting non-default case anyway)

	Ingo

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Santi Béjar @ 2008-07-23 14:06 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080723130518.GA17462@elte.hu>

On Wed, Jul 23, 2008 at 15:05, Ingo Molnar <mingo@elte.hu> wrote:
>
> I've got the following, possibly stupid question: is there a way to
> merge a healthy number of topic branches into the master branch in a
> quicker way, when most of the branches are already merged up?

You could filter upfront the branches that are already merged up with:

git show-branch --independent <commits>

but it has a limit of 25 refs.

Santi

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Björn Steinbrink @ 2008-07-23 14:06 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080723130518.GA17462@elte.hu>

On 2008.07.23 15:05:18 +0200, Ingo Molnar wrote:
> 
> I've got the following, possibly stupid question: is there a way to 
> merge a healthy number of topic branches into the master branch in a 
> quicker way, when most of the branches are already merged up?
> 
> Right now i've got something like this scripted up:
> 
>   for B in $(git-branch | cut -c3- ); do git-merge $B; done 

Not yet in any release (AFAICT), but with git.git master, you could use:

for B in $(git branch --no-merged); do git-merge $B; done


Or with earlier versions, this should work, but it's a lot slower:

for B in $(git branch | cut -c3- ); do
	[[ -n "$(git rev-list -1 HEAD..$B)" ]] && git merge $B;
done

Björn

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 14:04 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git
In-Reply-To: <20080723135621.GJ22606@neumann>


* SZEDER Gábor <szeder@ira.uka.de> wrote:

> Hi,
> 
> On Wed, Jul 23, 2008 at 03:05:18PM +0200, Ingo Molnar wrote:
> > I've got the following, possibly stupid question: is there a way to 
> > merge a healthy number of topic branches into the master branch in a 
> > quicker way, when most of the branches are already merged up?
> > 
> > Right now i've got something like this scripted up:
> > 
> >   for B in $(git-branch | cut -c3- ); do git-merge $B; done 
> you cound use 'git branch --no-merged' to list only those branches
> that have not been merged into your current HEAD.

hm, it's very slow:

  $ time git branch --no-merged
  [...]

  real    0m9.177s
  user    0m9.027s
  sys     0m0.129s

when running it on tip/master:

  http://people.redhat.com/mingo/tip.git/README

	Ingo

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 14:02 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git
In-Reply-To: <488734D9.9070703@op5.se>


* Andreas Ericsson <ae@op5.se> wrote:

> Ingo Molnar wrote:
>> I've got the following, possibly stupid question: is there a way to  
>> merge a healthy number of topic branches into the master branch in a  
>> quicker way, when most of the branches are already merged up?
>>
>> Right now i've got something like this scripted up:
>>
>>   for B in $(git-branch | cut -c3- ); do git-merge $B; done 
>>
>> It takes a lot of time to run on even a 3.45GHz box:
>>
>>   real    0m53.228s
>>   user    0m41.134s
>>   sys     0m11.405s
>>
>> I just had a workflow incident where i forgot that this script was  
>> running in one window (53 seconds are a _long_ time to start doing some 
>> other stuff :-), i switched branches and the script merrily chugged 
>> away merging branches into a topic branch i did not intend.
>>
>> It iterates over 140 branches - but all of them are already merged up.
>>
>
> With the builtin merge (which is in next), this should be doable with 
> an octopus merge, which will eliminate the branches that are already 
> fully merged, resulting in a less-than-140-way merge (thank gods...). 
> It also doesn't have the 24-way cap that the scripted version suffers 
> from.
>
> If it does a good job at your rather extreme use-case, I'd say it's 
> good enough for 'master' pretty soon :-)

hm, while i do love octopus merges [*] for release and bisection-quality 
purposes, for throw-away (delta-)integration runs it's more manageable 
to do a predictable series of one-on-one merges.

It results in better git-rerere behavior, has easier (to the human) 
conflict resolutions and the octopus merge also falls apart quite easily 
when it runs into conflicts. Furthermore, i've often seen octopus merges 
fail while a series of 1:1 merges succeeded.

What i could try is to do a speculative octopus merge, in the hope of it 
just going fine - and then fall back to the serial merge if it fails?

The git-fastmerge approach is probably still faster though - and 
certainly simpler from a workflow POV.

	Ingo

[*] take a look at these in the Linux kernel -git repo:

      gitk 3c1ca43fafea41e38cb2d0c1684119af4c1de547
      gitk 6924d1ab8b7bbe5ab416713f5701b3316b2df85b

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Sergey Vlasov @ 2008-07-23 13:41 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080723130518.GA17462@elte.hu>

[-- Attachment #1: Type: text/plain, Size: 889 bytes --]

On Wed, 23 Jul 2008 15:05:18 +0200 Ingo Molnar wrote:

> Anyone can simulate it by switching to the linus/master branch of the
> current Linux kernel tree, and doing:
>
>    time for ((i=0; i<140; i++)); do git-merge v2.6.26; done
>
>    real    1m26.397s
>    user    1m10.048s
>    sys     0m13.944s

Timing results here (E6750 @ 2.66GHz):
41.61s user 3.71s system 99% cpu 45.530 total

However, testing whether there is something new to merge could be
performed significantly faster:

$ time sh -c 'for ((i=0; i<140; i++)); do [ -n "$(git rev-list --max-count=1 v2.6.26 ^HEAD)" ]; done'
sh -c   5.49s user 0.26s system 99% cpu 5.786 total

The same loop with "git merge-base v2.6.26 HEAD" takes about 40
seconds here - apparently finding the merge base is the expensive
part, and it makes sense to avoid it if you expect that most of your
branches do not contain anything new to merge.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: SZEDER Gábor @ 2008-07-23 13:56 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080723130518.GA17462@elte.hu>

Hi,

On Wed, Jul 23, 2008 at 03:05:18PM +0200, Ingo Molnar wrote:
> I've got the following, possibly stupid question: is there a way to 
> merge a healthy number of topic branches into the master branch in a 
> quicker way, when most of the branches are already merged up?
> 
> Right now i've got something like this scripted up:
> 
>   for B in $(git-branch | cut -c3- ); do git-merge $B; done 
you cound use 'git branch --no-merged' to list only those branches
that have not been merged into your current HEAD.


Üdv,
Gábor

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 13:49 UTC (permalink / raw)
  To: git
In-Reply-To: <20080723131736.GA9100@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> So i guess it's better to just create a separate 
> .git/refs/merge-cache/ hierarchy with timestamps of last merged 
> branches and their head sha1 ... but maybe i'm banging on open doors?

here's the git-fastmerge script i've whipped up in 10 minutes. It does 
the trick nicely for me:

first run:

  real    0m53.228s
  user    0m41.134s
  sys     0m11.405s

second run:

  real    0m2.751s
  user    0m1.280s
  sys     0m1.491s

or a 20x speedup. Yummie! :-)

It properly notices when i commit to a topic branch, and it maintains a 
proper matrix of <A> <- <B> merge timestamps. It even embedds the sha1's 
in the timestamp path so it should be quite complete. It should work 
fine across resets, re-merges, etc. too i think. It should work well 
with renamed branches as well i think. (although i dont do that all that 
often)

In fact even if i delete the whole .git/mergecache/ hierarchy and run a 
'cold' merge, it's much faster:

  real    0m32.129s
  user    0m24.456s
  sys     0m7.603s

Because many of the branches have the same sha1 so it's already 
half-optimized even on the first run.

Much of the remaining 2.7 seconds overhead comes from the git-log runs 
to retrieve the sha1s, so i guess it could all be made even faster.

Now this scheme assumes that there's a sane underlying filesystem that 
can take these long pathnames and which has good timestamps (which i 
have, so it's not a worry for me).

Hm?

	Ingo

-----------------{ git-fastmerge }--------------------->
#!/bin/bash

usage () {
  echo 'usage: git-fastmerge <refspec>..'
  exit -1
}

[ $# = 0 ] && usage

BRANCH=$1

MERGECACHE=.git/mergecache

[ ! -d $MERGECACHE ] && { mkdir $MERGECACHE || usage; }

HEAD_SHA1=$(git-log -1 --pretty=format:"%H")
BRANCH_SHA1=$(git-log -1 --pretty=format:"%H" $BRANCH)

CACHE=$MERGECACHE/$HEAD_SHA1/$BRANCH_SHA1

[ -f "$CACHE" -a "$CACHE" -nt .git/refs/heads/$BRANCH_SHA1 ] && {
  echo "merge-cache hit on HEAD <= $1"
  exit 0
}

git-merge $1 && {
  mkdir -p $(dirname $CACHE)
  touch $CACHE
}

^ permalink raw reply

* Re: [PATCH] index-pack: never prune base_cache.
From: Johannes Schindelin @ 2008-07-23 13:46 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Björn Steinbrink, spearce, Git ML, Junio C Hamano
In-Reply-To: <20080723132031.GC20614@artemis.madism.org>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1220 bytes --]

Hi,

On Wed, 23 Jul 2008, Pierre Habouzit wrote:

> On Wed, Jul 23, 2008 at 01:09:40PM +0000, Johannes Schindelin wrote:
> > On Wed, 23 Jul 2008, Björn Steinbrink wrote:
> > > The patch itself should be fine.
> > 
> > No, since it opens the whole issue of memory explosion again, the same 
> > issue Shawn's original patch tried to fix.
> 
>   No it won't. Indeed the issue is with fix_unresolved_deltas that
> sometimes put at the root of the chain (in base_cache) something that
> comes from our store, not the pack we are writing. Then starts a delta
> chain resolution.

If it comes from our store, we should have _no_ problem reconstructing the 
object.

>   It won't explode in memory at all, we just keep the first data of a
> delta chain in memory, that's all. It indeed consumes more memory, but
> we talk about *one* single object per delta chain because we're too lazy
> to memorize where it comes from. It's probably not much of an explosion.

"Probably".

>   We also waste that object even when it's from our own pack. Well, I'd
> say "too bad".

And I say it's dirty, and since the pack code traditionally is one of the 
cleanest parts of Git, coming from Nico,  let's not change that, okay?

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] index-pack: never prune base_cache.
From: Björn Steinbrink @ 2008-07-23 13:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Pierre Habouzit, spearce, Git ML, Junio C Hamano
In-Reply-To: <alpine.DEB.1.00.0807231407040.8986@racer>

On 2008.07.23 14:09:40 +0100, Johannes Schindelin wrote:
> Hi,
> 
> On Wed, 23 Jul 2008, Björn Steinbrink wrote:
> 
> > On 2008.07.23 14:11:18 +0200, Pierre Habouzit wrote:
> > > It may belong to something (stdin) that is consumed.
> > 
> > Probably thanks to me, babbling about stdin without having a clue what 
> > I'm talking about, that rationale is wrong.
> > 
> > We may not prune base_cache since that object might come from a
> > different pack than the one that we are processing. In such a case, we
> > would try to restore the data for that object from the pack we're
> > processing and fail miserably.
> 
> Then the proper fix would be to load the object from that pack again.

Actually, my analysis was total bullshit. Right after reading the object
from the foreign pack, we also call append_obj_to_pack, so we are
actually able to reread that object just fine. The real issue seems to
be that we just forget to initialize some fields.

This patch fixes the issue for me, but I guess it's not quite the right
way to do it, pure guesswork.

Björn

---

diff --git a/index-pack.c b/index-pack.c
index ac20a46..33ba8ef 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -699,6 +699,9 @@ static struct object_entry *append_obj_to_pack(
 	write_or_die(output_fd, header, n);
 	obj[0].idx.crc32 = crc32(0, Z_NULL, 0);
 	obj[0].idx.crc32 = crc32(obj[0].idx.crc32, header, n);
+	obj[0].hdr_size = n;
+	obj[0].type = type;
+	obj[0].size = size;
 	obj[1].idx.offset = obj[0].idx.offset + n;
 	obj[1].idx.offset += write_compressed(output_fd, buf, size, &obj[0].idx.crc32);
 	hashcpy(obj->idx.sha1, sha1);

^ permalink raw reply related

* Re: q: faster way to integrate/merge lots of topic branches?
From: Andreas Ericsson @ 2008-07-23 13:40 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git
In-Reply-To: <20080723130518.GA17462@elte.hu>

Ingo Molnar wrote:
> I've got the following, possibly stupid question: is there a way to 
> merge a healthy number of topic branches into the master branch in a 
> quicker way, when most of the branches are already merged up?
> 
> Right now i've got something like this scripted up:
> 
>   for B in $(git-branch | cut -c3- ); do git-merge $B; done 
> 
> It takes a lot of time to run on even a 3.45GHz box:
> 
>   real    0m53.228s
>   user    0m41.134s
>   sys     0m11.405s
> 
> I just had a workflow incident where i forgot that this script was 
> running in one window (53 seconds are a _long_ time to start doing some 
> other stuff :-), i switched branches and the script merrily chugged away 
> merging branches into a topic branch i did not intend.
> 
> It iterates over 140 branches - but all of them are already merged up.
> 

With the builtin merge (which is in next), this should be doable with
an octopus merge, which will eliminate the branches that are already
fully merged, resulting in a less-than-140-way merge (thank gods...).
It also doesn't have the 24-way cap that the scripted version suffers
from.

If it does a good job at your rather extreme use-case, I'd say it's
good enough for 'master' pretty soon :-)

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: [PATCH] index-pack: never prune base_cache.
From: Pierre Habouzit @ 2008-07-23 13:20 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Björn Steinbrink, spearce, Git ML, Junio C Hamano
In-Reply-To: <alpine.DEB.1.00.0807231407040.8986@racer>

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

On Wed, Jul 23, 2008 at 01:09:40PM +0000, Johannes Schindelin wrote:
> On Wed, 23 Jul 2008, Björn Steinbrink wrote:
> > The patch itself should be fine.
> 
> No, since it opens the whole issue of memory explosion again, the same 
> issue Shawn's original patch tried to fix.

  No it won't. Indeed the issue is with fix_unresolved_deltas that
sometimes put at the root of the chain (in base_cache) something that
comes from our store, not the pack we are writing. Then starts a delta
chain resolution.

  It won't explode in memory at all, we just keep the first data of a
delta chain in memory, that's all. It indeed consumes more memory, but
we talk about *one* single object per delta chain because we're too lazy
to memorize where it comes from. It's probably not much of an explosion.

  We also waste that object even when it's from our own pack. Well, I'd
say "too bad".

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [RFC] Git User's Survey 2008
From: Johannes Schindelin @ 2008-07-23 13:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <200807231508.42334.jnareb@gmail.com>

Hi,

On Wed, 23 Jul 2008, Jakub Narebski wrote:

> On Wed, 23 Jul 2008, Johannes Schindelin wrote:
> > On Wed, 23 Jul 2008, Jakub Narebski wrote:
> > 
> > Some people prefer to stay anonymous, so I think email is out.
> > 
> > >    04. Which programming languages you are proficient with?
> > >        (The choices include programming languages used by git)
> > >        (zero or more: multiple choice)
> > >      - C, shell, Perl, Python, Tcl/Tk
> > >      + (should we include other languages, like C++, Java, PHP,
> > >         Ruby,...?)
> > 
> > Yes, I think this should be a long list.
> 
> I'd rather not have a "laundry list" of languages.  I have put C++
> because QGit uses it, Java because of egit/jgit, PHP for web
> interfaces, Ruby because of GitHub and because of Ruby comminity
> choosing Git.  I should perhaps add Emacs Lisp, HTML+CSS and
> JavaScript here.  What other languages should be considered?

C# at least, since we had one (pretty unsuccessful) attempt at 
reimplementing Git in it.


> > >    07. What helped you most in learning to use it?
> > >        (free form question)
> > 
> > Is it possible to have multiple choice, with "other" (free-form)?
>
> But perhaps multiple choice with free-form "other" choice would be the 
> best?

Uhm, yes, you are right.  Why not have multiple choice, with "other") 
(free-form)?

> > >    42. Do you find traffic levels on Git mailing list OK.
> > >     -  yes/no? (optional)
> > 
> > /too low?  *ducksandrunsforcover*
> 
> ???

Well, was worth a try ;-)

Ciao,
Dscho

^ permalink raw reply

* Re: q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 13:17 UTC (permalink / raw)
  To: git
In-Reply-To: <20080723130518.GA17462@elte.hu>


* Ingo Molnar <mingo@elte.hu> wrote:

> I have thought of using the last CommitDate of the topic branch and 
> compare it with the last CommitDate of the master branch [and i can 
> trust those values] - that would be a lot faster - but maybe i'm 
> missing something trivial that makes that approach unworkable. It 
> would also be nice to have a builtin shortcut for that instead of 
> having to go via "git-log --pretty=fuller" to dump the CommitDate 
> field.

hm, this method would be fragile if done purely within my integration 
script, as the timestamp of the head would have to be updated 
atomically, while always merging all the topic branches in one such 
transaction. (so that the timestamps do not get out of sync and a topic 
branch is not skipped by accident)

So i guess it's better to just create a separate .git/refs/merge-cache/ 
hierarchy with timestamps of last merged branches and their head sha1 
... but maybe i'm banging on open doors?

	Ingo

^ permalink raw reply

* Re: [PATCH] index-pack: never prune base_cache.
From: Johannes Schindelin @ 2008-07-23 13:09 UTC (permalink / raw)
  To: Björn Steinbrink; +Cc: Pierre Habouzit, spearce, Git ML, Junio C Hamano
In-Reply-To: <20080723125226.GA11679@atjola.homenet>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1097 bytes --]

Hi,

On Wed, 23 Jul 2008, Björn Steinbrink wrote:

> On 2008.07.23 14:11:18 +0200, Pierre Habouzit wrote:
> > It may belong to something (stdin) that is consumed.
> 
> Probably thanks to me, babbling about stdin without having a clue what 
> I'm talking about, that rationale is wrong.
> 
> We may not prune base_cache since that object might come from a
> different pack than the one that we are processing. In such a case, we
> would try to restore the data for that object from the pack we're
> processing and fail miserably.

Then the proper fix would be to load the object from that pack again.

> The patch itself should be fine.

No, since it opens the whole issue of memory explosion again, the same 
issue Shawn's original patch tried to fix.

Ciao,
Dscho

P.S.: Could you please, please, please cull the part you are not 
responding to?  This mailing list is read by more than 50 people.  If you 
sum up the time it takes them to realize that that quoted part was 
irrelevant, I am sure you will end up with a larger number of minutes than 
it would take you to just delete it.

Thanks.

^ permalink raw reply

* Re: [RFC] Git User's Survey 2008
From: Jakub Narebski @ 2008-07-23 13:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0807231128090.2830@eeepc-johanness>

On Wed, 23 Jul 2008, Johannes Schindelin wrote:
> On Wed, 23 Jul 2008, Jakub Narebski wrote:
> 
> Some people prefer to stay anonymous, so I think email is out.
> 
> >    04. Which programming languages you are proficient with?
> >        (The choices include programming languages used by git)
> >        (zero or more: multiple choice)
> >      - C, shell, Perl, Python, Tcl/Tk
> >      + (should we include other languages, like C++, Java, PHP,
> >         Ruby,...?)
> 
> Yes, I think this should be a long list.

I'd rather not have a "laundry list" of languages.  I have put C++
because QGit uses it, Java because of egit/jgit, PHP for web
interfaces, Ruby because of GitHub and because of Ruby comminity
choosing Git.  I should perhaps add Emacs Lisp, HTML+CSS and
JavaScript here.  What other languages should be considered?
 
> >    07. What helped you most in learning to use it?
> >        (free form question)
> 
> Is it possible to have multiple choice, with "other" (free-form)?  Then 
> I'd suggest:
> 
> 	Colleague/Instructor, User Manual, Manpages, Tutorials, Tutorials 
> 	(elsewhere; not in git.git), Mailing list, IRC, Git Wiki, Other.

By "Tutorials (elsewhere; not in git.git)" you mean here many various
"git guide" pages, like "Git for Computer Scientists", "Git Magic",
etc.?

I'm not sure about having multiple choice vs. free-form question here.
Multiple choice is easier to analyze, especially if one would want
histogram of replies... but free form is more rich.  But perhaps
multiple choice with free-form "other" choice would be the best?

Besides proposed choices limit person filling the survey to single
understanding of "what helped you in learning to use Git", which
can be also understood as asking for list of features helping with
learning Git, not only list of documentation and such. 

> >    08. What did you find hardest in learning Git?
> >        What did you find harderst in using Git?
> 
> s/harderst/hardest.
> 
> >        (free form question)
> 
> Again, I'd suggest a multiple choice + Other:
> 
> 	The amount of commands, the amount of options, the index (AKA 
> 	staging), branching, user interface, bugs, Other.

Here it can be hard to come up with good list of choices.  For example
among responses in 2007 survey there were 'inconsistent commands',
'obtuse command messages', 'insufficient/hard to use documentation',
and many more.

I'm not sure if troubles with coming with extensive but not too large
list of options for this question is worth it; I think that we need
only list of responses, and not number of responses (perhaps mentioning
which one occur [much] more frequently).

> > Other SCMs (shortened compared with 2007 survey)
> > 
> >    10. What other SCM did or do you use?
> >        (zero or more: multiple choice)
> >      - CVS, Subversion, GNU Arch or arch clone (ArX, tla, ...),
> >        Bazaar-NG, Darcs, Mercurial, Monotone, SVK, AccuRev, Perforce,
> >        BitKeeper, ClearCase, MS Visual Source Safe, MS Visual Studio
> >        Team System, custom, other(*)
> 
> PVCS seems to be pretty popular, too.

O.K., I'll add it.  I think I'd better add RCS too.

> >    11. Why did you choose Git? (if you use Git)
> >        What do you like about using Git?
> >        (free form, not to be tabulated)
> 
> Again, to avoid hassles with free-form:
> 
> 	Mandatory: work, mandatory: open source project I am participating 
> 	in, speed, scalability, It's What Linus Uses, Other.

Free form has some hassles.  Because here histogram of responses might
be interesting, perhaps it would be good to use multiple choice here.

I would add "features" and/or "unique features" to the list, and also
perhaps "being popular/hype".

> >    12. Why did you choose other SCMs? (if you use other SCMs)
> >        What do you like about using other SCMs?
> >        Note: please write name of SCMs you are talking about.
> >        (free form, not to be tabulated).
> 
> Again:
> 
> 	ease-of use, simplicity, existing project uses it, I Do Not Like 
> 	Linus, Other

Again: free form has some hassles, but so does coming up with good
choice of fixed answers in multiple choice question.  I'll add
"ease to install on MS Windows" (or something like that) if we decide
to have this question multiple choice.

> >    15. What operating system do you use Git on?
> >        (one or more: multiple choice, as one can use more than one OS)
> >      - Linux, *BSD (FreeBSD, OpenBSD, etc.), MS Windows/Cygwin,
> >        MS Windows/msysGit, MacOS X, other UNIX, other
> 
> You should include "Dunno", which gets automatically mapped to "MS 
> Windows/msysGit" ;-)
> 
> >    19. How do you publish/propagate your changes?
> >        (zero or more: multiple choice)
> >      - push, pull request, format-patch + email, bundle, other
> 
> git svn
> 
> You might laugh, but it is a sad fact that some guy promotes "Using Git 
> with Google Code" by using git-svn to drive their crappy Subversion.

O.K.  I'll add "git-svn (or other to foreign SCM)".

> >    22. How does Git compare to other SCM tools you have used?
> >      - worse/equal (or comparable)/better
> >    23. What would you most like to see improved about Git?
> >        (features, bugs, plug-ins, documentation, ...)
> 
> Maybe here should be another question "What are the most useful features 
> of Git?" but maybe that is covered by earlier questions.

I think it is.  I'd rather try to reduce number of questions...

> >    24. If you want to see Git more widely used, what do you
> >        think we could do to make this happen?
> >      + Is this question necessary/useful?  Do we need wider adoption?
> 
> I agree with Junio: this is not so interesting for us; we are no company, 
> and we have no sales department who could wank of on these answers.

I'll remove it, then.

> >    27. Which of the following features do you use?
> >        (zero or more: multiple choice)
> >      - git-gui or other commit tool, gitk or other history viewer, patch
> >        management interface (e.g. StGIT), bundle, eol conversion,
> 
> For our Windows friends, we should add " (crlf)" to the last item.

Right.  Thanks.

> >    42. Do you find traffic levels on Git mailing list OK.
> >     -  yes/no? (optional)
> 
> /too low?  *ducksandrunsforcover*

???

> >    44. If yes, do you find IRC channel useful?
> >     -  yes/no (optional)
> 
> /somewhat.  Even if I would be the only one choosing that option.

I'm sorry about that: I have forgot that this and all similar questions
had triple choice: yes/no/somewhat in the final version of 2007 survey.
I'll correct it.

> >    45. Did you have problems getting GIT help on mailing list or
> >        on IRC channel? What were it? What could be improved?
> >        (free form)
> 
> Yeah, I know who will answer to that, and what... "yaddayadda very 
> unfriendly yaddayadda especially that Johannes guy yaddayadda" (you know 
> who you are)... *lol*

:-)

-- 
Jakub Narebski
Poland

^ permalink raw reply

* q: faster way to integrate/merge lots of topic branches?
From: Ingo Molnar @ 2008-07-23 13:05 UTC (permalink / raw)
  To: git


I've got the following, possibly stupid question: is there a way to 
merge a healthy number of topic branches into the master branch in a 
quicker way, when most of the branches are already merged up?

Right now i've got something like this scripted up:

  for B in $(git-branch | cut -c3- ); do git-merge $B; done 

It takes a lot of time to run on even a 3.45GHz box:

  real    0m53.228s
  user    0m41.134s
  sys     0m11.405s

I just had a workflow incident where i forgot that this script was 
running in one window (53 seconds are a _long_ time to start doing some 
other stuff :-), i switched branches and the script merrily chugged away 
merging branches into a topic branch i did not intend.

It iterates over 140 branches - but all of them are already merged up.

Anyone can simulate it by switching to the linus/master branch of the 
current Linux kernel tree, and doing:

   time for ((i=0; i<140; i++)); do git-merge v2.6.26; done

   real    1m26.397s
   user    1m10.048s
   sys     0m13.944s

One could argue that determining whether it's all merged up already is a 
complex task, but but even this seemingly trivial merge of HEAD into 
HEAD is quite slow:

   time for ((i=0; i<140; i++)); do git-merge HEAD; done

   real    0m17.871s
   user    0m8.977s
   sys     0m8.396s

I'm wondering whether there are tricks to speed this up. The real script 
i'm using is much longer and obscured with boring details like errors, 
conflicts, etc. - but the above is the gist of it. (and that is what 
makes it slow primarily)

Using a speculative Octopus might be one approach, but that runs into 
the octopus merge limitation at 24 branches, and it also is quite slow 
as well. (and is not equivalent to the serial merge of 140 branches)

I have thought of using the last CommitDate of the topic branch and 
compare it with the last CommitDate of the master branch [and i can 
trust those values] - that would be a lot faster - but maybe i'm missing 
something trivial that makes that approach unworkable. It would also be 
nice to have a builtin shortcut for that instead of having to go via 
"git-log --pretty=fuller" to dump the CommitDate field.

builtin-integrate.c perhaps? ;-)

	Ingo

^ permalink raw reply

* Re: git-svn does not seems to work with crlf convertion enabled.
From: Johannes Schindelin @ 2008-07-23 12:57 UTC (permalink / raw)
  To: Alexander Litvinov; +Cc: git
In-Reply-To: <200807231852.10206.litvinov2004@gmail.com>

Hi,

On Wed, 23 Jul 2008, Alexander Litvinov wrote:

> > On Wed, 23 Jul 2008, Alexander Litvinov wrote:
> > > In short: I can't clone svn repo into git when crlf convertion is 
> > > activated.
> >
> > This is a known issue, but since nobody with that itch seems to care 
> > enough to fix it, I doubt it will ever be fixed.
> 
> That is a bad news for me. Anyway I will spend some time at holidays 
> during digging this bug.

Note that you will have to do your digging using msysGit (i.e. the 
developer's pack, not the installer for plain Git), since git-svn will be 
removed from the next official "Windows Git" release, due to lack of 
fixers.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] index-pack: never prune base_cache.
From: Björn Steinbrink @ 2008-07-23 12:52 UTC (permalink / raw)
  To: Pierre Habouzit, Johannes Schindelin, spearce, Git ML,
	Junio C Hamano
In-Reply-To: <20080723121118.GA20614@artemis.madism.org>

On 2008.07.23 14:11:18 +0200, Pierre Habouzit wrote:
> It may belong to something (stdin) that is consumed.

Probably thanks to me, babbling about stdin without having a clue what
I'm talking about, that rationale is wrong.

We may not prune base_cache since that object might come from a
different pack than the one that we are processing. In such a case, we
would try to restore the data for that object from the pack we're
processing and fail miserably.


The patch itself should be fine.

> 
> Signed-off-by: Pierre Habouzit <madcoder@debian.org>
> ---
> 
>     On Wed, Jul 23, 2008 at 12:00:45PM +0000, Björn Steinbrink wrote:
>     > On 2008.07.23 12:37:00 +0100, Johannes Schindelin wrote:
>     > > Hi,
>     > > 
>     > > Well, I cannot.  However, I get some pread issue on i686.  To be nice to 
>     > > kernel.org, I downloaded the pack in question:
>     > > 
>     > > 	http://pacific.mpi-cbg.de/git/thin-pack.pack
>     > > 
>     > > You should be able to reproduce the behavior by piping this into
>     > > 
>     > > git-index-pack --stdin -v --fix-thin --keep=fetch-pack --pack_header=2,263
>     > 
>     > OK, that gave me a seemingly sane backtrace. What seems to happen (AFA
>     > my limited knowledge tells me):
>     > 
>     > In fix_unresolved_deltas, we read base_obj from an existing pack, other
>     > than the one we're reading. We then link that object to the base cache. 
>     > 
>     > Then in resolve_delta, we create the "result" base_data object and link
>     > that one, too. Now this triggers the pruning, and because the cache is
>     > so small, we prune the object that we read from the existing pack! Fast
>     > forward a few function calls, we end up in get_base_data trying to
>     > re-read the data for that object, but this time from the pack that we
>     > got on stdin. And boom it goes.
>     > 
>     > Does that make any sense to you?
> 
>       Yes, that's obvious, the pack that we read from stdin is consumed, we
>     should *NEVER* prune base_cache. And indeed that little patch works for
>     me.
> 
>  index-pack.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/index-pack.c b/index-pack.c
> index ac20a46..eb81ed4 100644
> --- a/index-pack.c
> +++ b/index-pack.c
> @@ -227,7 +227,7 @@ static void prune_base_data(struct base_data *retain)
>  	for (b = base_cache;
>  	     base_cache_used > delta_base_cache_limit && b;
>  	     b = b->child) {
> -		if (b->data && b != retain) {
> +		if (b != base_cache && b->data && b != retain) {
>  			free(b->data);
>  			b->data = NULL;
>  			base_cache_used -= b->size;

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox