* Software
From: Betty T. Sheller @ 2006-02-12 14:22 UTC (permalink / raw)
To: Git
Learn to build simple and clean websites that can bring in the dough...
Understanding 0EM software
New software on our site:
Plus! XP - $59.95
After Effects 6 - $69.95
Premiere 7 - $69.95
Fireworks MX 2004 - $69.95
Photoshop 7 - $69.95
Norton System Works 2003 - $59.95
Picture It Premium 9 - $59.95
Windows 98 - $49.95
PageMaker 7 (2CD) - $69.95
Actobat 6.0 Pro - $79.95
After Effects 6 - $69.95
Office 97 SR2 - $49.95
Actobat 6.0 Pro - $79.95
InDesign CS - $69.95
Our site:
http://paulinusag.com
^ permalink raw reply
* ***DONTUSE*** Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Junio C Hamano @ 2006-02-12 13:46 UTC (permalink / raw)
To: git
In-Reply-To: <7vlkwgdbk6.fsf_-_@assigned-by-dhcp.cox.net>
I've pushed things out to "master" and "next" branch.
Quite a lot of things.
One thing that I expected to be there is not. It is the
hashtable patch. It is in "pu".
I once had it in my private "next", but dropped it before
pushing things out.
The problem does not seem to trigger with casual use, but I
found that with a clone from my primary repository with '-l -s'
(that is, a clone that uses alternates mechanism to borrow from
my primary repository), fsck-objects built with the patch seems
to report bogus things "missing". I have not traced it fully;
instead I ended up spending most of the night (I noticed it at
around 01:30 and now it is 05:30 so that's about four hours)
recovering some of my refs and double checking if my primary
repository is not corrupt X-<. At least, the primary repository
looks sane now.
With luck, I would muster enough energy to figure it out, but I
need some sleep first.
The problem seems to be very elusive. I took a snapshot of the
two repositories involved, so that I can use them as an isolated
test case (the one is my primary repository and the other one is
the "-l -s" clone). The problem is repeatable, but the SHA1 of
the file the broken fsck-objects reports to be missing is
different from the one I observed in the first experiment with
the real repositories. It appears it has something to do with
the directory listing order of fsck-objects, which in turn means
the reproduction of the problem is related to memory allocation
patterns, so maybe valgrind would help. On the other hand, even
if I published a tarball of these two repositories somewhere,
other people (or myself) who extract the tarball would probably
not see the same SHA1 reported as missing X-<.
Anyway, I've pushed them out before crashing, after I double
checked that versions built from my "master" and "next" do not
seem to show the problem, while with the one in "pu", the first
patch after merging "next" in it being the said patch, exhibits
the problem.
^ permalink raw reply
* Configuration file musings
From: Mark Wooding @ 2006-02-12 13:45 UTC (permalink / raw)
To: git
Having thought about things a bit, I've reached the conclusion that the
configuration file $GIT_DIR/config is trying to hold (at least) three
entirely different kinds of configuration.
* User configuration: basically, how I like GIT to work for me. I
think that the way it represents my name in commit messages is user
configuration, as would be the behaviour of `git-commit PATH'.
Environment variables almost work for this, but they're a nuisance
to change. This stuff ought to be somewhere in my home directory,
probably; though it would be useful to override temporarily, or on a
per-repository basis.
* Project configuration: how GIT should be supporting a particular
project. The merge.summary flag is like this, I think: whether to
have summaries in merge messages is a policy decision to be taken
for a whole project, rather than something to be left to the whims
of individual developers. Such settings probably to be propagated
through git-clone, git-fetch and so on.
* True repository configuration: how this particular repository ought
to behave. I can't think of many examples off the top of my head,
but core.repositoryformatversion and core.filemode are the sorts of
things I'm thinking of.
I'm not entirely sure where I'm going with this at the moment, and I
don't like some of the complexity which seems inherent in doing anything
about it, but I thought I'd stick my oar in anyway.
-- [mdw]
^ permalink raw reply
* Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Junio C Hamano @ 2006-02-12 12:11 UTC (permalink / raw)
To: Alexandre Julliard; +Cc: git, Johannes Schindelin, Linus Torvalds
In-Reply-To: <87oe1dez7k.fsf@wine.dyndns.org>
Alexandre Julliard <julliard@winehq.org> writes:
> Junio C Hamano <junkio@cox.net> writes:
>
>> Alexandle, if you have a chance, could you try Johannes' patch
>> on your workload to see if it works OK for you?
>
> It works great for me, CPU time is down to 15 sec instead of 20 sec
> with my patch.
Thanks. Now we have three independent numbers to back up that
Johannes is the winner....
Grrrrrrr. Please, DO NOT USE THIS ONE YET.
At least, not with your production repository.
I am trying to nail it down but it appears at least fsck-objects
using this version gives bogus results. I am first trying to
see if my primary working repository is sane.
Oh, and thanks again for your initial patch, which was what
started this drastic improvement.
^ permalink raw reply
* ***DONTUSE*** Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Junio C Hamano @ 2006-02-12 12:08 UTC (permalink / raw)
To: Florian Weimer; +Cc: git
In-Reply-To: <87accwlt8k.fsf@mid.deneb.enyo.de>
Florian Weimer <fw@deneb.enyo.de> writes:
> (GCC should do the rest.)
>...
> AFAICS, obj_allocs is a power of two.
Yes, I already have something like these in my tree (the latter
did not help much as far as I could tell, though).
****HOWEVER****
Do not use this (not just my patch but with the whole hashtable
version) in your production repository yet.
I've got a mysterious corruption and bogus output from
fsck-objects, and have been tracking it (see the timestamp of
this message).
1.2.0 will most likely to be be *delayed*. I have to first make
sure my private repository is sane. Grrrrrrrr.
^ permalink raw reply
* [PATCH] Add howto about separating topics.
From: kent @ 2006-02-12 12:00 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vmzgxn1dz.fsf@assigned-by-dhcp.cox.net>
This howto consists of a footnote from an email by JC to the git
mailing list (<7vfyms0x4p.fsf@assigned-by-dhcp.cox.net>).
Signed-off-by: Kent Engstrom <kent@lysator.liu.se>
---
Documentation/howto/separating-topic-branches.txt | 91 +++++++++++++++++++++
1 files changed, 91 insertions(+), 0 deletions(-)
create mode 100644 Documentation/howto/separating-topic-branches.txt
39f152ae224f45a3d977aa8966a477dbc1df676d
diff --git a/Documentation/howto/separating-topic-branches.txt b/Documentation/howto/separating-topic-branches.txt
new file mode 100644
index 0000000..090e2c9
--- /dev/null
+++ b/Documentation/howto/separating-topic-branches.txt
@@ -0,0 +1,91 @@
+From: Junio C Hamano <junkio@cox.net>
+Subject: Separating topic branches
+Abstract: In this article, JC describes how to separate topic branches.
+
+This text was originally a footnote to a discussion about the
+behaviour of the git diff commands.
+
+Often I find myself doing that [running diff against something other
+than HEAD] while rewriting messy development history. For example, I
+start doing some work without knowing exactly where it leads, and end
+up with a history like this:
+
+ "master"
+ o---o
+ \ "topic"
+ o---o---o---o---o---o
+
+At this point, "topic" contains something I know I want, but it
+contains two concepts that turned out to be completely independent.
+And often, one topic component is larger than the other. It may
+contain more than two topics.
+
+In order to rewrite this mess to be more manageable, I would first do
+"diff master..topic", to extract the changes into a single patch, start
+picking pieces from it to get logically self-contained units, and
+start building on top of "master":
+
+ $ git diff master..topic >P.diff
+ $ git checkout -b topicA master
+ ... pick and apply pieces from P.diff to build
+ ... commits on topicA branch.
+
+ o---o---o
+ / "topicA"
+ o---o"master"
+ \ "topic"
+ o---o---o---o---o---o
+
+Before doing each commit on "topicA" HEAD, I run "diff HEAD"
+before update-index the affected paths, or "diff --cached HEAD"
+after. Also I would run "diff --cached master" to make sure
+that the changes are only the ones related to "topicA". Usually
+I do this for smaller topics first.
+
+After that, I'd do the remainder of the original "topic", but
+for that, I do not start from the patchfile I extracted by
+comparing "master" and "topic" I used initially. Still on
+"topicA", I extract "diff topic", and use it to rebuild the
+other topic:
+
+ $ git diff -R topic >P.diff ;# --cached also would work fine
+ $ git checkout -b topicB master
+ ... pick and apply pieces from P.diff to build
+ ... commits on topicB branch.
+
+ "topicB"
+ o---o---o---o---o
+ /
+ /o---o---o
+ |/ "topicA"
+ o---o"master"
+ \ "topic"
+ o---o---o---o---o---o
+
+After I am done, I'd try a pretend-merge between "topicA" and
+"topicB" in order to make sure I have not missed anything:
+
+ $ git pull . topicA ;# merge it into current "topicB"
+ $ git diff topic
+ "topicB"
+ o---o---o---o---o---* (pretend merge)
+ / /
+ /o---o---o----------'
+ |/ "topicA"
+ o---o"master"
+ \ "topic"
+ o---o---o---o---o---o
+
+The last diff better not to show anything other than cleanups
+for crufts. Then I can finally clean things up:
+
+ $ git branch -D topic
+ $ git reset --hard HEAD^ ;# nuke pretend merge
+
+ "topicB"
+ o---o---o---o---o
+ /
+ /o---o---o
+ |/ "topicA"
+ o---o"master"
+
--
1.1.6.g29e5
^ permalink raw reply related
* Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Florian Weimer @ 2006-02-12 11:19 UTC (permalink / raw)
To: git
In-Reply-To: <7virrli9am.fsf@assigned-by-dhcp.cox.net>
* Junio C. Hamano:
> static int hashtable_index(const unsigned char *sha1)
> {
> - unsigned int i = *(unsigned int *)sha1;
> - return (int)(i % obj_allocs);
> + int cnt;
> + unsigned int ix = *sha1++;
> +
> + for (cnt = 1; cnt < sizeof(unsigned int); cnt++) {
> + ix <<= 8;
> + ix |= *sha1++;
> + }
memcpy(&ix, sha1, sizeof(ix));
(GCC should do the rest.)
> + return (int)(ix % obj_allocs);
> }
return (int)(ix & (obj_allocs - 1));
AFAICS, obj_allocs is a power of two.
^ permalink raw reply
* Re: Make "git clone" less of a deathly quiet experience
From: Andreas Ericsson @ 2006-02-12 11:02 UTC (permalink / raw)
To: Keith Packard
Cc: Linus Torvalds, Junio C Hamano, Git Mailing List, Petr Baudis
In-Reply-To: <1139717510.4183.34.camel@evo.keithp.com>
Keith Packard wrote:
> On Sun, 2006-02-12 at 04:43 +0100, Andreas Ericsson wrote:
>
>
>>A weird oddity; Cloning is faster over rsync, day-to-day pulling is not.
>
>
> Precisely. If the protocol could deliver existing packs instead of
> unpacking and repacking them, then git would be as fast as rsync and I
> wouldn't have to worry about supporting two protocols.
>
Caching features have been discussed, but that means the daemon needs to
have write-access to some directory within the repository. It would also
work poorly for projects that see very rapid development unless the
cached pack-files can be amended to. A sort of "create packs on demand".
It shouldn't be too difficult, really.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply
* Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Alexandre Julliard @ 2006-02-12 8:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Johannes Schindelin, git
In-Reply-To: <7virrli9am.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano <junkio@cox.net> writes:
> I am also interested to find out how much the rehashing you do
> when you update obj_allocs to a larger value is costing.
>
> Alexandle, if you have a chance, could you try Johannes' patch
> on your workload to see if it works OK for you?
It works great for me, CPU time is down to 15 sec instead of 20 sec
with my patch.
--
Alexandre Julliard
julliard@winehq.org
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Linus Torvalds @ 2006-02-12 7:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602112117560.3691@g5.osdl.org>
On Sat, 11 Feb 2006, Linus Torvalds wrote:
>
> Before:
> real 0m41.322s user 0m40.612s sys 0m0.492s
> After:
> real 0m22.542s user 0m22.080s sys 0m0.448s
Johannes:
real 0m13.814s user 0m13.492s sys 0m0.296s
> And just so you wouldn't think that all my machines are slow..
>
> Before:
> real 0m28.645s user 0m28.366s sys 0m0.280s
> After:
> real 0m16.566s user 0m16.373s sys 0m0.196s
Johannes:
real 0m10.239s user 0m10.029s sys 0m0.208s
So the hashing thing is indeed the clear winner.
Make it so.
Linus
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Linus Torvalds @ 2006-02-12 6:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Alexandre Julliard, Johannes Schindelin
In-Reply-To: <7vaccxdsaf.fsf@assigned-by-dhcp.cox.net>
On Sat, 11 Feb 2006, Junio C Hamano wrote:
>
> It turns out that Johannes (with my patch to fix possible
> unsigned int alignment issue and the initial call to
> find_object()) is the clear winner.
Having looked at it, I will have to agree. Johannes' approach looks
pretty clean, and has the same memory overhead mine has (two pointers per
object in the hash - one used, one empty), but has a lot fewer memcmp()
calls and pointer chasing.
So I'll growl softly but concur. Johannes' code isn't even very complex.
Linus
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Junio C Hamano @ 2006-02-12 6:07 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git, Alexandre Julliard, Johannes Schindelin
In-Reply-To: <7v1wy9f7q4.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano <junkio@cox.net> writes:
> Linus Torvalds <torvalds@osdl.org> writes:
>
>> On Sat, 11 Feb 2006, Linus Torvalds wrote:
>>>
>>> If somebody shows that the other approaches are faster, then I guess I'll
>>> just have to sulk in a corner and grown quietly at people.
>>
>> growl. growL. With an 'L'!
>
> I do not get it.
> ...
I first suspected you just meant the typo (s/grown/growl/) but
it probably is that you really meant GROWL (and sulk).
It turns out that Johannes (with my patch to fix possible
unsigned int alignment issue and the initial call to
find_object()) is the clear winner.
base - tip of "master"
lt-obj - the binary tree without balancing
aj-obj - Alexandre's 256-way buckets
js-obj - Johannes' circular hash
Although I have _not_ double checked the correctness of them, I
did not see major flaw in any of them.
base/git-rev-list --objects v2.6.14..linus
real 2m32.088s user 2m2.830s sys 0m0.890s
real 2m6.614s user 2m1.860s sys 0m0.660s
real 2m13.776s user 2m2.450s sys 0m0.590s
real 2m6.062s user 2m2.420s sys 0m0.690s
real 2m15.567s user 2m3.170s sys 0m0.900s
lt-obj/git-rev-list --objects v2.6.14..linus
real 0m42.889s user 0m40.170s sys 0m0.570s
real 0m44.247s user 0m40.320s sys 0m0.530s
real 0m40.891s user 0m40.110s sys 0m0.500s
real 0m41.874s user 0m40.090s sys 0m0.530s
real 0m41.596s user 0m40.050s sys 0m0.600s
aj-obj/git-rev-list --objects v2.6.14..linus
real 0m36.842s user 0m36.200s sys 0m0.490s
real 0m37.178s user 0m36.740s sys 0m0.390s
real 0m37.222s user 0m36.540s sys 0m0.610s
real 0m36.924s user 0m36.410s sys 0m0.360s
real 0m37.341s user 0m36.150s sys 0m0.620s
js-obj/git-rev-list --objects v2.6.14..linus
real 0m24.689s user 0m24.120s sys 0m0.390s
real 0m24.753s user 0m24.020s sys 0m0.360s
real 0m27.650s user 0m24.470s sys 0m0.440s
real 0m33.480s user 0m24.030s sys 0m0.460s
real 0m25.329s user 0m24.490s sys 0m0.390s
base/git-name-rev --all
real 0m4.193s user 0m4.060s sys 0m0.130s
real 0m4.179s user 0m4.100s sys 0m0.080s
real 0m4.210s user 0m4.040s sys 0m0.150s
real 0m4.162s user 0m4.100s sys 0m0.060s
real 0m4.697s user 0m4.100s sys 0m0.120s
lt-obj/git-name-rev --all
real 0m2.199s user 0m2.120s sys 0m0.080s
real 0m2.186s user 0m2.110s sys 0m0.080s
real 0m2.187s user 0m2.150s sys 0m0.040s
real 0m2.817s user 0m2.150s sys 0m0.070s
real 0m2.323s user 0m2.170s sys 0m0.050s
aj-obj/git-name-rev --all
real 0m2.136s user 0m2.050s sys 0m0.080s
real 0m2.164s user 0m2.080s sys 0m0.060s
real 0m2.143s user 0m2.070s sys 0m0.070s
real 0m2.141s user 0m2.080s sys 0m0.060s
real 0m2.154s user 0m2.070s sys 0m0.090s
js-obj/git-name-rev --all
real 0m2.047s user 0m2.010s sys 0m0.040s
real 0m2.040s user 0m1.970s sys 0m0.070s
real 0m2.025s user 0m1.970s sys 0m0.060s
real 0m2.170s user 0m2.020s sys 0m0.030s
real 0m2.046s user 0m2.010s sys 0m0.030s
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Junio C Hamano @ 2006-02-12 5:48 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602112122400.3691@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> On Sat, 11 Feb 2006, Linus Torvalds wrote:
>>
>> If somebody shows that the other approaches are faster, then I guess I'll
>> just have to sulk in a corner and grown quietly at people.
>
> growl. growL. With an 'L'!
I do not get it.
But my impression was the circular hash with trivial fixes were
the fastest. I am benching them now.
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Linus Torvalds @ 2006-02-12 5:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602112117560.3691@g5.osdl.org>
On Sat, 11 Feb 2006, Linus Torvalds wrote:
>
> If somebody shows that the other approaches are faster, then I guess I'll
> just have to sulk in a corner and grown quietly at people.
growl. growL. With an 'L'!
Linus
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Linus Torvalds @ 2006-02-12 5:22 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602112045340.3691@g5.osdl.org>
On Sat, 11 Feb 2006, Linus Torvalds wrote:
>
> Before:
> real 0m41.322s user 0m40.612s sys 0m0.492s
> real 0m40.797s user 0m40.140s sys 0m0.468s
> real 0m40.433s user 0m40.016s sys 0m0.412s
>
> After:
> real 0m22.542s user 0m22.080s sys 0m0.448s
> real 0m22.660s user 0m22.336s sys 0m0.312s
> real 0m22.671s user 0m22.236s sys 0m0.292s
And just so you wouldn't think that all my machines are slow..
Before:
real 0m28.645s user 0m28.366s sys 0m0.280s
real 0m28.700s user 0m28.486s sys 0m0.212s
After:
real 0m16.566s user 0m16.373s sys 0m0.196s
real 0m16.512s user 0m16.277s sys 0m0.236s
so there (that's all with current kernel HEAD, mostly packed).
Now, I haven't compared it to the other suggested fixes (hashing, and the
256-way bucket-sorting), but I obviously prefer the tree approach because
it's my idea (and my ideas are _always_ superior) and because it's so dang
simple.
If somebody shows that the other approaches are faster, then I guess I'll
just have to sulk in a corner and grown quietly at people.
Linus
^ permalink raw reply
* Re: [PATCH] binary-tree-based objects.
From: Linus Torvalds @ 2006-02-12 5:06 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhd75fc6y.fsf_-_@assigned-by-dhcp.cox.net>
On Sat, 11 Feb 2006, Junio C Hamano wrote:
>
> * I haven't benched this seriously yet. One datapoint:
>
> time git-rev-list --objects v2.6.15..linus | wc -l
>
> are 53sec vs 22sec improvement with the same output.
Another datapoint: doing
time git-rev-list --objects HEAD > /dev/null
three times in a row (to verify that the numbers are stable - they very
clearly are).
Before:
real 0m41.322s user 0m40.612s sys 0m0.492s
real 0m40.797s user 0m40.140s sys 0m0.468s
real 0m40.433s user 0m40.016s sys 0m0.412s
After:
real 0m22.542s user 0m22.080s sys 0m0.448s
real 0m22.660s user 0m22.336s sys 0m0.312s
real 0m22.671s user 0m22.236s sys 0m0.292s
and doing some trivial oprofile runs shows that the object lookup is no
longer dominant (my libc's don't have symbol information, so I don't get
good profile data, but it shows that libc and libz are the biggest issues,
with memcmp and malloc/free apparently being much bigger issues than the
object lookup).
Linus
^ permalink raw reply
* Re: Make "git clone" less of a deathly quiet experience
From: Keith Packard @ 2006-02-12 4:11 UTC (permalink / raw)
To: Andreas Ericsson
Cc: keithp, Linus Torvalds, Junio C Hamano, Git Mailing List,
Petr Baudis
In-Reply-To: <43EEAEF3.7040202@op5.se>
[-- Attachment #1: Type: text/plain, Size: 368 bytes --]
On Sun, 2006-02-12 at 04:43 +0100, Andreas Ericsson wrote:
> A weird oddity; Cloning is faster over rsync, day-to-day pulling is not.
Precisely. If the protocol could deliver existing packs instead of
unpacking and repacking them, then git would be as fast as rsync and I
wouldn't have to worry about supporting two protocols.
--
keith.packard@intel.com
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply
* [PATCH] binary-tree-based objects.
From: Junio C Hamano @ 2006-02-12 4:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <7vslqpi9mg.fsf@assigned-by-dhcp.cox.net>
This implements Linus' idea to keep objects in a binary tree,
instead of using the linear array as we currently do.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
* I haven't benched this seriously yet. One datapoint:
time git-rev-list --objects v2.6.15..linus | wc -l
are 53sec vs 22sec improvement with the same output.
fsck-objects.c | 36 +++++++++++++++++----------------
name-rev.c | 17 +++++++++++-----
object.c | 61 +++++++++++++++++++++-----------------------------------
object.h | 3 ++-
4 files changed, 55 insertions(+), 62 deletions(-)
3c160f4d94cf16db5dc9c603e98ebacbe9ac4ca7
diff --git a/fsck-objects.c b/fsck-objects.c
index 9950be2..28a7c1b 100644
--- a/fsck-objects.c
+++ b/fsck-objects.c
@@ -56,23 +56,21 @@ static int objwarning(struct object *obj
}
-static void check_connectivity(void)
+static void check_connectivity(struct object *obj)
{
- int i;
-
/* Look up all the requirements, warn about missing objects.. */
- for (i = 0; i < nr_objs; i++) {
- struct object *obj = objs[i];
-
- if (!obj->parsed) {
- if (!standalone && has_sha1_file(obj->sha1))
- ; /* it is in pack */
- else
- printf("missing %s %s\n",
- obj->type, sha1_to_hex(obj->sha1));
- continue;
- }
+ again:
+ if (!obj)
+ return;
+ if (!obj->parsed) {
+ if (!standalone && has_sha1_file(obj->sha1))
+ ; /* it is in pack */
+ else
+ printf("missing %s %s\n",
+ obj->type, sha1_to_hex(obj->sha1));
+ }
+ else {
if (obj->refs) {
const struct object_refs *refs = obj->refs;
unsigned j;
@@ -91,14 +89,16 @@ static void check_connectivity(void)
if (show_unreachable && !(obj->flags & REACHABLE)) {
printf("unreachable %s %s\n",
obj->type, sha1_to_hex(obj->sha1));
- continue;
}
-
- if (!obj->used) {
+ else if (!obj->used) {
printf("dangling %s %s\n", obj->type,
sha1_to_hex(obj->sha1));
}
}
+ if (obj->left && obj->right)
+ check_connectivity(obj->left);
+ obj = obj->right ? obj->right : obj->left;
+ goto again;
}
/*
@@ -556,6 +556,6 @@ int main(int argc, char **argv)
}
}
- check_connectivity();
+ check_connectivity(objs_root);
return 0;
}
diff --git a/name-rev.c b/name-rev.c
index bbadb91..a4fecfb 100644
--- a/name-rev.c
+++ b/name-rev.c
@@ -120,6 +120,17 @@ static const char* get_rev_name(struct o
return buffer;
}
+void show_all_names(struct object *obj)
+{
+ while (obj) {
+ printf("%s %s\n", sha1_to_hex(obj->sha1), get_rev_name(obj));
+ if (obj->left && obj->right)
+ show_all_names(obj->left);
+ obj = obj->right ? obj->right : obj->left;
+ }
+}
+
+
int main(int argc, char **argv)
{
struct object_list *revs = NULL;
@@ -230,11 +241,7 @@ int main(int argc, char **argv)
fwrite(p_start, p - p_start, 1, stdout);
}
} else if (all) {
- int i;
-
- for (i = 0; i < nr_objs; i++)
- printf("%s %s\n", sha1_to_hex(objs[i]->sha1),
- get_rev_name(objs[i]));
+ show_all_names(objs_root);
} else
for ( ; revs; revs = revs->next)
printf("%s %s\n", revs->name, get_rev_name(revs->item));
diff --git a/object.c b/object.c
index 1577f74..a1b0729 100644
--- a/object.c
+++ b/object.c
@@ -5,65 +5,50 @@
#include "commit.h"
#include "tag.h"
-struct object **objs;
+struct object *objs_root;
int nr_objs;
-static int obj_allocs;
int track_object_refs = 1;
-static int find_object(const unsigned char *sha1)
+static struct object **lookup_object_position(const unsigned char *sha1)
{
- int first = 0, last = nr_objs;
+ struct object **p = &objs_root;
- while (first < last) {
- int next = (first + last) / 2;
- struct object *obj = objs[next];
- int cmp;
-
- cmp = memcmp(sha1, obj->sha1, 20);
- if (!cmp)
- return next;
- if (cmp < 0) {
- last = next;
- continue;
- }
- first = next+1;
- }
- return -first-1;
+ for (;;) {
+ struct object *object = *p;
+ int sign;
+
+ if (!object)
+ break;
+ sign = memcmp(sha1, object->sha1, 20);
+ if (!sign)
+ break;
+ p = &object->left;
+ if (sign < 0)
+ continue;
+ p = &object->right;
+ }
+ return p;
}
struct object *lookup_object(const unsigned char *sha1)
{
- int pos = find_object(sha1);
- if (pos >= 0)
- return objs[pos];
- return NULL;
+ return *lookup_object_position(sha1);
}
void created_object(const unsigned char *sha1, struct object *obj)
{
- int pos = find_object(sha1);
+ struct object **op = lookup_object_position(sha1);
obj->parsed = 0;
memcpy(obj->sha1, sha1, 20);
obj->type = NULL;
obj->refs = NULL;
obj->used = 0;
-
- if (pos >= 0)
+ obj->left = obj->right = NULL;
+ if (*op)
die("Inserting %s twice\n", sha1_to_hex(sha1));
- pos = -pos-1;
-
- if (obj_allocs == nr_objs) {
- obj_allocs = alloc_nr(obj_allocs);
- objs = xrealloc(objs, obj_allocs * sizeof(struct object *));
- }
-
- /* Insert it into the right place */
- memmove(objs + pos + 1, objs + pos, (nr_objs - pos) *
- sizeof(struct object *));
-
- objs[pos] = obj;
+ *op = obj;
nr_objs++;
}
diff --git a/object.h b/object.h
index 0e76182..32b276d 100644
--- a/object.h
+++ b/object.h
@@ -19,12 +19,13 @@ struct object {
unsigned char sha1[20];
const char *type;
struct object_refs *refs;
+ struct object *left, *right;
void *util;
};
extern int track_object_refs;
extern int nr_objs;
-extern struct object **objs;
+extern struct object *objs_root;
/** Internal only **/
struct object *lookup_object(const unsigned char *sha1);
--
1.1.6.g69c5
^ permalink raw reply related
* Re: Two crazy proposals for changing git's diff commands
From: Junio C Hamano @ 2006-02-12 3:48 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: git
In-Reply-To: <20060212031527.GA31228@fieldses.org>
"J. Bruce Fields" <bfields@fieldses.org> writes:
> On Wed, Feb 08, 2006 at 05:21:12PM -0800, Junio C Hamano wrote:
>> Of course, learning various flags to give "git diff" is part of
>> understanding the index
>
> Well, there's understanding the index, and then there's memorizing the
> flags...
> ...
> But maybe that's just me. (And maybe the namespace in question is
> already to crowded to allow for INDEX and WORK.)
I do not think it is just you. The real problem, honestly
speaking, is that "git diff" wrapper cheats and avoids doing its
own set of flags.
The low-level is just a mechanism UI is built upon, and as a
mechanism, except perhaps maybe --cached might be now better
spelled as --index, has set of options and semantics that are
consistent with its world model (index centric way of thinking).
Because "git diff" wrapper cheats, it ends up exposing the
low-level flags and arguments to the end user, and to use that
effectively, obviously you need to understand the world model
the low-level is built upon.
It was OK (it could be argued that it was even better than sugar
coating to make it *inconsistent* with the underlying world
model) so far, as long as people who use it are aware of the
index centric world model, but that "consistency with the
underlying world model" makes it harder to approach and causes
confusion.
That is why I these days often mention "welding training
wheels". Doing half-baked sugarcoating of the UI layer would
break mental model of people who understand the world model
low-level builds and tries to make effective use of low-level
through the UI.
^ permalink raw reply
* Re: Make "git clone" less of a deathly quiet experience
From: Andreas Ericsson @ 2006-02-12 3:43 UTC (permalink / raw)
To: Keith Packard
Cc: Linus Torvalds, Junio C Hamano, Git Mailing List, Petr Baudis
In-Reply-To: <1139685031.4183.31.camel@evo.keithp.com>
Keith Packard wrote:
> On Sat, 2006-02-11 at 09:45 -0800, Linus Torvalds wrote:
>
>
>>More importantly, it really wouldn't have helped that much in this
>>situation. At least for me, the network is 90% of the problem, the
>>pack-file generation is at most 10%. So cached packfiles really only
>>matter for server-side problems (high CPU load, or lack of memory, or
>>heavy disk activity).
>
>
> I'd like to see git use less CPU than CVS does on my distribution host;
> some mechanism for re-using either existing or cached packs would help a
> whole lot with that. The alternative is to see people switch to rsync
> instead, which seems like a far worse idea.
>
A weird oddity; Cloning is faster over rsync, day-to-day pulling is not.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply
* Re: Two crazy proposals for changing git's diff commands
From: J. Bruce Fields @ 2006-02-12 3:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Carl Worth, git
In-Reply-To: <7vfymtl43b.fsf@assigned-by-dhcp.cox.net>
On Wed, Feb 08, 2006 at 05:21:12PM -0800, Junio C Hamano wrote:
> Of course, learning various flags to give "git diff" is part of
> understanding the index
Well, there's understanding the index, and then there's memorizing the
flags. I would've thought it'd be a lot easier to remember something
like
git diff HEAD INDEX
git diff INDEX WORK
git diff HEAD WORK
than, respectively,
git diff --cached
git diff
git diff HEAD
But maybe that's just me. (And maybe the namespace in question is
already to crowded to allow for INDEX and WORK.)
--b.
^ permalink raw reply
* [PATCH] Add support for explicit type specifiers when calling git-repo-config
From: Petr Baudis @ 2006-02-12 3:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: A Large Angry SCM, git
In-Reply-To: <7vwtg2pkt2.fsf@assigned-by-dhcp.cox.net>
Dear diary, on Sat, Feb 11, 2006 at 05:43:21AM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> (3) neither of these commands know list of all the possible
> configuration items, nor types of them, so core.filename
> can be spelled as "1" or "true" to mean the same thing to
> our C code, but repo-config faithfully returns how the
> value is literally spelled in the configuration file. The
> following two means the same thing to the C layer, so the
> calling script needs to further interpret the output from
> git-repo-config:
>
> $ git repo-config core.filemode ;# [core] filemode=1
> 1
> $ git repo-config core.filemode ;# [core] filemode=true
> true
>
> (4) worse, boolean 'true' can be specified by just having the
> configuration item in the file, but repo-config dumps core
> on that:
>
> $ git repo-config core.filemode ;# [core] filemode
> segmentation fault
This patch provides a partial solution - if you query only for variables
of the same type (or just a single variable), this adds type-checking
and transformation to the given type.
It is basically what Cogito would like to see - centralized variables
database in GIT won't help us, but we would like to have custom but
still typed variables in the config file.
---
[PATCH] Add support for explicit type specifiers when calling git-repo-config
Currently, git-repo-config will just return the raw value of option
as specified in the config file; this makes things difficult for scripts
calling it, especially if the value is supposed to be boolean.
This patch makes it possible to ask git-repo-config to check if the option
is of the given type (int or bool) and write out the value in its
canonical form. If you do not pass --int or --bool, the behaviour stays
unchanged and the raw value is emitted.
This also incidentally fixes the segfault when option with no value is
encountered.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
commit 8dcc626cd144b2c6eae2a299242bbbe905cb0059
tree 0d4dcc3a44eb318ef52c3d64dda11768745f7583
parent 29e55cd5ad9e17d2ff8a1a37b7ee45d18d1e59d6
author Petr Baudis <pasky@suse.cz> Sun, 12 Feb 2006 04:09:01 +0100
committer Petr Baudis <xpasky@machine.or.cz> Sun, 12 Feb 2006 04:09:01 +0100
Documentation/git-repo-config.txt | 18 ++++++--
repo-config.c | 80 +++++++++++++++++++++++--------------
2 files changed, 62 insertions(+), 36 deletions(-)
diff --git a/Documentation/git-repo-config.txt b/Documentation/git-repo-config.txt
index 3069464..33fcde4 100644
--- a/Documentation/git-repo-config.txt
+++ b/Documentation/git-repo-config.txt
@@ -8,12 +8,12 @@ git-repo-config - Get and set options in
SYNOPSIS
--------
-'git-repo-config' name [value [value_regex]]
-'git-repo-config' --replace-all name [value [value_regex]]
-'git-repo-config' --get name [value_regex]
-'git-repo-config' --get-all name [value_regex]
-'git-repo-config' --unset name [value_regex]
-'git-repo-config' --unset-all name [value_regex]
+'git-repo-config' [type] name [value [value_regex]]
+'git-repo-config' [type] --replace-all name [value [value_regex]]
+'git-repo-config' [type] --get name [value_regex]
+'git-repo-config' [type] --get-all name [value_regex]
+'git-repo-config' [type] --unset name [value_regex]
+'git-repo-config' [type] --unset-all name [value_regex]
DESCRIPTION
-----------
@@ -26,6 +26,12 @@ should provide a POSIX regex for the val
*not* matching the regex, just prepend a single exclamation mark in front
(see EXAMPLES).
+The type specifier can be either '--int' or '--bool', which will make
+'git-repo-config' ensure that the variable(s) are of the given type and
+convert the value to the canonical form (simple decimal number for int,
+a "true" or "false" string for bool). If no type specifier is passed,
+no checks or transformations are performed on the value.
+
This command will fail if
. .git/config is invalid,
diff --git a/repo-config.c b/repo-config.c
index c31e441..ccdee3c 100644
--- a/repo-config.c
+++ b/repo-config.c
@@ -2,7 +2,7 @@
#include <regex.h>
static const char git_config_set_usage[] =
-"git-repo-config [--get | --get-all | --replace-all | --unset | --unset-all] name [value [value_regex]]";
+"git-repo-config [ --bool | --int ] [--get | --get-all | --replace-all | --unset | --unset-all] name [value [value_regex]]";
static char* key = NULL;
static char* value = NULL;
@@ -10,6 +10,7 @@ static regex_t* regexp = NULL;
static int do_all = 0;
static int do_not_match = 0;
static int seen = 0;
+static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
static int show_config(const char* key_, const char* value_)
{
@@ -25,7 +26,17 @@ static int show_config(const char* key_,
fprintf(stderr, "More than one value: %s\n", value);
free(value);
}
- value = strdup(value_);
+
+ if (type == T_INT) {
+ value = malloc(256);
+ sprintf(value, "%d", git_config_int(key_, value_));
+ } else if (type == T_BOOL) {
+ value = malloc(256);
+ sprintf(value, "%s", git_config_bool(key_, value_)
+ ? "true" : "false");
+ } else {
+ value = strdup(value_ ? : "");
+ }
seen++;
}
return 0;
@@ -72,43 +83,52 @@ static int get_value(const char* key_, c
int main(int argc, const char **argv)
{
+ int i;
setup_git_directory();
- switch (argc) {
+ for (i = 1; i < argc; i++) {
+ if (!strcmp(argv[i], "--int"))
+ type = T_INT;
+ else if (!strcmp(argv[i], "--bool"))
+ type = T_BOOL;
+ else
+ break;
+ }
+ switch (argc-i) {
+ case 1:
+ return get_value(argv[i], NULL);
case 2:
- return get_value(argv[1], NULL);
- case 3:
- if (!strcmp(argv[1], "--unset"))
- return git_config_set(argv[2], NULL);
- else if (!strcmp(argv[1], "--unset-all"))
- return git_config_set_multivar(argv[2], NULL, NULL, 1);
- else if (!strcmp(argv[1], "--get"))
- return get_value(argv[2], NULL);
- else if (!strcmp(argv[1], "--get-all")) {
+ if (!strcmp(argv[i], "--unset"))
+ return git_config_set(argv[i+1], NULL);
+ else if (!strcmp(argv[i], "--unset-all"))
+ return git_config_set_multivar(argv[i+1], NULL, NULL, 1);
+ else if (!strcmp(argv[i], "--get"))
+ return get_value(argv[i+1], NULL);
+ else if (!strcmp(argv[i], "--get-all")) {
do_all = 1;
- return get_value(argv[2], NULL);
+ return get_value(argv[i+1], NULL);
} else
- return git_config_set(argv[1], argv[2]);
- case 4:
- if (!strcmp(argv[1], "--unset"))
- return git_config_set_multivar(argv[2], NULL, argv[3], 0);
- else if (!strcmp(argv[1], "--unset-all"))
- return git_config_set_multivar(argv[2], NULL, argv[3], 1);
- else if (!strcmp(argv[1], "--get"))
- return get_value(argv[2], argv[3]);
- else if (!strcmp(argv[1], "--get-all")) {
+ return git_config_set(argv[i], argv[i+1]);
+ case 3:
+ if (!strcmp(argv[i], "--unset"))
+ return git_config_set_multivar(argv[i+1], NULL, argv[i+2], 0);
+ else if (!strcmp(argv[i], "--unset-all"))
+ return git_config_set_multivar(argv[i+1], NULL, argv[i+2], 1);
+ else if (!strcmp(argv[i], "--get"))
+ return get_value(argv[i+1], argv[i+2]);
+ else if (!strcmp(argv[i], "--get-all")) {
do_all = 1;
- return get_value(argv[2], argv[3]);
- } else if (!strcmp(argv[1], "--replace-all"))
+ return get_value(argv[i+1], argv[i+2]);
+ } else if (!strcmp(argv[i], "--replace-all"))
- return git_config_set_multivar(argv[2], argv[3], NULL, 1);
+ return git_config_set_multivar(argv[i+1], argv[i+2], NULL, 1);
else
- return git_config_set_multivar(argv[1], argv[2], argv[3], 0);
- case 5:
- if (!strcmp(argv[1], "--replace-all"))
- return git_config_set_multivar(argv[2], argv[3], argv[4], 1);
- case 1:
+ return git_config_set_multivar(argv[i], argv[i+1], argv[i+2], 0);
+ case 4:
+ if (!strcmp(argv[i], "--replace-all"))
+ return git_config_set_multivar(argv[i+1], argv[i+2], argv[i+3], 1);
+ case 0:
default:
usage(git_config_set_usage);
}
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe. -- Douglas Adams
^ permalink raw reply related
* Re: [PATCH] Teach repo-config the -l and --get-regexp options
From: Junio C Hamano @ 2006-02-12 3:05 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0602111306450.25997@wbgn013.biozentrum.uni-wuerzburg.de>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> Happier?
Not really.
It still dumps core with:
[core]
boolvarsaretrueiftheirnamesarelisted
The patch does not address any of the more important issues I
listed with git-var and git-repo-config in that message.
^ permalink raw reply
* Re: [PATCH] fetch-clone progress: finishing touches.
From: Linus Torvalds @ 2006-02-12 3:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vslqpjq2q.fsf@assigned-by-dhcp.cox.net>
On Sat, 11 Feb 2006, Junio C Hamano wrote:
>
> BTW, don't you mean 512 down there???
>
> - msecs += (int)(tv.tv_usec - prev_tv.tv_usec) >> 10;
> + msecs += usec_to_binarymsec(tv.tv_usec - prev_tv.tv_usec);
> +
> if (msecs > 500) {
> prev_tv = tv;
Well, it's just a random number, but if you like 512 better than 500, go
wild ;)
Linus
^ permalink raw reply
* Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Junio C Hamano @ 2006-02-12 2:46 UTC (permalink / raw)
To: Johannes Schindelin, Alexandre Julliard; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0602120254260.10235@wbgn013.biozentrum.uni-wuerzburg.de>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> In a simple test, this brings down the CPU time from 47 sec to 22 sec.
I was planning to take Alexandre's patch, but the approach your
patch takes feels more correct -- it scales with the number of
objects you need to handle, instead of having fixed 256
hashbuckets.
BTW, your version dumped core in hashtable_index immediately
after I started "git-rev-list --objects HEAD". How did you get
_any_ CPU time?
I am not sure expecting that object name pointers are always
(unsigned int *) aligned as your patch does is OK. We may want
to have something like the attached patch on top of yours.
I am also interested to find out how much the rehashing you do
when you update obj_allocs to a larger value is costing.
Alexandle, if you have a chance, could you try Johannes' patch
on your workload to see if it works OK for you?
-- >8 --
[PATCH] do not assume object name pointers are uint aligned.
Also fix an obvious bug that caused it dump core at my first
attempt. There might be others but I did not actively look for
them.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/object.c b/object.c
index 3259862..59e5e36 100644
--- a/object.c
+++ b/object.c
@@ -13,17 +13,24 @@ int track_object_refs = 1;
static int hashtable_index(const unsigned char *sha1)
{
- unsigned int i = *(unsigned int *)sha1;
- return (int)(i % obj_allocs);
+ int cnt;
+ unsigned int ix = *sha1++;
+
+ for (cnt = 1; cnt < sizeof(unsigned int); cnt++) {
+ ix <<= 8;
+ ix |= *sha1++;
+ }
+ return (int)(ix % obj_allocs);
}
static int find_object(const unsigned char *sha1)
{
- int i = hashtable_index(sha1);
+ int i;
if (!objs)
return -1;
+ i = hashtable_index(sha1);
while (objs[i]) {
if (memcmp(sha1, objs[i]->sha1, 20) == 0)
return i;
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox