* Software
From: Betty T. Sheller @ 2006-02-12 14:22 UTC (permalink / raw)
To: Git
Learn to build simple and clean websites that can bring in the dough...
Understanding 0EM software
New software on our site:
Plus! XP - $59.95
After Effects 6 - $69.95
Premiere 7 - $69.95
Fireworks MX 2004 - $69.95
Photoshop 7 - $69.95
Norton System Works 2003 - $59.95
Picture It Premium 9 - $59.95
Windows 98 - $49.95
PageMaker 7 (2CD) - $69.95
Actobat 6.0 Pro - $79.95
After Effects 6 - $69.95
Office 97 SR2 - $49.95
Actobat 6.0 Pro - $79.95
InDesign CS - $69.95
Our site:
http://paulinusag.com
^ permalink raw reply
* Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Johannes Schindelin @ 2006-02-12 14:31 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Alexandre Julliard, git, Linus Torvalds
In-Reply-To: <7vaccwdbfs.fsf@assigned-by-dhcp.cox.net>
Hi,
On Sun, 12 Feb 2006, Junio C Hamano wrote:
> Alexandre Julliard <julliard@winehq.org> writes:
>
> > Junio C Hamano <junkio@cox.net> writes:
> >
> >> Alexandle, if you have a chance, could you try Johannes' patch
> >> on your workload to see if it works OK for you?
> >
> > It works great for me, CPU time is down to 15 sec instead of 20 sec
> > with my patch.
>
> Thanks. Now we have three independent numbers to back up that
> Johannes is the winner....
>
> Grrrrrrr. Please, DO NOT USE THIS ONE YET.
>
> At least, not with your production repository.
>
> I am trying to nail it down but it appears at least fsck-objects
> using this version gives bogus results. I am first trying to
> see if my primary working repository is sane.
>
> Oh, and thanks again for your initial patch, which was what
> started this drastic improvement.
I am sorry! I tested fsck, but only *once*, since I did not think such a
creepy bug was in there. And then, I had to run to sing Beethoven's Missa
Solemnis, and missed all the action about this patch.
Just a few remarks around the comments in this thread:
- the doubling of obj_allocs is arbitrary. Originally, I planned to do the
growing much faster, which would have been helped by the fact. But it
turned out my thinking was defective. So, you can grow the hashtable by
whatever you like (doubling is quite effective, though).
- hashtable has expected O(1) insertion, and that is what boosts the
performance. Since the table growing is linear in the number of objects
(both size and computing time), and all operations afterwards are linear
on the table, *and the hash is already computed*, the hashtable is
preferrable over other data structures (sorted list has O(n) insertion
time, and tree still O(log n)).
- the bug Junio fixed was not triggered here, since I did all the testing
on my venerable iBook. The PowerPC architecture evidently aligns
all pointers to 32-bit, so I could reinterpret the pointer as to an
unsigned int. Note that there is a small overhead in Junio's version, but
it is probably not worth the hassle to make that a compile time option.
But I agree with Florian that memcpy would be more efficient.
- Arithmetic and Boolean operations on 32-bit integers typically are
handled very efficiently in modern 32-bit CPUs, so there should be no
reason to use "&" instead of "%" (especially since understanding the code
wouldn't be helped by that).
Ciao,
Dscho
^ permalink raw reply
* [PATCH] Properly git-bisect reset after bisecting from non-master head
From: Petr Baudis @ 2006-02-12 16:06 UTC (permalink / raw)
To: junkio; +Cc: git
git-bisect reset without an argument would return to master even
if the bisecting started at a non-master branch. This patch makes
it save the original branch name to .git/head-name and restore it
afterwards.
This is also compatible with Cogito and cg-seek, so cg-status will
show that we are seeked on the bisect branch and cg-reset will
properly restore the original branch.
git-bisect start will refuse to work if it is not on a bisect but
.git/head-name exists; this is to protect against conflicts with
other seeking tools.
Signed-off-by: Petr Baudis <pasky@suse.cz>
---
commit 143fc0c9a04ca38a70fbd882e38620f566415b6c
tree c93f93a984c00cfa08bbb9cb46bd1c6ba2c82de0
parent 8dcc626cd144b2c6eae2a299242bbbe905cb0059
author Petr Baudis <pasky@suse.cz> Sun, 12 Feb 2006 16:57:39 +0100
committer Petr Baudis <xpasky@machine.or.cz> Sun, 12 Feb 2006 16:57:39 +0100
git-bisect.sh | 17 ++++++++++++++---
1 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/git-bisect.sh b/git-bisect.sh
index 51e1e44..3c024aa 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -49,9 +49,16 @@ bisect_start() {
die "Bad HEAD - I need a symbolic ref"
case "$head" in
refs/heads/bisect*)
- git checkout master || exit
+ if [ -s "$GIT_DIR/head-name" ]; then
+ branch=`cat "$GIT_DIR/head-name"`
+ else
+ branch=master
+ fi
+ git checkout $branch || exit
;;
refs/heads/*)
+ [ -s "$GIT_DIR/head-name" ] && die "won't bisect on seeked tree"
+ echo "$head" | sed 's#^refs/heads/##' >"$GIT_DIR/head-name"
;;
*)
die "Bad HEAD - strange symbolic ref"
@@ -159,7 +166,11 @@ bisect_visualize() {
bisect_reset() {
case "$#" in
- 0) branch=master ;;
+ 0) if [ -s "$GIT_DIR/head-name" ]; then
+ branch=`cat "$GIT_DIR/head-name"`
+ else
+ branch=master
+ fi ;;
1) test -f "$GIT_DIR/refs/heads/$1" || {
echo >&2 "$1 does not seem to be a valid branch"
exit 1
@@ -170,7 +181,7 @@ bisect_reset() {
esac
git checkout "$branch" &&
rm -fr "$GIT_DIR/refs/bisect"
- rm -f "$GIT_DIR/refs/heads/bisect"
+ rm -f "$GIT_DIR/refs/heads/bisect" "$GIT_DIR/head-name"
rm -f "$GIT_DIR/BISECT_LOG"
}
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe. -- Douglas Adams
^ permalink raw reply related
* [ANNOUNCE] Cogito-0.17rc1
From: Petr Baudis @ 2006-02-12 17:11 UTC (permalink / raw)
To: git
Hello,
I'm announcing the release of Cogito version 0.17rc1, the human-friendly
version control UI for Linus' GIT tool. Share and enjoy at:
http://www.kernel.org/pub/software/scm/cogito/
This isn't as heavy on bugfixes as 0.16rc1 was, since most of the
bugfixes already went to 0.16 minor releases. This worked out pretty
well and 0.16.4 is really quite stable; so I will do the same thing for
0.17 as well.
Still, there is a huge amount of new features and cool stuff. The
highlight is cg-switch for switching between local branches and massive
cg-patch improvements, but there is plenty of other stuff as well. Read
on for more details.
Note that there is a lot of new stuff inside and some of it went in
quite lately so it didn't have a lot of time to get tested by the
bleeding edge users. Therefore take care, bugs might be lurking around.
The notable new stuff includes:
* cg-switch - Cogito finally gives you the full convenience of
multiple local branches in a single repository ;)
* cg-patch -c, -C, -d - Cogito now supports cherrypicking, easy commit
reverts and automatic committing of applied patches
* Resumable cg-clone - if cg-clone fails in the middle of the initial
fetch, the directory is not deleted and you do not have to start all
over again - just cd inside and run cg-fetch and it will DTRT
* Support for tracking rebasing branches; as long as you use cg-update
(NOT cg-fetch + cg-merge) and won't commit local changes, Cogito
will correctly update the branch even if it got rebased in the
meanwhile
* Quoting fixes - this means that Cogito should be now theoretically
100% resilient to whitespaces and metacharacters in filenames etc.
Note that filenames containing newlines still aren't supported and
aren't likely to ever be. You are a loonie. Go away.
* Radically improved cg-fetch progressbar; it still doesn't quite work
with rsync (use cg-fetch -v -v), but I don't think that can be
helped. The main advantage is that it will show HTTP fetch progress
even when fetching large files (especially packs).
* Significant merges speedup (but still quite some potential for
improvement)
* cg-* --help now by default shows only short help; use --long-help
to see the full manual
* cg-commit --signoff
* cg-commit --review to review and even modify the patch you are
committing
* bash commandline autocompletion files in contrib/
* cg-fetch -v, cg-fetch -v -v, cg-merge -v, cg-update -v
* cg-push -r to push a different branch (or even a specific commit)
instead of your current branch
* cg-rm -r for recursive directories removal
* cg-mv trivial wrapper for git-mv
* cg-push over HTTP
* cg-patch -u for applying non-git patches while autoadding/removing
files, cg-patch -pN with obvious meaning
* cg-object-id -d for short human-readable commit string id
(just wraps git-describe)
* Too many minor new features to list here
* Incompatible change - the post-commit hook won't be ran for all the
merged commits anymore when you commit a merge; you can reenable
that in .git/config, see the cg-commit documentation for details
P.S.: See us at #git @ FreeNode!
Happy hacking,
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
^ permalink raw reply
* Re: ***DONTUSE*** Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Johannes Schindelin @ 2006-02-12 17:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7virrkbsgp.fsf@assigned-by-dhcp.cox.net>
Hi,
On Sun, 12 Feb 2006, Junio C Hamano wrote:
> The problem does not seem to trigger with casual use, but I
> found that with a clone from my primary repository with '-l -s'
> (that is, a clone that uses alternates mechanism to borrow from
> my primary repository), fsck-objects built with the patch seems
> to report bogus things "missing". I have not traced it fully;
> instead I ended up spending most of the night (I noticed it at
> around 01:30 and now it is 05:30 so that's about four hours)
> recovering some of my refs and double checking if my primary
> repository is not corrupt X-<. At least, the primary repository
> looks sane now.
Could it be the shallow thing?
+
+void for_each_object(void (*fn)(struct object *))
+{
+ int i;
+ for (i = 0; i < nr_objs; i++)
+ fn(objs[i]);
+}
This will not work with the hashtable. It has to be something like
void for_each_object(void (*fn)(struct object *))
{
int i;
for (i = 0; i < obj_allocs; i++)
if (objs[i])
fn(objs[i]);
}
This error did not trigger here, since I don't use shallow clones.
Again, sorry for the inconvenience, Junio.
Hth,
Dscho
^ permalink raw reply
* [PATCH] fix "test: 2: unexpected operator" on bsd
From: Alex Riesen @ 2006-02-12 18:03 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
---
t/t0000-basic.sh | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
4ce9a3f9fe30e091411811a905cccbffcb45ee58
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index bc3e711..c339a36 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -33,7 +33,7 @@ then
fi
merge >/dev/null 2>/dev/null
-if test $? == 127
+if test $? = 127
then
echo >&2 'You do not seem to have "merge" installed.
Please check INSTALL document.'
--
1.1.6.gd46b
^ permalink raw reply related
* Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 18:04 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List, Johannes Schindelin
The hashed object lookup had a subtle bug in re-hashing: it did
for (i = 0; i < count; i++)
if (objs[i]) {
.. rehash ..
where "count" was the old hash couny. Oon the face of it is obvious, since
it clearly re-hashes all the old objects.
However, it's wrong.
If the last old hash entry before re-hashing was in use (or became in use
by the re-hashing), then when re-hashing could have inserted an object
into the hash entries with idx >= count due to overflow. When we then
rehash the last old entry, that old entry might become empty, which means
that the overflow entries should be re-hashed again.
In other words, the loop has to be fixed to either traverse the whole
array, rather than just the old count.
(There's room for a slight optimization: instead of counting all the way
up, we can break when we see the first empty slot that is above the old
"count". At that point we know we don't have any collissions that we might
have to fix up any more. This patch only does the trivial fix)
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
I actually didn't see any of this trigger in real life, so maybe my
analysis is wrong. Junio? Johannes?
diff --git a/object.c b/object.c
index 59e5e36..aeda228 100644
--- a/object.c
+++ b/object.c
@@ -65,7 +65,7 @@ void created_object(const unsigned char
objs = xrealloc(objs, obj_allocs * sizeof(struct object *));
memset(objs + count, 0, (obj_allocs - count)
* sizeof(struct object *));
- for (i = 0; i < count; i++)
+ for (i = 0; obj_allocs ; i++)
if (objs[i]) {
int j = find_object(objs[i]->sha1);
if (j != i) {
^ permalink raw reply related
* [PATCH] avoid echo -e, there are systems where it does not work
From: Alex Riesen @ 2006-02-12 18:05 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
FreeBSD 4.11 being one example: the built-in echo doesn't have -e,
and the installed /bin/echo does not do "-e" as well.
"printf" works, laking just "\e" and "\xAB'.
---
git-tag.sh | 3 ++-
t/t3001-ls-files-others-exclude.sh | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
b155e96dcb988b2533e8546666d0f045a36f594d
diff --git a/git-tag.sh b/git-tag.sh
index 6d0c973..c74e1b4 100755
--- a/git-tag.sh
+++ b/git-tag.sh
@@ -85,7 +85,8 @@ if [ "$annotate" ]; then
exit 1
}
- ( echo -e "object $object\ntype $type\ntag $name\ntagger $tagger\n";
+ ( printf 'object %s\ntype %s\ntag %s\ntagger %s\n\n' \
+ "$object" "$type" "$name" "$tagger";
cat "$GIT_DIR"/TAG_FINALMSG ) >"$GIT_DIR"/TAG_TMP
rm -f "$GIT_DIR"/TAG_TMP.asc "$GIT_DIR"/TAG_FINALMSG
if [ "$signed" ]; then
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index fde2bb2..6979b7c 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -68,7 +68,7 @@ test_expect_success \
diff -u expect output'
# Test \r\n (MSDOS-like systems)
-echo -ne '*.1\r\n/*.3\r\n!*.6\r\n' >.gitignore
+printf '*.1\r\n/*.3\r\n!*.6\r\n' >.gitignore
test_expect_success \
'git-ls-files --others with \r\n line endings.' \
--
1.1.6.gd46b
^ permalink raw reply related
* Re: Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 18:10 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List, Johannes Schindelin
In-Reply-To: <Pine.LNX.4.64.0602120956130.3691@g5.osdl.org>
On Sun, 12 Feb 2006, Linus Torvalds wrote:
>
> I actually didn't see any of this trigger in real life, so maybe my
> analysis is wrong. Junio? Johannes?
Btw, if it does trigger, the behaviour would be that a subsequent object
lookup will fail, because the last old slot would be NULL, and a few
entries following it (likely just a couple - never mind that the event
triggering in the first place is probably fairly rare) wouldn't have
gotten re-hashed down.
As a result, we'd allocate a new object, and have _two_ "struct object"s
that describe the same real object. I don't know what would get upset, but
git-fsck-index certainly would be (one of them would likely be marked
unreachable, because lookup wouldn't find it, but you might have other
issues too).
Linus
^ permalink raw reply
* Re: Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 18:16 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List, Johannes Schindelin
In-Reply-To: <Pine.LNX.4.64.0602120956130.3691@g5.osdl.org>
On Sun, 12 Feb 2006, Linus Torvalds wrote:
> - for (i = 0; i < count; i++)
> + for (i = 0; obj_allocs ; i++)
GAAH.
That should obviously be "i < obj_allocs".
That's what I get for editing the patch in-place to remove the optimized
version that I felt wasn't worth worrying about due to being subtle. So
instead I sent out a patch that was not-so-subtly obvious crap!
Sorry about that.
Linus
^ permalink raw reply
* Re: Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 18:18 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List, Johannes Schindelin
In-Reply-To: <Pine.LNX.4.64.0602121015020.3691@g5.osdl.org>
On Sun, 12 Feb 2006, Linus Torvalds wrote:
>
> That's what I get for editing the patch in-place to remove the optimized
> version that I felt wasn't worth worrying about due to being subtle. So
> instead I sent out a patch that was not-so-subtly obvious crap!
Btw: the reason I edited out the optimization is that it doesn't actually
matter. Re-hashing the whole thing is a trivial thing, and has basically
zero overhead in my testing. The costs are all elsewhere now.
Linus
^ permalink raw reply
* Re: Fix object re-hashing
From: Junio C Hamano @ 2006-02-12 18:32 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602121006360.3691@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> On Sun, 12 Feb 2006, Linus Torvalds wrote:
>>
>> I actually didn't see any of this trigger in real life, so maybe my
>> analysis is wrong. Junio? Johannes?
>
> Btw, if it does trigger, the behaviour would be that a subsequent object
> lookup will fail, because the last old slot would be NULL, and a few
> entries following it (likely just a couple - never mind that the event
> triggering in the first place is probably fairly rare) wouldn't have
> gotten re-hashed down.
>
> As a result, we'd allocate a new object, and have _two_ "struct object"s
> that describe the same real object. I don't know what would get upset, but
> git-fsck-index certainly would be (one of them would likely be marked
> unreachable, because lookup wouldn't find it, but you might have other
> issues too).
This "fix" makes the symptom that me fire two (maybe three)
Grrrrr messages earlier this morning disappear. I haven't had
my caffeine nor nicotine yet after my short sleep, so I need to
take some time understanding your explanation first, but I am
reasonably sure this must be it (not that I do not trust you,
not at all -- it is that I do not trust *me* applying a patch
without understanding when I have a bug reproducible).
Thanks.
^ permalink raw reply
* Re: ***DONTUSE*** Re: [PATCH] Use a hashtable for objects instead of a sorted list
From: Junio C Hamano @ 2006-02-12 18:34 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0602121822060.21959@wbgn013.biozentrum.uni-wuerzburg.de>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> Could it be the shallow thing?
No, I do not think so. In the current "master", "next", "pu"
picture, "pu" merges js/obj with the "8856cc69" commit and then
shallow and bind comes later. That merged version is what I am
testing, and also its second parent, which is "master" plus
hashtable. In either case, shallow and bind are not part of the
picture.
> Again, sorry for the inconvenience, Junio.
That is not your fault, and I did not mean to sound I am unhappy
about *you*. I am indeed unhappy because I did not see anything
obviously or subtly wrong with your patch after I moved call to
hashtable_index() in find_object().
But I think Linus nailed it. I'll see if I can understand his
explanation.
^ permalink raw reply
* Re: Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 18:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vaccwbf6v.fsf@assigned-by-dhcp.cox.net>
On Sun, 12 Feb 2006, Junio C Hamano wrote:
>
> This "fix" makes the symptom that me fire two (maybe three)
> Grrrrr messages earlier this morning disappear.
Goodie. I assume that was the fixed fix, not my original "edit out the
useless optimization and then break it totally" fix ;)
> I haven't had my caffeine nor nicotine yet after my short sleep, so I
> need to take some time understanding your explanation first, but I am
> reasonably sure this must be it (not that I do not trust you, not at all
> -- it is that I do not trust *me* applying a patch without understanding
> when I have a bug reproducible).
The basic notion is that this hashing algorithm uses a normal "linear
probing" overflow approach, which basically means that overflows in
a hash bucket always just probe the next few buckets to find an empty one.
That's a really simple (and fairly cache-friendly) approach, and it makes
tons of sense, especially since we always re-size the hash to guarantee
that we'll have empty slots. It's a bit more subtle - especially when
re-hashing - than the probably more common "collission chain" approach,
though.
Now, when we re-hash, the important rule is:
- the re-hashing has to walk in the same direction as the overflow.
This is important, because when we move a hashed entry, that automatically
means that even otherwise _already_correctly_ hashed entries may need to
be moved down (ie even if their "inherent hash" does not change, their
_effective_ hash address changes because their overflow position needs to
be fixed up).
There are two interesting cases:
- the "overflow of the overflow": when the linear probing itself
overflows the size of the hash queue, it will "change direction" by
overflowing back to index zero.
Happily, the re-hashing does not need to care about this case, because
the new hash is bigger: the rule we have when doing the re-hashing is
that as we re-hash, the "i" entries we have already re-hashed are all
valid in the new hash, so even if overflow occurs, it will occur the
right way (and if it overflows all the way past the current "i", we'll
re-hash the already re-hashed entry anyway).
- the old/new border case. In particular, the trivial logic says that we
only need to re-hash entries that were hashed with the old hash. That's
what the broken code did: it only traversed "0..oldcount-1", because
any entries that had an index bigger than or equal to "oldcount" were
obviously _already_ re-hashed.
That logic sounds obvious, but it falls down on exactly the fact that
we may indeed have to re-hash even entries that already were re-hashed
with the new algorithm, exactly because of the overflow changes.
So the boundary for old/new is really: "you need to rehash all entries
that were old, but then you _also_ need to rehash the list of entries that
you rehashed that might need to be moved down to an empty spot vacated by
an old hash".
So the stop condition really ends up being: "stop when you have seen all
old hash entries _and_ at least one empty entry after that", since an
empty entry means that there was no overflow from earlier positions past
that position. But it's just simpler to walk the whole damn new thing and
not worry about it.
Linus
^ permalink raw reply
* Re: Configuration file musings
From: Junio C Hamano @ 2006-02-12 18:56 UTC (permalink / raw)
To: Mark Wooding; +Cc: git
In-Reply-To: <slrnduuf08.518.mdw@metalzone.distorted.org.uk>
Mark Wooding <mdw@distorted.org.uk> writes:
> Having thought about things a bit, I've reached the conclusion that the
> configuration file $GIT_DIR/config is trying to hold (at least) three
> entirely different kinds of configuration.
Yes. I've been somewhat bothered by that, but I did not ramble
on it too much because I could not say what exactly is _wrong_
to have these things in the same file.
Except perhaps that pure user configuration could be further
split out from per-repository user configuration to .gitconfig
under $HOME or something like that. The former is things like
"I am Junio C Hamano no matter which project I work on" the
latter is for example "The e-mail address I use for this project
is <junkio@cox>". But once we start dealing with more than one
configuration file, you need one file taking precedence over the
other and it appeared there is no single _right_ order. The
above identity case is an example that it is better to make the
config in repository to override $HOME one, but if you think
long enough I am sure you can come up with an example that you
would want to override repository one from $HOME one.
^ permalink raw reply
* Re: Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 19:10 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602121037460.3691@g5.osdl.org>
On Sun, 12 Feb 2006, Linus Torvalds wrote:
>
> - the "overflow of the overflow": when the linear probing itself
> overflows the size of the hash queue, it will "change direction" by
> overflowing back to index zero.
>
> Happily, the re-hashing does not need to care about this case, because
> the new hash is bigger: the rule we have when doing the re-hashing is
> that as we re-hash, the "i" entries we have already re-hashed are all
> valid in the new hash, so even if overflow occurs, it will occur the
> right way (and if it overflows all the way past the current "i", we'll
> re-hash the already re-hashed entry anyway).
Btw, this is only always true if the new hash is at least twice the size
of the old hash, I think. Otherwise a re-hash can fill up the new entries
and overflow entirely before we've actually even re-hashed all the old
entries, and then we'd need to re-hash even the overflowed entries (which
are now below "i").
If the new size is at least twice the old size, the "upper area" cannot
overflow completely (there has to be empty room), and we cannot be in the
situation that we need to move even the overflowed entries when we remove
an old hash entry.
Anyway, if all this makes you nervous, the conceptually much simpler way
to do the re-sizing is to not do the in-place re-hashing. Instead of doing
the xrealloc(), just do a "xmalloc()" of the new area, do the re-hashing
(which now _must_ re-hash in just the "0..oldcount-1" old area) into the
new area, and then free the old area after rehashing.
That would make things more obviously correct, and perhaps simpler.
Johannes, do you want to try that?
Btw, as it currently stands, I worry a tiny tiny bit about the
obj_allocs = (obj_allocs < 32 ? 32 : 2 * obj_allocs)
thing, because I think that second "32" needs to be a "64" to be really
safe (ie guarantee that the new obj_allocs value is always at least twice
the old one).
Anyway, I'm pretty sure people smarter than me have already codified
exactly what needs to be done for a in-place rehash of a linear probe hash
overflow algorithm. This must all be in some "hashing 101" book. I had to
think it through from first principles rather than "knowing" what the
right answer was (which probably means that I slept through some
fundamental algorithms class in University ;)
Linus
^ permalink raw reply
* Re: Fix object re-hashing
From: Junio C Hamano @ 2006-02-12 19:13 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602121037460.3691@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> On Sun, 12 Feb 2006, Junio C Hamano wrote:
>>
>> This "fix" makes the symptom that me fire two (maybe three)
>> Grrrrr messages earlier this morning disappear.
>
> Goodie. I assume that was the fixed fix, not my original "edit out the
> useless optimization and then break it totally" fix ;)
>
>> I haven't had my caffeine nor nicotine yet after my short sleep, so I
>> need to take some time understanding your explanation first, but I am
>> reasonably sure this must be it (not that I do not trust you, not at all
>> -- it is that I do not trust *me* applying a patch without understanding
>> when I have a bug reproducible).
Your explanation finally made sense to me, without caffeine nor
nicotine yet, but when I tried to do an illustration.
If the initial obj_allocs were 4 instead of 32, we may have
something lie this before rehashing.
slot value
0 3
1 -
2 -
3 7
Rehash to double the hash goes like this:
step1 step2 step3 fixup rehash
enlarge rehash rehash missing from
array "3%8" "7%8" the original
0 3 - - -
1 - - - -
2 - - - -
3 7 7 - 3
4 - 3 3 -
5 - - - -
6 - - - -
7 - - 7 7
We cannot find "3%8" without the fix.
Thanks for the fix. Will do an updated "master" soon.
^ permalink raw reply
* Re: Fix object re-hashing
From: Junio C Hamano @ 2006-02-12 19:21 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602121101040.3691@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> Anyway, if all this makes you nervous,...
I did draw an illustration like the one I sent in my previous
message when I received the first patch from Johannes, and it was
reasonably obvious to me that it was meant to redistribute about
half of the existing entries to the upper area, always going
upwards, so modulo that wraparound corner case you fixed, I
think doubling is fine.
> Btw, as it currently stands, I worry a tiny tiny bit about the
>
> obj_allocs = (obj_allocs < 32 ? 32 : 2 * obj_allocs)
>
> thing, because I think that second "32" needs to be a "64" to be really
> safe (ie guarantee that the new obj_allocs value is always at least twice
> the old one).
obj_allocs starts out as 0 so the first value it gets is 32 when
you need to insert the first element.
^ permalink raw reply
* Re: [PATCH] Properly git-bisect reset after bisecting from non-master head
From: Junio C Hamano @ 2006-02-12 19:33 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
In-Reply-To: <20060212160614.GV31278@pasky.or.cz>
Petr Baudis <pasky@suse.cz> writes:
> diff --git a/git-bisect.sh b/git-bisect.sh
> index 51e1e44..3c024aa 100755
> --- a/git-bisect.sh
> +++ b/git-bisect.sh
> @@ -49,9 +49,16 @@ bisect_start() {
> die "Bad HEAD - I need a symbolic ref"
> case "$head" in
> refs/heads/bisect*)
> - git checkout master || exit
> + if [ -s "$GIT_DIR/head-name" ]; then
> + branch=`cat "$GIT_DIR/head-name"`
> + else
> + branch=master
> + fi
> + git checkout $branch || exit
> ;;
> refs/heads/*)
> + [ -s "$GIT_DIR/head-name" ] && die "won't bisect on seeked tree"
> + echo "$head" | sed 's#^refs/heads/##' >"$GIT_DIR/head-name"
> ;;
Hmph. It seems that $GIT_DIR/head-name might want to be a
symbolic ref?
But other than that the patch looks sane, being able to go back
to the original branch, and preventing starting to bisect while
bisecting are useful and safe changes. Thanks.
^ permalink raw reply
* Re: Fix object re-hashing
From: Linus Torvalds @ 2006-02-12 19:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vlkwg9ye0.fsf@assigned-by-dhcp.cox.net>
On Sun, 12 Feb 2006, Junio C Hamano wrote:
>
> > Btw, as it currently stands, I worry a tiny tiny bit about the
> >
> > obj_allocs = (obj_allocs < 32 ? 32 : 2 * obj_allocs)
> >
> > thing, because I think that second "32" needs to be a "64" to be really
> > safe (ie guarantee that the new obj_allocs value is always at least twice
> > the old one).
>
> obj_allocs starts out as 0 so the first value it gets is 32 when
> you need to insert the first element.
Yes. The point being that the code is "conceptually wrong", not that it
doesn't work in practice. If we somehow could get into the situation that
we had a hash size of 31, resizing it to 32 would be incorrect.
Of course, if we just make it a rule that the hash size must always be a
power-of-two (add a comment, and enforce the rule by changing the modulus
into a bitwise "and"), then that issue too goes away.
Linus
^ permalink raw reply
* Re: [PATCH] Properly git-bisect reset after bisecting from non-master head
From: Petr Baudis @ 2006-02-12 19:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhd749xtl.fsf@assigned-by-dhcp.cox.net>
Dear diary, on Sun, Feb 12, 2006 at 08:33:26PM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> Petr Baudis <pasky@suse.cz> writes:
>
> > diff --git a/git-bisect.sh b/git-bisect.sh
> > index 51e1e44..3c024aa 100755
> > --- a/git-bisect.sh
> > +++ b/git-bisect.sh
> > @@ -49,9 +49,16 @@ bisect_start() {
> > die "Bad HEAD - I need a symbolic ref"
> > case "$head" in
> > refs/heads/bisect*)
> > - git checkout master || exit
> > + if [ -s "$GIT_DIR/head-name" ]; then
> > + branch=`cat "$GIT_DIR/head-name"`
> > + else
> > + branch=master
> > + fi
> > + git checkout $branch || exit
> > ;;
> > refs/heads/*)
> > + [ -s "$GIT_DIR/head-name" ] && die "won't bisect on seeked tree"
> > + echo "$head" | sed 's#^refs/heads/##' >"$GIT_DIR/head-name"
> > ;;
>
> Hmph. It seems that $GIT_DIR/head-name might want to be a
> symbolic ref?
That probably isn't a bad idea per se, but I can't think of anything
which that would improve either, and this has the plus of being
compatible with Cogito.
Anyway, if you want a symref, you should probably give it a different
name.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe. -- Douglas Adams
^ permalink raw reply
* Re: [PATCH] Properly git-bisect reset after bisecting from non-master head
From: Junio C Hamano @ 2006-02-12 20:35 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
In-Reply-To: <20060212194146.GX31278@pasky.or.cz>
Petr Baudis <pasky@suse.cz> writes:
>> Hmph. It seems that $GIT_DIR/head-name might want to be a
>> symbolic ref?
>
> That probably isn't a bad idea per se, but I can't think of anything
> which that would improve either, and this has the plus of being
> compatible with Cogito.
Ah, I missed that part, that "head-name" is being used by Cogito
and you used that format.
^ permalink raw reply
* Re: [PATCH] fetch-clone progress: finishing touches.
From: Linus Torvalds @ 2006-02-12 20:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vslqpjq2q.fsf@assigned-by-dhcp.cox.net>
On Sat, 11 Feb 2006, Junio C Hamano wrote:
>
> * While we are doing eye-candy, this makes the silence after
> "Generating pack..." part a bit more bearable.
>
> Likes, dislikes, too-much's?
Too little, actually.
Your change makes
git clone ssh://...
be silent again, until the download actually starts. The "isatty(2)" thing
in git-pack-objects won't trigger, because it's actually a socket, not a
tty ;/
ssh will only set up a pty pair if it starts an interactive shell, not if
you use the "ssh host cmd" form.
Linus
^ permalink raw reply
* Re: [PATCH] fetch-clone progress: finishing touches.
From: Junio C Hamano @ 2006-02-12 21:50 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0602121230370.3691@g5.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> ssh will only set up a pty pair if it starts an interactive shell, not if
> you use the "ssh host cmd" form.
True. Or we _could_ use "ssh -t", but I've decided to make
progress the default. If some script wants quiet behaviour they
can say 'pack-objects -q'.
^ permalink raw reply
* Re: [ANNOUNCE] GIT 1.2.0
From: H. Peter Anvin @ 2006-02-12 21:51 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, linux-kernel
In-Reply-To: <7vzmkw8d5b.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano wrote:
> The latest feature release GIT 1.2.0 is available at the
> usual places:
>
> http://www.kernel.org/pub/software/scm/git/
>
> git-1.2.0.tar.{gz,bz2} (tarball)
> RPMS/$arch/git-*-1.2.0-1.$arch.rpm (RPM)
>
> Right now binary RPM is available only for x86_64, because I do
> not have an access to RPM capable i386 box.
>
You can build the i386 binary rpms on hera as such:
rpmbuild --rebuild --target i386 git-1.2.0-1.src.rpm
I had to install openssl-devel from the i386 distribution, which for
some reason wasn't part of the x86-64 distribution, but that's now taken
care of.
-hpa
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox