Git development
 help / color / mirror / Atom feed
* Re: [PATCH] fix diff-delta bad memory access
From: Linus Torvalds @ 2006-05-10 19:01 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git, Randal L. Schwartz, Alex Riesen
In-Reply-To: <Pine.LNX.4.64.0605101311020.24505@localhost.localdomain>



Btw, Nico, that rabin hash code is _extremely_ confusing.

The hash entry pointers point to "data + RABIN_WINDOW", and then to make 
things even _more_ confusing, the hash calculation code is actually offset 
by one, so it will have computed the hash with

	val = ((val << 8) | data[i]) ^ T[val >> RABIN_SHIFT];

where "i" goes from _1_ to RABIN_WINDOW instead of 0..WINDOW-1.

So, if I read that correctly, the "entry->ptr" actually points not to the 
beginning of the data that was hashed, or even the end, but literally to 
the last byte of the data that was hashed in that window.

Isn't that just _really_ confusing?

Or is there some sense to this?

			Linus

^ permalink raw reply

* Re: [PATCH] fix diff-delta bad memory access
From: Nicolas Pitre @ 2006-05-10 19:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, Randal L. Schwartz, Alex Riesen
In-Reply-To: <Pine.LNX.4.64.0605101159160.3718@g5.osdl.org>

On Wed, 10 May 2006, Linus Torvalds wrote:

> 
> 
> Btw, Nico, that rabin hash code is _extremely_ confusing.

Possible.

> The hash entry pointers point to "data + RABIN_WINDOW", and then to make 
> things even _more_ confusing, the hash calculation code is actually offset 
> by one, so it will have computed the hash with
> 
> 	val = ((val << 8) | data[i]) ^ T[val >> RABIN_SHIFT];
> 
> where "i" goes from _1_ to RABIN_WINDOW instead of 0..WINDOW-1.
> 
> So, if I read that correctly, the "entry->ptr" actually points not to the 
> beginning of the data that was hashed, or even the end, but literally to 
> the last byte of the data that was hashed in that window.
> 
> Isn't that just _really_ confusing?
> 
> Or is there some sense to this?

Yes, ptr points to the last byte of the window for given hash value.

This is so because in the matching loop the window is scrolled byte by 
byte and the new hash value is always made of ROBIN_WINDOW-1 bytes which 
already have been processed (most probably added as literal bytes in the 
delta buffer) plus one new byte.  So it makes sense to start forward 
byte matching only from that last byte to find the longest source area 
to match, especially since all the other bytes in the window are likely 
to be identical already and comparing them repeatedly for entries with 
the same hash would be wasteful in most cases.

Further down, once the best offset with the longest match in the source 
buffer has been found, backward matching is performed to remove as much 
literal bytes that were added to the delta output as possible, which is 
most likely to catch the whole Robin window that hasn't been compared 
previously, but in that case the window content is compared only once.

And the reason why the reference hash is computed with an offset of 1 to 
RABIN_WINDOW inclusive in create_delta_index() is to allow for quick 
initialization of the Rabin window _outside_ of the main loop in 
create_delta().  There is a comment to that effect near the top of 
create_delta_index but probably a small reminder should be added in the 
index loop as well.


Nicolas

^ permalink raw reply

* Re: [PATCH] fix diff-delta bad memory access
From: Nicolas Pitre @ 2006-05-10 19:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, Randal L. Schwartz, Alex Riesen
In-Reply-To: <Pine.LNX.4.64.0605101515420.24505@localhost.localdomain>


And of course s/robin/rabin/ (I can't type RABIN without having my 
fingers decide on ROBIN by themselves).

On Wed, 10 May 2006, Nicolas Pitre wrote:

> On Wed, 10 May 2006, Linus Torvalds wrote:
> 
> > 
> > 
> > Btw, Nico, that rabin hash code is _extremely_ confusing.
> 
> Possible.
> 
> > The hash entry pointers point to "data + RABIN_WINDOW", and then to make 
> > things even _more_ confusing, the hash calculation code is actually offset 
> > by one, so it will have computed the hash with
> > 
> > 	val = ((val << 8) | data[i]) ^ T[val >> RABIN_SHIFT];
> > 
> > where "i" goes from _1_ to RABIN_WINDOW instead of 0..WINDOW-1.
> > 
> > So, if I read that correctly, the "entry->ptr" actually points not to the 
> > beginning of the data that was hashed, or even the end, but literally to 
> > the last byte of the data that was hashed in that window.
> > 
> > Isn't that just _really_ confusing?
> > 
> > Or is there some sense to this?
> 
> Yes, ptr points to the last byte of the window for given hash value.
> 
> This is so because in the matching loop the window is scrolled byte by 
> byte and the new hash value is always made of ROBIN_WINDOW-1 bytes which 
> already have been processed (most probably added as literal bytes in the 
> delta buffer) plus one new byte.  So it makes sense to start forward 
> byte matching only from that last byte to find the longest source area 
> to match, especially since all the other bytes in the window are likely 
> to be identical already and comparing them repeatedly for entries with 
> the same hash would be wasteful in most cases.
> 
> Further down, once the best offset with the longest match in the source 
> buffer has been found, backward matching is performed to remove as much 
> literal bytes that were added to the delta output as possible, which is 
> most likely to catch the whole Robin window that hasn't been compared 
> previously, but in that case the window content is compared only once.
> 
> And the reason why the reference hash is computed with an offset of 1 to 
> RABIN_WINDOW inclusive in create_delta_index() is to allow for quick 
> initialization of the Rabin window _outside_ of the main loop in 
> create_delta().  There is a comment to that effect near the top of 
> create_delta_index but probably a small reminder should be added in the 
> index loop as well.
> 
> 
> Nicolas
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Nicolas

^ permalink raw reply

* common URL for repository and gitweb
From: Martin Waitz @ 2006-05-10 23:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]

hoi :)

I hacked a little bit in gitweb so that it can get the project path
form the URI without using a ?p= parameter.  That is, you can now
use "http://.../cgi-bin/gitweb.cgi/path/to/project/" to show
the summary of your project.

Together with a apache configuration like below, you can give your
gitweb pages the same URL as your repositories:

	<VirtualHost www:80>
		ServerName git.hostname.org
		DocumentRoot /pub/git
		RewriteEngine on
		RewriteRule ^/(.*\.git/(.*\.html)?)?$ /usr/lib/cgi-bin/gitweb.cgi%{REQUEST_URI}  [L]
	</VirtualHost>

This will rewrite all URLs that go to gitweb to use the CGI, while
leaving URLs for the repository intact.

You can see an example at http://git.admingilde.org/.
The gitweb version used for that is available here, too.

As an added bonus, gitweb can now serve the "html" branch of a
repository directly using "text/html", so you can show your
documentation without needing to update a checked out version
of this branch.
For example have a look at the GIT manpages at
http://git.admingilde.org/tali/git.git/git.html

-- 
Martin Waitz

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Martin Langhoff @ 2006-05-10 23:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: sean, junkio, Johannes.Schindelin, git
In-Reply-To: <Pine.LNX.4.64.0605100823350.3718@g5.osdl.org>

On 5/11/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Sure. It clearly _is_ a bike shed, which is why I posted patches: I think
> the way to resolve bike sheds is to "just do it". In the kernel, the

there's no disputing that!

> So don't knock the bike sheds. It's a BSD term, and I think there's a
> _reason_ why it's a BSD term. Those people seem to sometimes like to

Apologies -- I didn't want to know it, but I do wonder what the gain
behind the change is. It seems to me that it would be a bad idea to
store refs in the config file and, in my mind at least, branch info is
closer to refs.

> > As an end-user, I have personally stayed away from the increasingly
> > complex scheme for remotes waiting for it to settle, and stuck with
> > the old-styled .git/branches stuff which is slam-dunk simple and it
> > just works.
>
> It does work, and I think it's nice in many ways. It was certainly a good
> way to prototype things.
>
> At the same time, especially with moving things more to C (or almost any
> other language, for that matter - shell is really pretty special in making
> it _easier_ to have things in individual files), it's in many ways nicer
> to have a more structured representation that has a nice fixed interface
> and a library and known access methods behind it.

But it is a bit of a loss for perl/shell porcelains, and for users
that abuse the contents of .git directly on a regular basis...

there's nothing to stop us from having a structured representation of
what's in .git/branches/

> And I'm personally actually pretty fed up with the .git/branches/ and
> .git/remotes/ thing, partly because I can never remember which file is
> which. I had to look at the code of git-parse-remote.sh to remind me

Agreed, it's a mess, but I suspect it's still there to support cogito.
In keeping with the 'talk code' ethos, I'll work on adding support for
.git/remotes in cogito.

> And if we truly have separate files, we should go all the way, and have
> the good old "one file, one value" rule.

What about one semantics, one file? So far we have had 3 semantics:
git config, remotes, refs. And git config has been mostly
project-indepentent. In fact, I have been copying it across my
checkouts of different, unrelated projects. You just don't do that
with remotes or refs.

cheers,


martin

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Linus Torvalds @ 2006-05-10 23:55 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: sean, junkio, Johannes.Schindelin, git
In-Reply-To: <46a038f90605101617x1aa9bd2du959ead77ebf61795@mail.gmail.com>



On Thu, 11 May 2006, Martin Langhoff wrote:
> 
> Apologies -- I didn't want to know it, but I do wonder what the gain
> behind the change is.

I think we can do better in a few pretty important regards:

 - having the information in one place. I agree that the multi-file 
   approach works fine for shell scripts (although I disagree that the new 
   one would be harder - you just use git-repo-config instead), but I 
   think it's quite confusing from a new user perspective.

   I bet that even without any tools, new users can be told to just open 
   ".git/config", and guess how hard a time they would have to add a new 
   branch, if they already had one that said

	[branch "origin"]
		remote = git://git.kernel.org/pub/scm/git/git.git
		branch master

   which would tell you that the local branch "origin" is really branch
   "master" at that remote git repository.

   Yeah, I'm not sure what the actual config rules would be, but think it 
   would be a hell of a lot more intuitive than what we have now. 

   What we have now _works_. It works really well. No question about that. 
   It's just pretty hard to explain. The above syntax wouldn't even need 
   any explanation. You could just tell people to look into their config 
   files.

 - I think we'll have a much easier time (from a purely technical angle) 
   to add special attributes to the local branches. Add a "read-only" 
   specifier? It's _obvious_:

	[branch "origin"] 
		remote = git://git.kernel.org/pub/scm/git/git.git
		branch master
		readonly

   and it's absolutely trivial to parse. And part of the important thing 
   is that this all makes 100% sense EVEN IF IT'S NOT A REMOTE REPO!

   So imagine that it's a purely local branch, but you want to protect it. 
   Solution?

	[branch "July Snapshot"]
		readonly

   and you're done. In contrast, even if you ended up just extending the 
   file format for the .git/remotes/July\ Snapshot file, and just added a 
   "readonly" line to it, it wouldn't make _sense_. Whaa? "remotes"? In 
   contrast, in the .git/config file, it makes a ton of sense, and in fact 
   it's totally obvious.

   (Actually, we should probably have the .git/config file syntax separate 
   local branches like "master" from remote branches like "origin", so it 
   might be more like

	[remote "origin"]
		url = git://git.kernel.org/pub/scm/git/git.git

    which just tells that the _word_ "origin" corresponds to a 
    shorthand for a particular remote repository

	[branch "origin"]
		remote = origin
		branch = master

   or something to show that your _local_ branch named "origin" 
   corresponds to a particular remote (which could be a shorthand like the 
   above, or just spelled out), and a particular branch _at_ that remote 
   repository)

   Anyway, the point is, I think our current .git/remotes/xyzzy files 
   actually mix two different concepts, and they also end up doing it 
   pretty badly. They _work_, but because of the mix-ups, they aren't all 
   that they could be, and it's fundamentally impossible to make them so, 
   because the mixup really is that "origin" means TWO DIFFERENT THINGS 
   (the local branch, and the remote that it corresponds to)

 - Finally, I think it opens the possibility for some other things. For 
   example, once you accept that different branches might want attributes 
   like "readonly", you realize that some other attributes also make 
   sense. Like adding the default pull source per local branch, etc.

Again, I'm not saying that we can't work with the .git/remotes/ files. But 
I think it gets increasingly ugly, and the confusion gets increasingly 
worse.

> But it is a bit of a loss for perl/shell porcelains, and for users
> that abuse the contents of .git directly on a regular basis...

I really disagree. 

The .git/config file is _easier_ to edit by hand than the remotes. It's 
easier to copy-paste within one file than it is to work with two different 
files (and let's face it, copy-paste is usually what at least I would do 
for something like this). And it's _easier_ to just always open one file, 
and search within that one, than try to remember what file it was.

Now, C programs can very easily use the config library, and shell programs 
can equally easily query the variables with "git repo-config". I really 
doubt it's very hard for perl either, but I'm not a perl person, so maybe 
I don't see why this is hard.

		Linus

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Linus Torvalds @ 2006-05-11  0:11 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: sean, junkio, Johannes.Schindelin, git
In-Reply-To: <Pine.LNX.4.64.0605101629230.3718@g5.osdl.org>



On Wed, 10 May 2006, Linus Torvalds wrote:
>
>  - having the information in one place. I agree that the multi-file 
>    approach works fine for shell scripts (although I disagree that the new 
>    one would be harder - you just use git-repo-config instead), but I 
>    think it's quite confusing from a new user perspective.

Btw, I seriously believe that git has come to the point where we've licked 
the real technical issues. Stability hasn't been a concern for the last 
year - and even something as seriously as a repacking bug causing a 
SIGSEGV (yesterday) was actually basically designed to not be able to 
cause problems. The repack failed, and nothing happened to the old data. 
It was scary, but it wasn't "bad".

The last performance problem was a stupid one-liner, where one of the 
shell scripts didn't use the "--aggressive" flag for doing the trivial 
three-way merge, so it ended up forking and executing the "merge-one-file" 
shell script for 4500+ files for one unfortunate project that had a 
strange workflow. Adding the "--aggressive" flag took a 5-minute (where 
all of the time was spent in a shell script basically doing nothing) thing 
down to under a second.

So git should kick butt in performance, scale very well, and seems to take 
less disk space than just about anybody else. 

So what do people actually _complain_ about? 

I don't think we've seen a serious complaint lately that hasn't been about 
nice user interface and/or documentation. Anybody? 

So as far as I can tell, the #1 issue is that "new user" experience. You 
can pretty much forget about anything else. Working with git in a 
distributed manner is really easy and efficient, but from the comments 
I've seen, it's not always easy and obvious how to get to that point. 

Creating a remote repository and filling it. And being able to understand 
what the local vs remote branches actually _mean_. And I think our current 
.git/remotes/ thing is a part of that. It's not exactly user _hostile_, 
but it's very much "implementation friendly, and doesn't care about the 
user". So I think .git/config can help us there.

I also think we could do with a few scripts to just do setup of a remote 
repo:

	git remote clone <remoteaddress>
	git remote branch <remoteaddress> [-D]
	git remote fsck <remoteaddress>
	git remote repack <remoteaddress> -a -d

which would all basically boil down to "ssh to the remote address, cd into 
that directory, and do the named git command there" (well, not clone: 
doing a remote clone involves doing a mkdir/git-init-db/git-receive-pack 
remotely, and doing a git-send-pack locally, so some of them would be 
about doing things _both_ locally and remotely).

And documentation.

Now, I don't do documentation, and I really think somebody else could do 
the whole "git remote <cmd>" thing too. It _should_ really be pretty 
trivial. My real point is that almost none of this is about technology, 
and it's much more about trying to put a whole lot of lipstick on this 
pig. We have _got_ the technology already, and I think most people will 
agree git is doing pretty damn well there.

Because I really think the pig is quite charming, just sometimes you see 
some of its boorish sides right now..

(Or should that be "boarish", when we talk about pigs? ;)

		Linus

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Martin Langhoff @ 2006-05-11  0:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: sean, junkio, Johannes.Schindelin, git
In-Reply-To: <Pine.LNX.4.64.0605101629230.3718@g5.osdl.org>

On 5/11/06, Linus Torvalds <torvalds@osdl.org> wrote:
>         [branch "origin"]
>                 remote = git://git.kernel.org/pub/scm/git/git.git
>                 branch master

This is confusing at first read -- is it branch origin or branch master?

>    Anyway, the point is, I think our current .git/remotes/xyzzy files
>    actually mix two different concepts, and they also end up doing it
>    pretty badly. They _work_, but because of the mix-ups, they aren't all
>    that they could be, and it's fundamentally impossible to make them so,
>    because the mixup really is that "origin" means TWO DIFFERENT THINGS
>    (the local branch, and the remote that it corresponds to)

As you say, this needs to be explained/exposed better to the user.
Now, how about having one .git/config and one .git/branches file?
Different semantics, etc.

> The .git/config file is _easier_ to edit by hand than the remotes. It's
> easier to copy-paste within one file than it is to work with two

Agreed, but I suspect repo config and branches config travel at
different speeds. Maybe what this means is that if this happens, we'll
start seeing a need for ~/.git/config and /etc/git/config to set
defaults (merge.summary=1 for all my repos, core.sharedrepository=1
for all the repos on this server) where I now I just mostly copy
.git/config around.

Does that make sense?

> I don't see why this is hard.

Must be me... it's not the Perl part... I just do a lot of grep |
xargs | sed stuff in my daily git usage ;-)

cheers,


martin

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Nicolas Pitre @ 2006-05-11  1:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, sean, junkio, Johannes.Schindelin, git
In-Reply-To: <Pine.LNX.4.64.0605101629230.3718@g5.osdl.org>

On Wed, 10 May 2006, Linus Torvalds wrote:

> 	[branch "origin"]
> 		remote = git://git.kernel.org/pub/scm/git/git.git
> 		branch master

I totally agree with the principle, but I find the above really 
confusing.  Which "branch" means what?  At least if it was "remote_url" 
and "remote_branch" then there wouldn't be any possibility for 
confusion.


Nicolas

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Jakub Narebski @ 2006-05-11  6:02 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605102148310.24505@localhost.localdomain>

Nicolas Pitre wrote:

> On Wed, 10 May 2006, Linus Torvalds wrote:
> 
>> [branch "origin"]
>> remote = git://git.kernel.org/pub/scm/git/git.git
>> branch master
> 
> I totally agree with the principle, but I find the above really
> confusing.  Which "branch" means what?  At least if it was "remote_url"
> and "remote_branch" then there wouldn't be any possibility for
> confusion.

I'm not sure if remotes shortcuts and configuration (which branches should
be pulled, and to which branches) should be in branches configuration. It
is somewhat confusing. Branches configuration might be used for default
pulling, e.g.


[remote "kernel.org"]
        url = git://git.kernel.org/pub/scm/git/git.git
        pull = master:origin
        ...
        pull = +pu:pu

[branch "origin"]
        pull = kernel.org
        readonly

[branch "pu"]
        pull = kernel.org
        readonly
        fast-forward = no

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Jeff King @ 2006-05-11  9:51 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605101656110.3718@g5.osdl.org>

On Wed, May 10, 2006 at 05:11:17PM -0700, Linus Torvalds wrote:

> I also think we could do with a few scripts to just do setup of a remote 
> repo:
> 
> 	git remote clone <remoteaddress>
> 	git remote branch <remoteaddress> [-D]
> 	git remote fsck <remoteaddress>
> 	git remote repack <remoteaddress> -a -d

Here's a 'git remote' that handles the easy commands. It makes things
like 'git remote origin repack -a -d' do what you expect. The biggest
problems are:
  - it only works for ssh remotes
  - it assumes your remote path is a git dir (do we have a usual way of
    deciding between $path and $path/.git?)
  - ssh'ing will mangle your shell quoting in the command arguments
  - the url parsing is somewhat ad-hoc (do we have a usual way of
    parsing urls for shell scripts?)

---
Add braindead git-remote script.

This script is a convenience wrapper for performing remote commands on a
repository using ssh.

---

 Makefile      |    2 +-
 git-remote.sh |   19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 1 deletions(-)
 create mode 100644 git-remote.sh

8810ae2524d3339b8a8341b34b2d3f14ddb9c899
diff --git a/Makefile b/Makefile
index 37fbe78..58eddd8 100644
--- a/Makefile
+++ b/Makefile
@@ -125,7 +125,7 @@ SCRIPT_SH = \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
 	git-merge-resolve.sh git-merge-ours.sh git-grep.sh \
-	git-lost-found.sh
+	git-lost-found.sh git-remote.sh
 
 SCRIPT_PERL = \
 	git-archimport.perl git-cvsimport.perl git-relink.perl \
diff --git a/git-remote.sh b/git-remote.sh
new file mode 100644
index 0000000..04b1ce9
--- /dev/null
+++ b/git-remote.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+USAGE='<remote> <command> [options]'
+. git-sh-setup
+. git-parse-remote
+
+case "$#" in
+  0|1) usage ;;
+esac
+
+remote=`get_remote_url "$1"` shift;
+case "$remote" in
+  ssh://*|git+ssh://*|ssh+git://*)
+    host=`echo "$remote" | sed 's!^[^/]*://\([^/]*\).*!\1!'`
+    path=`echo "$remote" | sed 's!^[^/]*://[^/]*\(.*\)!\1!'`
+    exec ssh -n $host "GIT_DIR=$path git $@"
+    ;;
+  *) die "unhandled protocol: $remote" ;;
+esac
-- 
1.3.1

^ permalink raw reply related

* Re: Implementing branch attributes in git config
From: Johannes Schindelin @ 2006-05-11 10:30 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git
In-Reply-To: <46a038f90605101713n506207aem1326cd2bf5a1f861@mail.gmail.com>

Hi,

On Thu, 11 May 2006, Martin Langhoff wrote:

> [...] where I now I just mostly copy .git/config around.

Why not just edit the template?

Ciao,
Dscho

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Jeff King @ 2006-05-11 11:39 UTC (permalink / raw)
  To: git
In-Reply-To: <20060511095158.GA23620@coredump.intra.peff.net>

On Thu, May 11, 2006 at 05:51:58AM -0400, Jeff King wrote:

> +remote=`get_remote_url "$1"` shift;

Oops, this should be:
  remote=`get_remote_url "$1"`; shift
but it actually works fine under dash (a bug in dash?)

-Peff

^ permalink raw reply

* Re: Implementing branch attributes in git config
From: Junio C Hamano @ 2006-05-11 17:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Johannes Schindelin, sean
In-Reply-To: <Pine.LNX.4.64.0605091215340.3718@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 9 May 2006, Junio C Hamano wrote:
>> 
>> If we are shooting for "let's not do this again", I do not think
>> (4) without some quoting convention is good enough.  Today, we
>> are talking about branch names so we could give them artificial
>> limits, which could be weaker than what we already have on the
>> branch names, but we would later regret that, when we start
>> wanting to have other names in the configuration (e.g. people's
>> names).
>
> Here's my suggestion as a patch.
>
> NOTE! This patch could be applied right now, and to all the branches (to
> make 1.x, 1.2.x and 1.3.x all support the _syntax_). Even if nothing 
> actually uses it.

Linus,

I've adjusted this patch, your follow-up patch, and Sean's
"extended section part is case sensitive" patch, along with the
test tweak to "maint" branch.  I also prepared them to be
mergeable to the "master" branch.  Tentatively it is in "next"
for testing.

I'm ready to push out "maint" (not tagged as 1.3.3 yet) and
"next", but have not done so.

I understand the plan is to have 1.3.3 out from this "maint",
and also merge this in "master" about the same time, but I am
expecting I will be offline for the rest of the week most of the
time, so it would happen over the weekend at the earliest.

Does that sound good to you?

I do not think we would need v1.1.7 nor v1.2.7 for this.  People
who have stayed at v1.1.6 or v1.2.6 would need to update if they
are going to use newer git in their repo anyway, and I do not
think of a reason not to update to 1.3.3 but update to 1.1.7 or
1.2.7 in order to stay at the feature level of 1.1.X or 1.2.X
series.  We haven't made incompatible changes as far as I
remember.

Here is what the (adjusted) test case in "next" looks like.
Corresponding one in "maint" lack --list so does not have the
last hunk.  The "maint" and "next" branches I have locally both
passes the test.

-- >8 --

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 7090ea9..8260d57 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -229,7 +229,7 @@ test_expect_failure 'invalid key' 'git-r
 test_expect_success 'correct key' 'git-repo-config 123456.a123 987'
 
 test_expect_success 'hierarchical section' \
-	'git-repo-config 1.2.3.alpha beta'
+	'git-repo-config Version.1.2.3eX.Alpha beta'
 
 cat > expect << EOF
 [beta] ; silly comment # another comment
@@ -241,8 +241,8 @@ # empty line
 	NoNewLine = wow2 for me
 [123456]
 	a123 = 987
-[1.2.3]
-	alpha = beta
+[Version "1.2.3eX"]
+	Alpha = beta
 EOF
 
 test_expect_success 'hierarchical section value' 'cmp .git/config expect'
@@ -251,7 +251,7 @@ cat > expect << EOF
 beta.noindent=sillyValue
 nextsection.nonewline=wow2 for me
 123456.a123=987
-1.2.3.alpha=beta
+version.1.2.3eX.alpha=beta
 EOF
 
 test_expect_success 'working --list' \

^ permalink raw reply related

* [PATCH] Fix compilation on newer NetBSD systems
From: Dennis Stosberg @ 2006-05-11 17:35 UTC (permalink / raw)
  To: git

NetBSD >=2.0 has iconv() in libc.  A libiconv is not required and
does not exist.

See: http://netbsd.gw.com/cgi-bin/man-cgi?iconv+3+NetBSD-2.0

Signed-off-by: Dennis Stosberg <dennis@stosberg.net>

---

 Makefile |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

2a73fcf43bfd1f73ec5e1e50396d54b83abae5e1
diff --git a/Makefile b/Makefile
index 37fbe78..26fa4e6 100644
--- a/Makefile
+++ b/Makefile
@@ -285,7 +285,9 @@ ifeq ($(uname_S),OpenBSD)
 	ALL_LDFLAGS += -L/usr/local/lib
 endif
 ifeq ($(uname_S),NetBSD)
-	NEEDS_LIBICONV = YesPlease
+	ifeq ($(shell test `uname -r | sed -e 's/^\([0-9]\).*/\1/'` -lt 2 && echo y),y)
+		NEEDS_LIBICONV = YesPlease
+	endif
 	ALL_CFLAGS += -I/usr/pkg/include
 	ALL_LDFLAGS += -L/usr/pkg/lib -Wl,-rpath,/usr/pkg/lib
 endif
-- 
1.3.2.gbe65

^ permalink raw reply related

* [PATCH] Fix git-pack-objects for 64-bit platforms
From: Dennis Stosberg @ 2006-05-11 17:36 UTC (permalink / raw)
  To: git

The offset of an object in the pack is recorded as a 4-byte integer
in the index file.  When reading the offset from the mmap'ed index
in prepare_pack_revindex(), the address is dereferenced as a long*.
This works fine as long as the long type is four bytes wide.  On
NetBSD/sparc64, however, a long is 8 bytes wide and so dereferencing
the offset produces garbage.

Signed-off-by: Dennis Stosberg <dennis@stosberg.net>

---

I am not sure whether an int cast or an int32_t cast is more
appropriate here.  An int is not guaranteed to be four bytes wide,
but I don't know of any modern platform where that's not the case.
On the other hand int32_t is not necessarily available before C99.

Any opinions?  I wonder why no one has hit this on x86_64...


 pack-objects.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

026b1b2cdd5332f59e15cd8611a49ead3094d08c
diff --git a/pack-objects.c b/pack-objects.c
index 523a1c7..29bda43 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -156,7 +156,7 @@ static void prepare_pack_revindex(struct
 
 	rix->revindex = xmalloc(sizeof(unsigned long) * (num_ent + 1));
 	for (i = 0; i < num_ent; i++) {
-		long hl = *((long *)(index + 24 * i));
+		long hl = *((int *)(index + 24 * i));
 		rix->revindex[i] = ntohl(hl);
 	}
 	/* This knows the pack format -- the 20-byte trailer
-- 
1.3.2.gbe65

^ permalink raw reply related

* Re: [PATCH] Fix git-pack-objects for 64-bit platforms
From: Linus Torvalds @ 2006-05-11 17:58 UTC (permalink / raw)
  To: Dennis Stosberg; +Cc: git
In-Reply-To: <20060511173632.G60c08b4@leonov.stosberg.net>



On Thu, 11 May 2006, Dennis Stosberg wrote:
> 
> I am not sure whether an int cast or an int32_t cast is more
> appropriate here.  An int is not guaranteed to be four bytes wide,
> but I don't know of any modern platform where that's not the case.
> On the other hand int32_t is not necessarily available before C99.
> 
> Any opinions?  I wonder why no one has hit this on x86_64...

I think the "ntohl()" hides it. It loads a 64-bit value, but since x86-64 
is little-endian, the low 32 bits are correct. The htonl() will then strip 
the high bits and make it all be big-endian.

And while I actually run a 64-bit big-endian machine myself (G5 ppc64), my 
user space is all 32-bit by default, so it never showed up on linux-ppc64 
either.

Anyway, the correct type to use is "uint32_t" in this case. That's what 
htonl() takes.

		Linus

^ permalink raw reply

* Re: [PATCH] Fix git-pack-objects for 64-bit platforms
From: Junio C Hamano @ 2006-05-11 18:52 UTC (permalink / raw)
  To: Dennis Stosberg; +Cc: git, Linus Torvalds
In-Reply-To: <Pine.LNX.4.64.0605111054290.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Thu, 11 May 2006, Dennis Stosberg wrote:
>> 
>> I am not sure whether an int cast or an int32_t cast is more
>> appropriate here.  An int is not guaranteed to be four bytes wide,
>> but I don't know of any modern platform where that's not the case.
>> On the other hand int32_t is not necessarily available before C99.
>> 
>> Any opinions?  I wonder why no one has hit this on x86_64...
>
> I think the "ntohl()" hides it. It loads a 64-bit value, but since x86-64 
> is little-endian, the low 32 bits are correct. The htonl() will then strip 
> the high bits and make it all be big-endian.

That sounds sensible.

Since I saw a patch that touches only one place, I thought I'd
better point this out...

There are a few more places that knows about this
((char*)base_pointer + (entry_count * 24)) magic in our code.

$ git grep -n -e '24  *\*' -e '\*  *24' master -- '*.c'

master:pack-objects.c:159:		long hl = *((long *)(index + 24 * i));

	This is yours.

master:sha1_file.c:447:	if (idx_size != 4*256 + nr * 24 + 20 + 20)
master:sha1_file.c:1148:	memcpy(sha1, (index + 24 * n + 4), 20);
master:sha1_file.c:1162:		int cmp = memcmp(index + 24 * mi + 4, sha1, 20);

	These three should be OK, I think.

master:sha1_file.c:1164:			e->offset = ntohl(*((int*)(index + 24 * mi)));

	This you might want to look at; I suspect it is what
	Linus suggests.

Also we _might_ have uglier magic that assumes the base_pointer to
be a pointer to a 4-byte integer and uses offset of multiple of
6 instead of 24, although I do not think it is likely.

I have to leave the keyboard in a few minutes so I cannot verify
nor fix them myself for the next 8 hours or so.  Sorry.

> And while I actually run a 64-bit big-endian machine myself (G5 ppc64), my 
> user space is all 32-bit by default, so it never showed up on linux-ppc64 
> either.
>
> Anyway, the correct type to use is "uint32_t" in this case. That's what 
> htonl() takes.

Is uint32_t guaranteed to be exactly 32-bit, or merely enough to
hold 32-bit?

^ permalink raw reply

* Re: [PATCH] Fix git-pack-objects for 64-bit platforms
From: Linus Torvalds @ 2006-05-11 19:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dennis Stosberg, git
In-Reply-To: <7v7j4swg0r.fsf@assigned-by-dhcp.cox.net>



On Thu, 11 May 2006, Junio C Hamano wrote:
> 
> Is uint32_t guaranteed to be exactly 32-bit, or merely enough to
> hold 32-bit?

I think it's guaranteed to be 32-bit, but regardless, the current git 
headers already assume that "unsigned int" is 32-bit.

Which is a pretty safe assumption for at least the next ten years or so, 
possibly much longer. So I don't think we need to worry _too_ much about 
this. I think it's more important to try to get git working on Windows, 
than on 16-bit DOS or on a PDP-9, or one of the odd cray machines.

		Linus

^ permalink raw reply

* Re: [PATCH] Fix git-pack-objects for 64-bit platforms
From: Linus Torvalds @ 2006-05-11 19:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dennis Stosberg, git
In-Reply-To: <7v7j4swg0r.fsf@assigned-by-dhcp.cox.net>



On Thu, 11 May 2006, Junio C Hamano wrote:
> 
> Since I saw a patch that touches only one place, I thought I'd
> better point this out...
> 
> There are a few more places that knows about this
> ((char*)base_pointer + (entry_count * 24)) magic in our code.

Since I _do_ have a 64-bit big-endian architecture, I decided to install 
some of the 64-bit development libraries that I didn't already have, and I 
added "-m64" to the compiler flags.

All the tests seem to pass with the simple fix (and without it, we do 
indeed fail at least t5700-clone-reference.sh).

Of course, there might well be something else that doesn't get coverage, 
but passing all tests is at least a good sign. And since x86-64 has been 
getting pretty extensive coverage, I don't think we have a lot of 64-bit 
bugs per se, this one just happened to hide.

		Linus

---
diff --git a/pack-objects.c b/pack-objects.c
index 523a1c7..1b9e7a1 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -156,7 +156,7 @@ static void prepare_pack_revindex(struct
 
 	rix->revindex = xmalloc(sizeof(unsigned long) * (num_ent + 1));
 	for (i = 0; i < num_ent; i++) {
-		long hl = *((long *)(index + 24 * i));
+		uint32_t hl = *((uint32_t *)(index + 24 * i));
 		rix->revindex[i] = ntohl(hl);
 	}
 	/* This knows the pack format -- the 20-byte trailer

^ permalink raw reply related

* Re: [PATCH 5/6] gitopt: convert setup_revisions(), and diff_opt_parse()
From: Eric Wong @ 2006-05-11 20:19 UTC (permalink / raw)
  To: git
In-Reply-To: <11471512123005-git-send-email-normalperson@yhbt.net>

In other news, this patch is broken.  Bundled args won't work if some
in the bundle are handled setup_revisions() and some are handled by
diff_opt_parse().

I'll work on fixing this, as well (may take a while working mostly one-handed).

-- 
Eric Wong

^ permalink raw reply

* git-bisect failed me again
From: Andrew Morton @ 2006-05-12  7:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


Trying to find a recently-merged box-killer in Len's tree:

bix:/usr/src/git26> cat .git/branches/git-acpi 
git+ssh://master.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git#test

git-checkout git-acpi
git-bisect reset
git-bisect start
git-bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
git-bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9

And it led me to

bix:/usr/src/git26> git-bisect good
9011bff4bdc0fef1f9a782d7415c306ee61826c9 is first bad commit
diff-tree 9011bff4bdc0fef1f9a782d7415c306ee61826c9 (from 7e1f19e50371e1d148226b64c8edc77fec47fa5b)
Author: Len Brown <len.brown@intel.com>
Date:   Thu May 11 00:28:12 2006 -0400

    ACPI: delete newly added debugging macros in processor_perflib.c
    

which isn't a buggy patch.

bix:/usr/src/git26> cat .git/BISECT_LOG
git-bisect start
# good: [ff2fc3e9e3edb918b6c6b288485c6cb267bc865e] ACPI: EC acpi-ecdt-uid-hack
git-bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
# bad: [9011bff4bdc0fef1f9a782d7415c306ee61826c9] ACPI: delete newly added debugging macros in processor_perflib.c
git-bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
# good: [c52851b60cc0aaaf974ff0e49989fb698220447d] P-state software coordination for speedstep-centrino
git-bisect good c52851b60cc0aaaf974ff0e49989fb698220447d
# good: [7e1f19e50371e1d148226b64c8edc77fec47fa5b] ACPI: UP build fix for bugzilla-5737
git-bisect good 7e1f19e50371e1d148226b64c8edc77fec47fa5b


A had a second go - fed it the very first and last commit IDs in that tree.
 Same result.

bix:/usr/src/git26> git-bisect good
9011bff4bdc0fef1f9a782d7415c306ee61826c9 is first bad commit
diff-tree 9011bff4bdc0fef1f9a782d7415c306ee61826c9 (from 7e1f19e50371e1d148226b64c8edc77fec47fa5b)
Author: Len Brown <len.brown@intel.com>
Date:   Thu May 11 00:28:12 2006 -0400

    ACPI: delete newly added debugging macros in processor_perflib.c
    
    Signed-off-by: Len Brown <len.brown@intel.com>

:040000 040000 f7a3b4cfdb128fb6a777b2e30a83c63edd70f46a 2ca1c42aaae65df9052d60d274aaec3116e30c2d M      drivers
bix:/usr/src/git26> cat .git/BISECT_LOG       
git-bisect start
# good: [74951d613e758f9709d6f2173107be68f18f77f4] ACPI: Remove debugging macros from drivers/acpi/thermal.c
git-bisect good 74951d613e758f9709d6f2173107be68f18f77f4
# bad: [9011bff4bdc0fef1f9a782d7415c306ee61826c9] ACPI: delete newly added debugging macros in processor_perflib.c
git-bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
# good: [c52851b60cc0aaaf974ff0e49989fb698220447d] P-state software coordination for speedstep-centrino
git-bisect good c52851b60cc0aaaf974ff0e49989fb698220447d
# good: [7e1f19e50371e1d148226b64c8edc77fec47fa5b] ACPI: UP build fix for bugzilla-5737
git-bisect good 7e1f19e50371e1d148226b64c8edc77fec47fa5b


What did I do wrong this time?

Thanks.

^ permalink raw reply

* Re: git-bisect failed me again
From: Linus Torvalds @ 2006-05-12 14:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Junio C Hamano, git
In-Reply-To: <20060512000249.71933599.akpm@osdl.org>



On Fri, 12 May 2006, Andrew Morton wrote:
> 
> Trying to find a recently-merged box-killer in Len's tree:
> 
> bix:/usr/src/git26> cat .git/branches/git-acpi 
> git+ssh://master.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git#test
> 
> git-checkout git-acpi

Just for others: if you already have a Linux repo, this is the perfect 
time to do

	git clone --reference <old-linux-repo>
		git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6

to get that linux-acpi-2.6 repository.

And for Junio: good job on that "--reference" flag. I used to do a local 
clone and then force an update, this was much better.

> git-bisect reset
> git-bisect start
> git-bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
> git-bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
> 
> And it led me to
> 
> bix:/usr/src/git26> git-bisect good
> 9011bff4bdc0fef1f9a782d7415c306ee61826c9 is first bad commit
> 
> which isn't a buggy patch.

Well, to see what's up, do "git bisect visualize". That tends to not only 
help bisect things (for when you want to pick a different bisection 
point), it's also a wonderfully visual tool to what the f*%& happens when 
something goes wrong.

Anyway, when I replay your log:

> bix:/usr/src/git26> cat .git/BISECT_LOG
> git-bisect start
> # good: [ff2fc3e9e3edb918b6c6b288485c6cb267bc865e] ACPI: EC acpi-ecdt-uid-hack
> git-bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
> # bad: [9011bff4bdc0fef1f9a782d7415c306ee61826c9] ACPI: delete newly added debugging macros in processor_perflib.c
> git-bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
> # good: [c52851b60cc0aaaf974ff0e49989fb698220447d] P-state software coordination for speedstep-centrino
> git-bisect good c52851b60cc0aaaf974ff0e49989fb698220447d
> # good: [7e1f19e50371e1d148226b64c8edc77fec47fa5b] ACPI: UP build fix for bugzilla-5737
> git-bisect good 7e1f19e50371e1d148226b64c8edc77fec47fa5b

I definitely get the same "9011bff4bdc0fef1f9a782d7415c306ee61826c9 is 
first bad commit" result, and going along visually at every point makes it 
very very obvious that git-bisect is right.

(In fact, the _most_ visually obvious way to do it is to do this:

	git bisect reset
	git bisect start
	git bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
	git bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
	git bisect visualize &
	git bisect good c52851b60cc0aaaf974ff0e49989fb698220447d
	.. go into the "file" menu, and select "re-read references" ..
	git-bisect good 7e1f19e50371e1d148226b64c8edc77fec47fa5b
	.. do "re-read references" again ..

and you'll see exactly what "git bisect" is doing).

You claimed that the previous commit (7e1f19..) was good, and that 
9011bff.. itself was bad). So if that was true, then it really _was_ that 
9011bff.. commit that caused it.

> What did I do wrong this time?

You did nothing wrong, unless your _testing_ was wrong, and one of your 
"git bisect good" entries should have been bad, or the other way around 
(you booted into the wrong kernel, so you thought something was ok when it 
wasn't).

Why are you so sure that git bisect gave the wrong answer? This is ACPI, 
after all. For all we know, subtle cache-effects could break it.

"git bisect" sadly won't help with bugs that show up due to some other 
subtle interaction..

Anyway, my first guess would be that you might have marked something good 
or bad that wasn't. How sure are you about that initial "git bisect bad" 
you did?

		Linus

^ permalink raw reply

* Re: git-bisect failed me again
From: Andrew Morton @ 2006-05-12 15:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: junkio, git
In-Reply-To: <Pine.LNX.4.64.0605120738190.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:
>
> (In fact, the _most_ visually obvious way to do it is to do this:
> 
>  	git bisect reset
>  	git bisect start
>  	git bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
>  	git bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
>  	git bisect visualize &
>  	git bisect good c52851b60cc0aaaf974ff0e49989fb698220447d
>  	.. go into the "file" menu, and select "re-read references" ..
>  	git-bisect good 7e1f19e50371e1d148226b64c8edc77fec47fa5b
>  	.. do "re-read references" again ..
> 
>  and you'll see exactly what "git bisect" is doing).
> 
>  You claimed that the previous commit (7e1f19..) was good, and that 
>  9011bff.. itself was bad). So if that was true, then it really _was_ that 
>  9011bff.. commit that caused it.
> 
>  > What did I do wrong this time?
> 
>  You did nothing wrong, unless your _testing_ was wrong, and one of your 
>  "git bisect good" entries should have been bad, or the other way around 
>  (you booted into the wrong kernel, so you thought something was ok when it 
>  wasn't).
> 
>  Why are you so sure that git bisect gave the wrong answer? This is ACPI, 
>  after all. For all we know, subtle cache-effects could break it.
> 
>  "git bisect" sadly won't help with bugs that show up due to some other 
>  subtle interaction..
> 
>  Anyway, my first guess would be that you might have marked something good 
>  or bad that wasn't. How sure are you about that initial "git bisect bad" 
>  you did?

Am pretty confident.


bix:/usr/src/25> grep '^commit ' patches/git-acpi.patch 
commit 9011bff4bdc0fef1f9a782d7415c306ee61826c9		<- This was my `bad'
commit 5d882e684aafea30c508d86d235327d94e1d38ae
commit 14394600cdfe0c952ce662a32a68c5c5524d32ac
commit da95181baf3cf6a2bd81c0c8af1d4c6790703e4f
commit b128440ed11d108c375772b7fe9ad46d2ac07084		<- This was the bug
commit 61ce94e1f8b16b1694475adba9bf2e07fac02020
commit a48142ea89e02ed0aba0a481ead1e9302e1a4160
commit d5c11d3ba31d6ead24f27de648dc2dcfde5092e3
commit f6a08bf2cb06ee3d5be749cf20685b677619bc8e
commit 2cb7f1704275905b7548eee299c554bcdc5cf357
commit 2ce2b16467f0d43d0f8933eb4821b2369b31888c
commit 8ec0cbd9386a40a3afffad78334f4403b256dc4b
commit ba8acc597cff47fcbbd7b9f0d73a59e784852d8b
commit 7e9e8344848d80c9b6e1b9eaf32dd498b48ca5bb
commit d2606159ffdf8e435f6a7714f8e8910672b944d5
commit 8fb1d47b74e2bad912f74783048b433a1e313799
commit f7c0fce6da5cb68b8b0e203df4ff8ef9b3265105
commit 61e295946a248e43cf244cb24097e284d1d00e35
commit a32283362a7a8e7cff608fe25299a59925daea4d
commit 4cd5611ca16348b3805ddcf89b97fe670e76faaa
commit 529758bad4b0f9a8eec56fcc5cad342e9680ea36
commit 91afb9e683426ff238aab159e60f6d6e792e7488
commit 9f102deee398ea4dfcee3b2108dc00bc59ea877b
commit e85eb9a47f19a26b636b58106e309f8db6b2415d
commit 4597ac50598b85a09417df531849b80ce2e8e44b
commit 74951d613e758f9709d6f2173107be68f18f77f4
commit e6f1f3c54974a30c65ea0b699809d12f0aa04272
commit c12ea918ee175ceb3a258cd81f1c43e897d0c0bc
commit eefa27a93a0490902f33837ac86dbcf344b3aa29
commit ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
commit df42baa0d8e54df18dd9366dd7c93d6be7d5d063
commit 200739c179c63d21804e9e8e2ced265243831579
commit 5e15b92d07fb11490c886c5dd7567f523ea43e2d
commit 9224a867c497053842dc595e594ca6d32112221f
commit 459c7266d7a5c1730169258217e25fdd1b7ca854
commit 1a36561607abf1405b56a41aac2fd163429cd1f8
commit e4513a57ef719d3d6d1cee0ca4d9f4016aa452bb
commit 578b333bfe8eb1360207a08a53c321822a8f40f3
commit 9d9f749b316ac21cb59ad3e595cbce469b409e1a
commit cd090eedd85256829f762677d0752a846c1b88b9
commit 81507ea9cfa64e9851b53e0fefebfa776eda9ecb
commit 1c6e7d0aeecac38e66b1bb63e3eff07b2a1c2f2c
commit b5f2490b6e3317059e87ba40d4f659d1c30afc1f
commit 1acfb7f2b0d460ee86bdb25ad0679070ec8a5f0d
commit 7e1f19e50371e1d148226b64c8edc77fec47fa5b
commit 1300124f69cafc54331bc06e968a8dd67863f989
commit ec7381d6bfd3e7b8d2880dd5e9d03b131b0603f6
commit 8313524a0d466f451a62709aaedf988d8257b21c
commit ea936b78f46cbe089a4ac363e1682dee7d427096
commit 52fc0b026e99b5d5d585095148d997d5634bbc25
commit 46358614ed5b031797522f1020e989c959a8d8a6
commit 6665bda76461308868bd1e52caf627f4cb29ed32
commit fdc136ccd3332938e989439c025c363f8479f3e6
commit a1f9e65e2085e0a87f28a4d5a8ae43b32c087f24
commit 1fee94034917aa711fcbd4ebf4c36f7ebd9fa7d6
commit 0eacee585a89ce5827b572a73a024931506bef48
commit 9cfda2c94df61c9f859b474abe774c65a4464d0a
commit d52bb94d56676acd9bdac8e097257a87b4b1b2e1
commit c52851b60cc0aaaf974ff0e49989fb698220447d
commit 09b4d1ee881c8593bfad2a42f838d85070365c3e
commit 3b2d99429e3386b6e2ac949fc72486509c8bbe36
commit ffd642e748c867a7339b57225b8bf8b9a0dcd9c5      <- This was my `good'
commit f9ea7fd8be9827791f407ca1191ff70ec25eb2d9
commit b60e49b2383db0334bef1f0d9cdad9bec2336050
commit 1ca218d3bd6acca0922a349cb76e3244d27ebfba

and git-bisect claimed that 9011bff4bdc0fef1f9a782d7415c306ee61826c9
introduced the bug.  

^ permalink raw reply

* Re: git-bisect failed me again
From: Linus Torvalds @ 2006-05-12 15:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: junkio, git
In-Reply-To: <20060512081207.6cd701f9.akpm@osdl.org>



On Fri, 12 May 2006, Andrew Morton wrote:
>
> Linus Torvalds <torvalds@osdl.org> wrote:
> > 
> >  Anyway, my first guess would be that you might have marked something good 
> >  or bad that wasn't. How sure are you about that initial "git bisect bad" 
> >  you did?
> 
> Am pretty confident.

And I'm pretty damn sure it ain't.

Andrew, git is _not_ linear. You can't just list the commits, take the 
last and the first, and say "the last must be bad, the first must be 
good". Which seems to be what you did.

> 
> 
> bix:/usr/src/25> grep '^commit ' patches/git-acpi.patch 
> commit 9011bff4bdc0fef1f9a782d7415c306ee61826c9		<- This was my `bad'
> commit 5d882e684aafea30c508d86d235327d94e1d38ae
> commit 14394600cdfe0c952ce662a32a68c5c5524d32ac
> commit da95181baf3cf6a2bd81c0c8af1d4c6790703e4f
> commit b128440ed11d108c375772b7fe9ad46d2ac07084		<- This was the bug

That "b128440e.." commit wasn't even among the collection of commits that 
you tested with "git bisect" in the first place. 

You've apparently created a "list of commits" that doesn't include any 
merges, and then you decided that the "most recent of those commits was 
obviously bad".

WHICH IS NOT TRUE.

You never actually even TESTED that 9011bff commit, did you? In fact, I'm 
100% sure you didn't. You just said "it's bad", without any confidence 
what-so-ever except that it happened to be first on your list.

Right?

The fact is, it seems that the way you generated the list of commits was 
basically:

 - pick every commit that is not a merge and doesn't exist in linus tree.

   (ie you basically did the equivalent of "git-rev-list --no-merges 
   linus..acpi", although it's possible that you used "git whatchanged" or 
   something else that will not show merges because they don't generate 
   diffs)

And then you believed that you had a linear series of commits, and that 
the most recent commit must thus be the buggy one.

But git isn't linear. Never has been. The fact that commits get (roughtly) 
sorted by date (modified by their ancestry relationships either subtly or 
grossly depending on whether --topo-sort is off or on) does not make 
anything linear.

The commit you mark as "this was the bug" is on a totally different 
development branch from the one you marked "bad". That development branch 
was merged with the branch that your "bad" commit came from with commit 
7378614.. (which is not on your list at all):

	"Pull bugzilla-5737 into test branch"

but there are actually a few other merges there too (and ACPI-only commits 
that aren't reachable from your "top" bad commit).

> and git-bisect claimed that 9011bff4bdc0fef1f9a782d7415c306ee61826c9
> introduced the bug.  

Hell no. Git bisect did no such thing at all. YOU DID.

Go back and look at what your sequence of instructions was (from your 
original email):

 ->> Trying to find a recently-merged box-killer in Len's tree:
 ->> 
 ->> bix:/usr/src/git26> cat .git/branches/git-acpi 
 ->> git+ssh://master.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git#test
 ->> 
 ->> git-checkout git-acpi
 ->> git-bisect reset
 ->> git-bisect start
 ->> git-bisect good ff2fc3e9e3edb918b6c6b288485c6cb267bc865e
 ->> git-bisect bad 9011bff4bdc0fef1f9a782d7415c306ee61826c9
 ->> 
 ->> And it led me to

and notice how YOU claimed that "9011bff4.." was bad at the very
beginning of the "git bisect" run.  Not "git bisect". YOU. You started off 
by claiming that 9011bff4 was bad, apparently without having ever even 
tested it.

The way "git bisect" works is that if you give it garbage information, it 
_will_ give you a garbage result. That's pretty much guaranteed. But if 
you actually give it tested and correct information, it will very 
efficiently zero in on what the problem really was.

And the whole _point_ about "git bisect" is that the git history isn't 
linear. If it was linear, you wouldn't need a tool to bisect it at all: 
you'd just pick the middle entry from the history list, and use it. It 
would be so trivial to bisect by hand, that using a tool is just 
unnecessary.

So really, take a look at "git bisect visualize". In this case, you should 
have noticed that you had a list of 50+ commits, but when you did

	git bisect good ff2fc3e
	git bisect bad 9011bff
	git bisect visualize

you had cut your list of commits down to just six (none of which was the 
bug).

This is why I integrated "gitk" immediately when it became available. It's 
really important to see the non-linear history, because if you don't 
visualize it (either mentally or with a tool like "gitk"), you'll never 
understand what is going on. 

		Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox