Handling renames.

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Handling renames.
@ 2005-04-14 17:54 David Woodhouse
  2005-04-14 18:11 ` Linus Torvalds
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: David Woodhouse @ 2005-04-14 17:54 UTC (permalink / raw)
  To: git; +Cc: James Bottomley

I've been looking at tracking file revisions. One proposed solution was
to have a separate revision history for individual files, with a new
kind of 'filecommit' object which parallels the existing 'commit',
referencing a blob instead of a tree. Then trees would reference such
objects instead of referencing blobs directly.

I think that introduces a lot of redundancy though, because 99% of the
time, the revision history of the individual file is entirely
reproducible from the revision history of the tree. It's only when files
are renamed that we fall over -- and I think we can handle renames
fairly well if we just log them in the commit object. 

My 'gitfilelog.sh' script is already capable of tracking a given file
back through multiple tree commits, listing those commits where the file
in question was actually changed. It uses my patched version of diff-
tree which supports 'diff-tree <TREE_A> <TREE_B> <filename>' in order to
do this.

By storing rename information in the commit object, the script (or a
reimplementation of a similar algorithm) could know when to change the
filename it's looking for, as it goes back through the tree. That ought
to be perfectly sufficient.

So a commit involving a rename would look something like this...

	tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa
	parent bb95843a5a0f397270819462812735ee29796fb4
	rename foo.c bar.c
	author David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
	committer David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
	Rename foo.c to bar.c and s/foo_/bar_/g

Opinions? Dissent? We'd probably need to escape the filenames in some
way -- handwave over that for now.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 17:54 David Woodhouse
@ 2005-04-14 18:11 ` Linus Torvalds
  2005-04-14 19:09   ` David Woodhouse
  2005-04-14 18:12 ` Ingo Molnar
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Linus Torvalds @ 2005-04-14 18:11 UTC (permalink / raw)
  To: David Woodhouse; +Cc: git, James Bottomley

On Thu, 14 Apr 2005, David Woodhouse wrote:
>
> I've been looking at tracking file revisions. One proposed solution was
> to have a separate revision history for individual files, with a new
> kind of 'filecommit' object which parallels the existing 'commit',
> referencing a blob instead of a tree. Then trees would reference such
> objects instead of referencing blobs directly.

Please don't.  It's fundamentally the git notion of "content determines
objects".

It also has no relevance. A "rename" really doesn't exist in the git 
model. The git model really is about tracking data, not about tracking 
what happened to _create_ that data.

The one exception is the commit log. That's where you put the explanations 
of _why_ the data changed. And git itself doesn't care what the format is, 
apart from the git header.

So, you really need to think of git as a filesystem. You can then 
implement an SCM _on_top_of_it_, which means that your second suggestion 
is not only acceptable, it really is the _only_ way to handle this in git:

> So a commit involving a rename would look something like this...
> 
> 	tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa
> 	parent bb95843a5a0f397270819462812735ee29796fb4
> 	rename foo.c bar.c
> 	author David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
> 	committer David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
> 	Rename foo.c to bar.c and s/foo_/bar_/g

Except I want that empty line in there, and I want it in the "free-form"  
section. The "rename" part really isn't part of the git header. It's not 
what git tracks, it was tracked by an SCM system on top of git.

So the git header is an "inode" in the git filesystem, and like an inode 
it has a ctime and an mtime, and pointers to the data. So as far as git is 
concerned, this part:

	tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa
	parent bb95843a5a0f397270819462812735ee29796fb4
	author David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
	committer David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100

really is the filesystem "inode". The rest is whatever the filesystem user
puts into it, and git won't care.

> Opinions? Dissent? We'd probably need to escape the filenames in some
> way -- handwave over that for now.

The fact that git handles arbitrary filenames (stuff starting with "." 
excepted) doesn't mean that the SCM above it needs to. Quite frankly, I 
think an SCM that handles newlines in filenames is being silly. But a 
_filesystem_ needs to not care.

There are too many messy SCM's out there that do not hav ea "philosophy". 
Dammit, I'm not interested in creating another one. This thing has a 
mental model, and we keep to that model.

The reason UNIX is beautiful is that it has a mental model of processes 
and files. Git has a mental model of objects and certain very very limited 
relationships. The relationships git cares about are encoded in the C 
files, the "extra crap" (like rename info) is just that - stuff that 
random scripts wrote, and that is just informational and not central to 
the model.

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 17:54 David Woodhouse
  2005-04-14 18:11 ` Linus Torvalds
@ 2005-04-14 18:12 ` Ingo Molnar
  2005-04-14 18:32   ` Linus Torvalds
  2005-04-14 18:21 ` H. Peter Anvin
  2005-04-14 22:23 ` Daniel Barkalow
  3 siblings, 1 reply; 37+ messages in thread
From: Ingo Molnar @ 2005-04-14 18:12 UTC (permalink / raw)
  To: David Woodhouse; +Cc: git, James Bottomley

* David Woodhouse <dwmw2@infradead.org> wrote:

> I've been looking at tracking file revisions. One proposed solution 
> was to have a separate revision history for individual files, with a 
> new kind of 'filecommit' object which parallels the existing 'commit', 
> referencing a blob instead of a tree. Then trees would reference such 
> objects instead of referencing blobs directly.
> 
> I think that introduces a lot of redundancy though, because 99% of the 
> time, the revision history of the individual file is entirely 
> reproducible from the revision history of the tree. It's only when 
> files are renamed that we fall over -- and I think we can handle 
> renames fairly well if we just log them in the commit object.

how about the following structure:

    - tree_new --->
    - tree_old ---> rename_commit -> blob

the rename_commit object just contains a pointer to the file content 
blob. If a rename happens then the old tree references the rename_commit 
object (instead of the blob), and the new tree references it too. This 
way there's no need to list the rename via namespace means: if a tree 
entry points to a rename_commit object then a rename happened and the 
rename_commit object is looked up in the old tree to get the old name.

there's no redundancy caused by this method: only renames (which are 
rare) go through the rename_commit redirection. (to speed up the lookup 
the rename_commit object could cache the offset of the two names within 
their tree objects.)

	Ingo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 17:54 David Woodhouse
  2005-04-14 18:11 ` Linus Torvalds
  2005-04-14 18:12 ` Ingo Molnar
@ 2005-04-14 18:21 ` H. Peter Anvin
  2005-04-14 18:48   ` Linus Torvalds
  2005-04-14 22:23 ` Daniel Barkalow
  3 siblings, 1 reply; 37+ messages in thread
From: H. Peter Anvin @ 2005-04-14 18:21 UTC (permalink / raw)
  To: David Woodhouse; +Cc: git, James Bottomley

David Woodhouse wrote:
> 
> Opinions? Dissent? We'd probably need to escape the filenames in some
> way -- handwave over that for now.
> 

For readability and simplicity I'd suggest using either URL-style %XX 
escapes or octal \xxx escapes for anything bytes < 33, minus the escape 
character.

Although Linus is correct in that an SCM doesn't *have* to handle this, 
it really feels like shooting for mediocracy to me.  We might as well 
design it right from the beginning.

	-hpa

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:12 ` Ingo Molnar
@ 2005-04-14 18:32   ` Linus Torvalds
  2005-04-14 18:58     ` Ingo Molnar
  2005-04-14 19:21     ` David Mansfield
  0 siblings, 2 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-04-14 18:32 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: David Woodhouse, git, James Bottomley

On Thu, 14 Apr 2005, Ingo Molnar wrote:
> 
> there's no redundancy caused by this method: only renames (which are 
> rare) go through the rename_commit redirection. (to speed up the lookup 
> the rename_commit object could cache the offset of the two names within 
> their tree objects.)

Bzzt. Wrong.

ANYTHING you do with games like this will cause the "same directory 
creates different object" crap.

Git doesn't do that. There fundamentally is no history in objects, 
_except_ for the commit object. Two objects with the same name are 
identical, and that means that they are easy to share. 

Any time you break that model, you break the whole point of git. Don't do 
it. You'll be very very sorry if you ever do, because it breaks the clean 
separation of "time" and "space". I guarantee you that your merges will 
become _harder_ rather than easier.

What you can do at an SCM level, is that if you want to track renames, you
track them as a separate commit altogether. Ie if you notice a rename, you
first commit the rename (and you can _see_ it's a rename, since the object
didn't change, and the sha1 stayed the same, which in git-speak means that
it is the same object, ie that _is_ a rename as far as git is concerned),
and then you create the "this is the data that changed" as a _second_
commit.

But don't make it a new kind of commit. It's just a regular commit, 
dammit. No new abstractions. 

Trust me, it's worth it to follow the rules. You don't start making up new 
concepts for every new thing you track. Next you'll want "tag objects". 
That's a totally idiotic idea. What you do is you tag things at a higher 
level than git ever is, and git will _never_ have to know about tag 
objects. 

Some "higher level" thing can add its own rules _on_top_ of git rules. The
same way we have normal applications having their _own_ rules on top of
the kernel. You do abstraction in layers, but for this to work, the base 
you build on top of had better be damn solid, and not have any ugly 
special cases.

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:21 ` H. Peter Anvin
@ 2005-04-14 18:48   ` Linus Torvalds
  2005-04-14 18:49     ` H. Peter Anvin
  2005-04-14 19:22     ` Zach Welch
  0 siblings, 2 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-04-14 18:48 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: David Woodhouse, git, James Bottomley

On Thu, 14 Apr 2005, H. Peter Anvin wrote:
> 
> Although Linus is correct in that an SCM doesn't *have* to handle this, 
> it really feels like shooting for mediocracy to me.  We might as well 
> design it right from the beginning.

No. git is not an SCM. it's a filesystem designed to _host_ an SCM, and 
that _is_ doing it right from the beginning.

Keep the abstractions clean. Do _not_ get confused into thinking that git 
is an SCM. If you think of it that way, you'll end up with crap you can't 
think about.

And at a filesystem layer, "rename" already exists. It's moving an object 
to a new name in a tree. git already does that very well, thank you very 
much.

But a filesystem rename is _not_ the same thing as an SCM rename.  An SCM 
rename is built on top of a filesystem rename, but it has its own issues 
that may or may not make sense for the filesystem.

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:48   ` Linus Torvalds
@ 2005-04-14 18:49     ` H. Peter Anvin
  2005-04-14 19:22     ` Zach Welch
  1 sibling, 0 replies; 37+ messages in thread
From: H. Peter Anvin @ 2005-04-14 18:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Woodhouse, git, James Bottomley

Linus Torvalds wrote:
> 
> On Thu, 14 Apr 2005, H. Peter Anvin wrote:
> 
>>Although Linus is correct in that an SCM doesn't *have* to handle this, 
>>it really feels like shooting for mediocracy to me.  We might as well 
>>design it right from the beginning.
> 
> No. git is not an SCM. it's a filesystem designed to _host_ an SCM, and 
> that _is_ doing it right from the beginning.
> 
> Keep the abstractions clean. Do _not_ get confused into thinking that git 
> is an SCM. If you think of it that way, you'll end up with crap you can't 
> think about.
> 
> And at a filesystem layer, "rename" already exists. It's moving an object 
> to a new name in a tree. git already does that very well, thank you very 
> much.
> 
> But a filesystem rename is _not_ the same thing as an SCM rename.  An SCM 
> rename is built on top of a filesystem rename, but it has its own issues 
> that may or may not make sense for the filesystem.
> 

I wasn't referring to git per se, I was referring to the hosted SCM.

	-hpa

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:32   ` Linus Torvalds
@ 2005-04-14 18:58     ` Ingo Molnar
  2005-04-14 19:20       ` David Woodhouse
  2005-04-14 19:21     ` David Mansfield
  1 sibling, 1 reply; 37+ messages in thread
From: Ingo Molnar @ 2005-04-14 18:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Woodhouse, git, James Bottomley

* Linus Torvalds <torvalds@osdl.org> wrote:

> [...] Ie if you notice a rename, you first commit the rename (and you 
> can _see_ it's a rename, since the object didn't change, and the sha1 
> stayed the same, which in git-speak means that it is the same object, 
> ie that _is_ a rename as far as git is concerned), and then you create 
> the "this is the data that changed" as a _second_ commit.

ok, i accept your point of not putting this into such a low level as the 
object abstraction. Was a bad idea.

but i dont think the above would be enough: there can be renames of 
objects that have the same sha1 hash as other objects in the same tree, 
and developers want to track individual objects, regardless of whether 
other files share the same content. So some formal operation would be 
needed to signal renames - e.g. to embedd it in the commit object, per 
David's suggestion.

The thing i tried to avoid was to list long filenames in the commit 
(because of the tree hierarchy we'd need to do tree-absolute pathnames 
or something like that, and escape things, and do lookups - duplicating 
a VFS which is quite bad) - it would be better to identify the rename 
source and target via its tree object hash and its offset within that 
tree. Such information could be embedded in the commit object just fine.  
Something like:

me bb95843a5a0f397270819462812735ee29796fb4
tree 1756b578489f93999ded68ae347bef7d6063101c
parent 9f02d4d233223462d3f6217b5837b786e6286ba4
author
committer
rename 39021759c903a943a33a28cfbd5070d36d851581 15234 9f02d4d233223462d3f6217b5837b786e6286ba4 16163

?

	Ingo

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:11 ` Linus Torvalds
@ 2005-04-14 19:09   ` David Woodhouse
  0 siblings, 0 replies; 37+ messages in thread
From: David Woodhouse @ 2005-04-14 19:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, James Bottomley

On Thu, 2005-04-14 at 11:11 -0700, Linus Torvalds wrote:
> So, you really need to think of git as a filesystem. You can then 
> implement an SCM _on_top_of_it_, which means that your second suggestion 
> is not only acceptable, it really is the _only_ way to handle this in git:
> 
> > So a commit involving a rename would look something like this...
> > 
> > 	tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa
> > 	parent bb95843a5a0f397270819462812735ee29796fb4
> > 	rename foo.c bar.c
> > 	author David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
> > 	committer David Woodhouse <dwmw2@hades.cambridge.redhat.com> 1113499881 +0100
> > 	Rename foo.c to bar.c and s/foo_/bar_/g
> 
> Except I want that empty line in there, and I want it in the "free-form"  
> section. The "rename" part really isn't part of the git header. It's not 
> what git tracks, it was tracked by an SCM system on top of git.

Note that not only may you have a _set_ of renames, but you'll also have
a _different_ set of renames for each parent. Consider the
representation of a merge where a file was called 'foo' in one parent,
'bar' in the other, and we called it 'foobar' in the resulting tree.

That's the main reason I wanted the renames in with the parent
information -- so it's <parent><rename><rename><...><parent><rename>...

I see your point though and I can't be bothered to argue for the sake of
the slight efficiency benefit we might gain from doing it that way. The
implementation details really aren't that interesting right now.

Let us assume, however, that we have this information somehow stored in
each commit object. It's perfectly sufficient from the POV of the 
'git revtool' which I've been poking at; is it good enough for merges?

Consider a simple case: A branch is taken, file foo.c is renamed to
bar.c, and now we're trying to merge that branch back into the head,
which has moved on. 

We can't just take 'bar.c' as a new file -- we have to track it all the
way back to its inception, and notice that it actually shares a common
ancestor with 'foo.c' in the other parent of the merge.

How feasible, and how computationally expensive, is that task going to
be? Especially given that there may be _many_ new files that we need to
attempt to tie up with their partners, across many potential renames. 

One option for optimising this, if we really need to, might be to track
the file back to its _first_ ancestor and use that as an identification.
The SCM could store that identifier in the blob itself, or we could
consider it an 'inode number' and store it in git's tree objects.

If we can avoid that, however, it would be nice. How feasible is the
merge going to be without it?

-- 
dwmw2

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:58     ` Ingo Molnar
@ 2005-04-14 19:20       ` David Woodhouse
  0 siblings, 0 replies; 37+ messages in thread
From: David Woodhouse @ 2005-04-14 19:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, git, James Bottomley

On Thu, 2005-04-14 at 20:58 +0200, Ingo Molnar wrote:
> The thing i tried to avoid was to list long filenames in the commit 
> (because of the tree hierarchy we'd need to do tree-absolute pathnames 
> or something like that, and escape things, and do lookups - duplicating 
> a VFS which is quite bad) - it would be better to identify the rename 
> source and target via its tree object hash and its offset within that 
> tree. Such information could be embedded in the commit object just fine.  
> Something like:

Actually I'm not sure that's true. Let's consider the two main users of
this information.

Firstly, because it's what I've been playing with: to list a given
file's revision history, I currently work with its filename -- walk the
commit objects, inspecting the tree and selecting those commits where
the file has changed. If my filename is 'fs/jffs2/inode.c' then I can
immediately skip over a commit where the 'fs' entry in the top-level
tree is identical to that in the parent, or I can skip a commit where
the 'jffs2' entry in the 'fs' subtree is identical to the parent... it's
all done on filename, and the {parent, entry} tuple wouldn't help much
here; I'd probably have to convert back to a filename anyway.

Secondly, there's merges. I've paid less attention to these (see mail 5
minutes ago) but I think they'd end up operating on the rename
information in a very similar way. To find a common ancestor for a given
file,, we want to track its name as it changed during history; at that
point it's all string compares.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:32   ` Linus Torvalds
  2005-04-14 18:58     ` Ingo Molnar
@ 2005-04-14 19:21     ` David Mansfield
  1 sibling, 0 replies; 37+ messages in thread
From: David Mansfield @ 2005-04-14 19:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ingo Molnar, David Woodhouse, git, James Bottomley

Linus Torvalds wrote:
> 
> On Thu, 14 Apr 2005, Ingo Molnar wrote:
> 
>>there's no redundancy caused by this method: only renames (which are 
>>rare) go through the rename_commit redirection. (to speed up the lookup 
>>the rename_commit object could cache the offset of the two names within 
>>their tree objects.)
> 
> 

> 
> Some "higher level" thing can add its own rules _on_top_ of git rules. The
> same way we have normal applications having their _own_ rules on top of
> the kernel. You do abstraction in layers, but for this to work, the base 
> you build on top of had better be damn solid, and not have any ugly 
> special cases.
> 

Maybe you (or the group) should standardize on a way to 'extend' the 
commit 'object' in terms of:

the layer1 (git) header for commit object is defined as such-and-such
the layer2 (scm or other) header for commit object is defined as 
such-and-such

Much the way network protocols stack on top of each other.  If a 
standard way of stacking is defined, then it could be much cleaner for 
future implementors to understand a 'new' stacking protocol, and it will 
make the scm-level extensions easier to discuss it terms of their own 
'layer'.

David

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 18:48   ` Linus Torvalds
  2005-04-14 18:49     ` H. Peter Anvin
@ 2005-04-14 19:22     ` Zach Welch
  2005-04-14 19:40       ` Andrew Timberlake-Newell
  1 sibling, 1 reply; 37+ messages in thread
From: Zach Welch @ 2005-04-14 19:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: H. Peter Anvin, David Woodhouse, git, James Bottomley

Linus Torvalds wrote:
> 
> On Thu, 14 Apr 2005, H. Peter Anvin wrote:
> 
>> Although Linus is correct in that an SCM doesn't *have* to handle 
>> this, it really feels like shooting for mediocracy to me.  We might
>>  as well design it right from the beginning.
> 
> 
> No. git is not an SCM. it's a filesystem designed to _host_ an SCM, 
> and that _is_ doing it right from the beginning.

I imagine quite a few folks expect something not entirely unlike an SCM
to emerge from these current efforts. Moreover, Petr's 'git' scripts
wrap your "filesystem" plumbing to that very end.

To avoid confusion, I think it would be better to distinguish the two
layers, perhaps by calling the low-level plumbing... 'gitfs', of course.

Cheers,

Zach Welch
Superlucidity Services

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: Handling renames.
  2005-04-14 19:22     ` Zach Welch
@ 2005-04-14 19:40       ` Andrew Timberlake-Newell
  0 siblings, 0 replies; 37+ messages in thread
From: Andrew Timberlake-Newell @ 2005-04-14 19:40 UTC (permalink / raw)
  To: git; +Cc: 'Zach Welch', 'Linus Torvalds'

Zach Welch pontificated:
> I imagine quite a few folks expect something not entirely unlike an SCM
> to emerge from these current efforts. Moreover, Petr's 'git' scripts
> wrap your "filesystem" plumbing to that very end.
> 
> To avoid confusion, I think it would be better to distinguish the two
> layers, perhaps by calling the low-level plumbing... 'gitfs', of course.

Or perhaps to come up with a name (or at least nickname) for the SCM.

GitMaster?


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 17:54 David Woodhouse
                   ` (2 preceding siblings ...)
  2005-04-14 18:21 ` H. Peter Anvin
@ 2005-04-14 22:23 ` Daniel Barkalow
  2005-04-14 22:46   ` David Woodhouse
  3 siblings, 1 reply; 37+ messages in thread
From: Daniel Barkalow @ 2005-04-14 22:23 UTC (permalink / raw)
  To: David Woodhouse; +Cc: git, James Bottomley

On Thu, 14 Apr 2005, David Woodhouse wrote:

> Opinions? Dissent? We'd probably need to escape the filenames in some
> way -- handwave over that for now.

I personally think renames are a minor thing that doesn't happen
much. What actually happens, in my opinion, is that some chunk of a file
is moved to a different, possibly new, file. If this is supported (as
something that the SCM notices), then a rename is just a special case
where the moved chunk is a whole file.

I think that it should be possible to identify and tag "big
enough" deletions and insertions, and compare them to find moves, where a
further change may be applied in the middle if two chunks are "very
similar" but not the same.

On the other hand, I think that the SCM will need to cache its
understanding of what a commit did in order to give reasonable
performance for operations like "annotate", and it may be advantegous to
distribute things from this cache, since the committer might want to tell
the system something that it didn't guess.

At some point, I'm going to argue for core support for "back pointers",
where a file can be created which is "about" some other file(s), and
someone looking for files "about" a particular file can find them without
searching the entire database. I think this will turn out to be important
for a variety of cases where some later participant wants to say something
about an existing file without changing the content of the file.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-14 22:23 ` Daniel Barkalow
@ 2005-04-14 22:46   ` David Woodhouse
  0 siblings, 0 replies; 37+ messages in thread
From: David Woodhouse @ 2005-04-14 22:46 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, James Bottomley

On Thu, 2005-04-14 at 18:23 -0400, Daniel Barkalow wrote:
> I personally think renames are a minor thing that doesn't happen
> much. What actually happens, in my opinion, is that some chunk of a
> file is moved to a different, possibly new, file. If this is supported
> (as something that the SCM notices), then a rename is just a special
> case where the moved chunk is a whole file.

Certainly we'd discussed the possibility that the 'rename' field may
contain more than one destination, or more than one source filename.
This could happen when a file is split into two, or when two files are
merged into one, for example.

-- 
dwmw2



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
@ 2005-04-15 13:37 linux
  2005-04-15 13:53 ` David Woodhouse
  0 siblings, 1 reply; 37+ messages in thread
From: linux @ 2005-04-15 13:37 UTC (permalink / raw)
  To: dwmw2; +Cc: git

> One option for optimising this, if we really need to, might be to track
> the file back to its _first_ ancestor and use that as an identification.
> The SCM could store that identifier in the blob itself, or we could
> consider it an 'inode number' and store it in git's tree objects.

This suggestion (and this whole discussion about renames) has issues
with file copies, which form a branch in the revision history.  If I
copy foo.c to foo2.c (or fs/ext2/ to fs/ext3/), then the oldest ancestor
isn't a "unique inode number".

I've written a lot of programs by debugging hello.c.

Thinking about this can give you all sorts of exciting merge possibilities.

If branch1 renames a.c to b.c, and branch2 patches a.c, it seems obvious
that the patch should be merged into b.c.  But what if branch1 copies a.c
to b.c?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-04-15 13:37 Handling renames linux
@ 2005-04-15 13:53 ` David Woodhouse
  0 siblings, 0 replies; 37+ messages in thread
From: David Woodhouse @ 2005-04-15 13:53 UTC (permalink / raw)
  To: linux; +Cc: git

On Fri, 2005-04-15 at 13:37 +0000, linux@horizon.com wrote:
> > One option for optimising this, if we really need to, might be to track
> > the file back to its _first_ ancestor and use that as an identification.
> > The SCM could store that identifier in the blob itself, or we could
> > consider it an 'inode number' and store it in git's tree objects.
> 
> This suggestion (and this whole discussion about renames) has issues
> with file copies, which form a branch in the revision history.  If I
> copy foo.c to foo2.c (or fs/ext2/ to fs/ext3/), then the oldest ancestor
> isn't a "unique inode number".

That's why I prefer the option of simply annotating the moves. They
don't need to be just renames -- it can cover the cases where files are
split up or merged into one, to indicate where the history of the given
_data_ is coming from.

-- 
dwmw2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* git-rev-list: add "--dense" flag
@ 2005-10-21 23:40 Linus Torvalds
  2005-10-21 23:47 ` Linus Torvalds
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-21 23:40 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


This is what the recent git-rev-list changes have all been gearing up for.

When we use a path filter to git-rev-list, the new "--dense" flag asks
git-rev-list to compress the history so that it _only_ contains commits
that change files in the path filter.  It also rewrites the parent
information so that tools like "gitk" will see the result as a dense
history tree.

For example, on the current kernel archive:

	[torvalds@g5 linux]$ git-rev-list HEAD | wc -l
	9904
	[torvalds@g5 linux]$ git-rev-list HEAD -- kernel | wc -l
	5442
	[torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel | wc -l
	356

which shows that while we have almost ten thousand commits, we can prune
down the work to slightly more than half by only following the merges
that are interesting. But further, we can then compress the history to
just 356 entries that actually make changes to the kernel subdirectory.

To see this in action, try something like

	gitk --dense -- gitk

to see just the history that affects gitk.  Or, to show that true
parallell development still remains parallell, do

	gitk --dense -- daemon.c

which shows some parallell commits in the current git tree.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---

I'm really happy with how this turned out. It's a bit expensive to run on 
big archives, but I _really_ think it's quite spectacular. And likely very 
useful indeed.

For example, say you love gitk, but only care about networking changes. 
Easy enough, just do

	gitk --dense -- net/ include/net/

and off you go. It's not free (we do a _lot_ of tree comparisons), but 
dammit, it's fast enough that it's very very useful.  The tree comparisons 
are done very efficiently.

This is _way_ more powerful than annotate. Interested in just a single 
file? Just do

	gitk --dense -- kernel/exit.c

and it will show the 17 or so commits that change kernel/exit.c with the 
right history (it turns out that there is no parallell development at all 
in that file, so in this case it will linearize history entirely).

Damn, I'm good. 

diff --git a/rev-list.c b/rev-list.c
index b5dbb9f..5f125fd 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -11,6 +11,7 @@
 #define INTERESTING	(1u << 1)
 #define COUNTED		(1u << 2)
 #define SHOWN		(1u << 3)
+#define TREECHANGE	(1u << 4)
 
 static const char rev_list_usage[] =
 	"git-rev-list [OPTION] commit-id <commit-id>\n"
@@ -27,6 +28,7 @@ static const char rev_list_usage[] =
 		      "  --merge-order [ --show-breaks ]\n"
 		      "  --topo-order";
 
+static int dense = 0;
 static int unpacked = 0;
 static int bisect_list = 0;
 static int tag_objects = 0;
@@ -79,6 +81,26 @@ static void show_commit(struct commit *c
 	fflush(stdout);
 }
 
+static void rewrite_one(struct commit **pp)
+{
+	for (;;) {
+		struct commit *p = *pp;
+		if (p->object.flags & (TREECHANGE | UNINTERESTING))
+			return;
+		/* Only single-parent commits don't have TREECHANGE */
+		*pp = p->parents->item;
+	}
+}
+
+static void rewrite_parents(struct commit *commit)
+{
+	struct commit_list *parent = commit->parents;
+	while (parent) {
+		rewrite_one(&parent->item);
+		parent = parent->next;
+	}
+}
+
 static int filter_commit(struct commit * commit)
 {
 	if (stop_traversal && (commit->object.flags & BOUNDARY))
@@ -95,6 +117,11 @@ static int filter_commit(struct commit *
 		return STOP;
 	if (no_merges && (commit->parents && commit->parents->next))
 		return CONTINUE;
+	if (paths && dense) {
+		if (!(commit->object.flags & TREECHANGE))
+			return CONTINUE;
+		rewrite_parents(commit);
+	}
 	return DO;
 }
 
@@ -404,6 +431,14 @@ static struct diff_options diff_opt = {
 	.change = file_change,
 };
 
+static int same_tree(struct tree *t1, struct tree *t2)
+{
+	is_different = 0;
+	if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "", &diff_opt) < 0)
+		return 0;
+	return !is_different;
+}
+
 static struct commit *try_to_simplify_merge(struct commit *commit, struct commit_list *parent)
 {
 	if (!commit->tree)
@@ -415,11 +450,7 @@ static struct commit *try_to_simplify_me
 		parse_commit(p);
 		if (!p->tree)
 			continue;
-		is_different = 0;
-		if (diff_tree_sha1(commit->tree->object.sha1,
-				   p->tree->object.sha1, "", &diff_opt) < 0)
-			continue;
-		if (!is_different)
+		if (same_tree(commit->tree, p->tree))
 			return p;
 	}
 	return NULL;
@@ -485,6 +516,27 @@ static void add_parents_to_list(struct c
 	}
 }
 
+static void compress_list(struct commit_list *list)
+{
+	while (list) {
+		struct commit *commit = list->item;
+		struct commit_list *parent = commit->parents;
+		list = list->next;
+
+		/*
+		 * Exactly one parent? Check if it leaves the tree
+		 * unchanged
+		 */
+		if (parent && !parent->next) {
+			struct tree *t1 = commit->tree;
+			struct tree *t2 = parent->item->tree;
+			if (!t1 || !t2 || same_tree(t1, t2))
+				continue;
+		}
+		commit->object.flags |= TREECHANGE;
+	}
+}
+
 static struct commit_list *limit_list(struct commit_list *list)
 {
 	struct commit_list *newlist = NULL;
@@ -514,6 +566,8 @@ static struct commit_list *limit_list(st
 	}
 	if (tree_objects)
 		mark_edges_uninteresting(newlist);
+	if (paths && dense)
+		compress_list(newlist);
 	if (bisect_list)
 		newlist = find_bisection(newlist);
 	return newlist;
@@ -700,6 +754,10 @@ int main(int argc, const char **argv)
 		        limited = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--dense")) {
+			dense = 1;
+			continue;
+		}
 		if (!strcmp(arg, "--")) {
 			paths = get_pathspec(prefix, argv + i + 1);
 			if (paths) {

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-21 23:40 git-rev-list: add "--dense" flag Linus Torvalds
@ 2005-10-21 23:47 ` Linus Torvalds
  2005-10-21 23:55 ` Linus Torvalds
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-21 23:47 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List

On Fri, 21 Oct 2005, Linus Torvalds wrote:
>
> parallell

Oh, please also fix my spelling. That's one 'l' too many at the end.

Damn, I think I'm a reasonably good speller most of the time, but this 
time it shows that Swedish is my first language (where it really _is_ 
written that way, with four 'l's total).

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-21 23:40 git-rev-list: add "--dense" flag Linus Torvalds
  2005-10-21 23:47 ` Linus Torvalds
@ 2005-10-21 23:55 ` Linus Torvalds
  2005-10-22  0:37 ` Petr Baudis
  2005-10-25 18:07 ` Jonas Fonseca
  3 siblings, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-21 23:55 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List

On Fri, 21 Oct 2005, Linus Torvalds wrote:
>
> 	[torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel | wc -l
> 	356

Btw, one additional point.

This took 0.91 seconds to complete on the current kernel history on my 
machine with a pretty much fully packed archive, and the memory footprint 
was a total of about 12MB.

And it scales pretty well too. On the historical linux archive, which is 
three years of history, the same thing takes me just over 12 seconds and 
52MB, and that's for the _whole_ history. And it's not just following one 
file: it's following that subdirectory.

So it really is pretty damn cool. 

Of course, I might have a bug somewhere, but it all _seems_ to work very 
well indeed.

			Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-21 23:40 git-rev-list: add "--dense" flag Linus Torvalds
  2005-10-21 23:47 ` Linus Torvalds
  2005-10-21 23:55 ` Linus Torvalds
@ 2005-10-22  0:37 ` Petr Baudis
  2005-10-22  0:47   ` Handling renames Petr Baudis
  2005-10-22  1:26   ` git-rev-list: add "--dense" flag Linus Torvalds
  2005-10-25 18:07 ` Jonas Fonseca
  3 siblings, 2 replies; 37+ messages in thread
From: Petr Baudis @ 2005-10-22  0:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Dear diary, on Sat, Oct 22, 2005 at 01:40:54AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> told me that...
> When we use a path filter to git-rev-list, the new "--dense" flag asks
> git-rev-list to compress the history so that it _only_ contains commits
> that change files in the path filter.  It also rewrites the parent
> information so that tools like "gitk" will see the result as a dense
> history tree.

There is no documentation for the --dense flag and it is even missing
from the usage string.

But my main concern is - will it be possible to do the rename detection
here as well? Using --dense instead of explicit diff-tree calls in
cg-log would be nice optimization, but I was about to add support for
optional following of renames for cg-log <filename>. That's really
pretty useful to have, every time I hit the Junio's big scripts rename
I keep repeating that to myself. ;-) Now when core GIT got the comfort
of per-file history, I only hope that it will start to annoy you as well.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Handling renames.
  2005-10-22  0:37 ` Petr Baudis
@ 2005-10-22  0:47   ` Petr Baudis
  2005-10-22  1:28     ` Linus Torvalds
  2005-10-22  1:26   ` git-rev-list: add "--dense" flag Linus Torvalds
  1 sibling, 1 reply; 37+ messages in thread
From: Petr Baudis @ 2005-10-22  0:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

  (Having Apr 14 flashbacks? Good memory!)

Dear diary, on Sat, Oct 22, 2005 at 02:37:33AM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> told me that...
> But my main concern is - will it be possible to do the rename detection
> here as well? Using --dense instead of explicit diff-tree calls in
> cg-log would be nice optimization, but I was about to add support for
> optional following of renames for cg-log <filename>. That's really
> pretty useful to have, every time I hit the Junio's big scripts rename
> I keep repeating that to myself. ;-) Now when core GIT got the comfort
> of per-file history, I only hope that it will start to annoy you as well.

  After all, this might be as good time to bring this up as any. Heavy
post-1.0 material follows:

  How to track renames? I believe the situation has changed in the last
half a year.  GIT really is a full-fledged SCM by now (at least its
major part code-wise), and I think it's hopefully becoming obvious that
we need to track renames. I actually decided to skip the whole
discussion why so, because we already _do_ concern ourselves with
renames - that's what the cool diff -M gadget does. And people start to
want to use it for all kind of history-digging stuff (not just for nice
diffs between trees).

  So the problem is whether we should make this explicit. diff -M is
only a heuristic and it can go wrong, while it was empirically found
out in other SCMs that people actually don't mind telling their SCM
about renames explicitly - no more than telling it about adds and
removals explicitly. So the user is willing to tell us what precisely
happened and it would be foolish to throw that away and insist on
guessing.  Besides, guessing (and even doing that everytime we go
through the history) is fundamentally slow, orders of magnitude more
than just a tree diff.

  If I convince you that it is worth tracking the renames explicitly,
"how" is already a minor question. One idea of mine was to add an "edge"
object describing the edge between two trees (that's optimized for
flexible use - some people use GIT for really weird things and perhaps
do not use commits at all; edge between two commits would be optimized
for flexibility in case we will later think of some other cool stuff to
track at the edge):

	trees c53f757133bb84a2d87e901c49207e9b7c48e1a6 6bc7aa4f652d0ef49108d9e30a7ea7fbf8e44639
	rename git-pull-script\0git-pull.sh\0
	copy somefile1\0somefile2\0
	rewrite anotherfile1\0anotherfile2\0

  (The "rewrite" line might be controversial. And you might want to
merge this with the delta objects during packing, or do something
similarly clever.)

  Then you could e.g. pass the edge object ID as the second ID on the
parent line:

	parent 99977bd5fdeabbd0608a70e9411c243007ec4ea2 edgeobjectid

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-22  0:37 ` Petr Baudis
  2005-10-22  0:47   ` Handling renames Petr Baudis
@ 2005-10-22  1:26   ` Linus Torvalds
  2005-10-22  2:56     ` Petr Baudis
  2005-10-22  3:55     ` Junio C Hamano
  1 sibling, 2 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-22  1:26 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, Git Mailing List



On Sat, 22 Oct 2005, Petr Baudis wrote:
> 
> There is no documentation for the --dense flag and it is even missing
> from the usage string.

I'm not much for docs ;)

The whole path argument also isn't even there (and without paths, --dense 
doesn't matter)

> But my main concern is - will it be possible to do the rename detection
> here as well?

Yes. Note that git-rev-list doesn't actually _do_ the diff, it only checks 
whether the tree is changed. You still just get a list of commits out of 
it, and it is up to you to decide what to do with it.

If you're just going to feed them to diff-tree _anyway_, then you might as 
well not even do the dense thing, because quite frankly, you're just doing 
extra work.

But let's say that you want to follow a certain filename, what you can do 
is basically (fake shell syntax with "goto restart")

	rev=HEAD
   restart:

	git-rev-list $rev --dense --parents -- "$filename" |
		while read commit parent1 restofparents
		do
			if [ "$restofparents" ]; then
				.. it's a merge, do whatever it is you do
				   with merges ..
			else
				shaold=$(git-ls-tree $parent -- "$filename")
				shanew=$(git-ls-tree $commit -- "$filename")

				# Did it disappear?
				# Maybe it got renamed from something
				if [ -z "$shaold" ]; then
					old=$(git-diff-tree -M -r $commit |
						grep " R .* $filename")
					echo "rename? $old"
					filename=figure-it-out-from-$old
					rev=$parent
					goto restart
				fi
				git-diff-tree -p $commit -- "$filename"
			fi
		while

or something similar. 

In other words: you'll basically have to figure out the renames on your 
own, and follow the renaming, but something like the above should do it.

			Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-10-22  0:47   ` Handling renames Petr Baudis
@ 2005-10-22  1:28     ` Linus Torvalds
  2005-10-22  1:51       ` Petr Baudis
  0 siblings, 1 reply; 37+ messages in thread
From: Linus Torvalds @ 2005-10-22  1:28 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, Git Mailing List

On Sat, 22 Oct 2005, Petr Baudis wrote:
> 
>   How to track renames? I believe the situation has changed in the last
> half a year.

I disagree.

Every single thing that said that renames were a bad idea to track when 
git started is still equally true.

>   If I convince you that it is worth tracking the renames explicitly,
> "how" is already a minor question.

Never. I'm 100% convinced that tracking renames is WRONG WRONG WRONG.

You can follow renames _afterwards_. 

Git tracks contents. And I think we've proven that figuring out renames 
after-the-fact from those contents is not only doable, but very well 
supported already.

I'm convinced that git handles renames better than any other SCM ever. 
Exactly because we figure it out when it matters.

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-10-22  1:28     ` Linus Torvalds
@ 2005-10-22  1:51       ` Petr Baudis
  2005-10-22  2:10         ` Junio C Hamano
  2005-10-22  3:23         ` Linus Torvalds
  0 siblings, 2 replies; 37+ messages in thread
From: Petr Baudis @ 2005-10-22  1:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

> Every single thing that said that renames were a bad idea to track
> when git started is still equally true.

It'd be good to clarify whether we discuss whether the idea of tracking
renames is good or bad, or whether having the user explicitly specify
renames is better than figuring that out automagically. Your first
comment would indicate the former, but the rest of your reply the
latter.

> You can follow renames _afterwards_. 

I can - crudely, but what's the point, if the user is dying to give me
the information.

> Git tracks contents. And I think we've proven that figuring out renames 
> after-the-fact from those contents is not only doable, but very well 
> supported already.

It's unreliable and it's slow (well, perhaps I should get some numbers
to back that out, but given how it is done I take it for granted). Does
not sound too "very well" to me.

> I'm convinced that git handles renames better than any other SCM ever. 
> Exactly because we figure it out when it matters.

It matters at least every time you show per-file history and every time
you merge cross the rename. I think that can be both pretty common if
you ever do the rename. That means you can do an expensive guess
every time you hit that, and the guess can get it wrong, in which case
there is no way around that and you lose.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-10-22  1:51       ` Petr Baudis
@ 2005-10-22  2:10         ` Junio C Hamano
  2005-10-22  2:49           ` Petr Baudis
  2005-10-22  3:23         ` Linus Torvalds
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2005-10-22  2:10 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis <pasky@suse.cz> writes:

>> I'm convinced that git handles renames better than any other SCM ever. 
>> Exactly because we figure it out when it matters.
>
> It matters at least every time you show per-file history and every time
> you merge cross the rename. I think that can be both pretty common if
> you ever do the rename. That means you can do an expensive guess
> every time you hit that, and the guess can get it wrong, in which case
> there is no way around that and you lose.

I think it is OK for the higher level layer (like your
single-file-history follower) to use what you outlined with
"edges", as either a hint/request from the user and/or a cache
of what the expensive and unreliable thing figured out.

I however do not necessarily think adding the "edges"
information as an optional second item on "parent " line in a
commit is a good idea.  Rather, treating the set of "edges" just
like we treat the grafts feel more appropriate to me.  IOW,
create .git/info/edges and keep your edge information there.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-10-22  2:10         ` Junio C Hamano
@ 2005-10-22  2:49           ` Petr Baudis
  0 siblings, 0 replies; 37+ messages in thread
From: Petr Baudis @ 2005-10-22  2:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Dear diary, on Sat, Oct 22, 2005 at 04:10:59AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> I think it is OK for the higher level layer (like your
> single-file-history follower) to use what you outlined with
> "edges", as either a hint/request from the user and/or a cache
> of what the expensive and unreliable thing figured out.
> 
> I however do not necessarily think adding the "edges"
> information as an optional second item on "parent " line in a
> commit is a good idea.

Well, it's obviously unflying idea when it doesn't become core part of
GIT. ;-) That won't stop me with Cogito, though...

> Rather, treating the set of "edges" just
> like we treat the grafts feel more appropriate to me.  IOW,
> create .git/info/edges and keep your edge information there.

The thing is, you aren't supposed to have a lot of grafts, they are
expected to be rare. OTOH, renames aren't really all that rare, and as
the history goes, you'll accumulate a lot of them. Besides, it's not
just a cache - in case the user manually recorded the rename, it's an
important historical information, as important as e.g. the commit
message. It should be also distributed with the rest of the history
data, not be a per-repository thing. (Actually, this does not mix with
the cache at all - that should be a separate thing, quite possibly in
info/.)

But if GIT won't do this, I will have to do this in some Cogito-specific
way. My preliminary plan is appending something like this to the commit
message:

	---
	!%$rename foo\x20bar baz\nquux

!%$ being some arbitrary magic literal, and filter this out from cg-log
and such. Also, all Cogito commits adding/removing files but not
renaming any would have a '!%$norenames' line in the commit message, so
that I know that I don't have to do the expensive heuristics because no
rename/copy happened.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-22  1:26   ` git-rev-list: add "--dense" flag Linus Torvalds
@ 2005-10-22  2:56     ` Petr Baudis
  2005-10-22  3:30       ` Linus Torvalds
  2005-10-22  3:55     ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Petr Baudis @ 2005-10-22  2:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Dear diary, on Sat, Oct 22, 2005 at 03:26:27AM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> told me that...
> On Sat, 22 Oct 2005, Petr Baudis wrote:
> > There is no documentation for the --dense flag and it is even missing
> > from the usage string.
> 
> I'm not much for docs ;)
> 
> The whole path argument also isn't even there (and without paths, --dense 
> doesn't matter)

Then it should be added to the docs as well. ;-)

Oh well, I guess I can always send a patch when yours will get applied. :-)

> > But my main concern is - will it be possible to do the rename detection
> > here as well?
> 
> Yes. Note that git-rev-list doesn't actually _do_ the diff, it only checks 
> whether the tree is changed. You still just get a list of commits out of 
> it, and it is up to you to decide what to do with it.

Ok, fair enough.

> If you're just going to feed them to diff-tree _anyway_, then you might as 
> well not even do the dense thing, because quite frankly, you're just doing 
> extra work.

fork() is awfully expensive for me. fork()ing something (supposedly)
trivial (it was stat) slowed my (non-trivial) loop about 4 times. So I
think it will still pay off hugely if I will be able do the diff-tree
only on the interesting commits.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: Handling renames.
  2005-10-22  1:51       ` Petr Baudis
  2005-10-22  2:10         ` Junio C Hamano
@ 2005-10-22  3:23         ` Linus Torvalds
  1 sibling, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-22  3:23 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, Git Mailing List

On Sat, 22 Oct 2005, Petr Baudis wrote:
> 
> > You can follow renames _afterwards_. 
> 
> I can - crudely, but what's the point, if the user is dying to give me
> the information.

No the user is NOT.

The fact is, users have not a frigging clue when a rename happens. 

I told you before, I'll tell you again: if you depend on users telling you 
about renames, you'll get it wrong. You'll get it wrong quite often, in 
fact.

This is not something I'm going to discuss again. Go back to all the same 
arguments from 6 months ago. I was right then, I'm right now.

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-22  2:56     ` Petr Baudis
@ 2005-10-22  3:30       ` Linus Torvalds
  0 siblings, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-22  3:30 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, Git Mailing List

On Sat, 22 Oct 2005, Petr Baudis wrote:
> 
> > If you're just going to feed them to diff-tree _anyway_, then you might as 
> > well not even do the dense thing, because quite frankly, you're just doing 
> > extra work.
> 
> fork() is awfully expensive for me. fork()ing something (supposedly)
> trivial (it was stat) slowed my (non-trivial) loop about 4 times.

Not fork. More like

	git-rev-list .. | git-diff-tree --stdin

If you do it that way (which is quite common - git-whatchanged, git-log 
etc), there's no reason to ask for a dense output, because git-diff-tree 
can handle the non-dense one as fast as git-rev-list can.

But the --dense option is wonderful for "gitk", and for "slow stuff". 
Where "slow" can indeed be a fork, or just an interpreter loop like a 
shell/perl script, which would fork/exec git-diff-tree for each commit.

>From a commit-list standpoint

	git-rev-list --dense ... -- paths

is the same as

	git-rev-list -- paths | git-diff-tree -s -r --stdin -- paths

but it's somewhat more efficient, and more importantly, it rewrites the 
parents. So quite often, you'd want to use "--dense" together with 
"--parent" so that you get the "rewritten" history. That's how come gitk 
didn't need any changes, and it "just worked". It's also how my example 
script did the rename detection without having to traverse unnecessary 
commits.

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-22  1:26   ` git-rev-list: add "--dense" flag Linus Torvalds
  2005-10-22  2:56     ` Petr Baudis
@ 2005-10-22  3:55     ` Junio C Hamano
  2005-10-22 20:37       ` Linus Torvalds
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2005-10-22  3:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Petr Baudis, Git Mailing List

Linus Torvalds <torvalds@osdl.org> writes:

> Yes. Note that git-rev-list doesn't actually _do_ the diff, it only checks 
> whether the tree is changed. You still just get a list of commits out of 
> it, and it is up to you to decide what to do with it.
>
> If you're just going to feed them to diff-tree _anyway_, then you might as 
> well not even do the dense thing, because quite frankly, you're just doing 
> extra work.
>
> But let's say that you want to follow a certain filename, what you can do 
> is basically (fake shell syntax with "goto restart")
>...
> or something similar. 

Isn't that only true because you are not doing more than "have
these paths change" in the new rev-list that already has part of
diff-tree?

If rev-list can optionally be told to detect renames internally
(it has necessary bits after all), it could adjust the set of
paths to follow when it sees something got renamed, either by
replacing the original path given from the command line with its
previous name, or adding its previous name to the set of path
limitters (to cover the copy case as well).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-22  3:55     ` Junio C Hamano
@ 2005-10-22 20:37       ` Linus Torvalds
  0 siblings, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-22 20:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, Git Mailing List

On Fri, 21 Oct 2005, Junio C Hamano wrote:
> 
> If rev-list can optionally be told to detect renames internally
> (it has necessary bits after all), it could adjust the set of
> paths to follow when it sees something got renamed, either by
> replacing the original path given from the command line with its
> previous name, or adding its previous name to the set of path
> limitters (to cover the copy case as well).

The problem with renames isn't the straight-line case. 

The problem with renames is the merge case. And quite frankly, I don't 
know how to handle that sanely.

If everything was straight-line (CVS), renames would be trivial. But 
git-rev-list very fundamentally works on something that isn't. So let's 
look at the _fundamental_ problem:

 - git-rev-list traverses one commit at a time, and it doesn't even _know_ 
   whether it has seem all the parents of that commit during the first 
   phase.

   The first phase is the "limit_list()" thing, which is when we decide 
   which commits are reachable, uninteresting, and which merges to follow.

   Now, think about this: since we don't even know that we've seen all 
   parents, that pretty much by definition means that we can't know what 
   has been renamed if we were to track it.

 - in the second phase (which is where I do the densification), we do 
   actually have the full tree and that makes a lot of things a lot easier 
   to do. When it comes to dense, for example, it means that I know all 
   the UNINTERESTING logic has already been done, and that all merges have 
   been simplified.

   But in the second phase, we couldn't do rename detection either, since 
   by then, we've already fixed the list of names as far as merges are 
   concerned.

And note that this fundamental issue is true _whether_ we have some 
explicit rename information in a commit or not. 

Git-rev-list has a few additional issues that make it even more 
interesting:

 - git-rev-list fundamentally is designed for multiple heads. There is no 
   one special "origin" head. We might not have _one_ end-point, we might 
   have twenty-five different heads we're looking at at the same time. Try

	gitk --all --dense -d -- git-fetch.c

   on the current git archive, and be impressed by how well it works (and 
   yes, you'll see a funny artifact: "--dense" never removes a line of 
   development entirely, so you'll see a single dot for the two 
   "unrelated" lines: "gitk" and "to-do" do not share a root with the rest 
   of them. You'll get a single dot for that root, even if it 
   obviously doesn't actually change "git-fetch.c").

   You'll also very clearly see the "Big tool rename" which created that 
   name: it will be the first commit after the root that is visible that 
   way.

 - path limiters are fundamentally designed to take a list of paths and 
   directories. Personally, I find that to be a lot more important than a 
   single file. That's very efficient the way we do it, but if you were to 
   _change_ the list of paths, you'd basically have to track those changes 
   over history.

   (This actually happens even with a single path - imagine a merge, and a 
   rename down one line of the merge. You'll have to keep track of 
   different files for the different lines of development, and then when 
   they join together, and the rename only happened in one of them, you'll 
   have to make an executive decision, or just continue to track _both_)

So what do I propose? I propose that you realize that "git-rev-list" is 
just the _building_ block. It does one thing, and one thing only: it 
creates revision list.

A user basically _never_ uses git-rev-list directly: it just doesn't make 
sense. It's not what git-rev-list is there for. git-rev-list is meant to 
be used as a base for doing the real work. And that's how you can use it 
for renaming too.

If you think of "git-rev-list --dense -- name" as a fast way to get a set 
of commits that affect "name", suddenly it all makes sense. You suddenly 
realize that that's a nice building block for figuring out renames. It's 
not _all_ of it, but it's a big portion.

To go back to "gitk", let's see what the path limitation shows us. Right 
now, doing a

	gitk --all --dense -d -- git-fetch.sh

only shows that particular name, and that's by design. Maybe that's what 
the user wants? You have to realize that especially if you remember an 
_old_ name that may not even exist any more, that's REALLY what you want. 
Something that works more like "annotate" is useless, because something 
that works like annotate would just say "I don't have that file, I can't 
follow renames" and exit.

So the first lesson to learn is that following just pure path-names is 
actually meaningful ON ITS OWN! Sometimes you do NOT want to follow 
renames.

For example, let's say that you used to work on git 4 months ago, but gave 
up, since it was too rough for you. But you played around with it, and now 
you're back, and you have an old patch that you want to re-apply, but it 
doesn't talk about "git-fetch.sh", it talks about "git-fetch-script". So 
you do

	gitk --dense -- git-fetch-script

and voila, it does the right thing, and top of the thing is "Big tool 
rename", which tells you _exactly_ what happened to that PATHNAME. 

See? Static pathnames are important! 

Now, this does show that when you _do_ care about renames, "gitk" right 
now doesn't help you very much (no offense to gitk - how could it know 
that git-rev-list would give it pathname detection one day?). Let's 
go back to the original example, and see what we could do to make gitk 
more useful..

	gitk --all --dense -d -- git-fetch.sh

Go and select that "Big tool rename" thing, and start looking for the 
rename..

You won't find it. Why? You'll see all-new files, no renames. It turns out 
that "gitk" follows the "parent" thing a bit _too_ slavishly, which is 
correct on one level, but in this case with "--dense" it turns out that 
what you want to do is see only what _that_ commit does, not what that 
commit did relative to its parent (which was the initial revision).

So while "--dense" made gitk work even without any changes, it's clear 
that the new capability means that gitk might want to have a new button: 
"show diff against 'fake parent'" and "show diff against 'real parent'".

If you want the global view, the default gitk behaviour is correct (it 
will show a "valid" diff - you'll see everything that changed between the 
points it shows). But in a rename-centric world, you want the _local_ 
change to that commit, and right now gitk can't show you that.

So for trackign renames, we probably want that as a helper to gitk.

Also, gitk has a "re-read references" button, but if you track renames, 
you probably want to do more than re-read them: you want to re-DO them 
with the "Big rename" as the new head (forget the old references 
entirely), and with the name list changed. New functionality (possibly 
you'd like to havea "New wiew" button, which actually starts a new gitk 
so that you can see both of them). Right now you'd have to do it by hand:

	gitk --dense 215a7ad1ef790467a4cd3f0dcffbd6e5f04c38f7 -- git-fetch-script 

(where 215a.. is the thing you get when you select the "Big tool rename")

You'd also probably like to have some way to limit the names shown for the 
git-diff-tree in gitk. 

In short, the new "gitk --dense -- filename" doesn't help you nearly as 
much as it _could_. But when you squint a bit, I think you'll see how it's 
quite possible to do...

		Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-21 23:40 git-rev-list: add "--dense" flag Linus Torvalds
                   ` (2 preceding siblings ...)
  2005-10-22  0:37 ` Petr Baudis
@ 2005-10-25 18:07 ` Jonas Fonseca
  2005-10-25 18:23   ` Linus Torvalds
  3 siblings, 1 reply; 37+ messages in thread
From: Jonas Fonseca @ 2005-10-25 18:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Linus Torvalds <torvalds@osdl.org> wrote Fri, Oct 21, 2005:
> To see this in action, try something like
> 
> 	gitk --dense -- gitk
> 
> to see just the history that affects gitk.

Is the initial commit supposed to be listed when the file has been added
later? I would expect it to only list until (and including) the commit
where the file was introduced.

	prompt > git-rev-list --dense HEAD -- compat/mmap.c
	f48000fcbe1009c18f1cc46e56cde2cb632071fa
	730d48a2ef88a7fb7aa4409d40b1e6964a93267f
	e83c5163316f89bfbde7d9ab23ca2e25604af290
	prompt > git-cat-file commit e83c
	tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
	author Linus Torvalds <torvalds@ppc970.osdl.org> 1112911993 -0700
	committer Linus Torvalds <torvalds@ppc970.osdl.org> 1112911993 -0700

	Initial revision of "git", the information manager from hell

Interestingly, for the special gitk, the correct commit is the last rev
listed.

-- 
Jonas Fonseca

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-25 18:07 ` Jonas Fonseca
@ 2005-10-25 18:23   ` Linus Torvalds
  2005-10-25 18:40     ` Jonas Fonseca
  2005-10-25 18:50     ` Linus Torvalds
  0 siblings, 2 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-25 18:23 UTC (permalink / raw)
  To: Jonas Fonseca; +Cc: Junio C Hamano, Git Mailing List



On Tue, 25 Oct 2005, Jonas Fonseca wrote:
> 
> Is the initial commit supposed to be listed when the file has been added
> later?

Right now --dense will _always_ show the root commit. I didn't do the 
logic that does the diff against an empty tree. I was lazy.

This patch does that, and may or may not work.

Does this match what you expected?

		Linus

----
diff --git a/rev-list.c b/rev-list.c
index 5f125fd..038dae2 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -439,6 +439,30 @@ static int same_tree(struct tree *t1, st
 	return !is_different;
 }
 
+static int same_tree_as_empty(struct tree *t1)
+{
+	int retval;
+	void *tree;
+	struct tree_desc empty, real;
+
+	if (!t1)
+		return 0;
+
+	tree = read_object_with_reference(t1->object.sha1, "tree", &real.size, NULL);
+	if (!tree)
+		return 0;
+	real.buf = tree;
+
+	empty.buf = "";
+	empty.size = 0;
+
+	is_different = 0;
+	retval = diff_tree(&empty, &real, "", &diff_opt);
+	free(tree);
+
+	return !retval && !is_different;
+}
+
 static struct commit *try_to_simplify_merge(struct commit *commit, struct commit_list *parent)
 {
 	if (!commit->tree)
@@ -523,11 +547,17 @@ static void compress_list(struct commit_
 		struct commit_list *parent = commit->parents;
 		list = list->next;
 
+		if (!parent) {
+			if (!same_tree_as_empty(commit->tree))
+				commit->object.flags |= TREECHANGE;
+			continue;
+		}
+
 		/*
 		 * Exactly one parent? Check if it leaves the tree
 		 * unchanged
 		 */
-		if (parent && !parent->next) {
+		if (!parent->next) {
 			struct tree *t1 = commit->tree;
 			struct tree *t2 = parent->item->tree;
 			if (!t1 || !t2 || same_tree(t1, t2))

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-25 18:23   ` Linus Torvalds
@ 2005-10-25 18:40     ` Jonas Fonseca
  2005-10-25 18:52       ` Linus Torvalds
  2005-10-25 18:50     ` Linus Torvalds
  1 sibling, 1 reply; 37+ messages in thread
From: Jonas Fonseca @ 2005-10-25 18:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Linus Torvalds <torvalds@osdl.org> wrote Tue, Oct 25, 2005:
> On Tue, 25 Oct 2005, Jonas Fonseca wrote:
> > 
> > Is the initial commit supposed to be listed when the file has been added
> > later?
> 
> Right now --dense will _always_ show the root commit. I didn't do the 
> logic that does the diff against an empty tree. I was lazy.
> 
> This patch does that, and may or may not work.

Without the workaround below it segfaults.

> Does this match what you expected?

Yes, thanks.

diff --git a/rev-list.c b/rev-list.c
index 5f125fd..82ec656 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -87,6 +87,7 @@ static void rewrite_one(struct commit **
 		struct commit *p = *pp;
 		if (p->object.flags & (TREECHANGE | UNINTERESTING))
 			return;
+		if (!p->parents) return;
 		/* Only single-parent commits don't have TREECHANGE */
 		*pp = p->parents->item;
 	}


-- 
Jonas Fonseca

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-25 18:23   ` Linus Torvalds
  2005-10-25 18:40     ` Jonas Fonseca
@ 2005-10-25 18:50     ` Linus Torvalds
  1 sibling, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-25 18:50 UTC (permalink / raw)
  To: Jonas Fonseca; +Cc: Junio C Hamano, Git Mailing List



On Tue, 25 Oct 2005, Linus Torvalds wrote:
> 
> Right now --dense will _always_ show the root commit. I didn't do the 
> logic that does the diff against an empty tree. I was lazy.
> 
> This patch does that, and may or may not work.

Never mind. It's incorrect. 

It's -close- to being correct, but it will SIGSEGV because now the root 
won't necessarily have the TREECHANGED flag, so now it can follow the root 
down to its parents (which is a NULL pointer - that's the definition of a 
root commit, of course).

It needs to remove the parent pointers that become NULL.

This patch is even slightly tested, and might do a better job.

[ Sorry for sending untested crap out, but I have a fairly good track 
  record in general. Testing is for wimps ]

		Linus

---
diff --git a/rev-list.c b/rev-list.c
index 5f125fd..edf3b37 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -81,23 +81,28 @@ static void show_commit(struct commit *c
 	fflush(stdout);
 }
 
-static void rewrite_one(struct commit **pp)
+static int rewrite_one(struct commit **pp)
 {
 	for (;;) {
 		struct commit *p = *pp;
 		if (p->object.flags & (TREECHANGE | UNINTERESTING))
-			return;
-		/* Only single-parent commits don't have TREECHANGE */
+			return 0;
+		if (!p->parents)
+			return -1;
 		*pp = p->parents->item;
 	}
 }
 
 static void rewrite_parents(struct commit *commit)
 {
-	struct commit_list *parent = commit->parents;
-	while (parent) {
-		rewrite_one(&parent->item);
-		parent = parent->next;
+	struct commit_list **pp = &commit->parents;
+	while (*pp) {
+		struct commit_list *parent = *pp;
+		if (rewrite_one(&parent->item) < 0) {
+			*pp = parent->next;
+			continue;
+		}
+		pp = &parent->next;
 	}
 }
 
@@ -439,6 +444,30 @@ static int same_tree(struct tree *t1, st
 	return !is_different;
 }
 
+static int same_tree_as_empty(struct tree *t1)
+{
+	int retval;
+	void *tree;
+	struct tree_desc empty, real;
+
+	if (!t1)
+		return 0;
+
+	tree = read_object_with_reference(t1->object.sha1, "tree", &real.size, NULL);
+	if (!tree)
+		return 0;
+	real.buf = tree;
+
+	empty.buf = "";
+	empty.size = 0;
+
+	is_different = 0;
+	retval = diff_tree(&empty, &real, "", &diff_opt);
+	free(tree);
+
+	return retval >= 0 && !is_different;
+}
+
 static struct commit *try_to_simplify_merge(struct commit *commit, struct commit_list *parent)
 {
 	if (!commit->tree)
@@ -523,11 +552,17 @@ static void compress_list(struct commit_
 		struct commit_list *parent = commit->parents;
 		list = list->next;
 
+		if (!parent) {
+			if (!same_tree_as_empty(commit->tree))
+				commit->object.flags |= TREECHANGE;
+			continue;
+		}
+
 		/*
 		 * Exactly one parent? Check if it leaves the tree
 		 * unchanged
 		 */
-		if (parent && !parent->next) {
+		if (!parent->next) {
 			struct tree *t1 = commit->tree;
 			struct tree *t2 = parent->item->tree;
 			if (!t1 || !t2 || same_tree(t1, t2))

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: git-rev-list: add "--dense" flag
  2005-10-25 18:40     ` Jonas Fonseca
@ 2005-10-25 18:52       ` Linus Torvalds
  0 siblings, 0 replies; 37+ messages in thread
From: Linus Torvalds @ 2005-10-25 18:52 UTC (permalink / raw)
  To: Jonas Fonseca; +Cc: Junio C Hamano, Git Mailing List



On Tue, 25 Oct 2005, Jonas Fonseca wrote:
> 
> Without the workaround below it segfaults.

Yes, but your patch only partly helps. It fixes the SIGSEGV, but it still 
needs to remove the "parents" pointer when the parent ends up NULL.

The patch I just sent out should be better. I think.

			Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2005-10-25 18:52 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-21 23:40 git-rev-list: add "--dense" flag Linus Torvalds
2005-10-21 23:47 ` Linus Torvalds
2005-10-21 23:55 ` Linus Torvalds
2005-10-22  0:37 ` Petr Baudis
2005-10-22  0:47   ` Handling renames Petr Baudis
2005-10-22  1:28     ` Linus Torvalds
2005-10-22  1:51       ` Petr Baudis
2005-10-22  2:10         ` Junio C Hamano
2005-10-22  2:49           ` Petr Baudis
2005-10-22  3:23         ` Linus Torvalds
2005-10-22  1:26   ` git-rev-list: add "--dense" flag Linus Torvalds
2005-10-22  2:56     ` Petr Baudis
2005-10-22  3:30       ` Linus Torvalds
2005-10-22  3:55     ` Junio C Hamano
2005-10-22 20:37       ` Linus Torvalds
2005-10-25 18:07 ` Jonas Fonseca
2005-10-25 18:23   ` Linus Torvalds
2005-10-25 18:40     ` Jonas Fonseca
2005-10-25 18:52       ` Linus Torvalds
2005-10-25 18:50     ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2005-04-15 13:37 Handling renames linux
2005-04-15 13:53 ` David Woodhouse
2005-04-14 17:54 David Woodhouse
2005-04-14 18:11 ` Linus Torvalds
2005-04-14 19:09   ` David Woodhouse
2005-04-14 18:12 ` Ingo Molnar
2005-04-14 18:32   ` Linus Torvalds
2005-04-14 18:58     ` Ingo Molnar
2005-04-14 19:20       ` David Woodhouse
2005-04-14 19:21     ` David Mansfield
2005-04-14 18:21 ` H. Peter Anvin
2005-04-14 18:48   ` Linus Torvalds
2005-04-14 18:49     ` H. Peter Anvin
2005-04-14 19:22     ` Zach Welch
2005-04-14 19:40       ` Andrew Timberlake-Newell
2005-04-14 22:23 ` Daniel Barkalow
2005-04-14 22:46   ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).