Git development
 help / color / mirror / Atom feed
* Re: diff against a tag ?
From: Linus Torvalds @ 2005-04-28 22:22 UTC (permalink / raw)
  To: Dave Jones; +Cc: Junio C Hamano, git
In-Reply-To: <20050428220626.GC15706@redhat.com>



On Thu, 28 Apr 2005, Dave Jones wrote:
>
> Groovy. Is it 'THE LAW' that the first one it reports will always be
> the most recent tag ?

Nope. They are reported in the order they are found, which is not 
meaningful at all (it depends on the directory ordering, with the 
highest-level bits being obviously ordered by the sha1 number thanks to 
the subdirectory stuff).

So you can only see the name of the tag - there's no ordering. The 
tag-name may of course imply an ordering in itself..

> Hmm, in a fresh rsync from your kernel tree, I get this..
> tagged commit a2755a80f40e5794ddc20e00f781af9d6320fafb (v2.6.12-rc3) in 0397236d43e48e821cce5bbe6a80a1a56bb7cc3a
> tagged commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (v2.6.12-rc2) in 9e734775f7c22d2f89943ad6c745571f1930105f
> expect dangling commits - potential heads - due to lack of head information
> dangling tag 0397236d43e48e821cce5bbe6a80a1a56bb7cc3a
> dangling commit 9acf6597c533f3d5c991f730c6a1be296679018e
> dangling tag 9e734775f7c22d2f89943ad6c745571f1930105f
> 
> Is that last part to be expected ?

It even says so: "expect dangling commits".

The tags will always be dangling, since nothing refers to them. Once you 
have them listed in your tag database (ie you've created files that 
mention them in .git/refs/tags or something), you can tell fsck about 
them, and fsck won't complain. 

Something like

	fsck-cache --unreachable $(cat .git/refs/*/*)

would do it (and depending on exactly how cogito ends up recording them).

		Linus

^ permalink raw reply

* Re: How to get bash to shut up about SIGPIPE?
From: Linus Torvalds @ 2005-04-28 22:16 UTC (permalink / raw)
  To: Edgar Toernig; +Cc: Git Mailing List, Petr Baudis
In-Reply-To: <20050428233104.3e606ba9.froese@gmx.de>



On Thu, 28 Apr 2005, Edgar Toernig wrote:
> 
> Try this:

Nope. No difference what-so-ever.

Side note: it's timing-dependent, so apparently especially on SMP, 
sometimes it doesn't complain, and then I think soemthing worked. Then I 
try again, and it's always back.

I've made a bug-report against bash.

		Linus

^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: David Woodhouse @ 2005-04-28 22:12 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List
In-Reply-To: <42715B30.6010705@zytor.com>

On Thu, 2005-04-28 at 14:52 -0700, H. Peter Anvin wrote:
> I thought about this for a few seconds (I really should do that more 
> often...) and realized what it is you want: you want a primary search 
> criterion which is "when did event X become visible to me", where "me" 
> in this case is the web tool.  That is not repository information, but 
> it is perfectly possible for the webtool to be aware of what it has 
> previously seen and when.
> 
> And yes, this ordering is clearly different for each observer.

The mailing list does it with tags -- it remembers the 'last seen
commit' and then effectively does 'rev-tree HEAD ^LASTSEEN', except that
I make a primitive attempt to get the ordering a little better than what
I get from rev-tree. But since the mailing list runs are hourly, I
really can get away with a _primitive_ attempt. That's why I hadn't
noticed the local/remote ordering problem that Linus pointed out.

It's not clear how you'd attempt to track local history for the general
case though -- the whole concept of a 'local' branch being special is
anathema to git. You'd have to hack it into some auxiliary storage, as I
do with tags -- but to get a fullly correct ordering it'd have to track
at least every locally-performed merge, and you really don't want to be
doing that kind of thing.

You might perhaps attempt to find a path through the graph which takes
in as many commits as possible where committer == `logname`@`hostname`
-- but as Linus and I already said, that's expensive.

I'm not entirely sure what the answer is; but it isn't parent ordering
and it isn't dates.

Using dates might be a nice quick approximation, but that really isn't
good enough. 

I wonder if we could try to enforce some meaning for dates though....
Currently, 'rev-tree AAAA ^BBBB' has to build the _entire_ tree for BBBB
back to the beginning, so it knows where to stop when following AAAA. 

However, if we _do_ take Junio's suggesting of enforcing monotonicity,
then we'll always know that the parents of a given commit will have a
timestamp which is older than its own timestamp. 

So given the task "list commits between 2.6.12-rc3 and 2.6.12-rc4' we
could look at the timestamp of rc3, and immediately follow the rc4
parents until we start seeing commits which are older than rc3. Then
each time we hit a commit in the parents of rc4 which is older than rc3
is, we continue doing a breadth-first search from rc3 until all the
parents we're looking at are older than the parent of rc4 which we're
currently considering. Etc. 

That means that the common case of "in A but not in B" can at least be
handled relatively efficiently without having to wait while it tracks
the history all the way back to the beginning. I still don't like it
much though...

-- 
dwmw2


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Linus Torvalds @ 2005-04-28 22:12 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: David Woodhouse, Petr Baudis, Git Mailing List
In-Reply-To: <42715B30.6010705@zytor.com>



On Thu, 28 Apr 2005, H. Peter Anvin wrote:
>
> I thought about this for a few seconds (I really should do that more 
> often...) and realized what it is you want: you want a primary search 
> criterion which is "when did event X become visible to me", where "me" 
> in this case is the web tool.  That is not repository information, but 
> it is perfectly possible for the webtool to be aware of what it has 
> previously seen and when.

This is exactly what rev-tree does, and how things like the commit emails 
happen.

The problem is that since it's observer-dependent, it's not generally very 
useful for something like a web interface. You really don't want to keep 
track of what everybody has seen ;)

What you _can_ try to keep track of is what some "special observer" has
seen. That's really quite complicated too, but if you do a web interface,
the "special observer" is yourself. Then at every time you mirror the
thing, you need to remember what your "last view" was, and you base your
"new view" on the fact that you know what you saw last time, so you know
which things are new to _you_.

But it really means that each web interface ends up showing quite
_different_ information, and the particular information you show ends up
being dependent on when you started looking at the tree (and how often you
re-generate new views).

This really is why "time" is interesting. Because it's simple, and 
observers can agree about it (not because the time was the same, but 
because each observer just agrees that time is "whatever was reported as 
the local time at the point the action happened").

		Linus

^ permalink raw reply

* Re: [PATCH] Built-in diff driver shows Index: line.
From: Junio C Hamano @ 2005-04-28 22:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0504281236090.18901@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Actually, I do dislike the Index: line, and think this is a pretty
LT> intrusive work-around for a problem with diffstat.

Alright.  Please consider the patch retracted.

LT> Oh, actually maybe the better pattern to use is the one that GNU diff 
LT> itself ends up matching:
LT> 	"*** %[^\t ]%[\t ]%d%c%d%c%d %d:%d:%d"
LT> where the "%c" has to be either '-' or '/' (ie it ends up matching as 
LT> "numeric date" + "numeric time").

LT> You can put the "mode" thing at the end, and diffstat won't care about it.

Hmph.  Timestamps do not mean anything in most of the intended
use of diff-* family, since they are meant to operate on trees,
except:

 - comparing against the working tree --- show-diff's <new> and
   diff-cache's <new>; we can take the timestamp from the
   filesystem.

 - comparing against a tree that comes from a known commit ---
   we can take the timestamp of the commit that contains the
   file.

If we want to show the timestamp of the latter, diff-tree and
diff-cache need to be taught to take notice if their
tree-or-commit parameter is actually a commit and if so needs to
pass the timestamp in the committer field down the path for the
diff driver.  There is no way for diff-tree-helper to do this
because the origin information is already stripped out when it
sees a valid SHA1.

So I'd say we'd punt this one for now, unless somebody else has
a better idea.


^ permalink raw reply

* Re: diff against a tag ?
From: Dave Jones @ 2005-04-28 22:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.58.0504281358060.18901@ppc970.osdl.org>

On Thu, Apr 28, 2005 at 02:01:40PM -0700, Linus Torvalds wrote:
 > 
 > 
 > On Thu, 28 Apr 2005, Junio C Hamano wrote:
 > > 
 > > Depends on your definition of today, but with the patch below I
 > > sent today, you can say "diff-tree -p $tag $(cat .git/HEAD)".
 > 
 > I think Dave was wondering how to _find_ the tag in the first place, which 
 > is a different issue.

Indeed. Sorry if I was unclear.

 > Right now fsck is the only thing that reports tags that aren't referenced 
 > some other way. Once you know the tag, things are easy - even without 
 > Junio's patch you can just do
 > 
 > 	object=$(cat-file tag $tag | sed 's/object //;q')
 > 
 > and then you can just do
 > 
 > 	diff-tree $object $(cat .git/HEAD)
 > 
 > or whatever you want to do.
 > 
 > Dave: do a "fsck --tags" in your tree, and it will talk about the tags it
 > finds. Then you can create files like .git/refs/tags/v2.6.12-rc2 that
 > contain pointers to those tags..

Groovy. Is it 'THE LAW' that the first one it reports will always be
the most recent tag ?

Hmm, in a fresh rsync from your kernel tree, I get this..
tagged commit a2755a80f40e5794ddc20e00f781af9d6320fafb (v2.6.12-rc3) in 0397236d43e48e821cce5bbe6a80a1a56bb7cc3a
tagged commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (v2.6.12-rc2) in 9e734775f7c22d2f89943ad6c745571f1930105f
expect dangling commits - potential heads - due to lack of head information
dangling tag 0397236d43e48e821cce5bbe6a80a1a56bb7cc3a
dangling commit 9acf6597c533f3d5c991f730c6a1be296679018e
dangling tag 9e734775f7c22d2f89943ad6c745571f1930105f

Is that last part to be expected ?

		Dave


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Linus Torvalds @ 2005-04-28 22:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: David Woodhouse, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <7voeby60fp.fsf@assigned-by-dhcp.cox.net>



On Thu, 28 Apr 2005, Junio C Hamano wrote:
> 
> If that is really the case, shouldn't we do one of the
> following:

No. It's not the case that time-stamps are meaningless. 

The thing about distributed stuff is that time gets "fuzzy". It doesn't go 
away. It's still very valid to say "this was done yesterday".

But what gets fuzzy is "before" and "after". For two reasons:

 - time isn't synchronized, and clocks can be off. Usually by just a 
   little bit, but sometimes you'll find just plain badly maintained 
   machines, and time can be a year or two off.

   Ergo: time is a _hint_. It's usually a pretty good hint, but it's a 
   hint.

 - the "parent" relationship is the only "hard" before/after thing that 
   git knows about, but it ignores a lot of real-world interaction, so
   thinking that it is the _only_ before/after measure is ignoring all the
   other communication in a system.

   So parenthood guarantees that something happened "before", but _not_ 
   being directly related doesn't mean that they were totally independent. 
   There's no fixed "speed of light" that defines some absolute "cone of 
   reachability".

So time is relevant, but it's more of a hint than anything absolute. 
Anything that -depends- on time is a bug waiting to happen, but something 
that uses time to visualize things makes sense.

The big advantage with time is that it's cheap. If you want to do a full 
reachability analysis, you have to look at the whole revision tree. That's 
quite possible RIGHT NOW, but it simply ill not be practical in a year, 
when we have 15,000 commits.

So "time" ends up being an approximation for "doing it right".

As an example: it's quite expensive to ask "was this commit part of 
2.6.12-rc3?" because that involves knowing the whole set of commits 
involved in 2.6.12-rc3. Which in turn involves walking the whole revision 
tree starting at 2.6.12-rc3 downwards. 

That's exactly what "rev-tree" does, though. "rev-tree" will do the whole
reachability thing, and as a result you can see whether something was in
2.6.12-rc3 or not. But just for fun - time how long it takes for
"rev-tree" to output its first entry, and how long it takes for "rev-list"
to print its first line. 

Hint: do it with a cold-cache "sparse" tree. "rev-list" will start
outputting data immediately, and work it out as it goes along. "rev-tree"  
will think for some time, and then blast the data out.

In other words: rev-list is what you want for something like "git log", 
because you care about _latency_ of the result.

And that's why it uses time. It's an approximation, but it is an 
approximation that has meaning in real life.

		Linus

^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: H. Peter Anvin @ 2005-04-28 21:52 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List
In-Reply-To: <1114723214.2734.9.camel@localhost.localdomain>

David Woodhouse wrote:
> 
> Hmm, that's true; albeit unfortunate. 
> 
> Still, using the date isn't any better. It'll give results which are
> about as random as just sorting by the sha1 of each parent.
> 
> Yes, the ordering of the parents in a merge is probably meaningless in
> the general case, but so is the date.
> 
> The best we could probably do, from a theoretical standpoint, is to look
> at the paths via each parent to a common ancestor, and look at how many
> of the commits on each path were done by the same committer. Even that
> isn't ideal, and it's probably fairly expensive -- but it's pointless to
> pretend we can infer anything from _either_ the dates or the ordering of
> the parents in a merge.
> 

I thought about this for a few seconds (I really should do that more 
often...) and realized what it is you want: you want a primary search 
criterion which is "when did event X become visible to me", where "me" 
in this case is the web tool.  That is not repository information, but 
it is perfectly possible for the webtool to be aware of what it has 
previously seen and when.

And yes, this ordering is clearly different for each observer.

	-hpa

^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: H. Peter Anvin @ 2005-04-28 21:50 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List
In-Reply-To: <1114723214.2734.9.camel@localhost.localdomain>

David Woodhouse wrote:
> 
> Hmm, that's true; albeit unfortunate. 
> 
> Still, using the date isn't any better. It'll give results which are
> about as random as just sorting by the sha1 of each parent.
> 
> Yes, the ordering of the parents in a merge is probably meaningless in
> the general case, but so is the date.
> 
> The best we could probably do, from a theoretical standpoint, is to look
> at the paths via each parent to a common ancestor, and look at how many
> of the commits on each path were done by the same committer. Even that
> isn't ideal, and it's probably fairly expensive -- but it's pointless to
> pretend we can infer anything from _either_ the dates or the ordering of
> the parents in a merge.
> 

Perhaps the right thing to do is to draw a graph instead?

	-hpa

^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: David Woodhouse @ 2005-04-28 21:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504281432490.18901@ppc970.osdl.org>

On Thu, 2005-04-28 at 14:40 -0700, Linus Torvalds wrote:
> Wrong. The date _does_ have meaning. It shows which of the parents was 
> more recent, which indirectly is a hint about which side had more activity 
> going on. 
> 
> In other words, it _is_ meanginful. Maybe it's a _statistical_ meaning 
> ("that side is probably the active one, because it has the last commit"), 
> but it's a meaning.

It's not entirely clear what 'active' is supposed to be useful for in
this instance. You could just as well count the commits between the
merge and the common ancestor, if you want to see which side was most
_active_ -- but that isn't helpful for deciding the order in which
'cg-log' should show commits.

What you really want there is 'local' vs. 'remote', because people want
to see the order in which changesets arrived in the _local_ repository
-- if the last thing you did was pull from me, people want all my
changesets to be at the top; regardless of who last committed to their
tree before the merge -- i.e. regardless of whether I did a last-minute
commit before you pulled, or whether you'd done another commit to your
tree immediately before pulling.

As you rightly point out, the local/remote information isn't really
available in an easy form -- certainly not from the ordering of the
parents in a merge commit. But let's not fool ourselves that we can
piece it together from the date either.

OK, the date _is_ meaningful in a way, but only in the same way that the
author's name and IRC address information is meaningful. Of course we
didn't include it for _nothing_, but it's outside the scope of git
itself; it isn't part of the useful information which git should care
about.

-- 
dwmw2


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Junio C Hamano @ 2005-04-28 21:49 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <1114724307.2734.17.camel@localhost.localdomain>

>>>>> "DW" == David Woodhouse <dwmw2@infradead.org> writes:

DW> On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote:
>> "I want to view commit between these ones---give me a linearlized list
>> of commits."  When following the ancestor chain from the current top,
>> we can immediately stop upon seeing a commit made before the timestamp
>> of the named bottom one.

DW> This absolutely must not be timestamp based. If I ask for a list of
DW> commits before 2.6.12-rc3 and 2.6.12-rc4 I _really_ want to see those
DW> commits which happened before 2.6.12-rc3 but in a remote tree which was
DW> only later pulled. That's what 'rev-tree AAAAAA ^BBBBBB' already gives
DW> you.

How true.  I stand corrected.


^ permalink raw reply

* [PATCH] add short options to show-files
From: Nicolas Pitre @ 2005-04-28 21:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git


The show-files long options are cumbersome to type.  This patch adds 
equivalent short options.

Also add missing "unmerged" to usage string.

Finally reduce the number of lines for argument parsing in half.

Signed-off-by: Nicolas Pitre <nico@cam.org>

--- k/show-files.c
+++ l/show-files.c
@@ -206,6 +206,10 @@ static void show_files(void)
 	}
 }
 
+static const char *show_files_usage =
+	"show-files [-z] (--[cached|deleted|others|stage|unmerged])* "
+	"[ --ignored [--exclude=<pattern>] [--exclude-from=<file>) ]";
+
 int main(int argc, char **argv)
 {
 	int i;
@@ -215,55 +219,30 @@ int main(int argc, char **argv)
 
 		if (!strcmp(arg, "-z")) {
 			line_terminator = 0;
-			continue;
-		}
-
-		if (!strcmp(arg, "--cached")) {
+		} else if (!strcmp(arg, "-c") || !strcmp(arg, "--cached")) {
 			show_cached = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--deleted")) {
+		} else if (!strcmp(arg, "-d") || !strcmp(arg, "--deleted")) {
 			show_deleted = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--others")) {
+		} else if (!strcmp(arg, "-o") || !strcmp(arg, "--others")) {
 			show_others = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--ignored")) {
+		} else if (!strcmp(arg, "-i") || !strcmp(arg, "--ignored")) {
 			show_ignored = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--stage")) {
+		} else if (!strcmp(arg, "-s") || !strcmp(arg, "--stage")) {
 			show_stage = 1;
-			continue;
-		}
-		if (!strcmp(arg, "--unmerged")) {
+		} else if (!strcmp(arg, "-u") || !strcmp(arg, "--unmerged")) {
 			// There's no point in showing unmerged unless you also show the stage information
 			show_stage = 1;
 			show_unmerged = 1;
-			continue;
-		}
-
-		if (!strcmp(arg, "-x") && i+1 < argc) {
+		} else if (!strcmp(arg, "-x") && i+1 < argc) {
 			add_exclude(argv[++i]);
-			continue;
-		}
-		if (!strncmp(arg, "--exclude=", 10)) {
+		} else if (!strncmp(arg, "--exclude=", 10)) {
 			add_exclude(arg+10);
-			continue;
-		}
-		if (!strcmp(arg, "-X") && i+1 < argc) {
+		} else if (!strcmp(arg, "-X") && i+1 < argc) {
 			add_excludes_from_file(argv[++i]);
-			continue;
-		}
-		if (!strncmp(arg, "--exclude-from=", 15)) {
+		} else if (!strncmp(arg, "--exclude-from=", 15)) {
 			add_excludes_from_file(arg+15);
-			continue;
-		}
-
-		usage("show-files [-z] (--[cached|deleted|others|stage])* "
-		      "[ --ignored [--exclude=<pattern>] [--exclude-from=<file>) ]");
+		} else
+			usage(show_files_usage);
 	}
 
 	if (show_ignored && !nr_excludes) {

^ permalink raw reply

* Re: Finding file revisions
From: Linus Torvalds @ 2005-04-28 21:50 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Chris Mason, git
In-Reply-To: <1114723987.4212.51.camel@localhost.localdomain>



On Thu, 28 Apr 2005, Kay Sievers wrote:
> 
> Sure. But file-changes lists the commit:
>   c79bea07ec4d3ef087962699fe8b2f6dc5ca7754
> 
> when asked for:
>   "drivers/usb/core/usb.c"
> 
> and that file isn't touched there. Actually it lists merge-commits which
> are not related to the file.

It really _is_ touched by that commit. Look closer.

It has two parents: one that had already merged with Greg's USB tree, and 
one that had _not_ done so.

So whether it "modifies" the USB files or not really depends on which 
parent you go back. 

In general, you tend to want to ignore merge-nodes for looking at 
differences, but the differences are definitely there, and they are often 
vital (ie it's often _very_ important to know which side of a merge didn't 
change something).

		Linus

^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Junio C Hamano @ 2005-04-28 21:44 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <1114723402.2734.11.camel@localhost.localdomain>

>>>>> "DW" == David Woodhouse <dwmw2@infradead.org> writes:

DW> On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote:
>> 2. Assuming that we do want to enforce that parent fields of a
>> commit object name valid commit objects, is it OK to also
>> require that the commit timestamp of a child object is not in
>> the future relative to any and all of its parent commit
>> objects

DW> No. Time is utterly meaningless -- it's perfectly normal for clocks to
DW> be out of sync. We really don't want to fall into the trap of assigning
DW> any meaning to the timestamp.

If that is really the case, shouldn't we do one of the
following:

 (1) Timestamp is meaningless.  Stop recording it in the commit
     objects.

 (2) Keep recording meaningless timestamp in the commit objects,
     because otherwise it would break backward compatibility.
     However, stop looking at timestamp in commit.c; especially
     pop_most-recent_commit() is meaningless hance what rev-list
     does.

 (3) Require the proper ordering in the timestamp as I
     suggested.  Users should take note and make corrective
     action if their clocks are _way_ out of sync.

I do not think we want to do either (1) or (2).


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Linus Torvalds @ 2005-04-28 21:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: David Woodhouse, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <7v1x8u7g26.fsf@assigned-by-dhcp.cox.net>



On Thu, 28 Apr 2005, Junio C Hamano wrote:
> 
> 1. Currently, commit-tree does not seem to verify that all its
>    parent SHA1's actually name valid commit objects.  Is this
>    intentional?

No. Me lazy. I think we should check as many _cheap_ things as possible,
and checking whether a parent at least superficially looks like a real
commit object is certainly cheap.

> 2. Assuming that we do want to enforce that parent fields of a
>    commit object name valid commit objects, is it OK to also
>    require that the commit timestamp of a child object is not in
>    the future relative to any and all of its parent commit
>    objects (I'm talking about the timestamp of committer field
>    not author field, although your e-mail patch acceptance
>    procedure seems to be giving it the same timestamp right
>    now)?

No, this is not ok. Clock skew is real, and somebody may have a 
misconfigured machine. Being careful about integrity is good, but trying 
to enforce time flow in a distributed environment is just being anal.

Maybe a warning.

		Linus

^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: David Woodhouse @ 2005-04-28 21:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <7v1x8u7g26.fsf@assigned-by-dhcp.cox.net>

On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote:
> "I want to view commit between these ones---give me a linearlized list
> of commits."  When following the ancestor chain from the current top,
> we can immediately stop upon seeing a commit made before the timestamp
> of the named bottom one.

This absolutely must not be timestamp based. If I ask for a list of
commits before 2.6.12-rc3 and 2.6.12-rc4 I _really_ want to see those
commits which happened before 2.6.12-rc3 but in a remote tree which was
only later pulled. That's what 'rev-tree AAAAAA ^BBBBBB' already gives
you.

-- 
dwmw2


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Linus Torvalds @ 2005-04-28 21:40 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <1114723214.2734.9.camel@localhost.localdomain>



On Thu, 28 Apr 2005, David Woodhouse wrote:
> 
> Still, using the date isn't any better. It'll give results which are
> about as random as just sorting by the sha1 of each parent.

Well, it does use real information, and it is repeatable. And I don't see 
why you say that the date is meaningless, when it clearly isn't. The date 
absolutely does have meaning. 

Not having a global clock doesn't mean that clocks go away. It just means 
that they don't generate a total sort. They still generate a _partial_ 
sort, though, and it's a very valid partial sort.

The fact is, this is how the world works in real life too. Relativity 
doesn't make time "pointless". You still have "before" and "after" for 
almost all relevant events. The fact that not _all_ events can be sorted 
by "before" and "after", and different observers can disagree about some 
of the ordering does not mean that causality has gone away and that time 
is meaningless.

The same is true in a distributed system. Time still exists, and is still 
meaningful even outside the direct "causality" links implied by the 
parents. People probably discussed things, and there are methods of 
communication other than just direct parent links, and while you're not 
_guaranteed_ that "before" and "after" always makes sense, they definitely 
still exist 99% of the time.

> The best we could probably do, from a theoretical standpoint, is to look
> at the paths via each parent to a common ancestor, and look at how many
> of the commits on each path were done by the same committer.

That's quite expensive. 

> Even that isn't ideal, and it's probably fairly expensive -- but it's
> pointless to pretend we can infer anything from _either_ the dates or
> the ordering of the parents in a merge.

Wrong. The date _does_ have meaning. It shows which of the parents was 
more recent, which indirectly is a hint about which side had more activity 
going on. 

In other words, it _is_ meanginful. Maybe it's a _statistical_ meaning 
("that side is probably the active one, because it has the last commit"), 
but it's a meaning.

		Linus

^ permalink raw reply

* Re: Finding file revisions
From: Kay Sievers @ 2005-04-28 21:33 UTC (permalink / raw)
  To: Chris Mason; +Cc: Linus Torvalds, git
In-Reply-To: <200504281658.39300.mason@suse.com>

On Thu, 2005-04-28 at 16:58 -0400, Chris Mason wrote:
> On Thursday 28 April 2005 15:11, Kay Sievers wrote:
> >
> > Can you confirm this with the kernel tree?
> >   file-changes -c 9acf6597c533f3d5c991f730c6a1be296679018e
> > drivers/usb/core/usb.c
> >
> > lists the commit:
> >   diff-tree -r 1d66c64c3cee10a465cd3f8bd9191bbeb718f650
> > c79bea07ec4d3ef087962699fe8b2f6dc5ca7754
> > f0534ee064901d0108eb7b2b1fcb59a98bb53c2b->c231b4bef314284a168fedb6c5f6c47ae
> >c5084fc drivers/usb/core/usb.c cat-file commit
> > c79bea07ec4d3ef087962699fe8b2f6dc5ca7754
> >
> > which seems not to have changed the file asked for.
> 
> Hmmm, that does work here:
> 
> coffee:/src/git # diff-tree -r 1d66c64c3cee10a465cd3f8bd9191bbeb718f650 c79bea07ec4d3ef087962699fe8b2f6dc5ca7754 | grep usb.core.usb.c
> *100644->100644 blob    f0534ee064901d0108eb7b2b1fcb59a98bb53c2b->c231b4bef314284a168fedb6c5f6c47aec5084fc      drivers/usb/core/usb.c
> 
> -chris
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sure. But file-changes lists the commit:
  c79bea07ec4d3ef087962699fe8b2f6dc5ca7754

when asked for:
  "drivers/usb/core/usb.c"

and that file isn't touched there. Actually it lists merge-commits which
are not related to the file.

Thanks,
Kay



^ permalink raw reply

* Re: How to get bash to shut up about SIGPIPE?
From: Edgar Toernig @ 2005-04-28 21:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Petr Baudis
In-Reply-To: <Pine.LNX.4.58.0504281121430.18901@ppc970.osdl.org>

Linus Torvalds wrote:
>
> Damn bash. What's the magic incantation that says SHUT UP!?

Try this:

+	{ set -e; trap 'exit 1' SIGPIPE
	$revls | $revsort | while read time commit parents; do
		[ "$revfmt" = "rev-list" ] && commit="$time"
		...
-	done | ${PAGER:-less} ${PAGER_FLAGS:--R}
+	done; } | ${PAGER:-less} ${PAGER_FLAGS:--R}

Ciao, ET.

^ permalink raw reply

* Re: Finding file revisions
From: Linus Torvalds @ 2005-04-28 21:32 UTC (permalink / raw)
  To: Chris Mason; +Cc: Kay Sievers, git
In-Reply-To: <200504281658.39300.mason@suse.com>



On Thu, 28 Apr 2005, Chris Mason wrote:

> On Thursday 28 April 2005 15:11, Kay Sievers wrote:
> >
> > Can you confirm this with the kernel tree?
> >   file-changes -c 9acf6597c533f3d5c991f730c6a1be296679018e drivers/usb/core/usb.c
> >
> > lists the commit:
> >   diff-tree -r 1d66c64c3cee10a465cd3f8bd9191bbeb718f650 c79bea07ec4d3ef087962699fe8b2f6dc5ca7754
> > f0534ee064901d0108eb7b2b1fcb59a98bb53c2b->c231b4bef314284a168fedb6c5f6c47aec5084fc drivers/usb/core/usb.c
> >
> >  cat-file commit c79bea07ec4d3ef087962699fe8b2f6dc5ca7754
> >
> > which seems not to have changed the file asked for.
> 
> Hmmm, that does work here:
> 
> coffee:/src/git # diff-tree -r 1d66c64c3cee10a465cd3f8bd9191bbeb718f650 c79bea07ec4d3ef087962699fe8b2f6dc5ca7754 | grep usb.core.usb.c
> *100644->100644 blob    f0534ee064901d0108eb7b2b1fcb59a98bb53c2b->c231b4bef314284a168fedb6c5f6c47aec5084fc      drivers/usb/core/usb.c

I think Key is confused by the fact that the commit is a -merge- commit, 
and the first parent has _not_ changed that file - it got changed through 
the merge.

Ie:

	cat-file commit c79bea07ec4d3ef087962699fe8b2f6dc5ca7754

gives

	tree 3fbdc4745cfde60df7d05815b343e4a253020530
	parent a9e4820c4c170b3df0d2185f7b4130b0b2daed2c
	parent 1d66c64c3cee10a465cd3f8bd9191bbeb718f650
	author Linus Torvalds <torvalds@ppc970.osdl.org.(none)> 1113921100 -0700
	committer Linus Torvalds <torvalds@ppc970.osdl.org.(none)> 1113921100 -0700

	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6.git/

and if you do a diff against the _first_ parent you don't see anything 
changing in USB..

		Linus

^ permalink raw reply

* Re: diff against a tag ?
From: Junio C Hamano @ 2005-04-28 21:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dave Jones, git
In-Reply-To: <Pine.LNX.4.58.0504281358060.18901@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Right now fsck is the only thing that reports tags that aren't referenced 
LT> some other way. Once you know the tag, things are easy - even without 
LT> Junio's patch you can just do
LT> 	object=$(cat-file tag $tag | sed 's/object //;q')
LT> and then you can just do
LT> 	diff-tree $object $(cat .git/HEAD)
LT> or whatever you want to do.

Of course you are right.  The patch is about skipping the first
step and requiring you to know what you have is a tag not a
commit.

LT> Dave: do a "fsck --tags" in your tree, and it will talk about the tags it
LT> finds. Then you can create files like .git/refs/tags/v2.6.12-rc2 that
LT> contain pointers to those tags..

Arrrrrrrrrgh!  I just did "fsck --tags" blindly X-<.  Good thing
I was not root.

Of course you meant "fsck-cache --tags" ;-).


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: David Woodhouse @ 2005-04-28 21:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <7v1x8u7g26.fsf@assigned-by-dhcp.cox.net>

On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote:
> 2. Assuming that we do want to enforce that parent fields of a
>    commit object name valid commit objects, is it OK to also
>    require that the commit timestamp of a child object is not in
>    the future relative to any and all of its parent commit
>    objects

No. Time is utterly meaningless -- it's perfectly normal for clocks to
be out of sync. We really don't want to fall into the trap of assigning
any meaning to the timestamp.

-- 
dwmw2


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: Junio C Hamano @ 2005-04-28 21:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Woodhouse, Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504281149330.18901@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> So what you can do is:
LT>  - if there is one parent, just always walk straight down
LT>  - if it's a merge, add the parents _in_date_order_ to the list of things 
LT>    to do, and then pop the most recent one.
LT> Really. You say that dates don't matter, but they _do_ actually matter a
LT> lot more than "remote/local" does. At least they have meaning.

On a related topic, I have two questions on commit objects.

1. Currently, commit-tree does not seem to verify that all its
   parent SHA1's actually name valid commit objects.  Is this
   intentional?

I cannot see a good practical reason to commit a new version
that claim to be descendant of some SHA1 you know exists in
somebody else's tree, without actually having that object also
in your SHA1_FILE_DIRECTORY.  Otherwise how did you merge with
it in the first place?  For that reason, I expect the answer to
this question to be "no it was just being lazy.  Go ahead if you
really care."

2. Assuming that we do want to enforce that parent fields of a
   commit object name valid commit objects, is it OK to also
   require that the commit timestamp of a child object is not in
   the future relative to any and all of its parent commit
   objects (I'm talking about the timestamp of committer field
   not author field, although your e-mail patch acceptance
   procedure seems to be giving it the same timestamp right
   now)?

I have been wondering if imposing these two requirement has some
negative effects, but I do not offhand see any.  And these
requirements may make implementation of git log viewer simpler
when the user specifies "I want to view commit between these
ones---give me a linearlized list of commits."  When following
the ancestor chain from the current top, we can immediately stop
upon seeing a commit made before the timestamp of the named
bottom one.


^ permalink raw reply

* Re: kernel.org now has gitweb installed
From: David Woodhouse @ 2005-04-28 21:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504281149330.18901@ppc970.osdl.org>

On Thu, 2005-04-28 at 11:55 -0700, Linus Torvalds wrote:
> Anyway, the reason remote and local don't matter is that if somebody else
> merges with me, and I just pull the result without having any changes in 
> my tree, we just "fast-forward" to that other side, because otherwise you 
> can never "converge" on anything (people merging each others trees would 
> always create a new commit, for no good reason).
> 
> What does that mean? It means that my local tree now became the _remote_ 
> parent, even though it was always local to my tree.

Hmm, that's true; albeit unfortunate. 

Still, using the date isn't any better. It'll give results which are
about as random as just sorting by the sha1 of each parent.

Yes, the ordering of the parents in a merge is probably meaningless in
the general case, but so is the date.

The best we could probably do, from a theoretical standpoint, is to look
at the paths via each parent to a common ancestor, and look at how many
of the commits on each path were done by the same committer. Even that
isn't ideal, and it's probably fairly expensive -- but it's pointless to
pretend we can infer anything from _either_ the dates or the ordering of
the parents in a merge.

-- 
dwmw2


^ permalink raw reply

* Re: [PATCH] add a diff-files command (revised and cleaned up)
From: Junio C Hamano @ 2005-04-28 21:04 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, git
In-Reply-To: <Pine.LNX.4.62.0504281327520.14033@localhost.localdomain>

>>>>> "NP" == Nicolas Pitre <nico@cam.org> writes:

NP> What about this patch then?
NP> =====
NP> Give show-files the ability to process exclusion pattern.

I just tried this, and I like it.  Thanks!


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox