* kernel.org now has gitweb installed @ 2005-04-28 1:38 H. Peter Anvin 2005-04-28 4:17 ` Daniel Jacobowitz 2005-04-28 7:35 ` David Woodhouse 0 siblings, 2 replies; 23+ messages in thread From: H. Peter Anvin @ 2005-04-28 1:38 UTC (permalink / raw) To: Git Mailing List http://www.kernel.org/git/ -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 1:38 kernel.org now has gitweb installed H. Peter Anvin @ 2005-04-28 4:17 ` Daniel Jacobowitz 2005-04-28 7:35 ` David Woodhouse 1 sibling, 0 replies; 23+ messages in thread From: Daniel Jacobowitz @ 2005-04-28 4:17 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Git Mailing List On Wed, Apr 27, 2005 at 06:38:01PM -0700, H. Peter Anvin wrote: > http://www.kernel.org/git/ Thanks! Now all I crave is a version which can browse the file tree and file history; but I think we're almost ready for that... -- Daniel Jacobowitz CodeSourcery, LLC ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 1:38 kernel.org now has gitweb installed H. Peter Anvin 2005-04-28 4:17 ` Daniel Jacobowitz @ 2005-04-28 7:35 ` David Woodhouse 2005-04-28 8:10 ` Petr Baudis 1 sibling, 1 reply; 23+ messages in thread From: David Woodhouse @ 2005-04-28 7:35 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Git Mailing List On Wed, 2005-04-27 at 18:38 -0700, H. Peter Anvin wrote: > http://www.kernel.org/git/ Looks like the ordering is wrong. A chronological sort means that commits which were made three weeks ago, but which Linus only pulled yesterday, do not show up at the top of the tree. -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 7:35 ` David Woodhouse @ 2005-04-28 8:10 ` Petr Baudis 2005-04-28 8:29 ` David Woodhouse 0 siblings, 1 reply; 23+ messages in thread From: Petr Baudis @ 2005-04-28 8:10 UTC (permalink / raw) To: David Woodhouse; +Cc: H. Peter Anvin, Git Mailing List Dear diary, on Thu, Apr 28, 2005 at 09:35:23AM CEST, I got a letter where David Woodhouse <dwmw2@infradead.org> told me that... > On Wed, 2005-04-27 at 18:38 -0700, H. Peter Anvin wrote: > > http://www.kernel.org/git/ > > Looks like the ordering is wrong. A chronological sort means that > commits which were made three weeks ago, but which Linus only pulled > yesterday, do not show up at the top of the tree. Linus ASM (Anonymous Subsystem Maintainer) |------------------------. A| |B | | | \-------------\ | : | \------------------------\ |E C| |D | | /-------------/ | |F /------------------------/ How would you show that? F E D C B A? F D C A E B? -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 8:10 ` Petr Baudis @ 2005-04-28 8:29 ` David Woodhouse 2005-04-28 9:23 ` David Woodhouse 0 siblings, 1 reply; 23+ messages in thread From: David Woodhouse @ 2005-04-28 8:29 UTC (permalink / raw) To: Petr Baudis; +Cc: H. Peter Anvin, Git Mailing List On Thu, 2005-04-28 at 10:10 +0200, Petr Baudis wrote: > Linus ASM (Anonymous Subsystem Maintainer) > > |------------------------. > A| |B > | | > | \-------------\ > | : | > \------------------------\ |E > C| |D | > | /-------------/ > | |F > /------------------------/ > > How would you show that? F E D C B A? F D C A E B? Let us assume that C and A were already in Linus' tree (and on our web page) yesterday. Thus, they should be last. The newly-pulled stuff should be first -- FEDBCA. I'd say "depth-first, remote parent first" but that would actually show show 'A' (as a parent of D) long before it shows C. Walking of remote parents should stop as soon as we hit a commit which was accessible through a more local parent, rather than as soon as we hit a commit which we've already printed. Maybe it should be something like depth- first, local parent first, but _reversed_? The latter is what the mailing list feeder does, but that has the advantage of being about to use 'rev-tree $today ^$yesterday' so we _know_ we're excluding the ones people have already seen. Hence I haven't really paid that much attention to getting the order strictly correct. (Yes, I know that strictly speaking, git has no concept of 'remote' or 'local' parents. But the ordering of the two parents in a Cogito merge or pull hasn't changed, has it?) -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 8:29 ` David Woodhouse @ 2005-04-28 9:23 ` David Woodhouse 2005-04-28 18:55 ` Linus Torvalds 0 siblings, 1 reply; 23+ messages in thread From: David Woodhouse @ 2005-04-28 9:23 UTC (permalink / raw) To: Petr Baudis; +Cc: H. Peter Anvin, Git Mailing List On Thu, 2005-04-28 at 09:29 +0100, David Woodhouse wrote: > Let us assume that C and A were already in Linus' tree (and on our web > page) yesterday. Thus, they should be last. The newly-pulled stuff > should be first -- FEDBCA. > > I'd say "depth-first, remote parent first" but that would actually show > show 'A' (as a parent of D) long before it shows C. Walking of remote > parents should stop as soon as we hit a commit which was accessible > through a more local parent, rather than as soon as we hit a commit > which we've already printed. Walk the tree once. For each commit, count the number of _children_. That's not hard -- each new commit you find below HEAD has one child to start with, then you increment that figure by one each time you find another path to the same commit. When printing, you walk the tree depth-first, remote-parent-first. If you hit a commit with multiple children, decrement its count by one. If the count is still non-zero, ignore that commit (and its parents) and continue. If the count _is_ zero, then this is the "most local" path to the commit in question, so print it and continue to process its parents... (Actually I'd probably do it by adding real pointers to the children instead of using a counter. Operations like convert-cache would be far better off working that way round, and 'cg comments' is going to need to do something very similar to convert-cache.) -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 9:23 ` David Woodhouse @ 2005-04-28 18:55 ` Linus Torvalds 2005-04-28 21:20 ` David Woodhouse 2005-04-28 21:21 ` Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2005-04-28 18:55 UTC (permalink / raw) To: David Woodhouse; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 28 Apr 2005, David Woodhouse wrote: > > Walk the tree once. For each commit, count the number of _children_. > That's not hard -- each new commit you find below HEAD has one child to > start with, then you increment that figure by one each time you find > another path to the same commit. > > When printing, you walk the tree depth-first, remote-parent-first. No, that really sucks. Realize that "remote" and "local" parents don't really exist. They have no meaning. I've considered sorting the parents by the sha1 name, but I've left that for now. Anyway, the reason remote and local don't matter is that if somebody else merges with me, and I just pull the result without having any changes in my tree, we just "fast-forward" to that other side, because otherwise you can never "converge" on anything (people merging each others trees would always create a new commit, for no good reason). What does that mean? It means that my local tree now became the _remote_ parent, even though it was always local to my tree. So if you look at remote vs local, you're _guaranteed_ to mess up. It has no meaning. So what you can do is: - if there is one parent, just always walk straight down - if it's a merge, add the parents _in_date_order_ to the list of things to do, and then pop the most recent one. Really. You say that dates don't matter, but they _do_ actually matter a lot more than "remote/local" does. At least they have meaning. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 18:55 ` Linus Torvalds @ 2005-04-28 21:20 ` David Woodhouse 2005-04-28 21:40 ` Linus Torvalds ` (2 more replies) 2005-04-28 21:21 ` Junio C Hamano 1 sibling, 3 replies; 23+ messages in thread From: David Woodhouse @ 2005-04-28 21:20 UTC (permalink / raw) To: Linus Torvalds; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 2005-04-28 at 11:55 -0700, Linus Torvalds wrote: > Anyway, the reason remote and local don't matter is that if somebody else > merges with me, and I just pull the result without having any changes in > my tree, we just "fast-forward" to that other side, because otherwise you > can never "converge" on anything (people merging each others trees would > always create a new commit, for no good reason). > > What does that mean? It means that my local tree now became the _remote_ > parent, even though it was always local to my tree. Hmm, that's true; albeit unfortunate. Still, using the date isn't any better. It'll give results which are about as random as just sorting by the sha1 of each parent. Yes, the ordering of the parents in a merge is probably meaningless in the general case, but so is the date. The best we could probably do, from a theoretical standpoint, is to look at the paths via each parent to a common ancestor, and look at how many of the commits on each path were done by the same committer. Even that isn't ideal, and it's probably fairly expensive -- but it's pointless to pretend we can infer anything from _either_ the dates or the ordering of the parents in a merge. -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:20 ` David Woodhouse @ 2005-04-28 21:40 ` Linus Torvalds 2005-04-28 21:47 ` David Woodhouse 2005-04-28 21:50 ` H. Peter Anvin 2005-04-28 21:52 ` H. Peter Anvin 2 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2005-04-28 21:40 UTC (permalink / raw) To: David Woodhouse; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 28 Apr 2005, David Woodhouse wrote: > > Still, using the date isn't any better. It'll give results which are > about as random as just sorting by the sha1 of each parent. Well, it does use real information, and it is repeatable. And I don't see why you say that the date is meaningless, when it clearly isn't. The date absolutely does have meaning. Not having a global clock doesn't mean that clocks go away. It just means that they don't generate a total sort. They still generate a _partial_ sort, though, and it's a very valid partial sort. The fact is, this is how the world works in real life too. Relativity doesn't make time "pointless". You still have "before" and "after" for almost all relevant events. The fact that not _all_ events can be sorted by "before" and "after", and different observers can disagree about some of the ordering does not mean that causality has gone away and that time is meaningless. The same is true in a distributed system. Time still exists, and is still meaningful even outside the direct "causality" links implied by the parents. People probably discussed things, and there are methods of communication other than just direct parent links, and while you're not _guaranteed_ that "before" and "after" always makes sense, they definitely still exist 99% of the time. > The best we could probably do, from a theoretical standpoint, is to look > at the paths via each parent to a common ancestor, and look at how many > of the commits on each path were done by the same committer. That's quite expensive. > Even that isn't ideal, and it's probably fairly expensive -- but it's > pointless to pretend we can infer anything from _either_ the dates or > the ordering of the parents in a merge. Wrong. The date _does_ have meaning. It shows which of the parents was more recent, which indirectly is a hint about which side had more activity going on. In other words, it _is_ meanginful. Maybe it's a _statistical_ meaning ("that side is probably the active one, because it has the last commit"), but it's a meaning. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:40 ` Linus Torvalds @ 2005-04-28 21:47 ` David Woodhouse 0 siblings, 0 replies; 23+ messages in thread From: David Woodhouse @ 2005-04-28 21:47 UTC (permalink / raw) To: Linus Torvalds; +Cc: Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 2005-04-28 at 14:40 -0700, Linus Torvalds wrote: > Wrong. The date _does_ have meaning. It shows which of the parents was > more recent, which indirectly is a hint about which side had more activity > going on. > > In other words, it _is_ meanginful. Maybe it's a _statistical_ meaning > ("that side is probably the active one, because it has the last commit"), > but it's a meaning. It's not entirely clear what 'active' is supposed to be useful for in this instance. You could just as well count the commits between the merge and the common ancestor, if you want to see which side was most _active_ -- but that isn't helpful for deciding the order in which 'cg-log' should show commits. What you really want there is 'local' vs. 'remote', because people want to see the order in which changesets arrived in the _local_ repository -- if the last thing you did was pull from me, people want all my changesets to be at the top; regardless of who last committed to their tree before the merge -- i.e. regardless of whether I did a last-minute commit before you pulled, or whether you'd done another commit to your tree immediately before pulling. As you rightly point out, the local/remote information isn't really available in an easy form -- certainly not from the ordering of the parents in a merge commit. But let's not fool ourselves that we can piece it together from the date either. OK, the date _is_ meaningful in a way, but only in the same way that the author's name and IRC address information is meaningful. Of course we didn't include it for _nothing_, but it's outside the scope of git itself; it isn't part of the useful information which git should care about. -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:20 ` David Woodhouse 2005-04-28 21:40 ` Linus Torvalds @ 2005-04-28 21:50 ` H. Peter Anvin 2005-04-28 21:52 ` H. Peter Anvin 2 siblings, 0 replies; 23+ messages in thread From: H. Peter Anvin @ 2005-04-28 21:50 UTC (permalink / raw) To: David Woodhouse; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List David Woodhouse wrote: > > Hmm, that's true; albeit unfortunate. > > Still, using the date isn't any better. It'll give results which are > about as random as just sorting by the sha1 of each parent. > > Yes, the ordering of the parents in a merge is probably meaningless in > the general case, but so is the date. > > The best we could probably do, from a theoretical standpoint, is to look > at the paths via each parent to a common ancestor, and look at how many > of the commits on each path were done by the same committer. Even that > isn't ideal, and it's probably fairly expensive -- but it's pointless to > pretend we can infer anything from _either_ the dates or the ordering of > the parents in a merge. > Perhaps the right thing to do is to draw a graph instead? -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:20 ` David Woodhouse 2005-04-28 21:40 ` Linus Torvalds 2005-04-28 21:50 ` H. Peter Anvin @ 2005-04-28 21:52 ` H. Peter Anvin 2005-04-28 22:12 ` Linus Torvalds 2005-04-28 22:12 ` David Woodhouse 2 siblings, 2 replies; 23+ messages in thread From: H. Peter Anvin @ 2005-04-28 21:52 UTC (permalink / raw) To: David Woodhouse; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List David Woodhouse wrote: > > Hmm, that's true; albeit unfortunate. > > Still, using the date isn't any better. It'll give results which are > about as random as just sorting by the sha1 of each parent. > > Yes, the ordering of the parents in a merge is probably meaningless in > the general case, but so is the date. > > The best we could probably do, from a theoretical standpoint, is to look > at the paths via each parent to a common ancestor, and look at how many > of the commits on each path were done by the same committer. Even that > isn't ideal, and it's probably fairly expensive -- but it's pointless to > pretend we can infer anything from _either_ the dates or the ordering of > the parents in a merge. > I thought about this for a few seconds (I really should do that more often...) and realized what it is you want: you want a primary search criterion which is "when did event X become visible to me", where "me" in this case is the web tool. That is not repository information, but it is perfectly possible for the webtool to be aware of what it has previously seen and when. And yes, this ordering is clearly different for each observer. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:52 ` H. Peter Anvin @ 2005-04-28 22:12 ` Linus Torvalds 2005-04-28 22:12 ` David Woodhouse 1 sibling, 0 replies; 23+ messages in thread From: Linus Torvalds @ 2005-04-28 22:12 UTC (permalink / raw) To: H. Peter Anvin; +Cc: David Woodhouse, Petr Baudis, Git Mailing List On Thu, 28 Apr 2005, H. Peter Anvin wrote: > > I thought about this for a few seconds (I really should do that more > often...) and realized what it is you want: you want a primary search > criterion which is "when did event X become visible to me", where "me" > in this case is the web tool. That is not repository information, but > it is perfectly possible for the webtool to be aware of what it has > previously seen and when. This is exactly what rev-tree does, and how things like the commit emails happen. The problem is that since it's observer-dependent, it's not generally very useful for something like a web interface. You really don't want to keep track of what everybody has seen ;) What you _can_ try to keep track of is what some "special observer" has seen. That's really quite complicated too, but if you do a web interface, the "special observer" is yourself. Then at every time you mirror the thing, you need to remember what your "last view" was, and you base your "new view" on the fact that you know what you saw last time, so you know which things are new to _you_. But it really means that each web interface ends up showing quite _different_ information, and the particular information you show ends up being dependent on when you started looking at the tree (and how often you re-generate new views). This really is why "time" is interesting. Because it's simple, and observers can agree about it (not because the time was the same, but because each observer just agrees that time is "whatever was reported as the local time at the point the action happened"). Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:52 ` H. Peter Anvin 2005-04-28 22:12 ` Linus Torvalds @ 2005-04-28 22:12 ` David Woodhouse 2005-04-29 2:46 ` Jan Harkes 1 sibling, 1 reply; 23+ messages in thread From: David Woodhouse @ 2005-04-28 22:12 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List On Thu, 2005-04-28 at 14:52 -0700, H. Peter Anvin wrote: > I thought about this for a few seconds (I really should do that more > often...) and realized what it is you want: you want a primary search > criterion which is "when did event X become visible to me", where "me" > in this case is the web tool. That is not repository information, but > it is perfectly possible for the webtool to be aware of what it has > previously seen and when. > > And yes, this ordering is clearly different for each observer. The mailing list does it with tags -- it remembers the 'last seen commit' and then effectively does 'rev-tree HEAD ^LASTSEEN', except that I make a primitive attempt to get the ordering a little better than what I get from rev-tree. But since the mailing list runs are hourly, I really can get away with a _primitive_ attempt. That's why I hadn't noticed the local/remote ordering problem that Linus pointed out. It's not clear how you'd attempt to track local history for the general case though -- the whole concept of a 'local' branch being special is anathema to git. You'd have to hack it into some auxiliary storage, as I do with tags -- but to get a fullly correct ordering it'd have to track at least every locally-performed merge, and you really don't want to be doing that kind of thing. You might perhaps attempt to find a path through the graph which takes in as many commits as possible where committer == `logname`@`hostname` -- but as Linus and I already said, that's expensive. I'm not entirely sure what the answer is; but it isn't parent ordering and it isn't dates. Using dates might be a nice quick approximation, but that really isn't good enough. I wonder if we could try to enforce some meaning for dates though.... Currently, 'rev-tree AAAA ^BBBB' has to build the _entire_ tree for BBBB back to the beginning, so it knows where to stop when following AAAA. However, if we _do_ take Junio's suggesting of enforcing monotonicity, then we'll always know that the parents of a given commit will have a timestamp which is older than its own timestamp. So given the task "list commits between 2.6.12-rc3 and 2.6.12-rc4' we could look at the timestamp of rc3, and immediately follow the rc4 parents until we start seeing commits which are older than rc3. Then each time we hit a commit in the parents of rc4 which is older than rc3 is, we continue doing a breadth-first search from rc3 until all the parents we're looking at are older than the parent of rc4 which we're currently considering. Etc. That means that the common case of "in A but not in B" can at least be handled relatively efficiently without having to wait while it tracks the history all the way back to the beginning. I still don't like it much though... -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 22:12 ` David Woodhouse @ 2005-04-29 2:46 ` Jan Harkes 0 siblings, 0 replies; 23+ messages in thread From: Jan Harkes @ 2005-04-29 2:46 UTC (permalink / raw) To: Git Mailing List On Thu, Apr 28, 2005 at 11:12:52PM +0100, David Woodhouse wrote: > You might perhaps attempt to find a path through the graph which takes > in as many commits as possible where committer == `logname`@`hostname` > -- but as Linus and I already said, that's expensive. > > I'm not entirely sure what the answer is; but it isn't parent ordering > and it isn't dates. Perhaps a lamport clock? Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 18:55 ` Linus Torvalds 2005-04-28 21:20 ` David Woodhouse @ 2005-04-28 21:21 ` Junio C Hamano 2005-04-28 21:23 ` David Woodhouse ` (2 more replies) 1 sibling, 3 replies; 23+ messages in thread From: Junio C Hamano @ 2005-04-28 21:21 UTC (permalink / raw) To: Linus Torvalds Cc: David Woodhouse, Petr Baudis, H. Peter Anvin, Git Mailing List >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: LT> So what you can do is: LT> - if there is one parent, just always walk straight down LT> - if it's a merge, add the parents _in_date_order_ to the list of things LT> to do, and then pop the most recent one. LT> Really. You say that dates don't matter, but they _do_ actually matter a LT> lot more than "remote/local" does. At least they have meaning. On a related topic, I have two questions on commit objects. 1. Currently, commit-tree does not seem to verify that all its parent SHA1's actually name valid commit objects. Is this intentional? I cannot see a good practical reason to commit a new version that claim to be descendant of some SHA1 you know exists in somebody else's tree, without actually having that object also in your SHA1_FILE_DIRECTORY. Otherwise how did you merge with it in the first place? For that reason, I expect the answer to this question to be "no it was just being lazy. Go ahead if you really care." 2. Assuming that we do want to enforce that parent fields of a commit object name valid commit objects, is it OK to also require that the commit timestamp of a child object is not in the future relative to any and all of its parent commit objects (I'm talking about the timestamp of committer field not author field, although your e-mail patch acceptance procedure seems to be giving it the same timestamp right now)? I have been wondering if imposing these two requirement has some negative effects, but I do not offhand see any. And these requirements may make implementation of git log viewer simpler when the user specifies "I want to view commit between these ones---give me a linearlized list of commits." When following the ancestor chain from the current top, we can immediately stop upon seeing a commit made before the timestamp of the named bottom one. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:21 ` Junio C Hamano @ 2005-04-28 21:23 ` David Woodhouse 2005-04-28 21:44 ` Junio C Hamano 2005-04-28 22:59 ` Gerhard Schrenk 2005-04-28 21:38 ` David Woodhouse 2005-04-28 21:44 ` Linus Torvalds 2 siblings, 2 replies; 23+ messages in thread From: David Woodhouse @ 2005-04-28 21:23 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote: > 2. Assuming that we do want to enforce that parent fields of a > commit object name valid commit objects, is it OK to also > require that the commit timestamp of a child object is not in > the future relative to any and all of its parent commit > objects No. Time is utterly meaningless -- it's perfectly normal for clocks to be out of sync. We really don't want to fall into the trap of assigning any meaning to the timestamp. -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:23 ` David Woodhouse @ 2005-04-28 21:44 ` Junio C Hamano 2005-04-28 22:04 ` Linus Torvalds 2005-04-28 22:59 ` Gerhard Schrenk 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2005-04-28 21:44 UTC (permalink / raw) To: David Woodhouse Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List >>>>> "DW" == David Woodhouse <dwmw2@infradead.org> writes: DW> On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote: >> 2. Assuming that we do want to enforce that parent fields of a >> commit object name valid commit objects, is it OK to also >> require that the commit timestamp of a child object is not in >> the future relative to any and all of its parent commit >> objects DW> No. Time is utterly meaningless -- it's perfectly normal for clocks to DW> be out of sync. We really don't want to fall into the trap of assigning DW> any meaning to the timestamp. If that is really the case, shouldn't we do one of the following: (1) Timestamp is meaningless. Stop recording it in the commit objects. (2) Keep recording meaningless timestamp in the commit objects, because otherwise it would break backward compatibility. However, stop looking at timestamp in commit.c; especially pop_most-recent_commit() is meaningless hance what rev-list does. (3) Require the proper ordering in the timestamp as I suggested. Users should take note and make corrective action if their clocks are _way_ out of sync. I do not think we want to do either (1) or (2). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:44 ` Junio C Hamano @ 2005-04-28 22:04 ` Linus Torvalds 0 siblings, 0 replies; 23+ messages in thread From: Linus Torvalds @ 2005-04-28 22:04 UTC (permalink / raw) To: Junio C Hamano Cc: David Woodhouse, Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 28 Apr 2005, Junio C Hamano wrote: > > If that is really the case, shouldn't we do one of the > following: No. It's not the case that time-stamps are meaningless. The thing about distributed stuff is that time gets "fuzzy". It doesn't go away. It's still very valid to say "this was done yesterday". But what gets fuzzy is "before" and "after". For two reasons: - time isn't synchronized, and clocks can be off. Usually by just a little bit, but sometimes you'll find just plain badly maintained machines, and time can be a year or two off. Ergo: time is a _hint_. It's usually a pretty good hint, but it's a hint. - the "parent" relationship is the only "hard" before/after thing that git knows about, but it ignores a lot of real-world interaction, so thinking that it is the _only_ before/after measure is ignoring all the other communication in a system. So parenthood guarantees that something happened "before", but _not_ being directly related doesn't mean that they were totally independent. There's no fixed "speed of light" that defines some absolute "cone of reachability". So time is relevant, but it's more of a hint than anything absolute. Anything that -depends- on time is a bug waiting to happen, but something that uses time to visualize things makes sense. The big advantage with time is that it's cheap. If you want to do a full reachability analysis, you have to look at the whole revision tree. That's quite possible RIGHT NOW, but it simply ill not be practical in a year, when we have 15,000 commits. So "time" ends up being an approximation for "doing it right". As an example: it's quite expensive to ask "was this commit part of 2.6.12-rc3?" because that involves knowing the whole set of commits involved in 2.6.12-rc3. Which in turn involves walking the whole revision tree starting at 2.6.12-rc3 downwards. That's exactly what "rev-tree" does, though. "rev-tree" will do the whole reachability thing, and as a result you can see whether something was in 2.6.12-rc3 or not. But just for fun - time how long it takes for "rev-tree" to output its first entry, and how long it takes for "rev-list" to print its first line. Hint: do it with a cold-cache "sparse" tree. "rev-list" will start outputting data immediately, and work it out as it goes along. "rev-tree" will think for some time, and then blast the data out. In other words: rev-list is what you want for something like "git log", because you care about _latency_ of the result. And that's why it uses time. It's an approximation, but it is an approximation that has meaning in real life. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:23 ` David Woodhouse 2005-04-28 21:44 ` Junio C Hamano @ 2005-04-28 22:59 ` Gerhard Schrenk 1 sibling, 0 replies; 23+ messages in thread From: Gerhard Schrenk @ 2005-04-28 22:59 UTC (permalink / raw) To: Git Mailing List * David Woodhouse <dwmw2@infradead.org> [2005-04-28 23:23]: > No. Time is utterly meaningless -- This is fundamentally wrong. Space-time and causality has a *very* important meaning. If don't use this information (directly or indirectly) in your data modell or history graph you do something very stupid. You simply won't optimize for the common case because you won't scale with the fundamental physical laws of information exchange and syncronisation, you just kind of break space-time-symmetrie. Ever compared feynman diagrams to merge diagrams? > it's perfectly normal for clocks to be out of sync. Yes even special relativity just boils down to "there is no absolut simultaneity". So what? I'll predict if you break causality your kernel will suddenly destabilize and explode like a nuclear bomb ;-) Gerhard ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:21 ` Junio C Hamano 2005-04-28 21:23 ` David Woodhouse @ 2005-04-28 21:38 ` David Woodhouse 2005-04-28 21:49 ` Junio C Hamano 2005-04-28 21:44 ` Linus Torvalds 2 siblings, 1 reply; 23+ messages in thread From: David Woodhouse @ 2005-04-28 21:38 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote: > "I want to view commit between these ones---give me a linearlized list > of commits." When following the ancestor chain from the current top, > we can immediately stop upon seeing a commit made before the timestamp > of the named bottom one. This absolutely must not be timestamp based. If I ask for a list of commits before 2.6.12-rc3 and 2.6.12-rc4 I _really_ want to see those commits which happened before 2.6.12-rc3 but in a remote tree which was only later pulled. That's what 'rev-tree AAAAAA ^BBBBBB' already gives you. -- dwmw2 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:38 ` David Woodhouse @ 2005-04-28 21:49 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2005-04-28 21:49 UTC (permalink / raw) To: David Woodhouse Cc: Linus Torvalds, Petr Baudis, H. Peter Anvin, Git Mailing List >>>>> "DW" == David Woodhouse <dwmw2@infradead.org> writes: DW> On Thu, 2005-04-28 at 14:21 -0700, Junio C Hamano wrote: >> "I want to view commit between these ones---give me a linearlized list >> of commits." When following the ancestor chain from the current top, >> we can immediately stop upon seeing a commit made before the timestamp >> of the named bottom one. DW> This absolutely must not be timestamp based. If I ask for a list of DW> commits before 2.6.12-rc3 and 2.6.12-rc4 I _really_ want to see those DW> commits which happened before 2.6.12-rc3 but in a remote tree which was DW> only later pulled. That's what 'rev-tree AAAAAA ^BBBBBB' already gives DW> you. How true. I stand corrected. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel.org now has gitweb installed 2005-04-28 21:21 ` Junio C Hamano 2005-04-28 21:23 ` David Woodhouse 2005-04-28 21:38 ` David Woodhouse @ 2005-04-28 21:44 ` Linus Torvalds 2 siblings, 0 replies; 23+ messages in thread From: Linus Torvalds @ 2005-04-28 21:44 UTC (permalink / raw) To: Junio C Hamano Cc: David Woodhouse, Petr Baudis, H. Peter Anvin, Git Mailing List On Thu, 28 Apr 2005, Junio C Hamano wrote: > > 1. Currently, commit-tree does not seem to verify that all its > parent SHA1's actually name valid commit objects. Is this > intentional? No. Me lazy. I think we should check as many _cheap_ things as possible, and checking whether a parent at least superficially looks like a real commit object is certainly cheap. > 2. Assuming that we do want to enforce that parent fields of a > commit object name valid commit objects, is it OK to also > require that the commit timestamp of a child object is not in > the future relative to any and all of its parent commit > objects (I'm talking about the timestamp of committer field > not author field, although your e-mail patch acceptance > procedure seems to be giving it the same timestamp right > now)? No, this is not ok. Clock skew is real, and somebody may have a misconfigured machine. Being careful about integrity is good, but trying to enforce time flow in a distributed environment is just being anal. Maybe a warning. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2005-04-29 2:41 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-28 1:38 kernel.org now has gitweb installed H. Peter Anvin 2005-04-28 4:17 ` Daniel Jacobowitz 2005-04-28 7:35 ` David Woodhouse 2005-04-28 8:10 ` Petr Baudis 2005-04-28 8:29 ` David Woodhouse 2005-04-28 9:23 ` David Woodhouse 2005-04-28 18:55 ` Linus Torvalds 2005-04-28 21:20 ` David Woodhouse 2005-04-28 21:40 ` Linus Torvalds 2005-04-28 21:47 ` David Woodhouse 2005-04-28 21:50 ` H. Peter Anvin 2005-04-28 21:52 ` H. Peter Anvin 2005-04-28 22:12 ` Linus Torvalds 2005-04-28 22:12 ` David Woodhouse 2005-04-29 2:46 ` Jan Harkes 2005-04-28 21:21 ` Junio C Hamano 2005-04-28 21:23 ` David Woodhouse 2005-04-28 21:44 ` Junio C Hamano 2005-04-28 22:04 ` Linus Torvalds 2005-04-28 22:59 ` Gerhard Schrenk 2005-04-28 21:38 ` David Woodhouse 2005-04-28 21:49 ` Junio C Hamano 2005-04-28 21:44 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).