From: linux@horizon.com
To: junkio@cox.net, mcostalba@gmail.com
Cc: git@vger.kernel.org, jonsmirl@gmail.com, linux@horizon.com,
paulus@samba.org, torvalds@osdl.org
Subject: Re: Change set based shallow clone
Date: 10 Sep 2006 11:21:22 -0400 [thread overview]
Message-ID: <20060910152122.31694.qmail@science.horizon.com> (raw)
In-Reply-To: <e5bfff550609092214t4f8e195eib28e302f4d284aa@mail.gmail.com>
This conversation is going in so many directions at once that it's
getting annoying.
The first level decision requires input from the point of view of gitk
and qgit internals as to what's easy to implement.
I'm trying to figure out the gitk code, but I'm not fluent in tcl, and
it has 39 non-boilerplate comment lines in 229 functions and 6308 lines
of source, so it requires fairly intensive grokking.
Still, while it obviously doesn't render bitmaps until the data
appears in the window, it appears as though at least part of the
layout (the "layoutmore" function) code is performed eagerly as
soon as new data arrives via the getcommitlines callback.
(Indeed, it appears that Tk does not inform it of window scroll
events, so it can't wait any longer to decide on the layout.)
Case 1: The visualizer is NOT CAPABLE of accepting out-of-order
input. Without very sophisticated cacheing in git-rev-list,
it must process all of the data before outputting any in
order to make an absolute guarantee of no out-of-order data
despite possibly messed-up timestamps.
It is possible to notice that the date-first traversal only
has one active chain and flush the queued data at that point,
but that situation is unlikely to arise in repositories of
non-trivial size, which are exactly the ones for which
the batch-sorting delay is annoying.
Case 2: The visualizer IS CAPABLE of accepting out-of-order input.
Regardless of whether the layout is done eagerly or lazily,
this requires the visualizer to potentially undo part of
its layout and re-do it, so has a UI implementation cost.
The re-doing does not need to be highly efficient; any number
of heuristics and exception caches can reduce the occurrence of
this in git-rev-list output to very low levels. It's just
absolutely excluding it, without losing incremental output,
that is difficult.
Heuristic: I suspect the most common wrong-timestamp case is
a time zone misconfiguration, so holding back output until the
tree traversal has advanced 24 hours will eliminate most of the
problems. Atypically large time jumps (like more than a year)
could also trigger special "gross clock error" handling.
Cache: whenever a child timestamped older than an ancestor is
encountered, this can be entered in a persistent cache that can
be used to give the child a different sorting priority next time.
The simplest implementation would propagate this information up a
chain of bad timestamps by one commit per git-rev-list invocation,
but even that's probably okay.
(A study of timestamp ordering problems in existing repositories
would be helpful for tuning these.)
In case 2, I utterly fail to see how delaying emitting the out-of-order
commit is of the slightest help to the UI. The simplest way to merge
out-of-order data is with an insertion sort (a.k.a. roll back and
reprocess forward), and the cost of that is minimized if the distance
to back up is minimized.
Some "oops!" annotation on the git-rev-list output may be helpful to
tell the UI that it needs to search back, but it already has an internal
index of commits, so perhaps even that isn't worth bothering with.
Fancier output formats are also more work to process.
With sufficient cacheing of exceptions in git-rev-list, it may be
practical to just have a single "oops, I screwed up; let's start again"
output line, which very rarely triggers.
But can we stop designing git-rev-list output formats until we've figured
out if and how to implement it in the visualizer? Or, more to the point,
visualizers plural. That's the hard part. Then we can see what sort
of git-rev-list output would be most convenient.
For example, is fixing a small number of out-of-place commits practical,
or is it better to purge and restart? The former avoids deleting
already-existing objects, while the latter avoids moving them.
The original problem is that the long delay between starting a git history
browser and being able to browse them is annoying. The visualizer UIs
already support browsing while history is flowing in from git-rev-list,
but git-rev-list is reading and sorting the entire history before
outputting the first line.
next prev parent reply other threads:[~2006-09-10 15:21 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-07 19:52 Change set based shallow clone Jon Smirl
2006-09-07 20:21 ` Jakub Narebski
2006-09-07 20:41 ` Jon Smirl
2006-09-07 21:33 ` Jeff King
2006-09-07 21:51 ` Jakub Narebski
2006-09-07 21:37 ` Jakub Narebski
2006-09-07 22:14 ` Junio C Hamano
2006-09-07 23:09 ` Jon Smirl
2006-09-10 23:20 ` Anand Kumria
2006-09-08 8:48 ` Andreas Ericsson
2006-09-07 22:07 ` Junio C Hamano
2006-09-07 22:40 ` Jakub Narebski
2006-09-08 3:54 ` Martin Langhoff
2006-09-08 5:30 ` Junio C Hamano
2006-09-08 7:15 ` Martin Langhoff
2006-09-08 8:33 ` Junio C Hamano
2006-09-08 17:18 ` A Large Angry SCM
2006-09-08 14:20 ` Jon Smirl
2006-09-08 15:50 ` Jakub Narebski
2006-09-09 3:13 ` Petr Baudis
2006-09-09 8:39 ` Jakub Narebski
2006-09-08 5:05 ` Aneesh Kumar K.V
2006-09-08 1:01 ` linux
2006-09-08 2:23 ` Jon Smirl
2006-09-08 8:36 ` Jakub Narebski
2006-09-08 8:39 ` Junio C Hamano
2006-09-08 18:42 ` linux
2006-09-08 21:13 ` Jon Smirl
2006-09-08 22:27 ` Jakub Narebski
2006-09-08 23:09 ` Linus Torvalds
2006-09-08 23:28 ` Jon Smirl
2006-09-08 23:45 ` Paul Mackerras
2006-09-09 1:45 ` Jon Smirl
2006-09-10 12:41 ` Paul Mackerras
2006-09-10 14:56 ` Jon Smirl
2006-09-10 16:10 ` linux
2006-09-10 18:00 ` Jon Smirl
2006-09-10 19:03 ` linux
2006-09-10 20:00 ` Linus Torvalds
2006-09-10 21:00 ` Jon Smirl
2006-09-11 2:49 ` Linus Torvalds
2006-09-10 22:41 ` Paul Mackerras
2006-09-11 2:55 ` Linus Torvalds
2006-09-11 3:18 ` Linus Torvalds
2006-09-11 6:35 ` Junio C Hamano
2006-09-11 18:54 ` Junio C Hamano
2006-09-11 8:36 ` Paul Mackerras
2006-09-11 14:26 ` linux
2006-09-11 15:01 ` Jon Smirl
2006-09-11 16:47 ` Junio C Hamano
2006-09-11 21:52 ` Paul Mackerras
2006-09-11 23:47 ` Junio C Hamano
2006-09-12 0:06 ` Jakub Narebski
2006-09-12 0:18 ` Junio C Hamano
2006-09-12 0:25 ` Jakub Narebski
2006-09-11 9:04 ` Jakub Narebski
2006-09-10 18:51 ` Junio C Hamano
2006-09-11 0:04 ` Shawn Pearce
2006-09-11 0:42 ` Junio C Hamano
2006-09-11 0:03 ` Shawn Pearce
2006-09-11 0:41 ` Junio C Hamano
2006-09-11 1:04 ` Jakub Narebski
2006-09-11 2:44 ` Shawn Pearce
2006-09-11 5:27 ` Junio C Hamano
2006-09-11 6:08 ` Shawn Pearce
2006-09-11 7:11 ` Junio C Hamano
2006-09-11 17:52 ` Shawn Pearce
2006-09-11 2:11 ` Jon Smirl
2006-09-09 1:05 ` Paul Mackerras
2006-09-09 2:56 ` Linus Torvalds
2006-09-09 3:23 ` Junio C Hamano
2006-09-09 3:31 ` Paul Mackerras
2006-09-09 4:04 ` Linus Torvalds
2006-09-09 8:47 ` Marco Costalba
2006-09-09 17:33 ` Linus Torvalds
2006-09-09 18:04 ` Marco Costalba
2006-09-09 18:44 ` linux
2006-09-09 19:17 ` Marco Costalba
2006-09-09 20:05 ` Linus Torvalds
2006-09-09 20:43 ` Jeff King
2006-09-09 21:11 ` Junio C Hamano
2006-09-09 21:14 ` Jeff King
2006-09-09 21:40 ` Linus Torvalds
2006-09-09 22:54 ` Jon Smirl
2006-09-10 0:18 ` Linus Torvalds
2006-09-10 1:22 ` Junio C Hamano
2006-09-10 3:49 ` Marco Costalba
2006-09-10 4:13 ` Junio C Hamano
2006-09-10 4:23 ` Marco Costalba
2006-09-10 4:46 ` Marco Costalba
2006-09-10 4:54 ` Junio C Hamano
2006-09-10 5:14 ` Marco Costalba
2006-09-10 5:46 ` Junio C Hamano
2006-09-10 15:21 ` linux [this message]
2006-09-10 18:32 ` Marco Costalba
2006-09-11 9:56 ` Paul Mackerras
2006-09-11 12:39 ` linux
2006-09-10 9:49 ` Jakub Narebski
2006-09-10 10:28 ` Josef Weidendorfer
-- strict thread matches above, loose matches on Subject: below --
2006-09-09 10:31 linux
2006-09-09 13:00 ` Marco Costalba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060910152122.31694.qmail@science.horizon.com \
--to=linux@horizon.com \
--cc=git@vger.kernel.org \
--cc=jonsmirl@gmail.com \
--cc=junkio@cox.net \
--cc=mcostalba@gmail.com \
--cc=paulus@samba.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).