* What's cooking in git.git (Jul 2010, #05; Wed, 28) @ 2010-07-29 4:00 Junio C Hamano 2010-07-30 18:37 ` Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2010-07-29 4:00 UTC (permalink / raw) To: git Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. Now that the latest feature release 1.7.2 is out, we should rewind and rebuild 'next' and start cooking new topics. -------------------------------------------------- [New Topics] * ab/test-coverage (2010-07-26) 8 commits - Makefile: make gcov invocation configurable - t/README: Add a note about the dangers of coverage chasing - t/README: A new section about test coverage - Makefile: Add cover_db_html target - Makefile: Add cover_db target - Makefile: Split out the untested functions target - Makefile: Include subdirectories in "make cover" reports - gitignore: Ignore files generated by "make coverage" * ab/test-no-skip (2010-07-28) 5 commits - t/README: Update "Skipping tests" to align with best practices - t/t7800-difftool.sh: Skip with prereq on no PERL - t/t5800-remote-helpers.sh: Skip with prereq on python <2.4 - t/t4004-diff-rename-symlink.sh: use three-arg <prereq> - tests: implicitly skip SYMLINKS tests using <prereq> * bc/use-more-hardlinks-in-install (2010-07-23) 2 commits - Makefile: make hard/symbolic links for non-builtins too - Makefile: link builtins residing in bin directory to main git binary too * cc/find-commit-subject (2010-07-22) 6 commits - blame: use find_commit_subject() instead of custom code - merge-recursive: use find_commit_subject() instead of custom code - bisect: use find_commit_subject() instead of custom code - revert: rename variables related to subject in get_message() - revert: refactor code to find commit subject in find_commit_subject() - revert: fix off by one read when searching the end of a commit subject * gb/shell-ext (2010-07-28) 3 commits - Add sample commands for git-shell - Add interactive mode to git-shell for user-friendliness - Allow creation of arbitrary git-shell commands * jc/log-grep (2010-07-19) 1 commit - git log: add -G<regexp> that greps in the patch text * jh/clean-exclude (2010-07-20) 2 commits - Add test for git clean -e. - Add -e/--exclude to git-clean. * jh/use-test-must-fail (2010-07-20) 1 commit - Convert "! git" to "test_must_fail git" * jn/apply-filename-with-sp (2010-07-23) 4 commits - apply: Handle traditional patches with space in filename - t4135 (apply): use expand instead of pr for portability - tests: Test how well "git apply" copes with weird filenames - apply: Split quoted filename handling into new function * jn/fix-abbrev (2010-07-27) 3 commits - examples/commit: use --abbrev for commit summary - checkout, commit: remove confusing assignments to rev.abbrev - archive: abbreviate substituted commit ids again * jn/maint-setup-fix (2010-07-24) 10 commits - setup: split off a function to handle ordinary .git directories - Revert "rehabilitate 'git index-pack' inside the object store" - setup: do not forget working dir from subdir of gitdir - setup: split off get_device_or_die helper - setup: split off a function to handle hitting ceiling in repo search - setup: split off code to handle stumbling upon a repository - setup: split off a function to checks working dir for .git file - setup: split off $GIT_DIR-set case from setup_git_directory_gently - tests: try git apply from subdir of toplevel - t1501 (rev-parse): clarify * jn/rebase-rename-am (2008-11-10) 5 commits - rebase: protect against diff.renames configuration - t3400 (rebase): whitespace cleanup - Teach "apply --index-info" to handle rename patches - t4150 (am): futureproof against failing tests - t4150 (am): style fix * ml/rebase-x-strategy (2010-07-29) 1 commit - rebase: support -X to pass through strategy options * mm/shortopt-detached (2010-07-28) 4 commits - Allow detached form for --glob, --branches, --tags and --remote. - Allow detached form (e.g. "git log --grep foo") in log options. - Allow detached form for git diff --stat-name-width and --stat-width. - Allow detached form (e.g. "-S foo" instead of "-Sfoo") for diff options * nd/fix-sparse-checkout (2010-07-26) 3 commits - Mark new entries skip-worktree appropriately - unpack-trees.c: Do not check ce_stage in will_have_skip_worktree() - Fix sparse checkout not removing files from index * tr/ab-i18n-fix (2010-07-25) 1 commit - tests: locate i18n lib&data correctly under --valgrind (this branch uses ab/i18n.) * tr/maint-no-unquote-plus (2010-07-24) 1 commit - Do not unquote + into ' ' in URLs * tr/xsize-bits (2010-07-28) 1 commit - xsize_t: check whether we lose bits * vs/doc-spell (2010-07-20) 1 commit - Documentation: spelling fixes -------------------------------------------------- [Stalled -- would discard unless there are some movements soon] * by/log-range-diff (2010-07-12) 18 commits . Minimum fix to make by/log-range-diff topic at least compile . add test cases for '--graph' of line level log . line.c output the '--graph' padding before each line . add parent rewrite feature to line level log . make rewrite_parents an external function . some document update . add two test cases . add --always-print option . map/print ranges along traversing the history topologically . print the line log . map/take range to parent . add range clone functions . export three functions from diff.c . parse the -L options . refactor parse_loc . add the basic data structure for line level history . parse-options: add two helper functions . parse-options: stop when encounter a non-option Perhaps a re-roll is coming? I suspect that we would have some overlaps to mm/shortopt-detached topic. * ps/gitweb-soc (2010-06-02) 2 commits (merged to 'next' on 2010-06-13 at 92245ae) + git-instaweb: Add option to reuse previous config file + Makefile: Use $(sharedir)/gitweb for target 'install-gitweb' If we are going to have a configuration variable to control this, I strongly suspect that --reuse-config should be renamed so that the variable can be named more sanely and in line with whatever option that replaces it. No responses; I think we will eventually want to have a configuration to always enable the new option, so the renaming of the command line option is inevitable. I plan to kick this out of 'next' once the upcoming release is out, and expect a re-roll with configuration variable. * js/rebase-origin-x (2010-02-05) 1 commit - [RFC w/o test and incomplete] rebase: add -x option to record original commit name I retract my objection against the idea of -x; needs polishing before moving forward. No responses; I plan to drop this entirely after the upcoming release without prejudice. * zl/mailinfo-recode-patch (2010-06-14) 2 commits - add --recode-patch option to git-am - add --recode-patch option to git-mailinfo I recall there was another round of re-roll planned for this one. * jk/tag-contains (2010-07-05) 4 commits - Why is "git tag --contains" so slow? - default core.clockskew variable to one day - limit "contains" traversals based on commit timestamp - tag: speed up --contains calculation -------------------------------------------------- [Cooking] * ab/report-corrupt-object-with-type (2010-06-10) 1 commit - sha1_file: Show the the type and path to corrupt objects * cc/revert (2010-07-21) 5 commits - t3508: add check_head_differs_from() helper function and use it - revert: improve success message by adding abbreviated commit sha1 - revert: don't print "Finished one cherry-pick." if commit failed - revert: refactor commit code into a new run_git_commit() function - revert: report success when using option --strategy * en/fast-export-fix (2010-07-17) 2 commits - fast-export: Add a --full-tree option - fast-export: Fix dropping of files with --import-marks and path limiting * jn/parse-date-basic (2010-07-15) 1 commit - Export parse_date_basic() to convert a date string to timestamp (this branch is used by rr/svn-export.) * kf/post-receive-sample-hook (2010-07-16) 1 commit - post-receive-email: optional message line count limit * tr/rfc-reset-doc (2010-07-18) 5 commits - Documentation/reset: move "undo permanently" example behind "make topic" - Documentation/reset: reorder examples to match description - Documentation/reset: promote 'examples' one section up - Documentation/reset: separate options by mode - Documentation/git-reset: reorder modes for soft-mixed-hard progression * rr/svn-export (2010-07-15) 8 commits - Add SVN dump parser - Add infrastructure to write revisions in fast-export format - Add stream helper library - Add string-specific memory pool - vcs-svn: treap_search should return NULL for missing items - Add treap implementation - Add memory pool library - Introduce vcs-svn lib (this branch uses jn/parse-date-basic.) * hv/autosquash-config (2010-07-14) 1 commit - add configuration variable for --autosquash option of interactive rebase * jh/graph-next-line (2010-07-13) 2 commits - Enable custom schemes for column colors in the graph API - Make graph_next_line() available in the graph.h API * ar/string-list-foreach (2010-07-03) 2 commits - Convert the users of for_each_string_list to for_each_string_list_item macro - Add a for_each_string_list_item macro (this branch is used by tf/string-list-init.) * il/rfc-remote-fd-ext (2010-07-19) 3 commits - gitignore: Ignore the new /git-remote-{ext,fd} helpers - New remote helper: git-remote-ext - New remote helper git-remote-fd * gp/pack-refs-remove-empty-dirs (2010-07-06) 1 commit (merged to 'next' on 2010-07-14 at 7d25131) + pack-refs: remove newly empty directories * hv/submodule-find-ff-merge (2010-07-07) 3 commits - Implement automatic fast-forward merge for submodules - setup_revisions(): Allow walking history in a submodule - Teach ref iteration module about submodules * jn/fast-import-subtree (2010-06-30) 1 commit - Teach fast-import to import subtrees named by tree id * sg/rerere-gc-old-still-used (2010-07-13) 2 commits - rerere: fix overeager gc - mingw_utime(): handle NULL times parameter * tf/string-list-init (2010-07-04) 1 commit - string_list: Add STRING_LIST_INIT macro and make use of it. (this branch uses ar/string-list-foreach.) * en/d-f-conflict-fix (2010-07-27) 7 commits (merged to 'next' on 2010-07-28 at 75e8ac1) + t/t6035-merge-dir-to-symlink.sh: Remove TODO on passing test (merged to 'next' on 2010-07-14 at 2b2a810) + fast-import: Improve robustness when D->F changes provided in wrong order + fast-export: Fix output order of D/F changes + merge_recursive: Fix renames across paths below D/F conflicts + merge-recursive: Fix D/F conflicts + Add a rename + D/F conflict testcase + Add additional testcases for D/F conflicts * ab/i18n (2010-07-19) 2 commits - tests: rename test to work around GNU gettext bug - Add infrastructure for translating Git with gettext (this branch is used by tr/ab-i18n-fix.) * tc/checkout-B (2010-06-24) 3 commits - builtin/checkout: learn -B - builtin/checkout: reword hint for -b - add tests for checkout -b * eb/double-convert-before-merge (2010-07-02) 3 commits - Don't expand CRLFs when normalizing text during merge - Try normalizing files to avoid delete/modify conflicts when merging - Avoid conflicts when merging branches with mixed normalization ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: What's cooking in git.git (Jul 2010, #05; Wed, 28) 2010-07-29 4:00 What's cooking in git.git (Jul 2010, #05; Wed, 28) Junio C Hamano @ 2010-07-30 18:37 ` Jeff King 2010-07-31 6:07 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jonathan Nieder 0 siblings, 1 reply; 12+ messages in thread From: Jeff King @ 2010-07-30 18:37 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Wed, Jul 28, 2010 at 09:00:16PM -0700, Junio C Hamano wrote: > -------------------------------------------------- > [Stalled -- would discard unless there are some movements soon] > [...] > * jk/tag-contains (2010-07-05) 4 commits > - Why is "git tag --contains" so slow? > - default core.clockskew variable to one day > - limit "contains" traversals based on commit timestamp > - tag: speed up --contains calculation What do we want to do with this? The first patch by itself produces a pretty big speedup for Ted's case, and does not impact correctness. However, it does do a mindless depth-first search, so there are cases where it can be slower than the current code (basically, if you never have to go to the roots for your tagset, then my code will be slower, as it will almost certainly go to the roots, but it will do so only one time for the whole set, instead of potentially once per tag). The second patch by itself is harmless, as the user has to turn it on explicitly. And the amount of code is quite small, so even if most people don't use it, I don't think it is a problem. The third one is where we start defaulting things to "assume no more than 1 day of clock skew by default", which can cause incorrect answers in the face of skew. The fourth is just an illustrative patch for per-repo skew detection. So if the tradeoff for patch 1 is acceptable, we can merge the first two. If the tradeoff in patch 3 is acceptable, then we can merge up to patch 3. The fourth one should be thrown out either way. I can work up a "detect clock skew on clone and gc" patch based on it if we want to go that way. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-07-30 18:37 ` Jeff King @ 2010-07-31 6:07 ` Jonathan Nieder 2010-07-31 12:33 ` Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Jonathan Nieder @ 2010-07-31 6:07 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git Jeff King wrote: > What do we want to do with this? Probably I have already said too much about this topic, but here I go: > The third one is where we start defaulting things to "assume no more > than 1 day of clock skew by default", which can cause incorrect answers > in the face of skew. I think the default should be something that (just barely) works correctly for linux-2.6.git. > The fourth is just an illustrative patch for per-repo skew detection. I have been hoping for a chance to look these over, time hasn’t come my way yet. > I can work up a > "detect clock skew on clone and gc" patch based on it if we want to go > that way. That sounds very sane. Additional things to do (this is mostly a note to myself): - refuse to commit with a timestamp long before any parent - refuse to make a commit that would make the total slop too high? - check slop and warn about it in fsck (maybe your patch does this already) - document the maximum-total-slop and maximum-single-commit-slop rules! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-07-31 6:07 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jonathan Nieder @ 2010-07-31 12:33 ` Jeff King 2010-08-02 4:04 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Jeff King @ 2010-07-31 12:33 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, git On Sat, Jul 31, 2010 at 01:07:03AM -0500, Jonathan Nieder wrote: > > The third one is where we start defaulting things to "assume no more > > than 1 day of clock skew by default", which can cause incorrect answers > > in the face of skew. > > I think the default should be something that (just barely) works > correctly for linux-2.6.git. I am tempted by that (and it is why I made the fourth patch to actually calculate the worst skew). But my concern is that there are projects with even worse skew. Maybe that is unfounded. > > The fourth is just an illustrative patch for per-repo skew detection. > > I have been hoping for a chance to look these over, time hasn’t come my > way yet. It just a git-skew program to calculate the skew, but doesn't do anything fancy like detect-on-gc. However, it would be nice to have somebody sanity check the algorithm. Looking at it again, I think it might actually miss some skew if the skewed commit can be reached in multiple ways. > Additional things to do (this is mostly a note to myself): > > - refuse to commit with a timestamp long before any parent Agreed. > - refuse to make a commit that would make the total slop too high? That would be expensive to commit, and if we bound each individual commit to parent relationship as you mention above, I don't think it should be necessary. > - check slop and warn about it in fsck (maybe your patch does this > already) No, it doesn't, but it is something we should probably do. > - document the maximum-total-slop and maximum-single-commit-slop > rules! Definitely. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-07-31 12:33 ` Jeff King @ 2010-08-02 4:04 ` Junio C Hamano 2010-08-02 20:02 ` Jonathan Nieder 2010-08-05 17:56 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King 0 siblings, 2 replies; 12+ messages in thread From: Junio C Hamano @ 2010-08-02 4:04 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, git Jeff King <peff@peff.net> writes: > On Sat, Jul 31, 2010 at 01:07:03AM -0500, Jonathan Nieder wrote: > >> > The third one is where we start defaulting things to "assume no more >> > than 1 day of clock skew by default", which can cause incorrect answers >> > in the face of skew. >> >> I think the default should be something that (just barely) works >> correctly for linux-2.6.git. > > I am tempted by that (and it is why I made the fourth patch to actually > calculate the worst skew). But my concern is that there are projects > with even worse skew. Maybe that is unfounded. > >> > The fourth is just an illustrative patch for per-repo skew detection. >> >> I have been hoping for a chance to look these over, time hasn’t come my >> way yet. Sorry, but I am right in the middle of phisically moving, so my weekend and evening git time has been nil recently. > It just a git-skew program to calculate the skew, but doesn't do > anything fancy like detect-on-gc. However, it would be nice to have > somebody sanity check the algorithm. Looking at it again, I think it > might actually miss some skew if the skewed commit can be reached in > multiple ways. > >> Additional things to do (this is mostly a note to myself): >> >> - refuse to commit with a timestamp long before any parent > > Agreed. You need to be careful here, though. What if you pulled from somebody whose clock is set grossly in the future? >> - check slop and warn about it in fsck (maybe your patch does this >> already) > > No, it doesn't, but it is something we should probably do. I wonder if we can make fsck to notice a commit with a wrong timestamp (i.e. older than some of its parents) and make a note of it (hopefully they are miniscule minority)---then during the revision traversal when we hit such a commit, we perhaps ignore its timestamp (pretending as if its timestamp is one of its children or parent---I haven't thought about the details, but the note fsck leaves can record what adjusted timestamp should be used) to fix the issue? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-08-02 4:04 ` Junio C Hamano @ 2010-08-02 20:02 ` Jonathan Nieder 2010-08-02 20:08 ` Matthieu Moy 2010-08-05 17:56 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King 1 sibling, 1 reply; 12+ messages in thread From: Jonathan Nieder @ 2010-08-02 20:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git Junio C Hamano wrote: > You need to be careful here, though. What if you pulled from somebody > whose clock is set grossly in the future? We could check for that and give relevant advice: fatal: committer date <date> precedes parent date <date> hint: It looks like you are trying to commit on top of a commit hint: from 5 years into the future. hint: Use "git rebase -f" to rewrite the commit with a more hint: sensible date, and please, fix your clocks! > I wonder if we can make fsck to notice a commit with a wrong timestamp > (i.e. older than some of its parents) and make a note of it (hopefully > they are miniscule minority)---then during the revision traversal when we > hit such a commit, we perhaps ignore its timestamp (pretending as if its > timestamp is one of its children or parent---I haven't thought about the > details, but the note fsck leaves can record what adjusted timestamp > should be used) to fix the issue? Thanks --- at first glance, this idea would seem to allow much faster revision limiting. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-08-02 20:02 ` Jonathan Nieder @ 2010-08-02 20:08 ` Matthieu Moy 2010-08-02 20:19 ` jk/tag-contains Jonathan Nieder 0 siblings, 1 reply; 12+ messages in thread From: Matthieu Moy @ 2010-08-02 20:08 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, Jeff King, git Jonathan Nieder <jrnieder@gmail.com> writes: > Junio C Hamano wrote: > >> You need to be careful here, though. What if you pulled from somebody >> whose clock is set grossly in the future? > > We could check for that and give relevant advice: > > fatal: committer date <date> precedes parent date <date> > hint: It looks like you are trying to commit on top of a commit > hint: from 5 years into the future. > hint: Use "git rebase -f" to rewrite the commit with a more > hint: sensible date, and please, fix your clocks! If the problem is the commit you've just pulled, I'd advise against re-writing it: it's published, it's too late. Be careful also: Git can hardly guess whether your clock is late, or whether your co-worker's clock is in the future. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains 2010-08-02 20:08 ` Matthieu Moy @ 2010-08-02 20:19 ` Jonathan Nieder 2010-08-02 22:38 ` jk/tag-contains Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Jonathan Nieder @ 2010-08-02 20:19 UTC (permalink / raw) To: Matthieu Moy; +Cc: Junio C Hamano, Jeff King, git Matthieu Moy wrote: > Jonathan Nieder <jrnieder@gmail.com> writes: >> fatal: committer date <date> precedes parent date <date> >> hint: It looks like you are trying to commit on top of a commit >> hint: from 5 years into the future. >> hint: Use "git rebase -f" to rewrite the commit with a more >> hint: sensible date, and please, fix your clocks! > > If the problem is the commit you've just pulled, I'd advise against > re-writing it: it's published, it's too late. I guess that is the fundamental question. What do you do when a completely bogus commit has been published? (For example, fsck permits extra headers after the "encoding" header, but a commit object using random such headers would be malformed and noticeable as such as soon as fsck learns what header is supposed to come after "encoding".) I would like it to still be possible to publically acknowledge a mistake, make people rewrite their history to remove it, and move on. But another viable solution here would be to just warn about the problem and maintain a list of bogus commits as Junio suggested. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains 2010-08-02 20:19 ` jk/tag-contains Jonathan Nieder @ 2010-08-02 22:38 ` Junio C Hamano 0 siblings, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2010-08-02 22:38 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Matthieu Moy, Jeff King, git Jonathan Nieder <jrnieder@gmail.com> writes: > I would like it to still be possible to publically acknowledge a > mistake, make people rewrite their history to remove it, and move on. While I wish the world were that simple, I do not think that is viable. You may not have any control over your upstream (not to mention the possibility that the upstream might even be a foreign SCM). So I'd prefer to see us prepared to be lenient with what we accept from outside world. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-08-02 4:04 ` Junio C Hamano 2010-08-02 20:02 ` Jonathan Nieder @ 2010-08-05 17:56 ` Jeff King 2010-08-05 18:22 ` Junio C Hamano 1 sibling, 1 reply; 12+ messages in thread From: Jeff King @ 2010-08-05 17:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git On Sun, Aug 01, 2010 at 09:04:23PM -0700, Junio C Hamano wrote: > Sorry, but I am right in the middle of phisically moving, so my weekend > and evening git time has been nil recently. Didn't you just do that? ;P > >> Additional things to do (this is mostly a note to myself): > >> > >> - refuse to commit with a timestamp long before any parent > > > > Agreed. > > You need to be careful here, though. What if you pulled from somebody > whose clock is set grossly in the future? Reading the rest of this thread and thinking about it more, I think warning is the best thing we can do. Because only the user is in a position to know whether it is their clock or the previous commit that is in error. And if it is the previous commit, then only the user knows what the next logical step is: redo the commit, complain to somebody else, or just ignore and continue. > I wonder if we can make fsck to notice a commit with a wrong timestamp > (i.e. older than some of its parents) and make a note of it (hopefully > they are miniscule minority)---then during the revision traversal when we > hit such a commit, we perhaps ignore its timestamp (pretending as if its > timestamp is one of its children or parent---I haven't thought about the > details, but the note fsck leaves can record what adjusted timestamp > should be used) to fix the issue? That's basically a finer-grained version of what I implemented. Mine finds the _worst_ skew for the whole graph, and never lets you optimize a traversal cutoff more than that skew. So it is nicely bounded space-wise, as it is always a single integer, but you waste effort on the entire traversal because a couple of commits are skewed. Yours optimizes perfectly, but needs O(skewed commits) storage. Which is probably a better tradeoff when the number of skewed commits is tiny (which is what we expect). I think your technique would work, but with one note. You probably want to pull the timestamp from the parent (pulling from the child makes no sense to me, as there can be an infinite number of children), but you need to process the parent first and pull from its _corrected_ timestamp. Because at least in the linux-2.6 case, there is a run of skewed commits. So if you have something like: A -- B -- C -- D A, timestamp = 1000 B, timestamp = 900 C, timestamp = 950 D, timestamp = 1100 where obviously the timestamps are shortened to be readable, but are meant to be seconds-since-epoch. You'd probably want to end up with: A, timestamp = 1000 B, timestamp = 1001 C, timestamp = 1002 D, timestamp = 1100 which means recursing all the way to the root, and fixing timestamps as you back out. This seems like just a straight sha1->int mapping, which presumably one could do using "git notes". Though I worry it could slow down traversal for all of the lookup misses for non-skewed commits. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-08-05 17:56 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King @ 2010-08-05 18:22 ` Junio C Hamano 2010-08-05 19:35 ` Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2010-08-05 18:22 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, git Jeff King <peff@peff.net> writes: > That's basically a finer-grained version of what I implemented. Mine > finds the _worst_ skew for the whole graph, and never lets you optimize > a traversal cutoff more than that skew. So it is nicely bounded > space-wise, as it is always a single integer, but you waste effort on > the entire traversal because a couple of commits are skewed. Yours > optimizes perfectly, but needs O(skewed commits) storage. Which is > probably a better tradeoff when the number of skewed commits is tiny > (which is what we expect). One thing missing from the above equation is that O(skewed commits) approach will need O(number of commits) look-ups in the skew database (be it a notes tree or whatever), only to make sure that most of the look-ups say "no timestamp tweak required". So I think the global single integer approach you took would probably be better in the overall picture. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) 2010-08-05 18:22 ` Junio C Hamano @ 2010-08-05 19:35 ` Jeff King 0 siblings, 0 replies; 12+ messages in thread From: Jeff King @ 2010-08-05 19:35 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, git On Thu, Aug 05, 2010 at 11:22:37AM -0700, Junio C Hamano wrote: > > That's basically a finer-grained version of what I implemented. Mine > > finds the _worst_ skew for the whole graph, and never lets you optimize > > a traversal cutoff more than that skew. So it is nicely bounded > > space-wise, as it is always a single integer, but you waste effort on > > the entire traversal because a couple of commits are skewed. Yours > > optimizes perfectly, but needs O(skewed commits) storage. Which is > > probably a better tradeoff when the number of skewed commits is tiny > > (which is what we expect). > > One thing missing from the above equation is that O(skewed commits) > approach will need O(number of commits) look-ups in the skew database (be > it a notes tree or whatever), only to make sure that most of the look-ups > say "no timestamp tweak required". So I think the global single integer > approach you took would probably be better in the overall picture. I'm not sure it is that bad. Shouldn't it have the same number of lookups as this scenario: # pretend we have some fake timestamps for i in 20 40 60; do git notes add -m "fake timestamp" HEAD~$i done # now time it without notes time git log --pretty=raw --no-notes >/dev/null # and with notes time git log --pretty=raw --show-notes >/dev/null For me, the timing differences are lost in the noise. So perhaps the lookup isn't all that expensive compared to the actual traversal. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-08-05 19:35 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-07-29 4:00 What's cooking in git.git (Jul 2010, #05; Wed, 28) Junio C Hamano 2010-07-30 18:37 ` Jeff King 2010-07-31 6:07 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jonathan Nieder 2010-07-31 12:33 ` Jeff King 2010-08-02 4:04 ` Junio C Hamano 2010-08-02 20:02 ` Jonathan Nieder 2010-08-02 20:08 ` Matthieu Moy 2010-08-02 20:19 ` jk/tag-contains Jonathan Nieder 2010-08-02 22:38 ` jk/tag-contains Junio C Hamano 2010-08-05 17:56 ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King 2010-08-05 18:22 ` Junio C Hamano 2010-08-05 19:35 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).