git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* What's cooking in git.git (Jul 2010, #05; Wed, 28)
@ 2010-07-29  4:00 Junio C Hamano
  2010-07-30 18:37 ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2010-07-29  4:00 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking.  Commits prefixed with '-' are
only in 'pu' while commits prefixed with '+' are in 'next'.  The ones
marked with '.' do not appear in any of the integration branches, but I am
still holding onto them.

Now that the latest feature release 1.7.2 is out, we should rewind and
rebuild 'next' and start cooking new topics.

--------------------------------------------------
[New Topics]

* ab/test-coverage (2010-07-26) 8 commits
 - Makefile: make gcov invocation configurable
 - t/README: Add a note about the dangers of coverage chasing
 - t/README: A new section about test coverage
 - Makefile: Add cover_db_html target
 - Makefile: Add cover_db target
 - Makefile: Split out the untested functions target
 - Makefile: Include subdirectories in "make cover" reports
 - gitignore: Ignore files generated by "make coverage"

* ab/test-no-skip (2010-07-28) 5 commits
 - t/README: Update "Skipping tests" to align with best practices
 - t/t7800-difftool.sh: Skip with prereq on no PERL
 - t/t5800-remote-helpers.sh: Skip with prereq on python <2.4
 - t/t4004-diff-rename-symlink.sh: use three-arg <prereq>
 - tests: implicitly skip SYMLINKS tests using <prereq>

* bc/use-more-hardlinks-in-install (2010-07-23) 2 commits
 - Makefile: make hard/symbolic links for non-builtins too
 - Makefile: link builtins residing in bin directory to main git binary too

* cc/find-commit-subject (2010-07-22) 6 commits
 - blame: use find_commit_subject() instead of custom code
 - merge-recursive: use find_commit_subject() instead of custom code
 - bisect: use find_commit_subject() instead of custom code
 - revert: rename variables related to subject in get_message()
 - revert: refactor code to find commit subject in find_commit_subject()
 - revert: fix off by one read when searching the end of a commit subject

* gb/shell-ext (2010-07-28) 3 commits
 - Add sample commands for git-shell
 - Add interactive mode to git-shell for user-friendliness
 - Allow creation of arbitrary git-shell commands

* jc/log-grep (2010-07-19) 1 commit
 - git log: add -G<regexp> that greps in the patch text

* jh/clean-exclude (2010-07-20) 2 commits
 - Add test for git clean -e.
 - Add -e/--exclude to git-clean.

* jh/use-test-must-fail (2010-07-20) 1 commit
 - Convert "! git" to "test_must_fail git"

* jn/apply-filename-with-sp (2010-07-23) 4 commits
 - apply: Handle traditional patches with space in filename
 - t4135 (apply): use expand instead of pr for portability
 - tests: Test how well "git apply" copes with weird filenames
 - apply: Split quoted filename handling into new function

* jn/fix-abbrev (2010-07-27) 3 commits
 - examples/commit: use --abbrev for commit summary
 - checkout, commit: remove confusing assignments to rev.abbrev
 - archive: abbreviate substituted commit ids again

* jn/maint-setup-fix (2010-07-24) 10 commits
 - setup: split off a function to handle ordinary .git directories
 - Revert "rehabilitate 'git index-pack' inside the object store"
 - setup: do not forget working dir from subdir of gitdir
 - setup: split off get_device_or_die helper
 - setup: split off a function to handle hitting ceiling in repo search
 - setup: split off code to handle stumbling upon a repository
 - setup: split off a function to checks working dir for .git file
 - setup: split off $GIT_DIR-set case from setup_git_directory_gently
 - tests: try git apply from subdir of toplevel
 - t1501 (rev-parse): clarify

* jn/rebase-rename-am (2008-11-10) 5 commits
 - rebase: protect against diff.renames configuration
 - t3400 (rebase): whitespace cleanup
 - Teach "apply --index-info" to handle rename patches
 - t4150 (am): futureproof against failing tests
 - t4150 (am): style fix

* ml/rebase-x-strategy (2010-07-29) 1 commit
 - rebase: support -X to pass through strategy options

* mm/shortopt-detached (2010-07-28) 4 commits
 - Allow detached form for --glob, --branches, --tags and --remote.
 - Allow detached form (e.g. "git log --grep foo") in log options.
 - Allow detached form for git diff --stat-name-width and --stat-width.
 - Allow detached form (e.g. "-S foo" instead of "-Sfoo") for diff options

* nd/fix-sparse-checkout (2010-07-26) 3 commits
 - Mark new entries skip-worktree appropriately
 - unpack-trees.c: Do not check ce_stage in will_have_skip_worktree()
 - Fix sparse checkout not removing files from index

* tr/ab-i18n-fix (2010-07-25) 1 commit
 - tests: locate i18n lib&data correctly under --valgrind
 (this branch uses ab/i18n.)

* tr/maint-no-unquote-plus (2010-07-24) 1 commit
 - Do not unquote + into ' ' in URLs

* tr/xsize-bits (2010-07-28) 1 commit
 - xsize_t: check whether we lose bits

* vs/doc-spell (2010-07-20) 1 commit
 - Documentation: spelling fixes

--------------------------------------------------
[Stalled -- would discard unless there are some movements soon]

* by/log-range-diff (2010-07-12) 18 commits
 . Minimum fix to make by/log-range-diff topic at least compile
 . add test cases for '--graph' of line level log
 . line.c output the '--graph' padding before each line
 . add parent rewrite feature to line level log
 . make rewrite_parents an external function
 . some document update
 . add two test cases
 . add --always-print option
 . map/print ranges along traversing the history topologically
 . print the line log
 . map/take range to parent
 . add range clone functions
 . export three functions from diff.c
 . parse the -L options
 . refactor parse_loc
 . add the basic data structure for line level history
 . parse-options: add two helper functions
 . parse-options: stop when encounter a non-option

Perhaps a re-roll is coming?  I suspect that we would have some overlaps
to mm/shortopt-detached topic.

* ps/gitweb-soc (2010-06-02) 2 commits
  (merged to 'next' on 2010-06-13 at 92245ae)
 + git-instaweb: Add option to reuse previous config file
 + Makefile: Use $(sharedir)/gitweb for target 'install-gitweb'

If we are going to have a configuration variable to control this, I
strongly suspect that --reuse-config should be renamed so that the
variable can be named more sanely and in line with whatever option
that replaces it.

No responses; I think we will eventually want to have a configuration to
always enable the new option, so the renaming of the command line option
is inevitable.  I plan to kick this out of 'next' once the upcoming
release is out, and expect a re-roll with configuration variable.

* js/rebase-origin-x (2010-02-05) 1 commit
 - [RFC w/o test and incomplete] rebase: add -x option to record original commit name

I retract my objection against the idea of -x; needs polishing before
moving forward.

No responses; I plan to drop this entirely after the upcoming release
without prejudice.

* zl/mailinfo-recode-patch (2010-06-14) 2 commits
 - add --recode-patch option to git-am
 - add --recode-patch option to git-mailinfo

I recall there was another round of re-roll planned for this one.

* jk/tag-contains (2010-07-05) 4 commits
 - Why is "git tag --contains" so slow?
 - default core.clockskew variable to one day
 - limit "contains" traversals based on commit timestamp
 - tag: speed up --contains calculation

--------------------------------------------------
[Cooking]

* ab/report-corrupt-object-with-type (2010-06-10) 1 commit
 - sha1_file: Show the the type and path to corrupt objects

* cc/revert (2010-07-21) 5 commits
 - t3508: add check_head_differs_from() helper function and use it
 - revert: improve success message by adding abbreviated commit sha1
 - revert: don't print "Finished one cherry-pick." if commit failed
 - revert: refactor commit code into a new run_git_commit() function
 - revert: report success when using option --strategy

* en/fast-export-fix (2010-07-17) 2 commits
 - fast-export: Add a --full-tree option
 - fast-export: Fix dropping of files with --import-marks and path limiting

* jn/parse-date-basic (2010-07-15) 1 commit
 - Export parse_date_basic() to convert a date string to timestamp
 (this branch is used by rr/svn-export.)

* kf/post-receive-sample-hook (2010-07-16) 1 commit
 - post-receive-email: optional message line count limit

* tr/rfc-reset-doc (2010-07-18) 5 commits
 - Documentation/reset: move "undo permanently" example behind "make topic"
 - Documentation/reset: reorder examples to match description
 - Documentation/reset: promote 'examples' one section up
 - Documentation/reset: separate options by mode
 - Documentation/git-reset: reorder modes for soft-mixed-hard progression

* rr/svn-export (2010-07-15) 8 commits
 - Add SVN dump parser
 - Add infrastructure to write revisions in fast-export format
 - Add stream helper library
 - Add string-specific memory pool
 - vcs-svn: treap_search should return NULL for missing items
 - Add treap implementation
 - Add memory pool library
 - Introduce vcs-svn lib
 (this branch uses jn/parse-date-basic.)

* hv/autosquash-config (2010-07-14) 1 commit
 - add configuration variable for --autosquash option of interactive rebase

* jh/graph-next-line (2010-07-13) 2 commits
 - Enable custom schemes for column colors in the graph API
 - Make graph_next_line() available in the graph.h API

* ar/string-list-foreach (2010-07-03) 2 commits
 - Convert the users of for_each_string_list to for_each_string_list_item macro
 - Add a for_each_string_list_item macro
 (this branch is used by tf/string-list-init.)

* il/rfc-remote-fd-ext (2010-07-19) 3 commits
 - gitignore: Ignore the new /git-remote-{ext,fd} helpers
 - New remote helper: git-remote-ext
 - New remote helper git-remote-fd

* gp/pack-refs-remove-empty-dirs (2010-07-06) 1 commit
  (merged to 'next' on 2010-07-14 at 7d25131)
 + pack-refs: remove newly empty directories

* hv/submodule-find-ff-merge (2010-07-07) 3 commits
 - Implement automatic fast-forward merge for submodules
 - setup_revisions(): Allow walking history in a submodule
 - Teach ref iteration module about submodules

* jn/fast-import-subtree (2010-06-30) 1 commit
 - Teach fast-import to import subtrees named by tree id

* sg/rerere-gc-old-still-used (2010-07-13) 2 commits
 - rerere: fix overeager gc
 - mingw_utime(): handle NULL times parameter

* tf/string-list-init (2010-07-04) 1 commit
 - string_list: Add STRING_LIST_INIT macro and make use of it.
 (this branch uses ar/string-list-foreach.)

* en/d-f-conflict-fix (2010-07-27) 7 commits
  (merged to 'next' on 2010-07-28 at 75e8ac1)
 + t/t6035-merge-dir-to-symlink.sh: Remove TODO on passing test
  (merged to 'next' on 2010-07-14 at 2b2a810)
 + fast-import: Improve robustness when D->F changes provided in wrong order
 + fast-export: Fix output order of D/F changes
 + merge_recursive: Fix renames across paths below D/F conflicts
 + merge-recursive: Fix D/F conflicts
 + Add a rename + D/F conflict testcase
 + Add additional testcases for D/F conflicts

* ab/i18n (2010-07-19) 2 commits
 - tests: rename test to work around GNU gettext bug
 - Add infrastructure for translating Git with gettext
 (this branch is used by tr/ab-i18n-fix.)

* tc/checkout-B (2010-06-24) 3 commits
 - builtin/checkout: learn -B
 - builtin/checkout: reword hint for -b
 - add tests for checkout -b

* eb/double-convert-before-merge (2010-07-02) 3 commits
 - Don't expand CRLFs when normalizing text during merge
 - Try normalizing files to avoid delete/modify conflicts when merging
 - Avoid conflicts when merging branches with mixed normalization

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)
  2010-07-29  4:00 What's cooking in git.git (Jul 2010, #05; Wed, 28) Junio C Hamano
@ 2010-07-30 18:37 ` Jeff King
  2010-07-31  6:07   ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jonathan Nieder
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2010-07-30 18:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jul 28, 2010 at 09:00:16PM -0700, Junio C Hamano wrote:

> --------------------------------------------------
> [Stalled -- would discard unless there are some movements soon]
> [...]
> * jk/tag-contains (2010-07-05) 4 commits
>  - Why is "git tag --contains" so slow?
>  - default core.clockskew variable to one day
>  - limit "contains" traversals based on commit timestamp
>  - tag: speed up --contains calculation

What do we want to do with this?

The first patch by itself produces a pretty big speedup for Ted's case,
and does not impact correctness.  However, it does do a mindless
depth-first search, so there are cases where it can be slower than the
current code (basically, if you never have to go to the roots for your
tagset, then my code will be slower, as it will almost certainly go to
the roots, but it will do so only one time for the whole set, instead of
potentially once per tag).

The second patch by itself is harmless, as the user has to turn it
on explicitly. And the amount of code is quite small, so even if most
people don't use it, I don't think it is a problem.

The third one is where we start defaulting things to "assume no more
than 1 day of clock skew by default", which can cause incorrect answers
in the face of skew.

The fourth is just an illustrative patch for per-repo skew detection.

So if the tradeoff for patch 1 is acceptable, we can merge the first
two. If the tradeoff in patch 3 is acceptable, then we can merge up to
patch 3. The fourth one should be thrown out either way. I can work up a
"detect clock skew on clone and gc" patch based on it if we want to go
that way.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-07-30 18:37 ` Jeff King
@ 2010-07-31  6:07   ` Jonathan Nieder
  2010-07-31 12:33     ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Nieder @ 2010-07-31  6:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

Jeff King wrote:

> What do we want to do with this?

Probably I have already said too much about this topic, but here I go:

> The third one is where we start defaulting things to "assume no more
> than 1 day of clock skew by default", which can cause incorrect answers
> in the face of skew.

I think the default should be something that (just barely) works
correctly for linux-2.6.git.

> The fourth is just an illustrative patch for per-repo skew detection.

I have been hoping for a chance to look these over, time hasn’t come my
way yet.

>                                                          I can work up a
> "detect clock skew on clone and gc" patch based on it if we want to go
> that way.

That sounds very sane.

Additional things to do (this is mostly a note to myself):

 - refuse to commit with a timestamp long before any parent

 - refuse to make a commit that would make the total slop too high?

 - check slop and warn about it in fsck (maybe your patch does this
   already)

 - document the maximum-total-slop and maximum-single-commit-slop
   rules!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-07-31  6:07   ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jonathan Nieder
@ 2010-07-31 12:33     ` Jeff King
  2010-08-02  4:04       ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2010-07-31 12:33 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git

On Sat, Jul 31, 2010 at 01:07:03AM -0500, Jonathan Nieder wrote:

> > The third one is where we start defaulting things to "assume no more
> > than 1 day of clock skew by default", which can cause incorrect answers
> > in the face of skew.
> 
> I think the default should be something that (just barely) works
> correctly for linux-2.6.git.

I am tempted by that (and it is why I made the fourth patch to actually
calculate the worst skew). But my concern is that there are projects
with even worse skew. Maybe that is unfounded.

> > The fourth is just an illustrative patch for per-repo skew detection.
> 
> I have been hoping for a chance to look these over, time hasn’t come my
> way yet.

It just a git-skew program to calculate the skew, but doesn't do
anything fancy like detect-on-gc. However, it would be nice to have
somebody sanity check the algorithm. Looking at it again, I think it
might actually miss some skew if the skewed commit can be reached in
multiple ways.

> Additional things to do (this is mostly a note to myself):
> 
>  - refuse to commit with a timestamp long before any parent

Agreed.

>  - refuse to make a commit that would make the total slop too high?

That would be expensive to commit, and if we bound each individual
commit to parent relationship as you mention above, I don't think it
should be necessary.

>  - check slop and warn about it in fsck (maybe your patch does this
>    already)

No, it doesn't, but it is something we should probably do.

>  - document the maximum-total-slop and maximum-single-commit-slop
>    rules!

Definitely.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-07-31 12:33     ` Jeff King
@ 2010-08-02  4:04       ` Junio C Hamano
  2010-08-02 20:02         ` Jonathan Nieder
  2010-08-05 17:56         ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-08-02  4:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, git

Jeff King <peff@peff.net> writes:

> On Sat, Jul 31, 2010 at 01:07:03AM -0500, Jonathan Nieder wrote:
>
>> > The third one is where we start defaulting things to "assume no more
>> > than 1 day of clock skew by default", which can cause incorrect answers
>> > in the face of skew.
>> 
>> I think the default should be something that (just barely) works
>> correctly for linux-2.6.git.
>
> I am tempted by that (and it is why I made the fourth patch to actually
> calculate the worst skew). But my concern is that there are projects
> with even worse skew. Maybe that is unfounded.
>
>> > The fourth is just an illustrative patch for per-repo skew detection.
>> 
>> I have been hoping for a chance to look these over, time hasn’t come my
>> way yet.

Sorry, but I am right in the middle of phisically moving, so my weekend
and evening git time has been nil recently.

> It just a git-skew program to calculate the skew, but doesn't do
> anything fancy like detect-on-gc. However, it would be nice to have
> somebody sanity check the algorithm. Looking at it again, I think it
> might actually miss some skew if the skewed commit can be reached in
> multiple ways.
>
>> Additional things to do (this is mostly a note to myself):
>> 
>>  - refuse to commit with a timestamp long before any parent
>
> Agreed.

You need to be careful here, though.  What if you pulled from somebody
whose clock is set grossly in the future?

>>  - check slop and warn about it in fsck (maybe your patch does this
>>    already)
>
> No, it doesn't, but it is something we should probably do.

I wonder if we can make fsck to notice a commit with a wrong timestamp
(i.e. older than some of its parents) and make a note of it (hopefully
they are miniscule minority)---then during the revision traversal when we
hit such a commit, we perhaps ignore its timestamp (pretending as if its
timestamp is one of its children or parent---I haven't thought about the
details, but the note fsck leaves can record what adjusted timestamp
should be used) to fix the issue?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-08-02  4:04       ` Junio C Hamano
@ 2010-08-02 20:02         ` Jonathan Nieder
  2010-08-02 20:08           ` Matthieu Moy
  2010-08-05 17:56         ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Jonathan Nieder @ 2010-08-02 20:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

Junio C Hamano wrote:

> You need to be careful here, though.  What if you pulled from somebody
> whose clock is set grossly in the future?

We could check for that and give relevant advice:

 fatal: committer date <date> precedes parent date <date>
 hint: It looks like you are trying to commit on top of a commit
 hint: from 5 years into the future.
 hint: Use "git rebase -f" to rewrite the commit with a more
 hint: sensible date, and please, fix your clocks!

> I wonder if we can make fsck to notice a commit with a wrong timestamp
> (i.e. older than some of its parents) and make a note of it (hopefully
> they are miniscule minority)---then during the revision traversal when we
> hit such a commit, we perhaps ignore its timestamp (pretending as if its
> timestamp is one of its children or parent---I haven't thought about the
> details, but the note fsck leaves can record what adjusted timestamp
> should be used) to fix the issue?

Thanks --- at first glance, this idea would seem to allow much faster
revision limiting.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-08-02 20:02         ` Jonathan Nieder
@ 2010-08-02 20:08           ` Matthieu Moy
  2010-08-02 20:19             ` jk/tag-contains Jonathan Nieder
  0 siblings, 1 reply; 12+ messages in thread
From: Matthieu Moy @ 2010-08-02 20:08 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Jeff King, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> Junio C Hamano wrote:
>
>> You need to be careful here, though.  What if you pulled from somebody
>> whose clock is set grossly in the future?
>
> We could check for that and give relevant advice:
>
>  fatal: committer date <date> precedes parent date <date>
>  hint: It looks like you are trying to commit on top of a commit
>  hint: from 5 years into the future.
>  hint: Use "git rebase -f" to rewrite the commit with a more
>  hint: sensible date, and please, fix your clocks!

If the problem is the commit you've just pulled, I'd advise against
re-writing it: it's published, it's too late.

Be careful also: Git can hardly guess whether your clock is late, or
whether your co-worker's clock is in the future.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains
  2010-08-02 20:08           ` Matthieu Moy
@ 2010-08-02 20:19             ` Jonathan Nieder
  2010-08-02 22:38               ` jk/tag-contains Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jonathan Nieder @ 2010-08-02 20:19 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Junio C Hamano, Jeff King, git

Matthieu Moy wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>>  fatal: committer date <date> precedes parent date <date>
>>  hint: It looks like you are trying to commit on top of a commit
>>  hint: from 5 years into the future.
>>  hint: Use "git rebase -f" to rewrite the commit with a more
>>  hint: sensible date, and please, fix your clocks!
>
> If the problem is the commit you've just pulled, I'd advise against
> re-writing it: it's published, it's too late.

I guess that is the fundamental question.  What do you do when a
completely bogus commit has been published?

(For example, fsck permits extra headers after the "encoding" header,
but a commit object using random such headers would be malformed and
noticeable as such as soon as fsck learns what header is supposed to
come after "encoding".)

I would like it to still be possible to publically acknowledge a
mistake, make people rewrite their history to remove it, and move on.
But another viable solution here would be to just warn about the
problem and maintain a list of bogus commits as Junio suggested.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains
  2010-08-02 20:19             ` jk/tag-contains Jonathan Nieder
@ 2010-08-02 22:38               ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-08-02 22:38 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Matthieu Moy, Jeff King, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> I would like it to still be possible to publically acknowledge a
> mistake, make people rewrite their history to remove it, and move on.

While I wish the world were that simple, I do not think that is viable.
You may not have any control over your upstream (not to mention the
possibility that the upstream might even be a foreign SCM).

So I'd prefer to see us prepared to be lenient with what we accept from
outside world.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-08-02  4:04       ` Junio C Hamano
  2010-08-02 20:02         ` Jonathan Nieder
@ 2010-08-05 17:56         ` Jeff King
  2010-08-05 18:22           ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2010-08-05 17:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git

On Sun, Aug 01, 2010 at 09:04:23PM -0700, Junio C Hamano wrote:

> Sorry, but I am right in the middle of phisically moving, so my weekend
> and evening git time has been nil recently.

Didn't you just do that? ;P

> >> Additional things to do (this is mostly a note to myself):
> >> 
> >>  - refuse to commit with a timestamp long before any parent
> >
> > Agreed.
> 
> You need to be careful here, though.  What if you pulled from somebody
> whose clock is set grossly in the future?

Reading the rest of this thread and thinking about it more, I think
warning is the best thing we can do. Because only the user is in a
position to know whether it is their clock or the previous commit that
is in error. And if it is the previous commit, then only the user knows
what the next logical step is: redo the commit, complain to somebody
else, or just ignore and continue.

> I wonder if we can make fsck to notice a commit with a wrong timestamp
> (i.e. older than some of its parents) and make a note of it (hopefully
> they are miniscule minority)---then during the revision traversal when we
> hit such a commit, we perhaps ignore its timestamp (pretending as if its
> timestamp is one of its children or parent---I haven't thought about the
> details, but the note fsck leaves can record what adjusted timestamp
> should be used) to fix the issue?

That's basically a finer-grained version of what I implemented. Mine
finds the _worst_ skew for the whole graph, and never lets you optimize
a traversal cutoff more than that skew. So it is nicely bounded
space-wise, as it is always a single integer, but you waste effort on
the entire traversal because a couple of commits are skewed. Yours
optimizes perfectly, but needs O(skewed commits) storage. Which is
probably a better tradeoff when the number of skewed commits is tiny
(which is what we expect).

I think your technique would work, but with one note. You probably want
to pull the timestamp from the parent (pulling from the child makes no
sense to me, as there can be an infinite number of children), but you
need to process the parent first and pull from its _corrected_
timestamp. Because at least in the linux-2.6 case, there is a run of
skewed commits. So if you have something like:

  A -- B -- C -- D

A, timestamp = 1000
B, timestamp = 900
C, timestamp = 950
D, timestamp = 1100

where obviously the timestamps are shortened to be readable, but are
meant to be seconds-since-epoch.  You'd probably want to end up with:

A, timestamp = 1000
B, timestamp = 1001
C, timestamp = 1002
D, timestamp = 1100

which means recursing all the way to the root, and fixing timestamps as
you back out.

This seems like just a straight sha1->int mapping, which presumably one
could do using "git notes". Though I worry it could slow down traversal
for all of the lookup misses for non-skewed commits.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-08-05 17:56         ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King
@ 2010-08-05 18:22           ` Junio C Hamano
  2010-08-05 19:35             ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2010-08-05 18:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, git

Jeff King <peff@peff.net> writes:

> That's basically a finer-grained version of what I implemented. Mine
> finds the _worst_ skew for the whole graph, and never lets you optimize
> a traversal cutoff more than that skew. So it is nicely bounded
> space-wise, as it is always a single integer, but you waste effort on
> the entire traversal because a couple of commits are skewed. Yours
> optimizes perfectly, but needs O(skewed commits) storage. Which is
> probably a better tradeoff when the number of skewed commits is tiny
> (which is what we expect).

One thing missing from the above equation is that O(skewed commits)
approach will need O(number of commits) look-ups in the skew database (be
it a notes tree or whatever), only to make sure that most of the look-ups
say "no timestamp tweak required".  So I think the global single integer
approach you took would probably be better in the overall picture.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28))
  2010-08-05 18:22           ` Junio C Hamano
@ 2010-08-05 19:35             ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2010-08-05 19:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git

On Thu, Aug 05, 2010 at 11:22:37AM -0700, Junio C Hamano wrote:

> > That's basically a finer-grained version of what I implemented. Mine
> > finds the _worst_ skew for the whole graph, and never lets you optimize
> > a traversal cutoff more than that skew. So it is nicely bounded
> > space-wise, as it is always a single integer, but you waste effort on
> > the entire traversal because a couple of commits are skewed. Yours
> > optimizes perfectly, but needs O(skewed commits) storage. Which is
> > probably a better tradeoff when the number of skewed commits is tiny
> > (which is what we expect).
> 
> One thing missing from the above equation is that O(skewed commits)
> approach will need O(number of commits) look-ups in the skew database (be
> it a notes tree or whatever), only to make sure that most of the look-ups
> say "no timestamp tweak required".  So I think the global single integer
> approach you took would probably be better in the overall picture.

I'm not sure it is that bad. Shouldn't it have the same number of
lookups as this scenario:

  # pretend we have some fake timestamps
  for i in 20 40 60; do
    git notes add -m "fake timestamp" HEAD~$i
  done

  # now time it without notes
  time git log --pretty=raw --no-notes >/dev/null

  # and with notes
  time git log --pretty=raw --show-notes >/dev/null

For me, the timing differences are lost in the noise. So perhaps the
lookup isn't all that expensive compared to the actual traversal.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-08-05 19:35 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-29  4:00 What's cooking in git.git (Jul 2010, #05; Wed, 28) Junio C Hamano
2010-07-30 18:37 ` Jeff King
2010-07-31  6:07   ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jonathan Nieder
2010-07-31 12:33     ` Jeff King
2010-08-02  4:04       ` Junio C Hamano
2010-08-02 20:02         ` Jonathan Nieder
2010-08-02 20:08           ` Matthieu Moy
2010-08-02 20:19             ` jk/tag-contains Jonathan Nieder
2010-08-02 22:38               ` jk/tag-contains Junio C Hamano
2010-08-05 17:56         ` jk/tag-contains (Re: What's cooking in git.git (Jul 2010, #05; Wed, 28)) Jeff King
2010-08-05 18:22           ` Junio C Hamano
2010-08-05 19:35             ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).