From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Frans Pop <elendil@planet.nl>,
"Shawn O. Pearce" <spearce@spearce.org>,
git@vger.kernel.org
Subject: Re: Extremely slow progress during 'git reflog expire --all'
Date: Mon, 5 Apr 2010 02:26:21 -0400 [thread overview]
Message-ID: <20100405062621.GA30934@coredump.intra.peff.net> (raw)
In-Reply-To: <7vy6h36pt1.fsf@alter.siamese.dyndns.org>
On Sun, Apr 04, 2010 at 11:22:18AM -0700, Junio C Hamano wrote:
> Thanks for the analysis, but expire_reflog() that is run for each ref
> already does that, I think. It first runs mark_reachable(), then walks
> each reflog entry for the ref to call expire_reflog_ent(), which in turn
> calls unreachable() that first checks if mark_reachable() has marked the
> commit, and if so we don't run in_merge_bases().
Hmm. It looks like mark_reachable() stops traversing when it hits a
commit older than expire_total. I imagine that's to avoid going all the
way to the roots. But if we hit any unreachable entry, in_merge_bases()
is going to have to go all the way to the roots, anyway.
If we just marked everything, couldn't we then trust the REACHABLE
value and not have to do the in_merge_bases double-check? It has worse
best-case cost (since we always go to the root), but better worst-case
(since we can potentially go to the roots for each unreachable reflog
entry).
For example:
> + /*
> + * Unless there was a clock skew, younger ones that are
> + * reachable should have been marked by mark_reachable().
> + */
> + if (cb->cmd->expire_total < commit->date)
> + return 1;
> +
> if (in_merge_bases(commit, &cb->ref_commit, 1))
> return 0;
If we haven't done a "reflog expire" in a while, won't we have a bunch
of old commits that will still need double-checked, and produce the same
slow behavior? Or is that what your "clock skew" is meant to mean? That
we would have removed those old ones already assuming the commit date
and the reflog entry for them match up? That's not necessarily true if
you move to an older commit, which freshens its reflog entry compared to
the commit date.
I wonder if, in addition to your patch, we should remove the
double-check in_merge_bases and simply report those old ones as
reachable. We may be wrong, but we are erring on the side of keeping
entries, and they will eventually expire in the regular cycle (i.e., 90
days instead of 30).
All of that being said, your patch does drop Frans' case down to about
1s of CPU time, so perhaps it is not worth worrying about beyond that.
-Peff
next prev parent reply other threads:[~2010-04-05 6:26 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-02 19:54 Extremely slow progress during 'git reflog expire --all' Frans Pop
2010-04-02 21:28 ` Jeff King
2010-04-02 21:50 ` Frans Pop
2010-04-02 22:41 ` Jeff King
2010-04-03 14:29 ` Frans Pop
2010-04-03 20:33 ` Jeff King
2010-04-03 20:35 ` Jeff King
2010-04-04 18:22 ` Junio C Hamano
2010-04-05 6:26 ` Jeff King [this message]
2010-04-05 18:54 ` Junio C Hamano
2010-04-06 6:02 ` Jeff King
2010-04-07 18:39 ` Re*: " Junio C Hamano
2010-04-07 18:43 ` Junio C Hamano
2010-04-08 7:00 ` Jeff King
2010-04-08 6:52 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100405062621.GA30934@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=elendil@planet.nl \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).