From: Junio C Hamano <gitster@pobox.com>
To: David Turner <dturner@twosigma.com>
Cc: git@vger.kernel.org, peff@peff.net, pclouds@gmail.com
Subject: Re: [PATCH v3] gc: ignore old gc.log files
Date: Fri, 10 Feb 2017 11:47:19 -0800 [thread overview]
Message-ID: <xmqqy3xdkeuw.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170210192019.13927-1-dturner@twosigma.com> (David Turner's message of "Fri, 10 Feb 2017 14:20:19 -0500")
David Turner <dturner@twosigma.com> writes:
> It would be good if automatic gc were useful to server operators.
> A server can end up in a state whre there are
> lots of unreferenced loose objects (say, because many users are doing
> a bunch of rebasing and pushing their rebased branches). Before this
> patch, this state would cause a gc.log file to be created, preventing
> future auto gcs. Then pack files could pile up. Since many git
> operations are O(n) in the number of pack files, this would lead to
> poor performance. Now, the pack files will get cleaned up, if
> necessary, at least once per day. And operators who find a need for
> more-frequent gcs can adjust gc.logExpiry to meet their needs.
>
> Git should never get itself into a state where it refuses to do any
> maintenance, just because at some point some piece of the maintenance
> didn't make progress.
>
> In this commit, git learns to ignore gc.log files which are older than
> (by default) one day old. It also learns about a config, gc.logExpiry
> to manage this. There is also some cleanup: a successful manual gc,
> or a warning-free auto gc with an old log file, will remove any old
> gc.log files.
>
> It might still happen that manual intervention is required
> (e.g. because the repo is corrupt), but at the very least it won't be
> because Git is too dumb to try again.
Thanks, nicely explained.
> + if (fstat(get_lock_file_fd(&log_lock), &st)) {
> + if (errno == ENOENT) {
> + /*
> + * The user has probably intentionally deleted
> + * gc.log.lock (perhaps because they're blowing
> + * away the whole repo), so thre's no need to
> + * report anything here. But we also won't
> + * delete gc.log, because we don't know what
> + * the user's intentions are.
> + */
OK.
> + } else {
> + FILE *fp;
> + int fd;
> + int saved_errno = errno;
> + /*
> + * Perhaps there was an i/o error or another
> + * unlikely situation. Try to make a note of
> + * this in gc.log. If this fails again,
> + * give up and leave gc.log as it was.
> + */
> + rollback_lock_file(&log_lock);
> + fd = hold_lock_file_for_update(&log_lock,
> + git_path("gc.log"),
> + LOCK_DIE_ON_ERROR);
> +
> + fp = fdopen(fd, "w");
> + fprintf(fp, _("Failed to fstat %s: %s"),
> + get_tempfile_path(&log_lock.tempfile),
> + strerror(errno));
I think you meant to use saved_errno in this message and then
> + fclose(fp);
> + commit_lock_file(&log_lock);
possibly assign it back to errno around here?
> + }
> +
> + } else if (st.st_size) {
> + /* There was some error recorded in the lock file */
> commit_lock_file(&log_lock);
> - else
> + } else {
> + /* No error, clean up any old gc.log */
> + unlink(git_path("gc.log"));
> rollback_lock_file(&log_lock);
> + }
> }
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index 1762dfa6a..84ad07eb2 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -67,5 +67,18 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre
> test_line_count = 2 new # There is one new pack and its .idx
> '
>
> +test_expect_success 'background auto gc does not run if gc.log is present and recent but does if it is old' '
> + keep=$(ls .git/objects/pack/*.pack|head -1|sed -e "s/pack$/keep/") &&
You could save one process with:
ls .git/objects/pack/*.pack | sed -e "s/pack$/keep/" -e q
but do you even need $keep? I do not see it used below.
> + test_commit foo &&
> + test_commit bar &&
> + git repack &&
> + test_config gc.autopacklimit 1 &&
> + test_config gc.autodetach true &&
> + echo fleem >.git/gc.log &&
> + test_must_fail git gc --auto 2>err &&
> + test_i18ngrep "^error:" err &&
> + test-chmtime =-86401 .git/gc.log &&
> + git gc --auto
> +'
>
> test_done
Other than that I didn't spot anything suspicious. I'll wait to see
what others may want to add.
Thanks.
next prev parent reply other threads:[~2017-02-10 19:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-10 19:20 [PATCH v3] gc: ignore old gc.log files David Turner
2017-02-10 19:47 ` Junio C Hamano [this message]
2017-02-10 19:56 ` Junio C Hamano
2017-02-10 20:08 ` Jeff King
2017-02-10 20:44 ` David Turner
2017-02-10 20:51 ` Jeff King
2017-02-10 21:04 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqy3xdkeuw.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=dturner@twosigma.com \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox