From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: SURA via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org, SURA <sura907@hotmail.com>
Subject: Re: [PATCH] builtin/fetch.c: clean tmp pack after receive signal
Date: Wed, 17 Mar 2021 11:15:07 -0700 [thread overview]
Message-ID: <xmqqsg4twst0.fsf@gitster.g> (raw)
In-Reply-To: <YFEpGGLBgLSdR40V@coredump.intra.peff.net> (Jeff King's message of "Tue, 16 Mar 2021 17:54:32 -0400")
Jeff King <peff@peff.net> writes:
> On Tue, Mar 16, 2021 at 02:53:36AM +0000, SURA via GitGitGadget wrote:
>
>> In Gitee.com, I often use scripts to start a time-limited
>
> Not related to your patch, but I think this name falls afoul of Git's
> trademark policy. See:
>
> https://git-scm.com/trademark
>
> There's also some discussion in this thread:
>
> https://lore.kernel.org/git/20170202022655.2jwvudhvo4hmueaw@sigill.intra.peff.net/
Thanks. On somewhat related to this patch, we also ask contributors
to use their real names so that we do not render the Signed-off-by:
procedure meaningless.
> This isn't quite true. "git gc" will clean up the temporary files, but
> only if the mtime is sufficiently old. The purpose here is to give a
> grace period to avoid deleting a file that is actively being written to.
> However, we use the same grace period that we use for deleting
> unreachable objects, which is absurdly long for this purpose: 2 weeks.
> Probably something like an hour would be more appropriate (since the
> mtime is updated on each write, this would imply a process not making
> forward progress).
I agree that for temporaries the two-week default is way too long,
and I am OK if we decide to shorten the expiration for them
separately from the known-to-be-good-but-unreferenced objects.
> Likewise, we have a tempfile cleanup system already.
>
> I think this hunk:
>
>> @@ -336,6 +339,7 @@ static const char *open_pack_file(const char *pack_name)
>> output_fd = odb_mkstemp(&tmp_file,
>> "pack/tmp_pack_XXXXXX");
>> pack_name = strbuf_detach(&tmp_file, NULL);
>> + tmp_pack_name = pack_name;
>
> ...can just call register_tempfile(). It should also record the result
> so that we don't try to unlink() it after we've already moved it away
> from its temporary name (though it's fairly unlikely for somebody else
> to have used the name in the interim).
>
> I think you'd want to do the same for the tmp_idx_* files, too. Likewise
> for ".rev" files we create starting in v2.31.
>
> I think it would also make sense in create_tmp_packfile(), which is used
> during repacking (a different problem space, but really the same thing:
> if repacking fails for some reason, we probably shouldn't leave a
> useless gigantic half-finished packfile on disk).
>
> We should possibly also do so for tmp_obj_* files. Those can be written
> for a fetch or push via unpack-objects (as well as normal local
> commands). They're not usually as big as a pack, obviously, but I think
> the same principle applies.
>
>> [...]
>
> It would be nice to see some tests covering this functionality, too.
> Reproducing it with signals is likely to be racy and not worth it. But I
> think that right now index-pack reading a bogus pack (say, one that
> fails fsck checks) will leave the tmp_pack_* on disk. And it would not
> if we cleanup tempfiles (again, this would be on any exit, not just
> signal death, but I think that is what we'd want, and also what
> register_tempfile() will do).
Sounds like a good medium difficulty leftover bit.
Thanks.
prev parent reply other threads:[~2021-03-17 18:16 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-16 2:53 [PATCH] builtin/fetch.c: clean tmp pack after receive signal SURA via GitGitGadget
2021-03-16 21:54 ` Jeff King
2021-03-17 18:15 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqsg4twst0.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=peff@peff.net \
--cc=sura907@hotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.