All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Elijah Newren <newren@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 0/2] commit-graph/server-info: use tempfile.h in more places
Date: Fri, 14 Jun 2024 15:35:06 -0400	[thread overview]
Message-ID: <ZmybaokdVIXOX+6D@nand.local> (raw)
In-Reply-To: <20240608104821.GF2659849@coredump.intra.peff.net>

On Sat, Jun 08, 2024 at 06:48:21AM -0400, Jeff King wrote:
> On Thu, Jun 06, 2024 at 06:19:21PM -0400, Taylor Blau wrote:
>
> > Looking at the remaining uses of mkstemp(), the remaining class of
> > callers that don't use the tempfile.h API are for creating temporary
> > .idx, .rev files, and similar. My personal feeling is that we should
> > apply similar treatment there, since these files are generated based on
> > .pack data, and thus keeping around temporary copies is unnecessary when
> > they can be regenerated.
>
> And actual loose object and pack files themselves, I think. I think it
> was a deliberate choice long ago to keep those files around, since in
> the worst case they could be used to recover actual repo content (e.g.,
> a failed fetch will leave its tmp_pack_* file around, and you could
> probably extract _some_ objects from it).

Heh, yes. Those were intentionally omitted from this list ;-).

I agree that having the content stick around in failed packfile and
loose object writes is useful as a last-resort recovery mechanism. I
suspect that it is often difficult or impossible to recover the contents
of an object/pack from a broken write, but I think it's better than the
alternative of just throwing it away up front.

> And the philosophy is that we'd leave them sitting around until gc ran
> and found tmp_* in objects/, check their mtimes, and remove them then.
>
> In practice, I don't think I have really seen anybody recover anything
> from a temporary file. You're better off looking at whatever was feeding
> the temporary file (if it was "git add", then look at the filesystem,
> and if it was index-pack, look at the fetch/push source, etc).

Yup.

> So I'd argue that we should just treat object/pack tempfiles like the
> rest, and delete them if they don't make it all the way to the rename
> step. If we really want to support debugging, we could perhaps provide
> a run-time knob to leave them in place (and maybe even have it apply to
> _all_ tempfiles).

I dunno... I don't disagree with what you're saying, but it does seem a
little scary to delete files containing data that we might have been
able to recover.

I think the current series ends at a reasonable stopping point... I'd
honestly have to think more about whether I agree with what you're
saying here or not ;-).

Thanks,
Taylor

      parent reply	other threads:[~2024-06-14 19:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-06 22:19 [PATCH 0/2] commit-graph/server-info: use tempfile.h in more places Taylor Blau
2024-06-06 22:19 ` [PATCH 1/2] commit-graph.c: remove temporary graph layers on exit Taylor Blau
2024-06-07 16:33   ` Junio C Hamano
2024-06-07 21:41     ` Taylor Blau
2024-06-07 21:47       ` Junio C Hamano
2024-06-08  9:42         ` Jeff King
2024-06-07 21:49       ` Junio C Hamano
2024-06-08  9:53   ` Jeff King
2024-06-06 22:19 ` [PATCH 2/2] server-info.c: remove temporary info files " Taylor Blau
2024-06-07 16:02   ` Junio C Hamano
2024-06-07 21:44     ` Taylor Blau
2024-06-07 21:49       ` Junio C Hamano
2024-06-08 10:25   ` Jeff King
2024-06-07 15:40 ` [PATCH 0/2] commit-graph/server-info: use tempfile.h in more places Junio C Hamano
2024-06-08 10:48 ` Jeff King
2024-06-14 17:41   ` Elijah Newren
2024-06-14 19:35   ` Taylor Blau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmybaokdVIXOX+6D@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.