git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: Junio C Hamano <gitster@pobox.com>,
	git discussion list <git@vger.kernel.org>
Subject: Re: Tag peeling peculiarities
Date: Fri, 22 Mar 2013 13:42:35 -0400	[thread overview]
Message-ID: <20130322174235.GA29011@sigill.intra.peff.net> (raw)
In-Reply-To: <514475C4.7020901@alum.mit.edu>

[this email is from last week, and I think most of was responded to in
 other parts of the thread, but there were a few loose ends]

On Sat, Mar 16, 2013 at 02:38:12PM +0100, Michael Haggerty wrote:

> >> * Change pack-refs to use the peeled information from ref_entries if it
> >>   is available, rather than looking up the references again.
> > 
> > I don't know that it matters from a performance standpoint (we don't
> > really care how long pack-refs takes, as long as it is within reason,
> > because it is run as part of a gc).  But it would mean that any errors
> > in the file are propagated from one run of pack-refs to the next. I
> > don't know if we want to spend the extra time to be more robust.
> 
> I thought about this argument too.  Either way is OK with me.  BTW would
> it make sense to build a consistency check of peeled references into fsck?

Yeah, I do think an fsck check makes sense. It should not be expensive,
and if there is a realistic corruption/health check for a repo, it makes
sense to me for it to be part of fsck. I do not recall many incidents of
packed-refs corruption in the history of the list, but this fairly
recent one comes to mind:

  http://thread.gmane.org/gmane.comp.version-control.git/217065

On the other hand, if it is just as cheap to rewrite the file as it is
to do the health checks, I wonder if the advice should simply be "run
pack-refs again" (and doing so should always start from scratch, not
trusting the existing version).

> > Yuck. I really wonder if repack_without_ref should literally just be
> > writing out the exact same file, minus the lines for the deleted ref.
> > Is there a reason we need to do any processing at all? I guess the
> > answer is that you want to avoid re-parsing the current file, and just
> > write it out from memory. But don't we need to refresh the memory cache
> > from disk under the lock anyway? That was the pack-refs race that I
> > fixed recently.
> 
> It would certainly be thinkable to rewrite the file textually without
> parsing it again.  But I did it this way for two reasons:
> 
> 1. It would be better to have one packed-refs file parser and writer
> rather than two.  (I'm working towards unifying the two writers, and
> will continue once I understand their differences.)

I see your point, though I also feel that with the right refactoring,
they could share the parser. And the second writer would be mostly a
pass-through. But much more compelling is reason 2...

> 2. Given how peeled references were added, it seems dangerous to read,
> modify, and write data that might be in a future format that we don't
> entirely understand.  For example, suppose a new feature is added to
> mark Junio's favorite tags:
> 
>     # pack-refs with: peeled fully-peeled junios-favorites \n
>     ce432cac30f98b291be609a0fc974042a2156f55 refs/heads/master
>     83b3dd7151e7eb0e5ac62ee03aca7e6bccaa8d07 refs/tags/evil-dogs
>     ^e1d04f8aad59397cbd55e72bf8a1bd75606f5350
>     7ed863a85a6ce2c4ac4476848310b8f917ab41f9 refs/tags/lolcats
>     ^990f856a62a24bfd56bac1f5e4581381369e4ede
>     ^^^junios-favorite
>     b0517166ae2ad92f3b17638cbdee0f04b8170d99 refs/tags/nonsense
>     ^4baff50551545e2b6825973ec37bcaf03edb95fe

Hmm. Good point. It is tempting to make a rule like "extensions that
are specific to a ref must come after the ref but before the next ref".
And then the writer could simply drop any lines between the to-delete
ref and the one that follows it.

I think that would work OK in practice, but I am worried that it would
paint us into a corner later on (after all, we do not know what
extensions will be added in the future). The only thing I can think of
that would break is something where a ref or its extension depends on
previous entries.  E.g., we start prefix-compressing the ref names to
save space. Of course that would break backwards compatibility horribly
anyway, so it's a no-go. But maybe some extension would want to refer to
a prior ref. I dunno.

-Peff

      parent reply	other threads:[~2013-03-22 17:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-13 14:59 Tag peeling peculiarities Michael Haggerty
2013-03-13 15:34 ` Michael Haggerty
2013-03-13 17:29 ` Junio C Hamano
2013-03-13 21:58   ` Jeff King
2013-03-14  4:41     ` Michael Haggerty
2013-03-14  5:24       ` Jeff King
2013-03-14  5:32         ` Jeff King
2013-03-14 15:14           ` Junio C Hamano
2013-03-14 11:28         ` Michael Haggerty
2013-03-14 13:40           ` Jeff King
2013-03-15  5:12             ` Michael Haggerty
2013-03-15 16:28               ` Junio C Hamano
2013-03-16  8:48             ` Michael Haggerty
2013-03-16  9:34               ` Jeff King
2013-03-16 13:38                 ` Michael Haggerty
2013-03-18  3:17                   ` Michael Haggerty
2013-03-22 17:42                   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130322174235.GA29011@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).