All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Junio C Hamano <junkio@cox.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] Flag and skip over packfiles known to be invalid.
Date: Fri, 2 Feb 2007 03:51:08 -0500	[thread overview]
Message-ID: <20070202085108.GC20832@spearce.org> (raw)
In-Reply-To: <7virelhsae.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> I haven't looked your patch very closely, but two successive
> "git-repack -a -d" would produce the packfile under the same
> name, but with different parameters (--window or --depth), the
> old .idx and new .pack may not match.
> 
> Can this be possible (time flows from top to bottom):
> 
> 	git-repack -a -d		somebody else
> 
> 	produces a new .pack/.idx
> 
>         mv new .pack to objects/pack
> 
> 					mmap .idx (old)
> 					finds .pack does not match
> 
> 	mv new .idx to objects/pack

Gah.  That's sick.  And *so* totally possible.
 	
> Is this something we care about?
 
Yes and no.

I've pushed the idea that `git repack -a -d` is totally safe to
run in a live repository.  Its clearly not.  *Especially* if you
repack with the same set of objects (as you do above) or if you
repack in rapid succession faster than readers can make progress.
I think both of these are just outside of the realm of "good ideas
to do to live things".

Its like taking your cat to the zoo play with a lion.  Maybe not
the best thing for the cat...


In this particular case that you highlight above this patch would
mark the pack as invalid and never permit it to be accessed again
by the 'somebody else'.

But Git has always been broken for this case.  prepare_packed_git_one
would refuse to open the new index, as it already has a packed_git.
Prior to this patch we'd try to access the packfile and die(),
as it did not match the index that we have.  Now we'd at least
try to locate some other source, or die() over a missing object,
but only after listing the idx/pack mismatch error too.

The only fix for the case above is to unlink the invalid packed_git
from packed_git, so that if we fail to find the object and wind up
calling reprepare_packed_git() we can reload the new index.  I did
consider doing that over the invalid flag, but flipped a coin and
added invalid.  FWIW, maybe invalid is the wrong way to go.


We still have a problem though, as the above case could still
happen to us during the fallback search in read_sha1_file().
In other words two searches ain't enough.  As Dscho pointed out in
the original thread, we probably need to loop until no new packs
are identified before giving up.

I almost submitted a patch to do that tonight, but I couldn't decide
on behavior: should we scan known packs, then try for loose, then
scan packs again until no object or no new pack is found?  Probably.

-- 
Shawn.

  reply	other threads:[~2007-02-02  8:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <dda240a4adf0511b3e1ab1eb74abdd28821358b0.1170403175.git.spearce@spearce.org>
2007-02-02  8:00 ` [PATCH 2/2] Flag and skip over packfiles known to be invalid Shawn O. Pearce
2007-02-02  8:34   ` Junio C Hamano
2007-02-02  8:51     ` Shawn O. Pearce [this message]
2007-02-03  5:44       ` Junio C Hamano
2007-02-04  5:02         ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070202085108.GC20832@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.