git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Junio C Hamano <junkio@cox.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] Make sure quickfetch is not fooled with a previous, incomplete fetch.
Date: Tue, 17 Apr 2007 02:58:53 -0400	[thread overview]
Message-ID: <20070417065853.GR2229@spearce.org> (raw)
In-Reply-To: <11767920013264-git-send-email-junkio@cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> in the kernel repository, with three different implementations
> of the "check-blob".
> 
>  - Checking with has_sha1_file() has negligible (unmeasurable)
>    performance penalty, which is what this patch has.
> 
>  - Checking with sha1_object_info() makes it somewhat slower,
>    perhaps by 5%.
> 
>  - Checking with read_sha1_file() to cause a fully re-validation
>    is prohibitively expensive (about 4 times as much runtime).
> 
> I am tempted to make the version with has_sha1_file() the
> default, getting rid of this new --check-blob option.

Misses in has_sha1_file suck, as we rescan for packfiles after
failing to find it in the current known packfiles and running a
failed stat().  Per object.

Though if --objects aborts with an error on the first failure, that's
perfectly reasonable performance, and actually quite sane behavior.
So I'm all for having --objects always imply has_sha1_file().

If the object exists enough to satisfy has_sha1_file() I think its
sane enough to assume its safe for a fast-fetch path.  If we didn't
have fast-fetch implemented and we fetched <100 objects we'd use
unpack-objects, which would only call has_sha1_file() anyway, and
find that possibly corrupted object in the source repository...,
oh, and that cannot happen as the source repository had to have a
good copy of the damn object to send it to us in the first place.
So has_sha1_file() is sufficient.

In my opinion, I think the --check-blob option of rev-list should
imply doing the sha1_object_info() type test at least, if not doing
a full-up SHA-1 validation of the blob content.  Which yea, that's
basically most of an fsck...


So I'm in favor of making --objects imply has_sha1_file(), aborting
on the first false return from that, and either make --check-blob
do the stricter checking work, or don't define it at this time.

-- 
Shawn.

      reply	other threads:[~2007-04-17  6:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-17  6:40 [PATCH] Make sure quickfetch is not fooled with a previous, incomplete fetch Junio C Hamano
2007-04-17  6:58 ` Shawn O. Pearce [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070417065853.GR2229@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).