git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Geoffrey Irving" <irving@naml.us>
To: "Junio C Hamano" <gitster@pobox.com>
Cc: "Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH] cherry: cache patch-ids to avoid repeating work
Date: Fri, 11 Jul 2008 07:58:20 -0700	[thread overview]
Message-ID: <7f9d599f0807110758y6c4ea7bepd726daf4fe5f074c@mail.gmail.com> (raw)
In-Reply-To: <7v3amglxmb.fsf@gitster.siamese.dyndns.org>

On Thu, Jul 10, 2008 at 11:54 PM, Junio C Hamano <gitster@pobox.com> wrote:
> "Geoffrey Irving" <irving@naml.us> writes:
>
>>>> Oops: avoiding the infinite loop only requires reading expected O(1)
>>>> entries on load, so I can fix that if you like.  It would only be all of
>>>> them if it actually did detect the infinite loop.
>>>
>>> I have to admit that you lost me there.  AFAIR the patch-id cache is a
>>> simple commit->patch_id store, right?  Then there should be no way to get
>>> an infinite loop.
>>
>> If every entry is nonnull, find_helper loops forever.
>
> Isn't it sufficient to make this part check the condition as well?
>
> +       if (cache->count >= cache->size)
> +       {
> +               warning("%s is corrupt: count %"PRIu32" >= size %"PRIu32,
> +                       filename, cache->count, cache->size);
> +               goto empty;
> +       }
>
> At runtime you keep the invariants that hashtable always has at most 3/4
> full and whoever wrote the file you are reading must have honored that as
> well, or there is something fishy going on.

Good point.  There's no reason not to check the 3/4 condition.  It
isn't sufficient to avoid the infinite loop, though, since we don't
verify that count is accurate.

Another route would to eliminate the count field entirely, and replace
the count >= size/4*3 check with a statistical one based on the
entries seen so far.  The main advantage of that would be to make
incremental writes simpler by avoiding the need to update the header.
The disadvantage is that there would be a small chance that the map
would grow in size despite being almost empty.  Thoughts on whether we
should do that?

>>> Besides, this is a purely local cache, no?  Never to be transmitted...  So
>>> not much chance of a malicious attack, except if you allow write access to
>>> your local repository, in which case you are endangered no matter what.
>>
>> Yep, that's why it's only a hole in quotes, and why I didn't fix it.
>
> You might want to protect yourself against file corruption, though.
> Checksumming the whole file and checking it at opening defeats the point
> of mmap'ed access, but at least the header may want to be checksummed?

Okay.  I imagine I should use sha1 as the sum?

Geoffrey

  reply	other threads:[~2008-07-11 14:59 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-09  3:53 [PATCH 1/3] cherry: cache patch-ids to avoid repeating work Geoffrey Irving
2008-07-09  5:14 ` Junio C Hamano
2008-07-09  5:26   ` Geoffrey Irving
2008-07-09  6:24     ` Junio C Hamano
2008-07-09 12:18       ` Johannes Schindelin
2008-07-10  3:34       ` [PATCH] " Geoffrey Irving
2008-07-10 14:09         ` Geoffrey Irving
2008-07-10 14:28           ` Johannes Schindelin
2008-07-10 14:33             ` Geoffrey Irving
2008-07-10 15:56               ` Johannes Schindelin
2008-07-11  6:54               ` Junio C Hamano
2008-07-11 14:58                 ` Geoffrey Irving [this message]
2008-07-11 15:36                   ` Johannes Schindelin
2008-07-11 15:41                     ` Geoffrey Irving
2008-07-11 15:48                       ` Johannes Schindelin
     [not found]                         ` <7vej60jln6.fsf@gitster.siamese.dyndns.org>
2008-07-13  3:14                           ` Geoffrey Irving
2008-07-15 16:57                             ` Geoffrey Irving
2008-07-15 21:52                               ` Johannes Schindelin
2008-07-15 22:14                                 ` Junio C Hamano
2008-07-16  6:57                                   ` Karl Hasselström
2008-07-16  7:22                                   ` Johan Herland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f9d599f0807110758y6c4ea7bepd726daf4fe5f074c@mail.gmail.com \
    --to=irving@naml.us \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).