git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 2/2] reftable/block: fix binary search over restart counter
Date: Mon, 11 Mar 2024 06:18:06 +0100	[thread overview]
Message-ID: <Ze6UDkipgNsDVCwf@tanuki> (raw)
In-Reply-To: <xmqq34t1k12p.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 2514 bytes --]

On Thu, Mar 07, 2024 at 04:40:46PM -0800, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Patrick Steinhardt <ps@pks.im> writes:
> >
> >> The consequence is that `binsearch()` essentially always returns 0,
> >> indicacting to us that we must start searching right at the beginning of
> >> the block. This works by chance because we now always do a linear scan
> >> from the start of the block, and thus we would still end up finding the
> >> desired record. But needless to say, this makes the optimization quite
> >> useless.
> 
> >> Fix this bug by returning whether the current key is smaller than the
> >> searched key. As the current behaviour was correct it is not possible to
> >> write a test. Furthermore it is also not really possible to demonstrate
> >> in a benchmark that this fix speeds up seeking records.
> >
> > This is an amusing bug.  
> 
> Having said all that.
> 
> I have to wonder if it is the custom implementation of binsearch()
> the reftable/basic.c file has, not this particular comparison
> callback.  It makes an unusual expectation on the comparison
> function, unlike bsearch(3) whose compar(a,b) is expected to return
> an answer with the same sign as "a - b".
> 
> I just checked the binary search loops we have in the core part of
> the system, like the one in hash-lookup.c (which takes advantage of
> the random and uniform nature of hashed values to converge faster
> than log2) and ones in builtin/pack-objects.c (both of which are
> absolute bog-standard).  Luckily, we do not use such an unusual
> convention (well, we avoid overhead of compar callbacks to begin
> with, so it is a bit of apples-to-oranges comparison).

Very true, this behaviour cought me by surprise, as well, and I do think
it's quite easy to get wrong. Now I would've understood if `binsearch()`
were able to handle and forward errors to the caller by passing -1. And
I almost thought that was the case because `restart_key_less()` can
indeed fail, and it would return a negative value if so. But that error
return code is then not taken as an indicator of failure, but instead
will cause us to treat the current value as smaller than the comparison
key.

But we do know to bubble the error up via the pasesd-in args by setting
`args->error = -1`. Funny thing though: I just now noticed that we check
for `args.error` _before_ we call `binsearch()`. Oops.

I will send a follow-up patch that addresses these issues.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-03-11  5:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-07 15:26 [PATCH] reftable/block: fix binary search over restart counter Patrick Steinhardt
2024-03-07 16:24 ` Patrick Steinhardt
2024-03-07 20:35 ` [PATCH v2 0/2] " Patrick Steinhardt
2024-03-07 20:35   ` [PATCH v2 1/2] reftable/record: fix memory leak when decoding object records Patrick Steinhardt
2024-03-07 20:36   ` [PATCH v2 2/2] reftable/block: fix binary search over restart counter Patrick Steinhardt
2024-03-07 23:29     ` Junio C Hamano
2024-03-08  0:40       ` Junio C Hamano
2024-03-11  5:18         ` Patrick Steinhardt [this message]
2024-03-11 17:33           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ze6UDkipgNsDVCwf@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).