Re: [Qemu-devel] [RFC v5 1/2] util: add memmem replacement function

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Eric Blake <eblake@redhat.com>
To: Claudio Fontana <claudio.fontana@huawei.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	hw.claudio@gmail.com, Luiz Capitulino <lcapitulino@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Gonglei <arei.gonglei@huawei.com>,
	qemu-devel@nongnu.org, karl@freefriends.org
Subject: Re: [Qemu-devel] [RFC v5 1/2] util: add memmem replacement function
Date: Fri, 15 May 2015 10:01:39 -0600	[thread overview]
Message-ID: <55561863.5040600@redhat.com> (raw)
In-Reply-To: <55560D26.3080202@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 5128 bytes --]

On 05/15/2015 09:13 AM, Claudio Fontana wrote:

>> Or back to the original question - why are we worrying about the O(n)
>> memmem implementation when the naive O(n^2) is MUCH shorter and easier
>> to understand, and where the scaling penalty is only apparent on really
>> long corner case strings?  Is our use of memmem going to hit the corner
>> cases where scaling matters?
> 
> For the specific use case here (memory search functionality),
> the haystack is normally the factor which really can scale up a lot.

Remember, when searching, the worst-case penalty is determined by a
combination of the size of the haystack (H) and the size of the needle
(I'll choose M to represent that, to avoid confusion with the generic n
in Big-O notation).  The most brain-dead loop (for each byte in
haystack, see if the needle matches) is H*M operations; this is
noticeable as quadratic scaling (and even then, only for certain
matching patterns, such as determining if a needle of all 'a' except for
a final 'b' occurs in a haystack of all 'a') only if both needle and
haystack can grow arbitrarily large: O(H*M) == O(n^2).  Where my code
shines is that it breaks the operation into two parts - a factorization
of the needle O(M) == O(n), then a linear search using that factored
needle O(H) == O(n), for an overall cost O(H+M) == O(n). But with the
naive algorithm, if the needle is capped to a fixed size, then you have
O(constant*H) == O(H) == O(n).

Meanwhile, remember that Big-O notation is only about scaling effects,
but that there are still constant factors to be considered - an O(n)
algorithm that requires 3 comparisons per byte of the haystack will
always be slower than an O(n) algorithm that only needs 2 comparisons
per byte of the haystack, even though both algorithms scale linearly.

> 
> I would expect the needle to be usually smallish (though for string search sometimes the needle could probably arrive at the threshold of 32).
> 
> When scanning a very large memory range, considering a binary string with roughly length around 32, do you think that the non-trivial implementation could bring some noticeable benefits?
> 
> If it's below 32, do we get any benefits compared to the trivial implementation?

In fact, one of the glibc optimizations was realizing that my two-way
algorithm code is LOUSY at short needles.  It has a high relative
overhead for finding the proper factorization of a short needle, before
it can even start searching.  A hybrid approach (brute force a short
needle, and only bother with factorizing a long needle, or even after a
heuristic of a minimum number of comparisons is first met) is ideal,
because of the observation above that the poor scaling of quadratic
behavior is only a problem for large needles, while the constant
overhead of factorization is measurably larger in proportion to the rest
of the work done on short needles.

If you are letting the user search for arbitrarily long needles, then an
O(n) scale may be worth the speed penalty on shorter needles, or the
complexity of a hybrid approach (again, I think glibc may have better
hybrid approach than gnulib at the moment).  But as this is an HMP
interface, the user has to type the search string, and is unlikely to be
typing long needles, so it is justifiable to just ditch the complexity
of a hybrid approach and use ONLY the naive O(n^2) algorithm, because as
I argued above, it is still observable only as O(n) for a bounded-size
needle.

> 
>>
>> [Personal note: I'm still tickled pink every time I see yet another
>> project adopt my O(n) code - I didn't come up with the algorithm, but
>> merely scraped it from public research, and was shocked to note how many
>> libc still had O(n^2) code at the time.  But it is still gratifying to
>> see that my implementation is getting such wide adoption]
>>
> 
> Well your code and its derived versions have been tried and tested a lot, so picking it up just makes sense to me, to avoid getting into the same corner cases and issues that your implementation already had to go through. And I am a lazy person of course!

You do have a point here - if we are going to reuse code, then reusing a
debugged implementation rather than writing from scratch is the way to
go.  But my point remains that the complexity of a reused Two-Way
algorithm may not be necessary, and the 3-line naive algorithm is
trivial to review.  What's more, it is also arguable that we only use
the naive algorithm on less-than-stellar platforms that lack memmem
natively (well, we may also use less-than-stellar performance if the
native memmem is not O(n)) - but who is going to be using the HMP
command on such a platform?  And maybe those platforms will finally be
convinced to add a decent memmem if they realize that we are slow on
their platform but not on other platforms, precisely because we are
working around their lack of memmem by intentionally using a slow but
trivial replacement.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

next prev parent reply	other threads:[~2015-05-15 16:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-15 12:25 [Qemu-devel] [RFC v5 0/2] monitor: add memory search commands s, sp hw.claudio
2015-05-15 12:25 ` [Qemu-devel] [RFC v5 1/2] util: add memmem replacement function hw.claudio
2015-05-15 13:57   ` Claudio Fontana
2015-05-15 14:12     ` Paolo Bonzini
2015-05-15 14:56       ` Eric Blake
2015-05-15 15:13         ` Claudio Fontana
2015-05-15 16:01           ` Eric Blake [this message]
2015-05-18  8:36             ` Claudio Fontana
2015-05-15 12:25 ` [Qemu-devel] [RFC v5 2/2] monitor: add memory search commands s, sp hw.claudio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55561863.5040600@redhat.com \
    --to=eblake@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=claudio.fontana@huawei.com \
    --cc=hw.claudio@gmail.com \
    --cc=karl@freefriends.org \
    --cc=lcapitulino@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).