qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Claudio Fontana <claudio.fontana@huawei.com>
To: Eric Blake <eblake@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	hw.claudio@gmail.com, Luiz Capitulino <lcapitulino@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Gonglei <arei.gonglei@huawei.com>,
	qemu-devel@nongnu.org, karl@freefriends.org
Subject: Re: [Qemu-devel] [RFC v5 1/2] util: add memmem replacement function
Date: Fri, 15 May 2015 17:13:42 +0200	[thread overview]
Message-ID: <55560D26.3080202@huawei.com> (raw)
In-Reply-To: <5556090E.7070408@redhat.com>

Hello Eric,

On 15.05.2015 16:56, Eric Blake wrote:
> On 05/15/2015 08:12 AM, Paolo Bonzini wrote:
>>
>>
>> On 15/05/2015 15:57, Claudio Fontana wrote:
>>> The header here mentions GPLv2+, but the module data for memmem in gnulib mentions LGPLv2+.
>>>
>>> Very confusing.
>>>
>>> The COPYING file in gnulib mentions:
>>>
>>> "The files in here are mostly copyright (C) Free Software Foundation, and
>>> are under assorted licenses.  Mostly, but not entirely, GPL.
>>>
>>> Many modules are provided dual-license, either GPL or LGPL at your
>>> option.  The headers of files in the lib directory (e.g., lib/error.c)
>>> state GPL for convenience, since the bulk of current gnulib users are
>>> GPL'd programs.  But the files in the modules directory (e.g.,
>>> modules/error) state the true license of each file, [...]
>>> "
>>>
>>> So if one would want to include the gnulib code without using the gnulib esoteric modules mechanisms, what should one do?
>>
>> Change it manually, I guess.
> 
> It's still possible to use gnulib-tool for a one-time export of the
> modules in question into a temporary project with correct licensing
> headers written by gnulib-tool, then copy from that temporary project
> into qemu.  It's probably just as fast to do it by hand, but using
> gnulib-tool in the procedure (and documenting in the commit message what
> that procedure was, to make it easier for the next guy to copy) is
> probably friendlier.  I'm not sure off-hand what the command line would
> be, but could help if that's the approach we want to take.
> 
> Or, since gnulib's memmem is (mostly) sync'd with glibc's memmem (or at
> least was in sync at one point), and glibc is under LGPLv2+, why not
> copy straight from glibc instead of from gnulib, and then you don't have
> to modify any headers?  (I haven't looked lately, but glibc may have
> optimizations that should be backported into gnulib)

I will check the status of glibc's version, seems a good idea to get it from there.

> 
> Historical note: gnulib had this particular implementation of memmem
> first, as written by me under a dual license; then I got glibc to
> incorporate it under just LGPLv2+; then since the initial
> implementation, glibc has been better optimized by others. If you really
> want to, you could observe that since I released the same original
> memmem implementation under multiple licenses at the time I first wrote
> it, you could therefore find and copy an older version under an even
> looser license in the newlib project (but without glibc's enhancements,
> as I can't dual-license someone else's follow-up work):
> https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=newlib/libc/string/memmem.c;h=25704e46;hb=HEAD
> 
> Or back to the original question - why are we worrying about the O(n)
> memmem implementation when the naive O(n^2) is MUCH shorter and easier
> to understand, and where the scaling penalty is only apparent on really
> long corner case strings?  Is our use of memmem going to hit the corner
> cases where scaling matters?

For the specific use case here (memory search functionality),
the haystack is normally the factor which really can scale up a lot.

I would expect the needle to be usually smallish (though for string search sometimes the needle could probably arrive at the threshold of 32).

When scanning a very large memory range, considering a binary string with roughly length around 32, do you think that the non-trivial implementation could bring some noticeable benefits?

If it's below 32, do we get any benefits compared to the trivial implementation?

> 
> [Personal note: I'm still tickled pink every time I see yet another
> project adopt my O(n) code - I didn't come up with the algorithm, but
> merely scraped it from public research, and was shocked to note how many
> libc still had O(n^2) code at the time.  But it is still gratifying to
> see that my implementation is getting such wide adoption]
> 

Well your code and its derived versions have been tried and tested a lot, so picking it up just makes sense to me, to avoid getting into the same corner cases and issues that your implementation already had to go through. And I am a lazy person of course!

Ciao,

Claudio

  reply	other threads:[~2015-05-15 15:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-15 12:25 [Qemu-devel] [RFC v5 0/2] monitor: add memory search commands s, sp hw.claudio
2015-05-15 12:25 ` [Qemu-devel] [RFC v5 1/2] util: add memmem replacement function hw.claudio
2015-05-15 13:57   ` Claudio Fontana
2015-05-15 14:12     ` Paolo Bonzini
2015-05-15 14:56       ` Eric Blake
2015-05-15 15:13         ` Claudio Fontana [this message]
2015-05-15 16:01           ` Eric Blake
2015-05-18  8:36             ` Claudio Fontana
2015-05-15 12:25 ` [Qemu-devel] [RFC v5 2/2] monitor: add memory search commands s, sp hw.claudio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55560D26.3080202@huawei.com \
    --to=claudio.fontana@huawei.com \
    --cc=arei.gonglei@huawei.com \
    --cc=eblake@redhat.com \
    --cc=hw.claudio@gmail.com \
    --cc=karl@freefriends.org \
    --cc=lcapitulino@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).