From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47259)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1asegR-0003ra-Bs
	for qemu-devel@nongnu.org; Tue, 19 Apr 2016 19:03:52 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1asegO-00006K-0w
	for qemu-devel@nongnu.org; Tue, 19 Apr 2016 19:03:51 -0400
Received: from out2-smtp.messagingengine.com ([66.111.4.26]:44151)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1asegN-000061-Sg
	for qemu-devel@nongnu.org; Tue, 19 Apr 2016 19:03:47 -0400
Received: from compute5.internal (compute5.nyi.internal [10.202.2.45])
	by mailout.nyi.internal (Postfix) with ESMTP id 87CC720AF8
	for <qemu-devel@nongnu.org>; Tue, 19 Apr 2016 19:03:46 -0400 (EDT)
Date: Tue, 19 Apr 2016 19:03:46 -0400
From: "Emilio G. Cota" <cota@braap.org>
Message-ID: <20160419230346.GA2621@flamenco>
References: <1459834253-8291-1-git-send-email-cota@braap.org>
	<1459834253-8291-9-git-send-email-cota@braap.org>
	<87h9fcbdko.fsf@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <87h9fcbdko.fsf@linaro.org>
Subject: Re: [Qemu-devel] [PATCH 08/10] qht: QEMU's fast,
 resizable and scalable Hash Table
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alex =?iso-8859-1?Q?Benn=E9e?= <alex.bennee@linaro.org>
Cc: QEMU Developers <qemu-devel@nongnu.org>, MTTCG Devel <mttcg@listserver.greensocs.com>, Paolo Bonzini <pbonzini@redhat.com>, Peter Crosthwaite <crosthwaite.peter@gmail.com>, Richard Henderson <rth@twiddle.net>, Peter Maydell <peter.maydell@linaro.org>, Sergey Fedorov <serge.fdrv@gmail.com>

Hi Alex,

I'm sending a v3 in a few minutes. I've addressed all your comments there,
so I won't duplicate them here; please find inline my replies to some
questions you raised.

On Fri, Apr 08, 2016 at 11:27:19 +0100, Alex Bennée wrote:
> Emilio G. Cota <cota@braap.org> writes:
(snip)
> > +/* call only when there are no readers left */
> > +void qht_destroy(struct qht *ht);
> > +
> 
> What's the rationale for making the caller responsible for locking here?

Instead of embedding a mutex in struct qht, I decided to let callers
handle that, since most likely write accesses will be part of
other updates that will require their own lock.

> Are we going to be protected by tb_lock in the MTTCG case rather than a
> finer grained locking inside the qht?

Yes, this is an example of the external-lock-that-other-updates-require
that I refer to above.

(snip)
> > +static inline
> > +void *__qht_lookup(struct qht_bucket *b, struct qht_bucket **far_bucket,
> > +                    int *pos, qht_lookup_func_t func, const void *userp,
> > +                    uint32_t hash)
> 
> Aside the inline is there any reason to keep such an implementation
> detail in the header. I certainly couldn't measure a difference after I
> moved them into qht.c.

I thought I measured a small perf. increase in the past with the inline.
But I couldn't replicate it so I've moved this to qht.c in v3.

(snip)
> > +static inline void *qht_lookup(struct qht *ht, qht_lookup_func_t func,
> > +                                const void *userp, uint32_t hash)
> > +{
> > +    struct qht_bucket *far_bucket = NULL;
> > +    struct qht_bucket *b;
> > +    struct qht_map *map;
> > +    uint32_t version;
> > +    int pos = 0;
> > +    void *ret;
> > +
> > +    map = atomic_read(&ht->map);
> > +    /* paired with smp_wmb() before setting ht->map */
> > +    smp_rmb();
> > +    b = qht_map_to_bucket(map, hash);
> > +
> > +    do {
> > +        version = seqlock_read_begin(&b->sequence);
> > +        ret = __qht_lookup(b, &far_bucket, &pos, func, userp, hash);
> > +    } while (seqlock_read_retry(&b->sequence, version));
> 
> OK I was slightly confused by this until I got further along. Maybe some
> comments in the qht_bucket structure to point out the seqlock is
> important when a bucket is at the head of the list.

There is a comment at the top of qht.c that says exactly that. I'd rather
not duplicate it elsewhere.

(snip)
> > +    for (i = 0; i < QHT_BUCKET_ENTRIES; i++) {
> > +        if (i == QHT_BUCKET_ENTRIES - 1) {
> > +            dest_hash = &orig->hashes[pos];
> > +            dest_p = &o rig->pointers[pos];
> > +        } else {
> > +            dest_hash = &head->hashes[i + 1];
> > +            dest_p = &head->pointers[i + 1];
> > +        }
> > +        hash = *dest_hash;
> > +        p = *dest_p;
> > +
> > +        atomic_set(dest_hash, head->hashes[i]);
> > +        atomic_set(dest_p, head->pointers[i]);
> > +
> > +        atomic_set(&head->hashes[i], hash);
> > +        atomic_set(&head->pointers[i], p);
> > +    }
> 
> So the bucket that has just swapped the MRU entry for an evicted entry
> gets placed after head. Is there a chance we end up just bouncing the
> head->next bucket each hit?

Given a certain sequence of hits, it could happen that head->next
gets updated every time, yes. This is of course unlikely, which is
why I think it's fine. An alternative would be to simply swap
head[last] with orig[pos], and leave it there. I can't measure
much of a difference between the two options.

(snip until end of patch)
> This looks good and I can see this being a useful utility function
> across QEMU. My main problem was following exactly what happens when
> entries are moved around for the MRU case. As users don't have to have a
> deep knowledge of the implementation details this isn't a major issue
> although a few more expansive comments or ASCII diagrams may help if
> you are feeling up to it ;-)

I added comments in v3 to address this.

Thanks a lot for your review!

		Emilio