From mboxrd@z Thu Jan  1 00:00:00 1970
From: Xiao Guangrong <guangrong.xiao@gmail.com>
Subject: Re: [PATCH 09/12] ring: introduce lockless ring buffer
Date: Thu, 28 Jun 2018 22:00:03 +0800
Message-ID: <f483f3d3-f9a3-2bf2-5123-d79c43d72762@gmail.com>
References: <20180604095520.8563-1-xiaoguangrong@tencent.com>
	<20180604095520.8563-10-xiaoguangrong@tencent.com>
	<20180620055535.GF18985@xz-mi>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>, dgilbert@redhat.com,
	qemu-devel@nongnu.org, wei.w.wang@intel.com,
	jiang.biao2@zte.com.cn, pbonzini@redhat.com
To: Peter Xu <peterx@redhat.com>
Return-path: <qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org>
In-Reply-To: <20180620055535.GF18985@xz-mi>
Content-Language: en-US
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+gceq-qemu-devel2=m.gmane.org@nongnu.org>
List-Id: kvm.vger.kernel.org


On 06/20/2018 01:55 PM, Peter Xu wrote:
> On Mon, Jun 04, 2018 at 05:55:17PM +0800, guangrong.xiao@gmail.com wrote:
> 
> [...]
> 
> (Some more comments/questions for the MP implementation...)
> 
>> +static inline int ring_mp_put(Ring *ring, void *data)
>> +{
>> +    unsigned int index, in, in_next, out;
>> +
>> +    do {
>> +        in = atomic_read(&ring->in);
>> +        out = atomic_read(&ring->out);
> 
> [0]
> 
> Do we need to fetch "out" with load_acquire()?  Otherwise what's the
> pairing of below store_release() at [1]?
> 

The barrier paired with [1] is the full barrier implied in atomic_cmpxchg(),

> This barrier exists in SP-SC case which makes sense to me, I assume
> that's also needed for MP-SC case, am I right?

We needn't put a memory here as we do not need to care the order between
these two indexes (in and out), instead, the memory barrier (and for
SP-SC as well) is used to make the order between ring->out and updating
ring->data[] as we explained in previous mail.

> 
>> +
>> +        if (__ring_is_full(ring, in, out)) {
>> +            if (atomic_read(&ring->in) == in &&
>> +                atomic_read(&ring->out) == out) {
> 
> Why read again?  After all the ring API seems to be designed as
> non-blocking.  E.g., I see the poll at [2] below makes more sense
> since when reaches [2] it means that there must be a producer that is
> _doing_ the queuing, so polling is very possible to complete fast.
> However here it seems to be a pure busy poll without any hint.  Then
> not sure whether we should just let the caller decide whether it wants
> to call ring_put() again.
> 

Without it we can easily observe a "strange" behavior that the thread will
put the result to the global ring failed even if we allocated enough room
for the global ring (its capability >= total requests), that's because
these two indexes can be updated at anytime, consider the case that multiple
get and put operations can be finished between reading ring->in and ring->out
so that very possibly ring->in can pass the value readed from ring->out.

Having this code, the negative case only happens if these two indexes (32 bits)
overflows to the same value, that can help us to catch potential bug in the
code.

>> +                return -ENOBUFS;
>> +            }
>> +
>> +            /* a entry has been fetched out, retry. */
>> +            continue;
>> +        }
>> +
>> +        in_next = in + 1;
>> +    } while (atomic_cmpxchg(&ring->in, in, in_next) != in);
>> +
>> +    index = ring_index(ring, in);
>> +
>> +    /*
>> +     * smp_rmb() paired with the memory barrier of (A) in ring_mp_get()
>> +     * is implied in atomic_cmpxchg() as we should read ring->out first
>> +     * before fetching the entry, otherwise this assert will fail.
> 
> Thanks for all these comments!  These are really helpful for
> reviewers.
> 
> However I'm not sure whether I understand it correctly here on MB of
> (A) for ring_mp_get() - AFAIU that should corresponds to a smp_rmb()
> at [0] above when reading the "out" variable rather than this
> assertion, and that's why I thought at [0] we should have something
> like a load_acquire() there (which contains a rmb()).

Memory barrier (A) in ring_mp_get() makes sure the order between:
    ring->data[index] = NULL;
    smp_wmb();
    ring->out = out + 1;

And the memory barrier at [0] makes sure the order between:
    out = ring->out;
    /* smp_rmb() */
    compxchg();
    value = ring->data[index];
    assert(value);

[ note: the assertion and reading ring->out are across cmpxchg(). ]

Did i understand your question clearly?

> 
>  From content-wise, I think the code here is correct, since
> atomic_cmpxchg() should have one implicit smp_mb() after all so we
> don't need anything further barriers here.

Yes, it is.