From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752316AbdLFMq7 (ORCPT ); Wed, 6 Dec 2017 07:46:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38510 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752221AbdLFMq4 (ORCPT ); Wed, 6 Dec 2017 07:46:56 -0500 Date: Wed, 6 Dec 2017 14:46:54 +0200 From: "Michael S. Tsirkin" To: George Cherian Cc: linux-kernel@vger.kernel.org, George Cherian , Jason Wang , davem@davemloft.net, edumazet@google.com, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org Subject: Re: [PATCH] ptr_ring: add barriers Message-ID: <20171206144633-mutt-send-email-mst@kernel.org> References: <1512501990-30029-1-git-send-email-mst@redhat.com> <7d1ce1b5-edba-b017-3131-37405f1b0c24@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7d1ce1b5-edba-b017-3131-37405f1b0c24@caviumnetworks.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 06 Dec 2017 12:46:56 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 06, 2017 at 02:51:41PM +0530, George Cherian wrote: > Hi Michael, > > > On 12/06/2017 12:59 AM, Michael S. Tsirkin wrote: > > Users of ptr_ring expect that it's safe to give the > > data structure a pointer and have it be available > > to consumers, but that actually requires an smb_wmb > > or a stronger barrier. > This is not the exact situation we are seeing. Could you test the patch pls? > Let me try to explain the situation > > Affected on ARM64 platform. > 1) tun_net_xmit calls skb_array_produce, which pushes the skb to the > ptr_ring, this push is protected by a producer_lock. > > 2)Prior to this call the tun_net_xmit calls skb_orphan which calls the > skb->destructor and sets skb->destructor and skb->sk as NULL. > > 2.a) These 2 writes are getting reordered > > 3) At the same time in the receive side (tun_ring_recv), which gets executed > in another core calls skb_array_consume which pulls the skb from ptr ring, > this pull is protected by a consumer lock. > > 4) eventually calling the skb->destructor (sock_wfree) with stale values. > > Also note that this issue is reproducible in a long run and doesn't happen > immediately after the launch of multiple VM's (infact the particular test > cases launches 56 VM's which does iperf back and forth) > > > > > In absence of such barriers and on architectures that reorder writes, > > consumer might read an un=initialized value from an skb pointer stored > > in the skb array. This was observed causing crashes. > > > > To fix, add memory barriers. The barrier we use is a wmb, the > > assumption being that producers do not need to read the value so we do > > not need to order these reads. > It is not the case that producer is reading the value, but the consumer > reading stale value. So we need to have a strict rmb in place . > > > > > Reported-by: George Cherian > > Suggested-by: Jason Wang > > Signed-off-by: Michael S. Tsirkin > > --- > > > > George, could you pls report whether this patch fixes > > the issue for you? > > > > This seems to be needed in stable as well. > > > > > > > > > > include/linux/ptr_ring.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h > > index 37b4bb2..6866df4 100644 > > --- a/include/linux/ptr_ring.h > > +++ b/include/linux/ptr_ring.h > > @@ -101,12 +101,18 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r) > > /* Note: callers invoking this in a loop must use a compiler barrier, > > * for example cpu_relax(). Callers must hold producer_lock. > > + * Callers are responsible for making sure pointer that is being queued > > + * points to a valid data. > > */ > > static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr) > > { > > if (unlikely(!r->size) || r->queue[r->producer]) > > return -ENOSPC; > > + /* Make sure the pointer we are storing points to a valid data. */ > > + /* Pairs with smp_read_barrier_depends in __ptr_ring_consume. */ > > + smp_wmb(); > > + > > r->queue[r->producer++] = ptr; > > if (unlikely(r->producer >= r->size)) > > r->producer = 0; > > @@ -275,6 +281,9 @@ static inline void *__ptr_ring_consume(struct ptr_ring *r) > > if (ptr) > > __ptr_ring_discard_one(r); > > + /* Make sure anyone accessing data through the pointer is up to date. */ > > + /* Pairs with smp_wmb in __ptr_ring_produce. */ > > + smp_read_barrier_depends(); > > return ptr; > > } > >