From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH] ptr_ring: add barriers Date: Wed, 6 Dec 2017 14:46:54 +0200 Message-ID: <20171206144633-mutt-send-email-mst@kernel.org> References: <1512501990-30029-1-git-send-email-mst@redhat.com> <7d1ce1b5-edba-b017-3131-37405f1b0c24@caviumnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: George Cherian , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, edumazet@google.com, davem@davemloft.net To: George Cherian Return-path: Content-Disposition: inline In-Reply-To: <7d1ce1b5-edba-b017-3131-37405f1b0c24@caviumnetworks.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org List-Id: netdev.vger.kernel.org On Wed, Dec 06, 2017 at 02:51:41PM +0530, George Cherian wrote: > Hi Michael, > > > On 12/06/2017 12:59 AM, Michael S. Tsirkin wrote: > > Users of ptr_ring expect that it's safe to give the > > data structure a pointer and have it be available > > to consumers, but that actually requires an smb_wmb > > or a stronger barrier. > This is not the exact situation we are seeing. Could you test the patch pls? > Let me try to explain the situation > > Affected on ARM64 platform. > 1) tun_net_xmit calls skb_array_produce, which pushes the skb to the > ptr_ring, this push is protected by a producer_lock. > > 2)Prior to this call the tun_net_xmit calls skb_orphan which calls the > skb->destructor and sets skb->destructor and skb->sk as NULL. > > 2.a) These 2 writes are getting reordered > > 3) At the same time in the receive side (tun_ring_recv), which gets executed > in another core calls skb_array_consume which pulls the skb from ptr ring, > this pull is protected by a consumer lock. > > 4) eventually calling the skb->destructor (sock_wfree) with stale values. > > Also note that this issue is reproducible in a long run and doesn't happen > immediately after the launch of multiple VM's (infact the particular test > cases launches 56 VM's which does iperf back and forth) > > > > > In absence of such barriers and on architectures that reorder writes, > > consumer might read an un=initialized value from an skb pointer stored > > in the skb array. This was observed causing crashes. > > > > To fix, add memory barriers. The barrier we use is a wmb, the > > assumption being that producers do not need to read the value so we do > > not need to order these reads. > It is not the case that producer is reading the value, but the consumer > reading stale value. So we need to have a strict rmb in place . > > > > > Reported-by: George Cherian > > Suggested-by: Jason Wang > > Signed-off-by: Michael S. Tsirkin > > --- > > > > George, could you pls report whether this patch fixes > > the issue for you? > > > > This seems to be needed in stable as well. > > > > > > > > > > include/linux/ptr_ring.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h > > index 37b4bb2..6866df4 100644 > > --- a/include/linux/ptr_ring.h > > +++ b/include/linux/ptr_ring.h > > @@ -101,12 +101,18 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r) > > /* Note: callers invoking this in a loop must use a compiler barrier, > > * for example cpu_relax(). Callers must hold producer_lock. > > + * Callers are responsible for making sure pointer that is being queued > > + * points to a valid data. > > */ > > static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr) > > { > > if (unlikely(!r->size) || r->queue[r->producer]) > > return -ENOSPC; > > + /* Make sure the pointer we are storing points to a valid data. */ > > + /* Pairs with smp_read_barrier_depends in __ptr_ring_consume. */ > > + smp_wmb(); > > + > > r->queue[r->producer++] = ptr; > > if (unlikely(r->producer >= r->size)) > > r->producer = 0; > > @@ -275,6 +281,9 @@ static inline void *__ptr_ring_consume(struct ptr_ring *r) > > if (ptr) > > __ptr_ring_discard_one(r); > > + /* Make sure anyone accessing data through the pointer is up to date. */ > > + /* Pairs with smp_wmb in __ptr_ring_produce. */ > > + smp_read_barrier_depends(); > > return ptr; > > } > >