From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ilya Maximets Subject: Re: [PATCH] vhost: fix segfault on bad descriptor address. Date: Fri, 03 Jun 2016 09:01:31 +0300 Message-ID: <57511D3B.1000300@samsung.com> References: <1463748604-27251-1-git-send-email-i.maximets@samsung.com> <57500E86.3070104@samsung.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, Huawei Xie , Yuanhan Liu , Dyasly Sergey , Heetae Ahn , Jianfeng Tan To: Rich Lane Return-path: Received: from mailout3.w1.samsung.com (mailout3.w1.samsung.com [210.118.77.13]) by dpdk.org (Postfix) with ESMTP id 2C4A75946 for ; Fri, 3 Jun 2016 08:01:35 +0200 (CEST) Received: from eucpsbgm2.samsung.com (unknown [203.254.199.245]) by mailout3.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O8600B1XM2LCV80@mailout3.w1.samsung.com> for dev@dpdk.org; Fri, 03 Jun 2016 07:01:33 +0100 (BST) In-reply-to: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 02.06.2016 19:22, Rich Lane wrote: > On Thu, Jun 2, 2016 at 3:46 AM, Ilya Maximets > wrote: > > Hi, Rich. > Thank you for testing and analysing. > > On 01.06.2016 01:06, Rich Lane wrote: > > On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets >> wrote: > > > > In current implementation guest application can reinitialize vrings > > by executing start after stop. In the same time host application > > can still poll virtqueue while device stopped in guest and it will > > crash with segmentation fault while vring reinitialization because > > of dereferencing of bad descriptor addresses. > > > > > > I see a performance regression with this patch at large packet sizes (> 768 bytes). rte_vhost_enqueue_burst is consuming 10% more cycles. Strangely, there's actually a ~1% performance improvement at small packet sizes. > > > > The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1. > > > > AFAICT this is just the compiler generating bad code. One difference is that it's storing the offset on the stack instead of in a register. A workaround is to move the !desc_addr check outside the unlikely macros. > > > > --- a/lib/librte_vhost/vhost_rxtx.c > > +++ b/lib/librte_vhost/vhost_rxtx.c > > @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, > > struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0}; > > > > desc = &vq->desc[desc_idx]; > > - if (unlikely(desc->len < vq->vhost_hlen)) > > + desc_addr = gpa_to_vva(dev, desc->addr); > > + if (unlikely(desc->len < vq->vhost_hlen || !desc_addr)) > > > > > > Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) || !desc_addr)". > > > > return -1; > > > > > > - desc_addr = gpa_to_vva(dev, desc->addr); > > rte_prefetch0((void *)(uintptr_t)desc_addr); > > > > virtio_enqueue_offload(m, &virtio_hdr.hdr); > > @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, > > > > desc = &vq->desc[desc->next]; > > desc_addr = gpa_to_vva(dev, desc->addr); > > + if (unlikely(!desc_addr)) > > > > > > Workaround: change to "if (!desc_addr)". > > > > > > + return -1; > > + > > desc_offset = 0; > > desc_avail = desc->len; > > } > > > > What about other places? Is there same issues or it's only inside copy_mbuf_to_desc() ? > > > Only copy_mbuf_to_desc has the issue. Ok. Actually, I can't reproduce this performance issue using gcc 4.8.5 from RHEL 7.2. I'm not sure if I should post v2 with above fixes. May be them could be applied while pushing patch to repository? Best regards, Ilya Maximets.