From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH net] xen-netback: bookkeep number of queues in our own module Date: Wed, 18 Jun 2014 10:30:37 -0400 Message-ID: <53A1A28D.6060203@oracle.com> References: <1403100558-12866-1-git-send-email-wei.liu2@citrix.com> <53A19FCA.1090205@oracle.com> <20140618142152.GC20819@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: xen-devel@lists.xen.org, netdev@vger.kernel.org, Ian Campbell , David Vrabel To: Wei Liu Return-path: Received: from aserp1040.oracle.com ([141.146.126.69]:47841 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752191AbaFRO3F (ORCPT ); Wed, 18 Jun 2014 10:29:05 -0400 In-Reply-To: <20140618142152.GC20819@zion.uk.xensource.com> Sender: netdev-owner@vger.kernel.org List-ID: On 06/18/2014 10:21 AM, Wei Liu wrote: > On Wed, Jun 18, 2014 at 10:18:50AM -0400, Boris Ostrovsky wrote: >> On 06/18/2014 10:09 AM, Wei Liu wrote: >>> The original code uses netdev->real_num_tx_queues to bookkeep number of >>> queues and invokes netif_set_real_num_tx_queues to set the number of >>> queues. However, netif_set_real_num_tx_queues doesn't allow >>> real_num_tx_queues to be smaller than 1, which means setting the number >>> to 0 will not work and real_num_tx_queues is untouched. >>> >>> This is bogus when xenvif_free is invoked before any number of queues is >>> allocated. That function needs to iterate through all queues to free >>> resources. Using the wrong number of queues results in NULL pointer >>> dereference. >>> >>> So we bookkeep the number of queues in xen-netback to solve this >>> problem. The usage of real_num_tx_queues in core driver is to cap queue >>> index to a valid value. In start_xmit we've already guarded against out >>> of range queue index so we should be fine. >>> >>> This fixes a regression introduced by multiqueue patchset in 3.16-rc1. >> >> David sent a couple of patches earlier today that I have been testing and >> they appear to fix both netfront and netback. (I am waiting for 32-bit to >> finish) >> >> http://lists.xenproject.org/archives/html/xen-devel/2014-06/msg02308.html >> > I saw that, but they don't fix this backend bug. Try crashing the guest > before it connects to backend. As I said in commit message: Apparently it doesn't indeed since 32-bit just crashed on me in xenvif_free() (the moment I hit Send on my response to you). But 64-bit run completed without failures. And even 32-bit test ran fine for a while. -boris > >>> This is bogus when xenvif_free is invoked before any number of queues is >>> allocated. That function needs to iterate through all queues to free > netif_set_real_num_tx_queues will need to be removed anyway. > > Wei.