From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: netfront/netback multiqueue exhausting grants Date: Wed, 20 Jan 2016 10:10:25 -0500 Message-ID: <569FA361.60102@oracle.com> References: <1453292623.26343.95.camel@citrix.com> <569F9C6A.9070008@oracle.com> <1453301571.26343.127.camel@citrix.com> <569FA195.1020104@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <569FA195.1020104@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , Ian Campbell , xen-devel Cc: Wei Liu List-Id: xen-devel@lists.xenproject.org On 01/20/2016 10:02 AM, David Vrabel wrote: > On 20/01/16 14:52, Ian Campbell wrote: >> On Wed, 2016-01-20 at 09:40 -0500, Boris Ostrovsky wrote: >>> On 01/20/2016 07:23 AM, Ian Campbell wrote: >>>> There have been a few reports recently[0] which relate to a failure of >>>> netfront to allocate sufficient grant refs for all the queues: >>>> >>>> [ 0.533589] xen_netfront: can't alloc rx grant refs >>>> [ 0.533612] net eth0: only created 31 queues >>>> >>>> Which can be worked around by increasing the number of grants on the >>>> hypervisor command line or by limiting the number of queues permitted >>>> by >>>> either back or front using a module param (which was broken but is now >>>> fixed on both sides, but I'm not sure it has been backported everywhere >>>> such that it is a reliable thing to always tell users as a workaround). >>>> >>>> Is there any plan to do anything about the default/out of the box >>>> experience? Either limiting the number of queues or making both ends >>>> cope >>>> more gracefully with failure to create some queues (or both) might be >>>> sufficient? >>>> >>>> I think the crash after the above in the first link at [0] is fixed? I >>>> think that was the purpose of ca88ea1247df "xen-netfront: update >>>> num_queues >>>> to real created" which was in 4.3. >>> I think ca88ea1247df is the solution --- it will limit the number of >>> queues. >> That's in 4.4, which the first link at [0] claimed to have tested. I can >> see this fixing the crash, but does it really fix the "actually works with >> less queues than it tried to get" issue? That's what I thought it does too. I didn't notice that 4.4 was tested as well, so maybe not. -boris >> >> In any case having exhausted the grant entries creating queues there aren't >> any left to shuffle actual data around, is there? (or are those >> preallocated too?) > All grants refs for Tx and Rx are preallocated (this is the allocation > that is failing above). > > David