From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: netfront/netback multiqueue exhausting grants Date: Wed, 20 Jan 2016 15:02:45 +0000 Message-ID: <569FA195.1020104@citrix.com> References: <1453292623.26343.95.camel@citrix.com> <569F9C6A.9070008@oracle.com> <1453301571.26343.127.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1453301571.26343.127.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell , Boris Ostrovsky , xen-devel Cc: Wei Liu , David Vrabel List-Id: xen-devel@lists.xenproject.org On 20/01/16 14:52, Ian Campbell wrote: > On Wed, 2016-01-20 at 09:40 -0500, Boris Ostrovsky wrote: >> On 01/20/2016 07:23 AM, Ian Campbell wrote: >>> There have been a few reports recently[0] which relate to a failure of >>> netfront to allocate sufficient grant refs for all the queues: >>> >>> [ 0.533589] xen_netfront: can't alloc rx grant refs >>> [ 0.533612] net eth0: only created 31 queues >>> >>> Which can be worked around by increasing the number of grants on the >>> hypervisor command line or by limiting the number of queues permitted >>> by >>> either back or front using a module param (which was broken but is now >>> fixed on both sides, but I'm not sure it has been backported everywhere >>> such that it is a reliable thing to always tell users as a workaround). >>> >>> Is there any plan to do anything about the default/out of the box >>> experience? Either limiting the number of queues or making both ends >>> cope >>> more gracefully with failure to create some queues (or both) might be >>> sufficient? >>> >>> I think the crash after the above in the first link at [0] is fixed? I >>> think that was the purpose of ca88ea1247df "xen-netfront: update >>> num_queues >>> to real created" which was in 4.3. >> >> I think ca88ea1247df is the solution --- it will limit the number of >> queues. > > That's in 4.4, which the first link at [0] claimed to have tested. I can > see this fixing the crash, but does it really fix the "actually works with > less queues than it tried to get" issue? > > In any case having exhausted the grant entries creating queues there aren't > any left to shuffle actual data around, is there? (or are those > preallocated too?) All grants refs for Tx and Rx are preallocated (this is the allocation that is failing above). David