From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752051AbdBBPyF convert rfc822-to-8bit (ORCPT ); Thu, 2 Feb 2017 10:54:05 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:19271 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751863AbdBBPyD (ORCPT ); Thu, 2 Feb 2017 10:54:03 -0500 Subject: Re: [PATCH] xen-netfront: Improve error handling during initialization To: Ross Lagerwall References: <1485964222-1501-1-git-send-email-ross.lagerwall@citrix.com> <78bd8aab-69db-30cb-45ac-d12e17870447@oracle.com> Cc: xen-devel@lists.xenproject.org, netdev@vger.kernel.org, Juergen Gross , linux-kernel@vger.kernel.org, wei.liu2@citrix.com From: Boris Ostrovsky Message-ID: Date: Thu, 2 Feb 2017 10:54:01 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8BIT X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/02/2017 09:54 AM, Ross Lagerwall wrote: > On 02/01/2017 06:54 PM, Boris Ostrovsky wrote: >> On 02/01/2017 10:50 AM, Ross Lagerwall wrote: >>> Improve error handling during initialization. This fixes a crash when >>> running out of grant refs when creating many queues across many >>> netdevs. >>> >>> * Delay timer creation so that if initializing a queue fails, the timer >>> has not been setup yet. >>> * If creating queues fails (i.e. there are no grant refs available), >>> call xenbus_dev_fatal() to ensure that the xenbus device is set to the >>> closed state. >>> * If no queues are created, don't call xennet_disconnect_backend as >>> netdev->real_num_tx_queues will not have been set correctly. >>> * If setup_netfront() fails, ensure that all the queues created are >>> cleaned up, not just those that have been set up. >>> * If any queues were set up and an error occurs, call >>> xennet_destroy_queues() to stop the timer and clean up the napi >>> context. >> >> We need to stop the timer in xennet_disconnect_backend(). I sent a patch >> a couple of day ago >> >> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg03269.html >> >> >> but was about to resend it with del_timer_sync() moved after >> napi_synchronize(). >> > > OK, but the patch is still relevant since I believe we still need to > clean up the napi context in this case (plus the patch fixes a lot of > other issues). I was only commenting on that specific bullet in the commit message, I am not arguing against the patch. > > But I will respin it on top of your patch(es) and re-test it before > resending. > You can re-test with the patch in the link above, I will not be re-sending new version. Thanks. -boris