From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7A7A42AAB for ; Thu, 2 May 2024 10:07:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714644444; cv=none; b=j/BiYWy7G/zwdjPTw7QKPnjXYlOuNmGVQCj5kVg+spgR16puQeWG7we23ChLdipMsGyAdFZqLuaID6VmYrrejP6Nx0WZaje/mtpy9JSCP/3yLbiAPLo5BS5DL/sHZnQwS9va/b0pRo8Y/BIbasic6JgLn0RbWclxDjvtAL1uW58= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714644444; c=relaxed/simple; bh=L/AENLsmoC5DefF05/SUjZQjoJJG9u0ysZASg/GiNbU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=r1Xp1xkd2v68S9FdXmwPocPRruXCyfYQ4XKjPKQEcSyxCNN5/wEtykugBD2lfYV3jC8uY1eec6SPR48H8m2qanPacKNPqcs93T62SGY82faZyhenx2DPAldzBIz0kEjoW+27P1uSlm9u/dIb7xan0YuiFRgNTbPULHLCB6EZM2k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FhIHIPbr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FhIHIPbr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 41B0DC113CC; Thu, 2 May 2024 10:07:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714644444; bh=L/AENLsmoC5DefF05/SUjZQjoJJG9u0ysZASg/GiNbU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FhIHIPbrkn5of7ojkv+WpNvtr26Xz5XrI1HxTn+niRPTklj8mVPPyx4TzCZ0M3/He 9yuC6c2i98crENMzPMcxMwLwG8i1x7WNd3dEkmUoiopYIqN3DeNYV8JO4DmiKG5hUG vf4NT5hBmSDQI5je3hWIA7kJVtxOgrKnfUDZ8v6UyQfzUCloRApHiWp36Y8AIlwK+q TLjP8Wu/FYNJuLSxw7zyBMoa808XQjSHpaL720U1d2RGOj+S7c4LFV6xW+NWMx/E5k fDANfRqTuwF+OXZHwUW1EBadBVxQkOXjLMqCCxlxF5wWTNpbE9TyQC+DalivdgV/Ig wIgcyqlRw9Lzw== Date: Thu, 2 May 2024 11:07:19 +0100 From: Simon Horman To: Michael Chan Cc: davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, andrew.gospodarek@broadcom.com, Kalesh AP , Selvin Thyparampil Xavier , Vikas Gupta , Pavan Chebbi Subject: Re: [PATCH net-next v2 5/6] bnxt_en: Optimize recovery path ULP locking in the driver Message-ID: <20240502100719.GI2821784@kernel.org> References: <20240501003056.100607-1-michael.chan@broadcom.com> <20240501003056.100607-6-michael.chan@broadcom.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240501003056.100607-6-michael.chan@broadcom.com> On Tue, Apr 30, 2024 at 05:30:55PM -0700, Michael Chan wrote: > From: Kalesh AP > > In the error recovery path (AER, firmware recovery, etc), the > driver notifies the RoCE driver via ULP_STOP before the reset > and via ULP_START after the reset, all under RTNL_LOCK. The > RoCE driver can take a long time if there are a lot of QPs to > destroy, so it is not ideal to hold the global RTNL lock. > > Rely on the new en_dev_lock mutex instead for ULP_STOP and > ULP_START. For the most part, we move the ULP_STOP call before > we take the RTNL lock and move the ULP_START after RTNL unlock. > Note that SRIOV re-enablement must be done after ULP_START > or RoCE on the VFs will not resume properly after reset. > > The one scenario in bnxt_hwrm_if_change() where the RTNL lock > is already taken in the .ndo_open() context requires the ULP > restart to be deferred to the bnxt_sp_task() workqueue. > > Reviewed-by: Selvin Thyparampil Xavier > Reviewed-by: Vikas Gupta > Reviewed-by: Pavan Chebbi > Signed-off-by: Kalesh AP > Signed-off-by: Michael Chan Reviewed-by: Simon Horman > diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c > index d9ea6fa23923..4cb0fabf977e 100644 > --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c > +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c > @@ -437,18 +437,20 @@ static int bnxt_dl_reload_down(struct devlink *dl, bool netns_change, > > switch (action) { > case DEVLINK_RELOAD_ACTION_DRIVER_REINIT: { > + bnxt_ulp_stop(bp); > rtnl_lock(); > if (bnxt_sriov_cfg(bp)) { > NL_SET_ERR_MSG_MOD(extack, > "reload is unsupported while VFs are allocated or being configured"); > rtnl_unlock(); > + bnxt_ulp_start(bp, 0); > return -EOPNOTSUPP; > } > if (bp->dev->reg_state == NETREG_UNREGISTERED) { > rtnl_unlock(); > + bnxt_ulp_start(bp, 0); > return -ENODEV; Hi Selvin, Michael, all, FWIIW, I would have used a goto to unwind this and the previous error. No need to need to respin because of this. > } > - bnxt_ulp_stop(bp); > if (netif_running(bp->dev)) > bnxt_close_nic(bp, true, true); > bnxt_vf_reps_free(bp); > @@ -516,7 +518,6 @@ static int bnxt_dl_reload_up(struct devlink *dl, enum devlink_reload_action acti > bnxt_vf_reps_alloc(bp); > if (netif_running(bp->dev)) > rc = bnxt_open_nic(bp, true, true); > - bnxt_ulp_start(bp, rc); > if (!rc) { > bnxt_reenable_sriov(bp); > bnxt_ptp_reapply_pps(bp); > @@ -570,6 +571,8 @@ static int bnxt_dl_reload_up(struct devlink *dl, enum devlink_reload_action acti > dev_close(bp->dev); > } > rtnl_unlock(); > + if (action == DEVLINK_RELOAD_ACTION_DRIVER_REINIT) > + bnxt_ulp_start(bp, rc); > return rc; > }