From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Guilherme G. Piccoli" Subject: Re: [PATCH net] bnx2x: don't reset chip on cleanup if PCI function is offline Date: Tue, 9 Aug 2016 10:27:07 -0300 Message-ID: <57A9DA2B.7080908@linux.vnet.ibm.com> References: <1470687227-21024-1-git-send-email-gpiccoli@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev To: Yuval Mintz , Ariel Elior Return-path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:30211 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932389AbcHIN1Q (ORCPT ); Tue, 9 Aug 2016 09:27:16 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u79DOUgq024089 for ; Tue, 9 Aug 2016 09:27:15 -0400 Received: from e24smtp01.br.ibm.com (e24smtp01.br.ibm.com [32.104.18.85]) by mx0b-001b2d01.pphosted.com with ESMTP id 24qe85nup4-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 09 Aug 2016 09:27:15 -0400 Received: from localhost by e24smtp01.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 9 Aug 2016 10:27:13 -0300 Received: from d24relay02.br.ibm.com (d24relay02.br.ibm.com [9.13.184.26]) by d24dlp01.br.ibm.com (Postfix) with ESMTP id 42027352006C for ; Tue, 9 Aug 2016 09:26:49 -0400 (EDT) Received: from d24av05.br.ibm.com (d24av05.br.ibm.com [9.18.232.44]) by d24relay02.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u79DR93H61866006 for ; Tue, 9 Aug 2016 10:27:09 -0300 Received: from d24av05.br.ibm.com (localhost [127.0.0.1]) by d24av05.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u79DR9SR014714 for ; Tue, 9 Aug 2016 10:27:09 -0300 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 08/09/2016 09:14 AM, Yuval Mintz wrote: >> When PCI error is detected, in some architectures (like PowerPC) a slot reset is >> performed - the driver's error handlers are in charge of "disable" >> device before the reset, and re-enable it after a successful slot reset. >> >> There are two cases though that another path is taken on the code: if the slot >> reset is not successful or if too many errors already happened in the specific >> adapter (meaning that possibly the device is experiencing a HW failure that slot >> reset is not able to solve), the core PCI error mechanism (called EEH in PowerPC) >> will remove the adapter from the system, since it will consider this as a >> permanent failure on device. > Why would the published resume() from pci_error_handlers be called in this scenario? It isn't. That's why I specifically commented on commit message: "There are two cases though that another path is taken on the code". The code path reach bnx2x_chip_cleanup() on device removal from the system, as seen in the below call trace: bnx2x_chip_cleanup+0x3c0/0x910 [bnx2x] bnx2x_nic_unload+0x268/0xaf0 [bnx2x] bnx2x_close+0x34/0x50 [bnx2x] __dev_close_many+0xd4/0x150 dev_close_many+0xa8/0x160 rollback_registered_many+0x174/0x3f0 rollback_registered+0x40/0x70 unregister_netdevice_queue+0x98/0x110 unregister_netdev+0x34/0x50 __bnx2x_remove+0xa8/0x3a0 [bnx2x] pci_device_remove+0x70/0x110 > >> Also, we avoid the MCP information dump in case of non-recoverable PCI error >> (when adapter is about to be removed), since it will certainly fail. > > We should probably avoid several things here; Why specifically only this? For example, we shouldn't execute bnx2x_timer() in this scenario. But I thought it'd be too much to check every call of a timer function against PCI channel state just to avoid it's execution on this scenario, so I just let it execute, since it seems harmless. On the other hand, MCP dump is, as you said below, a slowpath and surely will fail in the following conditional (since addresses are all like 0xFFF...): /* sanity */ if (trace_shmem_base < MCPR_SCRATCH_BASE(bp) + MCPR_TRACE_BUFFER_SIZE || trace_shmem_base >= MCPR_SCRATCH_BASE(bp) + SCRATCH_BUFFER_SIZE(bp)) So, I added the check to at least be more informative on logs on why it failed. If you desire, we can avoid the timer execution (maybe a del_timer() instead of checking for PCI state?) and bnx2x_period_task() run in case of PCI permanent failure. I'm open to suggestions! >> + if (unlikely(pci_channel_offline(bp->pdev))) { >> + BNX2X_ERR("Cannot dump MCP info while in PCI error\n"); >> + return; >> + } >> + > Nitpicky, but I don't think there's reason to add 'unlikely' to a purely slowpath > Configuration. OK, your call. I can remove it on v2, no problem. > >> val = REG_RD(bp, MCP_REG_MCPR_CPU_PROGRAM_COUNTER); >> if (val == REG_RD(bp, MCP_REG_MCPR_CPU_PROGRAM_COUNTER)) >> BNX2X_ERR("%s" "MCP PC at 0x%x\n", lvl, val); @@ -9415,10 >> +9420,16 @@ unload_error: > > >> - /* Reset the chip */ >> - rc = bnx2x_reset_hw(bp, reset_code); >> - if (rc) >> - BNX2X_ERR("HW_RESET failed\n"); >> + /* Reset the chip, unless PCI function is offline. If we reach this >> + * point following a PCI error handling, it means device is really >> + * in a bad state and we're about to remove it, so reset the chip >> + * is not a good idea. >> + */ >> + if (!pci_channel_offline(bp->pdev)) { >> + rc = bnx2x_reset_hw(bp, reset_code); >> + if (rc) >> + BNX2X_ERR("HW_RESET failed\n"); >> + } > > Why not simply check this at the beginning of the function? Because I wasn't sure if I could drop the entire execution of chip_cleanup(). I tried to keep the most of this function aiming to shutdown the module in a gentle way, like cleaning MAC, stopping queues...but again, I'm open to suggestions and gladly will change this in v2 if you think it's for the best. Thanks for your review, Guilherme