From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: [PATCH net] bnx2x: Improve reliability in case of nested PCI errors Date: Wed, 27 Dec 2017 12:13:49 -0500 (EST) Message-ID: <20171227.121349.689207304027894541.davem@davemloft.net> References: <20171222150139.10244-1-gpiccoli@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: ariel.elior@cavium.com, everest-linux-l2@cavium.com, netdev@vger.kernel.org, gpiccoli@protonmail.ch To: gpiccoli@linux.vnet.ibm.com Return-path: Received: from shards.monkeyblade.net ([184.105.139.130]:38972 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751028AbdL0RNw (ORCPT ); Wed, 27 Dec 2017 12:13:52 -0500 In-Reply-To: <20171222150139.10244-1-gpiccoli@linux.vnet.ibm.com> Sender: netdev-owner@vger.kernel.org List-ID: From: "Guilherme G. Piccoli" Date: Fri, 22 Dec 2017 13:01:39 -0200 > While in recovery process of PCI error (called EEH on PowerPC arch), > another PCI transaction could be corrupted causing a situation of > nested PCI errors. Also, this scenario could be reproduced with > error injection mechanisms (for debug purposes). > > We observe that in case of nested PCI errors, bnx2x might attempt to > initialize its shmem and cause a kernel crash due to bad addresses > read from MCP. Multiple different stack traces were observed depending > on the point the second PCI error happens. > > This patch avoids the crashes by: > > * failing PCI recovery in case of nested errors (since multiple > PCI errors in a row are not expected to lead to a functional > adapter anyway), and by, > > * preventing access to adapter FW when MCP is failed (we mark it as > failed when shmem cannot get initialized properly). > > Reported-by: Abdul Haleem > Signed-off-by: Guilherme G. Piccoli Applied, thank you.