From mboxrd@z Thu Jan 1 00:00:00 1970 From: keith.busch@intel.com (Keith Busch) Date: Tue, 6 Feb 2018 09:33:47 -0700 Subject: [PATCH]nvme-pci: Fixes EEH failure on ppc In-Reply-To: <1517867380-18790-1-git-send-email-wenxiong@vmlinux.vnet.ibm.com> References: <1517867380-18790-1-git-send-email-wenxiong@vmlinux.vnet.ibm.com> Message-ID: <20180206163347.GG31110@localhost.localdomain> On Mon, Feb 05, 2018@03:49:40PM -0600, wenxiong@vmlinux.vnet.ibm.com wrote: > @@ -1189,6 +1183,12 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) > struct nvme_command cmd; > u32 csts = readl(dev->bar + NVME_REG_CSTS); > > + /* If PCI error recovery process is happening, we cannot reset or > + * the recovery mechanism will surely fail. > + */ > + if (pci_channel_offline(to_pci_dev(dev->dev))) > + return BLK_EH_HANDLED; > + This patch will tell the block layer to complete the request and consider it a success, but it doesn't look like the command actually completed at all. You're going to get data corruption this way, right? Is returning BLK_EH_HANDLED immediately really the right thing to do here?