From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:7985 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751371AbeECUNy (ORCPT ); Thu, 3 May 2018 16:13:54 -0400 Date: Thu, 3 May 2018 14:15:08 -0600 From: Keith Busch To: Mikulas Patocka Cc: Sagi Grimberg , Ming Lei , linux-nvme , Keith Busch , linux-pci@vger.kernel.org, Bjorn Helgaas , Christoph Hellwig Subject: Re: [PATCH] nvme/pci: Use async_schedule for initial reset work Message-ID: <20180503201507.GO5938@localhost.localdomain> References: <20180427211708.5604-1-keith.busch@intel.com> <20180430194533.GC5938@localhost.localdomain> <20180502152953.GH5938@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Sender: linux-pci-owner@vger.kernel.org List-ID: On Thu, May 03, 2018 at 10:55:09AM -0400, Mikulas Patocka wrote: > I think there is still one more bug: > > If nvme_probe is called, it schedules the asynchronous work using > async_schedule - now suppose that the pci system calls the "remove", > "shutdown" or "suspend" method - this method will race with > nvme_async_probe running in the async domain - that will cause > misbehavior. > > Or - does the PCI subsystem flush the async queues before calling these > methods? I'm not sure, but it doesn't seem so. > > I think, you need to save the cookie returned by async_schedule and wait > for this cookie with async_synchronize_cookie in the other methods. I think we're fine as-is without syncing the cookie. The remove path should be fine since we already sync with the necessary work queues. The shutdown, suspend and reset paths will just cause the initial reset work to end early, the same result as what previously would happen.