From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: Date: Thu, 12 Apr 2018 08:39:54 -0600 From: Keith Busch To: Sinan Kaya Cc: Bjorn Helgaas , Oza Pawandeep , Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Wei Zhang , Timur Tabi , Alex Williamson Subject: Re: [PATCH v13 6/6] PCI/DPC: Do not do recovery for hotplug enabled system Message-ID: <20180412143954.GB4810@localhost.localdomain> References: <1523284914-2037-1-git-send-email-poza@codeaurora.org> <1523284914-2037-7-git-send-email-poza@codeaurora.org> <20180410210349.GG54986@bhelgaas-glaptop.roam.corp.google.com> <13efe2e8-74c8-acb4-ec58-f79b14a1f182@codeaurora.org> <20180412140648.GD145698@bhelgaas-glaptop.roam.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: List-ID: On Thu, Apr 12, 2018 at 10:34:37AM -0400, Sinan Kaya wrote: > On 4/12/2018 10:06 AM, Bjorn Helgaas wrote: > > > > I think the scenario you are describing is two systems that are > > identical except that in the first, the endpoint is below a hotplug > > bridge, while in the second, it's below a non-hotplug bridge. There's > > no physical hotplug (no drive removed or inserted), and DPC is > > triggered in both systems. > > > > I suggest that DPC should be handled identically in both systems: > > > > - The PCI core should have the same view of the endpoint: it should > > be removed and re-added in both cases (or in neither case). > > > > - The endpoint itself should not be able to tell the difference: it > > should see a link down event, followed by a link retrain, followed > > by the same sequence of config accesses, etc. > > > > - The endpoint driver should not be able to tell the difference, > > i.e., we should be calling the same pci_error_handlers callbacks > > in both cases. > > > > It's true that in the non-hotplug system, pciehp probably won't start > > re-enumeration, so we might need an alternate path to trigger that. > > > > But that's not what we're doing in this patch. In this patch we're > > adding a much bigger difference: for hotplug bridges, we stop and > > remove the hierarchy below the bridge; for non-hotplug bridges, we do > > the AER-style flow of calling pci_error_handlers callbacks. > > Our approach on V12 was to go to AER style recovery for all DPC events > regardless of hotplug support or not. > > Keith was not comfortable with this approach. That's why, we special cased > hotplug. > > If we drop 6/6 on this patch on v13, we achieve this. We still have to > take care of Keith's inputs on individual patches. > > we have been struggling with the direction for a while. > > Keith, what do you think? My only concern was for existing production environments that use DPC for handling surprise removal, and I don't wish to break the existing uses.