From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Date: Mon, 16 Apr 2018 19:42:44 +0530 From: poza@codeaurora.org To: Bjorn Helgaas Cc: Sinan Kaya , Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Keith Busch , Wei Zhang , Timur Tabi Subject: Re: [PATCH v13 0/6] Address error and recovery for AER and DPC In-Reply-To: <20180416132753.GA28657@bhelgaas-glaptop.roam.corp.google.com> References: <1523284914-2037-1-git-send-email-poza@codeaurora.org> <20180416031600.GB80087@bhelgaas-glaptop.roam.corp.google.com> <5b4e667f-bead-a007-78dd-e42d3194f232@codeaurora.org> <9301606a70a213c180d9e6764b002cf9@codeaurora.org> <20180416132753.GA28657@bhelgaas-glaptop.roam.corp.google.com> Message-ID: <0b94f5a75fbfec9063e2c07006be3fbb@codeaurora.org> List-ID: On 2018-04-16 18:57, Bjorn Helgaas wrote: > On Mon, Apr 16, 2018 at 11:33:13AM +0530, poza@codeaurora.org wrote: >> On 2018-04-16 09:23, Sinan Kaya wrote: >> > On 4/15/2018 11:16 PM, Bjorn Helgaas wrote: >> > > On Mon, Apr 09, 2018 at 10:41:48AM -0400, Oza Pawandeep wrote: >> > > > This patch set brings in error handling support for DPC >> > > > >> > > > The current implementation of AER and error message broadcasting >> > > > to the >> > > > EP driver is tightly coupled and limited to AER service driver. >> > > > It is important to factor out broadcasting and other link handling >> > > > callbacks. So that not only when AER gets triggered, but also >> > > > when DPC get >> > > > triggered (for e.g. ERR_FATAL), callbacks are handled appropriately. >> > > > >> > > > DPC should behave identical to AER as far as error handling is >> > > > concerned. >> > > > DPC should remove the devices and not to do recovery for hotplug >> > > > enabled system. >> > > >> > > Is there a specific bug that's fixed by these patches? I didn't see >> > > one mentioned in the changelogs. >> > > >> > >> > There is no actual bug. >> > >> > We realized that DPC and hotplug is heavily integrated today. We >> > have use cases for systems without hotplug support but still >> > support DPC. That's the problem we are trying to solve with this >> > patchset. > > Apparently there's a problem with systems that have DPC but not > hotplug. It will be extremely helpful if you can articulate what that > problem is and include it in the appropriate changelog. > >> Adding to what Sinan said; >> >> DPC should handle the error handling and recovery similar to AER, >> because finally both are attempting recovery in some or the other >> way, and for that error handling and recovery framework has to be >> loosely coupled. It achieves uniformity and transparency to the >> error handling agents such as AER, DPC, with respect to recovery and >> error handling. >> >> So, this patch-set tries to unify lot of things between error agents >> and make them behave in a well defined way. (be it error (FATAL, >> NON_FATAL) handling or recovery). > > I totally support this objective. Thanks Bjorn, I will include this objective in Changelog along with Sinan's text. I am not clear on one last thing Bjorn; which is; do we need last patch ? patch-6 which handles hotplug case. Also I think we could take this patch-set as basic changes/attempt to unify the code which it does. And, in the next follow-up patches we can improve upon the things such as, whether to do different actions for FATAL cases and NON_FATAL cases. And then I can make needed changes to AER and DPC Please let me know how this sounds. > > Bjorn