From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1E47C433E0 for ; Thu, 11 Jun 2020 23:03:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C53A20842 for ; Thu, 11 Jun 2020 23:03:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1591916606; bh=yM5pBJUy0BXOdjz4XMcr+tOSmSsPnlkkobonfqdzjSg=; h=Date:From:To:Cc:Subject:In-Reply-To:List-ID:From; b=Xm0OBtcxKoV7zcJbbsgndFHLof3j9NnWbXcHvPKB17Q/rwYX/D2UdHoiKgpdpY9mq ANB3jbyWy5Z1LI2fEHuKlDtMgLc4fvJ8CRcuYm+aDXjjsZ5BGP8jV7Hua2vYsn834L 5LN08Mycm/M4ILJcqjLLKh1G0mttyPxy0oI7zisU= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726277AbgFKXD0 (ORCPT ); Thu, 11 Jun 2020 19:03:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:41100 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726254AbgFKXDZ (ORCPT ); Thu, 11 Jun 2020 19:03:25 -0400 Received: from localhost (mobile-166-170-222-206.mycingular.net [166.170.222.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E6CDF2075F; Thu, 11 Jun 2020 23:03:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1591916605; bh=yM5pBJUy0BXOdjz4XMcr+tOSmSsPnlkkobonfqdzjSg=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=D+qrXcCqbOyBxciiOWZgechHGVDuhy7PA3H6Zw/VkcKP/QvMh0ftEP2yKGml3S9kd 09w1IeQULkDKinvROeRZPCVif1IFyJzDcFU8+jAJ4nKByZ1oX5ZQ1L9UVV+FlFvh4O kHK8XgBPGMoyp63Owk4z/DP0E1b4PenykufRuFt8= Date: Thu, 11 Jun 2020 18:03:23 -0500 From: Bjorn Helgaas To: Prabhakar Kushwaha Cc: Robin Murphy , linux-arm-kernel , kexec mailing list , linux-pci@vger.kernel.org, Marc Zyngier , Will Deacon , Ganapatrao Prabhakerrao Kulkarni , Bhupesh Sharma , Prabhakar Kushwaha , Kuppuswamy Sathyanarayanan , Vijay Mohan Pandarathil , Myron Stowe Subject: Re: [PATCH][v2] iommu: arm-smmu-v3: Copy SMMU table for kdump kernel Message-ID: <20200611230323.GA1616315@bjorn-Precision-5520> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Sun, Jun 07, 2020 at 02:00:35PM +0530, Prabhakar Kushwaha wrote: > On Thu, Jun 4, 2020 at 5:32 AM Bjorn Helgaas wrote: > > On Wed, Jun 03, 2020 at 11:12:48PM +0530, Prabhakar Kushwaha wrote: > > > On Sat, May 30, 2020 at 1:03 AM Bjorn Helgaas wrote: > > > > On Fri, May 29, 2020 at 07:48:10PM +0530, Prabhakar Kushwaha wrote: > > > > > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c > > > > > index 117c0a2b2ba4..26b908f55aef 100644 > > > > > --- a/drivers/pci/pcie/err.c > > > > > +++ b/drivers/pci/pcie/err.c > > > > > @@ -66,6 +66,20 @@ static int report_error_detected(struct pci_dev *dev, > > > > > if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) { > > > > > vote = PCI_ERS_RESULT_NO_AER_DRIVER; > > > > > pci_info(dev, "can't recover (no > > > > > error_detected callback)\n"); > > > > > + > > > > > + pci_save_state(dev); > > > > > + pci_cfg_access_lock(dev); > > > > > + > > > > > + /* Quiesce the device completely */ > > > > > + pci_write_config_word(dev, PCI_COMMAND, > > > > > + PCI_COMMAND_INTX_DISABLE); > > > > > + if (!__pci_reset_function_locked(dev)) { > > > > > + vote = PCI_ERS_RESULT_RECOVERED; > > > > > + pci_info(dev, "recovered via pci level > > > > > reset\n"); > > > > > + } > > > > So I guess we *do* need to save the state before the reset and restore > > it (either that or enumerate the device from scratch just like we > > would if it had been hot-added). I'm not really thrilled with trying > > to save the state after the device has already reported an error. I'd > > rather do it earlier, maybe during enumeration, like in > > pci_init_capabilities(). But I don't understand all the subtleties of > > dev->state_saved, so that requires some legwork. > > I tried moving pci_save_state earlier. All observations are the same > as mentioned in earlier discussions. By "legwork", I didn't mean just trying things to see whether they seem to work. I meant researching the history to find out *why* it's designed the way it is so that when we change it, we don't break things. For example, these commits are obviously important to understand: aa8c6c93747f ("PCI PM: Restore standard config registers of all devices early") c82f63e411f1 ("PCI: check saved state before restore") 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored") I think we need to step back and separate this AER issue from the whole SMMU table copying thing. Then do the research and start a new thread with a patch to fix just the AER issue. The ARM guys would probably be grateful to be dropped from the AER thread because it really has nothing to do with ARM. Bjorn