From: Bjorn Helgaas <helgaas@kernel.org>
To: sathyanarayanan kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: "Patel, Mayurkumar" <mayurkumar.patel@intel.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"mika.westerberg@linux.intel.com"
<mika.westerberg@linux.intel.com>,
'Andy Shevchenko' <andriy.shevchenko@linux.intel.com>,
Keith Busch <keith.busch@intel.com>,
Oza Pawandeep <poza@codeaurora.org>
Subject: Re: [PATCH] PCI/AER: save/restore AER registers during suspend/resume
Date: Wed, 24 Jul 2019 13:45:48 -0500 [thread overview]
Message-ID: <20190724184548.GC203187@google.com> (raw)
In-Reply-To: <cd3cb1af-bc80-f92d-b9e4-7b7c2a9bd2fb@linux.intel.com>
[+cc Keith, Oza, who did quite a bit of recent AER work]
Please see
https://lkml.kernel.org/r/20171026223701.GA25649@bhelgaas-glaptop.roam.corp.google.com
for a few incidental subject line/commit log hints.
On Wed, Jul 10, 2019 at 10:36:16AM -0700, sathyanarayanan kuppuswamy wrote:
> On 7/10/19 10:22 AM, Patel, Mayurkumar wrote:
> > > On 7/9/19 1:00 AM, Patel, Mayurkumar wrote:
> > > > After system suspend/resume cycle AER registers settings are
> > > > lost. Not restoring Root Error Command Register bits if it were
> > > > set, keeps AER interrupts disabled after system resume.
> > > > Moreover, AER mask and severity registers are also required
> > > > to be restored back to AER settings prior to system suspend.
> > > >
> > > > Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
> > > > ---
> > > > drivers/pci/pci.c | 2 ++
> > > > drivers/pci/pcie/aer.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > > include/linux/aer.h | 4 ++++
> > > > 3 files changed, 55 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 8abc843..40d5507 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -1340,6 +1340,7 @@ int pci_save_state(struct pci_dev *dev)
> > > >
> > > > pci_save_ltr_state(dev);
> > > > pci_save_dpc_state(dev);
> > > > + pci_save_aer_state(dev);
> > > > return pci_save_vc_state(dev);
> > > > }
> > > > EXPORT_SYMBOL(pci_save_state);
> > > > @@ -1453,6 +1454,7 @@ void pci_restore_state(struct pci_dev *dev)
> > > > pci_restore_dpc_state(dev);
> > > >
> > > > pci_cleanup_aer_error_status_regs(dev);
> > > > + pci_restore_aer_state(dev);
> > > >
> > > > pci_restore_config_space(dev);
> > > >
> > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > > > index b45bc47..1acc641 100644
> > > > --- a/drivers/pci/pcie/aer.c
> > > > +++ b/drivers/pci/pcie/aer.c
> > > > @@ -448,6 +448,54 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> > > > return 0;
> > > > }
> > > >
> > > > +void pci_save_aer_state(struct pci_dev *dev)
> > > > +{
> > > > + int pos = 0;
> > > > + struct pci_cap_saved_state *save_state;
> > > > + u32 *cap;
> > > > +
> > > > + if (!pci_is_pcie(dev))
> > > > + return;
> > > > +
> > > > + save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_ERR);
> > > > + if (!save_state)
> > > > + return;
> > > > +
> > > > + pos = dev->aer_cap;
> > > > + if (!pos)
> > > > + return;
> > > > +
> > > > + cap = &save_state->cap.data[0];
> > > > + pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_MASK, cap++);
> > > > + pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, cap++);
> > > > + pci_read_config_dword(dev, pos + PCI_ERR_COR_MASK, cap++);
> > > > + pci_read_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, cap++);
> > > I don't see AER driver modifying UNCOR_MASK/SEVER/COR_MASK register. If
> > > all it has is default value then why do you want to preserve it ?
> > >
> > Thanks for reply.
> > You are right about UNCOR_MASK/SEVER/COR_MASK mask AER driver
> > leaves untouched, But IMHO users for example can use "setpci" to
> > unmask specific errors and its severity based on their debugging
> > requirement on Root port and/or Endpoint. So during resume, if
> > PCIe endpoint fails due to any error which by default stays masked
> > then it can't be catched and/or debugged. Moreover, Endpoint
> > driver may also unmask during "driver probe" certain specific
> > errors for endpoint, which needs to be restored while resume.
>
> Isn't these registers configuration usually done by firmware ? I
> think user application rarely touch them. Also, IMO,
> Caching/Restoring registers under the assumption that it might be
> useful for user if they modified it is not a convincing argument.
> But I will let Bjorn and others decide whether its alright to cache
> these registers.
I think the ideal user experience would be "suspend/resume has no
effect on any configuration". That would argue for saving/restoring
registers even if we think it's unlikely the user would change them.
> > @Bjorn/Anybody else has any opinion to cache/restore
> > UNCOR_MASK/SEVER/COR_MASK registers? Please help to comment.
> >
> > > Also, any reason for not preserving ECRC settings ?
> > No specific reason. I can incorporte that with v2 of this patchset
> > but I don’t have HW on which I can validate that.
I think we should preserve ECRC settings as well for the same reason.
> > > > +}
> > > > +
> > > > +void pci_restore_aer_state(struct pci_dev *dev)
> > > > +{
> > > > + int pos = 0;
> > > > + struct pci_cap_saved_state *save_state;
> > > > + u32 *cap;
> > > > +
> > > > + if (!pci_is_pcie(dev))
> > > > + return;
> > > > +
> > > > + save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_ERR);
> > > > + if (!save_state)
> > > > + return;
> > > > +
> > > > + pos = dev->aer_cap;
> > > > + if (!pos)
> > > > + return;
> > > > +
> > > > + cap = &save_state->cap.data[0];
> > > > + pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_MASK, *cap++);
> > > > + pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, *cap++);
> > > > + pci_write_config_dword(dev, pos + PCI_ERR_COR_MASK, *cap++);
> > > > + pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, *cap++);
> > > > +}
> > > > +
> > > > void pci_aer_init(struct pci_dev *dev)
> > > > {
> > > > dev->aer_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
> > > > @@ -1396,6 +1444,7 @@ static int aer_probe(struct pcie_device *dev)
> > > > return status;
> > > > }
> > > >
> > > > + pci_add_ext_cap_save_buffer(port, PCI_EXT_CAP_ID_ERR, sizeof(u32) * 4);
> > > > aer_enable_rootport(rpc);
> > > > pci_info(port, "enabled with IRQ %d\n", dev->irq);
> > > > return 0;
> > > > diff --git a/include/linux/aer.h b/include/linux/aer.h
> > > > index 514bffa..fa19e01 100644
> > > > --- a/include/linux/aer.h
> > > > +++ b/include/linux/aer.h
> > > > @@ -46,6 +46,8 @@ int pci_enable_pcie_error_reporting(struct pci_dev *dev);
> > > > int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> > > > int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
> > > > int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
> > > > +void pci_save_aer_state(struct pci_dev *dev);
> > > > +void pci_restore_aer_state(struct pci_dev *dev);
> > > > #else
> > > > static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
> > > > {
> > > > @@ -63,6 +65,8 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> > > > {
> > > > return -EINVAL;
> > > > }
> > > > +static inline void pci_save_aer_state(struct pci_dev *dev) {}
> > > > +static inline void pci_restore_aer_state(struct pci_dev *dev) {}
> > > > #endif
> > > >
> > > > void cper_print_aer(struct pci_dev *dev, int aer_severity,
> > > --
> > > Sathyanarayanan Kuppuswamy
> > > Linux kernel developer
> > Intel Deutschland GmbH
> > Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
> > Tel: +49 89 99 8853-0, www.intel.de
> > Managing Directors: Christin Eisenschmid, Gary Kershaw
> > Chairperson of the Supervisory Board: Nicole Lau
> > Registered Office: Munich
> > Commercial Register: Amtsgericht Muenchen HRB 186928
>
> --
> Sathyanarayanan Kuppuswamy
> Linux kernel developer
>
next prev parent reply other threads:[~2019-07-24 18:45 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-09 8:00 [PATCH] PCI/AER: save/restore AER registers during suspend/resume Patel, Mayurkumar
2019-07-09 18:31 ` sathyanarayanan kuppuswamy
2019-07-10 17:22 ` Patel, Mayurkumar
2019-07-10 17:36 ` sathyanarayanan kuppuswamy
2019-07-24 18:45 ` Bjorn Helgaas [this message]
2019-07-25 16:08 ` Keith Busch
2019-07-26 9:00 ` Patel, Mayurkumar
2019-07-26 16:15 ` Keith Busch
2019-07-30 13:57 ` Bjorn Helgaas
2019-08-01 13:09 ` Patel, Mayurkumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190724184548.GC203187@google.com \
--to=helgaas@kernel.org \
--cc=andriy.shevchenko@linux.intel.com \
--cc=keith.busch@intel.com \
--cc=linux-pci@vger.kernel.org \
--cc=mayurkumar.patel@intel.com \
--cc=mika.westerberg@linux.intel.com \
--cc=poza@codeaurora.org \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox