All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: sathyanarayanan kuppuswamy  <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: "Patel, Mayurkumar" <mayurkumar.patel@intel.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"mika.westerberg@linux.intel.com"
	<mika.westerberg@linux.intel.com>,
	'Andy Shevchenko' <andriy.shevchenko@linux.intel.com>,
	Keith Busch <keith.busch@intel.com>,
	Oza Pawandeep <poza@codeaurora.org>
Subject: Re: [PATCH] PCI/AER: save/restore AER registers during suspend/resume
Date: Wed, 24 Jul 2019 13:45:48 -0500	[thread overview]
Message-ID: <20190724184548.GC203187@google.com> (raw)
In-Reply-To: <cd3cb1af-bc80-f92d-b9e4-7b7c2a9bd2fb@linux.intel.com>

[+cc Keith, Oza, who did quite a bit of recent AER work]

Please see
https://lkml.kernel.org/r/20171026223701.GA25649@bhelgaas-glaptop.roam.corp.google.com
for a few incidental subject line/commit log hints.

On Wed, Jul 10, 2019 at 10:36:16AM -0700, sathyanarayanan kuppuswamy wrote:
> On 7/10/19 10:22 AM, Patel, Mayurkumar wrote:
> > > On 7/9/19 1:00 AM, Patel, Mayurkumar wrote:
> > > > After system suspend/resume cycle AER registers settings are
> > > > lost. Not restoring Root Error Command Register bits if it were
> > > > set, keeps AER interrupts disabled after system resume.
> > > > Moreover, AER mask and severity registers are also required
> > > > to be restored back to AER settings prior to system suspend.
> > > > 
> > > > Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
> > > > ---
> > > >    drivers/pci/pci.c      |  2 ++
> > > >    drivers/pci/pcie/aer.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > >    include/linux/aer.h    |  4 ++++
> > > >    3 files changed, 55 insertions(+)
> > > > 
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 8abc843..40d5507 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -1340,6 +1340,7 @@ int pci_save_state(struct pci_dev *dev)
> > > > 
> > > >    	pci_save_ltr_state(dev);
> > > >    	pci_save_dpc_state(dev);
> > > > +	pci_save_aer_state(dev);
> > > >    	return pci_save_vc_state(dev);
> > > >    }
> > > >    EXPORT_SYMBOL(pci_save_state);
> > > > @@ -1453,6 +1454,7 @@ void pci_restore_state(struct pci_dev *dev)
> > > >    	pci_restore_dpc_state(dev);
> > > > 
> > > >    	pci_cleanup_aer_error_status_regs(dev);
> > > > +	pci_restore_aer_state(dev);
> > > > 
> > > >    	pci_restore_config_space(dev);
> > > > 
> > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > > > index b45bc47..1acc641 100644
> > > > --- a/drivers/pci/pcie/aer.c
> > > > +++ b/drivers/pci/pcie/aer.c
> > > > @@ -448,6 +448,54 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> > > >    	return 0;
> > > >    }
> > > > 
> > > > +void pci_save_aer_state(struct pci_dev *dev)
> > > > +{
> > > > +	int pos = 0;
> > > > +	struct pci_cap_saved_state *save_state;
> > > > +	u32 *cap;
> > > > +
> > > > +	if (!pci_is_pcie(dev))
> > > > +		return;
> > > > +
> > > > +	save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_ERR);
> > > > +	if (!save_state)
> > > > +		return;
> > > > +
> > > > +	pos = dev->aer_cap;
> > > > +	if (!pos)
> > > > +		return;
> > > > +
> > > > +	cap = &save_state->cap.data[0];
> > > > +	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_MASK, cap++);
> > > > +	pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, cap++);
> > > > +	pci_read_config_dword(dev, pos + PCI_ERR_COR_MASK, cap++);
> > > > +	pci_read_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, cap++);

> > > I don't see AER driver modifying UNCOR_MASK/SEVER/COR_MASK register. If
> > > all it has is default value then why do you want to preserve it ?
> > > 
> > Thanks for reply.
> > You are right about UNCOR_MASK/SEVER/COR_MASK mask AER driver
> > leaves untouched, But IMHO users for example can use "setpci" to
> > unmask specific errors and its severity based on their debugging
> > requirement on Root port and/or Endpoint. So during resume, if
> > PCIe endpoint fails due to any error which by default stays masked
> > then it can't be catched and/or debugged.  Moreover, Endpoint
> > driver may also unmask during "driver probe" certain specific
> > errors for endpoint, which needs to be restored while resume.
> 
> Isn't these registers configuration usually done by firmware ? I
> think user application rarely touch them. Also, IMO,
> Caching/Restoring registers under the assumption that it might be
> useful for user if they modified it is not a convincing argument.
> But I will let Bjorn and others decide whether its alright to cache
> these registers.

I think the ideal user experience would be "suspend/resume has no
effect on any configuration".  That would argue for saving/restoring
registers even if we think it's unlikely the user would change them.

> > @Bjorn/Anybody else has any opinion to cache/restore
> > UNCOR_MASK/SEVER/COR_MASK registers?  Please help to comment.
> > 
> > > Also, any reason for not preserving ECRC settings ?
> > No specific reason. I can incorporte that with v2 of this patchset
> > but I don’t have HW on which I can validate that.

I think we should preserve ECRC settings as well for the same reason.

> > > > +}
> > > > +
> > > > +void pci_restore_aer_state(struct pci_dev *dev)
> > > > +{
> > > > +	int pos = 0;
> > > > +	struct pci_cap_saved_state *save_state;
> > > > +	u32 *cap;
> > > > +
> > > > +	if (!pci_is_pcie(dev))
> > > > +		return;
> > > > +
> > > > +	save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_ERR);
> > > > +	if (!save_state)
> > > > +		return;
> > > > +
> > > > +	pos = dev->aer_cap;
> > > > +	if (!pos)
> > > > +		return;
> > > > +
> > > > +	cap = &save_state->cap.data[0];
> > > > +	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_MASK, *cap++);
> > > > +	pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, *cap++);
> > > > +	pci_write_config_dword(dev, pos + PCI_ERR_COR_MASK, *cap++);
> > > > +	pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, *cap++);
> > > > +}
> > > > +
> > > >    void pci_aer_init(struct pci_dev *dev)
> > > >    {
> > > >    	dev->aer_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
> > > > @@ -1396,6 +1444,7 @@ static int aer_probe(struct pcie_device *dev)
> > > >    		return status;
> > > >    	}
> > > > 
> > > > +	pci_add_ext_cap_save_buffer(port, PCI_EXT_CAP_ID_ERR, sizeof(u32) * 4);
> > > >    	aer_enable_rootport(rpc);
> > > >    	pci_info(port, "enabled with IRQ %d\n", dev->irq);
> > > >    	return 0;
> > > > diff --git a/include/linux/aer.h b/include/linux/aer.h
> > > > index 514bffa..fa19e01 100644
> > > > --- a/include/linux/aer.h
> > > > +++ b/include/linux/aer.h
> > > > @@ -46,6 +46,8 @@ int pci_enable_pcie_error_reporting(struct pci_dev *dev);
> > > >    int pci_disable_pcie_error_reporting(struct pci_dev *dev);
> > > >    int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
> > > >    int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
> > > > +void pci_save_aer_state(struct pci_dev *dev);
> > > > +void pci_restore_aer_state(struct pci_dev *dev);
> > > >    #else
> > > >    static inline int pci_enable_pcie_error_reporting(struct pci_dev *dev)
> > > >    {
> > > > @@ -63,6 +65,8 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> > > >    {
> > > >    	return -EINVAL;
> > > >    }
> > > > +static inline void pci_save_aer_state(struct pci_dev *dev) {}
> > > > +static inline void pci_restore_aer_state(struct pci_dev *dev) {}
> > > >    #endif
> > > > 
> > > >    void cper_print_aer(struct pci_dev *dev, int aer_severity,
> > > --
> > > Sathyanarayanan Kuppuswamy
> > > Linux kernel developer
> > Intel Deutschland GmbH
> > Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
> > Tel: +49 89 99 8853-0, www.intel.de
> > Managing Directors: Christin Eisenschmid, Gary Kershaw
> > Chairperson of the Supervisory Board: Nicole Lau
> > Registered Office: Munich
> > Commercial Register: Amtsgericht Muenchen HRB 186928
> 
> -- 
> Sathyanarayanan Kuppuswamy
> Linux kernel developer
> 

  reply	other threads:[~2019-07-24 18:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-09  8:00 [PATCH] PCI/AER: save/restore AER registers during suspend/resume Patel, Mayurkumar
2019-07-09 18:31 ` sathyanarayanan kuppuswamy
2019-07-10 17:22   ` Patel, Mayurkumar
2019-07-10 17:36     ` sathyanarayanan kuppuswamy
2019-07-24 18:45       ` Bjorn Helgaas [this message]
2019-07-25 16:08         ` Keith Busch
2019-07-26  9:00           ` Patel, Mayurkumar
2019-07-26 16:15             ` Keith Busch
2019-07-30 13:57               ` Bjorn Helgaas
2019-08-01 13:09                 ` Patel, Mayurkumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190724184548.GC203187@google.com \
    --to=helgaas@kernel.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=keith.busch@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=mayurkumar.patel@intel.com \
    --cc=mika.westerberg@linux.intel.com \
    --cc=poza@codeaurora.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.