From: poza@codeaurora.org
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>,
Thomas Gleixner <tglx@linutronix.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Kate Stewart <kstewart@linuxfoundation.org>,
Dongdong Liu <liudongdong3@huawei.com>,
Keith Busch <keith.busch@intel.com>, Wei Zhang <wzhang@fb.com>,
Sinan Kaya <okaya@kernel.org>, Timur Tabi <timur@codeaurora.org>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL
Date: Thu, 19 Jul 2018 21:26:45 +0530 [thread overview]
Message-ID: <272934d875d2f3a1546567f8c26e5946@codeaurora.org> (raw)
In-Reply-To: <153194245964.191586.14782253252654776509.stgit@bhelgaas-glaptop.roam.corp.google.com>
On 2018-07-19 01:14, Bjorn Helgaas wrote:
> This is a v3 of Oza's patches [1]. It's available at [2] if you prefer
> git.
>
> v3 changes:
> - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only
> called
> from pcie_do_fatal_recovery(). Moved to first in series to avoid a
> window where ERR_FATAL recovery only clears ERR_NONFATAL bits.
> Visible
> only inside the PCI core.
> - Instead of having pci_cleanup_aer_uncorrect_error_status() do
> different
> things based on dev->error_state, use this only for ERR_NONFATAL
> bits.
> I didn't change the name because it's used by many drivers.
> - Rename pci_cleanup_aer_error_device_status() to
> pci_aer_clear_device_status(), make it void, and make it visible
> only
> inside the PCI core.
> - Remove pcie_portdrv_err_handler.slot_reset altogether instead of
> making
> it a stub function. Possibly pcie_portdrv_err_handler could be
> removed
> completely?
>
> [1]
> https://lkml.kernel.org/r/1529661494-20936-1-git-send-email-poza@codeaurora.org
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
>
> ---
>
> Bjorn Helgaas (1):
> PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
>
> Oza Pawandeep (6):
> PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
> PCI/AER: Factor out ERR_NONFATAL status bit clearing
> PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
> PCI/AER: Clear device status bits during ERR_FATAL and
> ERR_NONFATAL
> PCI/AER: Clear device status bits during ERR_COR handling
> PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
>
>
> drivers/pci/pci.h | 5 ++++
> drivers/pci/pcie/aer.c | 47
> +++++++++++++++++++++++++++-------------
> drivers/pci/pcie/err.c | 15 +++++--------
> drivers/pci/pcie/portdrv_pci.c | 25 ---------------------
> 4 files changed, 43 insertions(+), 49 deletions(-)
Hi Bjorn,
I am planning on some things to do after this series.
your text
"
1) I don't think the driver slot_reset callbacks should be responsible
for clearing these AER status bits. Can we clear them somewhere in
the pcie_do_nonfatal_recovery() path and remove these calls from the
drivers?
"
Oza: We can do following
broadcast_error_message()
if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
should do
pci_walk_bus(dev->subordinate,
pci_cleanup_aer_uncorrect_error_status, NULL);
and update all the drivers and remove the call
pci_cleanup_aer_uncorrect_error_status()
2) In principle, we should only read PCI_ERR_UNCOR_STATUS *once* per
device when handling an error. We currently read it three times:
aer_isr
aer_isr_one_error
find_source_device
find_device_iter
is_error_source
read PCI_ERR_UNCOR_STATUS # 1
Oza: this is the first legitimate read
aer_process_err_devices
get_device_error_info(e_info->dev[i])
read PCI_ERR_UNCOR_STATUS # 2
Oza: I see this read used to check if link is healthy so the purpose of
this read looks different to me.
handle_error_source
pcie_do_nonfatal_recovery
...
report_slot_reset
driver->err_handler->slot_reset
pci_cleanup_aer_uncorrect_error_status
read PCI_ERR_UNCOR_STATUS # 3
Oza: pci_cleanup_aer_uncorrect_error_status() is generic and able to
clear status.
for e.g. in point 4 as I suggested if we have to do
pci_walk_bus(dev->subordinate, pci_cleanup_aer_uncorrect_error_status,
NULL); then we have to read them.
3) we need to get rid of pci_channel_io_frozen permanently.
Regards,
Oza.
prev parent reply other threads:[~2018-07-19 15:56 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling Bjorn Helgaas
2018-07-18 19:45 ` [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset Bjorn Helgaas
2018-07-19 3:53 ` [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL poza
2018-07-19 23:00 ` Bjorn Helgaas
2018-07-19 15:56 ` poza [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=272934d875d2f3a1546567f8c26e5946@codeaurora.org \
--to=poza@codeaurora.org \
--cc=gregkh@linuxfoundation.org \
--cc=helgaas@kernel.org \
--cc=keith.busch@intel.com \
--cc=kstewart@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=liudongdong3@huawei.com \
--cc=okaya@kernel.org \
--cc=pombredanne@nexb.com \
--cc=tglx@linutronix.de \
--cc=timur@codeaurora.org \
--cc=wzhang@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).