From: poza@codeaurora.org
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>,
Thomas Gleixner <tglx@linutronix.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Kate Stewart <kstewart@linuxfoundation.org>,
Dongdong Liu <liudongdong3@huawei.com>,
Keith Busch <keith.busch@intel.com>, Wei Zhang <wzhang@fb.com>,
Sinan Kaya <okaya@kernel.org>, Timur Tabi <timur@codeaurora.org>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL
Date: Thu, 19 Jul 2018 21:26:45 +0530 [thread overview]
Message-ID: <272934d875d2f3a1546567f8c26e5946@codeaurora.org> (raw)
In-Reply-To: <153194245964.191586.14782253252654776509.stgit@bhelgaas-glaptop.roam.corp.google.com>
On 2018-07-19 01:14, Bjorn Helgaas wrote:
> This is a v3 of Oza's patches [1]. It's available at [2] if you prefer
> git.
>
> v3 changes:
> - Add pci_aer_clear_fatal_status() to clear ERR_FATAL bits, only
> called
> from pcie_do_fatal_recovery(). Moved to first in series to avoid a
> window where ERR_FATAL recovery only clears ERR_NONFATAL bits.
> Visible
> only inside the PCI core.
> - Instead of having pci_cleanup_aer_uncorrect_error_status() do
> different
> things based on dev->error_state, use this only for ERR_NONFATAL
> bits.
> I didn't change the name because it's used by many drivers.
> - Rename pci_cleanup_aer_error_device_status() to
> pci_aer_clear_device_status(), make it void, and make it visible
> only
> inside the PCI core.
> - Remove pcie_portdrv_err_handler.slot_reset altogether instead of
> making
> it a stub function. Possibly pcie_portdrv_err_handler could be
> removed
> completely?
>
> [1]
> https://lkml.kernel.org/r/1529661494-20936-1-git-send-email-poza@codeaurora.org
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/?h=pci/06-22-oza-aer
>
> ---
>
> Bjorn Helgaas (1):
> PCI/AER: Clear only ERR_FATAL status bits during fatal recovery
>
> Oza Pawandeep (6):
> PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery
> PCI/AER: Factor out ERR_NONFATAL status bit clearing
> PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path
> PCI/AER: Clear device status bits during ERR_FATAL and
> ERR_NONFATAL
> PCI/AER: Clear device status bits during ERR_COR handling
> PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset
>
>
> drivers/pci/pci.h | 5 ++++
> drivers/pci/pcie/aer.c | 47
> +++++++++++++++++++++++++++-------------
> drivers/pci/pcie/err.c | 15 +++++--------
> drivers/pci/pcie/portdrv_pci.c | 25 ---------------------
> 4 files changed, 43 insertions(+), 49 deletions(-)
Hi Bjorn,
I am planning on some things to do after this series.
your text
"
1) I don't think the driver slot_reset callbacks should be responsible
for clearing these AER status bits. Can we clear them somewhere in
the pcie_do_nonfatal_recovery() path and remove these calls from the
drivers?
"
Oza: We can do following
broadcast_error_message()
if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
should do
pci_walk_bus(dev->subordinate,
pci_cleanup_aer_uncorrect_error_status, NULL);
and update all the drivers and remove the call
pci_cleanup_aer_uncorrect_error_status()
2) In principle, we should only read PCI_ERR_UNCOR_STATUS *once* per
device when handling an error. We currently read it three times:
aer_isr
aer_isr_one_error
find_source_device
find_device_iter
is_error_source
read PCI_ERR_UNCOR_STATUS # 1
Oza: this is the first legitimate read
aer_process_err_devices
get_device_error_info(e_info->dev[i])
read PCI_ERR_UNCOR_STATUS # 2
Oza: I see this read used to check if link is healthy so the purpose of
this read looks different to me.
handle_error_source
pcie_do_nonfatal_recovery
...
report_slot_reset
driver->err_handler->slot_reset
pci_cleanup_aer_uncorrect_error_status
read PCI_ERR_UNCOR_STATUS # 3
Oza: pci_cleanup_aer_uncorrect_error_status() is generic and able to
clear status.
for e.g. in point 4 as I suggested if we have to do
pci_walk_bus(dev->subordinate, pci_cleanup_aer_uncorrect_error_status,
NULL); then we have to read them.
3) we need to get rid of pci_channel_io_frozen permanently.
Regards,
Oza.
prev parent reply other threads:[~2018-07-19 15:56 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-18 19:44 [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 1/7] PCI/AER: Clear only ERR_FATAL status bits during fatal recovery Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 2/7] PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 3/7] PCI/AER: Factor out ERR_NONFATAL status bit clearing Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 4/7] PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 5/7] PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Bjorn Helgaas
2018-07-18 19:44 ` [PATCH v3 6/7] PCI/AER: Clear device status bits during ERR_COR handling Bjorn Helgaas
2018-07-18 19:45 ` [PATCH v3 7/7] PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset Bjorn Helgaas
2018-07-19 3:53 ` [PATCH v3 0/7] Fix issues and cleanup for ERR_FATAL and ERR_NONFATAL poza
2018-07-19 23:00 ` Bjorn Helgaas
2018-07-19 15:56 ` poza [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=272934d875d2f3a1546567f8c26e5946@codeaurora.org \
--to=poza@codeaurora.org \
--cc=gregkh@linuxfoundation.org \
--cc=helgaas@kernel.org \
--cc=keith.busch@intel.com \
--cc=kstewart@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=liudongdong3@huawei.com \
--cc=okaya@kernel.org \
--cc=pombredanne@nexb.com \
--cc=tglx@linutronix.de \
--cc=timur@codeaurora.org \
--cc=wzhang@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.