From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Farhan Ali <alifm@linux.ibm.com>,
linux-s390@vger.kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Cc: alex.williamson@redhat.com, helgaas@kernel.org, mjrosato@linux.ibm.com
Subject: Re: [PATCH v3 07/10] s390/pci: Store PCI error information for passthrough devices
Date: Tue, 16 Sep 2025 12:54:30 +0200 [thread overview]
Message-ID: <6703760a502d146909482f3aeb4333bf33cb431b.camel@linux.ibm.com> (raw)
In-Reply-To: <98a3bc6f-9b75-48cd-b09f-343831f5dcbf@linux.ibm.com>
On Mon, 2025-09-15 at 11:12 -0700, Farhan Ali wrote:
> On 9/15/2025 4:42 AM, Niklas Schnelle wrote:
> > On Thu, 2025-09-11 at 11:33 -0700, Farhan Ali wrote:
> > > For a passthrough device we need co-operation from user space to recover
> > > the device. This would require to bubble up any error information to user
> > > space. Let's store this error information for passthrough devices, so it
> > > can be retrieved later.
> > >
> > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> > > ---
> > >
--- snip ---
> > > + mutex_unlock(&zdev->pending_errs_lock);
> > > +}
> > > +
> > > +void zpci_cleanup_pending_errors(struct zpci_dev *zdev)
> > > +{
> > > + struct pci_dev *pdev = NULL;
> > > +
> > > + mutex_lock(&zdev->pending_errs_lock);
> > > + pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn);
> > > + if (zdev->pending_errs.count)
> > > + pr_err("%s: Unhandled PCI error events count=%zu",
> > > + pci_name(pdev), zdev->pending_errs.count);
> > I think this could be a zpci_dbg(). That way you also don't need the
> > pci_get_slot() which is also buggy as it misses a pci_dev_put(). The
> > message also doesn't seem useful for the user. As I understand it this
> > would happen if a vfio-pci user dies without handling all the error
> > events but then vfio-pci will also reset the slot on closing of the
> > fds, no? So the device will get reset anyway.
>
> Right, the device will reset anyway. But I wanted to at least give an
> indication to the user that some events were not handled correctly.
> Maybe pr_err is a little extreme, so can convert to a warn? This should
> be rare as well behaving applications shouldn't do this. I am fine with
> zpci_dbg as well, its just the kernel needs to be in debug mode for us
> to get this info.
No, zpci_dbg() logs to /sys/kernel/debug/s390dbf/pci_msg/sprintf
without need for debug mode. I'm also ok with a pr_warn() or maybe even
pr_info(). I can see your argument that this may be useful to have in
dmesg e.g. when debugging a user-space driver without having to know
about s390 specific debug aids.
>
> >
> > > + memset(&zdev->pending_errs, 0, sizeof(struct zpci_ccdf_pending));
> > If this goes wrong and we subsequently crash or take a live memory dump
> > I'd prefer to have bread crumbs such as the errors that weren't cleaned
> > up. Wouldn't it be enough to just set the count to zero and for debug
> > the original count will be in s390dbf.
>
> I think setting count to zero should be enough, but I am wary about
> keeping stale state around. How about just logging the count that was
> not handled, in s390dbf? I think we already dump the ccdf in s390df if
> we get any error event. So it should be enough for us to trace back the
> unhandled error events?
>
> > Also maybe it would make sense
> > to pull the zdev->mediated_recovery clearing in here?
>
> I would like to keep the mediated_recovery flag separate from just
> cleaning up the errors. The flag gets initialized when we open the vfio
> device and so having the flag cleared on close makes it easier to track
> this IMHO.
Ok yeah I can see the symmetry argument.
>
next prev parent reply other threads:[~2025-09-16 10:54 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 18:32 [PATCH v3 00/10] Error recovery for vfio-pci devices on s390x Farhan Ali
2025-09-11 18:32 ` [PATCH v3 01/10] PCI: Avoid saving error values for config space Farhan Ali
2025-09-13 8:27 ` Alex Williamson
2025-09-15 17:15 ` Farhan Ali
2025-09-16 18:09 ` Bjorn Helgaas
2025-09-16 20:00 ` Farhan Ali
2025-09-19 18:17 ` Alex Williamson
2025-09-11 18:32 ` [PATCH v3 02/10] PCI: Add additional checks for flr reset Farhan Ali
2025-09-11 18:33 ` [PATCH v3 03/10] PCI: Allow per function PCI slots Farhan Ali
2025-09-12 12:23 ` Benjamin Block
2025-09-12 17:19 ` Farhan Ali
2025-09-16 6:52 ` Cédric Le Goater
2025-09-16 18:37 ` Farhan Ali
2025-09-17 6:21 ` Cédric Le Goater
2025-09-17 17:50 ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 04/10] s390/pci: Add architecture specific resource/bus address translation Farhan Ali
2025-09-17 14:48 ` Niklas Schnelle
2025-09-17 17:22 ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 05/10] s390/pci: Restore IRQ unconditionally for the zPCI device Farhan Ali
2025-09-15 8:39 ` Niklas Schnelle
2025-09-15 17:42 ` Farhan Ali
2025-09-16 10:59 ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 06/10] s390/pci: Update the logic for detecting passthrough device Farhan Ali
2025-09-15 9:22 ` Niklas Schnelle
2025-09-11 18:33 ` [PATCH v3 07/10] s390/pci: Store PCI error information for passthrough devices Farhan Ali
2025-09-15 11:42 ` Niklas Schnelle
2025-09-15 18:12 ` Farhan Ali
2025-09-16 10:54 ` Niklas Schnelle [this message]
2025-09-11 18:33 ` [PATCH v3 08/10] vfio-pci/zdev: Add a device feature for error information Farhan Ali
2025-09-13 9:04 ` Alex Williamson
2025-09-15 18:27 ` Farhan Ali
2025-09-15 6:26 ` Cédric Le Goater
2025-09-15 18:27 ` Farhan Ali
2025-09-11 18:33 ` [PATCH v3 09/10] vfio: Add a reset_done callback for vfio-pci driver Farhan Ali
2025-09-11 18:33 ` [PATCH v3 10/10] vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX Farhan Ali
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6703760a502d146909482f3aeb4333bf33cb431b.camel@linux.ibm.com \
--to=schnelle@linux.ibm.com \
--cc=alex.williamson@redhat.com \
--cc=alifm@linux.ibm.com \
--cc=helgaas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox