Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Karolina Stolarek <karolina.stolarek@oracle.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	"Shen, Yijun" <Yijun.Shen@dell.com>,
	Bjorn Helgaas <bhelgaas@google.com>, <linux-pci@vger.kernel.org>,
	Jon Pan-Doh <pandoh@google.com>,
	Terry Bowman <terry.bowman@amd.com>, Len Brown <lenb@kernel.org>,
	James Morse <james.morse@arm.com>,
	Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
	Ben Cheatham <Benjamin.Cheatham@amd.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>,
	Liu Xinpeng <liuxp11@chinatelecom.cn>,
	"Darren Hart" <darren@os.amperecomputing.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths
Date: Tue, 6 May 2025 18:03:00 +0100	[thread overview]
Message-ID: <20250506180300.00006527@huawei.com> (raw)
In-Reply-To: <1fb6b57b-4317-404d-8361-19e1c3bd499c@oracle.com>

On Mon, 5 May 2025 11:58:25 +0200
Karolina Stolarek <karolina.stolarek@oracle.com> wrote:

> On 29/04/2025 17:54, Jonathan Cameron wrote:
> > On Fri, 25 Apr 2025 16:12:26 +0200
> > Karolina Stolarek <karolina.stolarek@oracle.com> wrote:  
> >>
> >> OK, that means even if we manage to inject a PCIe error, AER wouldn't be
> >> able to look up the Source ID and other values it needs to report an
> >> error, which is not quite the solution I was looking for.  
> > 
> > Isn't the source ID in the CPER record? (Device ID field) or do
> > you mean something else?  
> 
> Ah, sorry, I got confused on the way. I meant that even if we have the 
> Device ID in CPER set, the specific device has no data in aer_regs if we 
> inject an error using the GHES error injection script. We probably would 
> end up with !info->status in aer_print_error(), thus printing only a 
> line about "Inaccessible" agent and return early.

If you were feeling creative with scripts you might be able to make this
work today...  Qemu does allow native aer injection via pcie_aer_inject_error
which will fill in the stuff in the device and 'try' to trigger an interrupt.
That last bit will fail (I think) if we are doing fw first handling.
(you might need to just prevent the interrupt generation in a similar fashion
to this code did here:

https://gitlab.com/jic23/qemu/-/commit/ce801e4d5b5cc5417cc7c7e5ecdaaa2ca5d6efe3#8eeec1fb38fa7149cc37b7a56dc193d69281ee96_704_708

At that point if you were to inject GHES error using Mauro's stuff it will work
and find that pre injected hardware info.

If not we need a refresh of that patch to hook up record generation with
Mauro's new handling. That's what I plan to get to but will be a while yet.

J



> 
> >>> The aim is specifically to allow exercising FW first error handling
> >>> paths because it's a pain to get real systems that have firmware to inject
> >>> the full range of what the kernel etc need to handle.  
> >>
> >> Does this include PCIe errors? If so, that probably doesn't make sense
> >> to try to test my patch on an actual system?  
> > 
> > Ideally test it on a real system as well, but indeed the intent is to
> > allow testing of PCI errors on emulation.  
> 
> I understand. Do you have pointers on how to inject it on a real system? 
> All info I could find about FW error injection pointed to the qemu 
> scripts I mentioned.

Sorry no.  It maybe system specific and disabled on production bios.

> 
> >>> x86 support for emulated injection is a work in progress (more of a mess wrt
> >>> to the different ways the event signaling is handled than it is on arm64).
> >>>
> >>> I did have an earlier version of that work wired up to the same
> >>> hooks as the native CXL error injection but I dropped it from my QEMU
> >>> CXL staging tree for now as it was a pain to rebase whilst Mauro was rapidly
> >>> revising the infrastructure.  I'll bring it back when I get time.  
> >>
> >> I understand, I saw some of your series while looking for ways to test
> >> my patch. Thank you very much for your work. As you can see, there are
> >> people actually looking forward to it :)  
> > 
> > Great!  I'll try and get back to wiring it all up again sometime soon.  
> 
> Awesome, thanks.
> 
> Bjorn, is this patch blocking the ratelimiting series? Would it be 
> acceptable to use public logs in the commit message? I'm asking because 
> it looks like there's no easy way to trigger the GHES path, or it would 
> take some time, further delaying the ratelimiting work.
> 
> All the best,
> Karolina
> 
> > 
> > Jonathan
> >   
> 


      parent reply	other threads:[~2025-05-06 17:03 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 15:07 [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths Karolina Stolarek
2025-04-01  1:47 ` Jon Pan-Doh
2025-04-04  9:33   ` Karolina Stolarek
2025-04-23 13:52 ` Karolina Stolarek
2025-04-23 20:31   ` Bjorn Helgaas
2025-04-24  9:01     ` Karolina Stolarek
2025-04-24 17:28       ` Bjorn Helgaas
2025-04-25 10:32         ` Karolina Stolarek
2025-04-25 13:14           ` Jonathan Cameron
2025-04-25 14:12             ` Karolina Stolarek
2025-04-29 15:54               ` Jonathan Cameron
2025-05-05  9:58                 ` Karolina Stolarek
2025-05-05 17:45                   ` Bjorn Helgaas
2025-05-06 17:03                   ` Jonathan Cameron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250506180300.00006527@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=Yijun.Shen@dell.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=darren@os.amperecomputing.com \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=karolina.stolarek@oracle.com \
    --cc=lenb@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuxp11@chinatelecom.cn \
    --cc=pandoh@google.com \
    --cc=terry.bowman@amd.com \
    --cc=tony.luck@intel.com \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox