Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Karolina Stolarek <karolina.stolarek@oracle.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	"Shen, Yijun" <Yijun.Shen@dell.com>,
	Bjorn Helgaas <bhelgaas@google.com>, <linux-pci@vger.kernel.org>,
	Jon Pan-Doh <pandoh@google.com>,
	Terry Bowman <terry.bowman@amd.com>, Len Brown <lenb@kernel.org>,
	James Morse <james.morse@arm.com>,
	Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
	Ben Cheatham <Benjamin.Cheatham@amd.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>,
	Liu Xinpeng <liuxp11@chinatelecom.cn>,
	"Darren Hart" <darren@os.amperecomputing.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths
Date: Tue, 29 Apr 2025 16:54:10 +0100	[thread overview]
Message-ID: <20250429165410.00002c86@huawei.com> (raw)
In-Reply-To: <0f4944a4-fd05-4365-9416-378a7385547b@oracle.com>

On Fri, 25 Apr 2025 16:12:26 +0200
Karolina Stolarek <karolina.stolarek@oracle.com> wrote:

> On 25/04/2025 15:14, Jonathan Cameron wrote:
> > On Fri, 25 Apr 2025 12:32:10 +0200
> > Karolina Stolarek <karolina.stolarek@oracle.com> wrote:  
> >> 
> >> It's possible that some of the nuances of this escaped me. I decided to
> >> pick up the series, as I saw "PCI Express bus error injection via GHES"
> >> script and thought it might be useful.  
> > 
> > With Mauro's series you can inject (on ARM64 virt) any CPER record you
> > like.  That doesn't synchronize the wider state of the system though
> > so may not exercise everything (PCI registers etc not updated as it
> > is only injecting the record).  Mostly it just works, as remarkably
> > few error handlers actually take the state of the components on which
> > the error is reported into account.  
> 
> OK, that means even if we manage to inject a PCIe error, AER wouldn't be 
> able to look up the Source ID and other values it needs to report an 
> error, which is not quite the solution I was looking for.

Isn't the source ID in the CPER record? (Device ID field) or do
you mean something else?

> 
> > The aim is specifically to allow exercising FW first error handling
> > paths because it's a pain to get real systems that have firmware to inject
> > the full range of what the kernel etc need to handle.  
> 
> Does this include PCIe errors? If so, that probably doesn't make sense 
> to try to test my patch on an actual system?

Ideally test it on a real system as well, but indeed the intent is to
allow testing of PCI errors on emulation.

> 
> > x86 support for emulated injection is a work in progress (more of a mess wrt
> > to the different ways the event signaling is handled than it is on arm64).
> > 
> > I did have an earlier version of that work wired up to the same
> > hooks as the native CXL error injection but I dropped it from my QEMU
> > CXL staging tree for now as it was a pain to rebase whilst Mauro was rapidly
> > revising the infrastructure.  I'll bring it back when I get time.  
> 
> I understand, I saw some of your series while looking for ways to test 
> my patch. Thank you very much for your work. As you can see, there are 
> people actually looking forward to it :)

Great!  I'll try and get back to wiring it all up again sometime soon.

Jonathan

> 
> 
> All the best,
> Karolina
> 
> > 
> > Jonathan
> >   
> >>  
> >>> Unfortunately there are some typos in the spec (FIRMWARE_FIRST,
> >>> FIRMWAREFIRST in 18.4), so it's a little hard to find all the
> >>> references.  
> >>
> >> Thanks for the pointers, I'll take a look.
> >>  
> >>> It's a long shot, but I added Yijun as a Dell contact that who might
> >>> have a pointer to someone who could possibly test GHES logging on a
> >>> Dell box with and without your patch so we could have a concrete
> >>> comparison of the dmesg log differences.  
> >>
> >> Thank you very much. Let's see, maybe we'll get lucky :)
> >>
> >> All the best,
> >> Karolina
> >>  
> >>>      
> >>>>> If you can't produce actual logs for comparison, I think we can take
> >>>>> info from a sample log somebody has posted and synthesize what the
> >>>>> changes would be after this patch.  
> >>>>
> >>>> I also found some logs at some point, mostly from 2021 and 2023, but I felt
> >>>> bad about mocking up the messages and tried to produce actual logs. If I
> >>>> can't find a way to get this working in two weeks, I'll revisit this idea.
> >>>>
> >>>> All the best,
> >>>> Karolina
> >>>>
> >>>> -------------------------------------------------------------
> >>>> [1] - https://lore.kernel.org/lkml/76824dfc6bb5dd23a9f04607a907ac4ccf7cb147.1740653898.git.mchehab+huawei@kernel.org/  
> >>
> >>  
> >   
> 
> 


  reply	other threads:[~2025-04-29 15:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 15:07 [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths Karolina Stolarek
2025-04-01  1:47 ` Jon Pan-Doh
2025-04-04  9:33   ` Karolina Stolarek
2025-04-23 13:52 ` Karolina Stolarek
2025-04-23 20:31   ` Bjorn Helgaas
2025-04-24  9:01     ` Karolina Stolarek
2025-04-24 17:28       ` Bjorn Helgaas
2025-04-25 10:32         ` Karolina Stolarek
2025-04-25 13:14           ` Jonathan Cameron
2025-04-25 14:12             ` Karolina Stolarek
2025-04-29 15:54               ` Jonathan Cameron [this message]
2025-05-05  9:58                 ` Karolina Stolarek
2025-05-05 17:45                   ` Bjorn Helgaas
2025-05-06 17:03                   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250429165410.00002c86@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=Yijun.Shen@dell.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=darren@os.amperecomputing.com \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=karolina.stolarek@oracle.com \
    --cc=lenb@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuxp11@chinatelecom.cn \
    --cc=pandoh@google.com \
    --cc=terry.bowman@amd.com \
    --cc=tony.luck@intel.com \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox