Linux CXL
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Karolina Stolarek <karolina.stolarek@oracle.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	"Shen, Yijun" <Yijun.Shen@dell.com>,
	Bjorn Helgaas <bhelgaas@google.com>, <linux-pci@vger.kernel.org>,
	Jon Pan-Doh <pandoh@google.com>,
	Terry Bowman <terry.bowman@amd.com>, Len Brown <lenb@kernel.org>,
	James Morse <james.morse@arm.com>,
	Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
	Ben Cheatham <Benjamin.Cheatham@amd.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Shuai Xue <xueshuai@linux.alibaba.com>,
	Liu Xinpeng <liuxp11@chinatelecom.cn>,
	"Darren Hart" <darren@os.amperecomputing.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths
Date: Fri, 25 Apr 2025 14:14:01 +0100	[thread overview]
Message-ID: <20250425141401.0000067b@huawei.com> (raw)
In-Reply-To: <61d3f860-9411-4c86-b9c4-a4524ec8ea6d@oracle.com>

On Fri, 25 Apr 2025 12:32:10 +0200
Karolina Stolarek <karolina.stolarek@oracle.com> wrote:

> On 24/04/2025 19:28, Bjorn Helgaas wrote:
> > [+to Yijun @Dell in case there's some testing opportunity, thread at
> > https://lore.kernel.org/r/81c040d54209627de2d8b150822636b415834c7f.1742900213.git.karolina.stolarek@oracle.com]
> > 
> > On Thu, Apr 24, 2025 at 11:01:11AM +0200, Karolina Stolarek wrote:  
>  >>
> >> The only way to inject GHES errors I'm aware of is Mauro's patch for
> >> qemu[1], so I went down the virtualization path. As for working with the
> >> actual hardware, I'd need to ask around and learn more about the platform.  
> > 
> > I'd be surprised if the qemu firmware supports firmware-first
> > handling, so I wouldn't expect to be able to exercise this path that
> > way.  I think there are some bits in HEST and similar tables that tell
> > us about this, e.g., ACPI r6.5, sec 18.3.2.4.  
> 
> It's possible that some of the nuances of this escaped me. I decided to 
> pick up the series, as I saw "PCI Express bus error injection via GHES" 
> script and thought it might be useful.

With Mauro's series you can inject (on ARM64 virt) any CPER record you
like.  That doesn't synchronize the wider state of the system though
so may not exercise everything (PCI registers etc not updated as it
is only injecting the record).  Mostly it just works, as remarkably 
few error handlers actually take the state of the components on which
the error is reported into account.

The aim is specifically to allow exercising FW first error handling
paths because it's a pain to get real systems that have firmware to inject
the full range of what the kernel etc need to handle.

x86 support for emulated injection is a work in progress (more of a mess wrt
to the different ways the event signaling is handled than it is on arm64).

I did have an earlier version of that work wired up to the same
hooks as the native CXL error injection but I dropped it from my QEMU
CXL staging tree for now as it was a pain to rebase whilst Mauro was rapidly
revising the infrastructure.  I'll bring it back when I get time.

Jonathan

> 
> > Unfortunately there are some typos in the spec (FIRMWARE_FIRST,
> > FIRMWAREFIRST in 18.4), so it's a little hard to find all the
> > references.  
> 
> Thanks for the pointers, I'll take a look.
> 
> > It's a long shot, but I added Yijun as a Dell contact that who might
> > have a pointer to someone who could possibly test GHES logging on a
> > Dell box with and without your patch so we could have a concrete
> > comparison of the dmesg log differences.  
> 
> Thank you very much. Let's see, maybe we'll get lucky :)
> 
> All the best,
> Karolina
> 
> >   
> >>> If you can't produce actual logs for comparison, I think we can take
> >>> info from a sample log somebody has posted and synthesize what the
> >>> changes would be after this patch.  
> >>
> >> I also found some logs at some point, mostly from 2021 and 2023, but I felt
> >> bad about mocking up the messages and tried to produce actual logs. If I
> >> can't find a way to get this working in two weeks, I'll revisit this idea.
> >>
> >> All the best,
> >> Karolina
> >>
> >> -------------------------------------------------------------
> >> [1] - https://lore.kernel.org/lkml/76824dfc6bb5dd23a9f04607a907ac4ccf7cb147.1740653898.git.mchehab+huawei@kernel.org/  
> 
> 


  reply	other threads:[~2025-04-25 13:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 15:07 [PATCH v2] PCI/AER: Consolidate CXL, ACPI GHES and native AER reporting paths Karolina Stolarek
2025-04-01  1:47 ` Jon Pan-Doh
2025-04-04  9:33   ` Karolina Stolarek
2025-04-23 13:52 ` Karolina Stolarek
2025-04-23 20:31   ` Bjorn Helgaas
2025-04-24  9:01     ` Karolina Stolarek
2025-04-24 17:28       ` Bjorn Helgaas
2025-04-25 10:32         ` Karolina Stolarek
2025-04-25 13:14           ` Jonathan Cameron [this message]
2025-04-25 14:12             ` Karolina Stolarek
2025-04-29 15:54               ` Jonathan Cameron
2025-05-05  9:58                 ` Karolina Stolarek
2025-05-05 17:45                   ` Bjorn Helgaas
2025-05-06 17:03                   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250425141401.0000067b@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=Yijun.Shen@dell.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=darren@os.amperecomputing.com \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=karolina.stolarek@oracle.com \
    --cc=lenb@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuxp11@chinatelecom.cn \
    --cc=pandoh@google.com \
    --cc=terry.bowman@amd.com \
    --cc=tony.luck@intel.com \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox