linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Huacai Chen <chenhuacai@kernel.org>,
	Tianrui Zhao <zhaotianrui@loongson.cn>,
	Bibo Mao <maobibo@loongson.cn>,
	Bjorn Helgaas <bhelgaas@google.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	loongarch@lists.linux.dev, Farhan Ali <alifm@linux.ibm.com>,
	Matthew Rosato	 <mjrosato@linux.ibm.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Gerd Bayer	 <gbayer@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [PATCH v5 1/2] PCI: Fix isolated PCI function probing with ARI and SR-IOV
Date: Fri, 28 Nov 2025 14:30:40 +0100	[thread overview]
Message-ID: <298aaf6b2815e59d1a94efffdd0e3b002c000cea.camel@linux.ibm.com> (raw)
In-Reply-To: <7385677843a7790e01158644f63ae4dbb3353bfe.camel@linux.ibm.com>

On Mon, 2025-11-10 at 14:08 +0100, Niklas Schnelle wrote:
> On Fri, 2025-11-07 at 15:19 +0800, Huacai Chen wrote:
> > On Wed, Nov 5, 2025 at 5:46 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > > 
> > > On Wed, 2025-11-05 at 09:01 +0800, Huacai Chen wrote:
> > > > On Mon, Nov 3, 2025 at 7:23 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > > > > 
> > > > > On Mon, 2025-11-03 at 17:50 +0800, Huacai Chen wrote:
> > > > > > Hi, Niklas,
> > > > > > 
> > > > > > On Wed, Oct 29, 2025 at 5:42 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
--- snip ---
> > > > > > > 
> > > > > > > Still especially the first issue prevents correct detection of ARI and
> > > > > > > the second might be a problem for other users of isolated function
> > > > > > > probing. Fix both issues by keeping things as simple as possible. If
> > > > > > > isolated function probing is enabled simply scan every possible devfn.
> > > > > > I'm very sorry, but applying this patch on top of commit a02fd05661d7
> > > > > > ("PCI: Extend isolated function probing to LoongArch") we fail to
> > > > > > boot.
> > > > > > 
> > > > > > Boot log:
> > > > > > 
--- snip ---
> > > > > 
> > > > > 
> > > > > This looks like a warning telling us that AHCI enable failed / timed
> > > > > out. Do you have Panic on Warn on that this directly causes a boot
> > > > > failure? The only relation I can see with my patch is that maybe this
> > > > > AHCI device wasn't probed before and somehow isn't working?
> > > > The rootfs is on the AHCI controller, so AHCI failure causes the boot
> > > > failure, without this patch no boot problems.
> > > > 
> > > > Huacai
> > > > 
> > > 
> > > Ok, I'm going to need more details to make sense of this. Can you tell
> > > me if ARI is enabled for that bus? Did you test with both patches or
> > > just this one? Could you provide lspci -vv from a good boot and can you
> > > tell which AHCI device the root device is on? Also could you clarify
> > > why you set hypervisor_isolated_pci_functions() in particular this
> > > seems like a bare metal boot, right? When running in KVM do you pass-
> > > through individual PCI functions with the guest seeing a devfn other
> > > than 0 alone, i.e. a missing devfn 0? Or do you need this for bare
> > > metal for some reason? If you don't need it for bare metal, does the
> > > boot work if you return 0 from hypervisor_isolated_pci_functions() with
> > > this patch?
> > 1. ARI isn't enabled.
> > 2. Only test the first patch.
> > 3. This is a bare metal boot.
> > 4. If hypervisor_isolated_pci_functions() return 0 then boot is OK
> > 5. PCI information please see the attachment.
> > 
> > Huacai
> 
> Thanks for the input. As far as I can see the lspci from a good boot
> shows no holes in your devfn space so this particular system doesn't
> seem to need the isolated function probing at all. But even then using
> it should only try out all devfns and thus never skip one that is found
> without isolated function probing.
> 
> To sanity check this, I just booted my personal AMD Ryzen 3900X system
> with this series plus a two-liner to unconditionally enable isolated
> function probing also on x86_64 and it came up fine including AMD
> graphics and my Intel NIC with enabled SR-IOV. 
> 
> So I'm really perplexed and coming back to the thought that a device on
> your system is misbehaving when probing is attempted and maybe due to a
> similar issue as what I saw with SR-IOV it wasn't probed so far but
> really should be probed if isolated function probing is enabled. I also
> still don't understand your use-case. If it is for VMs then maybe you
> could limit it to those? Otherwise it feels like this is just a hack to
> probe an odd topology and I wonder if you should rather set
> PCI_SCAN_ALL_PCIE_DEVS to find those?
> 
> Thanks,
> Niklas

Hi LoongArch Maintainers, Hi Bjorn,

Sorry for the ping but I'd really like to somehow get this unstuck and
I haven't heard back on my previous questions. From my testing on s390
this patch fixes a real logic error which prevents the scanning of some
devfns which I believe should be scanned if isolated functions are
possible. And in all my testing, including on x86 as stated in the
previous mail, the code does exactly what I think it is supposed to do.
So to me it really looks like something goes wrong with your use of
hypervisor_isolated_pci_functions() on your specific hardware.

One idea I had is if maybe you need to somehow exclude known empty
slots in you config space accessors?

And just in general I'd really like to better understand your use-case
for the isolated PCI functions. And speaking of that, I'm sorry for
having been so blunt in my last mail saying that it felt like a hack.
I'm just worried, that we've run into incompatible interpretations or
uses of this feature that now prevent us from fixing actual bugs.

Thanks in advance,
Niklas Schnelle

  reply	other threads:[~2025-11-28 13:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29  9:41 [PATCH v5 0/2] PCI: Fix isolated function probing and enable ARI for s390 Niklas Schnelle
2025-10-29  9:41 ` [PATCH v5 1/2] PCI: Fix isolated PCI function probing with ARI and SR-IOV Niklas Schnelle
2025-11-03  9:50   ` Huacai Chen
2025-11-03 11:23     ` Niklas Schnelle
2025-11-05  1:01       ` Huacai Chen
2025-11-05  9:46         ` Niklas Schnelle
2025-11-07  7:19           ` Huacai Chen
2025-11-10 13:08             ` Niklas Schnelle
2025-11-28 13:30               ` Niklas Schnelle [this message]
2025-12-01 14:45                 ` Huacai Chen
2025-12-03 21:45                   ` Niklas Schnelle
2025-10-29  9:41 ` [PATCH v5 2/2] PCI: s390: Handle ARI on bus without associated struct pci_dev Niklas Schnelle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=298aaf6b2815e59d1a94efffdd0e3b002c000cea.camel@linux.ibm.com \
    --to=schnelle@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=alifm@linux.ibm.com \
    --cc=bhelgaas@google.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=chenhuacai@kernel.org \
    --cc=gbayer@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jan.kiszka@siemens.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=loongarch@lists.linux.dev \
    --cc=maobibo@loongson.cn \
    --cc=mjrosato@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=zhaotianrui@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).