From: Niklas Schnelle <schnelle@linux.ibm.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>,
linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] s390/pci: fix CPU address in MSI for directed IRQ
Date: Mon, 30 Nov 2020 10:50:02 +0100 [thread overview]
Message-ID: <bc52205b-9ae8-98c6-0eb0-9fff551c7868@linux.ibm.com> (raw)
In-Reply-To: <20201130095549.27da927f.pasic@linux.ibm.com>
On 11/30/20 9:55 AM, Halil Pasic wrote:
> On Mon, 30 Nov 2020 09:30:33 +0100
> Niklas Schnelle <schnelle@linux.ibm.com> wrote:
>
>> I'm not really familiar, with it but I think this is closely related
>> to what I asked Bernd Nerz. I fear that if CPUs go away we might already
>> be in trouble at the firmware/hardware/platform level because the CPU Address is
>> "programmed into the device" so to speak. Thus a directed interrupt from
>> a device may race with anything reordering/removing CPUs even if
>> CPU addresses of dead CPUs are not reused and the mapping is stable.
>
> From your answer, I read that CPU hot-unplug is supported for LPAR.
I'm not sure about hot hot-unplug and firmware
telling us about removed CPUs but at the very least there is:
echo 0 > /sys/devices/system/cpu/cpu6/online
>>
>> Furthermore our floating fallback path will try to send a SIGP
>> to the target CPU which clearly doesn't work when that is permanently
>> gone. Either way I think these issues are out of scope for this fix
>> so I will go ahead and merge this.
>
> I agree, it makes on sense to delay this fix.
>
> But if CPU hot-unplug is supported, I believe we should react when
> a CPU is unplugged, that is a target of directed interrupts. My guess
> is, that in this scenario transient hiccups are unavoidable, and thus
> should be accepted, but we should make sure that we recover.
I agree, I just tested the above command on a firmware test system and
deactivated 4 of 8 CPUs.
This is in /proc/interrupts after that:
...
3: 9392 0 0 0 PCI-MSI mlx5_async@pci:0001:00:00.0
4: 282741 0 0 0 PCI-MSI mlx5_comp0@pci:0001:00:00.0
5: 0 2 0 0 PCI-MSI mlx5_comp1@pci:0001:00:00.0
6: 0 0 104 0 PCI-MSI mlx5_comp2@pci:0001:00:00.0
7: 0 0 0 2 PCI-MSI mlx5_comp3@pci:0001:00:00.0
8: 0 0 0 0 PCI-MSI mlx5_comp4@pci:0001:00:00.0
9: 0 0 0 0 PCI-MSI mlx5_comp5@pci:0001:00:00.0
10: 0 0 0 0 PCI-MSI mlx5_comp6@pci:0001:00:00.0
11: 0 0 0 0 PCI-MSI mlx5_comp7@pci:0001:00:00.0
...
So it looks like we are left with registered interrupts
for CPUs which are offline. However I'm not sure how to
trigger a problem with that. I think the drivers would
usually only do a directed interrupt to a CPU that
is currently running the process that triggered the
I/O (I tested this assumption with "taskset -c 2 ping ...").
Now with the CPU offline there cannot be such a
process. So I think for the most part the queue would
just remain unused. Still, if we do get a directed
interrupt for it's my understanding that currently
we will lose that.
I think this could be fixed with
something I tried in a prototype code a while back that
is in zpci_handle_fallback_irq() I handled the IRQ locally.
Back then it looked like Directed IRQs would make it to z15 GA 1.5 and
this was done to help Bernd to debug a Millicode issue (Jup 905371).
I also had a version of that code meant as a possible performance
improvement that would check if the target CPU is
available and only then send the SIGP and otherwise handle it
locally.
>
> Regards,
> Halil
>
next prev parent reply other threads:[~2020-11-30 9:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-26 17:00 [PATCH v3] s390/pci: fix CPU address in MSI for directed IRQ Alexander Gordeev
2020-11-27 8:56 ` Halil Pasic
2020-11-27 10:08 ` Niklas Schnelle
2020-11-27 15:39 ` Halil Pasic
2020-11-30 8:30 ` Niklas Schnelle
2020-11-30 8:55 ` Halil Pasic
2020-11-30 9:50 ` Niklas Schnelle [this message]
2020-12-09 15:08 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bc52205b-9ae8-98c6-0eb0-9fff551c7868@linux.ibm.com \
--to=schnelle@linux.ibm.com \
--cc=agordeev@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=pasic@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox