From: John Meneghini <jmeneghi@redhat.com>
To: John Garry <john.g.garry@oracle.com>,
jejb@linux.vnet.ibm.com, martin.petersen@oracle.com,
sagar.biradar@microchip.com
Cc: linux-scsi@vger.kernel.org, hare@suse.com, ming.lei@redhat.com
Subject: Re: [PATCH] scsi: aacraid: Stop using PCI_IRQ_AFFINITY
Date: Tue, 15 Jul 2025 15:18:42 -0400 [thread overview]
Message-ID: <beaeca23-a475-405e-b0ea-af42373dd3bd@redhat.com> (raw)
In-Reply-To: <20250715111535.499853-1-john.g.garry@oracle.com>
OK John. Thanks for the patch.
We will test this out on our test bed here at Red Hat and let you know if this solves the problem.
/John
On 7/15/25 7:15 AM, John Garry wrote:
> When PCI_IRQ_AFFINITY is set for calling pci_alloc_irq_vectors(), it
> means interrupts are spread around the available CPUs. It also means that
> the interrupts become managed, which means that an interrupt is shutdown
> when all the CPUs in the interrupt affinity mask go offline.
>
> Using managed interrupts in this way means that we should ensure that
> completions should not occur on HW queues where the associated interrupt
> is shutdown. This is typically achieved by ensuring only CPUs which are
> online can generate IO completion traffic to the HW queue which they are
> mapped to (so that they can also serve completion interrupts for that
> HW queue).
>
> The problem in the driver is that a CPU can generate completions to
> a HW queue whose interrupt may be shutdown, as the CPUs in the HW queue
> interrupt affinity mask may be offline. This can cause IOs to never
> complete and hang the system. The driver maintains its own CPU <-> HW
> queue mapping for submissions, see aac_fib_vector_assign(), but this
> does not reflect the CPU <-> HW queue interrupt affinity mapping.
>
> Commit 9dc704dcc09e ("scsi: aacraid: Reply queue mapping to CPUs based on
> IRQ affinity") tried to remedy this issue may mapping CPUs properly to
> HW queue interrupts. However this was later reverted in commit c5becf57dd56
> ("Revert "scsi: aacraid: Reply queue mapping to CPUs based on IRQ
> affinity") - it seems that there were other reports of hangs. I guess that
> this was due to some implementation issue in the original commit or
> maybe a HW issue.
>
> Fix the very original hang by just not using managed interrupts by not
> setting PCI_IRQ_AFFINITY. In this way, all CPUs will be in each HW
> queue affinity mask, so should not create completion problems if any
> CPUs go offline.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
> build tested only
>
> diff --git a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c
> index 28cf18955a08..726c8531b7d3 100644
> --- a/drivers/scsi/aacraid/comminit.c
> +++ b/drivers/scsi/aacraid/comminit.c
> @@ -481,8 +481,7 @@ void aac_define_int_mode(struct aac_dev *dev)
> pci_find_capability(dev->pdev, PCI_CAP_ID_MSIX)) {
> min_msix = 2;
> i = pci_alloc_irq_vectors(dev->pdev,
> - min_msix, msi_count,
> - PCI_IRQ_MSIX | PCI_IRQ_AFFINITY);
> + min_msix, msi_count, PCI_IRQ_MSIX);
> if (i > 0) {
> dev->msi_enabled = 1;
> msi_count = i;
next prev parent reply other threads:[~2025-07-15 19:18 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 11:15 [PATCH] scsi: aacraid: Stop using PCI_IRQ_AFFINITY John Garry
2025-07-15 19:18 ` John Meneghini [this message]
2025-07-24 17:23 ` John Meneghini
2025-07-25 8:24 ` John Garry
2025-07-24 17:29 ` John Meneghini
2025-07-25 1:18 ` Martin K. Petersen
2025-07-31 4:44 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=beaeca23-a475-405e-b0ea-af42373dd3bd@redhat.com \
--to=jmeneghi@redhat.com \
--cc=hare@suse.com \
--cc=jejb@linux.vnet.ibm.com \
--cc=john.g.garry@oracle.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=sagar.biradar@microchip.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox