public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.g.garry@oracle.com>
To: Sagar.Biradar@microchip.com, Don.Brace@microchip.com,
	Gilbert.Wu@microchip.com, linux-scsi@vger.kernel.org,
	martin.petersen@oracle.com, jejb@linux.ibm.com,
	brking@linux.vnet.ibm.com, stable@vger.kernel.org,
	Tom.White@microchip.com
Subject: Re: [PATCH] aacraid: reply queue mapping to CPUs based of IRQ affinity
Date: Wed, 12 Apr 2023 10:38:21 +0100	[thread overview]
Message-ID: <d7abc010-4d02-c04d-64d5-5fa857b0e690@oracle.com> (raw)
In-Reply-To: <BYAPR11MB3606BFE903D1BBE56C89D31BFA959@BYAPR11MB3606.namprd11.prod.outlook.com>

On 10/04/2023 22:17, Sagar.Biradar@microchip.com wrote:
> On 28/03/2023 22:41, Sagar Biradar wrote:
>> Fix the IO hang that arises because of MSIx vector not having a mapped
>> online CPU upon receiving completion.
> What about if the CPU targeted goes offline while the IO is in-flight?
> 
>> This patch sets up a reply queue mapping to CPUs based on the IRQ
>> affinity retrieved using pci_irq_get_affinity() API.
>>
> blk-mq already does what you want here, including handling for the case I mention above. It maintains a CPU -> HW queue mapping, and using a reply map in the LLD is the old way of doing this.
> 
> Could you instead follow the example in commit 664f0dce2058 ("scsi:
> mpt3sas: Add support for shared host tagset for CPU hotplug"), and expose the HW queues to the upper layer? You can alternatively check the example of any SCSI driver which sets shost->host_tagset for this.
> 
> Thanks,
> John
> [Sagar Biradar]
> 
> ***What about if the CPU targeted goes offline while the IO is in-flight?
> We ran multiple random cases with the IO's running in parallel and disabling load-bearing CPU's. We saw that the load was transferred to the other online CPUs successfully every time.
> The same was tested at vendor and their customer site - they did not see any issues too.

You need to ensure that all CPUs associated with the HW queue are 
offline and stay offline until any IO may timeout, which would be 30 
seconds according to SCSI sd default timeout. I am not sure if you were 
doing that exactly.

> 
> 
> ***blk-mq already does what you want here, including handling for the case I mention above. It maintains a CPU -> HW queue mapping, and using a reply map in the LLD is the old way of doing this.
> We also tried implementing the blk-mq mechanism in the driver and we saw command timeouts.
> The firmware has limitation of fixed number of queues per vector and the blk-mq changes would saturate that limit.
> That answers the possible command timeout.
> 
> Also this is EOL product and there will be no firmware code changes. Given this, we have decided to stick to the reply_map mechanism.
> (https://urldefense.com/v3/__https://storage.microsemi.com/en-us/support/series8/index.php__;!!ACWV5N9M2RV99hQ!PLrbfoEBvEGxw2CvahCL0AP5c4f5cQ8gT0ahXVgB0mSbyqxWJ8pdtYY0JwRL8xZ59k0NHJhXCBbMtVWlq5pYMeOEHmw7ww$  )
> 
> Thank you for your review comments and we hope you will reconsider the original patch.

I've been checking the driver a bit more and this drivers uses some 
"reserved" commands, right? That would be internal commands which the 
driver sends to the adapter which does not have a scsi_cmnd associated. 
If so, it gets a bit more tricky to use blk-mq support for HW queues, as 
we need to manually find a HW queue for those "reserved commands", like
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/hisi_sas/hisi_sas_main.c?h=v6.3-rc6#n532

Anyway, it's not up to me ...

Thanks,
John



  reply	other threads:[~2023-04-12  9:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28 21:41 [PATCH] aacraid: reply queue mapping to CPUs based of IRQ affinity Sagar Biradar
2023-03-29  7:08 ` John Garry
2023-04-10 21:17   ` Sagar.Biradar
2023-04-12  9:38     ` John Garry [this message]
2023-04-12 19:29       ` Sagar.Biradar
2023-04-13 21:52         ` Sagar.Biradar
2023-04-14  7:20           ` John Garry
2023-04-14 20:07       ` Sagar.Biradar
2023-04-18 16:36         ` Sagar.Biradar
2023-04-18 17:12     ` James Bottomley
2023-04-18 23:55       ` Sagar.Biradar
2023-04-19  7:42         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7abc010-4d02-c04d-64d5-5fa857b0e690@oracle.com \
    --to=john.g.garry@oracle.com \
    --cc=Don.Brace@microchip.com \
    --cc=Gilbert.Wu@microchip.com \
    --cc=Sagar.Biradar@microchip.com \
    --cc=Tom.White@microchip.com \
    --cc=brking@linux.vnet.ibm.com \
    --cc=jejb@linux.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox