* [PATCH] scsi: Update max_hw_sectors on rescan
@ 2024-01-17 21:36 Brian King
2024-01-18 15:44 ` John Garry
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Brian King @ 2024-01-17 21:36 UTC (permalink / raw)
To: linux-scsi; +Cc: brking, jejb, martin.petersen, Brian King
This addresses an issue discovered on ibmvfc LUNs. For this driver,
max_sectors is negotiated with the VIOS. This gets done at initialization
time, then LUNs get scanned and things generally work fine. However,
this attribute can be changed on the VIOS, either due to a sysadmin
change or potentially a VIOS code level change. If this decreases
to a smaller value, due to one of these reasons, the next time the
ibmvfc driver performs an NPIV login, it will only be able to use
the smaller value. In the case of a VIOS reboot, when the VIOS goes
down, all paths through that VIOS will go to devloss state. When
the VIOS comes back up, ibmvfc negotiates max_sectors and will only
be able to get the smaller value and it will update shost->max_sectors.
However, when LUNs are scanned, the devloss paths will be found
and brought back online, still using the old max_hw_sectors. This
change ensures that max_hw_sectors gets updated.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---
drivers/scsi/scsi_scan.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 44680f65ea14..01f2b38daab3 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
blist_flags_t bflags;
int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
+ struct request_queue *q;
/*
* The rescan flag is used as an optimization, the first scan of a
@@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
*bflagsp = scsi_get_device_flags(sdev,
sdev->vendor,
sdev->model);
+ q = sdev->request_queue;
+ if (queue_max_hw_sectors(q) > shost->max_sectors)
+ blk_queue_max_hw_sectors(q, shost->max_sectors);
+
return SCSI_SCAN_LUN_PRESENT;
}
scsi_device_put(sdev);
@@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost)
}
spin_unlock_irqrestore(shost->host_lock, flags);
}
-
--
2.39.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-17 21:36 [PATCH] scsi: Update max_hw_sectors on rescan Brian King
@ 2024-01-18 15:44 ` John Garry
2024-01-18 17:22 ` Brian King
2024-01-23 22:40 ` Mike Christie
2024-01-24 9:24 ` Christoph Hellwig
2 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2024-01-18 15:44 UTC (permalink / raw)
To: Brian King, linux-scsi; +Cc: brking, jejb, martin.petersen
On 17/01/2024 21:36, Brian King wrote:
> This addresses an issue discovered on ibmvfc LUNs. For this driver,
> max_sectors is negotiated with the VIOS. This gets done at initialization
> time, then LUNs get scanned and things generally work fine. However,
> this attribute can be changed on the VIOS, either due to a sysadmin
> change or potentially a VIOS code level change. If this decreases
> to a smaller value, due to one of these reasons, the next time the
> ibmvfc driver performs an NPIV login, it will only be able to use
> the smaller value. In the case of a VIOS reboot, when the VIOS goes
> down, all paths through that VIOS will go to devloss state. When
> the VIOS comes back up, ibmvfc negotiates max_sectors and will only
> be able to get the smaller value and it will update shost->max_sectors.
Are you saying that the driver will manually update shost->max_sectors
after adding the scsi host? I didn't think that was permitted.
Thanks,
John
> However, when LUNs are scanned, the devloss paths will be found
> and brought back online, still using the old max_hw_sectors. This
> change ensures that max_hw_sectors gets updated.
>
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
> ---
> drivers/scsi/scsi_scan.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 44680f65ea14..01f2b38daab3 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
> blist_flags_t bflags;
> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
> + struct request_queue *q;
>
> /*
> * The rescan flag is used as an optimization, the first scan of a
> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
> *bflagsp = scsi_get_device_flags(sdev,
> sdev->vendor,
> sdev->model);
> + q = sdev->request_queue;
> + if (queue_max_hw_sectors(q) > shost->max_sectors)
> + blk_queue_max_hw_sectors(q, shost->max_sectors);
> +
> return SCSI_SCAN_LUN_PRESENT;
> }
> scsi_device_put(sdev);
> @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost)
> }
> spin_unlock_irqrestore(shost->host_lock, flags);
> }
> -
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-18 15:44 ` John Garry
@ 2024-01-18 17:22 ` Brian King
2024-01-19 9:02 ` John Garry
0 siblings, 1 reply; 8+ messages in thread
From: Brian King @ 2024-01-18 17:22 UTC (permalink / raw)
To: John Garry, linux-scsi; +Cc: brking, jejb, martin.petersen
On 1/18/24 9:44 AM, John Garry wrote:
> On 17/01/2024 21:36, Brian King wrote:
>> This addresses an issue discovered on ibmvfc LUNs. For this driver,
>> max_sectors is negotiated with the VIOS. This gets done at initialization
>> time, then LUNs get scanned and things generally work fine. However,
>> this attribute can be changed on the VIOS, either due to a sysadmin
>> change or potentially a VIOS code level change. If this decreases
>> to a smaller value, due to one of these reasons, the next time the
>> ibmvfc driver performs an NPIV login, it will only be able to use
>> the smaller value. In the case of a VIOS reboot, when the VIOS goes
>> down, all paths through that VIOS will go to devloss state. When
>> the VIOS comes back up, ibmvfc negotiates max_sectors and will only
>> be able to get the smaller value and it will update shost->max_sectors.
>
> Are you saying that the driver will manually update shost->max_sectors after adding the scsi host? I didn't think that was permitted.
That is what happens. The characteristics of the underlying hardware can change across
a virtual adapter reset.
Thanks,
Brian
>
> Thanks,
> John
>
>> However, when LUNs are scanned, the devloss paths will be found
>> and brought back online, still using the old max_hw_sectors. This
>> change ensures that max_hw_sectors gets updated.
>>
>> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
>> ---
>> drivers/scsi/scsi_scan.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index 44680f65ea14..01f2b38daab3 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>> blist_flags_t bflags;
>> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
>> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>> + struct request_queue *q;
>> /*
>> * The rescan flag is used as an optimization, the first scan of a
>> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>> *bflagsp = scsi_get_device_flags(sdev,
>> sdev->vendor,
>> sdev->model);
>> + q = sdev->request_queue;
>> + if (queue_max_hw_sectors(q) > shost->max_sectors)
>> + blk_queue_max_hw_sectors(q, shost->max_sectors);
>> +
>> return SCSI_SCAN_LUN_PRESENT;
>> }
>> scsi_device_put(sdev);
>> @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost)
>> }
>> spin_unlock_irqrestore(shost->host_lock, flags);
>> }
>> -
>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-18 17:22 ` Brian King
@ 2024-01-19 9:02 ` John Garry
2024-01-23 13:59 ` Brian King
0 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2024-01-19 9:02 UTC (permalink / raw)
To: Brian King, linux-scsi; +Cc: brking, jejb, martin.petersen
On 18/01/2024 17:22, Brian King wrote:
> On 1/18/24 9:44 AM, John Garry wrote:
>> On 17/01/2024 21:36, Brian King wrote:
>>> This addresses an issue discovered on ibmvfc LUNs. For this driver,
>>> max_sectors is negotiated with the VIOS. This gets done at initialization
>>> time, then LUNs get scanned and things generally work fine. However,
>>> this attribute can be changed on the VIOS, either due to a sysadmin
>>> change or potentially a VIOS code level change. If this decreases
>>> to a smaller value, due to one of these reasons, the next time the
>>> ibmvfc driver performs an NPIV login, it will only be able to use
>>> the smaller value. In the case of a VIOS reboot, when the VIOS goes
>>> down, all paths through that VIOS will go to devloss state. When
>>> the VIOS comes back up, ibmvfc negotiates max_sectors and will only
>>> be able to get the smaller value and it will update shost->max_sectors.
>>
>> Are you saying that the driver will manually update shost->max_sectors after adding the scsi host? I didn't think that was permitted.
>
> That is what happens. The characteristics of the underlying hardware can change across
> a virtual adapter reset.
That's unfortunate.
I don't think that it's a good idea to change shost->max_sectors after
adding the scsi host or to add core code to condone doing it. Indeed,
there is code there to limit shost->max_sectors from DMA mapping
constraints in scsi_add_host() path, which should not be ignored.
Would it be possible to initially set shost->max_sectors for this
adapter at the lowest anticipated value for that adapter and don't
change thereafter?
Thanks,
John
>
> Thanks,
>
> Brian
>
>>
>> Thanks,
>> John
>>
>>> However, when LUNs are scanned, the devloss paths will be found
>>> and brought back online, still using the old max_hw_sectors. This
>>> change ensures that max_hw_sectors gets updated.
>>>
>>> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
>>> ---
>>> drivers/scsi/scsi_scan.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>>> index 44680f65ea14..01f2b38daab3 100644
>>> --- a/drivers/scsi/scsi_scan.c
>>> +++ b/drivers/scsi/scsi_scan.c
>>> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>>> blist_flags_t bflags;
>>> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
>>> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>>> + struct request_queue *q;
>>> /*
>>> * The rescan flag is used as an optimization, the first scan of a
>>> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>>> *bflagsp = scsi_get_device_flags(sdev,
>>> sdev->vendor,
>>> sdev->model);
>>> + q = sdev->request_queue;
>>> + if (queue_max_hw_sectors(q) > shost->max_sectors)
>>> + blk_queue_max_hw_sectors(q, shost->max_sectors);
>>> +
>>> return SCSI_SCAN_LUN_PRESENT;
>>> }
>>> scsi_device_put(sdev);
>>> @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost)
>>> }
>>> spin_unlock_irqrestore(shost->host_lock, flags);
>>> }
>>> -
>>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-19 9:02 ` John Garry
@ 2024-01-23 13:59 ` Brian King
0 siblings, 0 replies; 8+ messages in thread
From: Brian King @ 2024-01-23 13:59 UTC (permalink / raw)
To: John Garry, linux-scsi; +Cc: brking, jejb, martin.petersen
On 1/19/24 3:02 AM, John Garry wrote:
> On 18/01/2024 17:22, Brian King wrote:
>> On 1/18/24 9:44 AM, John Garry wrote:
>>> On 17/01/2024 21:36, Brian King wrote:
>>>> This addresses an issue discovered on ibmvfc LUNs. For this driver,
>>>> max_sectors is negotiated with the VIOS. This gets done at initialization
>>>> time, then LUNs get scanned and things generally work fine. However,
>>>> this attribute can be changed on the VIOS, either due to a sysadmin
>>>> change or potentially a VIOS code level change. If this decreases
>>>> to a smaller value, due to one of these reasons, the next time the
>>>> ibmvfc driver performs an NPIV login, it will only be able to use
>>>> the smaller value. In the case of a VIOS reboot, when the VIOS goes
>>>> down, all paths through that VIOS will go to devloss state. When
>>>> the VIOS comes back up, ibmvfc negotiates max_sectors and will only
>>>> be able to get the smaller value and it will update shost->max_sectors.
>>>
>>> Are you saying that the driver will manually update shost->max_sectors after adding the scsi host? I didn't think that was permitted.
>>
>> That is what happens. The characteristics of the underlying hardware can change across
>> a virtual adapter reset.
>
> That's unfortunate.
>
> I don't think that it's a good idea to change shost->max_sectors after adding the scsi host or to add core code to condone doing it. Indeed, there is code there to limit shost->max_sectors from DMA mapping constraints in scsi_add_host() path, which should not be ignored.
Good point. However, this patch only lowers max_hw sectors if shost->max_sectors has since been decreased.
>
> Would it be possible to initially set shost->max_sectors for this adapter at the lowest anticipated value for that adapter and don't change thereafter?
Different physical backing devices support different ranges of values and the physical backing
device can change dynamically. There is currently no defined way for the client to determine
what the lowest possible value is. The downside to adding such an attribute would be that
we'd then always be limited to an arbitrarily small value, which would limit the performance.
Thanks,
Brian
>
> Thanks,
> John
>
>>
>> Thanks,
>>
>> Brian
>>
>>>
>>> Thanks,
>>> John
>>>
>>>> However, when LUNs are scanned, the devloss paths will be found
>>>> and brought back online, still using the old max_hw_sectors. This
>>>> change ensures that max_hw_sectors gets updated.
>>>>
>>>> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
>>>> ---
>>>> drivers/scsi/scsi_scan.c | 6 +++++-
>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>>>> index 44680f65ea14..01f2b38daab3 100644
>>>> --- a/drivers/scsi/scsi_scan.c
>>>> +++ b/drivers/scsi/scsi_scan.c
>>>> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>>>> blist_flags_t bflags;
>>>> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
>>>> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>>>> + struct request_queue *q;
>>>> /*
>>>> * The rescan flag is used as an optimization, the first scan of a
>>>> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>>>> *bflagsp = scsi_get_device_flags(sdev,
>>>> sdev->vendor,
>>>> sdev->model);
>>>> + q = sdev->request_queue;
>>>> + if (queue_max_hw_sectors(q) > shost->max_sectors)
>>>> + blk_queue_max_hw_sectors(q, shost->max_sectors);
>>>> +
>>>> return SCSI_SCAN_LUN_PRESENT;
>>>> }
>>>> scsi_device_put(sdev);
>>>> @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost)
>>>> }
>>>> spin_unlock_irqrestore(shost->host_lock, flags);
>>>> }
>>>> -
>>>
>>
>
>
--
Brian King
Power Linux I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-17 21:36 [PATCH] scsi: Update max_hw_sectors on rescan Brian King
2024-01-18 15:44 ` John Garry
@ 2024-01-23 22:40 ` Mike Christie
2024-01-24 9:24 ` Christoph Hellwig
2 siblings, 0 replies; 8+ messages in thread
From: Mike Christie @ 2024-01-23 22:40 UTC (permalink / raw)
To: Brian King, linux-scsi; +Cc: brking, jejb, martin.petersen
On 1/17/24 3:36 PM, Brian King wrote:
> This addresses an issue discovered on ibmvfc LUNs. For this driver,
> max_sectors is negotiated with the VIOS. This gets done at initialization
> time, then LUNs get scanned and things generally work fine. However,
> this attribute can be changed on the VIOS, either due to a sysadmin
> change or potentially a VIOS code level change. If this decreases
> to a smaller value, due to one of these reasons, the next time the
> ibmvfc driver performs an NPIV login, it will only be able to use
> the smaller value. In the case of a VIOS reboot, when the VIOS goes
> down, all paths through that VIOS will go to devloss state. When
> the VIOS comes back up, ibmvfc negotiates max_sectors and will only
> be able to get the smaller value and it will update shost->max_sectors.
> However, when LUNs are scanned, the devloss paths will be found
> and brought back online, still using the old max_hw_sectors. This
> change ensures that max_hw_sectors gets updated.
>
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
> ---
> drivers/scsi/scsi_scan.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 44680f65ea14..01f2b38daab3 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
> blist_flags_t bflags;
> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
> + struct request_queue *q;
>
> /*
> * The rescan flag is used as an optimization, the first scan of a
> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
> *bflagsp = scsi_get_device_flags(sdev,
> sdev->vendor,
> sdev->model);
> + q = sdev->request_queue;
> + if (queue_max_hw_sectors(q) > shost->max_sectors)
> + blk_queue_max_hw_sectors(q, shost->max_sectors);
> +
What happens if commands that are larger than the new shost->max_sectors get
sent to the driver/device?
For example, if we called fc_remote_port_add and scsi_target_unblock puts the
existing devices into SDEV_RUNNING, then we do the scsi_scan_target call and
hit the code above, could we have commands in the request_queue already (we
relogin before fast_io_fail even fires so the commands never get failed)?
It looks like commands have already passed checks like bio_may_exceed_limit
and will be sent to the driver. Will the driver/device spit out an error?
Is this ok, or do you need some sort of flush and limit re-check/re-split?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-17 21:36 [PATCH] scsi: Update max_hw_sectors on rescan Brian King
2024-01-18 15:44 ` John Garry
2024-01-23 22:40 ` Mike Christie
@ 2024-01-24 9:24 ` Christoph Hellwig
2024-01-24 22:46 ` Brian King
2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2024-01-24 9:24 UTC (permalink / raw)
To: Brian King; +Cc: linux-scsi, brking, jejb, martin.petersen
We can't change the host-wide limit here (it wouldn't apply to all
LUs anyway). If your limit is per-LU, you can call
blk_queue_max_hw_sectors from ->slave_configure.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] scsi: Update max_hw_sectors on rescan
2024-01-24 9:24 ` Christoph Hellwig
@ 2024-01-24 22:46 ` Brian King
0 siblings, 0 replies; 8+ messages in thread
From: Brian King @ 2024-01-24 22:46 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-scsi, brking, jejb, martin.petersen, michael.christie
On 1/24/24 3:24 AM, Christoph Hellwig wrote:
> We can't change the host-wide limit here (it wouldn't apply to all
> LUs anyway). If your limit is per-LU, you can call
> blk_queue_max_hw_sectors from ->slave_configure.
Unfortunately, it doesn't look like slave_configure gets called in the
scenario in question. In this case we already have a scsi_device created but
its in devloss state and the FC transport layer is bringing it back online.
There is also the point that Mike brought up in that if fast fail tmo
has not yet fired, there could be I/O still in the queue that is now
too large.
To answer your earlier question, Mike, if the VIOS receives a request that
is too large it closes the CRQ, forcing an entire reinit / discovery,
so its definitely not something we want to encounter. I'm trying to get this
behavior improved so that only the one command fails, but that's not what
happens today.
I suppose I could iterate through all the LUNs and call blk_queue_max_hw_sectors
on them, but I'm not sure if that solves the problem. It would close the window
that Mike highlighted, but if there are commands outstanding when this occurs
that are larger than the new max_hw_sectors and they get requeued, will they
get split in the block layer when they get resent to the LLD or will they
just get resent as-is? If its the latter, I'd get a request larger than
what I can support.
-Brian
--
Brian King
Power Linux I/O
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-01-24 22:46 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-17 21:36 [PATCH] scsi: Update max_hw_sectors on rescan Brian King
2024-01-18 15:44 ` John Garry
2024-01-18 17:22 ` Brian King
2024-01-19 9:02 ` John Garry
2024-01-23 13:59 ` Brian King
2024-01-23 22:40 ` Mike Christie
2024-01-24 9:24 ` Christoph Hellwig
2024-01-24 22:46 ` Brian King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox