* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-09-29 11:47 [PATCH] megaraid_sas: Enable shared tag map Hannes Reinecke
@ 2014-09-29 16:52 ` Webb Scales
2014-09-29 17:08 ` James Bottomley
2014-09-30 7:43 ` Christoph Hellwig
2014-10-01 18:51 ` Webb Scales
2 siblings, 1 reply; 16+ messages in thread
From: Webb Scales @ 2014-09-29 16:52 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: James Bottomley, Christoph Hellwig, linux-scsi
[Hannes, James, Christoph: sorry for the extra copies -- my mail client
didn't automatically convert to plain-text for the list, so it bounced.]
Could someone enlighten me on when and where a driver might want to or
be required to set the "tagged_supported" flag?
A quick grep yields a number of places where the flag value is set or
cleared.
This looks like the mechanism which is supposed to set the flag (e.g.,
if the device/LUN characteristics suggest that it's appropriate), and so
I'm not sure when/why the driver would want to do it explicitly:
./drivers/scsi/scsi_scan.c: sdev->tagged_supported = 1; [in scsi_add_lun()]
These look like attempts to override the SCSI mid-layer:
./drivers/scsi/fnic/fnic_main.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
./drivers/scsi/stex.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
./drivers/scsi/stex.c: sdev->tagged_supported = 1; [in '_slave_configure()]
./drivers/scsi/qla4xxx/ql4_os.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
./drivers/scsi/qla4xxx/ql4_os.c: sdev->tagged_supported = 1; [in '_slave_configure()]
./drivers/scsi/ufs/ufshcd.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
./drivers/scsi/scsi_debug.c: sdp->tagged_supported = 1; [in '_slave_configure()]
However, it strikes me that it's unnecessary to set the flag in /both/
'_slave_alloc() and '_slave_configure() -- I think that either one
should work, and '_slave_configure() seems like the appropriate choice.
(But, I'm left wondering why these drivers need to override the SCSI
ML's choice....)
These are interesting, but not pertinent to the discussion at hand (they
seem to be fall-backs triggered during error recovery):
./drivers/scsi/atari_NCR5380.c: cmd->device->tagged_supported = 0; [error recovery?]
./drivers/scsi/53c700.c: SCp->device->tagged_supported = 0; [error recovery?]
./drivers/scsi/sun3_NCR5380.c: cmd->device->tagged_supported = 0; [error recovery?]
And, this one looks spurious (that is, I believe it's clearing the flag
when it's already clear):
./drivers/scsi/libsas/sas_scsi_host.c: scsi_dev->tagged_supported = 0; [looks spurious]
So, taking the patch below, for example, why is tagged_supported being
set in /both/ the '_slave_alloc() and the '_slave_configure() routines,
and, in fact, why is it being set at all -- why doesn't the SCSI
mid-layer set it correctly in this case?
Thanks,
Webb
On 9/29/14 7:47 AM, Hannes Reinecke wrote:
> Megaraid_sas uses a shared pool of commands per HBA, so we
> should be enabling a shared tag map.
> This will allow the I/O scheduler to make better scheduling
> decisions and will avoid BUSY states in the driver.
>
> Signed-off-by: Hannes Reinecke<hare@suse.de>
> ---
> drivers/scsi/megaraid/megaraid_sas_base.c | 34 ++++++++++++++++++++++++++++++-
> 1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index f6a69a3..996fa9a 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -1659,7 +1659,7 @@ static int megasas_slave_configure(struct scsi_device *sdev)
> */
> blk_queue_rq_timeout(sdev->request_queue,
> MEGASAS_DEFAULT_CMD_TIMEOUT * HZ);
> -
> + sdev->tagged_supported = 1;
> return 0;
> }
>
> @@ -1667,6 +1667,10 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
> {
> u16 pd_index = 0;
> struct megasas_instance *instance ;
> +
> + /* We have a shared tag map, so TCQ is always supported */
> + sdev->tagged_supported = 1;
> +
> instance = megasas_lookup_instance(sdev->host->host_no);
> if (sdev->channel < MEGASAS_MAX_PD_CHANNELS) {
> /*
> @@ -1677,13 +1681,20 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
> sdev->id;
> if (instance->pd_list[pd_index].driveState ==
> MR_PD_STATE_SYSTEM) {
> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
> }
> return -ENXIO;
> }
> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
> }
>
> +static void megasas_slave_destroy(struct scsi_device *sdev)
> +{
> + scsi_deactivate_tcq(sdev, 1);
> +}
> +
> void megaraid_sas_kill_hba(struct megasas_instance *instance)
> {
> if ((instance->pdev->device == PCI_DEVICE_ID_LSI_SAS0073SKINNY) ||
> @@ -2746,6 +2757,21 @@ struct device_attribute *megaraid_host_attrs[] = {
> NULL,
> };
>
> +static int megasas_change_queue_type(struct scsi_device *sdev, int tag_type)
> +{
> + if (sdev->host->bqt && sdev->tagged_supported) {
> + scsi_set_tag_type(sdev, tag_type);
> +
> + if (tag_type)
> + scsi_activate_tcq(sdev, sdev->queue_depth);
> + else
> + scsi_deactivate_tcq(sdev, sdev->queue_depth);
> + } else
> + tag_type = 0;
> +
> + return tag_type;
> +}
> +
> /*
> * Scsi host template for megaraid_sas driver
> */
> @@ -2756,6 +2782,7 @@ static struct scsi_host_template megasas_template = {
> .proc_name = "megaraid_sas",
> .slave_configure = megasas_slave_configure,
> .slave_alloc = megasas_slave_alloc,
> + .slave_destroy = megasas_slave_destroy,
> .queuecommand = megasas_queue_command,
> .eh_device_reset_handler = megasas_reset_device,
> .eh_bus_reset_handler = megasas_reset_bus_host,
> @@ -2765,6 +2792,7 @@ static struct scsi_host_template megasas_template = {
> .bios_param = megasas_bios_param,
> .use_clustering = ENABLE_CLUSTERING,
> .change_queue_depth = megasas_change_queue_depth,
> + .change_queue_type = megasas_change_queue_type,
> .no_write_same = 1,
> };
>
> @@ -4909,6 +4937,10 @@ static int megasas_io_attach(struct megasas_instance *instance)
> host->this_id = instance->init_id;
> host->sg_tablesize = instance->max_num_sge;
>
> + if (!scsi_init_shared_tag_map(host, host->can_queue))
> + printk(KERN_WARNING "megasas: enable TCQ support, depth %d",
> + host->can_queue);
> +
> if (instance->fw_support_ieee)
> instance->max_sectors_per_req = MEGASAS_MAX_SECTORS_IEEE;
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-09-29 16:52 ` Webb Scales
@ 2014-09-29 17:08 ` James Bottomley
0 siblings, 0 replies; 16+ messages in thread
From: James Bottomley @ 2014-09-29 17:08 UTC (permalink / raw)
To: webbnh@hp.com; +Cc: linux-scsi@vger.kernel.org, hch@infradead.org, hare@suse.de
On Mon, 2014-09-29 at 12:52 -0400, Webb Scales wrote:
> [Hannes, James, Christoph: sorry for the extra copies -- my mail client
> didn't automatically convert to plain-text for the list, so it bounced.]
>
> Could someone enlighten me on when and where a driver might want to or
> be required to set the "tagged_supported" flag?
>
> A quick grep yields a number of places where the flag value is set or
> cleared.
>
> This looks like the mechanism which is supposed to set the flag (e.g.,
> if the device/LUN characteristics suggest that it's appropriate), and so
> I'm not sure when/why the driver would want to do it explicitly:
>
> ./drivers/scsi/scsi_scan.c: sdev->tagged_supported = 1; [in scsi_add_lun()]
That's set based on device capabilities.
> These look like attempts to override the SCSI mid-layer:
>
> ./drivers/scsi/fnic/fnic_main.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
> ./drivers/scsi/stex.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
> ./drivers/scsi/stex.c: sdev->tagged_supported = 1; [in '_slave_configure()]
> ./drivers/scsi/qla4xxx/ql4_os.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
> ./drivers/scsi/qla4xxx/ql4_os.c: sdev->tagged_supported = 1; [in '_slave_configure()]
> ./drivers/scsi/ufs/ufshcd.c: sdev->tagged_supported = 1; [in '_slave_alloc()]
> ./drivers/scsi/scsi_debug.c: sdp->tagged_supported = 1; [in '_slave_configure()]
Slave configure is allowed to override this, so they all have reasons I
would suspect (qla is because tape is untagged, but qla still needs a
tag for the internal queue, for instance).
>
> However, it strikes me that it's unnecessary to set the flag in /both/
> '_slave_alloc() and '_slave_configure() -- I think that either one
> should work, and '_slave_configure() seems like the appropriate choice.
> (But, I'm left wondering why these drivers need to override the SCSI
> ML's choice....)
Stex is the only one that does this; it's probably just a make sure
thing. You might want to set in slave alloc if you need a tagged
Inquiry command.
> These are interesting, but not pertinent to the discussion at hand (they
> seem to be fall-backs triggered during error recovery):
>
> ./drivers/scsi/atari_NCR5380.c: cmd->device->tagged_supported = 0; [error recovery?]
> ./drivers/scsi/53c700.c: SCp->device->tagged_supported = 0; [error recovery?]
> ./drivers/scsi/sun3_NCR5380.c: cmd->device->tagged_supported = 0; [error recovery?]
SPI allows the device to reject the tag enabling messages ... that's
what this lot are all doing.
> And, this one looks spurious (that is, I believe it's clearing the flag
> when it's already clear):
>
> ./drivers/scsi/libsas/sas_scsi_host.c: scsi_dev->tagged_supported = 0; [looks spurious]
Pointless but harmless.
> So, taking the patch below, for example, why is tagged_supported being
> set in /both/ the '_slave_alloc() and the '_slave_configure() routines,
> and, in fact, why is it being set at all -- why doesn't the SCSI
> mid-layer set it correctly in this case?
Because things like the qla have to have tagged inquiry because it's a
shared host tag map used to label the issue queue. Setting it twice is
overkill because we don't actually turn it off, but it could be
construed as prudent in case we did (we could make tagged_supported
exactly reflect the state of the CmdQue bit for instance).
James
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-09-29 11:47 [PATCH] megaraid_sas: Enable shared tag map Hannes Reinecke
2014-09-29 16:52 ` Webb Scales
@ 2014-09-30 7:43 ` Christoph Hellwig
2014-09-30 9:08 ` Hannes Reinecke
2014-10-01 18:51 ` Webb Scales
2 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2014-09-30 7:43 UTC (permalink / raw)
To: Hannes Reinecke
Cc: James Bottomley, Christoph Hellwig, linux-scsi, Sumit Saxena
On Mon, Sep 29, 2014 at 01:47:52PM +0200, Hannes Reinecke wrote:
> Megaraid_sas uses a shared pool of commands per HBA, so we
> should be enabling a shared tag map.
> This will allow the I/O scheduler to make better scheduling
> decisions and will avoid BUSY states in the driver.
What exact problem did you see? Do you have a link to a bugzilla entry
or similar? Was this observed on real hardware or your qemu emulation?
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/scsi/megaraid/megaraid_sas_base.c | 34 ++++++++++++++++++++++++++++++-
> 1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index f6a69a3..996fa9a 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -1659,7 +1659,7 @@ static int megasas_slave_configure(struct scsi_device *sdev)
> */
> blk_queue_rq_timeout(sdev->request_queue,
> MEGASAS_DEFAULT_CMD_TIMEOUT * HZ);
> -
> + sdev->tagged_supported = 1;
> + /* We have a shared tag map, so TCQ is always supported */
> + sdev->tagged_supported = 1;
> +
Why doesn't the device return the proper data in the INQUIRY response?
And more importantly why do you want to do this twice?
> instance = megasas_lookup_instance(sdev->host->host_no);
> if (sdev->channel < MEGASAS_MAX_PD_CHANNELS) {
> /*
> @@ -1677,13 +1681,20 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
> sdev->id;
> if (instance->pd_list[pd_index].driveState ==
> MR_PD_STATE_SYSTEM) {
> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
> }
> return -ENXIO;
> }
> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
Please refactor the code so that the first case falls through to the
second, something like:
if (instance->pd_list[pd_index].driveState !=
MR_PD_STATE_SYSTEM)
return -ENXIO;
}
scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
return 0;
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-09-30 7:43 ` Christoph Hellwig
@ 2014-09-30 9:08 ` Hannes Reinecke
0 siblings, 0 replies; 16+ messages in thread
From: Hannes Reinecke @ 2014-09-30 9:08 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: James Bottomley, linux-scsi, Sumit Saxena
On 09/30/2014 09:43 AM, Christoph Hellwig wrote:
> On Mon, Sep 29, 2014 at 01:47:52PM +0200, Hannes Reinecke wrote:
>> Megaraid_sas uses a shared pool of commands per HBA, so we
>> should be enabling a shared tag map.
>> This will allow the I/O scheduler to make better scheduling
>> decisions and will avoid BUSY states in the driver.
>
> What exact problem did you see? Do you have a link to a bugzilla entry
> or similar? Was this observed on real hardware or your qemu emulation?
>
Well, _actually_ I just did it so that I could get the tag number
for my printk patchset :-)
But there is an underlying reason:
megaraid_sas uses an internal frame pool of a fixed size.
Once that's exhausted it cannot queue any more commands, and need to
return busy.
But as this is a per-host command pool all LUNs have to shared the
same frame pool, and the driver uses heuristics (ie 1/4 of the pool
size) to set cmd_per_lun. So if you have more than 4 LUNs you'll run
into issues as the pool can become exhausted.
Hence I thought it would be good to expose this to the block layer
so that we can avoid BUSY states in the driver.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>> drivers/scsi/megaraid/megaraid_sas_base.c | 34 ++++++++++++++++++++++++++++++-
>> 1 file changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index f6a69a3..996fa9a 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -1659,7 +1659,7 @@ static int megasas_slave_configure(struct scsi_device *sdev)
>> */
>> blk_queue_rq_timeout(sdev->request_queue,
>> MEGASAS_DEFAULT_CMD_TIMEOUT * HZ);
>> -
>> + sdev->tagged_supported = 1;
>
>> + /* We have a shared tag map, so TCQ is always supported */
>> + sdev->tagged_supported = 1;
>> +
>
> Why doesn't the device return the proper data in the INQUIRY response?
>
> And more importantly why do you want to do this twice?
>
Bah, Typo.
>> instance = megasas_lookup_instance(sdev->host->host_no);
>> if (sdev->channel < MEGASAS_MAX_PD_CHANNELS) {
>> /*
>> @@ -1677,13 +1681,20 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
>> sdev->id;
>> if (instance->pd_list[pd_index].driveState ==
>> MR_PD_STATE_SYSTEM) {
>> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
>> return 0;
>> }
>> return -ENXIO;
>> }
>> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
>> return 0;
>
> Please refactor the code so that the first case falls through to the
> second, something like:
>
>
> if (instance->pd_list[pd_index].driveState !=
> MR_PD_STATE_SYSTEM)
> return -ENXIO;
> }
>
> scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
>
Okay, will be doing it.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-09-29 11:47 [PATCH] megaraid_sas: Enable shared tag map Hannes Reinecke
2014-09-29 16:52 ` Webb Scales
2014-09-30 7:43 ` Christoph Hellwig
@ 2014-10-01 18:51 ` Webb Scales
2014-10-01 21:10 ` Christoph Hellwig
2 siblings, 1 reply; 16+ messages in thread
From: Webb Scales @ 2014-10-01 18:51 UTC (permalink / raw)
To: Hannes Reinecke, James Bottomley; +Cc: Christoph Hellwig, linux-scsi
Hannes,
In megasas_change_queue_type(), is it possible for sdev->queue_depth to
be greater than 256?
Unless I'm misunderstanding the SCSI code, we can request a
queue-depth/tag-map-size which is greater than 256, but, since the
scsi_cmnd::tag field is an unsigned char, depths greater than 256 may
overflow the field when high-numbered tags are used...do I have that right?
Thanks,
Webb
On 9/29/14 7:47 AM, Hannes Reinecke wrote:
> Megaraid_sas uses a shared pool of commands per HBA, so we
> should be enabling a shared tag map.
> This will allow the I/O scheduler to make better scheduling
> decisions and will avoid BUSY states in the driver.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/scsi/megaraid/megaraid_sas_base.c | 34 ++++++++++++++++++++++++++++++-
> 1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index f6a69a3..996fa9a 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -1659,7 +1659,7 @@ static int megasas_slave_configure(struct scsi_device *sdev)
> */
> blk_queue_rq_timeout(sdev->request_queue,
> MEGASAS_DEFAULT_CMD_TIMEOUT * HZ);
> -
> + sdev->tagged_supported = 1;
> return 0;
> }
>
> @@ -1667,6 +1667,10 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
> {
> u16 pd_index = 0;
> struct megasas_instance *instance ;
> +
> + /* We have a shared tag map, so TCQ is always supported */
> + sdev->tagged_supported = 1;
> +
> instance = megasas_lookup_instance(sdev->host->host_no);
> if (sdev->channel < MEGASAS_MAX_PD_CHANNELS) {
> /*
> @@ -1677,13 +1681,20 @@ static int megasas_slave_alloc(struct scsi_device *sdev)
> sdev->id;
> if (instance->pd_list[pd_index].driveState ==
> MR_PD_STATE_SYSTEM) {
> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
> }
> return -ENXIO;
> }
> + scsi_activate_tcq(sdev, MEGASAS_DEFAULT_CMD_PER_LUN);
> return 0;
> }
>
> +static void megasas_slave_destroy(struct scsi_device *sdev)
> +{
> + scsi_deactivate_tcq(sdev, 1);
> +}
> +
> void megaraid_sas_kill_hba(struct megasas_instance *instance)
> {
> if ((instance->pdev->device == PCI_DEVICE_ID_LSI_SAS0073SKINNY) ||
> @@ -2746,6 +2757,21 @@ struct device_attribute *megaraid_host_attrs[] = {
> NULL,
> };
>
> +static int megasas_change_queue_type(struct scsi_device *sdev, int tag_type)
> +{
> + if (sdev->host->bqt && sdev->tagged_supported) {
> + scsi_set_tag_type(sdev, tag_type);
> +
> + if (tag_type)
> + scsi_activate_tcq(sdev, sdev->queue_depth);
> + else
> + scsi_deactivate_tcq(sdev, sdev->queue_depth);
> + } else
> + tag_type = 0;
> +
> + return tag_type;
> +}
> +
> /*
> * Scsi host template for megaraid_sas driver
> */
> @@ -2756,6 +2782,7 @@ static struct scsi_host_template megasas_template = {
> .proc_name = "megaraid_sas",
> .slave_configure = megasas_slave_configure,
> .slave_alloc = megasas_slave_alloc,
> + .slave_destroy = megasas_slave_destroy,
> .queuecommand = megasas_queue_command,
> .eh_device_reset_handler = megasas_reset_device,
> .eh_bus_reset_handler = megasas_reset_bus_host,
> @@ -2765,6 +2792,7 @@ static struct scsi_host_template megasas_template = {
> .bios_param = megasas_bios_param,
> .use_clustering = ENABLE_CLUSTERING,
> .change_queue_depth = megasas_change_queue_depth,
> + .change_queue_type = megasas_change_queue_type,
> .no_write_same = 1,
> };
>
> @@ -4909,6 +4937,10 @@ static int megasas_io_attach(struct megasas_instance *instance)
> host->this_id = instance->init_id;
> host->sg_tablesize = instance->max_num_sge;
>
> + if (!scsi_init_shared_tag_map(host, host->can_queue))
> + printk(KERN_WARNING "megasas: enable TCQ support, depth %d",
> + host->can_queue);
> +
> if (instance->fw_support_ieee)
> instance->max_sectors_per_req = MEGASAS_MAX_SECTORS_IEEE;
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-01 18:51 ` Webb Scales
@ 2014-10-01 21:10 ` Christoph Hellwig
2014-10-02 6:51 ` Hannes Reinecke
0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2014-10-01 21:10 UTC (permalink / raw)
To: Webb Scales
Cc: Hannes Reinecke, James Bottomley, Christoph Hellwig, linux-scsi
On Wed, Oct 01, 2014 at 02:51:31PM -0400, Webb Scales wrote:
> Hannes,
>
> In megasas_change_queue_type(), is it possible for sdev->queue_depth to be
> greater than 256?
>
> Unless I'm misunderstanding the SCSI code, we can request a
> queue-depth/tag-map-size which is greater than 256, but, since the
> scsi_cmnd::tag field is an unsigned char, depths greater than 256 may
> overflow the field when high-numbered tags are used...do I have that right?
Yes, we really need to increase the size of the tag field. SAM allows
a transport specific limit of up to 64 _bytes_ for it, although I don't
know implementation that large. Given that the block layer can generate
up to 32-bit tags both for the old blk-tag.c code and the new
blk-mq-tag.c version it would be good to use a u32 there. Can you send
me a patch?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-01 21:10 ` Christoph Hellwig
@ 2014-10-02 6:51 ` Hannes Reinecke
2014-10-02 9:19 ` Christoph Hellwig
2014-10-02 12:05 ` James Bottomley
0 siblings, 2 replies; 16+ messages in thread
From: Hannes Reinecke @ 2014-10-02 6:51 UTC (permalink / raw)
To: Christoph Hellwig, Webb Scales; +Cc: James Bottomley, linux-scsi
On 10/01/2014 11:10 PM, Christoph Hellwig wrote:
> On Wed, Oct 01, 2014 at 02:51:31PM -0400, Webb Scales wrote:
>> Hannes,
>>
>> In megasas_change_queue_type(), is it possible for sdev->queue_depth to be
>> greater than 256?
>>
>> Unless I'm misunderstanding the SCSI code, we can request a
>> queue-depth/tag-map-size which is greater than 256, but, since the
>> scsi_cmnd::tag field is an unsigned char, depths greater than 256 may
>> overflow the field when high-numbered tags are used...do I have that right?
>
> Yes, we really need to increase the size of the tag field. SAM allows
> a transport specific limit of up to 64 _bytes_ for it, although I don't
> know implementation that large. Given that the block layer can generate
> up to 32-bit tags both for the old blk-tag.c code and the new
> blk-mq-tag.c version it would be good to use a u32 there. Can you send
> me a patch?
>
Weeelll ...
I'm afraid it's not _that_ easy.
SCSI-II tagged queueing has some specific tag values:
#define SIMPLE_QUEUE_TAG 0x20
#define HEAD_OF_QUEUE_TAG 0x21
#define ORDERED_QUEUE_TAG 0x22
drivers/scsi/vmw_pvscsi.c:
e->tag = SIMPLE_QUEUE_TAG;
if (sdev->tagged_supported &&
(cmd->tag == HEAD_OF_QUEUE_TAG ||
cmd->tag == ORDERED_QUEUE_TAG))
e->tag = cmd->tag;
So we cannot translate between them easily.
Unless we move the SCSI-II values to negative and use the positive
values for 'real' tag numbers.
The recommendation here is to use 'scmd->request->tag' whenever
you want to get to the tag number, and 'scmd->tag' if you have to
play around with SCSI-II TCQ.
But if not I would strongly advise to leave 'scmd->tag' alone.
Speaking of which:
We have
drivers/scsi/scsi_lib.c:scsi_get_cmd_from_request()
/* pull a tag out of the request if we have one */
cmd->tag = req->tag;
cmd->request = req;
What would happen if the block layer decides to use tag number 0x22?
Wouldn't that completely bugger up the TCQ logic?
Why do we need to pull the request tag into the scmd tag, anyway?
I'd rather leave that alone, so that the SCSI midlayer can insert
whatever magic it'll decide to do here ...
James?
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 6:51 ` Hannes Reinecke
@ 2014-10-02 9:19 ` Christoph Hellwig
2014-10-02 9:50 ` Hannes Reinecke
2014-10-02 12:05 ` James Bottomley
1 sibling, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2014-10-02 9:19 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Webb Scales, James Bottomley, linux-scsi
On Thu, Oct 02, 2014 at 08:51:20AM +0200, Hannes Reinecke wrote:
>
> I'm afraid it's not _that_ easy.
> SCSI-II tagged queueing has some specific tag values:
>
> #define SIMPLE_QUEUE_TAG 0x20
> #define HEAD_OF_QUEUE_TAG 0x21
> #define ORDERED_QUEUE_TAG 0x22
These are not tag values. These are message codes set in the first byte
of the Queue tag message, the second byte is the actual tag.
> The recommendation here is to use 'scmd->request->tag' whenever
> you want to get to the tag number, and 'scmd->tag' if you have to
> play around with SCSI-II TCQ.
> But if not I would strongly advise to leave 'scmd->tag' alone.
Or kill off scmd->tag.. Let's see how feasible that is.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 9:19 ` Christoph Hellwig
@ 2014-10-02 9:50 ` Hannes Reinecke
2014-10-02 11:36 ` Christoph Hellwig
0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2014-10-02 9:50 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Webb Scales, James Bottomley, linux-scsi
On 10/02/2014 11:19 AM, Christoph Hellwig wrote:
> On Thu, Oct 02, 2014 at 08:51:20AM +0200, Hannes Reinecke wrote:
>>
>> I'm afraid it's not _that_ easy.
>> SCSI-II tagged queueing has some specific tag values:
>>
>> #define SIMPLE_QUEUE_TAG 0x20
>> #define HEAD_OF_QUEUE_TAG 0x21
>> #define ORDERED_QUEUE_TAG 0x22
>
> These are not tag values. These are message codes set in the first byte
> of the Queue tag message, the second byte is the actual tag.
>
>> The recommendation here is to use 'scmd->request->tag' whenever
>> you want to get to the tag number, and 'scmd->tag' if you have to
>> play around with SCSI-II TCQ.
>> But if not I would strongly advise to leave 'scmd->tag' alone.
>
> Or kill off scmd->tag.. Let's see how feasible that is.
>
I'm about to.
(See my last two patches).
There are now two instances left:
NCR5380 (and derived LLDDs) and fnic.
For some weird reasons fnic decided to duplicate blk-tag
functionality. Looking into it.
And NCR5380 tag support can be safely ignored, it never worked anyway.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 9:50 ` Hannes Reinecke
@ 2014-10-02 11:36 ` Christoph Hellwig
2014-10-02 12:00 ` Hannes Reinecke
0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2014-10-02 11:36 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Webb Scales, James Bottomley, linux-scsi
On Thu, Oct 02, 2014 at 11:50:42AM +0200, Hannes Reinecke wrote:
> NCR5380 (and derived LLDDs) and fnic.
fnic needs a tag for a device reset, which we don't provide if it
is issue by ioctl. This is also showed up during the blk-mq work.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 11:36 ` Christoph Hellwig
@ 2014-10-02 12:00 ` Hannes Reinecke
2014-10-02 14:32 ` Christoph Hellwig
0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2014-10-02 12:00 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Webb Scales, James Bottomley, linux-scsi
On 10/02/2014 01:36 PM, Christoph Hellwig wrote:
> On Thu, Oct 02, 2014 at 11:50:42AM +0200, Hannes Reinecke wrote:
>> NCR5380 (and derived LLDDs) and fnic.
>
> fnic needs a tag for a device reset, which we don't provide if it
> is issue by ioctl. This is also showed up during the blk-mq work.
>
Actually, I've hit a similar issue for megaraid_sas (and I guess
others which implement a host-wide tag map might suffer here, too):
tags need to be used even for internal commands.
But tag allocation is done exclusively in the block layer, so one
cannot readily influence this.
(Or, if you do, you'll end up with code duplication like the fnic
driver).
Thing is, for internal commands we typically do not _need_ a fully
formed request, we just need to tag number.
So we could try to implement a function like 'blk_tag_reserve'
to return the next free tag.
But even with that we'd be hitting a tag starvation issue; when
all command tags are in use we cannot send internal commands.
And if the command abort happens to be modelled with an internal
command, we cannot even abort commands anymore.
Which is probably _not_ what we want.
(And I think fnic will suffer from this, too ...)
So I guess we need to set aside a reserved tag pool from which
the internal commands will be allocated.
(Bit like the emergency network skb pool).
With that the tag starvation issue will be resolved, and
we should be able to remove the custom code in fnic.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 12:00 ` Hannes Reinecke
@ 2014-10-02 14:32 ` Christoph Hellwig
0 siblings, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2014-10-02 14:32 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Webb Scales, James Bottomley, linux-scsi
On Thu, Oct 02, 2014 at 02:00:32PM +0200, Hannes Reinecke wrote:
> Actually, I've hit a similar issue for megaraid_sas (and I guess
> others which implement a host-wide tag map might suffer here, too):
> tags need to be used even for internal commands.
> But tag allocation is done exclusively in the block layer, so one
> cannot readily influence this.
> (Or, if you do, you'll end up with code duplication like the fnic
> driver).
> Thing is, for internal commands we typically do not _need_ a fully
> formed request, we just need to tag number.
blk-mq allows a driver to reserve tags, and allocate them using
blk_mq_alloc_request with the reserved flag. Take a look at the
mtip32xx driver, which is using that feature.
The lockless hpsa series has a patch to expose the nr of reserved
tags to scsi, but there is no equivalent in the old blk code yet.
Another issue is that we only setup the tag allocator at scsi_add_host
time, and to fully support drivers that needs tags during initialization
we need move it to scsi_host_alloc time, which doesn't sound like a
major problem.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 6:51 ` Hannes Reinecke
2014-10-02 9:19 ` Christoph Hellwig
@ 2014-10-02 12:05 ` James Bottomley
2014-10-02 12:14 ` Hannes Reinecke
1 sibling, 1 reply; 16+ messages in thread
From: James Bottomley @ 2014-10-02 12:05 UTC (permalink / raw)
To: hare@suse.de; +Cc: linux-scsi@vger.kernel.org, hch@infradead.org, webbnh@hp.com
On Thu, 2014-10-02 at 08:51 +0200, Hannes Reinecke wrote:
> On 10/01/2014 11:10 PM, Christoph Hellwig wrote:
> > On Wed, Oct 01, 2014 at 02:51:31PM -0400, Webb Scales wrote:
> >> Hannes,
> >>
> >> In megasas_change_queue_type(), is it possible for sdev->queue_depth to be
> >> greater than 256?
> >>
> >> Unless I'm misunderstanding the SCSI code, we can request a
> >> queue-depth/tag-map-size which is greater than 256, but, since the
> >> scsi_cmnd::tag field is an unsigned char, depths greater than 256 may
> >> overflow the field when high-numbered tags are used...do I have that right?
> >
> > Yes, we really need to increase the size of the tag field. SAM allows
> > a transport specific limit of up to 64 _bytes_ for it, although I don't
> > know implementation that large. Given that the block layer can generate
> > up to 32-bit tags both for the old blk-tag.c code and the new
> > blk-mq-tag.c version it would be good to use a u32 there. Can you send
> > me a patch?
> >
> Weeelll ...
>
> I'm afraid it's not _that_ easy.
> SCSI-II tagged queueing has some specific tag values:
>
> #define SIMPLE_QUEUE_TAG 0x20
> #define HEAD_OF_QUEUE_TAG 0x21
> #define ORDERED_QUEUE_TAG 0x22
>
> drivers/scsi/vmw_pvscsi.c:
> e->tag = SIMPLE_QUEUE_TAG;
> if (sdev->tagged_supported &&
> (cmd->tag == HEAD_OF_QUEUE_TAG ||
> cmd->tag == ORDERED_QUEUE_TAG))
> e->tag = cmd->tag;
A SCSI-2 tag is a SPI two byte message. The first byte is the message
type. The values you have above identify the message type for simple,
ordered and head of queue tags. The *second* byte is the tag value.
See page 55 and 56 of the SCSI-2 standard. There's no connection (or
shouldn't be) between the message type and the tag value, so it looks
like a bug in the pvscsi driver.
James
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 12:05 ` James Bottomley
@ 2014-10-02 12:14 ` Hannes Reinecke
2014-10-02 13:06 ` James Bottomley
0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2014-10-02 12:14 UTC (permalink / raw)
To: James Bottomley
Cc: linux-scsi@vger.kernel.org, hch@infradead.org, webbnh@hp.com
On 10/02/2014 02:05 PM, James Bottomley wrote:
> On Thu, 2014-10-02 at 08:51 +0200, Hannes Reinecke wrote:
>> On 10/01/2014 11:10 PM, Christoph Hellwig wrote:
>>> On Wed, Oct 01, 2014 at 02:51:31PM -0400, Webb Scales wrote:
>>>> Hannes,
>>>>
>>>> In megasas_change_queue_type(), is it possible for sdev->queue_depth to be
>>>> greater than 256?
>>>>
>>>> Unless I'm misunderstanding the SCSI code, we can request a
>>>> queue-depth/tag-map-size which is greater than 256, but, since the
>>>> scsi_cmnd::tag field is an unsigned char, depths greater than 256 may
>>>> overflow the field when high-numbered tags are used...do I have that right?
>>>
>>> Yes, we really need to increase the size of the tag field. SAM allows
>>> a transport specific limit of up to 64 _bytes_ for it, although I don't
>>> know implementation that large. Given that the block layer can generate
>>> up to 32-bit tags both for the old blk-tag.c code and the new
>>> blk-mq-tag.c version it would be good to use a u32 there. Can you send
>>> me a patch?
>>>
>> Weeelll ...
>>
>> I'm afraid it's not _that_ easy.
>> SCSI-II tagged queueing has some specific tag values:
>>
>> #define SIMPLE_QUEUE_TAG 0x20
>> #define HEAD_OF_QUEUE_TAG 0x21
>> #define ORDERED_QUEUE_TAG 0x22
>>
>> drivers/scsi/vmw_pvscsi.c:
>> e->tag = SIMPLE_QUEUE_TAG;
>> if (sdev->tagged_supported &&
>> (cmd->tag == HEAD_OF_QUEUE_TAG ||
>> cmd->tag == ORDERED_QUEUE_TAG))
>> e->tag = cmd->tag;
>
> A SCSI-2 tag is a SPI two byte message. The first byte is the message
> type. The values you have above identify the message type for simple,
> ordered and head of queue tags. The *second* byte is the tag value.
>
> See page 55 and 56 of the SCSI-2 standard. There's no connection (or
> shouldn't be) between the message type and the tag value, so it looks
> like a bug in the pvscsi driver.
>
It is. I've already sent a patch.
But that just proves the scmd->tag is essentially a duplicate
and we should be removing it.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] megaraid_sas: Enable shared tag map
2014-10-02 12:14 ` Hannes Reinecke
@ 2014-10-02 13:06 ` James Bottomley
0 siblings, 0 replies; 16+ messages in thread
From: James Bottomley @ 2014-10-02 13:06 UTC (permalink / raw)
To: hare@suse.de; +Cc: linux-scsi@vger.kernel.org, hch@infradead.org, webbnh@hp.com
On Thu, 2014-10-02 at 14:14 +0200, Hannes Reinecke wrote:
> On 10/02/2014 02:05 PM, James Bottomley wrote:
> > On Thu, 2014-10-02 at 08:51 +0200, Hannes Reinecke wrote:
> >> On 10/01/2014 11:10 PM, Christoph Hellwig wrote:
> >>> On Wed, Oct 01, 2014 at 02:51:31PM -0400, Webb Scales wrote:
> >>>> Hannes,
> >>>>
> >>>> In megasas_change_queue_type(), is it possible for sdev->queue_depth to be
> >>>> greater than 256?
> >>>>
> >>>> Unless I'm misunderstanding the SCSI code, we can request a
> >>>> queue-depth/tag-map-size which is greater than 256, but, since the
> >>>> scsi_cmnd::tag field is an unsigned char, depths greater than 256 may
> >>>> overflow the field when high-numbered tags are used...do I have that right?
> >>>
> >>> Yes, we really need to increase the size of the tag field. SAM allows
> >>> a transport specific limit of up to 64 _bytes_ for it, although I don't
> >>> know implementation that large. Given that the block layer can generate
> >>> up to 32-bit tags both for the old blk-tag.c code and the new
> >>> blk-mq-tag.c version it would be good to use a u32 there. Can you send
> >>> me a patch?
> >>>
> >> Weeelll ...
> >>
> >> I'm afraid it's not _that_ easy.
> >> SCSI-II tagged queueing has some specific tag values:
> >>
> >> #define SIMPLE_QUEUE_TAG 0x20
> >> #define HEAD_OF_QUEUE_TAG 0x21
> >> #define ORDERED_QUEUE_TAG 0x22
> >>
> >> drivers/scsi/vmw_pvscsi.c:
> >> e->tag = SIMPLE_QUEUE_TAG;
> >> if (sdev->tagged_supported &&
> >> (cmd->tag == HEAD_OF_QUEUE_TAG ||
> >> cmd->tag == ORDERED_QUEUE_TAG))
> >> e->tag = cmd->tag;
> >
> > A SCSI-2 tag is a SPI two byte message. The first byte is the message
> > type. The values you have above identify the message type for simple,
> > ordered and head of queue tags. The *second* byte is the tag value.
> >
> > See page 55 and 56 of the SCSI-2 standard. There's no connection (or
> > shouldn't be) between the message type and the tag value, so it looks
> > like a bug in the pvscsi driver.
> >
> It is. I've already sent a patch.
>
> But that just proves the scmd->tag is essentially a duplicate
> and we should be removing it.
Yes, it was the original tag field, which then got replaced by the block
one and the accessor commands. scsi_populate_tag_msg (which is really
an SPI construct) meant that drivers fully adopting the model didn't
even need to look at cmd->tag any more. I'm fine with dropping it in
favour of req->tag.
James
^ permalink raw reply [flat|nested] 16+ messages in thread