From: Benjamin Block <bblock@linux.ibm.com>
To: Daniel Wagner <dwagner@suse.de>
Cc: linux-scsi@vger.kernel.org,
GR-QLogic-Storage-Upstream@marvell.com,
linux-nvme@lists.infradead.org, Hannes Reinecke <hare@suse.de>,
Nilesh Javali <njavali@marvell.com>,
Arun Easi <aeasi@marvell.com>
Subject: Re: [RFC] qla2xxx: Add dev_loss_tmo kernel module options
Date: Tue, 20 Apr 2021 19:27:00 +0200 [thread overview]
Message-ID: <YH8O5AaapQRg6Msq@t480-pf1aa2c2.linux.ibm.com> (raw)
In-Reply-To: <20210419100014.47144-1-dwagner@suse.de>
On Mon, Apr 19, 2021 at 12:00:14PM +0200, Daniel Wagner wrote:
> Allow to set the default dev_loss_tmo value as kernel module option.
>
> Cc: Nilesh Javali <njavali@marvell.com>
> Cc: Arun Easi <aeasi@marvell.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> Hi,
>
> During array upgrade tests with NVMe/FC on systems equiped with QLogic
> HBAs we faced the problem with the default setting of dev_loss_tmo.
>
> When the default timeout hit after 60 seconds the file system went
> into read only mode. The fix was to set the dev_loss_tmo to infinity
> (note this patch can't handle this).
>
> For lpfc devices we could use the sysfs interface under
> fc_remote_ports which exposed the dev_loss_tmo for SCSI and NVMe
> rports.
>
> The QLogic only expose the rports via fc_remote_ports if SCSI is used.
> There is the debugfs interface to set the dev_loss_tmo but this has
> two issues. First, it's not watched by udevd hence no rules work. This
> could be somehow worked around by setting it statically, but that is
> really only an option for testing. Even if the debugfs interface is
> used there is a bug in the code. In qla_nvme_register_remote() the
> value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
> default value 60 (this code path is exercised if the rport droppes
> twice).
>
> Anyway, this patch is just to get the discussion going. Maybe the
> driver could implement the fc_remote_port interface? Hannes was
> pointing out it might make sense to think about an controller sysfs
> API as there is already a host and the NVMe protocol is all about host
> and controller.
>
> Thanks,
> Daniel
>
> drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
> drivers/scsi/qla2xxx/qla_gbl.h | 1 +
> drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
> drivers/scsi/qla2xxx/qla_os.c | 5 +++++
> 4 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
> index 3aa9869f6fae..0d2386ba65c0 100644
> --- a/drivers/scsi/qla2xxx/qla_attr.c
> +++ b/drivers/scsi/qla2xxx/qla_attr.c
> @@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable)
> }
>
> /* initialize attributes */
> - fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> + fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
> fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
> fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
> fc_host_supported_classes(vha->host) =
> @@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha)
> struct qla_hw_data *ha = vha->hw;
> u32 speeds = FC_PORTSPEED_UNKNOWN;
>
> - fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> + fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
> fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
> fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
> fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ?
> diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
> index fae5cae6f0a8..0b9c24475711 100644
> --- a/drivers/scsi/qla2xxx/qla_gbl.h
> +++ b/drivers/scsi/qla2xxx/qla_gbl.h
> @@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers;
> extern int ql2xfulldump_on_mpifail;
> extern int ql2xenforce_iocb_limit;
> extern int ql2xabts_wait_nvme;
> +extern int ql2xdev_loss_tmo;
>
> extern int qla2x00_loop_reset(scsi_qla_host_t *);
> extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int);
> diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
> index 0cacb667a88b..cdc5b5075407 100644
> --- a/drivers/scsi/qla2xxx/qla_nvme.c
> +++ b/drivers/scsi/qla2xxx/qla_nvme.c
> @@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
> req.port_name = wwn_to_u64(fcport->port_name);
> req.node_name = wwn_to_u64(fcport->node_name);
> req.port_role = 0;
> - req.dev_loss_tmo = 0;
> + req.dev_loss_tmo = ql2xdev_loss_tmo;
>
> if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR)
> req.port_role = FC_PORT_ROLE_NVME_INITIATOR;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
> static int qla2xxx_map_queues(struct Scsi_Host *shost);
> static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> + "Time to wait for device to recover before reporting\n"
> + "an error. Default is 60 seconds\n");
Wouldn't that be really really confusing, if you set essentially the
same thing with two different knobs for one FC HBA? We already have
a `dev_loss_tmo` kernel parameter - granted, only for scsi_transport_fc;
but doesn't qla implement that as well?
I don't really have any horses in this race here, but that sounds
strange.
--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen / Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
next prev parent reply other threads:[~2021-04-20 17:27 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-19 10:00 [RFC] qla2xxx: Add dev_loss_tmo kernel module options Daniel Wagner
2021-04-19 16:19 ` Randy Dunlap
2021-04-20 12:37 ` Daniel Wagner
2021-04-20 14:51 ` Himanshu Madhani
2021-04-20 15:35 ` Randy Dunlap
2021-04-20 17:27 ` Benjamin Block [this message]
2021-04-20 17:35 ` Roman Bolshakov
2021-04-20 18:28 ` Daniel Wagner
2021-04-21 0:25 ` [EXT] " Arun Easi
2021-04-21 7:56 ` Daniel Wagner
2021-04-27 9:51 ` Daniel Wagner
2021-04-27 22:35 ` Arun Easi
2021-04-28 7:17 ` Daniel Wagner
2021-04-28 14:51 ` James Smart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YH8O5AaapQRg6Msq@t480-pf1aa2c2.linux.ibm.com \
--to=bblock@linux.ibm.com \
--cc=GR-QLogic-Storage-Upstream@marvell.com \
--cc=aeasi@marvell.com \
--cc=dwagner@suse.de \
--cc=hare@suse.de \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=njavali@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox