LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 16/21] ibmvfc: register Sub-CRQ handles with VIOS during channel setup
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

If the ibmvfc client adapter requests channels it must submit a number
of Sub-CRQ handles matching the number of channels being requested. The
VIOS in its response will overwrite the actual number of channel
resources allocated which may be less than what was requested. The
client then must store the VIOS Sub-CRQ handle for each queue. This VIOS
handle is needed as a parameter with  h_send_sub_crq().

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 865b87881d86..578e27180f10 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -4627,15 +4627,35 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost)
 static void ibmvfc_channel_setup_done(struct ibmvfc_event *evt)
 {
 	struct ibmvfc_host *vhost = evt->vhost;
+	struct ibmvfc_channel_setup *setup = vhost->channel_setup_buf;
+	struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs;
 	u32 mad_status = be16_to_cpu(evt->xfer_iu->channel_setup.common.status);
 	int level = IBMVFC_DEFAULT_LOG_LEVEL;
+	int flags, active_queues, i;
 
 	ibmvfc_free_event(evt);
 
 	switch (mad_status) {
 	case IBMVFC_MAD_SUCCESS:
 		ibmvfc_dbg(vhost, "Channel Setup succeded\n");
+		flags = be32_to_cpu(setup->flags);
 		vhost->do_enquiry = 0;
+		active_queues = be32_to_cpu(setup->num_scsi_subq_channels);
+		scrqs->active_queues = active_queues;
+
+		if (flags & IBMVFC_CHANNELS_CANCELED) {
+			ibmvfc_dbg(vhost, "Channels Canceled\n");
+			vhost->using_channels = 0;
+		} else {
+			if (active_queues)
+				vhost->using_channels = 1;
+			for (i = 0; i < active_queues; i++)
+				scrqs->scrqs[i].vios_cookie =
+					be64_to_cpu(setup->channel_handles[i]);
+
+			ibmvfc_dbg(vhost, "Using %u channels\n",
+				   vhost->scsi_scrqs.active_queues);
+		}
 		break;
 	case IBMVFC_MAD_FAILED:
 		level += ibmvfc_retry_host_init(vhost);
@@ -4659,9 +4679,19 @@ static void ibmvfc_channel_setup(struct ibmvfc_host *vhost)
 	struct ibmvfc_channel_setup_mad *mad;
 	struct ibmvfc_channel_setup *setup_buf = vhost->channel_setup_buf;
 	struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq);
+	struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs;
+	unsigned int num_channels =
+		min(vhost->client_scsi_channels, vhost->max_vios_scsi_channels);
+	int i;
 
 	memset(setup_buf, 0, sizeof(*setup_buf));
-	setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+	if (num_channels == 0)
+		setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+	else {
+		setup_buf->num_scsi_subq_channels = cpu_to_be32(num_channels);
+		for (i = 0; i < num_channels; i++)
+			setup_buf->channel_handles[i] = cpu_to_be64(scrqs->scrqs[i].cookie);
+	}
 
 	ibmvfc_init_event(evt, ibmvfc_channel_setup_done, IBMVFC_MAD_FORMAT);
 	mad = &evt->iu.channel_setup;
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 12/21] ibmvfc: implement channel enquiry and setup commands
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

New NPIV_ENQUIRY_CHANNEL and NPIV_SETUP_CHANNEL management datagrams
(MADs) were defined in a previous patchset. If the client advertises a
desire to use channels and the partner VIOS is channel capable then the
client must proceed with channel enquiry to determine the maximum number
of channels the VIOS is capable of providing, and registering SubCRQs
via channel setup with the VIOS immediately following NPIV Login. This
handshaking should not be performed for subsequent NPIV Logins unless
the CRQ connection has been reset.

Implement these two new MADs and issue them following a successful NPIV
login where the VIOS has set the SUPPORT_CHANNELS capability bit in the
NPIV Login response.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 135 ++++++++++++++++++++++++++++++++-
 drivers/scsi/ibmvscsi/ibmvfc.h |   3 +
 2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index d3d7c6b53d4f..a00f38558613 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -909,6 +909,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
 	spin_lock(vhost->crq.q_lock);
 	vhost->state = IBMVFC_NO_CRQ;
 	vhost->logged_in = 0;
+	vhost->do_enquiry = 1;
+	vhost->using_channels = 0;
 
 	/* Clean out the queue */
 	memset(crq->msgs.crq, 0, PAGE_SIZE);
@@ -4586,6 +4588,118 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost)
 		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
 }
 
+static void ibmvfc_channel_setup_done(struct ibmvfc_event *evt)
+{
+	struct ibmvfc_host *vhost = evt->vhost;
+	u32 mad_status = be16_to_cpu(evt->xfer_iu->channel_setup.common.status);
+	int level = IBMVFC_DEFAULT_LOG_LEVEL;
+
+	ibmvfc_free_event(evt);
+
+	switch (mad_status) {
+	case IBMVFC_MAD_SUCCESS:
+		ibmvfc_dbg(vhost, "Channel Setup succeded\n");
+		vhost->do_enquiry = 0;
+		break;
+	case IBMVFC_MAD_FAILED:
+		level += ibmvfc_retry_host_init(vhost);
+		ibmvfc_log(vhost, level, "Channel Setup failed\n");
+		fallthrough;
+	case IBMVFC_MAD_DRIVER_FAILED:
+		return;
+	default:
+		dev_err(vhost->dev, "Invalid Channel Setup response: 0x%x\n",
+			mad_status);
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+		return;
+	}
+
+	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
+	wake_up(&vhost->work_wait_q);
+}
+
+static void ibmvfc_channel_setup(struct ibmvfc_host *vhost)
+{
+	struct ibmvfc_channel_setup_mad *mad;
+	struct ibmvfc_channel_setup *setup_buf = vhost->channel_setup_buf;
+	struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq);
+
+	memset(setup_buf, 0, sizeof(*setup_buf));
+	setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS);
+
+	ibmvfc_init_event(evt, ibmvfc_channel_setup_done, IBMVFC_MAD_FORMAT);
+	mad = &evt->iu.channel_setup;
+	memset(mad, 0, sizeof(*mad));
+	mad->common.version = cpu_to_be32(1);
+	mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_SETUP);
+	mad->common.length = cpu_to_be16(sizeof(*mad));
+	mad->buffer.va = cpu_to_be64(vhost->channel_setup_dma);
+	mad->buffer.len = cpu_to_be32(sizeof(*vhost->channel_setup_buf));
+
+	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
+
+	if (!ibmvfc_send_event(evt, vhost, default_timeout))
+		ibmvfc_dbg(vhost, "Sent channel setup\n");
+	else
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DOWN);
+}
+
+static void ibmvfc_channel_enquiry_done(struct ibmvfc_event *evt)
+{
+	struct ibmvfc_host *vhost = evt->vhost;
+	struct ibmvfc_channel_enquiry *rsp = &evt->xfer_iu->channel_enquiry;
+	u32 mad_status = be16_to_cpu(rsp->common.status);
+	int level = IBMVFC_DEFAULT_LOG_LEVEL;
+
+	switch (mad_status) {
+	case IBMVFC_MAD_SUCCESS:
+		ibmvfc_dbg(vhost, "Channel Enquiry succeeded\n");
+		vhost->max_vios_scsi_channels = be32_to_cpu(rsp->num_scsi_subq_channels);
+		ibmvfc_free_event(evt);
+		break;
+	case IBMVFC_MAD_FAILED:
+		level += ibmvfc_retry_host_init(vhost);
+		ibmvfc_log(vhost, level, "Channel Enquiry failed\n");
+		fallthrough;
+	case IBMVFC_MAD_DRIVER_FAILED:
+		ibmvfc_free_event(evt);
+		return;
+	default:
+		dev_err(vhost->dev, "Invalid Channel Enquiry response: 0x%x\n",
+			mad_status);
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+		ibmvfc_free_event(evt);
+		return;
+	}
+
+	ibmvfc_channel_setup(vhost);
+}
+
+static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost)
+{
+	struct ibmvfc_channel_enquiry *mad;
+	struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq);
+
+	ibmvfc_init_event(evt, ibmvfc_channel_enquiry_done, IBMVFC_MAD_FORMAT);
+	mad = &evt->iu.channel_enquiry;
+	memset(mad, 0, sizeof(*mad));
+	mad->common.version = cpu_to_be32(1);
+	mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_ENQUIRY);
+	mad->common.length = cpu_to_be16(sizeof(*mad));
+
+	if (IBMVFC_MIG_NO_SUB_TO_CRQ)
+		mad->flags |= cpu_to_be32(IBMVFC_NO_CHANNELS_TO_CRQ_SUPPORT);
+	if (IBMVFC_MIG_NO_N_TO_M)
+		mad->flags |= cpu_to_be32(IBMVFC_NO_N_TO_M_CHANNELS_SUPPORT);
+
+	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
+
+	if (!ibmvfc_send_event(evt, vhost, default_timeout))
+		ibmvfc_dbg(vhost, "Send channel enquiry\n");
+	else
+		ibmvfc_link_down(vhost, IBMVFC_LINK_DEAD);
+}
+
 /**
  * ibmvfc_npiv_login_done - Completion handler for NPIV Login
  * @evt:	ibmvfc event struct
@@ -4667,8 +4781,14 @@ static void ibmvfc_npiv_login_done(struct ibmvfc_event *evt)
 
 	vhost->host->can_queue = be32_to_cpu(rsp->max_cmds) - IBMVFC_NUM_INTERNAL_REQ;
 	vhost->host->max_sectors = npiv_max_sectors;
-	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
-	wake_up(&vhost->work_wait_q);
+
+	if (ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPORT_CHANNELS) && vhost->do_enquiry) {
+		ibmvfc_channel_enquiry(vhost);
+	} else {
+		vhost->do_enquiry = 0;
+		ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_QUERY);
+		wake_up(&vhost->work_wait_q);
+	}
 }
 
 /**
@@ -5477,9 +5597,20 @@ static int ibmvfc_alloc_mem(struct ibmvfc_host *vhost)
 		goto free_trace;
 	}
 
+	vhost->channel_setup_buf = dma_alloc_coherent(dev, sizeof(*vhost->channel_setup_buf),
+						      &vhost->channel_setup_dma,
+						      GFP_KERNEL);
+
+	if (!vhost->channel_setup_buf) {
+		dev_err(dev, "Couldn't allocate Channel Setup buffer\n");
+		goto free_tgt_pool;
+	}
+
 	LEAVE;
 	return 0;
 
+free_tgt_pool:
+	mempool_destroy(vhost->tgt_pool);
 free_trace:
 	kfree(vhost->trace);
 free_disc_buffer:
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index bdafe9956649..3d76cd3c1fd9 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -854,10 +854,13 @@ struct ibmvfc_host {
 	struct ibmvfc_npiv_login login_info;
 	union ibmvfc_npiv_login_data *login_buf;
 	dma_addr_t login_buf_dma;
+	struct ibmvfc_channel_setup *channel_setup_buf;
+	dma_addr_t channel_setup_dma;
 	int disc_buf_sz;
 	int log_level;
 	struct ibmvfc_discover_targets_entry *disc_buf;
 	struct mutex passthru_mutex;
+	int max_vios_scsi_channels;
 	int task_set;
 	int init_retries;
 	int discovery_threads;
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 13/21] ibmvfc: advertise client support for using hardware channels
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Previous patches have plumbed the necessary Sub-CRQ interface and
channel negotiation MADs to fully channelize via hardware backed queues.

Advertise client support via NPIV Login capability
IBMVFC_CAN_USE_CHANNELS when the client bits have MQ enabled via
vhost->mq_enabled, or when channels were already in use during a
subsequent NPIV Login. The later is required because channel support is
only renegotiated after a CRQ pair is broken. Simple NPIV Logout/Logins
require the client to continue to advertise the channel capability until
the CRQ pair between the client is broken.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index a00f38558613..0653d52d4ea0 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1410,6 +1410,10 @@ static void ibmvfc_set_login_info(struct ibmvfc_host *vhost)

 	login_info->max_cmds = cpu_to_be32(max_requests + IBMVFC_NUM_INTERNAL_REQ);
 	login_info->capabilities = cpu_to_be64(IBMVFC_CAN_MIGRATE | IBMVFC_CAN_SEND_VF_WWPN);
+
+	if (vhost->mq_enabled || vhost->using_channels)
+		login_info->capabilities |= cpu_to_be64(IBMVFC_CAN_USE_CHANNELS);
+
 	login_info->async.va = cpu_to_be64(vhost->async_crq.msg_token);
 	login_info->async.len = cpu_to_be32(async_crq->size *
 					    sizeof(*async_crq->msgs.async));
-- 
2.27.0

^ permalink raw reply related

* [PATCH v5 15/21] ibmvfc: send commands down HW Sub-CRQ when channelized
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

When the client has negotiated the use of channels all vfcFrames are
required to go down a Sub-CRQ channel or it is a protocoal violation. If
the adapter state is channelized submit vfcFrames to the appropriate
Sub-CRQ via the h_send_sub_crq() helper.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 39 ++++++++++++++++++++++++++++------
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 3f3cc37a263f..865b87881d86 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -704,6 +704,15 @@ static int ibmvfc_send_crq(struct ibmvfc_host *vhost, u64 word1, u64 word2)
 	return plpar_hcall_norets(H_SEND_CRQ, vdev->unit_address, word1, word2);
 }
 
+static int ibmvfc_send_sub_crq(struct ibmvfc_host *vhost, u64 cookie, u64 word1,
+			       u64 word2, u64 word3, u64 word4)
+{
+	struct vio_dev *vdev = to_vio_dev(vhost->dev);
+
+	return plpar_hcall_norets(H_SEND_SUB_CRQ, vdev->unit_address, cookie,
+				  word1, word2, word3, word4);
+}
+
 /**
  * ibmvfc_send_crq_init - Send a CRQ init message
  * @vhost:	ibmvfc host struct
@@ -1623,8 +1632,17 @@ static int ibmvfc_send_event(struct ibmvfc_event *evt,
 
 	mb();
 
-	if ((rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]),
-				  be64_to_cpu(crq_as_u64[1])))) {
+	if (evt->queue->fmt == IBMVFC_SUB_CRQ_FMT)
+		rc = ibmvfc_send_sub_crq(vhost,
+					 evt->queue->vios_cookie,
+					 be64_to_cpu(crq_as_u64[0]),
+					 be64_to_cpu(crq_as_u64[1]),
+					 0, 0);
+	else
+		rc = ibmvfc_send_crq(vhost, be64_to_cpu(crq_as_u64[0]),
+				     be64_to_cpu(crq_as_u64[1]));
+
+	if (rc) {
 		list_del(&evt->queue_list);
 		spin_unlock_irqrestore(&evt->queue->l_lock, flags);
 		del_timer(&evt->timer);
@@ -1842,6 +1860,7 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 	struct ibmvfc_event *evt;
 	u32 tag_and_hwq = blk_mq_unique_tag(cmnd->request);
 	u16 hwq = blk_mq_unique_tag_to_hwq(tag_and_hwq);
+	u16 scsi_channel;
 	int rc;
 
 	if (unlikely((rc = fc_remote_port_chkready(rport))) ||
@@ -1852,7 +1871,13 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 	}
 
 	cmnd->result = (DID_OK << 16);
-	evt = ibmvfc_get_event(&vhost->crq);
+	if (vhost->using_channels) {
+		scsi_channel = hwq % vhost->scsi_scrqs.active_queues;
+		evt = ibmvfc_get_event(&vhost->scsi_scrqs.scrqs[scsi_channel]);
+		evt->hwq = hwq % vhost->scsi_scrqs.active_queues;
+	} else
+		evt = ibmvfc_get_event(&vhost->crq);
+
 	ibmvfc_init_event(evt, ibmvfc_scsi_done, IBMVFC_CMD_FORMAT);
 	evt->cmnd = cmnd;
 
@@ -1868,8 +1893,6 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 	}
 
 	vfc_cmd->correlation = cpu_to_be64((u64)evt);
-	if (vhost->using_channels)
-		evt->hwq = hwq % vhost->scsi_scrqs.active_queues;
 
 	if (likely(!(rc = ibmvfc_map_sg_data(cmnd, evt, vfc_cmd, vhost->dev))))
 		return ibmvfc_send_event(evt, vhost, 0);
@@ -2200,7 +2223,11 @@ static int ibmvfc_reset_device(struct scsi_device *sdev, int type, char *desc)
 
 	spin_lock_irqsave(vhost->host->host_lock, flags);
 	if (vhost->state == IBMVFC_ACTIVE) {
-		evt = ibmvfc_get_event(&vhost->crq);
+		if (vhost->using_channels)
+			evt = ibmvfc_get_event(&vhost->scsi_scrqs.scrqs[0]);
+		else
+			evt = ibmvfc_get_event(&vhost->crq);
+
 		ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_CMD_FORMAT);
 		tmf = ibmvfc_init_vfc_cmd(evt, sdev);
 		iu = ibmvfc_get_fcp_iu(vhost, tmf);
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 21/21] ibmvfc: provide modules parameters for MQ settings
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Add the various module parameter toggles for adjusting the MQ
characteristics at boot/load time as well as a device attribute for
changing the client scsi channel request amount.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 75 ++++++++++++++++++++++++++++++----
 drivers/scsi/ibmvscsi/ibmvfc.h |  1 +
 2 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index d314dffaafc4..7097028d4cb6 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -40,6 +40,12 @@ static unsigned int disc_threads = IBMVFC_MAX_DISC_THREADS;
 static unsigned int ibmvfc_debug = IBMVFC_DEBUG;
 static unsigned int log_level = IBMVFC_DEFAULT_LOG_LEVEL;
 static unsigned int cls3_error = IBMVFC_CLS3_ERROR;
+static unsigned int mq_enabled = IBMVFC_MQ;
+static unsigned int nr_scsi_hw_queues = IBMVFC_SCSI_HW_QUEUES;
+static unsigned int nr_scsi_channels = IBMVFC_SCSI_CHANNELS;
+static unsigned int mig_channels_only = IBMVFC_MIG_NO_SUB_TO_CRQ;
+static unsigned int mig_no_less_channels = IBMVFC_MIG_NO_N_TO_M;
+
 static LIST_HEAD(ibmvfc_head);
 static DEFINE_SPINLOCK(ibmvfc_driver_lock);
 static struct scsi_transport_template *ibmvfc_transport_template;
@@ -49,6 +55,22 @@ MODULE_AUTHOR("Brian King <brking@linux.vnet.ibm.com>");
 MODULE_LICENSE("GPL");
 MODULE_VERSION(IBMVFC_DRIVER_VERSION);
 
+module_param_named(mq, mq_enabled, uint, S_IRUGO);
+MODULE_PARM_DESC(mq, "Enable multiqueue support. "
+		 "[Default=" __stringify(IBMVFC_MQ) "]");
+module_param_named(scsi_host_queues, nr_scsi_hw_queues, uint, S_IRUGO);
+MODULE_PARM_DESC(scsi_host_queues, "Number of SCSI Host submission queues. "
+		 "[Default=" __stringify(IBMVFC_SCSI_HW_QUEUES) "]");
+module_param_named(scsi_hw_channels, nr_scsi_channels, uint, S_IRUGO);
+MODULE_PARM_DESC(scsi_hw_channels, "Number of hw scsi channels to request. "
+		 "[Default=" __stringify(IBMVFC_SCSI_CHANNELS) "]");
+module_param_named(mig_channels_only, mig_channels_only, uint, S_IRUGO);
+MODULE_PARM_DESC(mig_channels_only, "Prevent migration to non-channelized system. "
+		 "[Default=" __stringify(IBMVFC_MIG_NO_SUB_TO_CRQ) "]");
+module_param_named(mig_no_less_channels, mig_no_less_channels, uint, S_IRUGO);
+MODULE_PARM_DESC(mig_no_less_channels, "Prevent migration to system with less channels. "
+		 "[Default=" __stringify(IBMVFC_MIG_NO_N_TO_M) "]");
+
 module_param_named(init_timeout, init_timeout, uint, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(init_timeout, "Initialization timeout in seconds. "
 		 "[Default=" __stringify(IBMVFC_INIT_TIMEOUT) "]");
@@ -926,7 +948,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
 	crq->cur = 0;
 
 	if (vhost->scsi_scrqs.scrqs) {
-		for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++) {
+		for (i = 0; i < nr_scsi_hw_queues; i++) {
 			scrq = &vhost->scsi_scrqs.scrqs[i];
 			spin_lock(scrq->q_lock);
 			memset(scrq->msgs.scrq, 0, PAGE_SIZE);
@@ -3397,6 +3419,37 @@ static ssize_t ibmvfc_store_log_level(struct device *dev,
 	return strlen(buf);
 }
 
+static ssize_t ibmvfc_show_scsi_channels(struct device *dev,
+					 struct device_attribute *attr, char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct ibmvfc_host *vhost = shost_priv(shost);
+	unsigned long flags = 0;
+	int len;
+
+	spin_lock_irqsave(shost->host_lock, flags);
+	len = snprintf(buf, PAGE_SIZE, "%d\n", vhost->client_scsi_channels);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+	return len;
+}
+
+static ssize_t ibmvfc_store_scsi_channels(struct device *dev,
+					 struct device_attribute *attr,
+					 const char *buf, size_t count)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct ibmvfc_host *vhost = shost_priv(shost);
+	unsigned long flags = 0;
+	unsigned int channels;
+
+	spin_lock_irqsave(shost->host_lock, flags);
+	channels = simple_strtoul(buf, NULL, 10);
+	vhost->client_scsi_channels = min(channels, nr_scsi_hw_queues);
+	ibmvfc_hard_reset_host(vhost);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+	return strlen(buf);
+}
+
 static DEVICE_ATTR(partition_name, S_IRUGO, ibmvfc_show_host_partition_name, NULL);
 static DEVICE_ATTR(device_name, S_IRUGO, ibmvfc_show_host_device_name, NULL);
 static DEVICE_ATTR(port_loc_code, S_IRUGO, ibmvfc_show_host_loc_code, NULL);
@@ -3405,6 +3458,8 @@ static DEVICE_ATTR(npiv_version, S_IRUGO, ibmvfc_show_host_npiv_version, NULL);
 static DEVICE_ATTR(capabilities, S_IRUGO, ibmvfc_show_host_capabilities, NULL);
 static DEVICE_ATTR(log_level, S_IRUGO | S_IWUSR,
 		   ibmvfc_show_log_level, ibmvfc_store_log_level);
+static DEVICE_ATTR(nr_scsi_channels, S_IRUGO | S_IWUSR,
+		   ibmvfc_show_scsi_channels, ibmvfc_store_scsi_channels);
 
 #ifdef CONFIG_SCSI_IBMVFC_TRACE
 /**
@@ -3461,6 +3516,7 @@ static struct device_attribute *ibmvfc_attrs[] = {
 	&dev_attr_npiv_version,
 	&dev_attr_capabilities,
 	&dev_attr_log_level,
+	&dev_attr_nr_scsi_channels,
 	NULL
 };
 
@@ -4866,9 +4922,9 @@ static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost)
 	mad->common.opcode = cpu_to_be32(IBMVFC_CHANNEL_ENQUIRY);
 	mad->common.length = cpu_to_be16(sizeof(*mad));
 
-	if (IBMVFC_MIG_NO_SUB_TO_CRQ)
+	if (mig_channels_only)
 		mad->flags |= cpu_to_be32(IBMVFC_NO_CHANNELS_TO_CRQ_SUPPORT);
-	if (IBMVFC_MIG_NO_N_TO_M)
+	if (mig_no_less_channels)
 		mad->flags |= cpu_to_be32(IBMVFC_NO_N_TO_M_CHANNELS_SUPPORT);
 
 	ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT_WAIT);
@@ -5655,13 +5711,13 @@ static int ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
 
 	ENTER;
 
-	vhost->scsi_scrqs.scrqs = kcalloc(IBMVFC_SCSI_HW_QUEUES,
+	vhost->scsi_scrqs.scrqs = kcalloc(nr_scsi_hw_queues,
 					  sizeof(*vhost->scsi_scrqs.scrqs),
 					  GFP_KERNEL);
 	if (!vhost->scsi_scrqs.scrqs)
 		return -1;
 
-	for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++) {
+	for (i = 0; i < nr_scsi_hw_queues; i++) {
 		if (ibmvfc_register_scsi_channel(vhost, i)) {
 			for (j = i; j > 0; j--)
 				ibmvfc_deregister_scsi_channel(vhost, j - 1);
@@ -5685,7 +5741,7 @@ static void ibmvfc_release_sub_crqs(struct ibmvfc_host *vhost)
 	if (!vhost->scsi_scrqs.scrqs)
 		return;
 
-	for (i = 0; i < IBMVFC_SCSI_HW_QUEUES; i++)
+	for (i = 0; i < nr_scsi_hw_queues; i++)
 		ibmvfc_deregister_scsi_channel(vhost, i);
 
 	kfree(vhost->scsi_scrqs.scrqs);
@@ -5873,6 +5929,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	struct Scsi_Host *shost;
 	struct device *dev = &vdev->dev;
 	int rc = -ENOMEM;
+	unsigned int max_scsi_queues = IBMVFC_MAX_SCSI_QUEUES;
 
 	ENTER;
 	shost = scsi_host_alloc(&driver_template, sizeof(*vhost));
@@ -5888,7 +5945,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	shost->max_sectors = IBMVFC_MAX_SECTORS;
 	shost->max_cmd_len = IBMVFC_MAX_CDB_LEN;
 	shost->unique_id = shost->host_no;
-	shost->nr_hw_queues = IBMVFC_MQ ? IBMVFC_SCSI_HW_QUEUES : 1;
+	shost->nr_hw_queues = mq_enabled ? min(max_scsi_queues, nr_scsi_hw_queues) : 1;
 
 	vhost = shost_priv(shost);
 	INIT_LIST_HEAD(&vhost->targets);
@@ -5900,8 +5957,8 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
 	vhost->log_level = log_level;
 	vhost->task_set = 1;
 
-	vhost->mq_enabled = IBMVFC_MQ;
-	vhost->client_scsi_channels = IBMVFC_SCSI_CHANNELS;
+	vhost->mq_enabled = mq_enabled;
+	vhost->client_scsi_channels = min(shost->nr_hw_queues, nr_scsi_channels);
 	vhost->using_channels = 0;
 	vhost->do_enquiry = 1;
 
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 0391cb746d0b..19dcec3ae9ba 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -43,6 +43,7 @@
 #define IBMVFC_CLS3_ERROR		0
 #define IBMVFC_MQ			1
 #define IBMVFC_SCSI_CHANNELS		8
+#define IBMVFC_MAX_SCSI_QUEUES		16
 #define IBMVFC_SCSI_HW_QUEUES		8
 #define IBMVFC_MIG_NO_SUB_TO_CRQ	0
 #define IBMVFC_MIG_NO_N_TO_M		0
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 19/21] ibmvfc: purge scsi channels after transport loss/reset
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Grab the queue and list lock for each Sub-CRQ and add any uncompleted
events to the host purge list.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 5ca8fcafd1d5..d314dffaafc4 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1056,7 +1056,13 @@ static void ibmvfc_fail_request(struct ibmvfc_event *evt, int error_code)
 static void ibmvfc_purge_requests(struct ibmvfc_host *vhost, int error_code)
 {
 	struct ibmvfc_event *evt, *pos;
+	struct ibmvfc_queue *queues = vhost->scsi_scrqs.scrqs;
 	unsigned long flags;
+	int hwqs = 0;
+	int i;
+
+	if (vhost->using_channels)
+		hwqs = vhost->scsi_scrqs.active_queues;
 
 	ibmvfc_dbg(vhost, "Purging all requests\n");
 	spin_lock_irqsave(&vhost->crq.l_lock, flags);
@@ -1064,6 +1070,16 @@ static void ibmvfc_purge_requests(struct ibmvfc_host *vhost, int error_code)
 		ibmvfc_fail_request(evt, error_code);
 	list_splice_init(&vhost->crq.sent, &vhost->purge);
 	spin_unlock_irqrestore(&vhost->crq.l_lock, flags);
+
+	for (i = 0; i < hwqs; i++) {
+		spin_lock_irqsave(queues[i].q_lock, flags);
+		spin_lock(&queues[i].l_lock);
+		list_for_each_entry_safe(evt, pos, &queues[i].sent, queue_list)
+			ibmvfc_fail_request(evt, error_code);
+		list_splice_init(&queues[i].sent, &vhost->purge);
+		spin_unlock(&queues[i].l_lock);
+		spin_unlock_irqrestore(queues[i].q_lock, flags);
+	}
 }
 
 /**
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 20/21] ibmvfc: enable MQ and set reasonable defaults
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Turn on MQ by default and set sane values for the upper limit on hw
queues for the scsi host, and number of hw scsi channels to request from
the partner VIOS.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index c3bb83c9d8a6..0391cb746d0b 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -41,9 +41,9 @@
 #define IBMVFC_DEFAULT_LOG_LEVEL	2
 #define IBMVFC_MAX_CDB_LEN		16
 #define IBMVFC_CLS3_ERROR		0
-#define IBMVFC_MQ			0
-#define IBMVFC_SCSI_CHANNELS		0
-#define IBMVFC_SCSI_HW_QUEUES		1
+#define IBMVFC_MQ			1
+#define IBMVFC_SCSI_CHANNELS		8
+#define IBMVFC_SCSI_HW_QUEUES		8
 #define IBMVFC_MIG_NO_SUB_TO_CRQ	0
 #define IBMVFC_MIG_NO_N_TO_M		0
 
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 14/21] ibmvfc: set and track hw queue in ibmvfc_event struct
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Extract the hwq id from a SCSI command and store it in the ibmvfc_event
structure to identify which Sub-CRQ to send the command down when
channels are being utilized.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 5 +++++
 drivers/scsi/ibmvscsi/ibmvfc.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 0653d52d4ea0..3f3cc37a263f 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -1483,6 +1483,7 @@ static void ibmvfc_init_event(struct ibmvfc_event *evt,
 		evt->_done = done;
 		evt->done = ibmvfc_locked_done;
 	}
+	evt->hwq = 0;
 }
 
 /**
@@ -1839,6 +1840,8 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 	struct ibmvfc_cmd *vfc_cmd;
 	struct ibmvfc_fcp_cmd_iu *iu;
 	struct ibmvfc_event *evt;
+	u32 tag_and_hwq = blk_mq_unique_tag(cmnd->request);
+	u16 hwq = blk_mq_unique_tag_to_hwq(tag_and_hwq);
 	int rc;
 
 	if (unlikely((rc = fc_remote_port_chkready(rport))) ||
@@ -1865,6 +1868,8 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
 	}
 
 	vfc_cmd->correlation = cpu_to_be64((u64)evt);
+	if (vhost->using_channels)
+		evt->hwq = hwq % vhost->scsi_scrqs.active_queues;
 
 	if (likely(!(rc = ibmvfc_map_sg_data(cmnd, evt, vfc_cmd, vhost->dev))))
 		return ibmvfc_send_event(evt, vhost, 0);
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 3d76cd3c1fd9..2dbce7071409 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -753,6 +753,7 @@ struct ibmvfc_event {
 	struct completion comp;
 	struct completion *eh_comp;
 	struct timer_list timer;
+	u16 hwq;
 };
 
 /* a pool of event structs for use */
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 18/21] ibmvfc: send Cancel MAD down each hw scsi channel
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
	linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

In general the client needs to send Cancel MADs and task management
commands down the same channel as the command(s) intended to cancel or
abort. The client assigns cancel keys per LUN and thus must send a
Cancel down each channel commands were submitted for that LUN. Further,
the client then must wait for those cancel completions prior to
submitting a LUN RESET or ABORT TASK SET.

Add a cancel rsp iu syncronization field to the ibmvfc_queue struct such
that the cancel routine can sync the cancel response to each queue that
requires a cancel command. Build a list of each cancel event sent and
wait for the completion of each submitted cancel.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 109 +++++++++++++++++++++++++++++----
 drivers/scsi/ibmvscsi/ibmvfc.h |   3 +
 2 files changed, 100 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index b0b0212344f3..5ca8fcafd1d5 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2418,18 +2418,82 @@ static struct ibmvfc_event *ibmvfc_init_tmf(struct ibmvfc_queue *queue,
 	return evt;
 }
 
-/**
- * ibmvfc_cancel_all - Cancel all outstanding commands to the device
- * @sdev:	scsi device to cancel commands
- * @type:	type of error recovery being performed
- *
- * This sends a cancel to the VIOS for the specified device. This does
- * NOT send any abort to the actual device. That must be done separately.
- *
- * Returns:
- *	0 on success / other on failure
- **/
-static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
+static int ibmvfc_cancel_all_mq(struct scsi_device *sdev, int type)
+{
+	struct ibmvfc_host *vhost = shost_priv(sdev->host);
+	struct ibmvfc_event *evt, *found_evt, *temp;
+	struct ibmvfc_queue *queues = vhost->scsi_scrqs.scrqs;
+	unsigned long flags;
+	int num_hwq, i;
+	int fail = 0;
+	LIST_HEAD(cancelq);
+	u16 status;
+
+	ENTER;
+	spin_lock_irqsave(vhost->host->host_lock, flags);
+	num_hwq = vhost->scsi_scrqs.active_queues;
+	for (i = 0; i < num_hwq; i++) {
+		spin_lock(queues[i].q_lock);
+		spin_lock(&queues[i].l_lock);
+		found_evt = NULL;
+		list_for_each_entry(evt, &queues[i].sent, queue_list) {
+			if (evt->cmnd && evt->cmnd->device == sdev) {
+				found_evt = evt;
+				break;
+			}
+		}
+		spin_unlock(&queues[i].l_lock);
+
+		if (found_evt && vhost->logged_in) {
+			evt = ibmvfc_init_tmf(&queues[i], sdev, type);
+			evt->sync_iu = &queues[i].cancel_rsp;
+			ibmvfc_send_event(evt, vhost, default_timeout);
+			list_add_tail(&evt->cancel, &cancelq);
+		}
+
+		spin_unlock(queues[i].q_lock);
+	}
+	spin_unlock_irqrestore(vhost->host->host_lock, flags);
+
+	if (list_empty(&cancelq)) {
+		if (vhost->log_level > IBMVFC_DEFAULT_LOG_LEVEL)
+			sdev_printk(KERN_INFO, sdev, "No events found to cancel\n");
+		return 0;
+	}
+
+	sdev_printk(KERN_INFO, sdev, "Cancelling outstanding commands.\n");
+
+	list_for_each_entry_safe(evt, temp, &cancelq, cancel) {
+		wait_for_completion(&evt->comp);
+		status = be16_to_cpu(evt->queue->cancel_rsp.mad_common.status);
+		list_del(&evt->cancel);
+		ibmvfc_free_event(evt);
+
+		if (status != IBMVFC_MAD_SUCCESS) {
+			sdev_printk(KERN_WARNING, sdev, "Cancel failed with rc=%x\n", status);
+			switch (status) {
+			case IBMVFC_MAD_DRIVER_FAILED:
+			case IBMVFC_MAD_CRQ_ERROR:
+			/* Host adapter most likely going through reset, return success to
+			 * the caller will wait for the command being cancelled to get returned
+			 */
+				break;
+			default:
+				fail = 1;
+				break;
+			}
+		}
+	}
+
+	if (fail)
+		return -EIO;
+
+	sdev_printk(KERN_INFO, sdev, "Successfully cancelled outstanding commands\n");
+	LEAVE;
+	return 0;
+}
+
+static int ibmvfc_cancel_all_sq(struct scsi_device *sdev, int type)
 {
 	struct ibmvfc_host *vhost = shost_priv(sdev->host);
 	struct ibmvfc_event *evt, *found_evt;
@@ -2498,6 +2562,27 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
 	return 0;
 }
 
+/**
+ * ibmvfc_cancel_all - Cancel all outstanding commands to the device
+ * @sdev:	scsi device to cancel commands
+ * @type:	type of error recovery being performed
+ *
+ * This sends a cancel to the VIOS for the specified device. This does
+ * NOT send any abort to the actual device. That must be done separately.
+ *
+ * Returns:
+ *	0 on success / other on failure
+ **/
+static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
+{
+	struct ibmvfc_host *vhost = shost_priv(sdev->host);
+
+	if (vhost->mq_enabled && vhost->using_channels)
+		return ibmvfc_cancel_all_mq(sdev, type);
+	else
+		return ibmvfc_cancel_all_sq(sdev, type);
+}
+
 /**
  * ibmvfc_match_key - Match function for specified cancel key
  * @evt:	ibmvfc event struct
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
index 2dbce7071409..c3bb83c9d8a6 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.h
+++ b/drivers/scsi/ibmvscsi/ibmvfc.h
@@ -737,6 +737,7 @@ struct ibmvfc_target {
 /* a unit of work for the hosting partition */
 struct ibmvfc_event {
 	struct list_head queue_list;
+	struct list_head cancel;
 	struct ibmvfc_host *vhost;
 	struct ibmvfc_queue *queue;
 	struct ibmvfc_target *tgt;
@@ -790,6 +791,8 @@ struct ibmvfc_queue {
 	struct list_head free;
 	spinlock_t l_lock;
 
+	union ibmvfc_iu cancel_rsp;
+
 	/* Sub-CRQ fields */
 	struct ibmvfc_host *vhost;
 	unsigned long cookie;
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 10/21] ibmvfc: define Sub-CRQ interrupt handler routine
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Simple handler that calls Sub-CRQ drain routine directly.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index f3cd092478ee..51bcafad9490 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -3571,6 +3571,16 @@ static void ibmvfc_drain_sub_crq(struct ibmvfc_queue *scrq)
 	}
 }
 
+static irqreturn_t ibmvfc_interrupt_scsi(int irq, void *scrq_instance)
+{
+	struct ibmvfc_queue *scrq = (struct ibmvfc_queue *)scrq_instance;
+
+	ibmvfc_toggle_scrq_irq(scrq, 0);
+	ibmvfc_drain_sub_crq(scrq);
+
+	return IRQ_HANDLED;
+}
+
 /**
  * ibmvfc_init_tgt - Set the next init job step for the target
  * @tgt:		ibmvfc target struct
-- 
2.27.0


^ permalink raw reply related

* [PATCH v5 17/21] ibmvfc: add cancel mad initialization helper
From: Tyrel Datwyler @ 2021-01-14 20:31 UTC (permalink / raw)
  To: james.bottomley
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	Brian King, brking, linuxppc-dev
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Add a helper routine for initializing a Cancel MAD. This will be useful
for a channelized client that needs to send Cancel commands down every
channel commands were sent for a particular LUN.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 68 ++++++++++++++++++++--------------
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 578e27180f10..b0b0212344f3 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -2379,6 +2379,45 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
 	return SUCCESS;
 }
 
+static struct ibmvfc_event *ibmvfc_init_tmf(struct ibmvfc_queue *queue,
+					    struct scsi_device *sdev,
+					    int type)
+{
+	struct ibmvfc_host *vhost = shost_priv(sdev->host);
+	struct scsi_target *starget = scsi_target(sdev);
+	struct fc_rport *rport = starget_to_rport(starget);
+	struct ibmvfc_event *evt;
+	struct ibmvfc_tmf *tmf;
+
+	evt = ibmvfc_get_event(queue);
+	ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT);
+
+	tmf = &evt->iu.tmf;
+	memset(tmf, 0, sizeof(*tmf));
+	if (ibmvfc_check_caps(vhost, IBMVFC_HANDLE_VF_WWPN)) {
+		tmf->common.version = cpu_to_be32(2);
+		tmf->target_wwpn = cpu_to_be64(rport->port_name);
+	} else {
+		tmf->common.version = cpu_to_be32(1);
+	}
+	tmf->common.opcode = cpu_to_be32(IBMVFC_TMF_MAD);
+	tmf->common.length = cpu_to_be16(sizeof(*tmf));
+	tmf->scsi_id = cpu_to_be64(rport->port_id);
+	int_to_scsilun(sdev->lun, &tmf->lun);
+	if (!ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPRESS_ABTS))
+		type &= ~IBMVFC_TMF_SUPPRESS_ABTS;
+	if (vhost->state == IBMVFC_ACTIVE)
+		tmf->flags = cpu_to_be32((type | IBMVFC_TMF_LUA_VALID));
+	else
+		tmf->flags = cpu_to_be32(((type & IBMVFC_TMF_SUPPRESS_ABTS) | IBMVFC_TMF_LUA_VALID));
+	tmf->cancel_key = cpu_to_be32((unsigned long)sdev->hostdata);
+	tmf->my_cancel_key = cpu_to_be32((unsigned long)starget->hostdata);
+
+	init_completion(&evt->comp);
+
+	return evt;
+}
+
 /**
  * ibmvfc_cancel_all - Cancel all outstanding commands to the device
  * @sdev:	scsi device to cancel commands
@@ -2393,9 +2432,6 @@ static int ibmvfc_wait_for_ops(struct ibmvfc_host *vhost, void *device,
 static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
 {
 	struct ibmvfc_host *vhost = shost_priv(sdev->host);
-	struct scsi_target *starget = scsi_target(sdev);
-	struct fc_rport *rport = starget_to_rport(starget);
-	struct ibmvfc_tmf *tmf;
 	struct ibmvfc_event *evt, *found_evt;
 	union ibmvfc_iu rsp;
 	int rsp_rc = -EBUSY;
@@ -2422,32 +2458,8 @@ static int ibmvfc_cancel_all(struct scsi_device *sdev, int type)
 	}
 
 	if (vhost->logged_in) {
-		evt = ibmvfc_get_event(&vhost->crq);
-		ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT);
-
-		tmf = &evt->iu.tmf;
-		memset(tmf, 0, sizeof(*tmf));
-		if (ibmvfc_check_caps(vhost, IBMVFC_HANDLE_VF_WWPN)) {
-			tmf->common.version = cpu_to_be32(2);
-			tmf->target_wwpn = cpu_to_be64(rport->port_name);
-		} else {
-			tmf->common.version = cpu_to_be32(1);
-		}
-		tmf->common.opcode = cpu_to_be32(IBMVFC_TMF_MAD);
-		tmf->common.length = cpu_to_be16(sizeof(*tmf));
-		tmf->scsi_id = cpu_to_be64(rport->port_id);
-		int_to_scsilun(sdev->lun, &tmf->lun);
-		if (!ibmvfc_check_caps(vhost, IBMVFC_CAN_SUPPRESS_ABTS))
-			type &= ~IBMVFC_TMF_SUPPRESS_ABTS;
-		if (vhost->state == IBMVFC_ACTIVE)
-			tmf->flags = cpu_to_be32((type | IBMVFC_TMF_LUA_VALID));
-		else
-			tmf->flags = cpu_to_be32(((type & IBMVFC_TMF_SUPPRESS_ABTS) | IBMVFC_TMF_LUA_VALID));
-		tmf->cancel_key = cpu_to_be32((unsigned long)sdev->hostdata);
-		tmf->my_cancel_key = cpu_to_be32((unsigned long)starget->hostdata);
-
+		evt = ibmvfc_init_tmf(&vhost->crq, sdev, type);
 		evt->sync_iu = &rsp;
-		init_completion(&evt->comp);
 		rsp_rc = ibmvfc_send_event(evt, vhost, default_timeout);
 	}
 
-- 
2.27.0


^ permalink raw reply related

* [PATCH 1/3] tty: hvcs: Drop unnecessary if block
From: Uwe Kleine-König @ 2021-01-14 17:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby
  Cc: sparclinux, linuxppc-dev, David S . Miller, linux-kernel
In-Reply-To: <20210114175718.137483-1-u.kleine-koenig@pengutronix.de>

If hvcs_probe() succeeded dev_set_drvdata() is called with a non-NULL
value, and if hvcs_probe() failed hvcs_remove() isn't called.

So there is no way dev_get_drvdata() can return NULL in hvcs_remove() and
the check can just go away.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 drivers/tty/hvc/hvcs.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c
index 509d1042825a..3e0461285c34 100644
--- a/drivers/tty/hvc/hvcs.c
+++ b/drivers/tty/hvc/hvcs.c
@@ -825,9 +825,6 @@ static int hvcs_remove(struct vio_dev *dev)
 	unsigned long flags;
 	struct tty_struct *tty;

-	if (!hvcsd)
-		return -ENODEV;
-
 	/* By this time the vty-server won't be getting any more interrupts */

 	spin_lock_irqsave(&hvcsd->lock, flags);
-- 
2.29.2

^ permalink raw reply related

* Re: [PATCH v2 0/7] Rid W=1 warnings in Ethernet
From: Jakub Kicinski @ 2021-01-14 17:14 UTC (permalink / raw)
  To: Lee Jones
  Cc: Paul Durrant, Kurt Kanzenbach, Alexei Starovoitov,
	Gustavo A. R. Silva, Peter Cammaert, Paul Mackerras,
	Sukadev Bhattiprolu, Wei Liu, Daniel Borkmann, John Fastabend,
	Santiago Leon, xen-devel, Grygorii Strashko, Thomas Falcon,
	Jesper Dangaard Brouer, Jens Osterkamp, Rusty Russell,
	Daris A Nevil, Lijun Pan, Ivan Khoronzhuk, Nicolas Pitre,
	Geoff Levand, netdev, linux-kernel, Erik Stahlman, John Allen,
	Utz Bacher, Dany Madden, bpf, linuxppc-dev, David S. Miller,
	Russell King
In-Reply-To: <20210114083349.GI3975472@dell>

On Thu, 14 Jan 2021 08:33:49 +0000 Lee Jones wrote:
> On Wed, 13 Jan 2021, Jakub Kicinski wrote:
> 
> > On Wed, 13 Jan 2021 16:41:16 +0000 Lee Jones wrote:  
> > > Resending the stragglers again.                                                                                  
> > > 
> > > This set is part of a larger effort attempting to clean-up W=1                                                   
> > > kernel builds, which are currently overwhelmingly riddled with                                                   
> > > niggly little warnings.                                                                                          
> > >                                                                                                                  
> > > v2:                                                                                                              
> > >  - Squashed IBM patches                                                                                      
> > >  - Fixed real issue in SMSC
> > >  - Added Andrew's Reviewed-by tags on remainder  
> > 
> > Does not apply, please rebase on net-next/master.  
> 
> These are based on Tuesday's next/master.

What's next/master?

This is net-next:

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/

> I just rebased them now with no issue.
> 
> What conflict are you seeing?

Applying: net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
error: patch failed: drivers/net/ethernet/smsc/smc91x.c:2192
error: drivers/net/ethernet/smsc/smc91x.c: patch does not apply
Patch failed at 0001 net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

^ permalink raw reply

* [PATCH 2/6] powerpc/rtas-proc: remove unused RMO_READ_BUF_MAX
From: Nathan Lynch @ 2021-01-14 22:00 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-1-nathanl@linux.ibm.com>

This constant is unused.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas-proc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kernel/rtas-proc.c b/arch/powerpc/kernel/rtas-proc.c
index e0f8329966d6..d2b0d99824a4 100644
--- a/arch/powerpc/kernel/rtas-proc.c
+++ b/arch/powerpc/kernel/rtas-proc.c
@@ -755,8 +755,6 @@ static int ppc_rtas_tone_volume_show(struct seq_file *m, void *v)
 	return 0;
 }
 
-#define RMO_READ_BUF_MAX 30
-
 /**
  * ppc_rtas_rmo_buf_show() - Describe RTAS-addressable region for user space.
  *
-- 
2.29.2


^ permalink raw reply related

* [PATCH 1/6] powerpc/rtas: improve ppc_rtas_rmo_buf_show documentation
From: Nathan Lynch @ 2021-01-14 21:59 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-1-nathanl@linux.ibm.com>

Add kerneldoc for ppc_rtas_rmo_buf_show(), the callback for
/proc/powerpc/rtas/rmo_buffer, explaining its expected use.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas-proc.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas-proc.c b/arch/powerpc/kernel/rtas-proc.c
index 2d33f342a293..e0f8329966d6 100644
--- a/arch/powerpc/kernel/rtas-proc.c
+++ b/arch/powerpc/kernel/rtas-proc.c
@@ -757,7 +757,16 @@ static int ppc_rtas_tone_volume_show(struct seq_file *m, void *v)
 
 #define RMO_READ_BUF_MAX 30
 
-/* RTAS Userspace access */
+/**
+ * ppc_rtas_rmo_buf_show() - Describe RTAS-addressable region for user space.
+ *
+ * Base + size description of a range of RTAS-addressable memory set
+ * aside for user space to use as work area(s) for certain RTAS
+ * functions. User space accesses this region via /dev/mem. Apart from
+ * security policies, the kernel does not arbitrate or serialize
+ * access to this region, and user space must ensure that concurrent
+ * users do not interfere with each other.
+ */
 static int ppc_rtas_rmo_buf_show(struct seq_file *m, void *v)
 {
 	seq_printf(m, "%016lx %x\n", rtas_rmo_buf, RTAS_RMOBUF_MAX);
-- 
2.29.2


^ permalink raw reply related

* [PATCH 3/6] powerpc/rtas: remove ibm_suspend_me_token
From: Nathan Lynch @ 2021-01-14 22:00 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-1-nathanl@linux.ibm.com>

There's not a compelling reason to cache the value of the token for
the ibm,suspend-me function. Just look it up when needed in the RTAS
syscall's special case for it.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index d126d71ea5bd..60fcf7f7b0b8 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -828,7 +828,6 @@ void rtas_activate_firmware(void)
 		pr_err("ibm,activate-firmware failed (%i)\n", fwrc);
 }
 
-static int ibm_suspend_me_token = RTAS_UNKNOWN_SERVICE;
 #ifdef CONFIG_PPC_PSERIES
 /**
  * rtas_call_reentrant() - Used for reentrant rtas calls
@@ -1103,7 +1102,7 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 		return -EINVAL;
 
 	/* Need to handle ibm,suspend_me call specially */
-	if (token == ibm_suspend_me_token) {
+	if (token == rtas_token("ibm,suspend-me")) {
 
 		/*
 		 * rtas_ibm_suspend_me assumes the streamid handle is in cpu
@@ -1191,10 +1190,8 @@ void __init rtas_initialize(void)
 	 * the stop-self token if any
 	 */
 #ifdef CONFIG_PPC64
-	if (firmware_has_feature(FW_FEATURE_LPAR)) {
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		rtas_region = min(ppc64_rma_size, RTAS_INSTANTIATE_MAX);
-		ibm_suspend_me_token = rtas_token("ibm,suspend-me");
-	}
 #endif
 	rtas_rmo_buf = memblock_phys_alloc_range(RTAS_RMOBUF_MAX, PAGE_SIZE,
 						 0, rtas_region);
-- 
2.29.2


^ permalink raw reply related

* [PATCH 4/6] powerpc/rtas: move syscall filter setup into separate function
From: Nathan Lynch @ 2021-01-14 22:00 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-1-nathanl@linux.ibm.com>

Reduce conditionally compiled sections within rtas_initialize() by
moving the filter table initialization into its own function already
guarded by CONFIG_PPC_RTAS_FILTER. No behavior change intended.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 60fcf7f7b0b8..55f6aa170e57 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1051,6 +1051,16 @@ static bool block_rtas_call(int token, int nargs,
 	return true;
 }
 
+static void __init rtas_syscall_filter_init(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) {
+		rtas_filters[i].token = rtas_token(rtas_filters[i].name);
+	}
+
+}
+
 #else
 
 static bool block_rtas_call(int token, int nargs,
@@ -1059,6 +1069,10 @@ static bool block_rtas_call(int token, int nargs,
 	return false;
 }
 
+static void __init rtas_syscall_filter_init(void)
+{
+}
+
 #endif /* CONFIG_PPC_RTAS_FILTER */
 
 /* We assume to be passed big endian arguments */
@@ -1162,9 +1176,6 @@ void __init rtas_initialize(void)
 	unsigned long rtas_region = RTAS_INSTANTIATE_MAX;
 	u32 base, size, entry;
 	int no_base, no_size, no_entry;
-#ifdef CONFIG_PPC_RTAS_FILTER
-	int i;
-#endif
 
 	/* Get RTAS dev node and fill up our "rtas" structure with infos
 	 * about it.
@@ -1203,11 +1214,7 @@ void __init rtas_initialize(void)
 	rtas_last_error_token = rtas_token("rtas-last-error");
 #endif
 
-#ifdef CONFIG_PPC_RTAS_FILTER
-	for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) {
-		rtas_filters[i].token = rtas_token(rtas_filters[i].name);
-	}
-#endif
+	rtas_syscall_filter_init();
 }
 
 int __init early_init_dt_scan_rtas(unsigned long node,
-- 
2.29.2


^ permalink raw reply related

* [PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA
From: Nathan Lynch @ 2021-01-14 22:00 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-1-nathanl@linux.ibm.com>

Memory locations passed as arguments from the OS to RTAS usually need
to be addressable in 32-bit mode and must reside in the Real Mode
Area. On PAPR guests, the RMA starts at logical address 0 and is the
first logical memory block reported in the LPAR’s device tree.

On powerpc targets with RTAS, Linux makes available to user space a
region of memory suitable for arguments to be passed to RTAS via
sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
API during boot in order to ensure that it satisfies the requirements
described above.

With radix MMU, the upper limit supplied to the memblock allocation
can exceed the bounds of the first logical memory block, since
ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
a common size of the first memory block according to a small sample of
LPARs I have checked.) This leads to failures when user space invokes
an RTAS function that uses a work area, such as
ibm,configure-connector.

Alter the determination of the upper limit for rtas_rmo_buf's
allocation to consult the device tree directly, ensuring placement
within the RMA regardless of the MMU in use.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/rtas.c | 80 +++++++++++++++++++++++++++++++-------
 1 file changed, 65 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index da65faadbbb2..98dfb112f4df 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1166,6 +1166,70 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 	return 0;
 }
 
+/*
+ * Memory locations passed to RTAS must be in the RMA as described by
+ * the range in /memory@0.
+ */
+static phys_addr_t rtas_arg_addr_limit(void)
+{
+	unsigned int addr_cells;
+	unsigned int size_cells;
+	struct device_node *np;
+	const __be32 *prop;
+	u64 limit;
+	u64 base;
+
+	/* RTAS is instantiated in 32-bit mode. */
+	limit = 1ULL << 32;
+
+	/* Account for mem=. */
+	if (memory_limit != 0)
+		limit = min(limit, memory_limit);
+
+	np = of_find_node_by_path("/memory@0");
+	if (!np)
+		goto out;
+
+	prop = of_get_property(np, "reg", NULL);
+	if (!prop)
+		goto put;
+
+	addr_cells = of_n_addr_cells(np);
+	base = of_read_number(prop, addr_cells);
+	prop += addr_cells;
+	size_cells = of_n_size_cells(np);
+	limit = min(limit, of_read_number(prop, size_cells));
+put:
+	of_node_put(np);
+out:
+	pr_debug("%s: base = %#llx limit = %#llx", __func__, base, limit);
+
+	return limit;
+}
+
+static void __init rtas_user_region_setup(void)
+{
+	phys_addr_t limit, align, size;
+
+	limit = rtas_arg_addr_limit();
+	size = RTAS_USER_REGION_SIZE;
+
+	/*
+	 * Although work areas need only 4KB alignment, user space
+	 * accesses this region via mmap so it must be placed on a
+	 * page boundary.
+	 */
+	align = PAGE_SIZE;
+
+	rtas_rmo_buf = memblock_phys_alloc_range(size, align, 0, limit);
+	if (rtas_rmo_buf == 0) {
+		panic("Failed to allocate %llu bytes for user region below %pa\n",
+		      size, &limit);
+	}
+
+	pr_debug("RTAS user region allocated at %pa\n", &rtas_rmo_buf);
+}
+
 /*
  * Call early during boot, before mem init, to retrieve the RTAS
  * information from the device-tree and allocate the RMO buffer for userland
@@ -1173,7 +1237,6 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
  */
 void __init rtas_initialize(void)
 {
-	unsigned long rtas_region = RTAS_INSTANTIATE_MAX;
 	u32 base, size, entry;
 	int no_base, no_size, no_entry;
 
@@ -1197,23 +1260,10 @@ void __init rtas_initialize(void)
 	no_entry = of_property_read_u32(rtas.dev, "linux,rtas-entry", &entry);
 	rtas.entry = no_entry ? rtas.base : entry;
 
-	/* If RTAS was found, allocate the RMO buffer for it and look for
-	 * the stop-self token if any
-	 */
-#ifdef CONFIG_PPC64
-	if (firmware_has_feature(FW_FEATURE_LPAR))
-		rtas_region = min(ppc64_rma_size, RTAS_INSTANTIATE_MAX);
-#endif
-	rtas_rmo_buf = memblock_phys_alloc_range(RTAS_USER_REGION_SIZE, PAGE_SIZE,
-						 0, rtas_region);
-	if (!rtas_rmo_buf)
-		panic("ERROR: RTAS: Failed to allocate %lx bytes below %pa\n",
-		      PAGE_SIZE, &rtas_region);
-
 #ifdef CONFIG_RTAS_ERROR_LOGGING
 	rtas_last_error_token = rtas_token("rtas-last-error");
 #endif
-
+	rtas_user_region_setup();
 	rtas_syscall_filter_init();
 }
 
-- 
2.29.2


^ permalink raw reply related

* [PATCH 0/6] powerpc/rtas: miscellaneous cleanups, user region allocation
From: Nathan Lynch @ 2021-01-14 21:59 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking

The region exposed to user space for use as work areas passed to
sys_rtas() can be incorrectly allocated on radix, leading to failures
in users of librtas. Correct this and clean up some of the code
visited along the way.

I think the cleanups should be unobjectionable and I've placed them
first in the series. Please check my work on the rtas_rmo_buf
allocation changes; they are only lightly tested so far (slot add on
Power9 PowerVM, and comparison of /memory@0/reg with the contents of
/proc/powerpc/rtas/rmo_buf on qemu Power9 w/radix).

I suspect the per-cpu RTAS argument structures for reentrant calls
need similar measures, but I can add that to the series once there is
consensus on the approach.

Nathan Lynch (6):
  powerpc/rtas: improve ppc_rtas_rmo_buf_show documentation
  powerpc/rtas-proc: remove unused RMO_READ_BUF_MAX
  powerpc/rtas: remove ibm_suspend_me_token
  powerpc/rtas: move syscall filter setup into separate function
  powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE
  powerpc/rtas: constrain user region allocation to RMA

 arch/powerpc/include/asm/rtas.h |   9 ++-
 arch/powerpc/kernel/rtas-proc.c |  15 +++--
 arch/powerpc/kernel/rtas.c      | 108 ++++++++++++++++++++++++--------
 3 files changed, 98 insertions(+), 34 deletions(-)

-- 
2.29.2

^ permalink raw reply

* [PATCH 5/6] powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE
From: Nathan Lynch @ 2021-01-14 22:00 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-1-nathanl@linux.ibm.com>

RTAS_RMOBUF_MAX doesn't actually describe a "maximum" value in any
sense. It represents the size of an area of memory set aside for user
space to use as work areas for certain RTAS calls.

Rename it to RTAS_USER_REGION, and express the value in terms of the
number of work areas allocated.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>

squash! powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE
---
 arch/powerpc/include/asm/rtas.h | 9 ++++++---
 arch/powerpc/kernel/rtas-proc.c | 2 +-
 arch/powerpc/kernel/rtas.c      | 2 +-
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 332e1000ca0f..1aa7ab1cbc84 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -19,8 +19,11 @@
 #define RTAS_UNKNOWN_SERVICE (-1)
 #define RTAS_INSTANTIATE_MAX (1ULL<<30) /* Don't instantiate rtas at/above this value */
 
-/* Buffer size for ppc_rtas system call. */
-#define RTAS_RMOBUF_MAX (64 * 1024)
+/* Work areas shared with RTAS must be 4K, naturally aligned. */
+#define RTAS_WORK_AREA_SIZE   4096
+
+/* Work areas allocated for user space access. */
+#define RTAS_USER_REGION_SIZE (RTAS_WORK_AREA_SIZE * 16)
 
 /* RTAS return status codes */
 #define RTAS_BUSY		-2    /* RTAS Busy */
@@ -357,7 +360,7 @@ extern void rtas_take_timebase(void);
 static inline int page_is_rtas_user_buf(unsigned long pfn)
 {
 	unsigned long paddr = (pfn << PAGE_SHIFT);
-	if (paddr >= rtas_rmo_buf && paddr < (rtas_rmo_buf + RTAS_RMOBUF_MAX))
+	if (paddr >= rtas_rmo_buf && paddr < (rtas_rmo_buf + RTAS_USER_REGION_SIZE))
 		return 1;
 	return 0;
 }
diff --git a/arch/powerpc/kernel/rtas-proc.c b/arch/powerpc/kernel/rtas-proc.c
index d2b0d99824a4..6857a5b0a1c3 100644
--- a/arch/powerpc/kernel/rtas-proc.c
+++ b/arch/powerpc/kernel/rtas-proc.c
@@ -767,6 +767,6 @@ static int ppc_rtas_tone_volume_show(struct seq_file *m, void *v)
  */
 static int ppc_rtas_rmo_buf_show(struct seq_file *m, void *v)
 {
-	seq_printf(m, "%016lx %x\n", rtas_rmo_buf, RTAS_RMOBUF_MAX);
+	seq_printf(m, "%016lx %x\n", rtas_rmo_buf, RTAS_USER_REGION_SIZE);
 	return 0;
 }
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 55f6aa170e57..da65faadbbb2 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1204,7 +1204,7 @@ void __init rtas_initialize(void)
 	if (firmware_has_feature(FW_FEATURE_LPAR))
 		rtas_region = min(ppc64_rma_size, RTAS_INSTANTIATE_MAX);
 #endif
-	rtas_rmo_buf = memblock_phys_alloc_range(RTAS_RMOBUF_MAX, PAGE_SIZE,
+	rtas_rmo_buf = memblock_phys_alloc_range(RTAS_USER_REGION_SIZE, PAGE_SIZE,
 						 0, rtas_region);
 	if (!rtas_rmo_buf)
 		panic("ERROR: RTAS: Failed to allocate %lx bytes below %pa\n",
-- 
2.29.2


^ permalink raw reply related

* Re: [PATCH v5 18/21] ibmvfc: send Cancel MAD down each hw scsi channel
From: Brian King @ 2021-01-14 22:42 UTC (permalink / raw)
  To: Tyrel Datwyler, james.bottomley
  Cc: brking, linuxppc-dev, linux-scsi, martin.petersen, linux-kernel
In-Reply-To: <20210114203148.246656-19-tyreld@linux.ibm.com>

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v5 21/21] ibmvfc: provide modules parameters for MQ settings
From: Brian King @ 2021-01-14 22:44 UTC (permalink / raw)
  To: Tyrel Datwyler, james.bottomley
  Cc: brking, linuxppc-dev, linux-scsi, martin.petersen, linux-kernel
In-Reply-To: <20210114203148.246656-22-tyreld@linux.ibm.com>

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v5 00/21] ibmvfc: initial MQ development/enablement
From: Brian King @ 2021-01-14 22:47 UTC (permalink / raw)
  To: Tyrel Datwyler, james.bottomley
  Cc: brking, linuxppc-dev, linux-scsi, martin.petersen, linux-kernel
In-Reply-To: <20210114203148.246656-1-tyreld@linux.ibm.com>

Tyrel,

I think this patch series is looking pretty good. I don't think we need
to wait for resolution of the can_queue issue being discussed, since
that is an issue that exists prior to this patch series and this patch
series doesn't make the issue any worse. Let's work that separately.

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH v5 02/21] powerpc/64s: move the last of the page fault handling logic to C
From: Nicholas Piggin @ 2021-01-15  0:25 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev
In-Reply-To: <afa9875c-e8ed-de3e-d395-435be86bd958@csgroup.eu>

Excerpts from Christophe Leroy's message of January 14, 2021 11:28 pm:
> 
> 
> Le 14/01/2021 à 14:17, Nicholas Piggin a écrit :
>> Excerpts from Christophe Leroy's message of January 14, 2021 10:25 pm:
>>>
>>>
>>> Le 14/01/2021 à 13:09, Nicholas Piggin a écrit :
>>>> Excerpts from Nicholas Piggin's message of January 14, 2021 1:24 pm:
>>>>> Excerpts from Christophe Leroy's message of January 14, 2021 12:12 am:
>>>>>>
>>>>>>
>>>>>> Le 13/01/2021 à 08:31, Nicholas Piggin a écrit :
>>>>>>> The page fault handling still has some complex logic particularly around
>>>>>>> hash table handling, in asm. Implement this in C instead.
>>>>>>>
>>>>>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>>>>>> ---
>>>>>>>     arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>>>>>>>     arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>>>>>>>     arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>>>>>>>     arch/powerpc/mm/fault.c                       |  46 ++++--
>>>>>>>     4 files changed, 107 insertions(+), 148 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>>> index 066b1d34c7bc..60a669379aa0 100644
>>>>>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>>>>>> @@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>>>>>>>     #define HPTE_NOHPTE_UPDATE	0x2
>>>>>>>     #define HPTE_USE_KERNEL_KEY	0x4
>>>>>>>     
>>>>>>> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>>>>>>>     extern int __hash_page_4K(unsigned long ea, unsigned long access,
>>>>>>>     			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>>>>>>>     			  unsigned long flags, int ssize, int subpage_prot);
>>>>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>>>>> index 6e53f7638737..bcb5e81d2088 100644
>>>>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>>>>> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>>>>      *
>>>>>>>      * Handling:
>>>>>>>      * - Hash MMU
>>>>>>> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
>>>>>>> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>>> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
>>>>>>> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>>>>>>>      *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>>>>>>>      *   "non-bolted" regions, e.g., vmalloc space. However these should always be
>>>>>>> - *   backed by Linux page tables.
>>>>>>> + *   backed by Linux page table entries.
>>>>>>>      *
>>>>>>> - *   If none is found, do a Linux page fault. Linux page faults can happen in
>>>>>>> - *   kernel mode due to user copy operations of course.
>>>>>>> + *   If no entry is found the Linux page fault handler is invoked (by
>>>>>>> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
>>>>>>> + *   copy operations of course.
>>>>>>>      *
>>>>>>>      *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>>>>>>>      *   MMU context, which may cause a DSI in the host, which must go to the
>>>>>>> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>>>>>>>     	GEN_COMMON data_access
>>>>>>>     	ld	r4,_DAR(r1)
>>>>>>>     	ld	r5,_DSISR(r1)
>>>>>>
>>>>>> We have DSISR here. I think the dispatch between page fault or do_break() should be done here:
>>>>>> - It would be more similar to other arches
>>>>>
>>>>> Other sub-archs?
>>>>>
>>>>>> - Would avoid doing it also in instruction fault
>>>>>
>>>>> True but it's hidden under an unlikely branch so won't really help
>>>>> instruction fault.
>>>>>
>>>>>> - Would avoid that -1 return which looks more like a hack.
>>>>>
>>>>> I don't really see it as a hack, we return a code to asm caller to
>>>>> direct whether to restore registers or not, we alrady have this
>>>>> pattern.
>>>>>
>>>>> (I'm hoping all that might be go away one day by conrolling NV
>>>>> regs from C if we can get good code generation but even if not we
>>>>> still have it in the interrupt returns).
>>>>>
>>>>> That said I will give it a try here. At very least it might be a
>>>>> better intermediate step.
>>>>
>>>> Ah yes, this way doesn't work well for later patches because you end
>>>> e.g., with the do_break call having to call the interrupt handler
>>>> wrappers again when they actually expect to be in the asm entry state
>>>> (e.g., irq soft-mask state) when called, and return via interrupt_return
>>>> after the exit wrapper runs (which 64s uses to implement better context
>>>> tracking for example).
>>>>
>>>> That could possibly be hacked up to deal with multiple interrupt
>>>> wrappers per interrupt, but I'd rather not go backwards.
>>>>
>>>> That does leave the other sub archs as having this issue, but they don't
>>>> do so much in their handlers. 32 doesn't have soft-mask or context
>>>> tracking to deal with for example. We will need to fix this up though
>>>> and unify things more.
>>>>
>>>
>>> Not sure I understand what you mean exactly.
>>>
>>> On the 8xx, do_break() is called by totally different exceptions:
>>> - Exception 0x1c00 Data breakpoint ==> do_break()
>>> - Exception 0x1300 Instruction TLB error ==> handle_page_fault()
>>> - Exception 0x1400 Data TLB error ==> handle_page_fault()
>>>
>>> On book3s/32, we now (after my patch ie patch 1 in your series ) have either do_break() or
>>> handle_page_fault() being called from very early in ASM.
>>>
>>> If you do the same in book3s/64, then there is no issue with interrupt wrappers being called twice,
>>> is it ?
>> 
>> bad_page_fault is the problem, it has to go afterwards.
>> 
>> Once we have the changed 64s behaviour of do_page_fault, I don't know if
>> there is any point leaving do_break in asm is there? I guess it is neat
>> to treat it quite separately, I might need to count fast path branches...
>> I have done the split anyway already, so I will post your way first.
>> 
> 
> As far as I understand, not taken unlikely branches are costless (at least on book3s/32), so you 
> would only suffer the cost of the logical 'and.' on the value of DSISR that you already have in a 
> register. Should be in the noise.
> 
> bad_page_fault() is not in the fast path anymore since we now handle the exception fixup at the end 
> of do_page_fault(). So I think it shouldn't be a concern to call the wrapper again for bad_page_fault()

It's not performance but correctness. For example we can have interrupts 
enabled again at this time, which the interrupt wrapper does not expect.
Or the context tracking code in the entry wrapper would break if it's
called again before interrupt_return.

I think it's not too ugly to put bad_page_fault in C and having do_break 
in asm still avoids a lot of your complaints.

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH v4 01/21] ibmvfc: add vhost fields and defaults for MQ enablement
From: Ming Lei @ 2021-01-15  1:47 UTC (permalink / raw)
  To: Brian King
  Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel,
	james.smart, james.bottomley, brking, linuxppc-dev
In-Reply-To: <9c5f7786-cd13-6a49-2d71-d0c438318bcb@linux.vnet.ibm.com>

On Thu, Jan 14, 2021 at 11:24:35AM -0600, Brian King wrote:
> On 1/13/21 7:27 PM, Ming Lei wrote:
> > On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote:
> >> On 1/12/21 6:33 PM, Tyrel Datwyler wrote:
> >>> On 1/12/21 2:54 PM, Brian King wrote:
> >>>> On 1/11/21 5:12 PM, Tyrel Datwyler wrote:
> >>>>> Introduce several new vhost fields for managing MQ state of the adapter
> >>>>> as well as initial defaults for MQ enablement.
> >>>>>
> >>>>> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
> >>>>> ---
> >>>>>  drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++
> >>>>>  drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++
> >>>>>  2 files changed, 17 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> >>>>> index ba95438a8912..9200fe49c57e 100644
> >>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> >>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> >>>>> @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = {
> >>>>>  	.max_sectors = IBMVFC_MAX_SECTORS,
> >>>>>  	.shost_attrs = ibmvfc_attrs,
> >>>>>  	.track_queue_depth = 1,
> >>>>> +	.host_tagset = 1,
> >>>>
> >>>> This doesn't seem right. You are setting host_tagset, which means you want a
> >>>> shared, host wide, tag set for commands. It also means that the total
> >>>> queue depth for the host is can_queue. However, it looks like you are allocating
> >>>> max_requests events for each sub crq, which means you are over allocating memory.
> >>>
> >>> With the shared tagset yes the queue depth for the host is can_queue, but this
> >>> also implies that the max queue depth for each hw queue is also can_queue. So,
> >>> in the worst case that all commands are queued down the same hw queue we need an
> >>> event pool with can_queue commands.
> >>>
> >>>>
> >>>> Looking at this closer, we might have bigger problems. There is a host wide
> >>>> max number of commands that the VFC host supports, which gets returned on
> >>>> NPIV Login. This value can change across a live migration event.
> >>>
> >>> From what I understand the max commands can only become less.
> >>>
> >>>>
> >>>> The ibmvfc driver, which does the same thing the lpfc driver does, modifies
> >>>> can_queue on the scsi_host *after* the tag set has been allocated. This looks
> >>>> to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like
> >>>> we look at can_queue once the tag set is setup, and I'm not seeing a good way
> >>>> to dynamically change the host queue depth once the tag set is setup. 
> >>>>
> >>>> Unless I'm missing something, our best options appear to either be to implement
> >>>> our own host wide busy reference counting, which doesn't sound very good, or
> >>>> we need to add some API to block / scsi that allows us to dynamically change
> >>>> can_queue.
> >>>
> >>> Changing can_queue won't do use any good with the shared tagset becasue each
> >>> queue still needs to be able to queue can_queue number of commands in the worst
> >>> case.
> >>
> >> The issue I'm trying to highlight here is the following scenario:
> >>
> >> 1. We set shost->can_queue, then call scsi_add_host, which allocates the tag set.
> >>
> >> 2. On our NPIV login response from the VIOS, we might get a lower value than we
> >> initially set in shost->can_queue, so we update it, but nobody ever looks at it
> >> again, and we don't have any protection against sending too many commands to the host.
> >>
> >>
> >> Basically, we no longer have any code that ensures we don't send more
> >> commands to the VIOS than we are told it supports. According to the architecture,
> >> if we actually do this, the VIOS will do an h_free_crq, which would be a bit
> >> of a bug on our part.
> >>
> >> I don't think it was ever clearly defined in the API that a driver can
> >> change shost->can_queue after calling scsi_add_host, but up until
> >> commit 6eb045e092efefafc6687409a6fa6d1dabf0fb69, this worked and now
> >> it doesn't. 
> > 
> > Actually it isn't related with commit 6eb045e092ef, because blk_mq_alloc_tag_set()
> > uses .can_queue to create driver tag sbitmap and request pool.
> > 
> > So even thought without 6eb045e092ef, the updated .can_queue can't work
> > as expected because the max driver tag depth has been fixed by blk-mq already.
> 
> There are two scenarios here. In the scenario of someone increasing can_queue
> after the tag set is allocated, I agree, blk-mq will never take advantage
> of this. However, in the scenario of someone *decreasing* can_queue after the
> tag set is allocated, prior to 6eb045e092ef, the shost->host_busy code provided
> this protection.

When .can_queue is decreased, blk-mq still may allocate driver tag which is >
.can_queue, this way might break driver/device too, but it depends on how driver
uses req->tag.

> 
> > 
> > What 6eb045e092ef does is just to remove the double check on max
> > host-wide allowed commands because that has been respected by blk-mq
> > driver tag allocation already.
> > 
> >>
> >> I started looking through drivers that do this, and so far, it looks like the
> >> following drivers do: ibmvfc, lpfc, aix94xx, libfc, BusLogic, and likely others...
> >>
> >> We probably need an API that lets us change shost->can_queue dynamically.
> > 
> > I'd suggest to confirm changing .can_queue is one real usecase.
> 
> For ibmvfc, the total number of commands that the scsi host supports is very
> much a dynamic value. It can increase and it can decrease. Live migrating
> a logical partition from one system to another is the usual cause of
> such a capability change. For ibmvfc, at least, this only ever happens
> when we've self blocked the host and have sent back all outstanding I/O.

This one looks a good use case, and the new API may have to freeze request
queues of all LUNs, and the operation is very expensive and slow. 

> 
> However, looking at other drivers that modify can_queue dynamically, this
> doesn't always hold true. Looking at libfc, it looks to dynamically ramp
> up and ramp down can_queue based on its ability to handle requests.

This one looks hard to use the new API which isn't supposed to be called
in fast path. And changing host wide resource is really not good in fast
path, IMO.

> 
> There are certainly a number of other drivers that change can_queue
> after the tag set has been allocated. Some of these drivers could
> likely be changed to avoid doing this, but changing them all will likely
> be difficult.

It is still better to understand why these drivers have to update
.can_queue dynamically.


Thanks,
Ming


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox