[PATCH 13/42] lpfc: Fix oops when fewer hdwqs than cpus

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: James Smart <jsmart2021@gmail.com>
To: linux-scsi@vger.kernel.org
Cc: James Smart <jsmart2021@gmail.com>,
	Dick Kennedy <dick.kennedy@broadcom.com>
Subject: [PATCH 13/42] lpfc: Fix oops when fewer hdwqs than cpus
Date: Wed, 14 Aug 2019 16:56:43 -0700	[thread overview]
Message-ID: <20190814235712.4487-14-jsmart2021@gmail.com> (raw)
In-Reply-To: <20190814235712.4487-1-jsmart2021@gmail.com>

When tearing down the adapter for a reset, online/offline, or driver
unload, the queue free routine would hit a GPF oops.  This only
occurs on conditions where the number of hardware queues created is
fewer than the number of cpus in the system. In this condition cpus
share a hardware queue. And of course, it's the 2nd cpu that shares
a hardware that attempted to free it a second time and hit the oops.

Fix by reworking the cpu to hardware queue mapping such that:
Assignment of hardware queues to cpus occur in two passes:
first pass: is first time assignment of a hardware queue to a cpu.
  This will set the LPFC_CPU_FIRST_IRQ flag for the cpu.
second pass: for cpus that did not get a hardware queue they will
  be assigned one from a primary cpu (one set in first pass).

Deletion of hardware queues is driven by cpu itteration, and queues
will only be deleted if the LPFC_CPU_FIRST_IRQ flag is set.

Also contains a few small cleanup fixes and a little better logging.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/scsi/lpfc/lpfc_init.c | 144 ++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 55 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 72353c9c0fa9..fcc1c45f2d35 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -8878,7 +8878,7 @@ lpfc_sli4_queue_create(struct lpfc_hba *phba)
 		}
 		qdesc->qe_valid = 1;
 		qdesc->hdwq = cpup->hdwq;
-		qdesc->chann = cpu; /* First CPU this EQ is affinitised to */
+		qdesc->chann = cpu; /* First CPU this EQ is affinitized to */
 		qdesc->last_cpu = qdesc->chann;
 
 		/* Save the allocated EQ in the Hardware Queue */
@@ -10725,7 +10725,7 @@ lpfc_find_hyper(struct lpfc_hba *phba, int cpu,
 static void
 lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 {
-	int i, cpu, idx, new_cpu, start_cpu, first_cpu;
+	int i, cpu, idx, next_idx, new_cpu, start_cpu, first_cpu;
 	int max_phys_id, min_phys_id;
 	int max_core_id, min_core_id;
 	struct lpfc_vector_map_info *cpup;
@@ -10767,8 +10767,8 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 #endif
 
 		lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
-				"3328 CPU physid %d coreid %d\n",
-				cpup->phys_id, cpup->core_id);
+				"3328 CPU %d physid %d coreid %d flag x%x\n",
+				cpu, cpup->phys_id, cpup->core_id, cpup->flag);
 
 		if (cpup->phys_id > max_phys_id)
 			max_phys_id = cpup->phys_id;
@@ -10826,17 +10826,17 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 			cpup->eq = idx;
 			cpup->irq = pci_irq_vector(phba->pcidev, idx);
 
-			lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
-					"3336 Set Affinity: CPU %d "
-					"irq %d eq %d\n",
-					cpu, cpup->irq, cpup->eq);
-
 			/* If this is the first CPU thats assigned to this
 			 * vector, set LPFC_CPU_FIRST_IRQ.
 			 */
 			if (!i)
 				cpup->flag |= LPFC_CPU_FIRST_IRQ;
 			i++;
+
+			lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
+					"3336 Set Affinity: CPU %d "
+					"irq %d eq %d flag x%x\n",
+					cpu, cpup->irq, cpup->eq, cpup->flag);
 		}
 	}
 
@@ -10950,69 +10950,103 @@ lpfc_cpu_affinity_check(struct lpfc_hba *phba, int vectors)
 		}
 	}
 
+	/* Assign hdwq indices that are unique across all cpus in the map
+	 * that are also FIRST_CPUs.
+	 */
+	idx = 0;
+	for_each_present_cpu(cpu) {
+		cpup = &phba->sli4_hba.cpu_map[cpu];
+
+		/* Only FIRST IRQs get a hdwq index assignment. */
+		if (!(cpup->flag & LPFC_CPU_FIRST_IRQ))
+			continue;
+
+		/* 1 to 1, the first LPFC_CPU_FIRST_IRQ cpus to a unique hdwq */
+		cpup->hdwq = idx;
+		idx++;
+		lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
+				"3333 Set Affinity: CPU %d (phys %d core %d): "
+				"hdwq %d eq %d irq %d flg x%x\n",
+				cpu, cpup->phys_id, cpup->core_id,
+				cpup->hdwq, cpup->eq, cpup->irq, cpup->flag);
+	}
 	/* Finally we need to associate a hdwq with each cpu_map entry
 	 * This will be 1 to 1 - hdwq to cpu, unless there are less
 	 * hardware queues then CPUs. For that case we will just round-robin
 	 * the available hardware queues as they get assigned to CPUs.
+	 * The next_idx is the idx from the FIRST_CPU loop above to account
+	 * for irq_chann < hdwq.  The idx is used for round-robin assignments
+	 * and needs to start at 0.
 	 */
-	idx = 0;
+	next_idx = idx;
 	start_cpu = 0;
+	idx = 0;
 	for_each_present_cpu(cpu) {
 		cpup = &phba->sli4_hba.cpu_map[cpu];
-		if (idx >=  phba->cfg_hdw_queue) {
-			/* We need to reuse a Hardware Queue for another CPU,
-			 * so be smart about it and pick one that has its
-			 * IRQ/EQ mapped to the same phys_id (CPU package).
-			 * and core_id.
-			 */
-			new_cpu = start_cpu;
-			for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) {
-				new_cpup = &phba->sli4_hba.cpu_map[new_cpu];
-				if ((new_cpup->hdwq != LPFC_VECTOR_MAP_EMPTY) &&
-				    (new_cpup->phys_id == cpup->phys_id) &&
-				    (new_cpup->core_id == cpup->core_id))
-					goto found_hdwq;
-				new_cpu = cpumask_next(
-					new_cpu, cpu_present_mask);
-				if (new_cpu == nr_cpumask_bits)
-					new_cpu = first_cpu;
-			}
 
-			/* If we can't match both phys_id and core_id,
-			 * settle for just a phys_id match.
-			 */
-			new_cpu = start_cpu;
-			for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) {
-				new_cpup = &phba->sli4_hba.cpu_map[new_cpu];
-				if ((new_cpup->hdwq != LPFC_VECTOR_MAP_EMPTY) &&
-				    (new_cpup->phys_id == cpup->phys_id))
-					goto found_hdwq;
-				new_cpu = cpumask_next(
-					new_cpu, cpu_present_mask);
-				if (new_cpu == nr_cpumask_bits)
-					new_cpu = first_cpu;
+		/* FIRST cpus are already mapped. */
+		if (cpup->flag & LPFC_CPU_FIRST_IRQ)
+			continue;
+
+		/* If the cfg_irq_chann < cfg_hdw_queue, set the hdwq
+		 * of the unassigned cpus to the next idx so that all
+		 * hdw queues are fully utilized.
+		 */
+		if (next_idx < phba->cfg_hdw_queue) {
+			cpup->hdwq = next_idx;
+			next_idx++;
+			continue;
+		}
+
+		/* Not a First CPU and all hdw_queues are used.  Reuse a
+		 * Hardware Queue for another CPU, so be smart about it
+		 * and pick one that has its IRQ/EQ mapped to the same phys_id
+		 * (CPU package) and core_id.
+		 */
+		new_cpu = start_cpu;
+		for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) {
+			new_cpup = &phba->sli4_hba.cpu_map[new_cpu];
+			if (new_cpup->hdwq != LPFC_VECTOR_MAP_EMPTY &&
+			    new_cpup->phys_id == cpup->phys_id &&
+			    new_cpup->core_id == cpup->core_id) {
+				goto found_hdwq;
 			}
+			new_cpu = cpumask_next(new_cpu, cpu_present_mask);
+			if (new_cpu == nr_cpumask_bits)
+				new_cpu = first_cpu;
+		}
 
-			/* Otherwise just round robin on cfg_hdw_queue */
-			cpup->hdwq = idx % phba->cfg_hdw_queue;
-			goto logit;
-found_hdwq:
-			/* We found an available entry, copy the IRQ info */
-			start_cpu = cpumask_next(new_cpu, cpu_present_mask);
-			if (start_cpu == nr_cpumask_bits)
-				start_cpu = first_cpu;
-			cpup->hdwq = new_cpup->hdwq;
-		} else {
-			/* 1 to 1, CPU to hdwq */
-			cpup->hdwq = idx;
+		/* If we can't match both phys_id and core_id,
+		 * settle for just a phys_id match.
+		 */
+		new_cpu = start_cpu;
+		for (i = 0; i < phba->sli4_hba.num_present_cpu; i++) {
+			new_cpup = &phba->sli4_hba.cpu_map[new_cpu];
+			if (new_cpup->hdwq != LPFC_VECTOR_MAP_EMPTY &&
+			    new_cpup->phys_id == cpup->phys_id)
+				goto found_hdwq;
+
+			new_cpu = cpumask_next(new_cpu, cpu_present_mask);
+			if (new_cpu == nr_cpumask_bits)
+				new_cpu = first_cpu;
 		}
-logit:
+
+		/* Otherwise just round robin on cfg_hdw_queue */
+		cpup->hdwq = idx % phba->cfg_hdw_queue;
+		idx++;
+		goto logit;
+ found_hdwq:
+		/* We found an available entry, copy the IRQ info */
+		start_cpu = cpumask_next(new_cpu, cpu_present_mask);
+		if (start_cpu == nr_cpumask_bits)
+			start_cpu = first_cpu;
+		cpup->hdwq = new_cpup->hdwq;
+ logit:
 		lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
 				"3335 Set Affinity: CPU %d (phys %d core %d): "
 				"hdwq %d eq %d irq %d flg x%x\n",
 				cpu, cpup->phys_id, cpup->core_id,
 				cpup->hdwq, cpup->eq, cpup->irq, cpup->flag);
-		idx++;
 	}
 
 	/* The cpu_map array will be used later during initialization
-- 
2.13.7

next prev parent reply	other threads:[~2019-08-14 23:57 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-14 23:56 [PATCH 00/42] lpfc: Update lpfc to revision 12.4.0.0 James Smart
2019-08-14 23:56 ` [PATCH 01/42] lpfc: Limit xri count for kdump environment James Smart
2019-08-14 23:56 ` [PATCH 02/42] lpfc: Fix PLOGI failure with high remoteport count James Smart
2019-08-14 23:56 ` [PATCH 03/42] lpfc: Fix ELS field alignments James Smart
2019-08-14 23:56 ` [PATCH 04/42] lpfc: Fix crash on driver unload in wq free James Smart
2019-08-14 23:56 ` [PATCH 05/42] lpfc: Fix failure to clear non-zero eq_delay after io rate reduction James Smart
2019-08-14 23:56 ` [PATCH 06/42] lpfc: Fix leak of ELS completions on adapter reset James Smart
2019-08-14 23:56 ` [PATCH 07/42] lpfc: Fix port relogin failure due to GID_FT interaction James Smart
2019-08-14 23:56 ` [PATCH 08/42] lpfc: Fix discovery when target has no GID_FT information James Smart
2019-08-14 23:56 ` [PATCH 09/42] lpfc: Fix ADISC reception terminating login state if a NVME target James Smart
2019-08-14 23:56 ` [PATCH 10/42] lpfc: Fix issuing init_vpi mbox on SLI-3 card James Smart
2019-08-14 23:56 ` [PATCH 11/42] lpfc: Fix Oops in nvme_register with target logout/login James Smart
2019-08-14 23:56 ` [PATCH 12/42] lpfc: Fix irq raising in lpfc_sli_hba_down James Smart
2019-08-14 23:56 ` James Smart [this message]
2019-08-14 23:56 ` [PATCH 14/42] lpfc: Fix FLOGI handling across multiple link up/down conditions James Smart
2019-08-14 23:56 ` [PATCH 15/42] lpfc: Fix null ptr oops updating lpfc_devloss_tmo via sysfs attribute James Smart
2019-08-14 23:56 ` [PATCH 16/42] lpfc: Fix devices that don't return after devloss followed by rediscovery James Smart
2019-08-14 23:56 ` [PATCH 17/42] lpfc: Fix loss of remote port after devloss due to lack of RPIs James Smart
2019-08-14 23:56 ` [PATCH 18/42] lpfc: Fix propagation of devloss_tmo setting to nvme transport James Smart
2019-08-14 23:56 ` [PATCH 19/42] lpfc: Fix sg_seg_cnt for HBAs that don't support NVME James Smart
2019-08-14 23:56 ` [PATCH 20/42] lpfc: Fix driver nvme rescan logging James Smart
2019-08-14 23:56 ` [PATCH 21/42] lpfc: Fix error in remote port address change James Smart
2019-08-14 23:56 ` [PATCH 22/42] lpfc: Fix deadlock on host_lock during cable pulls James Smart
2019-08-14 23:56 ` [PATCH 23/42] lpfc: Fix crash due to port reset racing vs adapter error handling James Smart
2019-08-14 23:56 ` [PATCH 24/42] lpfc: Fix too many sg segments spamming in kernel log James Smart
2019-08-14 23:56 ` [PATCH 25/42] lpfc: Fix hang when downloading fw on port enabled for nvme James Smart
2019-08-14 23:56 ` [PATCH 26/42] lpfc: Fix nvme target mode ABTSing a received ABTS James Smart
2019-08-14 23:56 ` [PATCH 27/42] lpfc: Fix nvme sg_seg_cnt display if HBA does not support NVME James Smart
2019-08-14 23:56 ` [PATCH 28/42] lpfc: Fix sli4 adapter initialization with MSI James Smart
2019-08-14 23:56 ` [PATCH 29/42] lpfc: Fix upcall to bsg done in non-success cases James Smart
2019-08-14 23:57 ` [PATCH 30/42] lpfc: Fix Max Frame Size value shown in fdmishow output James Smart
2019-08-14 23:57 ` [PATCH 31/42] lpfc: Fix reported physical link speed on a disabled trunked link James Smart
2019-08-14 23:57 ` [PATCH 32/42] lpfc: Fix BlockGuard enablement on FCoE adapters James Smart
2019-08-14 23:57 ` [PATCH 33/42] lpfc: Fix nvme first burst module parameter description James Smart
2019-08-14 23:57 ` [PATCH 34/42] lpfc: Fix coverity warnings James Smart
2019-08-14 23:57 ` [PATCH 35/42] lpfc: Add simple unlikely optimizations to reduce NVME latency James Smart
2019-08-14 23:57 ` [PATCH 36/42] lpfc: Migrate to %px and %pf in kernel print calls James Smart
2019-08-14 23:57 ` [PATCH 37/42] lpfc: Add first and second level hardware revisions to sysfs reporting James Smart
2019-08-14 23:57 ` [PATCH 38/42] lpfc: Add MDS driver loopback diagnostics support James Smart
2019-08-14 23:57 ` [PATCH 39/42] lpfc: Support dynamic unbounded SGL lists on G7 hardware James Smart
2019-08-14 23:57 ` [PATCH 40/42] lpfc: Add NVMe sequence level error recovery support James Smart
2019-08-14 23:57 ` [PATCH 41/42] lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair James Smart
2019-08-14 23:57 ` [PATCH 42/42] lpfc: Update lpfc version to 12.4.0.0 James Smart
2019-08-20  3:06 ` [PATCH 00/42] lpfc: Update lpfc to revision 12.4.0.0 Martin K. Petersen
2019-08-27 13:31   ` Hannes Reinecke
2019-08-28  0:10     ` James Smart

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:72353c9c0fa dfblob:fcc1c45f2d3 )
 OR (
bs:"[PATCH 13/42] lpfc: Fix oops when fewer hdwqs than cpus" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190814235712.4487-14-jsmart2021@gmail.com \
    --to=jsmart2021@gmail.com \
    --cc=dick.kennedy@broadcom.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).