public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4
@ 2023-03-29 13:47 Lu Baolu
  2023-03-29 13:47 ` [PATCH 1/3] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc() Lu Baolu
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Lu Baolu @ 2023-03-29 13:47 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kan Liang, Lu Baolu, iommu, linux-kernel

Hi Joerg,

Some fixes have been queued for v6.3. They aim to:

- Fix a lockdep splat
- Fix a supplement of the specification
- Fix a warning in perfmon code

This series is also available at github.
https://github.com/LuBaolu/intel-iommu/commits/vtd-fix-for-v6.3-rc4

Please consider them for the iommu/fix branch.

Best regards,
Lu Baolu

Kan Liang (1):
  iommu/vt-d: Fix an IOMMU perfmon warning when CPU hotplug

Lu Baolu (2):
  iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()
  iommu/vt-d: Allow zero SAGAW if second-stage not supported

 include/linux/cpuhotplug.h          |  1 -
 drivers/iommu/intel/iommu.h         |  2 +
 drivers/iommu/intel/dmar.c          |  3 +-
 drivers/iommu/intel/irq_remapping.c |  6 ---
 drivers/iommu/intel/perfmon.c       | 68 +++++++++++++++++++----------
 5 files changed, 48 insertions(+), 32 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()
  2023-03-29 13:47 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Lu Baolu
@ 2023-03-29 13:47 ` Lu Baolu
  2023-03-29 13:47 ` [PATCH 2/3] iommu/vt-d: Allow zero SAGAW if second-stage not supported Lu Baolu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Lu Baolu @ 2023-03-29 13:47 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kan Liang, Lu Baolu, iommu, linux-kernel

The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.

Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
the DMAR global data structures are not touched there. Remove it to avoid
below lockdep warning.

 ======================================================
 WARNING: possible circular locking dependency detected
 6.3.0-rc2 #468 Not tainted
 ------------------------------------------------------
 swapper/0/1 is trying to acquire lock:
 ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
   at: __irq_domain_alloc_irqs+0x3b/0xa0

 but task is already holding lock:
 ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
   at: intel_iommu_init+0x58e/0x880

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (dmar_global_lock){++++}-{3:3}:
        lock_acquire+0xd6/0x320
        down_read+0x42/0x180
        intel_irq_remapping_alloc+0xad/0x750
        mp_irqdomain_alloc+0xb8/0x2b0
        irq_domain_alloc_irqs_locked+0x12f/0x2d0
        __irq_domain_alloc_irqs+0x56/0xa0
        alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
        mp_map_pin_to_irq+0x1dc/0x330
        setup_IO_APIC+0x128/0x210
        apic_intr_mode_init+0x67/0x110
        x86_late_time_init+0x24/0x40
        start_kernel+0x41e/0x7e0
        secondary_startup_64_no_verify+0xe0/0xeb

 -> #0 (&domain->mutex){+.+.}-{3:3}:
        check_prevs_add+0x160/0xef0
        __lock_acquire+0x147d/0x1950
        lock_acquire+0xd6/0x320
        __mutex_lock+0x9c/0xfc0
        __irq_domain_alloc_irqs+0x3b/0xa0
        dmar_alloc_hwirq+0x9e/0x120
        iommu_pmu_register+0x11d/0x200
        intel_iommu_init+0x5de/0x880
        pci_iommu_init+0x12/0x40
        do_one_initcall+0x65/0x350
        kernel_init_freeable+0x3ca/0x610
        kernel_init+0x1a/0x140
        ret_from_fork+0x29/0x50

 other info that might help us debug this:

 Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(dmar_global_lock);
                                lock(&domain->mutex);
                                lock(dmar_global_lock);
   lock(&domain->mutex);

                *** DEADLOCK ***

Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230314051836.23817-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/irq_remapping.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 6d01fa078c36..df9e261af0b5 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int apic)
 	if (!irte)
 		return -1;
 
-	down_read(&dmar_global_lock);
 	for (i = 0; i < MAX_IO_APICS; i++) {
 		if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
 			sid = (ir_ioapic[i].bus << 8) | ir_ioapic[i].devfn;
 			break;
 		}
 	}
-	up_read(&dmar_global_lock);
 
 	if (sid == 0) {
 		pr_warn("Failed to set source-id of IOAPIC (%d)\n", apic);
@@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte, u8 id)
 	if (!irte)
 		return -1;
 
-	down_read(&dmar_global_lock);
 	for (i = 0; i < MAX_HPET_TBS; i++) {
 		if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
 			sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
 			break;
 		}
 	}
-	up_read(&dmar_global_lock);
 
 	if (sid == 0) {
 		pr_warn("Failed to set source-id of HPET block (%d)\n", id);
@@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
 	if (!data)
 		goto out_free_parent;
 
-	down_read(&dmar_global_lock);
 	index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
-	up_read(&dmar_global_lock);
 	if (index < 0) {
 		pr_warn("Failed to allocate IRTE\n");
 		kfree(data);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] iommu/vt-d: Allow zero SAGAW if second-stage not supported
  2023-03-29 13:47 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Lu Baolu
  2023-03-29 13:47 ` [PATCH 1/3] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc() Lu Baolu
@ 2023-03-29 13:47 ` Lu Baolu
  2023-03-29 13:47 ` [PATCH 3/3] iommu/vt-d: Fix an IOMMU perfmon warning when CPU hotplug Lu Baolu
  2023-03-31  8:06 ` [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Joerg Roedel
  3 siblings, 0 replies; 5+ messages in thread
From: Lu Baolu @ 2023-03-29 13:47 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kan Liang, Lu Baolu, iommu, linux-kernel

The VT-d spec states (in section 11.4.2) that hardware implementations
reporting second-stage translation support (SSTS) field as Clear also
report the SAGAW field as 0. Fix an inappropriate check in alloc_iommu().

Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
Suggested-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230318024824.124542-1-baolu.lu@linux.intel.com
---
 drivers/iommu/intel/dmar.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 6acfe879589c..23828d189c2a 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1071,7 +1071,8 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
 	}
 
 	err = -EINVAL;
-	if (cap_sagaw(iommu->cap) == 0) {
+	if (!cap_sagaw(iommu->cap) &&
+	    (!ecap_smts(iommu->ecap) || ecap_slts(iommu->ecap))) {
 		pr_info("%s: No supported address widths. Not attempting DMA translation.\n",
 			iommu->name);
 		drhd->ignored = 1;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] iommu/vt-d: Fix an IOMMU perfmon warning when CPU hotplug
  2023-03-29 13:47 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Lu Baolu
  2023-03-29 13:47 ` [PATCH 1/3] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc() Lu Baolu
  2023-03-29 13:47 ` [PATCH 2/3] iommu/vt-d: Allow zero SAGAW if second-stage not supported Lu Baolu
@ 2023-03-29 13:47 ` Lu Baolu
  2023-03-31  8:06 ` [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Joerg Roedel
  3 siblings, 0 replies; 5+ messages in thread
From: Lu Baolu @ 2023-03-29 13:47 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Kan Liang, Lu Baolu, iommu, linux-kernel

From: Kan Liang <kan.liang@linux.intel.com>

A warning can be triggered when hotplug CPU 0.
$ echo 0 > /sys/devices/system/cpu/cpu0/online

 ------------[ cut here ]------------
 Voluntary context switch within RCU read-side critical section!
 WARNING: CPU: 0 PID: 19 at kernel/rcu/tree_plugin.h:318
          rcu_note_context_switch+0x4f4/0x580
 RIP: 0010:rcu_note_context_switch+0x4f4/0x580
 Call Trace:
  <TASK>
  ? perf_event_update_userpage+0x104/0x150
  __schedule+0x8d/0x960
  ? perf_event_set_state.part.82+0x11/0x50
  schedule+0x44/0xb0
  schedule_timeout+0x226/0x310
  ? __perf_event_disable+0x64/0x1a0
  ? _raw_spin_unlock+0x14/0x30
  wait_for_completion+0x94/0x130
  __wait_rcu_gp+0x108/0x130
  synchronize_rcu+0x67/0x70
  ? invoke_rcu_core+0xb0/0xb0
  ? __bpf_trace_rcu_stall_warning+0x10/0x10
  perf_pmu_migrate_context+0x121/0x370
  iommu_pmu_cpu_offline+0x6a/0xa0
  ? iommu_pmu_del+0x1e0/0x1e0
  cpuhp_invoke_callback+0x129/0x510
  cpuhp_thread_fun+0x94/0x150
  smpboot_thread_fn+0x183/0x220
  ? sort_range+0x20/0x20
  kthread+0xe6/0x110
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>
 ---[ end trace 0000000000000000 ]---

The synchronize_rcu() will be invoked in the perf_pmu_migrate_context(),
when migrating a PMU to a new CPU. However, the current for_each_iommu()
is within RCU read-side critical section.

Two methods were considered to fix the issue.
- Use the dmar_global_lock to replace the RCU read lock when going
  through the drhd list. But it triggers a lockdep warning.
- Use the cpuhp_setup_state_multi() to set up a dedicated state for each
  IOMMU PMU. The lock can be avoided.

The latter method is implemented in this patch. Since each IOMMU PMU has
a dedicated state, add cpuhp_node and cpu in struct iommu_pmu to track
the state. The state can be dynamically allocated now. Remove the
CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE.

Fixes: 46284c6ceb5e ("iommu/vt-d: Support cpumask for IOMMU perfmon")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230328182028.1366416-1-kan.liang@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/cpuhotplug.h    |  1 -
 drivers/iommu/intel/iommu.h   |  2 ++
 drivers/iommu/intel/perfmon.c | 68 ++++++++++++++++++++++-------------
 3 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index c6fab004104a..5b2f8147d1ae 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -218,7 +218,6 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_X86_CQM_ONLINE,
 	CPUHP_AP_PERF_X86_CSTATE_ONLINE,
 	CPUHP_AP_PERF_X86_IDXD_ONLINE,
-	CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE,
 	CPUHP_AP_PERF_S390_CF_ONLINE,
 	CPUHP_AP_PERF_S390_SF_ONLINE,
 	CPUHP_AP_PERF_ARM_CCI_ONLINE,
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index d6df3b865812..694ab9b7d3e9 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -641,6 +641,8 @@ struct iommu_pmu {
 	DECLARE_BITMAP(used_mask, IOMMU_PMU_IDX_MAX);
 	struct perf_event	*event_list[IOMMU_PMU_IDX_MAX];
 	unsigned char		irq_name[16];
+	struct hlist_node	cpuhp_node;
+	int			cpu;
 };
 
 #define IOMMU_IRQ_ID_OFFSET_PRQ		(DMAR_UNITS_SUPPORTED)
diff --git a/drivers/iommu/intel/perfmon.c b/drivers/iommu/intel/perfmon.c
index e17d9743a0d8..cf43e798eca4 100644
--- a/drivers/iommu/intel/perfmon.c
+++ b/drivers/iommu/intel/perfmon.c
@@ -773,19 +773,34 @@ static void iommu_pmu_unset_interrupt(struct intel_iommu *iommu)
 	iommu->perf_irq = 0;
 }
 
-static int iommu_pmu_cpu_online(unsigned int cpu)
+static int iommu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 {
+	struct iommu_pmu *iommu_pmu = hlist_entry_safe(node, typeof(*iommu_pmu), cpuhp_node);
+
 	if (cpumask_empty(&iommu_pmu_cpu_mask))
 		cpumask_set_cpu(cpu, &iommu_pmu_cpu_mask);
 
+	if (cpumask_test_cpu(cpu, &iommu_pmu_cpu_mask))
+		iommu_pmu->cpu = cpu;
+
 	return 0;
 }
 
-static int iommu_pmu_cpu_offline(unsigned int cpu)
+static int iommu_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
 {
-	struct dmar_drhd_unit *drhd;
-	struct intel_iommu *iommu;
-	int target;
+	struct iommu_pmu *iommu_pmu = hlist_entry_safe(node, typeof(*iommu_pmu), cpuhp_node);
+	int target = cpumask_first(&iommu_pmu_cpu_mask);
+
+	/*
+	 * The iommu_pmu_cpu_mask has been updated when offline the CPU
+	 * for the first iommu_pmu. Migrate the other iommu_pmu to the
+	 * new target.
+	 */
+	if (target < nr_cpu_ids && target != iommu_pmu->cpu) {
+		perf_pmu_migrate_context(&iommu_pmu->pmu, cpu, target);
+		iommu_pmu->cpu = target;
+		return 0;
+	}
 
 	if (!cpumask_test_and_clear_cpu(cpu, &iommu_pmu_cpu_mask))
 		return 0;
@@ -795,45 +810,50 @@ static int iommu_pmu_cpu_offline(unsigned int cpu)
 	if (target < nr_cpu_ids)
 		cpumask_set_cpu(target, &iommu_pmu_cpu_mask);
 	else
-		target = -1;
+		return 0;
 
-	rcu_read_lock();
-
-	for_each_iommu(iommu, drhd) {
-		if (!iommu->pmu)
-			continue;
-		perf_pmu_migrate_context(&iommu->pmu->pmu, cpu, target);
-	}
-	rcu_read_unlock();
+	perf_pmu_migrate_context(&iommu_pmu->pmu, cpu, target);
+	iommu_pmu->cpu = target;
 
 	return 0;
 }
 
 static int nr_iommu_pmu;
+static enum cpuhp_state iommu_cpuhp_slot;
 
 static int iommu_pmu_cpuhp_setup(struct iommu_pmu *iommu_pmu)
 {
 	int ret;
 
-	if (nr_iommu_pmu++)
-		return 0;
+	if (!nr_iommu_pmu) {
+		ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+					      "driver/iommu/intel/perfmon:online",
+					      iommu_pmu_cpu_online,
+					      iommu_pmu_cpu_offline);
+		if (ret < 0)
+			return ret;
+		iommu_cpuhp_slot = ret;
+	}
 
-	ret = cpuhp_setup_state(CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE,
-				"driver/iommu/intel/perfmon:online",
-				iommu_pmu_cpu_online,
-				iommu_pmu_cpu_offline);
-	if (ret)
-		nr_iommu_pmu = 0;
+	ret = cpuhp_state_add_instance(iommu_cpuhp_slot, &iommu_pmu->cpuhp_node);
+	if (ret) {
+		if (!nr_iommu_pmu)
+			cpuhp_remove_multi_state(iommu_cpuhp_slot);
+		return ret;
+	}
+	nr_iommu_pmu++;
 
-	return ret;
+	return 0;
 }
 
 static void iommu_pmu_cpuhp_free(struct iommu_pmu *iommu_pmu)
 {
+	cpuhp_state_remove_instance(iommu_cpuhp_slot, &iommu_pmu->cpuhp_node);
+
 	if (--nr_iommu_pmu)
 		return;
 
-	cpuhp_remove_state(CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE);
+	cpuhp_remove_multi_state(iommu_cpuhp_slot);
 }
 
 void iommu_pmu_register(struct intel_iommu *iommu)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4
  2023-03-29 13:47 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Lu Baolu
                   ` (2 preceding siblings ...)
  2023-03-29 13:47 ` [PATCH 3/3] iommu/vt-d: Fix an IOMMU perfmon warning when CPU hotplug Lu Baolu
@ 2023-03-31  8:06 ` Joerg Roedel
  3 siblings, 0 replies; 5+ messages in thread
From: Joerg Roedel @ 2023-03-31  8:06 UTC (permalink / raw)
  To: Lu Baolu; +Cc: Kan Liang, iommu, linux-kernel

On Wed, Mar 29, 2023 at 09:47:18PM +0800, Lu Baolu wrote:
> Kan Liang (1):
>   iommu/vt-d: Fix an IOMMU perfmon warning when CPU hotplug
> 
> Lu Baolu (2):
>   iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()
>   iommu/vt-d: Allow zero SAGAW if second-stage not supported

Queued to the fixes branch, thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-31  8:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-29 13:47 [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Lu Baolu
2023-03-29 13:47 ` [PATCH 1/3] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc() Lu Baolu
2023-03-29 13:47 ` [PATCH 2/3] iommu/vt-d: Allow zero SAGAW if second-stage not supported Lu Baolu
2023-03-29 13:47 ` [PATCH 3/3] iommu/vt-d: Fix an IOMMU perfmon warning when CPU hotplug Lu Baolu
2023-03-31  8:06 ` [PATCH 0/3] [PULL REQUEST] iommu/vt-d: Fixes for v6.3-rc4 Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox