linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] crypto: iaa - Fix race condition when probing IAA devices
@ 2025-06-03 23:55 Vinicius Costa Gomes
  2025-06-11  2:18 ` Herbert Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Vinicius Costa Gomes @ 2025-06-03 23:55 UTC (permalink / raw)
  To: Kristen Accardi, Vinicius Costa Gomes, Herbert Xu,
	David S. Miller, Tom Zanussi, linux-crypto, linux-kernel

While adding new devices the per-cpu workqueue map are temporarily
cleared, if a user tries to use them at that point it may fail. That
is, code calling wq_table_next_wq() may race during probe when
rebalance_wq_table() is called, which clears before rebuilding the
workqueue map. Add a spinlock to protect serialize that.

Add some lockdep asserts to document when this new lock should be
held.

It was reported that running iaa-deflate selftests using a IAA device
while another is being probed could fail:

[   45.432348] ------------[ cut here ]------------
[   45.432350] alg: self-tests for deflate using deflate-iaa failed (rc=-19)
[   45.432372] WARNING: CPU: 43 PID: 1402 at crypto/testmgr.c:6026 alg_test.part.0+0x493/0x570
[   45.432387] Modules linked in: iaa_crypto(+) mlx5_core qat_4xxx sr_mod ast igb mlxfw intel_qat cdrom drm_shmem_helper pci_hyperv_intf idxd
[   45.432405]  dca dh_generic crc8 rpcrdma rdma_cm iw_cm ib_cm sunrpc ib_core
[   45.432426] CPU: 43 UID: 0 PID: 1402 Comm: cryptomgr_test Not tainted 6.14.0-cwf.bkc.6.14.4.4.6.x86_64 #1
[   45.432436] RIP: 0010:alg_test.part.0+0x493/0x570
[   45.432442] Code: ef 8b 48 28 48 8b 50 20 e8 2a f9 ff ff 41 89 c0 e9 10 fe ff ff 44 89 c1 4c 89 e2 4c 89 ee 48 c7 c7 b8 86 3d 83 e8 dd 90 88 ff <0f> 0b 44 8b 44 24 04 e9 a9 fc ff ff 44 8d 7b 01 e9 90 fe ff ff 48
[   45.432458] RSP: 0018:ffffc90024bb3de8 EFLAGS: 00010286
[   45.432464] RAX: 0000000000000000 RBX: 000000000000004c RCX: 0000000000000000
[   45.432469] RDX: 0000000000000002 RSI: ffffffff834ee543 RDI: 00000000ffffffff
[   45.432472] RBP: 000000000000004d R08: 0000000000003d0e R09: ffffffff833d86e6
[   45.432476] R10: 0000000000000001 R11: 0000000000000004 R12: ffff88a099be1000
[   45.432480] R13: ffff88a099be1080 R14: 000000000000004d R15: 0000000000001340
[   45.432483] FS:  0000000000000000(0000) GS:ffff88bfff8c0000(0000) knlGS:0000000000000000
[   45.432488] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   45.432492] CR2: 00007fd70ad9c4e0 CR3: 000000000ce36001 CR4: 0000000108f70ef0
[   45.432495] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   45.432497] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[   45.432501] PKRU: 55555554
[   45.432504] Call Trace:
[   45.432508]  <TASK>
[   45.432513]  ? __warn+0x81/0x130
[   45.432524]  ? alg_test.part.0+0x493/0x570
[   45.432529]  ? report_bug+0x1c7/0x1d0
[   45.432540]  ? handle_bug+0x53/0x90
[   45.432546]  ? exc_invalid_op+0x18/0x70
[   45.432550]  ? asm_fred_entrypoint_kernel+0x45/0x60
[   45.432562]  ? alg_test.part.0+0x493/0x570
[   45.432570]  ? alg_test.part.0+0x493/0x570
[   45.432574]  ? _raw_spin_unlock+0x18/0x40
[   45.432583]  ? finish_task_switch.isra.0+0x97/0x290
[   45.432594]  ? __schedule+0x2fc/0x7a0
[   45.432603]  ? preempt_count_add+0x6d/0xa0
[   45.432611]  ? __pfx_cryptomgr_test+0x10/0x10
[   45.432619]  cryptomgr_test+0x24/0x40
[   45.432628]  kthread+0x10f/0x250
[   45.432636]  ? finish_task_switch.isra.0+0x97/0x290
[   45.432644]  ? __pfx_kthread+0x10/0x10
[   45.432650]  ? __pfx_kthread+0x10/0x10
[   45.432648] initcall iaa_crypto_init_module+0x0/0x2b0 [iaa_crypto] returned 0 after 2388 usecs
[   45.432657]  ret_from_fork+0x31/0x50
[   45.432668]  ? __pfx_kthread+0x10/0x10
[   45.432675]  ret_from_fork_asm+0x1a/0x30
[   45.432685]  </TASK>
[   45.432688] ---[ end trace 0000000000000000 ]---

Fixes: 2ec6761df889 ("crypto: iaa - Add support for deflate-iaa compression algorithm")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/crypto/intel/iaa/iaa_crypto_main.c | 48 ++++++++++++++++++----
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 23f585219fb4..2185c101bef3 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -35,28 +35,39 @@ static unsigned int cpus_per_iaa;
 
 /* Per-cpu lookup table for balanced wqs */
 static struct wq_table_entry __percpu *wq_table;
+static DEFINE_SPINLOCK(wq_table_lock);
 
 static struct idxd_wq *wq_table_next_wq(int cpu)
 {
-	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
+	struct wq_table_entry *entry;
+	struct idxd_wq *wq;
+	int id;
+
+	guard(spinlock)(&wq_table_lock);
+
+	entry = per_cpu_ptr(wq_table, cpu);
 
 	if (++entry->cur_wq >= entry->n_wqs)
 		entry->cur_wq = 0;
 
-	if (!entry->wqs[entry->cur_wq])
+	id = entry->cur_wq;
+	wq = entry->wqs[id];
+
+	if (!wq)
 		return NULL;
 
 	pr_debug("%s: returning wq at idx %d (iaa wq %d.%d) from cpu %d\n", __func__,
-		 entry->cur_wq, entry->wqs[entry->cur_wq]->idxd->id,
-		 entry->wqs[entry->cur_wq]->id, cpu);
+		 id, wq->idxd->id, wq->id, cpu);
 
-	return entry->wqs[entry->cur_wq];
+	return wq;
 }
 
 static void wq_table_add(int cpu, struct idxd_wq *wq)
 {
 	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
 
+	lockdep_assert_held(&wq_table_lock);
+
 	if (WARN_ON(entry->n_wqs == entry->max_wqs))
 		return;
 
@@ -71,6 +82,8 @@ static void wq_table_free_entry(int cpu)
 {
 	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
 
+	lockdep_assert_held(&wq_table_lock);
+
 	kfree(entry->wqs);
 	memset(entry, 0, sizeof(*entry));
 }
@@ -79,6 +92,8 @@ static void wq_table_clear_entry(int cpu)
 {
 	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
 
+	lockdep_assert_held(&wq_table_lock);
+
 	entry->n_wqs = 0;
 	entry->cur_wq = 0;
 	memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *));
@@ -702,14 +717,23 @@ static int iaa_wq_put(struct idxd_wq *wq)
 	return ret;
 }
 
-static void free_wq_table(void)
+static void __free_wq_table(void)
 {
 	int cpu;
 
+	lockdep_assert_held(&wq_table_lock);
+
 	for (cpu = 0; cpu < nr_cpus; cpu++)
 		wq_table_free_entry(cpu);
 
 	free_percpu(wq_table);
+}
+
+static void free_wq_table(void)
+{
+	guard(spinlock)(&wq_table_lock);
+
+	__free_wq_table();
 
 	pr_debug("freed wq table\n");
 }
@@ -719,15 +743,17 @@ static int alloc_wq_table(int max_wqs)
 	struct wq_table_entry *entry;
 	int cpu;
 
-	wq_table = alloc_percpu(struct wq_table_entry);
+	guard(spinlock)(&wq_table_lock);
+
+	wq_table = alloc_percpu_gfp(struct wq_table_entry, GFP_ATOMIC);
 	if (!wq_table)
 		return -ENOMEM;
 
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		entry = per_cpu_ptr(wq_table, cpu);
-		entry->wqs = kcalloc(max_wqs, sizeof(*entry->wqs), GFP_KERNEL);
+		entry->wqs = kcalloc(max_wqs, sizeof(*entry->wqs), GFP_ATOMIC);
 		if (!entry->wqs) {
-			free_wq_table();
+			__free_wq_table();
 			return -ENOMEM;
 		}
 
@@ -836,6 +862,8 @@ static int wq_table_add_wqs(int iaa, int cpu)
 	struct pci_dev *pdev;
 	struct device *dev;
 
+	lockdep_assert_held(&wq_table_lock);
+
 	list_for_each_entry(iaa_device, &iaa_devices, list) {
 		idxd = iaa_device->idxd;
 		pdev = idxd->pdev;
@@ -902,6 +930,8 @@ static void rebalance_wq_table(void)
 	pr_debug("rebalance: nr_nodes=%d, nr_cpus %d, nr_iaa %d, cpus_per_iaa %d\n",
 		 nr_nodes, nr_cpus, nr_iaa, cpus_per_iaa);
 
+	guard(spinlock)(&wq_table_lock);
+
 	clear_wq_table();
 
 	if (nr_iaa == 1) {
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: iaa - Fix race condition when probing IAA devices
  2025-06-03 23:55 [PATCH] crypto: iaa - Fix race condition when probing IAA devices Vinicius Costa Gomes
@ 2025-06-11  2:18 ` Herbert Xu
  2025-06-11 21:17   ` Vinicius Costa Gomes
  0 siblings, 1 reply; 5+ messages in thread
From: Herbert Xu @ 2025-06-11  2:18 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Kristen Accardi, David S. Miller, Tom Zanussi, linux-crypto,
	linux-kernel

On Tue, Jun 03, 2025 at 04:55:31PM -0700, Vinicius Costa Gomes wrote:
>
> diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
> index 23f585219fb4..2185c101bef3 100644
> --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
> +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
> @@ -35,28 +35,39 @@ static unsigned int cpus_per_iaa;
>  
>  /* Per-cpu lookup table for balanced wqs */
>  static struct wq_table_entry __percpu *wq_table;
> +static DEFINE_SPINLOCK(wq_table_lock);

This can be called in BH context so you need to disable BH when
taking the spinlock.

>  static struct idxd_wq *wq_table_next_wq(int cpu)
>  {
> -	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
> +	struct wq_table_entry *entry;
> +	struct idxd_wq *wq;
> +	int id;
> +
> +	guard(spinlock)(&wq_table_lock);
> +
> +	entry = per_cpu_ptr(wq_table, cpu);

You're taking a global spinlock around a per-cpu variable.  In that
case you might as well get rid of the per-cpu variable and use the
spinlock only.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: iaa - Fix race condition when probing IAA devices
  2025-06-11  2:18 ` Herbert Xu
@ 2025-06-11 21:17   ` Vinicius Costa Gomes
  2025-06-12  5:38     ` Herbert Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Vinicius Costa Gomes @ 2025-06-11 21:17 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Kristen Accardi, David S. Miller, Tom Zanussi, linux-crypto,
	linux-kernel

Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Tue, Jun 03, 2025 at 04:55:31PM -0700, Vinicius Costa Gomes wrote:
>>
>> diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
>> index 23f585219fb4..2185c101bef3 100644
>> --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
>> +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
>> @@ -35,28 +35,39 @@ static unsigned int cpus_per_iaa;
>>  
>>  /* Per-cpu lookup table for balanced wqs */
>>  static struct wq_table_entry __percpu *wq_table;
>> +static DEFINE_SPINLOCK(wq_table_lock);
>
> This can be called in BH context so you need to disable BH when
> taking the spinlock.
>

My tests with lockdep enabled must have missed something, will verify
and fix this.

>>  static struct idxd_wq *wq_table_next_wq(int cpu)
>>  {
>> -	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
>> +	struct wq_table_entry *entry;
>> +	struct idxd_wq *wq;
>> +	int id;
>> +
>> +	guard(spinlock)(&wq_table_lock);
>> +
>> +	entry = per_cpu_ptr(wq_table, cpu);
>
> You're taking a global spinlock around a per-cpu variable.  In that
> case you might as well get rid of the per-cpu variable and use the
> spinlock only.
>

From what I could gather, the idea of the per-cpu workqueue table ("map"
really) is more to "spread" the workqueues to different CPUS than to
reduce contention.

If the question is more about the choice of using per-cpu variables, I
can look for alternatives.

> Cheers,
> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: iaa - Fix race condition when probing IAA devices
  2025-06-11 21:17   ` Vinicius Costa Gomes
@ 2025-06-12  5:38     ` Herbert Xu
  2025-06-12 23:09       ` Vinicius Costa Gomes
  0 siblings, 1 reply; 5+ messages in thread
From: Herbert Xu @ 2025-06-12  5:38 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Kristen Accardi, David S. Miller, Tom Zanussi, linux-crypto,
	linux-kernel

On Wed, Jun 11, 2025 at 02:17:49PM -0700, Vinicius Costa Gomes wrote:
>
> >From what I could gather, the idea of the per-cpu workqueue table ("map"
> really) is more to "spread" the workqueues to different CPUS than to
> reduce contention.
> 
> If the question is more about the choice of using per-cpu variables, I
> can look for alternatives.

Prior to your patch, the compress/decompress paths simply did a
lockless per-cpu lookup to find the wq.  Now you're taking a global
spinlock to do the same lookup.

That makes no sense.  Either it should be redesigned to not use
a spinlock, or the per-cpu data structure should be removed since
it serves no purpose as you're always taking a global spinlock.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] crypto: iaa - Fix race condition when probing IAA devices
  2025-06-12  5:38     ` Herbert Xu
@ 2025-06-12 23:09       ` Vinicius Costa Gomes
  0 siblings, 0 replies; 5+ messages in thread
From: Vinicius Costa Gomes @ 2025-06-12 23:09 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Kristen Accardi, David S. Miller, Tom Zanussi, linux-crypto,
	linux-kernel

Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Wed, Jun 11, 2025 at 02:17:49PM -0700, Vinicius Costa Gomes wrote:
>>
>> >From what I could gather, the idea of the per-cpu workqueue table ("map"
>> really) is more to "spread" the workqueues to different CPUS than to
>> reduce contention.
>> 
>> If the question is more about the choice of using per-cpu variables, I
>> can look for alternatives.
>
> Prior to your patch, the compress/decompress paths simply did a
> lockless per-cpu lookup to find the wq.  Now you're taking a global
> spinlock to do the same lookup.
>
> That makes no sense.  Either it should be redesigned to not use
> a spinlock, or the per-cpu data structure should be removed since
> it serves no purpose as you're always taking a global spinlock.
>

Will think a bit harder on this. It could be the code is trying too hard
being smart and there's a easier/simpler way out. I was only trying to
solve a bug that some folks found.


> Cheers,
> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Cheer,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-12 23:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 23:55 [PATCH] crypto: iaa - Fix race condition when probing IAA devices Vinicius Costa Gomes
2025-06-11  2:18 ` Herbert Xu
2025-06-11 21:17   ` Vinicius Costa Gomes
2025-06-12  5:38     ` Herbert Xu
2025-06-12 23:09       ` Vinicius Costa Gomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).