[PATCH net] xdp: fix invalid wait context of page_pool

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net] xdp: fix invalid wait context of page_pool_destroy()
@ 2024-07-12  9:51 Taehee Yoo
  2024-07-13 22:09 ` Jakub Kicinski
  2024-07-15  4:40 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Taehee Yoo @ 2024-07-12  9:51 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet, ast, daniel, hawk, john.fastabend,
	netdev, bpf
  Cc: ap420073, ilias.apalodimas, jonathan.lemon

If the driver uses a page pool, it creates a page pool with
page_pool_create().
The reference count of page pool is 1 as default.
A page pool will be destroyed only when a reference count reaches 0.
page_pool_destroy() is used to destroy page pool, it decreases a
reference count.
When a page pool is destroyed, ->disconnect() is called, which is
mem_allocator_disconnect().
This function internally acquires mutex_lock().

If the driver uses XDP, it registers a memory model with
xdp_rxq_info_reg_mem_model().
The xdp_rxq_info_reg_mem_model() internally increases a page pool
reference count if a memory model is a page pool.
Now the reference count is 2.

To destroy a page pool, the driver should call both page_pool_destroy()
and xdp_unreg_mem_model().
The xdp_unreg_mem_model() internally calls page_pool_destroy().
Only page_pool_destroy() decreases a reference count.

If a driver calls page_pool_destroy() then xdp_unreg_mem_model(), we
will face an invalid wait context warning.
Because xdp_unreg_mem_model() calls page_pool_destroy() with
rcu_read_lock().
The page_pool_destroy() internally acquires mutex_lock().

Splat looks like:
=============================
[ BUG: Invalid wait context ]
6.10.0-rc6+ #4 Tainted: G W
-----------------------------
ethtool/1806 is trying to lock:
ffffffff90387b90 (mem_id_lock){+.+.}-{4:4}, at: mem_allocator_disconnect+0x73/0x150
other info that might help us debug this:
context-{5:5}
3 locks held by ethtool/1806:
stack backtrace:
CPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
Call Trace:
<TASK>
dump_stack_lvl+0x7e/0xc0
__lock_acquire+0x1681/0x4de0
? _printk+0x64/0xe0
? __pfx_mark_lock.part.0+0x10/0x10
? __pfx___lock_acquire+0x10/0x10
lock_acquire+0x1b3/0x580
? mem_allocator_disconnect+0x73/0x150
? __wake_up_klogd.part.0+0x16/0xc0
? __pfx_lock_acquire+0x10/0x10
? dump_stack_lvl+0x91/0xc0
__mutex_lock+0x15c/0x1690
? mem_allocator_disconnect+0x73/0x150
? __pfx_prb_read_valid+0x10/0x10
? mem_allocator_disconnect+0x73/0x150
? __pfx_llist_add_batch+0x10/0x10
? console_unlock+0x193/0x1b0
? lockdep_hardirqs_on+0xbe/0x140
? __pfx___mutex_lock+0x10/0x10
? tick_nohz_tick_stopped+0x16/0x90
? __irq_work_queue_local+0x1e5/0x330
? irq_work_queue+0x39/0x50
? __wake_up_klogd.part.0+0x79/0xc0
? mem_allocator_disconnect+0x73/0x150
mem_allocator_disconnect+0x73/0x150
? __pfx_mem_allocator_disconnect+0x10/0x10
? mark_held_locks+0xa5/0xf0
? rcu_is_watching+0x11/0xb0
page_pool_release+0x36e/0x6d0
page_pool_destroy+0xd7/0x440
xdp_unreg_mem_model+0x1a7/0x2a0
? __pfx_xdp_unreg_mem_model+0x10/0x10
? kfree+0x125/0x370
? bnxt_free_ring.isra.0+0x2eb/0x500
? bnxt_free_mem+0x5ac/0x2500
xdp_rxq_info_unreg+0x4a/0xd0
bnxt_free_mem+0x1356/0x2500
bnxt_close_nic+0xf0/0x3b0
? __pfx_bnxt_close_nic+0x10/0x10
? ethnl_parse_bit+0x2c6/0x6d0
? __pfx___nla_validate_parse+0x10/0x10
? __pfx_ethnl_parse_bit+0x10/0x10
bnxt_set_features+0x2a8/0x3e0
__netdev_update_features+0x4dc/0x1370
? ethnl_parse_bitset+0x4ff/0x750
? __pfx_ethnl_parse_bitset+0x10/0x10
? __pfx___netdev_update_features+0x10/0x10
? mark_held_locks+0xa5/0xf0
? _raw_spin_unlock_irqrestore+0x42/0x70
? __pm_runtime_resume+0x7d/0x110
ethnl_set_features+0x32d/0xa20

To fix this problem, it uses rhashtable_lookup_fast() instead of
rhashtable_lookup() with rcu_read_lock().
Using xa without rcu_read_lock() here is safe.
xa is freed by __xdp_mem_allocator_rcu_free() and this is called by
call_rcu() of mem_xa_remove().
The mem_xa_remove() is called by page_pool_destroy() if a reference
count reaches 0.
The xa is already protected by the reference count mechanism well in the
control plane.
So removing rcu_read_lock() for page_pool_destroy() is safe.

Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

Tested this patch with device memory TCP, which is not yet merged into the 
net and net-next branch.

 net/core/xdp.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 022c12059cf2..bcc5551c6424 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -127,10 +127,8 @@ void xdp_unreg_mem_model(struct xdp_mem_info *mem)
 		return;

 	if (type == MEM_TYPE_PAGE_POOL) {
-		rcu_read_lock();
-		xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
+		xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
 		page_pool_destroy(xa->page_pool);
-		rcu_read_unlock();
 	}
 }
 EXPORT_SYMBOL_GPL(xdp_unreg_mem_model);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] xdp: fix invalid wait context of page_pool_destroy()
  2024-07-12  9:51 [PATCH net] xdp: fix invalid wait context of page_pool_destroy() Taehee Yoo
@ 2024-07-13 22:09 ` Jakub Kicinski
  2024-07-15  4:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Jakub Kicinski @ 2024-07-13 22:09 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, pabeni, edumazet, ast, daniel, hawk, john.fastabend,
	netdev, bpf, ilias.apalodimas, jonathan.lemon

On Fri, 12 Jul 2024 09:51:16 +0000 Taehee Yoo wrote:
> Tested this patch with device memory TCP, which is not yet merged into the 
> net and net-next branch.

That's fine, I'm pretty sure I've hit this with just page pools and XDP
before.. LGTM:

Reviewed-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] xdp: fix invalid wait context of page_pool_destroy()
  2024-07-12  9:51 [PATCH net] xdp: fix invalid wait context of page_pool_destroy() Taehee Yoo
  2024-07-13 22:09 ` Jakub Kicinski
@ 2024-07-15  4:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-15  4:40 UTC (permalink / raw)
  To: Taehee Yoo
  Cc: davem, kuba, pabeni, edumazet, ast, daniel, hawk, john.fastabend,
	netdev, bpf, ilias.apalodimas, jonathan.lemon

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 12 Jul 2024 09:51:16 +0000 you wrote:
> If the driver uses a page pool, it creates a page pool with
> page_pool_create().
> The reference count of page pool is 1 as default.
> A page pool will be destroyed only when a reference count reaches 0.
> page_pool_destroy() is used to destroy page pool, it decreases a
> reference count.
> When a page pool is destroyed, ->disconnect() is called, which is
> mem_allocator_disconnect().
> This function internally acquires mutex_lock().
> 
> [...]

Here is the summary with links:
  - [net] xdp: fix invalid wait context of page_pool_destroy()
    https://git.kernel.org/netdev/net/c/59a931c5b732

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-15  4:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-12  9:51 [PATCH net] xdp: fix invalid wait context of page_pool_destroy() Taehee Yoo
2024-07-13 22:09 ` Jakub Kicinski
2024-07-15  4:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).