* [PATCH net] xdp: use multi-buff only if receive queue supports page pool
@ 2025-09-24 6:08 Octavian Purdila
2025-09-25 0:09 ` Jakub Kicinski
0 siblings, 1 reply; 10+ messages in thread
From: Octavian Purdila @ 2025-09-24 6:08 UTC (permalink / raw)
To: kuba
Cc: davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend,
sdf, uniyu, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev,
bpf, Octavian Purdila, syzbot+ff145014d6b0ce64a173
When a BPF program that supports BPF_F_XDP_HAS_FRAGS is issuing
bpf_xdp_adjust_tail and a large packet is injected via /dev/net/tun a
crash occurs due to detecting a bad page state (page_pool leak).
This is because xdp_buff does not record the type of memory and
instead relies on the netdev receive queue xdp info. Since the TUN/TAP
driver is using a MEM_TYPE_PAGE_SHARED memory model buffer, shrinking
will eventually call page_frag_free. But with current multi-buff
support for BPF_F_XDP_HAS_FRAGS programs buffers are allocated via the
page pool.
To fix this issue check that the receive queue memory mode is of
MEM_TYPE_PAGE_POOL before using multi-buffs.
Reported-by: syzbot+ff145014d6b0ce64a173@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6756c37b.050a0220.a30f1.019a.GAE@google.com/
Fixes: e6d5dbdd20aa ("xdp: add multi-buff support for xdp running in generic mode")
Signed-off-by: Octavian Purdila <tavip@google.com>
---
net/core/dev.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 8d49b2198d07..b195ee3068c2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5335,13 +5335,18 @@ static int
netif_skb_check_for_xdp(struct sk_buff **pskb, const struct bpf_prog *prog)
{
struct sk_buff *skb = *pskb;
+ struct netdev_rx_queue *rxq;
int err, hroom, troom;
- local_lock_nested_bh(&system_page_pool.bh_lock);
- err = skb_cow_data_for_xdp(this_cpu_read(system_page_pool.pool), pskb, prog);
- local_unlock_nested_bh(&system_page_pool.bh_lock);
- if (!err)
- return 0;
+ rxq = netif_get_rxqueue(skb);
+ if (rxq->xdp_rxq.mem.type == MEM_TYPE_PAGE_POOL) {
+ local_lock_nested_bh(&system_page_pool.bh_lock);
+ err = skb_cow_data_for_xdp(this_cpu_read(system_page_pool.pool),
+ pskb, prog);
+ local_unlock_nested_bh(&system_page_pool.bh_lock);
+ if (!err)
+ return 0;
+ }
/* In case we have to go down the path and also linearize,
* then lets do the pskb_expand_head() work just once here.
--
2.51.0.534.gc79095c0ca-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-24 6:08 [PATCH net] xdp: use multi-buff only if receive queue supports page pool Octavian Purdila @ 2025-09-25 0:09 ` Jakub Kicinski 2025-09-25 7:53 ` Octavian Purdila 0 siblings, 1 reply; 10+ messages in thread From: Jakub Kicinski @ 2025-09-25 0:09 UTC (permalink / raw) To: Octavian Purdila Cc: davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, uniyu, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173 On Wed, 24 Sep 2025 06:08:42 +0000 Octavian Purdila wrote: > When a BPF program that supports BPF_F_XDP_HAS_FRAGS is issuing > bpf_xdp_adjust_tail and a large packet is injected via /dev/net/tun a > crash occurs due to detecting a bad page state (page_pool leak). > > This is because xdp_buff does not record the type of memory and > instead relies on the netdev receive queue xdp info. Since the TUN/TAP > driver is using a MEM_TYPE_PAGE_SHARED memory model buffer, shrinking > will eventually call page_frag_free. But with current multi-buff > support for BPF_F_XDP_HAS_FRAGS programs buffers are allocated via the > page pool. > > To fix this issue check that the receive queue memory mode is of > MEM_TYPE_PAGE_POOL before using multi-buffs. This can also happen on veth, right? And veth re-stamps the Rx queues. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-25 0:09 ` Jakub Kicinski @ 2025-09-25 7:53 ` Octavian Purdila 2025-09-25 9:42 ` Maciej Fijalkowski 0 siblings, 1 reply; 10+ messages in thread From: Octavian Purdila @ 2025-09-25 7:53 UTC (permalink / raw) To: Jakub Kicinski Cc: davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Wed, Sep 24, 2025 at 5:09 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Wed, 24 Sep 2025 06:08:42 +0000 Octavian Purdila wrote: > > When a BPF program that supports BPF_F_XDP_HAS_FRAGS is issuing > > bpf_xdp_adjust_tail and a large packet is injected via /dev/net/tun a > > crash occurs due to detecting a bad page state (page_pool leak). > > > > This is because xdp_buff does not record the type of memory and > > instead relies on the netdev receive queue xdp info. Since the TUN/TAP > > driver is using a MEM_TYPE_PAGE_SHARED memory model buffer, shrinking > > will eventually call page_frag_free. But with current multi-buff > > support for BPF_F_XDP_HAS_FRAGS programs buffers are allocated via the > > page pool. > > > > To fix this issue check that the receive queue memory mode is of > > MEM_TYPE_PAGE_POOL before using multi-buffs. > > This can also happen on veth, right? And veth re-stamps the Rx queues. I am not sure if re-stamps will have ill effects. The allocation and deallocation for this issue happens while processing the same packet (receive skb -> skb_pp_cow_data -> page_pool alloc ... __bpf_prog_run -> bpf_xdp_adjust_tail). IIUC, if the veth re-stamps the RX queue to MEM_TYPE_PAGE_POOL skb_pp_cow_data will proceed to allocate from page_pool and bpf_xdp_adjust_tail will correctly free from page_pool. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-25 7:53 ` Octavian Purdila @ 2025-09-25 9:42 ` Maciej Fijalkowski 2025-09-26 2:12 ` Jakub Kicinski 0 siblings, 1 reply; 10+ messages in thread From: Maciej Fijalkowski @ 2025-09-25 9:42 UTC (permalink / raw) To: Octavian Purdila Cc: Jakub Kicinski, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Thu, Sep 25, 2025 at 12:53:53AM -0700, Octavian Purdila wrote: > On Wed, Sep 24, 2025 at 5:09 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Wed, 24 Sep 2025 06:08:42 +0000 Octavian Purdila wrote: > > > When a BPF program that supports BPF_F_XDP_HAS_FRAGS is issuing > > > bpf_xdp_adjust_tail and a large packet is injected via /dev/net/tun a > > > crash occurs due to detecting a bad page state (page_pool leak). > > > > > > This is because xdp_buff does not record the type of memory and > > > instead relies on the netdev receive queue xdp info. Since the TUN/TAP > > > driver is using a MEM_TYPE_PAGE_SHARED memory model buffer, shrinking > > > will eventually call page_frag_free. But with current multi-buff > > > support for BPF_F_XDP_HAS_FRAGS programs buffers are allocated via the > > > page pool. > > > > > > To fix this issue check that the receive queue memory mode is of > > > MEM_TYPE_PAGE_POOL before using multi-buffs. > > > > This can also happen on veth, right? And veth re-stamps the Rx queues. What do you mean by 're-stamps' in this case? > > I am not sure if re-stamps will have ill effects. > > The allocation and deallocation for this issue happens while > processing the same packet (receive skb -> skb_pp_cow_data -> > page_pool alloc ... __bpf_prog_run -> bpf_xdp_adjust_tail). > > IIUC, if the veth re-stamps the RX queue to MEM_TYPE_PAGE_POOL > skb_pp_cow_data will proceed to allocate from page_pool and > bpf_xdp_adjust_tail will correctly free from page_pool. netif_get_rxqueue() gives you a pointer the netstack queue, not the driver one. Then you take the xdp_rxq from there. Do we even register memory model for these queues? Or am I missing something here. We're in generic XDP hook where driver specifics should not matter here IMHO. > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-25 9:42 ` Maciej Fijalkowski @ 2025-09-26 2:12 ` Jakub Kicinski 2025-09-26 7:33 ` Octavian Purdila 0 siblings, 1 reply; 10+ messages in thread From: Jakub Kicinski @ 2025-09-26 2:12 UTC (permalink / raw) To: Maciej Fijalkowski Cc: Octavian Purdila, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Thu, 25 Sep 2025 11:42:04 +0200 Maciej Fijalkowski wrote: > On Thu, Sep 25, 2025 at 12:53:53AM -0700, Octavian Purdila wrote: > > On Wed, Sep 24, 2025 at 5:09 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > On Wed, 24 Sep 2025 06:08:42 +0000 Octavian Purdila wrote: > [...] > > > > > > This can also happen on veth, right? And veth re-stamps the Rx queues. > > What do you mean by 're-stamps' in this case? > > > > > I am not sure if re-stamps will have ill effects. > > > > The allocation and deallocation for this issue happens while > > processing the same packet (receive skb -> skb_pp_cow_data -> > > page_pool alloc ... __bpf_prog_run -> bpf_xdp_adjust_tail). > > > > IIUC, if the veth re-stamps the RX queue to MEM_TYPE_PAGE_POOL > > skb_pp_cow_data will proceed to allocate from page_pool and > > bpf_xdp_adjust_tail will correctly free from page_pool. > > netif_get_rxqueue() gives you a pointer the netstack queue, not the driver > one. Then you take the xdp_rxq from there. Do we even register memory > model for these queues? Or am I missing something here. > > We're in generic XDP hook where driver specifics should not matter here > IMHO. Well, IDK how helpful the flow below would be but: veth_xdp_xmit() -> [ptr ring] -> veth_xdp_rcv() -> veth_xdp_rcv_one() | | xdp_convert_frame_to_buff() <-' ( "re-stamps" ;) -> | xdp->rxq = &rq->xdp_rxq; can eat frags but now rxq | bpf_prog_run_xdp() is veth's | I just glanced at the code so >50% changes I'm wrong, but that's what I meant. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-26 2:12 ` Jakub Kicinski @ 2025-09-26 7:33 ` Octavian Purdila 2025-09-26 11:24 ` Maciej Fijalkowski 0 siblings, 1 reply; 10+ messages in thread From: Octavian Purdila @ 2025-09-26 7:33 UTC (permalink / raw) To: Jakub Kicinski Cc: Maciej Fijalkowski, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Thu, Sep 25, 2025 at 7:12 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 25 Sep 2025 11:42:04 +0200 Maciej Fijalkowski wrote: > > On Thu, Sep 25, 2025 at 12:53:53AM -0700, Octavian Purdila wrote: > > > On Wed, Sep 24, 2025 at 5:09 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > > > On Wed, 24 Sep 2025 06:08:42 +0000 Octavian Purdila wrote: > > [...] > > > > > > > > This can also happen on veth, right? And veth re-stamps the Rx queues. > > > > What do you mean by 're-stamps' in this case? > > > > > > > > I am not sure if re-stamps will have ill effects. > > > > > > The allocation and deallocation for this issue happens while > > > processing the same packet (receive skb -> skb_pp_cow_data -> > > > page_pool alloc ... __bpf_prog_run -> bpf_xdp_adjust_tail). > > > > > > IIUC, if the veth re-stamps the RX queue to MEM_TYPE_PAGE_POOL > > > skb_pp_cow_data will proceed to allocate from page_pool and > > > bpf_xdp_adjust_tail will correctly free from page_pool. > > > > netif_get_rxqueue() gives you a pointer the netstack queue, not the driver > > one. Then you take the xdp_rxq from there. Do we even register memory > > model for these queues? Or am I missing something here. > > Ah, yes, you are right. So my comment in the commit message about TUN/TAP registering a page shared memory model is wrong. But I think the fix is still correct for the reported syzkaller issue. From bpf_prog_run_generic_xdp: rxqueue = netif_get_rxqueue(skb); xdp_init_buff(xdp, frame_sz, rxq: &rxqueue->xdp_rxq); So xdp_buff's rxq is set to the netstack queue for the generic XDP hook. And adding the check in netif_skb_check_for_xdp based on the netstack queue should be correct, right? > > We're in generic XDP hook where driver specifics should not matter here > > IMHO. > > Well, IDK how helpful the flow below would be but: > > veth_xdp_xmit() -> [ptr ring] -> veth_xdp_rcv() -> veth_xdp_rcv_one() > | > | xdp_convert_frame_to_buff() <-' > ( "re-stamps" ;) -> | xdp->rxq = &rq->xdp_rxq; > can eat frags but now rxq | bpf_prog_run_xdp() > is veth's | > > I just glanced at the code so >50% changes I'm wrong, but that's what > I meant. Thanks for the clarification, I thought that "re-stamps" means the: xdp->rxq->mem.type = frame->mem_type; from veth_xdp_rcv_one in the XDP_TX/XDP_REDIRECT cases. And yes, now I think the same issue can happen because veth sets the memory model to MEM_TYPE_PAGE_SHARED but veth_convert_skb_to_xdp_buff calls skb_pp_cow_data that uses page_pool for allocations. I'll try to see if I can adapt the syzkaller repro to trigger it for confirmation. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-26 7:33 ` Octavian Purdila @ 2025-09-26 11:24 ` Maciej Fijalkowski 2025-09-26 19:40 ` Jakub Kicinski 0 siblings, 1 reply; 10+ messages in thread From: Maciej Fijalkowski @ 2025-09-26 11:24 UTC (permalink / raw) To: Octavian Purdila Cc: Jakub Kicinski, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Fri, Sep 26, 2025 at 12:33:46AM -0700, Octavian Purdila wrote: > On Thu, Sep 25, 2025 at 7:12 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Thu, 25 Sep 2025 11:42:04 +0200 Maciej Fijalkowski wrote: > > > On Thu, Sep 25, 2025 at 12:53:53AM -0700, Octavian Purdila wrote: > > > > On Wed, Sep 24, 2025 at 5:09 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > > > > > On Wed, 24 Sep 2025 06:08:42 +0000 Octavian Purdila wrote: > > > [...] > > > > > > > > > > This can also happen on veth, right? And veth re-stamps the Rx queues. > > > > > > What do you mean by 're-stamps' in this case? > > > > > > > > > > > I am not sure if re-stamps will have ill effects. > > > > > > > > The allocation and deallocation for this issue happens while > > > > processing the same packet (receive skb -> skb_pp_cow_data -> > > > > page_pool alloc ... __bpf_prog_run -> bpf_xdp_adjust_tail). > > > > > > > > IIUC, if the veth re-stamps the RX queue to MEM_TYPE_PAGE_POOL > > > > skb_pp_cow_data will proceed to allocate from page_pool and > > > > bpf_xdp_adjust_tail will correctly free from page_pool. > > > > > > netif_get_rxqueue() gives you a pointer the netstack queue, not the driver > > > one. Then you take the xdp_rxq from there. Do we even register memory > > > model for these queues? Or am I missing something here. > > > > > Ah, yes, you are right. So my comment in the commit message about > TUN/TAP registering a page shared memory model is wrong. But I think > the fix is still correct for the reported syzkaller issue. From > bpf_prog_run_generic_xdp: > > rxqueue = netif_get_rxqueue(skb); > xdp_init_buff(xdp, frame_sz, rxq: &rxqueue->xdp_rxq); > > So xdp_buff's rxq is set to the netstack queue for the generic XDP > hook. And adding the check in netif_skb_check_for_xdp based on the > netstack queue should be correct, right? Per my limited understanding your change is making skb_cow_data_for_xdp() a dead code as I don't see mem model being registered for these stack queues - netif_alloc_rx_queues() only calls xdp_rxq_info_reg() and mem.type defaults to MEM_TYPE_PAGE_SHARED as it's defined as 0, which means it's never going to be MEM_TYPE_PAGE_POOL. IMHO that single case where we rewrite skb to memory backed by page pool should have it reflected in mem.type so __xdp_return() potentially used in bpf helpers could act correctly. > > > > We're in generic XDP hook where driver specifics should not matter here > > > IMHO. > > > > Well, IDK how helpful the flow below would be but: > > > > veth_xdp_xmit() -> [ptr ring] -> veth_xdp_rcv() -> veth_xdp_rcv_one() > > | > > | xdp_convert_frame_to_buff() <-' > > ( "re-stamps" ;) -> | xdp->rxq = &rq->xdp_rxq; > > can eat frags but now rxq | bpf_prog_run_xdp() > > is veth's | > > > > I just glanced at the code so >50% changes I'm wrong, but that's what > > I meant. > > Thanks for the clarification, I thought that "re-stamps" means the: > > xdp->rxq->mem.type = frame->mem_type; > > from veth_xdp_rcv_one in the XDP_TX/XDP_REDIRECT cases. > > And yes, now I think the same issue can happen because veth sets the > memory model to MEM_TYPE_PAGE_SHARED but veth_convert_skb_to_xdp_buff > calls skb_pp_cow_data that uses page_pool for allocations. I'll try to > see if I can adapt the syzkaller repro to trigger it for confirmation. That is a good catch. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-26 11:24 ` Maciej Fijalkowski @ 2025-09-26 19:40 ` Jakub Kicinski 2025-09-30 0:01 ` Octavian Purdila 0 siblings, 1 reply; 10+ messages in thread From: Jakub Kicinski @ 2025-09-26 19:40 UTC (permalink / raw) To: Maciej Fijalkowski Cc: Octavian Purdila, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Fri, 26 Sep 2025 13:24:12 +0200 Maciej Fijalkowski wrote: > On Fri, Sep 26, 2025 at 12:33:46AM -0700, Octavian Purdila wrote: > > On Thu, Sep 25, 2025 at 7:12 PM Jakub Kicinski <kuba@kernel.org> wrote: > > Ah, yes, you are right. So my comment in the commit message about > > TUN/TAP registering a page shared memory model is wrong. But I think > > the fix is still correct for the reported syzkaller issue. From > > bpf_prog_run_generic_xdp: > > > > rxqueue = netif_get_rxqueue(skb); > > xdp_init_buff(xdp, frame_sz, rxq: &rxqueue->xdp_rxq); > > > > So xdp_buff's rxq is set to the netstack queue for the generic XDP > > hook. And adding the check in netif_skb_check_for_xdp based on the > > netstack queue should be correct, right? > > Per my limited understanding your change is making skb_cow_data_for_xdp() > a dead code as I don't see mem model being registered for these stack > queues - netif_alloc_rx_queues() only calls xdp_rxq_info_reg() and > mem.type defaults to MEM_TYPE_PAGE_SHARED as it's defined as 0, which > means it's never going to be MEM_TYPE_PAGE_POOL. Hah, that's a great catch, how did I miss that.. The reason for the cow is that frags can be shared, we are not allowed to modify them. It's orthogonal. > IMHO that single case where we rewrite skb to memory backed by page pool > should have it reflected in mem.type so __xdp_return() potentially used in > bpf helpers could act correctly. > > > > Well, IDK how helpful the flow below would be but: > > > > > > veth_xdp_xmit() -> [ptr ring] -> veth_xdp_rcv() -> veth_xdp_rcv_one() > > > | > > > | xdp_convert_frame_to_buff() <-' > > > ( "re-stamps" ;) -> | xdp->rxq = &rq->xdp_rxq; > > > can eat frags but now rxq | bpf_prog_run_xdp() > > > is veth's | > > > > > > I just glanced at the code so >50% changes I'm wrong, but that's what > > > I meant. > > > > Thanks for the clarification, I thought that "re-stamps" means the: > > > > xdp->rxq->mem.type = frame->mem_type; > > > > from veth_xdp_rcv_one in the XDP_TX/XDP_REDIRECT cases. > > > > And yes, now I think the same issue can happen because veth sets the > > memory model to MEM_TYPE_PAGE_SHARED but veth_convert_skb_to_xdp_buff > > calls skb_pp_cow_data that uses page_pool for allocations. I'll try to > > see if I can adapt the syzkaller repro to trigger it for confirmation. > > That is a good catch. FWIW I think all calls to xdp_convert_frame_to_buff() must come with the hack that cpu_map_bpf_prog_run_xdp() is doing today. Declare rxq on the stack and fill in the mem info there. I wonder if we should add something to the API (xdp_convert_frame_to_buff()) to make sure people don't forget to do this.. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-26 19:40 ` Jakub Kicinski @ 2025-09-30 0:01 ` Octavian Purdila 2025-09-30 17:41 ` Jakub Kicinski 0 siblings, 1 reply; 10+ messages in thread From: Octavian Purdila @ 2025-09-30 0:01 UTC (permalink / raw) To: Jakub Kicinski Cc: Maciej Fijalkowski, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Fri, Sep 26, 2025 at 12:40 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 26 Sep 2025 13:24:12 +0200 Maciej Fijalkowski wrote: > > On Fri, Sep 26, 2025 at 12:33:46AM -0700, Octavian Purdila wrote: > > > On Thu, Sep 25, 2025 at 7:12 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > Ah, yes, you are right. So my comment in the commit message about > > > TUN/TAP registering a page shared memory model is wrong. But I think > > > the fix is still correct for the reported syzkaller issue. From > > > bpf_prog_run_generic_xdp: > > > > > > rxqueue = netif_get_rxqueue(skb); > > > xdp_init_buff(xdp, frame_sz, rxq: &rxqueue->xdp_rxq); > > > > > > So xdp_buff's rxq is set to the netstack queue for the generic XDP > > > hook. And adding the check in netif_skb_check_for_xdp based on the > > > netstack queue should be correct, right? > > > > Per my limited understanding your change is making skb_cow_data_for_xdp() > > a dead code as I don't see mem model being registered for these stack > > queues - netif_alloc_rx_queues() only calls xdp_rxq_info_reg() and > > mem.type defaults to MEM_TYPE_PAGE_SHARED as it's defined as 0, which > > means it's never going to be MEM_TYPE_PAGE_POOL. > > Hah, that's a great catch, how did I miss that.. > > The reason for the cow is that frags can be shared, we are not allowed > to modify them. It's orthogonal. > Could we use the same hack that you mention below (declare rxq on the stack and fill in the mem info there) for do_xdp_generic? @@ -5442,7 +5448,10 @@ int do_xdp_generic(const struct bpf_prog *xdp_prog, struct sk_buff **pskb) struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; if (xdp_prog) { - struct xdp_buff xdp; + struct xdp_rxq_info rxq = {}; + struct xdp_buff xdp = { + .rxq = &rxq, + }; u32 act; int err; and then explicitly set the mem type: @@ -5331,14 +5332,19 @@ u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp, return act; } -static int -netif_skb_check_for_xdp(struct sk_buff **pskb, const struct bpf_prog *prog) +static int netif_skb_check_for_xdp(struct sk_buff **pskb, + const struct bpf_prog *prog, + struct xdp_rxq_info *rxq) { struct sk_buff *skb = *pskb; int err, hroom, troom; + struct page_pool *pool; local_lock_nested_bh(&system_page_pool.bh_lock); - err = skb_cow_data_for_xdp(this_cpu_read(system_page_pool.pool), pskb, prog); + pool = this_cpu_read(system_page_pool.pool); + err = skb_cow_data_for_xdp(pool, pskb, prog); + rxq->mem.type = MEM_TYPE_PAGE_POOL; + rxq->mem.id = pool->xdp_mem_id; local_unlock_nested_bh(&system_page_pool.bh_lock); if (!err) return 0; > > IMHO that single case where we rewrite skb to memory backed by page pool > > should have it reflected in mem.type so __xdp_return() potentially used in > > bpf helpers could act correctly. > > > > > > Well, IDK how helpful the flow below would be but: > > > > > > > > veth_xdp_xmit() -> [ptr ring] -> veth_xdp_rcv() -> veth_xdp_rcv_one() > > > > | > > > > | xdp_convert_frame_to_buff() <-' > > > > ( "re-stamps" ;) -> | xdp->rxq = &rq->xdp_rxq; > > > > can eat frags but now rxq | bpf_prog_run_xdp() > > > > is veth's | > > > > > > > > I just glanced at the code so >50% changes I'm wrong, but that's what > > > > I meant. > > > > > > Thanks for the clarification, I thought that "re-stamps" means the: > > > > > > xdp->rxq->mem.type = frame->mem_type; > > > > > > from veth_xdp_rcv_one in the XDP_TX/XDP_REDIRECT cases. > > > > > > And yes, now I think the same issue can happen because veth sets the > > > memory model to MEM_TYPE_PAGE_SHARED but veth_convert_skb_to_xdp_buff > > > calls skb_pp_cow_data that uses page_pool for allocations. I'll try to > > > see if I can adapt the syzkaller repro to trigger it for confirmation. > > > > That is a good catch. > > FWIW I think all calls to xdp_convert_frame_to_buff() must come with > the hack that cpu_map_bpf_prog_run_xdp() is doing today. Declare rxq > on the stack and fill in the mem info there. I wonder if we should add > something to the API (xdp_convert_frame_to_buff()) to make sure people > don't forget to do this.. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] xdp: use multi-buff only if receive queue supports page pool 2025-09-30 0:01 ` Octavian Purdila @ 2025-09-30 17:41 ` Jakub Kicinski 0 siblings, 0 replies; 10+ messages in thread From: Jakub Kicinski @ 2025-09-30 17:41 UTC (permalink / raw) To: Octavian Purdila Cc: Maciej Fijalkowski, davem, edumazet, pabeni, horms, ast, daniel, hawk, john.fastabend, sdf, ahmed.zaki, aleksander.lobakin, toke, lorenzo, netdev, bpf, syzbot+ff145014d6b0ce64a173, Kuniyuki Iwashima On Mon, 29 Sep 2025 17:01:49 -0700 Octavian Purdila wrote: > local_lock_nested_bh(&system_page_pool.bh_lock); > - err = skb_cow_data_for_xdp(this_cpu_read(system_page_pool.pool), pskb, prog); > + pool = this_cpu_read(system_page_pool.pool); > + err = skb_cow_data_for_xdp(pool, pskb, prog); > + rxq->mem.type = MEM_TYPE_PAGE_POOL; > + rxq->mem.id = pool->xdp_mem_id; > local_unlock_nested_bh(&system_page_pool.bh_lock); Yes, LGTM. I _think_ that only skb_cow_data_for_xdp() has to be under the lock here, but doesn't really matter. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-09-30 17:41 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-24 6:08 [PATCH net] xdp: use multi-buff only if receive queue supports page pool Octavian Purdila 2025-09-25 0:09 ` Jakub Kicinski 2025-09-25 7:53 ` Octavian Purdila 2025-09-25 9:42 ` Maciej Fijalkowski 2025-09-26 2:12 ` Jakub Kicinski 2025-09-26 7:33 ` Octavian Purdila 2025-09-26 11:24 ` Maciej Fijalkowski 2025-09-26 19:40 ` Jakub Kicinski 2025-09-30 0:01 ` Octavian Purdila 2025-09-30 17:41 ` Jakub Kicinski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).