From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>,
<andrii@kernel.org>, <netdev@vger.kernel.org>,
<magnus.karlsson@intel.com>, <bjorn@kernel.org>,
<echaudro@redhat.com>, <lorenzo@kernel.org>
Subject: Re: [PATCH v2 bpf 2/3] xsk: fix usage of multi-buffer BPF helpers for ZC XDP
Date: Tue, 12 Dec 2023 14:36:29 +0100 [thread overview]
Message-ID: <ZXhh3U9ZykgN+BGi@boxer> (raw)
In-Reply-To: <CAJ8uoz2MAquu8yMgyNAubFB+uj+Dk0jSwwr9GmngK6YTM6sH6A@mail.gmail.com>
On Tue, Dec 12, 2023 at 02:30:50PM +0100, Magnus Karlsson wrote:
> On Tue, 12 Dec 2023 at 13:58, Maciej Fijalkowski
> <maciej.fijalkowski@intel.com> wrote:
> >
> > Currently when packet is shrunk via bpf_xdp_adjust_tail(), null ptr
> > dereference happens:
> >
> > [1136314.192256] BUG: kernel NULL pointer dereference, address:
> > 0000000000000034
> > [1136314.203943] #PF: supervisor read access in kernel mode
> > [1136314.213768] #PF: error_code(0x0000) - not-present page
> > [1136314.223550] PGD 0 P4D 0
> > [1136314.230684] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [1136314.239621] CPU: 8 PID: 54203 Comm: xdpsock Not tainted 6.6.0+ #257
> > [1136314.250469] Hardware name: Intel Corporation S2600WFT/S2600WFT,
> > BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
> > [1136314.265615] RIP: 0010:__xdp_return+0x6c/0x210
> > [1136314.274653] Code: ad 00 48 8b 47 08 49 89 f8 a8 01 0f 85 9b 01 00 00 0f 1f 44 00 00 f0 41 ff 48 34 75 32 4c 89 c7 e9 79 cd 80 ff 83 fe 03 75 17 <f6> 41 34 01 0f 85 02 01 00 00 48 89 cf e9 22 cc 1e 00 e9 3d d2 86
> > [1136314.302907] RSP: 0018:ffffc900089f8db0 EFLAGS: 00010246
> > [1136314.312967] RAX: ffffc9003168aed0 RBX: ffff8881c3300000 RCX:
> > 0000000000000000
> > [1136314.324953] RDX: 0000000000000000 RSI: 0000000000000003 RDI:
> > ffffc9003168c000
> > [1136314.336929] RBP: 0000000000000ae0 R08: 0000000000000002 R09:
> > 0000000000010000
> > [1136314.348844] R10: ffffc9000e495000 R11: 0000000000000040 R12:
> > 0000000000000001
> > [1136314.360706] R13: 0000000000000524 R14: ffffc9003168aec0 R15:
> > 0000000000000001
> > [1136314.373298] FS: 00007f8df8bbcb80(0000) GS:ffff8897e0e00000(0000)
> > knlGS:0000000000000000
> > [1136314.386105] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [1136314.396532] CR2: 0000000000000034 CR3: 00000001aa912002 CR4:
> > 00000000007706f0
> > [1136314.408377] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > 0000000000000000
> > [1136314.420173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > 0000000000000400
> > [1136314.431890] PKRU: 55555554
> > [1136314.439143] Call Trace:
> > [1136314.446058] <IRQ>
> > [1136314.452465] ? __die+0x20/0x70
> > [1136314.459881] ? page_fault_oops+0x15b/0x440
> > [1136314.468305] ? exc_page_fault+0x6a/0x150
> > [1136314.476491] ? asm_exc_page_fault+0x22/0x30
> > [1136314.484927] ? __xdp_return+0x6c/0x210
> > [1136314.492863] bpf_xdp_adjust_tail+0x155/0x1d0
> > [1136314.501269] bpf_prog_ccc47ae29d3b6570_xdp_sock_prog+0x15/0x60
> > [1136314.511263] ice_clean_rx_irq_zc+0x206/0xc60 [ice]
> > [1136314.520222] ? ice_xmit_zc+0x6e/0x150 [ice]
> > [1136314.528506] ice_napi_poll+0x467/0x670 [ice]
> > [1136314.536858] ? ttwu_do_activate.constprop.0+0x8f/0x1a0
> > [1136314.546010] __napi_poll+0x29/0x1b0
> > [1136314.553462] net_rx_action+0x133/0x270
> > [1136314.561619] __do_softirq+0xbe/0x28e
> > [1136314.569303] do_softirq+0x3f/0x60
> >
> > This comes from __xdp_return() call with xdp_buff argument passed as
> > NULL which is supposed to be consumed by xsk_buff_free() call.
> >
> > To address this properly, in ZC case, a node that represents the frag
> > being removed has to be pulled out of xskb_list. Introduce
> > appriopriate xsk helpers to do such node operation and use them
> > accordingly within bpf_xdp_adjust_tail().
> >
> > Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> > include/net/xdp_sock_drv.h | 26 +++++++++++++++++++++
> > net/core/filter.c | 48 +++++++++++++++++++++++++++++++-------
> > 2 files changed, 65 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
> > index 81e02de3f453..123adc6d68c1 100644
> > --- a/include/net/xdp_sock_drv.h
> > +++ b/include/net/xdp_sock_drv.h
> > @@ -147,6 +147,23 @@ static inline struct xdp_buff *xsk_buff_get_frag(struct xdp_buff *first)
> > return ret;
> > }
> >
> > +static inline void xsk_buff_tail_del(struct xdp_buff *tail)
> > +{
>
> I think it would be easier to remember function calls if we are
> consistent in the naming. Most of them are _verb_noun(), so I would
> call it xsk_buff_del_tail().
Sounds good. I can address that once I'll be sure that bots are happy.
>
> Apart from that:
>
> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> # For the xsk header part.
>
> > + struct xdp_buff_xsk *xskb = container_of(tail, struct xdp_buff_xsk, xdp);
> > +
> > + list_del(&xskb->xskb_list_node);
> > +}
> > +
> > +static inline struct xdp_buff *xsk_buff_get_tail(struct xdp_buff *first)
> > +{
> > + struct xdp_buff_xsk *xskb = container_of(first, struct xdp_buff_xsk, xdp);
> > + struct xdp_buff_xsk *frag;
> > +
> > + frag = list_last_entry(&xskb->pool->xskb_list, struct xdp_buff_xsk,
> > + xskb_list_node);
> > + return &frag->xdp;
> > +}
> > +
> > static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
> > {
> > xdp->data = xdp->data_hard_start + XDP_PACKET_HEADROOM;
> > @@ -333,6 +350,15 @@ static inline struct xdp_buff *xsk_buff_get_frag(struct xdp_buff *first)
> > return NULL;
> > }
> >
> > +static inline void xsk_buff_tail_del(struct xdp_buff *tail)
> > +{
> > +}
> > +
> > +static inline struct xdp_buff *xsk_buff_get_tail(struct xdp_buff *first)
> > +{
> > + return NULL;
> > +}
> > +
> > static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
> > {
> > }
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index adcfc2c25754..8ce13d73a660 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -83,6 +83,7 @@
> > #include <net/netfilter/nf_conntrack_bpf.h>
> > #include <net/netkit.h>
> > #include <linux/un.h>
> > +#include <net/xdp_sock_drv.h>
> >
> > #include "dev.h"
> >
> > @@ -4075,6 +4076,42 @@ static int bpf_xdp_frags_increase_tail(struct xdp_buff *xdp, int offset)
> > return 0;
> > }
> >
> > +static void __shrink_data(struct xdp_buff *xdp, struct xdp_mem_info *mem_info,
> > + skb_frag_t *frag, int shrink)
> > +{
> > + if (mem_info->type == MEM_TYPE_XSK_BUFF_POOL) {
> > + struct xdp_buff *tail = xsk_buff_get_tail(xdp);
> > +
> > + if (tail)
> > + tail->data_end -= shrink;
> > + }
> > + skb_frag_size_sub(frag, shrink);
> > +}
> > +
> > +static bool shrink_data(struct xdp_buff *xdp, skb_frag_t *frag, int shrink)
> > +{
> > + struct xdp_mem_info *mem_info = &xdp->rxq->mem;
> > +
> > + if (skb_frag_size(frag) == shrink) {
> > + struct page *page = skb_frag_page(frag);
> > + struct xdp_buff *zc_frag = NULL;
> > +
> > + if (mem_info->type == MEM_TYPE_XSK_BUFF_POOL) {
> > + zc_frag = xsk_buff_get_tail(xdp);
> > +
> > + if (zc_frag) {
> > + xdp_buff_clear_frags_flag(zc_frag);
> > + xsk_buff_tail_del(zc_frag);
> > + }
> > + }
> > +
> > + __xdp_return(page_address(page), mem_info, false, zc_frag);
> > + return true;
> > + }
> > + __shrink_data(xdp, mem_info, frag, shrink);
> > + return false;
> > +}
> > +
> > static int bpf_xdp_frags_shrink_tail(struct xdp_buff *xdp, int offset)
> > {
> > struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> > @@ -4089,17 +4126,10 @@ static int bpf_xdp_frags_shrink_tail(struct xdp_buff *xdp, int offset)
> >
> > len_free += shrink;
> > offset -= shrink;
> > -
> > - if (skb_frag_size(frag) == shrink) {
> > - struct page *page = skb_frag_page(frag);
> > -
> > - __xdp_return(page_address(page), &xdp->rxq->mem,
> > - false, NULL);
> > + if (shrink_data(xdp, frag, shrink))
> > n_frags_free++;
> > - } else {
> > - skb_frag_size_sub(frag, shrink);
> > + else
> > break;
> > - }
> > }
> > sinfo->nr_frags -= n_frags_free;
> > sinfo->xdp_frags_size -= len_free;
> > --
> > 2.34.1
> >
> >
next prev parent reply other threads:[~2023-12-12 13:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-12 12:57 [PATCH v2 bpf 0/3] net: bpf_xdp_adjust_tail() fixes Maciej Fijalkowski
2023-12-12 12:57 ` [PATCH v2 bpf 1/3] xsk: recycle buffer in case Rx queue was full Maciej Fijalkowski
2023-12-12 13:14 ` Magnus Karlsson
2023-12-12 12:57 ` [PATCH v2 bpf 2/3] xsk: fix usage of multi-buffer BPF helpers for ZC XDP Maciej Fijalkowski
2023-12-12 13:30 ` Magnus Karlsson
2023-12-12 13:36 ` Maciej Fijalkowski [this message]
2023-12-12 12:57 ` [PATCH v2 bpf 3/3] ice: work on pre-XDP prog frag count Maciej Fijalkowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXhh3U9ZykgN+BGi@boxer \
--to=maciej.fijalkowski@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=echaudro@redhat.com \
--cc=lorenzo@kernel.org \
--cc=magnus.karlsson@gmail.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).