All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: Donald Hunter <donald.hunter@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	Alexander Lobakin <aleksander.lobakin@intel.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	KP Singh <kpsingh@kernel.org>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
Date: Mon, 23 Feb 2026 15:18:45 -0800	[thread overview]
Message-ID: <20260223151845.06db43b0@kernel.org> (raw)
In-Reply-To: <aZyKWoxnywXKWth9@lore-desk>

On Mon, 23 Feb 2026 18:11:54 +0100 Lorenzo Bianconi wrote:
> > Off the top of my head drivers prefer reporting UNNECESSARY when they
> > have both, and reserve COMPLETE for cases where L4 could not be found
> > or is incorrect. Why don't we report both? We're using 3 args, we still
> > have 3 to go. We could turn ip_summed into a bitmap and have explicit
> > output args for both level and csum complete value?  
> 
> Ack, thx for the explanation. Just for sake of understanding, is there
> any NIC capable of reporting both csum_value and csum for the same packet
> in the DMA descriptor? Or is this change needed to be future-proof?

Both nfp and fbnic definitely can. Off the top of my head - mlx5 also
can, but I haven't double checked.

> > One more thing I'd like us to at least have a plan for at this stage
> > is how to deal with COMPLETE + modified packet + XDP_PASS.
> > Right now some drivers discard COMPLETE when XDP is attached since
> > they can't be sure if XDP modifies the packet. Other drivers don't
> > and we end up with bad csum splat. Do we have a recommendation on
> > the correct behavior? If not - should we have a kfunc to adjust /
> > discard csum complete explicitly?  
> 
> At the moment there is no way to store the csum value we got running
> bpf_xdp_metadata_rx_checksum() in order to be consumed during
> xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
> ebpf program bound to the NIC) but

I think the scope here is much narrower than the xdp_buf to xdp_frame
to skb conversion. We are just pass information between the program and
driver which owns xdp_buff. Very similar to your new xmo.

We could either tell the driver to discard the csum complete or even
add a helper to "adjust" the the csum value. Similar to the helper
we have to adjust the csum in TC / skb context.

> I guess the issue you pointed out can be solved in the verifier
> during program load time. What do you think?

It could, but at the verifier level we'd probably have to be fairly
coarse-grained. Any write to the packet data would mean csum complete
cannot be trusted, that's not too hard. But also any tail call / fentry?
I'm not really up to date on the latest in program chaining in BPF but
I think a lot of real-life deployments would use either chaining or
fentry. So in practice it may be a lot of complexity for having csum
complete always disabled w/ XDP, in practice.

Up to you. I'm totally okay to just say** that drivers should never
report csum complete with XDP (until appropriate API is built).
Perhaps this will force those who care about XDP+csum_complete to
tell us what their requirements are?

[**] "just say" == document and add driver kselftest that validates it

WARNING: multiple messages have this Message-ID (diff)
From: Jakub Kicinski <kuba@kernel.org>
To: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: Donald Hunter <donald.hunter@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	Alexander Lobakin <aleksander.lobakin@intel.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	KP Singh <kpsingh@kernel.org>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
Date: Mon, 23 Feb 2026 15:18:45 -0800	[thread overview]
Message-ID: <20260223151845.06db43b0@kernel.org> (raw)
In-Reply-To: <aZyKWoxnywXKWth9@lore-desk>

On Mon, 23 Feb 2026 18:11:54 +0100 Lorenzo Bianconi wrote:
> > Off the top of my head drivers prefer reporting UNNECESSARY when they
> > have both, and reserve COMPLETE for cases where L4 could not be found
> > or is incorrect. Why don't we report both? We're using 3 args, we still
> > have 3 to go. We could turn ip_summed into a bitmap and have explicit
> > output args for both level and csum complete value?  
> 
> Ack, thx for the explanation. Just for sake of understanding, is there
> any NIC capable of reporting both csum_value and csum for the same packet
> in the DMA descriptor? Or is this change needed to be future-proof?

Both nfp and fbnic definitely can. Off the top of my head - mlx5 also
can, but I haven't double checked.

> > One more thing I'd like us to at least have a plan for at this stage
> > is how to deal with COMPLETE + modified packet + XDP_PASS.
> > Right now some drivers discard COMPLETE when XDP is attached since
> > they can't be sure if XDP modifies the packet. Other drivers don't
> > and we end up with bad csum splat. Do we have a recommendation on
> > the correct behavior? If not - should we have a kfunc to adjust /
> > discard csum complete explicitly?  
> 
> At the moment there is no way to store the csum value we got running
> bpf_xdp_metadata_rx_checksum() in order to be consumed during
> xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
> ebpf program bound to the NIC) but

I think the scope here is much narrower than the xdp_buf to xdp_frame
to skb conversion. We are just pass information between the program and
driver which owns xdp_buff. Very similar to your new xmo.

We could either tell the driver to discard the csum complete or even
add a helper to "adjust" the the csum value. Similar to the helper
we have to adjust the csum in TC / skb context.

> I guess the issue you pointed out can be solved in the verifier
> during program load time. What do you think?

It could, but at the verifier level we'd probably have to be fairly
coarse-grained. Any write to the packet data would mean csum complete
cannot be trusted, that's not too hard. But also any tail call / fentry?
I'm not really up to date on the latest in program chaining in BPF but
I think a lot of real-life deployments would use either chaining or
fentry. So in practice it may be a lot of complexity for having csum
complete always disabled w/ XDP, in practice.

Up to you. I'm totally okay to just say** that drivers should never
report csum complete with XDP (until appropriate API is built).
Perhaps this will force those who care about XDP+csum_complete to
tell us what their requirements are?

[**] "just say" == document and add driver kselftest that validates it

  reply	other threads:[~2026-02-23 23:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
2026-02-17  8:33 ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
2026-02-17  8:33   ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-18  1:01   ` Stanislav Fomichev
2026-02-18  1:01     ` [Intel-wired-lan] " Stanislav Fomichev
2026-02-18 10:58     ` Jesper Dangaard Brouer
2026-02-18 10:58       ` [Intel-wired-lan] " Jesper Dangaard Brouer
2026-02-19  1:47   ` Jakub Kicinski
2026-02-19  1:47     ` [Intel-wired-lan] " Jakub Kicinski
2026-02-19 11:04     ` Lorenzo Bianconi
2026-02-19 11:04       ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-19 17:13       ` Jakub Kicinski
2026-02-19 17:13         ` [Intel-wired-lan] " Jakub Kicinski
2026-02-23 17:11         ` Lorenzo Bianconi
2026-02-23 17:11           ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-23 23:18           ` Jakub Kicinski [this message]
2026-02-23 23:18             ` Jakub Kicinski
2026-02-27 13:21             ` Lorenzo Bianconi
2026-02-27 13:21               ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-27 23:32               ` Jakub Kicinski
2026-02-27 23:32                 ` [Intel-wired-lan] " Jakub Kicinski
2026-02-28 11:58                 ` Lorenzo Bianconi
2026-02-28 11:58                   ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 2/5] net: veth: Add xmo_rx_checksum callback to veth driver Lorenzo Bianconi
2026-02-17  8:33   ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 3/5] net: ice: Add xmo_rx_checksum callback Lorenzo Bianconi
2026-02-17  8:33   ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum Lorenzo Bianconi
2026-02-17  8:33   ` [Intel-wired-lan] " Lorenzo Bianconi
2026-02-17  9:17   ` bot+bpf-ci
2026-02-17  9:17     ` [Intel-wired-lan] " bot+bpf-ci
2026-02-17  8:34 ` [PATCH bpf-next v3 5/5] selftests/bpf: Add bpf_xdp_metadata_rx_checksum support to xdp_hw_metadat prog Lorenzo Bianconi
2026-02-17  8:34   ` [Intel-wired-lan] " Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260223151845.06db43b0@kernel.org \
    --to=kuba@kernel.org \
    --cc=aleksander.lobakin@intel.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jakub@cloudflare.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.