From: "Zaremba, Larysa" <larysa.zaremba@intel.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"ast@kernel.org" <ast@kernel.org>,
"daniel@iogearbox.net" <daniel@iogearbox.net>,
"andrii@kernel.org" <andrii@kernel.org>,
"martin.lau@linux.dev" <martin.lau@linux.dev>,
"song@kernel.org" <song@kernel.org>, "yhs@fb.com" <yhs@fb.com>,
"john.fastabend@gmail.com" <john.fastabend@gmail.com>,
"kpsingh@kernel.org" <kpsingh@kernel.org>,
"sdf@google.com" <sdf@google.com>,
"haoluo@google.com" <haoluo@google.com>,
"jolsa@kernel.org" <jolsa@kernel.org>,
David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
Willem de Bruijn <willemb@google.com>,
"Brouer, Jesper" <brouer@redhat.com>,
"Burakov, Anatoly" <anatoly.burakov@intel.com>,
"Lobakin, Aleksander" <aleksander.lobakin@intel.com>,
Magnus Karlsson <magnus.karlsson@gmail.com>,
"Tahhan, Maryam" <mtahhan@redhat.com>,
"xdp-hints@xdp-project.net" <xdp-hints@xdp-project.net>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH bpf-next v3 12/21] xdp: Add checksum hint
Date: Fri, 21 Jul 2023 08:01:18 +0000 [thread overview]
Message-ID: <ZLo6Stj4HofGOcGO@lincoln> (raw)
In-Reply-To: <64b9b4ddae4e7_2c3d502944a@willemb.c.googlers.com.notmuch>
On Thu, Jul 20, 2023 at 06:27:41PM -0400, Willem de Bruijn wrote:
> Zaremba, Larysa wrote:
> > On Thu, Jul 20, 2023 at 09:55:16AM -0400, Willem de Bruijn wrote:
> > > Zaremba, Larysa wrote:
> > > > On Thu, Jul 20, 2023 at 09:57:05AM +0000, Zaremba, Larysa wrote:
> > > > > On Wed, Jul 19, 2023 at 05:42:04PM -0400, Willem de Bruijn wrote:
> > > > > > Larysa Zaremba wrote:
> > > > > > > Implement functionality that enables drivers to expose to XDP code checksum
> > > > > > > information that consists of:
> > > > > > >
> > > > > > > - Checksum status - bitfield that consists of
> > > > > > > - number of consecutive validated checksums. This is almost the same as
> > > > > > > csum_level in skb, but starts with 1. Enum names for those bits still
> > > > > > > use checksum level concept, so it is less confusing for driver
> > > > > > > developers.
> > > > > > > - Is checksum partial? This bit cannot coexist with any other
> > > > > > > - Is there a complete checksum available?
> > > > > > > - Additional checksum data, a union of:
> > > > > > > - checksum start and offset, if checksum is partial
> > > > > > > - complete checksum, if available
> > > > > > >
> > > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > > > ---
> > > > > > > Documentation/networking/xdp-rx-metadata.rst | 3 ++
> > > > > > > include/linux/netdevice.h | 3 ++
> > > > > > > include/net/xdp.h | 46 ++++++++++++++++++++
> > > > > > > kernel/bpf/offload.c | 2 +
> > > > > > > net/core/xdp.c | 23 ++++++++++
> > > > > > > 5 files changed, 77 insertions(+)
> > > > > > >
> > > > > > > diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
> > > > > > > index ea6dd79a21d3..7f056a44f682 100644
> > > > > > > --- a/Documentation/networking/xdp-rx-metadata.rst
> > > > > > > +++ b/Documentation/networking/xdp-rx-metadata.rst
> > > > > > > @@ -26,6 +26,9 @@ metadata is supported, this set will grow:
> > > > > > > .. kernel-doc:: net/core/xdp.c
> > > > > > > :identifiers: bpf_xdp_metadata_rx_vlan_tag
> > > > > > >
> > > > > > > +.. kernel-doc:: net/core/xdp.c
> > > > > > > + :identifiers: bpf_xdp_metadata_rx_csum
> > > > > > > +
> > > > > > > An XDP program can use these kfuncs to read the metadata into stack
> > > > > > > variables for its own consumption. Or, to pass the metadata on to other
> > > > > > > consumers, an XDP program can store it into the metadata area carried
> > > > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > > > > > > index 1749f4f75c64..4f6da36ac123 100644
> > > > > > > --- a/include/linux/netdevice.h
> > > > > > > +++ b/include/linux/netdevice.h
> > > > > > > @@ -1660,6 +1660,9 @@ struct xdp_metadata_ops {
> > > > > > > enum xdp_rss_hash_type *rss_type);
> > > > > > > int (*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tci,
> > > > > > > __be16 *vlan_proto);
> > > > > > > + int (*xmo_rx_csum)(const struct xdp_md *ctx,
> > > > > > > + enum xdp_csum_status *csum_status,
> > > > > > > + union xdp_csum_info *csum_info);
> > > > > > > };
> > > > > > >
> > > > > > > /**
> > > > > > > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > > > > > > index 89c58f56ffc6..2b7a7d678ff4 100644
> > > > > > > --- a/include/net/xdp.h
> > > > > > > +++ b/include/net/xdp.h
> > > > > > > @@ -391,6 +391,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
> > > > > > > bpf_xdp_metadata_rx_hash) \
> > > > > > > XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \
> > > > > > > bpf_xdp_metadata_rx_vlan_tag) \
> > > > > > > + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CSUM, \
> > > > > > > + bpf_xdp_metadata_rx_csum) \
> > > > > > >
> > > > > > > enum {
> > > > > > > #define XDP_METADATA_KFUNC(name, _) name,
> > > > > > > @@ -448,6 +450,50 @@ enum xdp_rss_hash_type {
> > > > > > > XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
> > > > > > > };
> > > > > > >
> > > > > > > +union xdp_csum_info {
> > > > > > > + /* Checksum referred to by ``csum_start + csum_offset`` is considered
> > > > > > > + * valid, but was never calculated, TX device has to do this,
> > > > > > > + * starting from csum_start packet byte.
> > > > > > > + * Any preceding checksums are also considered valid.
> > > > > > > + * Available, if ``status == XDP_CHECKSUM_PARTIAL``.
> > > > > > > + */
> > > > > > > + struct {
> > > > > > > + u16 csum_start;
> > > > > > > + u16 csum_offset;
> > > > > > > + };
> > > > > > > +
> > > > > > > + /* Checksum, calculated over the whole packet.
> > > > > > > + * Available, if ``status & XDP_CHECKSUM_COMPLETE``.
> > > > > > > + */
> > > > > > > + u32 checksum;
> > > > > > > +};
> > > > > > > +
> > > > > > > +enum xdp_csum_status {
> > > > > > > + /* HW had parsed several transport headers and validated their
> > > > > > > + * checksums, same as ``CHECKSUM_UNNECESSARY`` in ``sk_buff``.
> > > > > > > + * 3 least significat bytes contain number of consecutive checksum,
> > > > > >
> > > > > > typo: significant
> > > > > >
> > > > > > (I did not scan for typos, just came across this when trying to understand
> > > > > > the skb->csum_level + 1 trick. Probably good to run a spell check).
> > > > > >
> > > >
> > > > Oh, and about skb->csum_level + 1, maybe this way it would be more
> > > > understandable: XDP_CHECKSUM_VALID_LVL0 + skb->csum_level?
> > >
> > > Agreed, that would help document the intent.
> > >
> > > > Using number of valid checksums (starts with 1) instead of checksum level
> > > > (starts with 0) is a debatable decision, but I have decided to go with it under
> > > > 2 assumptions:
> > > >
> > > > - the only reason checksum level in skb starts with 0 is to use less bits
> > > > - checksum number would be more intuitive for XDP/AF_XDP application developers
> > > >
> > > > I encourage everyone to share their opinion on that.
> > >
> > > I assumed this offset by one was because csum_status zero implicitly
> > > meant XDP_CHECKSUM_NONE. Is that not correct? That should probably
> > > get an explicit name.
> > >
> >
> > Well, I was not sure, whether I should add XDP_CHECKSUM_NONE, because it would
> > be equal to returning -ENODATA from kfunc, but after giving it some thought now,
> > it is worth to have XDP_CHECKSUM_NONE for packets that have no checksum to
> > check, like for hash there is XDP_RSS_TYPE_L2.
>
> On receive, CHECKSUM_NONE means that the packet has not been checked, not
> necessarily that it has no checksum. Perhaps the device was unable to
> parse the protocol.
>
> (on transmit, it conveys that a transmit checksum is not required.)
Oh, my bad, I have re-read the docs and for packets without checksum,
CHECKSUM_UNNECESSARY instead conveys "CRC in OK". In such case,
XDP_CHECKSUM_NONE becomes a full equivalent of returning -ENODATA from kfunc, so
I do not think XDP_CHECKSUM_NONE enum is worth including, because it coud lead
to new people writing programs in such way:
if (bpf_xdp_metadata_rx_csum(ctx, &csum_status, &rx_csum_info))
fallback();
if (csum_status == XDP_CHECKSUM_NONE)
fallback();
[...]
next prev parent reply other threads:[~2023-07-21 8:01 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-19 18:37 [PATCH bpf-next v3 00/21] XDP metadata via kfuncs for ice Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 01/21] ice: make RX hash reading code more reusable Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 02/21] ice: make RX HW timestamp " Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 03/21] ice: make RX checksum checking " Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 04/21] ice: Make ptype internal to descriptor info processing Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 05/21] ice: Introduce ice_xdp_buff Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 06/21] ice: Support HW timestamp hint Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 07/21] ice: Support RX hash XDP hint Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 08/21] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 09/21] xdp: Add VLAN tag hint Larysa Zaremba
2023-07-20 21:49 ` Stanislav Fomichev
2023-07-19 18:37 ` [PATCH bpf-next v3 10/21] ice: Implement " Larysa Zaremba
2023-07-20 18:50 ` Simon Horman
2023-07-21 7:38 ` Zaremba, Larysa
2023-07-19 18:37 ` [PATCH bpf-next v3 11/21] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 12/21] xdp: Add checksum hint Larysa Zaremba
2023-07-19 21:42 ` Willem de Bruijn
2023-07-20 9:57 ` Zaremba, Larysa
2023-07-20 10:10 ` Zaremba, Larysa
2023-07-20 13:55 ` Willem de Bruijn
2023-07-20 16:03 ` Zaremba, Larysa
2023-07-20 22:27 ` Willem de Bruijn
2023-07-21 8:01 ` Zaremba, Larysa [this message]
2023-07-19 18:37 ` [PATCH bpf-next v3 13/21] ice: Implement " Larysa Zaremba
2023-07-19 18:59 ` Alexei Starovoitov
2023-07-19 21:51 ` Willem de Bruijn
2023-07-20 9:47 ` Zaremba, Larysa
2023-07-20 15:14 ` Alexei Starovoitov
2023-07-20 15:41 ` Zaremba, Larysa
2023-07-20 21:58 ` Stanislav Fomichev
2023-07-20 22:24 ` Willem de Bruijn
2023-07-19 18:37 ` [PATCH bpf-next v3 14/21] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
2023-07-20 21:58 ` Stanislav Fomichev
2023-07-19 18:37 ` [PATCH bpf-next v3 15/21] net, xdp: allow metadata > 32 Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 16/21] selftests/bpf: Add flags and new hints to xdp_hw_metadata Larysa Zaremba
2023-07-20 22:00 ` Stanislav Fomichev
2023-07-21 7:41 ` Zaremba, Larysa
2023-07-19 18:37 ` [PATCH bpf-next v3 17/21] veth: Implement VLAN tag and checksum XDP hint Larysa Zaremba
2023-07-20 22:02 ` Stanislav Fomichev
2023-07-19 18:37 ` [PATCH bpf-next v3 18/21] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba
2023-07-20 22:02 ` Stanislav Fomichev
2023-07-19 18:37 ` [PATCH bpf-next v3 19/21] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
2023-07-20 22:05 ` Stanislav Fomichev
2023-07-19 18:37 ` [PATCH bpf-next v3 20/21] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba
2023-07-20 22:14 ` Stanislav Fomichev
2023-07-21 7:46 ` Zaremba, Larysa
2023-07-21 16:44 ` Stanislav Fomichev
2023-07-25 7:11 ` Larysa Zaremba
2023-07-19 18:37 ` [PATCH bpf-next v3 21/21] selftests/bpf: check checksum state " Larysa Zaremba
2023-07-20 22:14 ` Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZLo6Stj4HofGOcGO@lincoln \
--to=larysa.zaremba@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@gmail.com \
--cc=martin.lau@linux.dev \
--cc=mtahhan@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.