From: Stanislav Fomichev <sdf@google.com>
To: Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: brouer@redhat.com, bpf@vger.kernel.org, ast@kernel.org,
daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
song@kernel.org, yhs@fb.com, john.fastabend@gmail.com,
kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org,
kuba@kernel.org, toke@kernel.org, willemb@google.com,
dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org,
maciej.fijalkowski@intel.com, hawk@kernel.org,
netdev@vger.kernel.org, xdp-hints@xdp-project.net
Subject: Re: [xdp-hints] [RFC net-next v4 2/8] xsk: add TX timestamp and TX checksum offload support
Date: Thu, 27 Jul 2023 09:34:30 -0700 [thread overview]
Message-ID: <ZMKclgdh4OVHEkJE@google.com> (raw)
In-Reply-To: <cce9db50-8c9d-ea97-cb88-171fa46cc064@redhat.com>
On 07/27, Jesper Dangaard Brouer wrote:
>
>
> On 26/07/2023 23.25, Stanislav Fomichev wrote:
> > On 07/26, Jesper Dangaard Brouer wrote:
> > >
> > >
> > > On 25/07/2023 01.59, Stanislav Fomichev wrote:
> > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > > > index 11652e464f5d..8b40c80557aa 100644
> > > > --- a/include/linux/netdevice.h
> > > > +++ b/include/linux/netdevice.h
> > > > @@ -1660,6 +1660,31 @@ struct xdp_metadata_ops {
> > > > enum xdp_rss_hash_type *rss_type);
> > > > };
> > > > +/*
> > > > + * This structure defines the AF_XDP TX metadata hooks for network devices.
> > > > + * The following hooks can be defined; unless noted otherwise, they are
> > > > + * optional and can be filled with a null pointer.
> > > > + *
> > > > + * int (*tmo_request_timestamp)(void *priv)
> > > > + * This function is called when AF_XDP frame requested egress timestamp.
> > > > + *
> > > > + * int (*tmo_fill_timestamp)(void *priv)
> > > > + * This function is called when AF_XDP frame, that had requested
> > > > + * egress timestamp, received a completion. The hook needs to return
> > > > + * the actual HW timestamp.
> > > > + *
> > > > + * int (*tmo_request_timestamp)(u16 csum_start, u16 csum_offset, void *priv)
> > > > + * This function is called when AF_XDP frame requested HW checksum
> > > > + * offload. csum_start indicates position where checksumming should start.
> > > > + * csum_offset indicates position where checksum should be stored.
> > > > + *
> > > > + */
> > > > +struct xsk_tx_metadata_ops {
> > > > + void (*tmo_request_timestamp)(void *priv);
> > > > + u64 (*tmo_fill_timestamp)(void *priv);
> > > > + void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv);
> > > > +};
> > > > +
> > > > /**
> > > > * enum netdev_priv_flags - &struct net_device priv_flags
> > > > *
> > > > @@ -1844,6 +1869,7 @@ enum netdev_ml_priv_type {
> > > > * @netdev_ops: Includes several pointers to callbacks,
> > > > * if one wants to override the ndo_*() functions
> > > > * @xdp_metadata_ops: Includes pointers to XDP metadata callbacks.
> > > > + * @xsk_tx_metadata_ops: Includes pointers to AF_XDP TX metadata callbacks.
> > > > * @ethtool_ops: Management operations
> > > > * @l3mdev_ops: Layer 3 master device operations
> > > > * @ndisc_ops: Includes callbacks for different IPv6 neighbour
> > > > @@ -2100,6 +2126,7 @@ struct net_device {
> > > > unsigned long long priv_flags;
> > > > const struct net_device_ops *netdev_ops;
> > > > const struct xdp_metadata_ops *xdp_metadata_ops;
> > > > + const struct xsk_tx_metadata_ops *xsk_tx_metadata_ops;
> > > > int ifindex;
> > > > unsigned short gflags;
> > > > unsigned short hard_header_len;
> > > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > > index faaba050f843..5febc1a5131e 100644
> > > > --- a/include/linux/skbuff.h
> > > > +++ b/include/linux/skbuff.h
> > > > @@ -581,7 +581,10 @@ struct skb_shared_info {
> > > > /* Warning: this field is not always filled in (UFO)! */
> > > > unsigned short gso_segs;
> > > > struct sk_buff *frag_list;
> > > > - struct skb_shared_hwtstamps hwtstamps;
> > > > + union {
> > > > + struct skb_shared_hwtstamps hwtstamps;
> > > > + struct xsk_tx_metadata *xsk_meta;
> > > > + };
> > > > unsigned int gso_type;
> > > > u32 tskey;
> > > > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
> > > > index 467b9fb56827..288fa58c4665 100644
> > > > --- a/include/net/xdp_sock.h
> > > > +++ b/include/net/xdp_sock.h
> > > > @@ -90,6 +90,54 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp);
> > > > int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp);
> > > > void __xsk_map_flush(void);
> > > > +/**
> > > > + * xsk_tx_metadata_request - Evaluate AF_XDP TX metadata at submission
> > > > + * and call appropriate xsk_tx_metadata_ops operation.
> > > > + * @meta: pointer to AF_XDP metadata area
> > > > + * @ops: pointer to struct xsk_tx_metadata_ops
> > > > + * @priv: pointer to driver-private aread
> > > > + *
> > > > + * This function should be called by the networking device when
> > > > + * it prepares AF_XDP egress packet.
> > > > + */
> > > > +static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta,
> > > > + const struct xsk_tx_metadata_ops *ops,
> > > > + void *priv)
> > >
> > > (As you mentioned) this gets inlined in drivers for performance.
> > >
> > > > +{
> > > > + if (!meta)
> > > > + return;
> > > > +
> > > > + if (ops->tmo_request_timestamp)
> > > > + if (meta->flags & XDP_TX_METADATA_TIMESTAMP)
> > > > + ops->tmo_request_timestamp(priv);
> > >
> > > We still have the overhead of function pointer call.
> > > With RETPOLINE this is costly.
> > >
> > > Measured on my testlab CPU E5-1650 v4 @ 3.60GHz
> > > Type:function_call_cost: 3 cycles(tsc) 1.010 ns
> > > Type:func_ptr_call_cost: 30 cycles(tsc) 8.341 ns
> > >
> > > Given this get inlined in drivers, perhaps we can add some
> > > INDIRECT_CALL_1 macros to avoid these indirect calls?
> >
> > I'm assuming that the compiler is smart enough to resolve these const
> > struct ops definitions as long as they are in the same file.
> >
> > At least here is what I see for mlx5e_xmit_xdp_frame_mpwqe: those
> > indirect jumps are resolved and the calls themselves are unrolled.
> > TBF, I don't have retpolines enabled in the config, but I don't think
> > it will bring indirect jumps back? Am I missing anything?
> >
>
> I tried this with CONFIG_RETPOLINE and same results.
> The compiler is smart and inlines the call to mlx5e_xsk_request_checksum().
> This is great, no need for crazy INDIRECT_CALL_1 macros :-)
Perfect, thanks for checking!
> >
> > 0000000000001930 <mlx5e_xmit_xdp_frame_mpwqe>:
> > ; mlx5e_xmit_xdp_frame_mpwqe():
> > ; ./drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c:436
> [...]
> > ; ./drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c:381
> > ; stats->mpwqe++;
> > 1b4a: 49 ff 44 24 08 incq 0x8(%r12)
> > ; ././include/net/xdp_sock.h:107
>
> How do you get objdump to add these file:line annotations?
>
> I use:
> objdump -rS --visualize-jumps=color -Mintel | less -R
I use llvm tools (and complier), so I did:
llvm-objdump -r -S -l --disassemble xxx.o
Most likely it's that -l option? man objdump seems to have it..
next prev parent reply other threads:[~2023-07-27 16:34 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-24 23:59 [RFC net-next v4 0/8] xsk: TX metadata Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 1/8] xsk: Support XDP_TX_METADATA_LEN Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 2/8] xsk: add TX timestamp and TX checksum offload support Stanislav Fomichev
2023-07-25 19:28 ` Simon Horman
2023-07-25 20:30 ` Stanislav Fomichev
2023-07-25 21:28 ` Jakub Kicinski
2023-07-25 22:40 ` Stanislav Fomichev
2023-07-26 7:47 ` Simon Horman
2023-07-25 20:54 ` Willem de Bruijn
2023-07-25 22:39 ` Stanislav Fomichev
2023-07-25 23:10 ` Willem de Bruijn
2023-07-25 23:48 ` Stanislav Fomichev
2023-07-27 14:11 ` Jesper Dangaard Brouer
2023-07-27 16:37 ` Stanislav Fomichev
2023-07-25 20:58 ` Willem de Bruijn
2023-07-28 11:56 ` Jesper Dangaard Brouer
2023-07-28 13:19 ` Willem de Bruijn
2023-07-26 19:25 ` [xdp-hints] " Jesper Dangaard Brouer
2023-07-26 21:25 ` Stanislav Fomichev
2023-07-27 13:50 ` Jesper Dangaard Brouer
2023-07-27 16:34 ` Stanislav Fomichev [this message]
2023-08-07 5:47 ` Dan Carpenter
2023-07-24 23:59 ` [RFC net-next v4 3/8] net/mlx5e: Implement AF_XDP TX timestamp and checksum offload Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 4/8] tools: ynl: update netdev sample to dump xsk-features Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 5/8] selftests/xsk: Support XDP_TX_METADATA_LEN Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 6/8] selftests/bpf: Add csum helpers Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 7/8] selftests/bpf: Add TX side to xdp_metadata Stanislav Fomichev
2023-07-24 23:59 ` [RFC net-next v4 8/8] selftests/bpf: Add TX side to xdp_hw_metadata Stanislav Fomichev
2023-07-25 20:59 ` Willem de Bruijn
2023-07-25 22:36 ` Stanislav Fomichev
2023-07-25 22:55 ` Willem de Bruijn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZMKclgdh4OVHEkJE@google.com \
--to=sdf@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@kernel.org \
--cc=haoluo@google.com \
--cc=hawk@kernel.org \
--cc=jbrouer@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=song@kernel.org \
--cc=toke@kernel.org \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.