All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Larysa Zaremba <larysa.zaremba@intel.com>
Cc: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<andrii@kernel.org>, <martin.lau@linux.dev>, <song@kernel.org>,
	<yhs@fb.com>, <john.fastabend@gmail.com>, <kpsingh@kernel.org>,
	<sdf@google.com>, <haoluo@google.com>, <jolsa@kernel.org>,
	David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>, <xdp-hints@xdp-project.net>,
	<netdev@vger.kernel.org>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	"Saeed Mahameed" <saeedm@mellanox.com>, <toke@kernel.org>
Subject: Re: [PATCH bpf-next v6 11/18] ice: put XDP meta sources assignment under a static key condition
Date: Thu, 2 Nov 2023 14:23:02 +0100	[thread overview]
Message-ID: <ZUOitlGHC0Kgrh5i@boxer> (raw)
In-Reply-To: <ZUENpw5GVqcL0GJc@lzaremba-mobl.ger.corp.intel.com>

On Tue, Oct 31, 2023 at 03:22:31PM +0100, Larysa Zaremba wrote:
> On Sat, Oct 28, 2023 at 09:55:52PM +0200, Maciej Fijalkowski wrote:
> > On Mon, Oct 23, 2023 at 11:35:46AM +0200, Larysa Zaremba wrote:
> > > On Fri, Oct 20, 2023 at 06:32:13PM +0200, Maciej Fijalkowski wrote:
> > > > On Thu, Oct 12, 2023 at 07:05:17PM +0200, Larysa Zaremba wrote:
> > > > > Usage of XDP hints requires putting additional information after the
> > > > > xdp_buff. In basic case, only the descriptor has to be copied on a
> > > > > per-packet basis, because xdp_buff permanently resides before per-ring
> > > > > metadata (cached time and VLAN protocol ID).
> > > > > 
> > > > > However, in ZC mode, xdp_buffs come from a pool, so memory after such
> > > > > buffer does not contain any reliable information, so everything has to be
> > > > > copied, damaging the performance.
> > > > > 
> > > > > Introduce a static key to enable meta sources assignment only when attached
> > > > > XDP program is device-bound.
> > > > > 
> > > > > This patch eliminates a 6% performance drop in ZC mode, which was a result
> > > > > of addition of XDP hints to the driver.
> > > > > 
> > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > ---
> > > > >  drivers/net/ethernet/intel/ice/ice.h      |  1 +
> > > > >  drivers/net/ethernet/intel/ice/ice_main.c | 14 ++++++++++++++
> > > > >  drivers/net/ethernet/intel/ice/ice_txrx.c |  3 ++-
> > > > >  drivers/net/ethernet/intel/ice/ice_xsk.c  |  3 +++
> > > > >  4 files changed, 20 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > > > > index 3d0f15f8b2b8..76d22be878a4 100644
> > > > > --- a/drivers/net/ethernet/intel/ice/ice.h
> > > > > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > > > > @@ -210,6 +210,7 @@ enum ice_feature {
> > > > >  };
> > > > >  
> > > > >  DECLARE_STATIC_KEY_FALSE(ice_xdp_locking_key);
> > > > > +DECLARE_STATIC_KEY_FALSE(ice_xdp_meta_key);
> > > > >  
> > > > >  struct ice_channel {
> > > > >  	struct list_head list;
> > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > > > > index 47e8920e1727..ee0df86d34b7 100644
> > > > > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > > > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > > > > @@ -48,6 +48,9 @@ MODULE_PARM_DESC(debug, "netif level (0=none,...,16=all)");
> > > > >  DEFINE_STATIC_KEY_FALSE(ice_xdp_locking_key);
> > > > >  EXPORT_SYMBOL(ice_xdp_locking_key);
> > > > >  
> > > > > +DEFINE_STATIC_KEY_FALSE(ice_xdp_meta_key);
> > > > > +EXPORT_SYMBOL(ice_xdp_meta_key);
> > > > > +
> > > > >  /**
> > > > >   * ice_hw_to_dev - Get device pointer from the hardware structure
> > > > >   * @hw: pointer to the device HW structure
> > > > > @@ -2634,6 +2637,11 @@ static int ice_xdp_alloc_setup_rings(struct ice_vsi *vsi)
> > > > >  	return -ENOMEM;
> > > > >  }
> > > > >  
> > > > > +static bool ice_xdp_prog_has_meta(struct bpf_prog *prog)
> > > > > +{
> > > > > +	return prog && prog->aux->dev_bound;
> > > > > +}
> > > > > +
> > > > >  /**
> > > > >   * ice_vsi_assign_bpf_prog - set or clear bpf prog pointer on VSI
> > > > >   * @vsi: VSI to set the bpf prog on
> > > > > @@ -2644,10 +2652,16 @@ static void ice_vsi_assign_bpf_prog(struct ice_vsi *vsi, struct bpf_prog *prog)
> > > > >  	struct bpf_prog *old_prog;
> > > > >  	int i;
> > > > >  
> > > > > +	if (ice_xdp_prog_has_meta(prog))
> > > > > +		static_branch_inc(&ice_xdp_meta_key);
> > > > 
> > > > i thought boolean key would be enough but inc/dec should serve properly
> > > > for example prog hotswap cases.
> > > >
> > > 
> > > My thought process on using counting instead of boolean was: there can be 
> > > several PFs that use the same driver, so therefore we need to keep track of how 
> > > many od them use hints. 
> > 
> > Very good point. This implies that if PF0 has hints-enabled prog loaded,
> > PF1 with non-hints prog will "suffer" from it.
> > 
> > Sorry for such a long delays in responses but I was having a hard time
> > making up my mind about it. In the end I have come up to some conclusions.
> > I know the timing for sending this response is not ideal, but I need to
> > get this off my chest and bring discussion back to life:)
> > 
> > IMHO having static keys to eliminate ZC overhead does not scale. I assume
> > every other driver would have to follow that.
> > 
> > XSK pool allows us to avoid initializing various things per each packet.
> > Instead, taking xdp_rxq_info as an example, each xdp_buff from pool has
> > xdp_rxq_info assigned at init time. With this in mind, we should have some
> > mechanism to set hints-specific things in xdp_buff_xsk::cb, at init time
> > as well. Such mechanism should not require us to expose driver's private
> > xdp_buff hints containers (such as ice_pkt_ctx) to XSK pool.
> > 
> > Right now you moved phctime down to ice_pkt_ctx and to me that's the main
> > reason we have to copy ice_pkt_ctx to each xdp_buff on ZC. What if we keep
> > the cached_phctime at original offset in ring but ice_pkt_ctx would get a
> > pointer to that?
> > 
> > This would allow us to init the pointer in each xdp_buff from XSK pool at
> > init time. I have come up with a way to program that via so called XSK
> > meta descriptors. Each desc would have data to write onto cb, offset
> > within cb and amount of bytes to write/copy.
> > 
> > I'll share the diff below but note that I didn't measure how much lower
> > the performance is degraded. My icelake machine where I used to measure
> > performance-sensitive code got broke. For now we can't escape initing
> > eop_desc per each xdp_buff, but I moved it to alloc side, as we mangle
> > descs there anyway.
> > 
> > I think mlx5 could benefit from that approach as well with initing the rq
> > ptr at init time.
> > 
> > Diff does mostly these things:
> > - move cached_phctime to old place in ice_rx_ring and add ptr to that in
> >   ice_pkt_ctx
> > - introduce xsk_pool_set_meta()
> > - use it from ice side.
> > 
> 
> Thank you for the code! I will probably send v7 with such changes. Are you OK, 
> if patch with core changes would go with you as an author?

Yes or I can produce a patch and share, up to you.

> 
> But also, I see a minor problem with that switching VLAN protocol does not 
> trigger buffer allocation, so we have to point to that too, this probably means 
> moving cached time back and finding 16 extra bits in CL3. Single pointer to 
> {cached time, vlan_proto} would be copied to be after xdp_buff.

It's not that it has to trigger buffer allocation, we could stop the
interface if pool is present and update vlan proto on pool's xdp_buffs
(from quick glance i don't see that we're stopping iface for setting vlan
features) but that sounds like more of a hassle to do...

So yeah maybe let's just have a ptr in ice_pkt_ctx as well.

[...]

  reply	other threads:[~2023-11-02 13:23 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12 17:05 [PATCH bpf-next v6 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 01/18] ice: make RX hash reading code more reusable Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 02/18] ice: make RX HW timestamp " Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 03/18] ice: Make ptype internal to descriptor info processing Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 04/18] ice: Introduce ice_xdp_buff Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 05/18] ice: Support HW timestamp hint Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 06/18] ice: Support RX hash XDP hint Larysa Zaremba
2023-10-20 15:27   ` Maciej Fijalkowski
2023-10-23 10:04     ` Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 07/18] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
2023-10-17 16:13   ` Maciej Fijalkowski
2023-10-17 16:37     ` Magnus Karlsson
2023-10-17 16:45       ` Maciej Fijalkowski
2023-10-17 17:03         ` Larysa Zaremba
2023-10-18 10:43           ` Maciej Fijalkowski
2023-10-20 15:29   ` Maciej Fijalkowski
2023-10-23  9:37     ` Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 08/18] xdp: Add VLAN tag hint Larysa Zaremba
2023-10-18 23:59   ` Jakub Kicinski
2023-10-19  8:05     ` Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 09/18] ice: Implement " Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 10/18] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 11/18] ice: put XDP meta sources assignment under a static key condition Larysa Zaremba
2023-10-20 16:32   ` Maciej Fijalkowski
2023-10-23  9:35     ` Larysa Zaremba
2023-10-28 19:55       ` Maciej Fijalkowski
2023-10-31 14:22         ` Larysa Zaremba
2023-11-02 13:23           ` Maciej Fijalkowski [this message]
2023-11-02 13:48             ` Larysa Zaremba
2023-10-31 17:32         ` Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 12/18] veth: Implement VLAN tag XDP hint Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 13/18] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 14/18] mlx5: implement VLAN tag XDP hint Larysa Zaremba
2023-10-17 12:40   ` Tariq Toukan
2023-10-12 17:05 ` [PATCH bpf-next v6 15/18] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 16/18] selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 17/18] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
2023-10-12 17:05 ` [PATCH bpf-next v6 18/18] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUOitlGHC0Kgrh5i@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=tariqt@mellanox.com \
    --cc=toke@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.