[Intel-wired-lan] [PATCH net-next 13/19] iecm: implement splitq napi_poll

Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Alexander Lobakin <alexandr.lobakin@intel.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH net-next 13/19] iecm: implement splitq napi_poll
Date: Fri,  4 Feb 2022 12:50:50 +0100	[thread overview]
Message-ID: <20220204115050.80763-1-alexandr.lobakin@intel.com> (raw)
In-Reply-To: <CO1PR11MB5186937D2198CEAA43DD4E578F289@CO1PR11MB5186.namprd11.prod.outlook.com>

From: Alan Brady <alan.brady@intel.com>
Date: Thu, 3 Feb 2022 02:07:08 +0100

> > -----Original Message-----
> > From: Lobakin, Alexandr <alexandr.lobakin@intel.com>
> > Sent: Friday, January 28, 2022 9:39 AM
> > To: Brady, Alan <alan.brady@intel.com>
> > Cc: Lobakin, Alexandr <alexandr.lobakin@intel.com>; intel-wired-
> > lan at lists.osuosl.org; Burra, Phani R <phani.r.burra@intel.com>; Chittim, Madhu
> > <madhu.chittim@intel.com>; Linga, Pavan Kumar
> > <pavan.kumar.linga@intel.com>
> > Subject: Re: [Intel-wired-lan] [PATCH net-next 13/19] iecm: implement splitq
> > napi_poll
> > 
> > From: Alan Brady <alan.brady@intel.com>
> > Date: Thu, 27 Jan 2022 16:10:03 -0800
> > 
> > > This adds everything we need to actually receive packets and process spent
> > > buffers using the splitq model. This contrasts to more traditional queueing
> > > models by essentially splitting a normal queue of descriptors and mapped
> > > buffers into separate queues. This allows us to deal with both more
> > > efficiently and also allows us to implement asymmetric queuing setups where
> > > you have multiple completion queues associated with a single buffer queue.
> > >
> > > Signed-off-by: Phani Burra <phani.r.burra@intel.com>
> > > Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
> > > Signed-off-by: Madhu Chittim <madhu.chittim@intel.com>
> > > Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
> > > Signed-off-by: Alan Brady <alan.brady@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/iecm/iecm_txrx.c   | 1468 ++++++++++++++++-
> > >  drivers/net/ethernet/intel/include/iecm.h     |    4 +
> > >  .../net/ethernet/intel/include/iecm_txrx.h    |   20 +
> > >  3 files changed, 1490 insertions(+), 2 deletions(-)
> > >

--- 8< ---

> > > +	bi = IECM_SPLITQ_RX_BI_DESC(refillq, nta);
> > > +	/* store the buffer ID and the SW maintained GEN bit to the refillq */
> > > +	*bi = ((buf_id << IECM_RX_BI_BUFID_S) & IECM_RX_BI_BUFID_M) |
> > > +	      (!!(test_bit(__IECM_Q_GEN_CHK, refillq->flags)) <<
> > > +	       IECM_RX_BI_GEN_S);
> > 
> > Please use FIELD_GET() and FIELD_PREP() for masks. This won't pass
> > the maintainers.
> > 
> 
> We've never had a problem before, I'm assuming these are new? Will check.

Ok, maybe "won't pass" is not quite correct, but at least "can
provoke questions". It depends on a reviewer, e.g. I generally
discourage those when it coule be just

	*bi = FIELD_PREP(IECM_RX_BI_BUFID_M, buf_id) |
	      FIELD_PREP(IECM_RX_BI_GEN_M,
			 test_bit(__IECM_Q_GEN_CHK, refillq->flags));

or even

(somewhere near the _M/_S definitions)

#define iecm_rx_bi_prep(gen, id)	\
	(FIELD_PREP(IECM_RX_BI_BUFID_M, id) | FIELD_PREP(IECM_RX_BI_GEN_M, gen))

(inside the function itself)

	*bi = iecm_rx_bi_prep(test_bit(__IECM_Q_GEN_CHK, refillq->flags),
			      buf_id);

Note that there's no `!!` before test_bit() since it's boolean
already.
There's nothing bad in introducing macros and/or static inlines to
make the functions themselves more compact and readable. Those
`(value << SOME_KIND_OF_A_FIELD_S) & SOME_KIND_OF_A_FIELD_M` each
time we need to load or store from/to registers or descriptors don't
make any sense to me.

> 
> > > +
> > > +	nta++;
> > > +	if (unlikely(nta == refillq->desc_count)) {
> > 
> > Could be compressed into one line.
> > 
> 
> Could be, but we'd prefer not to.

Just a personal preference to consider, no pressure at all.

> 
> > > +		nta = 0;
> > > +		change_bit(__IECM_Q_GEN_CHK, refillq->flags);

--- 8< ---

> > 	for (i = ...)
> > 		if (vport->txqs[i].q_id == q_id)
> > 			return tx_q;
> > 
> > No need to create a variable.
> >
> 
> It would actually look like
> 
>  	for (i = ...)
>  		if (vport->txqs[i]->q_id == q_id)
>  			return vport->txqs[i];
> 
> 
> You had another comment about adding a vc_ops variable where it was being used twice.  I'm not seeing a huge difference here and seems like splitting hairs. I think we would prefer to keep this.

I'm usually ok with using something not stored locally up to three
times, so please quote me if I really did that somewhere.
Creating a variable is really pointless here, it involves, apart
from growing the stack, braces and a newline which can be easily
avoided for no cost.

> 
> > > +
> > > +	return NULL;
> > > +}
> > > +
> > > +/**
> > > + * iecm_tx_handle_sw_marker - Handle queue marker packet

--- 8< ---

> > > +
> > > +	/* modifies the skb - consumes the enet header */
> > > +	skb->protocol = eth_type_trans(skb, rxq->vport->netdev);
> > 
> > eth_type_trans() should generally be called *right* before
> > napi_gro_receive() to still have caches warm.
> > 
> 
> I'm pretty sure this happening here because we need to consume header before messing with checksum stuff but I'll have to dig deeper.  Will check.

You can always do skb_pull_inline() or skb_pull() or __skb_pull() to
shift skb->data to the right by ETH_HLEN, and then skb_push() (or
its inline variant) right before eth_type_trans(). They're all +/-
one operation (ptr increment/decrement) and are much cheaper than
the cache misses (esp. on DMA buffers).
Or even just use `skb->data + ETH_HLEN, skb->len - ETH_HLEN` if
there are not much places using them.

> 
> > > +	iecm_rx_splitq_extract_csum_bits(rx_desc, &csum_bits);
> > > +	iecm_rx_csum(rxq, skb, &csum_bits, &decoded);
> > > +	/* process RSS/hash */

--- 8< ---

> > > +static bool iecm_rx_page_is_reserved(struct page *page)
> > > +{
> > > +	return (page_to_nid(page) != numa_mem_id()) ||
> > page_is_pfmemalloc(page);
> > > +}
> > 
> > Please check generic dev_page_is_reusable(), it's almost the same
> > (a bit more optimized).
> > 
> 
> Will check and see if it does what we need.

I introduced it upstream exactly for that purpose and switched all
Intel drivers (and in general all the drivers which use such a
construct) to it, so it *does*.

> 
> > > +
> > > +/**
> > > + * iecm_rx_buf_adjust_pg - Prepare rx buffer for reuse

--- 8< ---

> > > +/**
> > > + * iecm_rx_can_reuse_page - Determine if page can be reused for another rx
> > > + * @rx_buf: buffer containing the page
> > > + *
> > > + * If page is reusable, we have a green light for calling iecm_reuse_rx_page,
> > > + * which will assign the current buffer to the buffer that next_to_alloc is
> > > + * pointing to; otherwise, the dma mapping needs to be destroyed and
> > > + * page freed
> > > + */
> > > +bool iecm_rx_can_reuse_page(struct iecm_rx_buf *rx_buf)

BTW, this is a relatively small, but very hotpath function, please
consider making it static inline. Compilers can uninline it then
if they find it big enough.

> > > +{
> > > +	struct iecm_page_info *page_info = &rx_buf->page_info[rx_buf-
> > >page_indx];
> > > +
> > > +#if (PAGE_SIZE >= 8192)
> > > +	unsigned int last_offset = PAGE_SIZE - rx_buf->buf_size;
> > > +#endif /* PAGE_SIZE < 8192) */
> > > +	unsigned int pagecnt_bias = page_info->pagecnt_bias;
> > > +	struct page *page = page_info->page;
> > > +
> > > +	/* avoid re-using remote pages */
> > > +	if (unlikely(iecm_rx_page_is_reserved(page)))
> > > +		return false;
> > > +
> > > +#if (PAGE_SIZE < 8192)
> > > +	/* if we are only owner of page we can reuse it */
> > > +	if (unlikely((page_count(page) - pagecnt_bias) > 1))
> > > +		return false;
> > > +#else
> > > +	if (rx_buf->page_offset > last_offset)
> > > +		return false;
> > > +#endif /* PAGE_SIZE < 8192) */
> > 
> > Same here 2 times.
> > 
> > > +
> > > +	/* If we have drained the page fragment pool we need to update
> > > +	 * the pagecnt_bias and page count so that we fully restock the
> > > +	 * number of references the driver holds.
> > > +	 */
> > > +	if (unlikely(pagecnt_bias == 1)) {
> > 
> > With 1532 byte frames, this condition will be hit 50% of times. It's
> > definitely not a good place for unlkely().
> > 
> 
> I'm afraid I'm not following here, mind elaborating? Keep in mind the buffer is 4k not 2k on MEV.

4k doesn't make any difference, but I confused reaching the bottom
of the @pagecnt_bias with reaching the end of the page, nevermind.

> 
> > > +		page_ref_add(page, USHRT_MAX - 1);
> > > +		page_info->pagecnt_bias = USHRT_MAX;
> > > +	}
> > > +
> > > +	return true;
> > > +}
> > > +
> > > +/**
> > > + * iecm_rx_add_frag - Add contents of Rx buffer to sk_buff as a frag

--- 8< ---

> > > +
> > > +		/* protocol */
> > > +		if (unlikely(iecm_rx_process_skb_fields(rxq, skb,
> > > +							splitq_flex_rx_desc))) {
> > 
> > You can define variables with shorter names to avoid this.
> > 
> 
> The only variable I see here with a questionably long name is splitq_flex_rx_desc but I don't see a way to shorten it without losing information. Note 'rx_desc' is also a variable that exists in this context and `splitq_base_rx_desc` could conceivably also exist so splitq_rx_desc doesn't work.

You don't actually need all that information. SplitQ model has its
own separate file, and it's reflected in the name of this function.
Rx processing is reflected as well. "flexd" ("flex_desc" after all)
is enough to get it all.
@rxq is named "rxq", not "splitq_rx_queue", right?

> 
> > > +			dev_kfree_skb_any(skb);
> > > +			skb = NULL;
> > > +			continue;
> > > +		}

--- 8< ---

> > >  static int iecm_vport_splitq_napi_poll(struct napi_struct *napi, int budget)
> > >  {
> > > -	/* stub */
> > > -	return 0;
> > > +	struct iecm_q_vector *q_vector =
> > > +				container_of(napi, struct iecm_q_vector, napi);
> > 
> > You can assign it later (before clean_complete assignment) to avoid
> > this.
> > 
> 
> To avoid the word wrap? I'm not sure I see the difference between letting it wrap and assigning it on it's own line.

Yes. As I mentioned in one of my previous replies, declaration and
assignment which have their own lines is a regular occasion, and
a line wrap of an assignment is an exception *only* for the cases
when there's no other way.

I can't get why people are fighting so hardly for not splitting the
declarations + assignments. Like, you do

	clean_complete = iecm_tx_splitq_clean_all(q_vector, budget);

right after the declaration block, why don't you try to merge it
with the declaration itself? Because it's a non-read-only operation?
What is the actual problem with

	struct iecm_q_vector *q_vector;
	bool clean_complete;
	int work_done = 0;

	q_vector = container_of(napi, typeof(*q_vector), napi);
	clean_complete = iecm_tx_splitq_clean_all(q_vector, budget);

then? Is it less readable? To me it is actually more. Is it less
logical? Probably a newline between them will help if needed.

And also, if you have a really good reason for keeping it that way,
you should've wrapped it by arguments actually, not by operation.

	struct iecm_q_vector *q_vector = container_of(napi, typeof(*q_vector,
						      napi);

Wrapping by operation takes place *only* when there's no other way,
and here are at least two now.

> 
> > > +	bool clean_complete;
> > > +	int work_done = 0;
> > > +
> > > +	clean_complete = iecm_tx_splitq_clean_all(q_vector, budget);
> > > +
> > > +	/* Handle case where we are called by netpoll with a budget of 0 */

--- 8< ---

> > > --
> > > 2.33.0
> > 
> > Thanks,
> > Al

Thanks,
Al

next prev parent reply	other threads:[~2022-02-04 11:50 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-28  0:09 [Intel-wired-lan] [PATCH net-next 00/19] Add iecm and idpf Alan Brady
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 01/19] virtchnl: Add new virtchnl2 ops Alan Brady
2022-02-02 22:13   ` Brady, Alan
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 02/19] iecm: add basic module init and documentation Alan Brady
2022-01-28 11:56   ` Alexander Lobakin
2022-02-02 22:15     ` Brady, Alan
2022-02-01 19:44   ` Shannon Nelson
2022-02-03  3:08     ` Brady, Alan
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 03/19] iecm: add probe and remove Alan Brady
2022-02-01 20:02   ` Shannon Nelson
2022-02-03  3:13     ` Brady, Alan
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 04/19] iecm: add api_init and controlq init Alan Brady
2022-01-28 12:09   ` Alexander Lobakin
2022-02-02 22:16     ` Brady, Alan
2022-02-01 21:26   ` Shannon Nelson
2022-02-03  3:24     ` Brady, Alan
2022-02-03  3:40       ` Brady, Alan
2022-02-03  5:26         ` Shannon Nelson
2022-02-03 13:13       ` Alexander Lobakin
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 05/19] iecm: add vport alloc and virtchnl messages Alan Brady
2022-01-28  4:19   ` kernel test robot
2022-01-28 12:39     ` Alexander Lobakin
2022-02-02 22:23       ` Brady, Alan
2022-01-28 12:32   ` Alexander Lobakin
2022-02-02 22:21     ` Brady, Alan
2022-02-03 13:23       ` Alexander Lobakin
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 06/19] iecm: add virtchnl messages for queues Alan Brady
2022-01-28 13:03   ` Alexander Lobakin
2022-02-02 22:48     ` Brady, Alan
2022-02-03 10:08       ` Maciej Fijalkowski
2022-02-03 14:09       ` Alexander Lobakin
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 07/19] iecm: finish virtchnl messages Alan Brady
2022-01-28 13:19   ` Alexander Lobakin
2022-02-02 23:06     ` Brady, Alan
2022-02-03 15:05       ` Alexander Lobakin
2022-02-03 15:16         ` Maciej Fijalkowski
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 08/19] iecm: add interrupts and configure netdev Alan Brady
2022-01-28 13:34   ` Alexander Lobakin
2022-02-02 23:17     ` Brady, Alan
2022-02-03 15:55       ` Alexander Lobakin
2022-01-28  0:09 ` [Intel-wired-lan] [PATCH net-next 09/19] iecm: alloc vport TX resources Alan Brady
2022-02-02 23:45   ` Brady, Alan
2022-02-03 17:56     ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 10/19] iecm: alloc vport RX resources Alan Brady
2022-01-28 14:16   ` Alexander Lobakin
2022-02-03  0:13     ` Brady, Alan
2022-02-03 18:29       ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 11/19] iecm: add start_xmit and set_rx_mode Alan Brady
2022-01-28 16:35   ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 12/19] iecm: finish netdev_ops Alan Brady
2022-01-28 17:06   ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 13/19] iecm: implement splitq napi_poll Alan Brady
2022-01-28  5:21   ` kernel test robot
2022-01-28 17:44     ` Alexander Lobakin
2022-02-03  1:15       ` Brady, Alan
2022-01-28 17:38   ` Alexander Lobakin
2022-02-03  1:07     ` Brady, Alan
2022-02-04 11:50       ` Alexander Lobakin [this message]
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 14/19] iecm: implement singleq napi_poll Alan Brady
2022-01-28 17:57   ` Alexander Lobakin
2022-02-03  1:45     ` Brady, Alan
2022-02-03 19:05       ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 15/19] iecm: implement ethtool callbacks Alan Brady
2022-01-28 18:13   ` Alexander Lobakin
2022-02-03  2:13     ` Brady, Alan
2022-02-03 19:54       ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 16/19] iecm: implement flow director Alan Brady
2022-01-28 19:04   ` Alexander Lobakin
2022-02-03  2:41     ` Brady, Alan
2022-02-04 10:08       ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 17/19] iecm: implement cloud filters Alan Brady
2022-01-28 19:38   ` Alexander Lobakin
2022-02-03  2:53     ` Brady, Alan
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 18/19] iecm: add advanced rss Alan Brady
2022-01-28 19:53   ` Alexander Lobakin
2022-02-03  2:55     ` Brady, Alan
2022-02-03 10:46       ` Maciej Fijalkowski
2022-02-04 10:22       ` Alexander Lobakin
2022-01-28  0:10 ` [Intel-wired-lan] [PATCH net-next 19/19] idpf: introduce idpf driver Alan Brady
2022-01-28 20:08   ` Alexander Lobakin
2022-02-03  3:07     ` Brady, Alan
2022-02-04 10:35       ` Alexander Lobakin
2022-02-04 12:05 ` [Intel-wired-lan] [PATCH net-next 00/19] Add iecm and idpf Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220204115050.80763-1-alexandr.lobakin@intel.com \
    --to=alexandr.lobakin@intel.com \
    --cc=intel-wired-lan@osuosl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox