Re: [PATCH net-next v3 09/18] page_pool: allow mixing PPs within one bulk

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexander Lobakin <aleksander.lobakin@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	nex.sw.ncis.osdt.itp.upstreaming@intel.com, bpf@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v3 09/18] page_pool: allow mixing PPs within one bulk
Date: Fri, 01 Nov 2024 14:09:59 +0100	[thread overview]
Message-ID: <87ldy39k2g.fsf@toke.dk> (raw)
In-Reply-To: <20241030165201.442301-10-aleksander.lobakin@intel.com>

Alexander Lobakin <aleksander.lobakin@intel.com> writes:

> The main reason for this change was to allow mixing pages from different
> &page_pools within one &xdp_buff/&xdp_frame. Why not?
> Adjust xdp_return_frame_bulk() and page_pool_put_page_bulk(), so that
> they won't be tied to a particular pool. Let the latter create a
> separate bulk of pages which's PP is different and flush it recursively.
> This greatly optimizes xdp_return_frame_bulk(): no more hashtable
> lookups. Also make xdp_flush_frame_bulk() inline, as it's just one if +
> function call + one u32 read, not worth extending the call ladder.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>

Neat idea, but one comment, see below:

[...]

>  /**
>   * page_pool_put_page_bulk() - release references on multiple pages
> - * @pool:	pool from which pages were allocated
>   * @data:	array holding page pointers
>   * @count:	number of pages in @data
> + * @rec:	whether it's called recursively by itself
>   *
>   * Tries to refill a number of pages into the ptr_ring cache holding ptr_ring
>   * producer lock. If the ptr_ring is full, page_pool_put_page_bulk()
> @@ -854,21 +865,43 @@ EXPORT_SYMBOL(page_pool_put_unrefed_page);
>   * Please note the caller must not use data area after running
>   * page_pool_put_page_bulk(), as this function overwrites it.
>   */
> -void page_pool_put_page_bulk(struct page_pool *pool, struct page **data,
> -			     u32 count)
> +void page_pool_put_page_bulk(struct page **data, u32 count, bool rec)
>  {
> +	struct page_pool *pool = NULL;
> +	struct xdp_frame_bulk sub;
>  	int i, bulk_len = 0;
>  	bool allow_direct;
>  	bool in_softirq;
>  
> -	allow_direct = page_pool_napi_local(pool);
> +	xdp_frame_bulk_init(&sub);
>  
>  	for (i = 0; i < count; i++) {
> -		netmem_ref netmem = page_to_netmem(compound_head(data[i]));
> +		struct page *page;
> +		netmem_ref netmem;
> +
> +		if (!rec) {
> +			page = compound_head(data[i]);
> +			netmem = page_to_netmem(page);
>  
> -		/* It is not the last user for the page frag case */
> -		if (!page_pool_is_last_ref(netmem))
> +			/* It is not the last user for the page frag case */
> +			if (!page_pool_is_last_ref(netmem))
> +				continue;
> +		} else {
> +			page = data[i];
> +			netmem = page_to_netmem(page);
> +		}
> +
> +		if (unlikely(!pool)) {
> +			pool = page->pp;
> +			allow_direct = page_pool_napi_local(pool);
> +		} else if (page->pp != pool) {
> +			/*
> +			 * If the page belongs to a different page_pool, save
> +			 * it to handle recursively after the main loop.
> +			 */
> +			page_pool_bulk_rec_add(&sub, page);
>  			continue;
> +		}
>  
>  		netmem = __page_pool_put_page(pool, netmem, -1, allow_direct);
>  		/* Approved for bulk recycling in ptr_ring cache */
> @@ -876,6 +909,9 @@ void page_pool_put_page_bulk(struct page_pool *pool, struct page **data,
>  			data[bulk_len++] = (__force void *)netmem;
>  	}
>  
> +	if (sub.count)
> +		page_pool_put_page_bulk(sub.q, sub.count, true);
> +

In the worst case here, this function can recursively call itself
XDP_BULK_QUEUE_SIZE (=16) times. Which will blow ~2.5k of stack size,
and lots of function call overhead. I'm not saying this level of
recursion is likely to happen today, but who knows about future uses? So
why not make it iterative instead of recursive (same basic idea, but
some kind of 'goto begin', or loop, instead of the recursive call)?

Something like:


void page_pool_put_page_bulk(void **data, int count)
{
	struct page *bulk_prod[XDP_BULK_QUEUE_SIZE];
	int page_count = 0, pages_left, bulk_len, i;
	bool allow_direct;
	bool in_softirq;

	for (i = 0; i < count; i++) {
		struct page *p = compound_head(data[i]));
                
		if (page_pool_is_last_ref(page_to_netmem(p)))
			data[page_count++] = p;
	}

begin:
	pool = data[0]->pp;
	allow_direct = page_pool_napi_local(pool);
	pages_left = 0;
	bulk_len = 0;

	for (i = 0; i < page_count; i++) {
                struct page *p = data[i];
		netmem_ref netmem;
                
		if (unlikely(p->pp != pool)) {
			data[pages_left++] = p;
			continue;
		}

		netmem = __page_pool_put_page(pool, page_to_netmem(p), -1, allow_direct);
		/* Approved for bulk recycling in ptr_ring cache */
		if (netmem)
			bulk_prod[bulk_len++] = (__force void *)netmem;
	}

	if (!bulk_len)
		goto out;

	/* Bulk producer into ptr_ring page_pool cache */
	in_softirq = page_pool_producer_lock(pool);
	for (i = 0; i < bulk_len; i++) {
		if (__ptr_ring_produce(&pool->ring, bulk_prod[i])) {
			/* ring full */
			recycle_stat_inc(pool, ring_full);
			break;
		}
	}
	recycle_stat_add(pool, ring, i);
	page_pool_producer_unlock(pool, in_softirq);

	/* Hopefully all pages was return into ptr_ring */
	if (likely(i == bulk_len))
		goto out;

	/* ptr_ring cache full, free remaining pages outside producer lock
	 * since put_page() with refcnt == 1 can be an expensive operation
	 */
	for (; i < bulk_len; i++)
		page_pool_return_page(pool, (__force netmem_ref)bulk_prod[i]);

out:
	if (pages_left) {
		page_count = pages_left;
		goto begin;
	}        
}
    
  
Personally I also think this is easier to read, and it gets rid of the
'rec' function parameter wart... :)

-Toke

next prev parent reply	other threads:[~2024-11-01 13:10 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-30 16:51 [PATCH net-next v3 00/18] xdp: a fistful of generic changes (+libeth_xdp) Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 01/18] jump_label: export static_key_slow_{inc,dec}_cpuslocked() Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 02/18] skbuff: allow 2-4-argument skb_frag_dma_map() Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 03/18] unroll: add generic loop unroll helpers Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 04/18] bpf, xdp: constify some bpf_prog * function arguments Alexander Lobakin
2024-11-01 11:46   ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 05/18] xdp, xsk: constify read-only arguments of some static inline helpers Alexander Lobakin
2024-11-01 11:47   ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 06/18] xdp: allow attaching already registered memory model to xdp_rxq_info Alexander Lobakin
2024-11-01 11:51   ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 07/18] xdp: register system page pool as an XDP memory model Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 08/18] page_pool: make page_pool_put_page_bulk() actually handle array of pages Alexander Lobakin
2024-11-01 11:55   ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 09/18] page_pool: allow mixing PPs within one bulk Alexander Lobakin
2024-11-01 13:09   ` Toke Høiland-Jørgensen [this message]
2024-11-04 14:32     ` Alexander Lobakin
2024-11-04 16:22       ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 10/18] xdp: get rid of xdp_frame::mem.id Alexander Lobakin
2024-11-01  0:41   ` Jakub Kicinski
2024-11-04 14:36     ` Alexander Lobakin
2024-11-05  2:59       ` Jakub Kicinski
2024-11-01 13:13   ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 11/18] xdp: add generic xdp_buff_add_frag() Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 12/18] xdp: add generic xdp_build_skb_from_buff() Alexander Lobakin
2024-11-01 13:18   ` Toke Høiland-Jørgensen
2024-11-04 14:39     ` Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 13/18] xsk: allow attaching XSk pool via xdp_rxq_info_reg_mem_model() Alexander Lobakin
2024-11-01 13:20   ` Toke Høiland-Jørgensen
2024-10-30 16:51 ` [PATCH net-next v3 14/18] xsk: make xsk_buff_add_frag really add a frag via __xdp_buff_add_frag() Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 15/18] xsk: add generic XSk &xdp_buff -> skb conversion Alexander Lobakin
2024-10-30 16:51 ` [PATCH net-next v3 16/18] xsk: add helper to get &xdp_desc's DMA and meta pointer in one go Alexander Lobakin
2024-10-30 16:52 ` [PATCH net-next v3 17/18] libeth: support native XDP and register memory model Alexander Lobakin
2024-10-30 16:52 ` [PATCH net-next v3 18/18] libeth: add a couple of XDP helpers (libeth_xdp) Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ldy39k2g.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).