[PATCH net-next 0/4] net: relax alloc_skb_with

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size
@ 2023-08-01 13:54 Eric Dumazet
  2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
	Eric Dumazet

alloc_skb_with_frags(), while being able to use high order allocations,
limits the payload size to PAGE_SIZE * MAX_SKB_FRAGS

Reviewing Tahsin Erdogan patch [1], it was clear to me we need
to remove this limitation.

[1] https://lore.kernel.org/netdev/20230731230736.109216-1-trdgn@amazon.com/

Eric Dumazet (4):
  net: allow alloc_skb_with_frags() to allocate bigger packets
  net: tun: change tun_alloc_skb() to allow bigger paged allocations
  net/packet: change packet_alloc_skb() to allow bigger paged
    allocations
  net: tap: change tap_alloc_skb() to allow bigger paged allocations

 drivers/net/tap.c      |  4 ++-
 drivers/net/tun.c      |  4 ++-
 net/core/skbuff.c      | 56 +++++++++++++++++++-----------------------
 net/packet/af_packet.c |  4 ++-
 4 files changed, 34 insertions(+), 34 deletions(-)

-- 
2.41.0.585.gd2178a4bd4-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
  2023-08-01 15:44   ` Willem de Bruijn
  2023-08-01 13:54 ` [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations Eric Dumazet
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
	Eric Dumazet

Refactor alloc_skb_with_frags() to allow bigger packets allocations.

Instead of assuming that only order-0 allocations will be attempted,
use the caller supplied max order.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
 net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
 1 file changed, 25 insertions(+), 31 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
  *
  * @header_len: size of linear part
  * @data_len: needed length in frags
- * @max_page_order: max page order desired.
+ * @order: max page order desired.
  * @errcode: pointer to error code if any
  * @gfp_mask: allocation mask
  *
@@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
  */
 struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
 				     unsigned long data_len,
-				     int max_page_order,
+				     int order,
 				     int *errcode,
 				     gfp_t gfp_mask)
 {
-	int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
 	unsigned long chunk;
 	struct sk_buff *skb;
 	struct page *page;
-	int i;
+	int nr_frags = 0;
 
 	*errcode = -EMSGSIZE;
-	/* Note this test could be relaxed, if we succeed to allocate
-	 * high order pages...
-	 */
-	if (npages > MAX_SKB_FRAGS)
+	if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
 		return NULL;
 
 	*errcode = -ENOBUFS;
@@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
 	if (!skb)
 		return NULL;
 
-	skb->truesize += npages << PAGE_SHIFT;
-
-	for (i = 0; npages > 0; i++) {
-		int order = max_page_order;
-
-		while (order) {
-			if (npages >= 1 << order) {
-				page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
-						   __GFP_COMP |
-						   __GFP_NOWARN,
-						   order);
-				if (page)
-					goto fill_page;
-				/* Do not retry other high order allocations */
-				order = 1;
-				max_page_order = 0;
-			}
+	while (data_len) {
+		if (nr_frags == MAX_SKB_FRAGS - 1)
+			goto failure;
+		while (order && data_len < (PAGE_SIZE << order))
 			order--;
+
+		if (order) {
+			page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
+					   __GFP_COMP |
+					   __GFP_NOWARN,
+					   order);
+			if (!page) {
+				order--;
+				continue;
+			}
+		} else {
+			page = alloc_page(gfp_mask);
+			if (!page)
+				goto failure;
 		}
-		page = alloc_page(gfp_mask);
-		if (!page)
-			goto failure;
-fill_page:
 		chunk = min_t(unsigned long, data_len,
 			      PAGE_SIZE << order);
-		skb_fill_page_desc(skb, i, page, 0, chunk);
+		skb_fill_page_desc(skb, nr_frags, page, 0, chunk);
+		nr_frags++;
+		skb->truesize += (PAGE_SIZE << order);
 		data_len -= chunk;
-		npages -= 1 << order;
 	}
 	return skb;
 
-- 
2.41.0.585.gd2178a4bd4-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations
  2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
  2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
  2023-08-01 13:54 ` [PATCH net-next 3/4] net/packet: change packet_alloc_skb() " Eric Dumazet
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
	Eric Dumazet

tun_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.

Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max allocation size by 8x.

Also add logic to increase the linear part if needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
 drivers/net/tun.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d75456adc62ac82a4f9da10c50c48266c5a9a0c0..8a48431e8c5b3c6435079f84577435d8cac0eacf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1526,8 +1526,10 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
 	if (prepad + len < PAGE_SIZE || !linear)
 		linear = len;
 
+	if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+		linear = len - MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER);
 	skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
-				   &err, 0);
+				   &err, PAGE_ALLOC_COSTLY_ORDER);
 	if (!skb)
 		return ERR_PTR(err);
 
-- 
2.41.0.585.gd2178a4bd4-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next 3/4] net/packet: change packet_alloc_skb() to allow bigger paged allocations
  2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
  2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
  2023-08-01 13:54 ` [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
  2023-08-01 13:54 ` [PATCH net-next 4/4] net: tap: change tap_alloc_skb() " Eric Dumazet
  2023-08-03  1:50 ` [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size patchwork-bot+netdevbpf
  4 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
	Eric Dumazet

packet_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.

Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.

Also add logic to increase the linear part if needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
 net/packet/af_packet.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8e3ddec4c3d57e7492b55404db3119cbba6f6022..3b77d255d22d6d2ed23cfc50e69a32e9c7b94531 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2927,8 +2927,10 @@ static struct sk_buff *packet_alloc_skb(struct sock *sk, size_t prepad,
 	if (prepad + len < PAGE_SIZE || !linear)
 		linear = len;
 
+	if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+		linear = len - MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER);
 	skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
-				   err, 0);
+				   err, PAGE_ALLOC_COSTLY_ORDER);
 	if (!skb)
 		return NULL;
 
-- 
2.41.0.585.gd2178a4bd4-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next 4/4] net: tap: change tap_alloc_skb() to allow bigger paged allocations
  2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
                   ` (2 preceding siblings ...)
  2023-08-01 13:54 ` [PATCH net-next 3/4] net/packet: change packet_alloc_skb() " Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
  2023-08-03  1:50 ` [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size patchwork-bot+netdevbpf
  4 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
	Eric Dumazet

tap_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.

Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.

Also add logic to increase the linear part if needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
 drivers/net/tap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 9137fb8c1c420a792211cb70105144e8c2d73bc9..01574b9d410f0d9bfadbddf748d194e003d9da2f 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -614,8 +614,10 @@ static inline struct sk_buff *tap_alloc_skb(struct sock *sk, size_t prepad,
 	if (prepad + len < PAGE_SIZE || !linear)
 		linear = len;
 
+	if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+		linear = len - MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER);
 	skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
-				   err, 0);
+				   err, PAGE_ALLOC_COSTLY_ORDER);
 	if (!skb)
 		return NULL;
 
-- 
2.41.0.585.gd2178a4bd4-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
@ 2023-08-01 15:44   ` Willem de Bruijn
  2023-08-01 16:33     ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2023-08-01 15:44 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
	Eric Dumazet

Eric Dumazet wrote:
> Refactor alloc_skb_with_frags() to allow bigger packets allocations.
> 
> Instead of assuming that only order-0 allocations will be attempted,
> use the caller supplied max order.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Tahsin Erdogan <trdgn@amazon.com>
> ---
>  net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
>  1 file changed, 25 insertions(+), 31 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
>   *
>   * @header_len: size of linear part
>   * @data_len: needed length in frags
> - * @max_page_order: max page order desired.
> + * @order: max page order desired.
>   * @errcode: pointer to error code if any
>   * @gfp_mask: allocation mask
>   *
> @@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
>   */
>  struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
>  				     unsigned long data_len,
> -				     int max_page_order,
> +				     int order,
>  				     int *errcode,
>  				     gfp_t gfp_mask)
>  {
> -	int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
>  	unsigned long chunk;
>  	struct sk_buff *skb;
>  	struct page *page;
> -	int i;
> +	int nr_frags = 0;
>  
>  	*errcode = -EMSGSIZE;
> -	/* Note this test could be relaxed, if we succeed to allocate
> -	 * high order pages...
> -	 */
> -	if (npages > MAX_SKB_FRAGS)
> +	if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
>  		return NULL;
>  
>  	*errcode = -ENOBUFS;
> @@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
>  	if (!skb)
>  		return NULL;
>  
> -	skb->truesize += npages << PAGE_SHIFT;
> -
> -	for (i = 0; npages > 0; i++) {
> -		int order = max_page_order;
> -
> -		while (order) {
> -			if (npages >= 1 << order) {
> -				page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> -						   __GFP_COMP |
> -						   __GFP_NOWARN,
> -						   order);
> -				if (page)
> -					goto fill_page;
> -				/* Do not retry other high order allocations */

Is this heuristic to only try one type of compound pages and else
fall back onto regular pages still relevant? I don't know the story
behind it.

> -				order = 1;
> -				max_page_order = 0;
> -			}
> +	while (data_len) {
> +		if (nr_frags == MAX_SKB_FRAGS - 1)
> +			goto failure;
> +		while (order && data_len < (PAGE_SIZE << order))
>  			order--;

Why decrement order on every iteration through the loop, not just when
alloc_pages fails?

> +
> +		if (order) {
> +			page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> +					   __GFP_COMP |
> +					   __GFP_NOWARN,
> +					   order);
> +			if (!page) {
> +				order--;
> +				continue;
> +			}
> +		} else {
> +			page = alloc_page(gfp_mask);
> +			if (!page)
> +				goto failure;
>  		}
> -		page = alloc_page(gfp_mask);
> -		if (!page)
> -			goto failure;
> -fill_page:
>  		chunk = min_t(unsigned long, data_len,
>  			      PAGE_SIZE << order);
> -		skb_fill_page_desc(skb, i, page, 0, chunk);
> +		skb_fill_page_desc(skb, nr_frags, page, 0, chunk);
> +		nr_frags++;
> +		skb->truesize += (PAGE_SIZE << order);
>  		data_len -= chunk;
> -		npages -= 1 << order;
>  	}
>  	return skb;
>  
> -- 
> 2.41.0.585.gd2178a4bd4-goog
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 15:44   ` Willem de Bruijn
@ 2023-08-01 16:33     ` Eric Dumazet
  2023-08-01 17:56       ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 16:33 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
	Tahsin Erdogan, netdev, eric.dumazet

On Tue, Aug 1, 2023 at 5:44 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Eric Dumazet wrote:
> > Refactor alloc_skb_with_frags() to allow bigger packets allocations.
> >
> > Instead of assuming that only order-0 allocations will be attempted,
> > use the caller supplied max order.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Tahsin Erdogan <trdgn@amazon.com>
> > ---
> >  net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
> >  1 file changed, 25 insertions(+), 31 deletions(-)
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> >   *
> >   * @header_len: size of linear part
> >   * @data_len: needed length in frags
> > - * @max_page_order: max page order desired.
> > + * @order: max page order desired.
> >   * @errcode: pointer to error code if any
> >   * @gfp_mask: allocation mask
> >   *
> > @@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> >   */
> >  struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> >                                    unsigned long data_len,
> > -                                  int max_page_order,
> > +                                  int order,
> >                                    int *errcode,
> >                                    gfp_t gfp_mask)
> >  {
> > -     int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> >       unsigned long chunk;
> >       struct sk_buff *skb;
> >       struct page *page;
> > -     int i;
> > +     int nr_frags = 0;
> >
> >       *errcode = -EMSGSIZE;
> > -     /* Note this test could be relaxed, if we succeed to allocate
> > -      * high order pages...
> > -      */
> > -     if (npages > MAX_SKB_FRAGS)
> > +     if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
> >               return NULL;
> >
> >       *errcode = -ENOBUFS;
> > @@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> >       if (!skb)
> >               return NULL;
> >
> > -     skb->truesize += npages << PAGE_SHIFT;
> > -
> > -     for (i = 0; npages > 0; i++) {
> > -             int order = max_page_order;
> > -
> > -             while (order) {
> > -                     if (npages >= 1 << order) {
> > -                             page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> > -                                                __GFP_COMP |
> > -                                                __GFP_NOWARN,
> > -                                                order);
> > -                             if (page)
> > -                                     goto fill_page;
> > -                             /* Do not retry other high order allocations */
>
> Is this heuristic to only try one type of compound pages and else
> fall back onto regular pages still relevant? I don't know the story
> behind it.

I keep doing high-order attempts without direct reclaim,
they should be fine and we eventually fallback to order-2 pages
if we have plenty of them.

Immediate fallback to order-0 seems pessimistic.

>
> > -                             order = 1;
> > -                             max_page_order = 0;
> > -                     }
> > +     while (data_len) {
> > +             if (nr_frags == MAX_SKB_FRAGS - 1)
> > +                     goto failure;
> > +             while (order && data_len < (PAGE_SIZE << order))
> >                       order--;
>
> Why decrement order on every iteration through the loop, not just when
> alloc_pages fails?

Say we enter the function with initial @data_len == 4000, and @order==3

We do not want to allocate/waste an order-3 page (32768 bytes on x86)
while an order-0 one should be good enough to fit the expected
payload.

Same story if initial data_len = 33000:
- We should allocate one order-3 page, and one order-0 one, instead of
two order-3 pages.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 16:33     ` Eric Dumazet
@ 2023-08-01 17:56       ` Willem de Bruijn
  2023-08-01 18:10         ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2023-08-01 17:56 UTC (permalink / raw)
  To: Eric Dumazet, Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
	Tahsin Erdogan, netdev, eric.dumazet

Eric Dumazet wrote:
> On Tue, Aug 1, 2023 at 5:44 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Eric Dumazet wrote:
> > > Refactor alloc_skb_with_frags() to allow bigger packets allocations.
> > >
> > > Instead of assuming that only order-0 allocations will be attempted,
> > > use the caller supplied max order.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Cc: Tahsin Erdogan <trdgn@amazon.com>
> > > ---
> > >  net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
> > >  1 file changed, 25 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> > >   *
> > >   * @header_len: size of linear part
> > >   * @data_len: needed length in frags
> > > - * @max_page_order: max page order desired.
> > > + * @order: max page order desired.
> > >   * @errcode: pointer to error code if any
> > >   * @gfp_mask: allocation mask
> > >   *
> > > @@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> > >   */
> > >  struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> > >                                    unsigned long data_len,
> > > -                                  int max_page_order,
> > > +                                  int order,
> > >                                    int *errcode,
> > >                                    gfp_t gfp_mask)
> > >  {
> > > -     int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> > >       unsigned long chunk;
> > >       struct sk_buff *skb;
> > >       struct page *page;
> > > -     int i;
> > > +     int nr_frags = 0;
> > >
> > >       *errcode = -EMSGSIZE;
> > > -     /* Note this test could be relaxed, if we succeed to allocate
> > > -      * high order pages...
> > > -      */
> > > -     if (npages > MAX_SKB_FRAGS)
> > > +     if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
> > >               return NULL;
> > >
> > >       *errcode = -ENOBUFS;
> > > @@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> > >       if (!skb)
> > >               return NULL;
> > >
> > > -     skb->truesize += npages << PAGE_SHIFT;
> > > -
> > > -     for (i = 0; npages > 0; i++) {
> > > -             int order = max_page_order;
> > > -
> > > -             while (order) {
> > > -                     if (npages >= 1 << order) {
> > > -                             page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> > > -                                                __GFP_COMP |
> > > -                                                __GFP_NOWARN,
> > > -                                                order);
> > > -                             if (page)
> > > -                                     goto fill_page;
> > > -                             /* Do not retry other high order allocations */
> >
> > Is this heuristic to only try one type of compound pages and else
> > fall back onto regular pages still relevant? I don't know the story
> > behind it.
> 
> I keep doing high-order attempts without direct reclaim,
> they should be fine and we eventually fallback to order-2 pages
> if we have plenty of them.
> 
> Immediate fallback to order-0 seems pessimistic.
> 
> >
> > > -                             order = 1;
> > > -                             max_page_order = 0;
> > > -                     }
> > > +     while (data_len) {
> > > +             if (nr_frags == MAX_SKB_FRAGS - 1)
> > > +                     goto failure;
> > > +             while (order && data_len < (PAGE_SIZE << order))
> > >                       order--;
> >
> > Why decrement order on every iteration through the loop, not just when
> > alloc_pages fails?
> 
> Say we enter the function with initial @data_len == 4000, and @order==3
> 
> We do not want to allocate/waste an order-3 page (32768 bytes on x86)
> while an order-0 one should be good enough to fit the expected
> payload.
> 
> Same story if initial data_len = 33000:
> - We should allocate one order-3 page, and one order-0 one, instead of
> two order-3 pages.

Thanks for the explanation. For @data_len == 5000, you would want to
allocate an order-1?




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 17:56       ` Willem de Bruijn
@ 2023-08-01 18:10         ` Eric Dumazet
  2023-08-01 18:21           ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 18:10 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
	Tahsin Erdogan, netdev, eric.dumazet

On Tue, Aug 1, 2023 at 7:56 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>

> Thanks for the explanation. For @data_len == 5000, you would want to
> allocate an order-1?

Presumably we could try a bit harder, I will send a V2.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 18:10         ` Eric Dumazet
@ 2023-08-01 18:21           ` Eric Dumazet
  2023-08-01 18:39             ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 18:21 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
	Tahsin Erdogan, netdev, eric.dumazet

On Tue, Aug 1, 2023 at 8:10 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Aug 1, 2023 at 7:56 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
>
> > Thanks for the explanation. For @data_len == 5000, you would want to
> > allocate an order-1?
>
> Presumably we could try a bit harder, I will send a V2.

I will squash the following part:

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0ac70a0144a7c1f4e7824ddc19980aee73e4c121..c6f98245582cd4dd01a7c4f5708163122500a4f0
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6233,7 +6233,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned
long header_len,
        while (data_len) {
                if (nr_frags == MAX_SKB_FRAGS - 1)
                        goto failure;
-               while (order && data_len < (PAGE_SIZE << order))
+               while (order && PAGE_ALIGN(data_len) < (PAGE_SIZE << order))
                        order--;

                if (order) {

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
  2023-08-01 18:21           ` Eric Dumazet
@ 2023-08-01 18:39             ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2023-08-01 18:39 UTC (permalink / raw)
  To: Eric Dumazet, Willem de Bruijn
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
	Tahsin Erdogan, netdev, eric.dumazet

Eric Dumazet wrote:
> On Tue, Aug 1, 2023 at 8:10 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Aug 1, 2023 at 7:56 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> >
> > > Thanks for the explanation. For @data_len == 5000, you would want to
> > > allocate an order-1?
> >
> > Presumably we could try a bit harder, I will send a V2.
> 
> I will squash the following part:
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 0ac70a0144a7c1f4e7824ddc19980aee73e4c121..c6f98245582cd4dd01a7c4f5708163122500a4f0
> 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -6233,7 +6233,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned
> long header_len,
>         while (data_len) {
>                 if (nr_frags == MAX_SKB_FRAGS - 1)
>                         goto failure;
> -               while (order && data_len < (PAGE_SIZE << order))
> +               while (order && PAGE_ALIGN(data_len) < (PAGE_SIZE << order))
>                         order--;
> 
>                 if (order) {

Thanks. Clearly not a serious concern. v1 looks great to me too. Was
just trying to understand fully.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size
  2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
                   ` (3 preceding siblings ...)
  2023-08-01 13:54 ` [PATCH net-next 4/4] net: tap: change tap_alloc_skb() " Eric Dumazet
@ 2023-08-03  1:50 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-08-03  1:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, pabeni, willemb, trdgn, netdev, eric.dumazet

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue,  1 Aug 2023 13:54:51 +0000 you wrote:
> alloc_skb_with_frags(), while being able to use high order allocations,
> limits the payload size to PAGE_SIZE * MAX_SKB_FRAGS
> 
> Reviewing Tahsin Erdogan patch [1], it was clear to me we need
> to remove this limitation.
> 
> [1] https://lore.kernel.org/netdev/20230731230736.109216-1-trdgn@amazon.com/
> 
> [...]

Here is the summary with links:
  - [net-next,1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
    (no matching commit)
  - [net-next,2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations
    https://git.kernel.org/netdev/net-next/c/ce7c7fef1473
  - [net-next,3/4] net/packet: change packet_alloc_skb() to allow bigger paged allocations
    https://git.kernel.org/netdev/net-next/c/ae6db08f8b56
  - [net-next,4/4] net: tap: change tap_alloc_skb() to allow bigger paged allocations
    https://git.kernel.org/netdev/net-next/c/37dfe5b8ddeb

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-08-03  1:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
2023-08-01 15:44   ` Willem de Bruijn
2023-08-01 16:33     ` Eric Dumazet
2023-08-01 17:56       ` Willem de Bruijn
2023-08-01 18:10         ` Eric Dumazet
2023-08-01 18:21           ` Eric Dumazet
2023-08-01 18:39             ` Willem de Bruijn
2023-08-01 13:54 ` [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 3/4] net/packet: change packet_alloc_skb() " Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 4/4] net: tap: change tap_alloc_skb() " Eric Dumazet
2023-08-03  1:50 ` [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).