* [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size
@ 2023-08-01 13:54 Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
Eric Dumazet
alloc_skb_with_frags(), while being able to use high order allocations,
limits the payload size to PAGE_SIZE * MAX_SKB_FRAGS
Reviewing Tahsin Erdogan patch [1], it was clear to me we need
to remove this limitation.
[1] https://lore.kernel.org/netdev/20230731230736.109216-1-trdgn@amazon.com/
Eric Dumazet (4):
net: allow alloc_skb_with_frags() to allocate bigger packets
net: tun: change tun_alloc_skb() to allow bigger paged allocations
net/packet: change packet_alloc_skb() to allow bigger paged
allocations
net: tap: change tap_alloc_skb() to allow bigger paged allocations
drivers/net/tap.c | 4 ++-
drivers/net/tun.c | 4 ++-
net/core/skbuff.c | 56 +++++++++++++++++++-----------------------
net/packet/af_packet.c | 4 ++-
4 files changed, 34 insertions(+), 34 deletions(-)
--
2.41.0.585.gd2178a4bd4-goog
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
2023-08-01 15:44 ` Willem de Bruijn
2023-08-01 13:54 ` [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations Eric Dumazet
` (3 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
Eric Dumazet
Refactor alloc_skb_with_frags() to allow bigger packets allocations.
Instead of assuming that only order-0 allocations will be attempted,
use the caller supplied max order.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
1 file changed, 25 insertions(+), 31 deletions(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
*
* @header_len: size of linear part
* @data_len: needed length in frags
- * @max_page_order: max page order desired.
+ * @order: max page order desired.
* @errcode: pointer to error code if any
* @gfp_mask: allocation mask
*
@@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
*/
struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
unsigned long data_len,
- int max_page_order,
+ int order,
int *errcode,
gfp_t gfp_mask)
{
- int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
unsigned long chunk;
struct sk_buff *skb;
struct page *page;
- int i;
+ int nr_frags = 0;
*errcode = -EMSGSIZE;
- /* Note this test could be relaxed, if we succeed to allocate
- * high order pages...
- */
- if (npages > MAX_SKB_FRAGS)
+ if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
return NULL;
*errcode = -ENOBUFS;
@@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
if (!skb)
return NULL;
- skb->truesize += npages << PAGE_SHIFT;
-
- for (i = 0; npages > 0; i++) {
- int order = max_page_order;
-
- while (order) {
- if (npages >= 1 << order) {
- page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
- __GFP_COMP |
- __GFP_NOWARN,
- order);
- if (page)
- goto fill_page;
- /* Do not retry other high order allocations */
- order = 1;
- max_page_order = 0;
- }
+ while (data_len) {
+ if (nr_frags == MAX_SKB_FRAGS - 1)
+ goto failure;
+ while (order && data_len < (PAGE_SIZE << order))
order--;
+
+ if (order) {
+ page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
+ __GFP_COMP |
+ __GFP_NOWARN,
+ order);
+ if (!page) {
+ order--;
+ continue;
+ }
+ } else {
+ page = alloc_page(gfp_mask);
+ if (!page)
+ goto failure;
}
- page = alloc_page(gfp_mask);
- if (!page)
- goto failure;
-fill_page:
chunk = min_t(unsigned long, data_len,
PAGE_SIZE << order);
- skb_fill_page_desc(skb, i, page, 0, chunk);
+ skb_fill_page_desc(skb, nr_frags, page, 0, chunk);
+ nr_frags++;
+ skb->truesize += (PAGE_SIZE << order);
data_len -= chunk;
- npages -= 1 << order;
}
return skb;
--
2.41.0.585.gd2178a4bd4-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 3/4] net/packet: change packet_alloc_skb() " Eric Dumazet
` (2 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
Eric Dumazet
tun_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.
Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max allocation size by 8x.
Also add logic to increase the linear part if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
drivers/net/tun.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index d75456adc62ac82a4f9da10c50c48266c5a9a0c0..8a48431e8c5b3c6435079f84577435d8cac0eacf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1526,8 +1526,10 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
if (prepad + len < PAGE_SIZE || !linear)
linear = len;
+ if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+ linear = len - MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER);
skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
- &err, 0);
+ &err, PAGE_ALLOC_COSTLY_ORDER);
if (!skb)
return ERR_PTR(err);
--
2.41.0.585.gd2178a4bd4-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next 3/4] net/packet: change packet_alloc_skb() to allow bigger paged allocations
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 4/4] net: tap: change tap_alloc_skb() " Eric Dumazet
2023-08-03 1:50 ` [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size patchwork-bot+netdevbpf
4 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
Eric Dumazet
packet_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.
Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.
Also add logic to increase the linear part if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
net/packet/af_packet.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8e3ddec4c3d57e7492b55404db3119cbba6f6022..3b77d255d22d6d2ed23cfc50e69a32e9c7b94531 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2927,8 +2927,10 @@ static struct sk_buff *packet_alloc_skb(struct sock *sk, size_t prepad,
if (prepad + len < PAGE_SIZE || !linear)
linear = len;
+ if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+ linear = len - MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER);
skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
- err, 0);
+ err, PAGE_ALLOC_COSTLY_ORDER);
if (!skb)
return NULL;
--
2.41.0.585.gd2178a4bd4-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next 4/4] net: tap: change tap_alloc_skb() to allow bigger paged allocations
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
` (2 preceding siblings ...)
2023-08-01 13:54 ` [PATCH net-next 3/4] net/packet: change packet_alloc_skb() " Eric Dumazet
@ 2023-08-01 13:54 ` Eric Dumazet
2023-08-03 1:50 ` [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size patchwork-bot+netdevbpf
4 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 13:54 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
Eric Dumazet
tap_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.
Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.
Also add logic to increase the linear part if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
---
drivers/net/tap.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 9137fb8c1c420a792211cb70105144e8c2d73bc9..01574b9d410f0d9bfadbddf748d194e003d9da2f 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -614,8 +614,10 @@ static inline struct sk_buff *tap_alloc_skb(struct sock *sk, size_t prepad,
if (prepad + len < PAGE_SIZE || !linear)
linear = len;
+ if (len - linear > MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+ linear = len - MAX_SKB_FRAGS * (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER);
skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
- err, 0);
+ err, PAGE_ALLOC_COSTLY_ORDER);
if (!skb)
return NULL;
--
2.41.0.585.gd2178a4bd4-goog
^ permalink raw reply related [flat|nested] 12+ messages in thread
* RE: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
@ 2023-08-01 15:44 ` Willem de Bruijn
2023-08-01 16:33 ` Eric Dumazet
0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2023-08-01 15:44 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Willem de Bruijn, Tahsin Erdogan, netdev, eric.dumazet,
Eric Dumazet
Eric Dumazet wrote:
> Refactor alloc_skb_with_frags() to allow bigger packets allocations.
>
> Instead of assuming that only order-0 allocations will be attempted,
> use the caller supplied max order.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Tahsin Erdogan <trdgn@amazon.com>
> ---
> net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
> 1 file changed, 25 insertions(+), 31 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> *
> * @header_len: size of linear part
> * @data_len: needed length in frags
> - * @max_page_order: max page order desired.
> + * @order: max page order desired.
> * @errcode: pointer to error code if any
> * @gfp_mask: allocation mask
> *
> @@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> */
> struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> unsigned long data_len,
> - int max_page_order,
> + int order,
> int *errcode,
> gfp_t gfp_mask)
> {
> - int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> unsigned long chunk;
> struct sk_buff *skb;
> struct page *page;
> - int i;
> + int nr_frags = 0;
>
> *errcode = -EMSGSIZE;
> - /* Note this test could be relaxed, if we succeed to allocate
> - * high order pages...
> - */
> - if (npages > MAX_SKB_FRAGS)
> + if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
> return NULL;
>
> *errcode = -ENOBUFS;
> @@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> if (!skb)
> return NULL;
>
> - skb->truesize += npages << PAGE_SHIFT;
> -
> - for (i = 0; npages > 0; i++) {
> - int order = max_page_order;
> -
> - while (order) {
> - if (npages >= 1 << order) {
> - page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> - __GFP_COMP |
> - __GFP_NOWARN,
> - order);
> - if (page)
> - goto fill_page;
> - /* Do not retry other high order allocations */
Is this heuristic to only try one type of compound pages and else
fall back onto regular pages still relevant? I don't know the story
behind it.
> - order = 1;
> - max_page_order = 0;
> - }
> + while (data_len) {
> + if (nr_frags == MAX_SKB_FRAGS - 1)
> + goto failure;
> + while (order && data_len < (PAGE_SIZE << order))
> order--;
Why decrement order on every iteration through the loop, not just when
alloc_pages fails?
> +
> + if (order) {
> + page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> + __GFP_COMP |
> + __GFP_NOWARN,
> + order);
> + if (!page) {
> + order--;
> + continue;
> + }
> + } else {
> + page = alloc_page(gfp_mask);
> + if (!page)
> + goto failure;
> }
> - page = alloc_page(gfp_mask);
> - if (!page)
> - goto failure;
> -fill_page:
> chunk = min_t(unsigned long, data_len,
> PAGE_SIZE << order);
> - skb_fill_page_desc(skb, i, page, 0, chunk);
> + skb_fill_page_desc(skb, nr_frags, page, 0, chunk);
> + nr_frags++;
> + skb->truesize += (PAGE_SIZE << order);
> data_len -= chunk;
> - npages -= 1 << order;
> }
> return skb;
>
> --
> 2.41.0.585.gd2178a4bd4-goog
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 15:44 ` Willem de Bruijn
@ 2023-08-01 16:33 ` Eric Dumazet
2023-08-01 17:56 ` Willem de Bruijn
0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 16:33 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
Tahsin Erdogan, netdev, eric.dumazet
On Tue, Aug 1, 2023 at 5:44 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Eric Dumazet wrote:
> > Refactor alloc_skb_with_frags() to allow bigger packets allocations.
> >
> > Instead of assuming that only order-0 allocations will be attempted,
> > use the caller supplied max order.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Tahsin Erdogan <trdgn@amazon.com>
> > ---
> > net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
> > 1 file changed, 25 insertions(+), 31 deletions(-)
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> > *
> > * @header_len: size of linear part
> > * @data_len: needed length in frags
> > - * @max_page_order: max page order desired.
> > + * @order: max page order desired.
> > * @errcode: pointer to error code if any
> > * @gfp_mask: allocation mask
> > *
> > @@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> > */
> > struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> > unsigned long data_len,
> > - int max_page_order,
> > + int order,
> > int *errcode,
> > gfp_t gfp_mask)
> > {
> > - int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> > unsigned long chunk;
> > struct sk_buff *skb;
> > struct page *page;
> > - int i;
> > + int nr_frags = 0;
> >
> > *errcode = -EMSGSIZE;
> > - /* Note this test could be relaxed, if we succeed to allocate
> > - * high order pages...
> > - */
> > - if (npages > MAX_SKB_FRAGS)
> > + if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
> > return NULL;
> >
> > *errcode = -ENOBUFS;
> > @@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> > if (!skb)
> > return NULL;
> >
> > - skb->truesize += npages << PAGE_SHIFT;
> > -
> > - for (i = 0; npages > 0; i++) {
> > - int order = max_page_order;
> > -
> > - while (order) {
> > - if (npages >= 1 << order) {
> > - page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> > - __GFP_COMP |
> > - __GFP_NOWARN,
> > - order);
> > - if (page)
> > - goto fill_page;
> > - /* Do not retry other high order allocations */
>
> Is this heuristic to only try one type of compound pages and else
> fall back onto regular pages still relevant? I don't know the story
> behind it.
I keep doing high-order attempts without direct reclaim,
they should be fine and we eventually fallback to order-2 pages
if we have plenty of them.
Immediate fallback to order-0 seems pessimistic.
>
> > - order = 1;
> > - max_page_order = 0;
> > - }
> > + while (data_len) {
> > + if (nr_frags == MAX_SKB_FRAGS - 1)
> > + goto failure;
> > + while (order && data_len < (PAGE_SIZE << order))
> > order--;
>
> Why decrement order on every iteration through the loop, not just when
> alloc_pages fails?
Say we enter the function with initial @data_len == 4000, and @order==3
We do not want to allocate/waste an order-3 page (32768 bytes on x86)
while an order-0 one should be good enough to fit the expected
payload.
Same story if initial data_len = 33000:
- We should allocate one order-3 page, and one order-0 one, instead of
two order-3 pages.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 16:33 ` Eric Dumazet
@ 2023-08-01 17:56 ` Willem de Bruijn
2023-08-01 18:10 ` Eric Dumazet
0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2023-08-01 17:56 UTC (permalink / raw)
To: Eric Dumazet, Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
Tahsin Erdogan, netdev, eric.dumazet
Eric Dumazet wrote:
> On Tue, Aug 1, 2023 at 5:44 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Eric Dumazet wrote:
> > > Refactor alloc_skb_with_frags() to allow bigger packets allocations.
> > >
> > > Instead of assuming that only order-0 allocations will be attempted,
> > > use the caller supplied max order.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Cc: Tahsin Erdogan <trdgn@amazon.com>
> > > ---
> > > net/core/skbuff.c | 56 +++++++++++++++++++++--------------------------
> > > 1 file changed, 25 insertions(+), 31 deletions(-)
> > >
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index a298992060e6efdecb87c7ffc8290eafe330583f..0ac70a0144a7c1f4e7824ddc19980aee73e4c121 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -6204,7 +6204,7 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> > > *
> > > * @header_len: size of linear part
> > > * @data_len: needed length in frags
> > > - * @max_page_order: max page order desired.
> > > + * @order: max page order desired.
> > > * @errcode: pointer to error code if any
> > > * @gfp_mask: allocation mask
> > > *
> > > @@ -6212,21 +6212,17 @@ EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
> > > */
> > > struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> > > unsigned long data_len,
> > > - int max_page_order,
> > > + int order,
> > > int *errcode,
> > > gfp_t gfp_mask)
> > > {
> > > - int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
> > > unsigned long chunk;
> > > struct sk_buff *skb;
> > > struct page *page;
> > > - int i;
> > > + int nr_frags = 0;
> > >
> > > *errcode = -EMSGSIZE;
> > > - /* Note this test could be relaxed, if we succeed to allocate
> > > - * high order pages...
> > > - */
> > > - if (npages > MAX_SKB_FRAGS)
> > > + if (unlikely(data_len > MAX_SKB_FRAGS * (PAGE_SIZE << order)))
> > > return NULL;
> > >
> > > *errcode = -ENOBUFS;
> > > @@ -6234,34 +6230,32 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
> > > if (!skb)
> > > return NULL;
> > >
> > > - skb->truesize += npages << PAGE_SHIFT;
> > > -
> > > - for (i = 0; npages > 0; i++) {
> > > - int order = max_page_order;
> > > -
> > > - while (order) {
> > > - if (npages >= 1 << order) {
> > > - page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
> > > - __GFP_COMP |
> > > - __GFP_NOWARN,
> > > - order);
> > > - if (page)
> > > - goto fill_page;
> > > - /* Do not retry other high order allocations */
> >
> > Is this heuristic to only try one type of compound pages and else
> > fall back onto regular pages still relevant? I don't know the story
> > behind it.
>
> I keep doing high-order attempts without direct reclaim,
> they should be fine and we eventually fallback to order-2 pages
> if we have plenty of them.
>
> Immediate fallback to order-0 seems pessimistic.
>
> >
> > > - order = 1;
> > > - max_page_order = 0;
> > > - }
> > > + while (data_len) {
> > > + if (nr_frags == MAX_SKB_FRAGS - 1)
> > > + goto failure;
> > > + while (order && data_len < (PAGE_SIZE << order))
> > > order--;
> >
> > Why decrement order on every iteration through the loop, not just when
> > alloc_pages fails?
>
> Say we enter the function with initial @data_len == 4000, and @order==3
>
> We do not want to allocate/waste an order-3 page (32768 bytes on x86)
> while an order-0 one should be good enough to fit the expected
> payload.
>
> Same story if initial data_len = 33000:
> - We should allocate one order-3 page, and one order-0 one, instead of
> two order-3 pages.
Thanks for the explanation. For @data_len == 5000, you would want to
allocate an order-1?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 17:56 ` Willem de Bruijn
@ 2023-08-01 18:10 ` Eric Dumazet
2023-08-01 18:21 ` Eric Dumazet
0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 18:10 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
Tahsin Erdogan, netdev, eric.dumazet
On Tue, Aug 1, 2023 at 7:56 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Thanks for the explanation. For @data_len == 5000, you would want to
> allocate an order-1?
Presumably we could try a bit harder, I will send a V2.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 18:10 ` Eric Dumazet
@ 2023-08-01 18:21 ` Eric Dumazet
2023-08-01 18:39 ` Willem de Bruijn
0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2023-08-01 18:21 UTC (permalink / raw)
To: Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
Tahsin Erdogan, netdev, eric.dumazet
On Tue, Aug 1, 2023 at 8:10 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Aug 1, 2023 at 7:56 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
>
> > Thanks for the explanation. For @data_len == 5000, you would want to
> > allocate an order-1?
>
> Presumably we could try a bit harder, I will send a V2.
I will squash the following part:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0ac70a0144a7c1f4e7824ddc19980aee73e4c121..c6f98245582cd4dd01a7c4f5708163122500a4f0
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6233,7 +6233,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned
long header_len,
while (data_len) {
if (nr_frags == MAX_SKB_FRAGS - 1)
goto failure;
- while (order && data_len < (PAGE_SIZE << order))
+ while (order && PAGE_ALIGN(data_len) < (PAGE_SIZE << order))
order--;
if (order) {
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
2023-08-01 18:21 ` Eric Dumazet
@ 2023-08-01 18:39 ` Willem de Bruijn
0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2023-08-01 18:39 UTC (permalink / raw)
To: Eric Dumazet, Willem de Bruijn
Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
Tahsin Erdogan, netdev, eric.dumazet
Eric Dumazet wrote:
> On Tue, Aug 1, 2023 at 8:10 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Aug 1, 2023 at 7:56 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> >
> > > Thanks for the explanation. For @data_len == 5000, you would want to
> > > allocate an order-1?
> >
> > Presumably we could try a bit harder, I will send a V2.
>
> I will squash the following part:
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 0ac70a0144a7c1f4e7824ddc19980aee73e4c121..c6f98245582cd4dd01a7c4f5708163122500a4f0
> 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -6233,7 +6233,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned
> long header_len,
> while (data_len) {
> if (nr_frags == MAX_SKB_FRAGS - 1)
> goto failure;
> - while (order && data_len < (PAGE_SIZE << order))
> + while (order && PAGE_ALIGN(data_len) < (PAGE_SIZE << order))
> order--;
>
> if (order) {
Thanks. Clearly not a serious concern. v1 looks great to me too. Was
just trying to understand fully.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
` (3 preceding siblings ...)
2023-08-01 13:54 ` [PATCH net-next 4/4] net: tap: change tap_alloc_skb() " Eric Dumazet
@ 2023-08-03 1:50 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-08-03 1:50 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, kuba, pabeni, willemb, trdgn, netdev, eric.dumazet
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 1 Aug 2023 13:54:51 +0000 you wrote:
> alloc_skb_with_frags(), while being able to use high order allocations,
> limits the payload size to PAGE_SIZE * MAX_SKB_FRAGS
>
> Reviewing Tahsin Erdogan patch [1], it was clear to me we need
> to remove this limitation.
>
> [1] https://lore.kernel.org/netdev/20230731230736.109216-1-trdgn@amazon.com/
>
> [...]
Here is the summary with links:
- [net-next,1/4] net: allow alloc_skb_with_frags() to allocate bigger packets
(no matching commit)
- [net-next,2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations
https://git.kernel.org/netdev/net-next/c/ce7c7fef1473
- [net-next,3/4] net/packet: change packet_alloc_skb() to allow bigger paged allocations
https://git.kernel.org/netdev/net-next/c/ae6db08f8b56
- [net-next,4/4] net: tap: change tap_alloc_skb() to allow bigger paged allocations
https://git.kernel.org/netdev/net-next/c/37dfe5b8ddeb
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-08-03 1:50 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 13:54 [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 1/4] net: allow alloc_skb_with_frags() to allocate bigger packets Eric Dumazet
2023-08-01 15:44 ` Willem de Bruijn
2023-08-01 16:33 ` Eric Dumazet
2023-08-01 17:56 ` Willem de Bruijn
2023-08-01 18:10 ` Eric Dumazet
2023-08-01 18:21 ` Eric Dumazet
2023-08-01 18:39 ` Willem de Bruijn
2023-08-01 13:54 ` [PATCH net-next 2/4] net: tun: change tun_alloc_skb() to allow bigger paged allocations Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 3/4] net/packet: change packet_alloc_skb() " Eric Dumazet
2023-08-01 13:54 ` [PATCH net-next 4/4] net: tap: change tap_alloc_skb() " Eric Dumazet
2023-08-03 1:50 ` [PATCH net-next 0/4] net: relax alloc_skb_with_frags() max size patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).