All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Greg Thelen <gthelen@google.com>
Subject: Re: [PATCH net] Revert "virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb()"
Date: Wed, 13 Jan 2021 04:52:58 -0500	[thread overview]
Message-ID: <20210113043722-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20210113051207.142711-1-eric.dumazet@gmail.com>

On Tue, Jan 12, 2021 at 09:12:07PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> This reverts commit c67f5db82027ba6d2ea4ac9176bc45996a03ae6a.
> 
> While using page fragments instead of a kmalloc backed skb->head might give
> a small performance improvement in some cases, there is a huge risk of
> memory use under estimation.
> 
> GOOD_COPY_LEN is 128 bytes. This means that we need a small amount
> of memory to hold the headers and struct skb_shared_info
> 
> Yet, napi_alloc_skb() might use a whole 32KB page (or 64KB on PowerPc)
> for long lived incoming TCP packets.
> 
> We have been tracking OOM issues on GKE hosts hitting tcp_mem limits
> but consuming far more memory for TCP buffers than instructed in tcp_mem[2]

Are you using virtio on the host then? Is this with a hardware virtio
device? These do exist, guest is just more common, so I wanted to
make sure this is not a mistake.

> Even if we force napi_alloc_skb() to only use order-0 pages, the issue
> would still be there on arches with PAGE_SIZE >= 32768
> 
> Using alloc_skb() and thus standard kmallloc() for skb->head allocations
> will get the benefit of letting other objects in each page being independently
> used by other skbs, regardless of the lifetime.
> 
> Note that a similar problem exists for skbs allocated from napi_get_frags(),
> this is handled in a separate patch.
> 
> I would like to thank Greg Thelen for his precious help on this matter,
> analysing crash dumps is always a time consuming task.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Greg Thelen <gthelen@google.com>

Just curious - is the way virtio used napi_alloc_skb wrong somehow?

The idea was to benefit from better batching and less play with irq save
...

It would be helpful to improve the comments for napi_alloc_skb
to make it clearer how to use it correctly.

Are other uses of napi_alloc_skb ok?

> ---
>  drivers/net/virtio_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 508408fbe78fbd8658dc226834b5b1b334b8b011..5886504c1acacf3f6148127b5c1cc7f6a906b827 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -386,7 +386,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>  	p = page_address(page) + offset;
>  
>  	/* copy small packet so we can reuse these pages for small data */
> -	skb = napi_alloc_skb(&rq->napi, GOOD_COPY_LEN);
> +	skb = netdev_alloc_skb_ip_align(vi->dev, GOOD_COPY_LEN);
>  	if (unlikely(!skb))
>  		return NULL;
>  
> -- 
> 2.30.0.284.gd98b1dd5eaa7-goog


  parent reply	other threads:[~2021-01-13  9:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-13  5:12 [PATCH net] Revert "virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb()" Eric Dumazet
2021-01-13  5:28 ` Eric Dumazet
2021-01-13  9:52 ` Michael S. Tsirkin [this message]
2021-01-13  9:57   ` Eric Dumazet
2021-01-13 15:44 ` Paolo Abeni
2021-01-13 15:51   ` Eric Dumazet
2021-01-13 15:53     ` Eric Dumazet
2021-01-13 16:14       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210113043722-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=gthelen@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.