Re: [PATCH] net: use hardware buffer pool to allocate skb

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alexander Duyck <alexander.h.duyck@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Jiafei.Pan@freescale.com" <Jiafei.Pan@freescale.com>,
	David Miller <davem@davemloft.net>,
	"jkosina@suse.cz" <jkosina@suse.cz>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"LeoLi@freescale.com" <LeoLi@freescale.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH] net: use hardware buffer pool to allocate skb
Date: Thu, 16 Oct 2014 11:20:28 -0700	[thread overview]
Message-ID: <54400C6C.7010405@redhat.com> (raw)
In-Reply-To: <1413481529.28798.29.camel@edumazet-glaptop2.roam.corp.google.com>

On 10/16/2014 10:45 AM, Eric Dumazet wrote:
> On Thu, 2014-10-16 at 10:10 -0700, Alexander Duyck wrote:
>
>> So if a page is used twice we are double counting the page size for the
>> socket then, is that correct?  I just want to make sure because prior to
>> this patch both flows did the same thing and counted the portion of the
>> page used in this pass, now with this change for PAGE_SIZE of 4K we
>> count the entire page, and for all other cases we count the portion of
>> the page used.
> When a page is split in 2 parts only, probability that a segment holds
> the 4K page is quite high (There is a single half page)

Actually the likelihood of anything holding onto the 4K page for very 
long doesn't seem to occur, at least from the drivers perspective.  It 
is one of the reasons why I went for the page reuse approach rather than 
just partitioning a single large page.  It allows us to avoid having to 
call IOMMU map/unmap for the pages since the entire page is usually back 
in the driver ownership before we need to reuse the portion given to the 
stack.

> When we split say 64KB in 42 segments, the probability a single segment
> hold the full 64KB block is very low, so we can almost be safe when we
> consider 'truesize = 1536'

Yes, but the likelihood that only a few segments are holding the page is 
still very high.  So you might not have one segment holding the 64K 
page, but I find it very difficult to believe that all 42 would be 
holding it at the same time.  In that case should we be adding some 
portion of the 64K to the truesize for all frames to account for this?

> Of course there are pathological cases, but attacker has to be quite
> smart.
>
> I am just saying that counting 2048 might have a big impact on memory
> consumption if all these incoming segments are stored a long time in
> receive queues (TCP receive queues or out of order queues) : We might be
> off by a factor of 2 on the real memory usage, and delay the TCP
> collapsing too much.

My concern would be that we are off by a factor of 2 and prematurely 
collapse the TCP too soon with this change.  For example if you are 
looking at a socket that is holding pages for a long period of time 
there would be a good chance of it ending up with both halves of the 
page.  In this case is it fair to charge it for 8K or memory use when in 
reality it is only using 4K?

Thanks,

Alex

next prev parent reply	other threads:[~2014-10-16 18:20 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-15  3:26 [PATCH] net: use hardware buffer pool to allocate skb Pan Jiafei
2014-10-15  4:15 ` Eric Dumazet
2014-10-15  4:26   ` David Miller
2014-10-15  5:43   ` Jiafei.Pan
2014-10-15  9:15     ` Eric Dumazet
2014-10-15  4:25 ` David Miller
2014-10-15  5:34   ` Jiafei.Pan
2014-10-15  9:15     ` Eric Dumazet
2014-10-16  2:17       ` Jiafei.Pan
2014-10-16  4:15         ` Eric Dumazet
2014-10-16  5:15           ` Jiafei.Pan
2014-10-16 15:28             ` Alexander Duyck
2014-10-16 16:57               ` Eric Dumazet
2014-10-16 17:10                 ` Alexander Duyck
2014-10-16 17:45                   ` Eric Dumazet
2014-10-16 18:20                     ` Alexander Duyck [this message]
2014-10-16 21:40                       ` Eric Dumazet
2014-10-16 22:12                         ` Alexander Duyck
2014-10-17  9:11                       ` David Laight
2014-10-17 14:40                         ` Alexander Duyck
2014-10-17 16:55                           ` Eric Dumazet
2014-10-17 18:28                             ` Alexander Duyck
2014-10-17 18:53                               ` Eric Dumazet
2014-10-18  0:26                                 ` Eric Dumazet
2014-10-17 19:02                               ` Eric Dumazet
2014-10-17 19:38                                 ` Alexander Duyck
2014-10-17 19:51                                   ` Eric Dumazet
2014-10-17 22:13                                     ` Alexander Duyck
2014-10-17  2:35               ` Jiafei.Pan
2014-10-17 14:05                 ` Eric Dumazet
2014-10-17 14:12                   ` Alexander Duyck
2014-10-16  2:17       ` Jiafei.Pan
2014-10-15 15:51     ` David Miller
2014-10-15  4:59 ` Oliver Hartkopp
2014-10-15  5:47   ` Jiafei.Pan
2014-10-15  8:57 ` David Laight
2014-10-15  9:33 ` Stephen Hemminger
2014-10-16  2:30   ` Jiafei.Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54400C6C.7010405@redhat.com \
    --to=alexander.h.duyck@redhat.com \
    --cc=Jiafei.Pan@freescale.com \
    --cc=LeoLi@freescale.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jkosina@suse.cz \
    --cc=linux-doc@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).