netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.h.duyck@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Jiafei.Pan@freescale.com" <Jiafei.Pan@freescale.com>,
	David Miller <davem@davemloft.net>,
	"jkosina@suse.cz" <jkosina@suse.cz>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"LeoLi@freescale.com" <LeoLi@freescale.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Subject: Re: [PATCH] net: use hardware buffer pool to allocate skb
Date: Thu, 16 Oct 2014 10:10:27 -0700	[thread overview]
Message-ID: <543FFC03.1060207@redhat.com> (raw)
In-Reply-To: <1413478657.28798.22.camel@edumazet-glaptop2.roam.corp.google.com>


On 10/16/2014 09:57 AM, Eric Dumazet wrote:
> On Thu, 2014-10-16 at 08:28 -0700, Alexander Duyck wrote:
>
>> I think the part you are not getting is that is how buffers are
>> essentially handled now.  So for example in the case if igb the only
>> part we have copied out is usually the header, or the entire frame in
>> the case of small packets.  This has to happen in order to allow for
>> changes to the header for routing and such.  Beyond that the frags that
>> are passed are the buffers that igb is still holding onto.  So
>> effectively what the other device transmits in a bridging/routing
>> scenario is my own net card specified buffer plus the copied/modified
>> header.
>>
>> For a brief period igb used build_skb but that isn't valid on most
>> systems as memory mapped for a device can be overwritten if the page is
>> unmapped resulting in any changes to the header for routing/bridging
>> purposes being invalidated.  Thus we cannot use the buffers for both the
>> skb->data header which may be changed and Rx DMA simultaneously.
> This reminds me that igb still has skb->truesize underestimation by 100%
>
> If a fragment is held in some socket receive buffer, a full page is
> consumed, not 2048 bytes.
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index a21b14495ebd..56ca6c78985e 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -6586,9 +6586,11 @@ static bool igb_add_rx_frag(struct igb_ring *rx_ring,
>   	struct page *page = rx_buffer->page;
>   	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
>   #if (PAGE_SIZE < 8192)
> -	unsigned int truesize = IGB_RX_BUFSZ;
> +	unsigned int segsize = IGB_RX_BUFSZ;
> +	unsigned int truesize = PAGE_SIZE;
>   #else
> -	unsigned int truesize = ALIGN(size, L1_CACHE_BYTES);
> +	unsigned int segsize = ALIGN(size, L1_CACHE_BYTES);
> +	unsigned int truesize = segsize;
>   #endif

So if a page is used twice we are double counting the page size for the 
socket then, is that correct?  I just want to make sure because prior to 
this patch both flows did the same thing and counted the portion of the 
page used in this pass, now with this change for PAGE_SIZE of 4K we 
count the entire page, and for all other cases we count the portion of 
the page used.

Thanks,

Alex



  reply	other threads:[~2014-10-16 17:10 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-15  3:26 [PATCH] net: use hardware buffer pool to allocate skb Pan Jiafei
2014-10-15  4:15 ` Eric Dumazet
2014-10-15  4:26   ` David Miller
2014-10-15  5:43   ` Jiafei.Pan
2014-10-15  9:15     ` Eric Dumazet
2014-10-15  4:25 ` David Miller
2014-10-15  5:34   ` Jiafei.Pan
2014-10-15  9:15     ` Eric Dumazet
2014-10-16  2:17       ` Jiafei.Pan
2014-10-16  4:15         ` Eric Dumazet
2014-10-16  5:15           ` Jiafei.Pan
2014-10-16 15:28             ` Alexander Duyck
2014-10-16 16:57               ` Eric Dumazet
2014-10-16 17:10                 ` Alexander Duyck [this message]
2014-10-16 17:45                   ` Eric Dumazet
2014-10-16 18:20                     ` Alexander Duyck
2014-10-16 21:40                       ` Eric Dumazet
2014-10-16 22:12                         ` Alexander Duyck
2014-10-17  9:11                       ` David Laight
2014-10-17 14:40                         ` Alexander Duyck
2014-10-17 16:55                           ` Eric Dumazet
2014-10-17 18:28                             ` Alexander Duyck
2014-10-17 18:53                               ` Eric Dumazet
2014-10-18  0:26                                 ` Eric Dumazet
2014-10-17 19:02                               ` Eric Dumazet
2014-10-17 19:38                                 ` Alexander Duyck
2014-10-17 19:51                                   ` Eric Dumazet
2014-10-17 22:13                                     ` Alexander Duyck
2014-10-17  2:35               ` Jiafei.Pan
2014-10-17 14:05                 ` Eric Dumazet
2014-10-17 14:12                   ` Alexander Duyck
2014-10-16  2:17       ` Jiafei.Pan
2014-10-15 15:51     ` David Miller
2014-10-15  4:59 ` Oliver Hartkopp
2014-10-15  5:47   ` Jiafei.Pan
2014-10-15  8:57 ` David Laight
2014-10-15  9:33 ` Stephen Hemminger
2014-10-16  2:30   ` Jiafei.Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=543FFC03.1060207@redhat.com \
    --to=alexander.h.duyck@redhat.com \
    --cc=Jiafei.Pan@freescale.com \
    --cc=LeoLi@freescale.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jkosina@suse.cz \
    --cc=linux-doc@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).