netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* skbuff truesize incorrect.
@ 2014-05-22 19:07 Jim Baxter
  2014-05-22 19:21 ` David Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Jim Baxter @ 2014-05-22 19:07 UTC (permalink / raw)
  To: linux-usb, linux-kernel, netdev@vger.kernel.org
  Cc: David S. Miller, kamal, Ben Hutchings, edumazet, mszeredi, fw

Hi, I was hoping you can help me with some questions.

I have been investigating a network issue with bursts of network traffic
over USB CDC-NCM, the issue is that the kernel is dropping packets
because sk_rcvqueues_full() returns true due to skb2->truesize is always
32960 instead of SKB_TRUESIZE(skb2->len) which is about 1800.

The code I am trying to fix is this code below, it is splitting a set of
multiple network packets compressed into a single 16k packet into
individual skb's and sending them up the network stack.

    skb2 = skb_clone(skb, GFP_ATOMIC);
    if (skb2 == NULL)
        goto err;

    if (!skb_pull(skb2, index)) {
        ret = -EOVERFLOW;
        goto err;
    }

    skb_trim(skb2, dg_len - crc_len);

My questions are:

1) Which buffer size does truesize represent, is it the total buffer or
just the data related to the relevant skb?

2) If truesize is for the skb it is contained within should it be
updated during the call to skb_trim?

3) Why does the truesize default to 32960?


Thank you for any help,
Jim Baxter.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:07 skbuff truesize incorrect Jim Baxter
@ 2014-05-22 19:21 ` David Miller
  2014-05-22 20:21   ` Jim Baxter
  2014-05-22 19:25 ` Vlad Yasevich
  2014-05-22 20:58 ` Eric Dumazet
  2 siblings, 1 reply; 32+ messages in thread
From: David Miller @ 2014-05-22 19:21 UTC (permalink / raw)
  To: jim_baxter
  Cc: linux-usb, linux-kernel, netdev, kamal, bhutchings, edumazet,
	mszeredi, fw

From: Jim Baxter <jim_baxter@mentor.com>
Date: Thu, 22 May 2014 20:07:41 +0100

> My questions are:
> 
> 1) Which buffer size does truesize represent, is it the total buffer or
> just the data related to the relevant skb?
> 
> 2) If truesize is for the skb it is contained within should it be
> updated during the call to skb_trim?
> 
> 3) Why does the truesize default to 32960?

skb->truesize represents the total amount of memory that the SKB
is holding up.

This is the size of the sk_buff metadata plus all of the buffer
memory.

For example, if the buffer is using 16K buffer yet only 1500 of it is
actually used for the packet, 16K is what should be plugged into the
equation to compute truesize.

Otherwise it would be easy to compromise a system by sending lots of 1
byte packets to a socket, even though the actual memory consumed by
that "1 byte" packet is significantly larger.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:07 skbuff truesize incorrect Jim Baxter
  2014-05-22 19:21 ` David Miller
@ 2014-05-22 19:25 ` Vlad Yasevich
  2014-05-22 19:39   ` Jim Baxter
  2014-05-22 20:58 ` Eric Dumazet
  2 siblings, 1 reply; 32+ messages in thread
From: Vlad Yasevich @ 2014-05-22 19:25 UTC (permalink / raw)
  To: Jim Baxter, linux-usb, linux-kernel, netdev@vger.kernel.org
  Cc: David S. Miller, kamal, Ben Hutchings, edumazet, mszeredi, fw

On 05/22/2014 03:07 PM, Jim Baxter wrote:
> Hi, I was hoping you can help me with some questions.
> 
> I have been investigating a network issue with bursts of network traffic
> over USB CDC-NCM, the issue is that the kernel is dropping packets
> because sk_rcvqueues_full() returns true due to skb2->truesize is always
> 32960 instead of SKB_TRUESIZE(skb2->len) which is about 1800.
> 
> The code I am trying to fix is this code below, it is splitting a set of
> multiple network packets compressed into a single 16k packet into
> individual skb's and sending them up the network stack.
> 
>     skb2 = skb_clone(skb, GFP_ATOMIC);
>     if (skb2 == NULL)
>         goto err;
> 
>     if (!skb_pull(skb2, index)) {
>         ret = -EOVERFLOW;
>         goto err;
>     }

This assumes that you original 16K packet is linear.  Is that
always the case?

> 
>     skb_trim(skb2, dg_len - crc_len);
> 
> My questions are:
> 
> 1) Which buffer size does truesize represent, is it the total buffer or
> just the data related to the relevant skb?

Total buffer size (including the struct sk_buff).

> 
> 2) If truesize is for the skb it is contained within should it be
> updated during the call to skb_trim?

No, because the the buffer/memory is still there.  skb_trim just
sets the tail pointer and adjusts skb->len.


> 
> 3) Why does the truesize default to 32960?

Probably because that's how much buffer space was actually allocated
for the original skb.  When you cloned it, you inherited the truesize
since a clone is just the struct sk_buff that points into the data
of the original.

This is the very same problem that I ran into with SCTP since it
has similar code in it.  You can play games with truesize manually,
but you have to be very careful here.

-vlad
> 
> 
> Thank you for any help,
> Jim Baxter.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:25 ` Vlad Yasevich
@ 2014-05-22 19:39   ` Jim Baxter
  2014-05-22 19:59     ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Jim Baxter @ 2014-05-22 19:39 UTC (permalink / raw)
  To: Vlad Yasevich, linux-usb, linux-kernel, netdev@vger.kernel.org
  Cc: David S. Miller, kamal, Ben Hutchings, edumazet, mszeredi, fw

On 22/05/14 20:25, Vlad Yasevich wrote:
> On 05/22/2014 03:07 PM, Jim Baxter wrote:
>>
>>     skb2 = skb_clone(skb, GFP_ATOMIC);
>>     if (skb2 == NULL)
>>         goto err;
>>
>>     if (!skb_pull(skb2, index)) {
>>         ret = -EOVERFLOW;
>>         goto err;
>>     }
> 
> This assumes that you original 16K packet is linear.  Is that
> always the case?
The packets within the original packet are linear, however they could be
in an arbitrary order because they are just offsets from the start of
the 16k skb.


> 
> This is the very same problem that I ran into with SCTP since it
> has similar code in it.  You can play games with truesize manually,
> but you have to be very careful here.
> 
> -vlad

I now think that the correct solution here is to create a new smaller
skb and copy the data from the sub packets into it.

Jim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:39   ` Jim Baxter
@ 2014-05-22 19:59     ` Eric Dumazet
  2014-05-22 20:21       ` Jim Baxter
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-05-22 19:59 UTC (permalink / raw)
  To: Jim Baxter
  Cc: Vlad Yasevich, linux-usb, linux-kernel, netdev@vger.kernel.org,
	David S. Miller, kamal, Ben Hutchings, edumazet, mszeredi, fw

On Thu, 2014-05-22 at 20:39 +0100, Jim Baxter wrote:

> I now think that the correct solution here is to create a new smaller
> skb and copy the data from the sub packets into it.

For low speed devices, this is indeed the best way.

(this is called copybreak in some nic drivers)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:59     ` Eric Dumazet
@ 2014-05-22 20:21       ` Jim Baxter
  0 siblings, 0 replies; 32+ messages in thread
From: Jim Baxter @ 2014-05-22 20:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Vlad Yasevich, linux-usb, linux-kernel, netdev@vger.kernel.org,
	David S. Miller, kamal, Ben Hutchings, edumazet, mszeredi, fw

On 22/05/14 20:59, Eric Dumazet wrote:
> On Thu, 2014-05-22 at 20:39 +0100, Jim Baxter wrote:
> 
>> I now think that the correct solution here is to create a new smaller
>> skb and copy the data from the sub packets into it.
> 
> For low speed devices, this is indeed the best way.
> 
> (this is called copybreak in some nic drivers)
> 
> 
> 
Thank you Eric.

It is USB device that can handle 480Mbits/s so a middle speed device.

It looks like it is the normal trade off of speed versus memory. Either
I can use skb_clone for speed but that will impact memory with multiple
skb's of 16k or I copy the data to keep the memory usage low. Though
copying 2k of data into an skb will not likely be a massive overhead.

Jim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:21 ` David Miller
@ 2014-05-22 20:21   ` Jim Baxter
       [not found]     ` <537E5C63.7080607-nmGgyN9QBj3QT0dZR+AlfA@public.gmane.org>
  2014-05-22 20:58     ` David Miller
  0 siblings, 2 replies; 32+ messages in thread
From: Jim Baxter @ 2014-05-22 20:21 UTC (permalink / raw)
  To: David Miller
  Cc: linux-usb, linux-kernel, netdev, kamal, bhutchings, edumazet,
	mszeredi, fw

On 22/05/14 20:21, David Miller wrote:
> From: Jim Baxter <jim_baxter@mentor.com>
> Date: Thu, 22 May 2014 20:07:41 +0100
> 
>> My questions are:
>>
>> 1) Which buffer size does truesize represent, is it the total buffer or
>> just the data related to the relevant skb?
>>
>> 2) If truesize is for the skb it is contained within should it be
>> updated during the call to skb_trim?
>>
>> 3) Why does the truesize default to 32960?
> 
> skb->truesize represents the total amount of memory that the SKB
> is holding up.
> 
> This is the size of the sk_buff metadata plus all of the buffer
> memory.
> 
> For example, if the buffer is using 16K buffer yet only 1500 of it is
> actually used for the packet, 16K is what should be plugged into the
> equation to compute truesize.
> 
> Otherwise it would be easy to compromise a system by sending lots of 1
> byte packets to a socket, even though the actual memory consumed by
> that "1 byte" packet is significantly larger.
> 

OK, so it is the value of the memory that has been allocated for the SKB.
If there are multiple clones for an skb all pointing at the same data,
will that distort the memory used when they all have the same truesize?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
       [not found]     ` <537E5C63.7080607-nmGgyN9QBj3QT0dZR+AlfA@public.gmane.org>
@ 2014-05-22 20:30       ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-05-22 20:30 UTC (permalink / raw)
  To: Jim Baxter
  Cc: David Miller, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, kamal-Z7WLFzj8eWMS+FvcfC7Uqw,
	bhutchings-s/n/eUQHGBpZroRs9YW3xA,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA, mszeredi-AlSwsSmVLrQ,
	fw-HFFVJYpyMKqzQB+pC5nmwQ

On Thu, 2014-05-22 at 21:21 +0100, Jim Baxter wrote:


> OK, so it is the value of the memory that has been allocated for the SKB.
> If there are multiple clones for an skb all pointing at the same data,
> will that distort the memory used when they all have the same truesize?

Its always better to over estimate memory uses, than under estimating.

Its all about guarding against OOM.

Keep in mind some parts of networking stack do not like clones.

TCP coalescing works beautifully with non cloned skbs.

Instead of cloning skb or doing copies, you could play with having frame
on a page fragment. Only headers might need to be pulled from the frame
to linear part of skb.





--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 19:07 skbuff truesize incorrect Jim Baxter
  2014-05-22 19:21 ` David Miller
  2014-05-22 19:25 ` Vlad Yasevich
@ 2014-05-22 20:58 ` Eric Dumazet
  2014-05-22 21:03   ` Eric Dumazet
  2014-05-23  8:52   ` David Laight
  2 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-05-22 20:58 UTC (permalink / raw)
  To: Jim Baxter
  Cc: linux-usb, linux-kernel, netdev@vger.kernel.org, David S. Miller,
	kamal, Ben Hutchings, edumazet, mszeredi, fw

On Thu, 2014-05-22 at 20:07 +0100, Jim Baxter wrote:

> 
> I have been investigating a network issue with bursts of network traffic
> over USB CDC-NCM, the issue is that the kernel is dropping packets
> because sk_rcvqueues_full() returns true due to skb2->truesize is always
> 32960 instead of SKB_TRUESIZE(skb2->len) which is about 1800.
> 
> The code I am trying to fix is this code below, it is splitting a set of
> multiple network packets compressed into a single 16k packet into
> individual skb's and sending them up the network stack.

if skb are allocated with 16k = 16384 bytes, adding the struct
skb_shared_info overhead and rounding up to power of two gives 32768
bytes.

Chances to be able to allocate 32KB contiguous memory are slim after a
while.

I would set rx_max (rx_urb_size) to SKB_MAX_HEAD(0) so that you do not
use high order allocations.

What is the exact 'overhead' using ~4K instead of 16K exactly on this
hardware ?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 20:21   ` Jim Baxter
       [not found]     ` <537E5C63.7080607-nmGgyN9QBj3QT0dZR+AlfA@public.gmane.org>
@ 2014-05-22 20:58     ` David Miller
  2014-05-23  9:21       ` Jim Baxter
  1 sibling, 1 reply; 32+ messages in thread
From: David Miller @ 2014-05-22 20:58 UTC (permalink / raw)
  To: jim_baxter
  Cc: linux-usb, linux-kernel, netdev, kamal, bhutchings, edumazet,
	mszeredi, fw

From: Jim Baxter <jim_baxter@mentor.com>
Date: Thu, 22 May 2014 21:21:55 +0100

> If there are multiple clones for an skb all pointing at the same data,
> will that distort the memory used when they all have the same truesize?

What is distorted about it?

Each clone references exactly that much backing memory.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 20:58 ` Eric Dumazet
@ 2014-05-22 21:03   ` Eric Dumazet
  2014-05-22 21:10     ` David Miller
  2014-05-23  8:52   ` David Laight
  1 sibling, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-05-22 21:03 UTC (permalink / raw)
  To: Jim Baxter
  Cc: linux-usb, linux-kernel, netdev@vger.kernel.org, David S. Miller,
	kamal, Ben Hutchings, edumazet, mszeredi, fw

On Thu, 2014-05-22 at 13:58 -0700, Eric Dumazet wrote:

> I would set rx_max (rx_urb_size) to SKB_MAX_HEAD(0) so that you do not
> use high order allocations.

Correction, that would need SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN),
because drivers/net/usb/usbnet.c calls __netdev_alloc_skb_ip_align().

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 21:03   ` Eric Dumazet
@ 2014-05-22 21:10     ` David Miller
  2014-05-23  7:07       ` Bjørn Mork
  0 siblings, 1 reply; 32+ messages in thread
From: David Miller @ 2014-05-22 21:10 UTC (permalink / raw)
  To: eric.dumazet
  Cc: jim_baxter, linux-usb, linux-kernel, netdev, kamal, bhutchings,
	edumazet, mszeredi, fw, bjorn

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 22 May 2014 14:03:21 -0700

> On Thu, 2014-05-22 at 13:58 -0700, Eric Dumazet wrote:
> 
>> I would set rx_max (rx_urb_size) to SKB_MAX_HEAD(0) so that you do not
>> use high order allocations.
> 
> Correction, that would need SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN),
> because drivers/net/usb/usbnet.c calls __netdev_alloc_skb_ip_align().

I seem to recall that this driver has a lot of strange buffering
restrictions and that Bjorn Mork was talking a lot about this
recently.

Bjorn people are getting really burnt because the driver ends up
having a skb->truesize of 32K for the buffers it allocates on receive
and this chokes up TCP and SCTP because the socket memory limits
are hitting earlier than they should.

We've just in the past few postings been discussing whether the just
copy every packet into a more appropriately sized buffer, and it isn't
clear if that's a good idea of the data rates handled here.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 21:10     ` David Miller
@ 2014-05-23  7:07       ` Bjørn Mork
  2014-05-23  8:58         ` Jim Baxter
  0 siblings, 1 reply; 32+ messages in thread
From: Bjørn Mork @ 2014-05-23  7:07 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, jim_baxter, linux-usb, linux-kernel, netdev, kamal,
	bhutchings, edumazet, mszeredi, fw

David Miller <davem@davemloft.net> writes:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 22 May 2014 14:03:21 -0700
>
>> On Thu, 2014-05-22 at 13:58 -0700, Eric Dumazet wrote:
>> 
>>> I would set rx_max (rx_urb_size) to SKB_MAX_HEAD(0) so that you do not
>>> use high order allocations.
>> 
>> Correction, that would need SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN),
>> because drivers/net/usb/usbnet.c calls __netdev_alloc_skb_ip_align().
>
> I seem to recall that this driver has a lot of strange buffering
> restrictions and that Bjorn Mork was talking a lot about this
> recently.

Yes, I have been following this discussion with great interest, learning
a lot.

But although the problem is the same, I believe the driver in question
isn't the one I have been looking at recently.  The posted code snippet
was from the NCM gadget driver (drivers/usb/gadget/f_ncm.c), isn't that
right Jim?

That driver implements the same protocol with the same inherent issues,
but for the device side.  The implementations are independent for
historical reasons.  This happened before I started looking at these
things so I don't know exactly why.  In an ideal world they should have
shared pretty much everything except setup.

> Bjorn people are getting really burnt because the driver ends up
> having a skb->truesize of 32K for the buffers it allocates on receive
> and this chokes up TCP and SCTP because the socket memory limits
> are hitting earlier than they should.
>
> We've just in the past few postings been discussing whether the just
> copy every packet into a more appropriately sized buffer, and it isn't
> clear if that's a good idea of the data rates handled here.

Yes, judging by this discussion I guess we should unconditionally copy
instead of cloning in these drivers.  We'll always have bad
payload/truesize ratio for cloned skbs, often less than 1/10 even for
max size payload.

Actually, I thought we already did copy in the host cdc_ncm driver.  But
I was wrong.  I was thinking of the cdc_mbim driver (which is different
enough to have its own implementation of this part of the rx code). The
cdc_ncm driver is cloning and the cdc_mbim driver is copying.  So we're
not even consistent...

I'll create and test a patch for the cdc_ncm host driver unless someone
else wants to do that. I haven't really played with the gadget driver
before, so I'd prefer if someone knowing it (Jim maybe?) could take care
of it.  If not, then I can always make an attempt using dummy_hcd to
test it.

BTW, wrt the data rates: These drivers are USB class drivers and we
should really think of *all* possible rates, even future ones. This is
not limited to 480 Mbps USB2. AFAICS, there isn't anything preventing
the gadget driver from being used with e.g. a USB3380 controller to
create a 5 Gbps NCM device.  I'm sure the future will bring us even
faster USB devices.  The drivers will be the same.  Which is sort of
beautiful and scaring at the same time :-)

But I assume the bad payload/truesize ratio is the most important factor
here, so we should still copy?


Bjørn

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: skbuff truesize incorrect.
  2014-05-22 20:58 ` Eric Dumazet
  2014-05-22 21:03   ` Eric Dumazet
@ 2014-05-23  8:52   ` David Laight
  2014-05-23  9:48     ` Bjørn Mork
  2014-05-23 20:18     ` David Miller
  1 sibling, 2 replies; 32+ messages in thread
From: David Laight @ 2014-05-23  8:52 UTC (permalink / raw)
  To: 'Eric Dumazet', Jim Baxter
  Cc: linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, David S. Miller, kamal@canonical.com,
	Ben Hutchings, edumazet@google.com, mszeredi@suse.cz,
	fw@strlen.de

From: Eric Dumazet
> On Thu, 2014-05-22 at 20:07 +0100, Jim Baxter wrote:
> 
> >
> > I have been investigating a network issue with bursts of network traffic
> > over USB CDC-NCM, the issue is that the kernel is dropping packets
> > because sk_rcvqueues_full() returns true due to skb2->truesize is always
> > 32960 instead of SKB_TRUESIZE(skb2->len) which is about 1800.
> >
> > The code I am trying to fix is this code below, it is splitting a set of
> > multiple network packets compressed into a single 16k packet into
> > individual skb's and sending them up the network stack.
> 
> if skb are allocated with 16k = 16384 bytes, adding the struct
> skb_shared_info overhead and rounding up to power of two gives 32768
> bytes.
> 
> Chances to be able to allocate 32KB contiguous memory are slim after a
> while.
> 
> I would set rx_max (rx_urb_size) to SKB_MAX_HEAD(0) so that you do not
> use high order allocations.
> 
> What is the exact 'overhead' using ~4K instead of 16K exactly on this
> hardware ?

Many of the usb ethernet drivers have the same issue.

The hardware will put multiple ethernet frames into a single USB bulk data
message. To handle this the driver generates a URB that is (hopefully) long
enough for the longest USB message (typically 32k is assumed to be enough).
The URB that usb_net generated have the data in a linear skb - which then
has a large 'truesize'.

Since USB bulk data are terminated by a short fragment there is actually
no need for the URB be long enough for the full message. Provided the
URB are multiples of the USB message size (1k for USB 3) the message
can be received into multiple URB - the driver just has to be willing
to merge URB buffers (as well as split them) when generating the ethernet
frames.

What the driver needs to do us allocate URB with 2k (or 4k) buffers and
only allocate the skb when processing the receive data.
Unfortunately this is a major rework of usb_net.c

Note that some of the usb ethernet drivers allocate large skb then
lie about the truesize.

	David


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  7:07       ` Bjørn Mork
@ 2014-05-23  8:58         ` Jim Baxter
  2014-05-23  9:33           ` Bjørn Mork
  0 siblings, 1 reply; 32+ messages in thread
From: Jim Baxter @ 2014-05-23  8:58 UTC (permalink / raw)
  To: Bjørn Mork, David Miller
  Cc: eric.dumazet, linux-usb, linux-kernel, netdev, kamal, bhutchings,
	edumazet, mszeredi, fw

> But although the problem is the same, I believe the driver in question
> isn't the one I have been looking at recently.  The posted code snippet
> was from the NCM gadget driver (drivers/usb/gadget/f_ncm.c), isn't that
> right Jim?

Yes this is the NCM Gadget driver.
> 
> Yes, judging by this discussion I guess we should unconditionally copy
> instead of cloning in these drivers.  We'll always have bad
> payload/truesize ratio for cloned skbs, often less than 1/10 even for
> max size payload.
> 
> Actually, I thought we already did copy in the host cdc_ncm driver.  But
> I was wrong.  I was thinking of the cdc_mbim driver (which is different
> enough to have its own implementation of this part of the rx code). The
> cdc_ncm driver is cloning and the cdc_mbim driver is copying.  So we're
> not even consistent...
> 
> I'll create and test a patch for the cdc_ncm host driver unless someone
> else wants to do that. I haven't really played with the gadget driver
> before, so I'd prefer if someone knowing it (Jim maybe?) could take care
> of it.  If not, then I can always make an attempt using dummy_hcd to
> test it.
I can create a patch for the host driver, I will issue the gadget patch
first to resolve any issues, the fix would be similar.

> 
> BTW, wrt the data rates: These drivers are USB class drivers and we
> should really think of *all* possible rates, even future ones. This is
> not limited to 480 Mbps USB2. AFAICS, there isn't anything preventing
> the gadget driver from being used with e.g. a USB3380 controller to
> create a 5 Gbps NCM device.  I'm sure the future will bring us even
> faster USB devices.  The drivers will be the same.  Which is sort of
> beautiful and scaring at the same time :-)
> 
> But I assume the bad payload/truesize ratio is the most important factor
> here, so we should still copy?
I will test the copy implementation for any performance impact.


Jim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-22 20:58     ` David Miller
@ 2014-05-23  9:21       ` Jim Baxter
  2014-05-23  9:27         ` David Laight
  2014-05-23 16:46         ` David Miller
  0 siblings, 2 replies; 32+ messages in thread
From: Jim Baxter @ 2014-05-23  9:21 UTC (permalink / raw)
  To: David Miller
  Cc: linux-usb, linux-kernel, netdev, kamal, bhutchings, edumazet,
	mszeredi, fw

On 22/05/14 21:58, David Miller wrote:
> From: Jim Baxter <jim_baxter@mentor.com>
> Date: Thu, 22 May 2014 21:21:55 +0100
> 
>> If there are multiple clones for an skb all pointing at the same data,
>> will that distort the memory used when they all have the same truesize?
> 
> What is distorted about it?
> 
> Each clone references exactly that much backing memory.
>

Thank you for you help, I think distorted was the wrong word.


I think I am confused what truesize is used for it looks like it is used
to deallocate memory, is this correct?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: skbuff truesize incorrect.
  2014-05-23  9:21       ` Jim Baxter
@ 2014-05-23  9:27         ` David Laight
  2014-05-23 16:46         ` David Miller
  1 sibling, 0 replies; 32+ messages in thread
From: David Laight @ 2014-05-23  9:27 UTC (permalink / raw)
  To: 'Jim Baxter', David Miller
  Cc: linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, kamal@canonical.com,
	bhutchings@solarflare.com, edumazet@google.com, mszeredi@suse.cz,
	fw@strlen.de

From: Jim Baxter
> On 22/05/14 21:58, David Miller wrote:
> > From: Jim Baxter <jim_baxter@mentor.com>
> > Date: Thu, 22 May 2014 21:21:55 +0100
> >
> >> If there are multiple clones for an skb all pointing at the same data,
> >> will that distort the memory used when they all have the same truesize?
> >
> > What is distorted about it?
> >
> > Each clone references exactly that much backing memory.
> >
> 
> Thank you for you help, I think distorted was the wrong word.
> 
> 
> I think I am confused what truesize is used for it looks like it is used
> to deallocate memory, is this correct?

No, it is used for receive flow control.
You need to count allocated memory (not data bytes) in order to
actually stop the system running out of memory.

	David

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  8:58         ` Jim Baxter
@ 2014-05-23  9:33           ` Bjørn Mork
  2014-05-23 14:00             ` Eric Dumazet
  2014-05-23 15:44             ` Rick Jones
  0 siblings, 2 replies; 32+ messages in thread
From: Bjørn Mork @ 2014-05-23  9:33 UTC (permalink / raw)
  To: Jim Baxter
  Cc: David Miller, eric.dumazet, linux-usb, linux-kernel, netdev,
	kamal, edumazet, mszeredi, fw

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

Jim Baxter <jim_baxter@mentor.com> writes:

>> I'll create and test a patch for the cdc_ncm host driver unless someone
>> else wants to do that. I haven't really played with the gadget driver
>> before, so I'd prefer if someone knowing it (Jim maybe?) could take care
>> of it.  If not, then I can always make an attempt using dummy_hcd to
>> test it.
> I can create a patch for the host driver, I will issue the gadget patch
> first to resolve any issues, the fix would be similar.

Well, I couldn't help myself.  I just had to test it.  The attached
patch works for me, briefly tested with an Ericsson H5321gw NCM device.
I have no ideas about the performance impact as that modem is limited to
21 Mbps HSDPA.


Bjørn


[-- Attachment #2: 0001-net-cdc_ncm-reduce-skb-truesize-in-rx-path.patch --]
[-- Type: text/x-diff, Size: 1443 bytes --]

>From 4c7431cd046a6972e153e23249ad490a3692f9a3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= <bjorn@mork.no>
Date: Fri, 23 May 2014 09:40:51 +0200
Subject: [RFC] net: cdc_ncm: reduce skb truesize in rx path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Cloning the big skbs we use for USB buffering chokes up TCP and
SCTP because the socket memory limits are hitting earlier than
they should. It is better to unconditionally copy the unwrapped
packets to freshly allocated skbs.

Reported-by: Jim Baxter <jim_baxter@mentor.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/net/usb/cdc_ncm.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 93c9ca9924eb..2bbbd65591c7 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1289,12 +1289,11 @@ next_ndp:
 			break;
 
 		} else {
-			skb = skb_clone(skb_in, GFP_ATOMIC);
+			/* create a fresh copy to reduce truesize */
+			skb = netdev_alloc_skb_ip_align(dev->net,  len);
 			if (!skb)
 				goto error;
-			skb->len = len;
-			skb->data = ((u8 *)skb_in->data) + offset;
-			skb_set_tail_pointer(skb, len);
+			memcpy(skb_put(skb, len), skb_in->data + offset, len);
 			usbnet_skb_return(dev, skb);
 			payload += len;	/* count payload bytes in this NTB */
 		}
-- 
2.0.0.rc4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  8:52   ` David Laight
@ 2014-05-23  9:48     ` Bjørn Mork
  2014-05-23 10:45       ` David Laight
  2014-05-23 20:18     ` David Miller
  1 sibling, 1 reply; 32+ messages in thread
From: Bjørn Mork @ 2014-05-23  9:48 UTC (permalink / raw)
  To: David Laight
  Cc: 'Eric Dumazet', Jim Baxter, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	David S. Miller, kamal@canonical.com, edumazet@google.com,
	mszeredi@suse.cz, fw@strlen.de

David Laight <David.Laight@ACULAB.COM> writes:

> Since USB bulk data are terminated by a short fragment there is actually
> no need for the URB be long enough for the full message. Provided the
> URB are multiples of the USB message size (1k for USB 3) the message
> can be received into multiple URB - the driver just has to be willing
> to merge URB buffers (as well as split them) when generating the ethernet
> frames.
>
> What the driver needs to do us allocate URB with 2k (or 4k) buffers and
> only allocate the skb when processing the receive data.
> Unfortunately this is a major rework of usb_net.c

Yes, I believe this is the way to go as well.

I have been thinking about it in the context of the cdc_ncm driver.  One
problem with the NCM protocol is that it will have to merge all the
buffers of a "NCM Transfer Block" (NTB) before it can process any part
of it.  The protocol puts no restrictions on the internal pointers in a
NTB, so there is no guarantee that we can start at the beginning and
work our way towards the end of it. The contained packets can be
interleaved with index(es) or the index can be at the end etc.

But this should still be possible to parse using a list of shorter
buffers to hold each NTB.

> Note that some of the usb ethernet drivers allocate large skb then
> lie about the truesize.

Hmm, which drivers are these?


Bjørn

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: skbuff truesize incorrect.
  2014-05-23  9:48     ` Bjørn Mork
@ 2014-05-23 10:45       ` David Laight
  2014-05-23 11:13         ` Jim Baxter
  0 siblings, 1 reply; 32+ messages in thread
From: David Laight @ 2014-05-23 10:45 UTC (permalink / raw)
  To: 'Bjørn Mork'
  Cc: 'Eric Dumazet', Jim Baxter, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	David S. Miller, kamal@canonical.com, edumazet@google.com,
	mszeredi@suse.cz, fw@strlen.de

From: Bjørn Mork [mailto:bjorn@mork.no]
> David Laight <David.Laight@ACULAB.COM> writes:
...
> > Note that some of the usb ethernet drivers allocate large skb then
> > lie about the truesize.
> 
> Hmm, which drivers are these?

$ grep truesize linux/drivers/net/usb/*

asix_88179_178a, smsc95xx, sr9700.

	David


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23 10:45       ` David Laight
@ 2014-05-23 11:13         ` Jim Baxter
  2014-05-23 13:47           ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Jim Baxter @ 2014-05-23 11:13 UTC (permalink / raw)
  To: David Laight, 'Bjørn Mork'
  Cc: 'Eric Dumazet', linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	David S. Miller, kamal@canonical.com, edumazet@google.com,
	mszeredi@suse.cz, fw@strlen.de

On 23/05/14 11:45, David Laight wrote:
> From: Bjørn Mork [mailto:bjorn@mork.no]
>> David Laight <David.Laight@ACULAB.COM> writes:
> ...
>>> Note that some of the usb ethernet drivers allocate large skb then
>>> lie about the truesize.
>>
>> Hmm, which drivers are these?
> 
> $ grep truesize linux/drivers/net/usb/*
> 
> asix_88179_178a, smsc95xx, sr9700.
> 
> 	David
> 


What are the side effects of changing the truesize, if the original
uncloned skb has the full truesize then isn't the potential memory usage
still counted for the avoidance of OOM?

I suppose if the uncloned skb is deleted you would then have a problem
so a chain of URB's would be the safest solution.

Jim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23 11:13         ` Jim Baxter
@ 2014-05-23 13:47           ` Eric Dumazet
  2014-05-23 15:00             ` Jim Baxter
  2014-05-23 15:30             ` David Laight
  0 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-05-23 13:47 UTC (permalink / raw)
  To: Jim Baxter
  Cc: David Laight, 'Bjørn Mork',
	linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, David S. Miller, kamal@canonical.com,
	edumazet@google.com, mszeredi@suse.cz, fw@strlen.de

On Fri, 2014-05-23 at 12:13 +0100, Jim Baxter wrote:

> What are the side effects of changing the truesize, if the original
> uncloned skb has the full truesize then isn't the potential memory usage
> still counted for the avoidance of OOM?

Nope. This can be disastrous.

A malicious remote peer can crash your host by sending specially cooked
TCP messages.

Send messages with one byte of payload, and out of order so that they
cant be consumed by receiver, and cant be coalesced/collapsed.

If you claim the true size is sizeof(sk_buff) + 512, TCP stack will
accumulate these messages in out of order queue, and will not bother
with them, unless you hit sk_rcvbuf limit.

But in reality these messages uses sizeof(sk_buff) + 32768 bytes.

Divide your physical memory by 32768 : How many such messages will fit
in memory before the host crashes ?

I've seen that kind of attacks in real cases.

Even the fast clones sk_buff mismatch can be noticed. Luckily a 10%
error has no severe impact.

TCP stack uses fast clones, and current stack gives them a truesize of
2048 + sizeof(sk_buff), while it really should be 2048 +
2*sizeof(sk_buff)

Luckily, GSO/TSO tends to reduce the error, as skbs overhead is lower.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  9:33           ` Bjørn Mork
@ 2014-05-23 14:00             ` Eric Dumazet
  2014-05-23 15:44             ` Rick Jones
  1 sibling, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-05-23 14:00 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Jim Baxter, David Miller, linux-usb, linux-kernel, netdev, kamal,
	edumazet, mszeredi, fw

On Fri, 2014-05-23 at 11:33 +0200, Bjørn Mork wrote:
> Jim Baxter <jim_baxter@mentor.com> writes:
> 
> >> I'll create and test a patch for the cdc_ncm host driver unless someone
> >> else wants to do that. I haven't really played with the gadget driver
> >> before, so I'd prefer if someone knowing it (Jim maybe?) could take care
> >> of it.  If not, then I can always make an attempt using dummy_hcd to
> >> test it.
> > I can create a patch for the host driver, I will issue the gadget patch
> > first to resolve any issues, the fix would be similar.
> 
> Well, I couldn't help myself.  I just had to test it.  The attached
> patch works for me, briefly tested with an Ericsson H5321gw NCM device.
> I have no ideas about the performance impact as that modem is limited to
> 21 Mbps HSDPA.

Ideally, the skb_in should not be freed/reallocated, but recycled,
especially if using 32KB.

But thats a minor detail, this patch is already a huge win for a 21Mbps
device.

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23 13:47           ` Eric Dumazet
@ 2014-05-23 15:00             ` Jim Baxter
  2014-05-23 15:30             ` David Laight
  1 sibling, 0 replies; 32+ messages in thread
From: Jim Baxter @ 2014-05-23 15:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Laight, 'Bjørn Mork',
	linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, David S. Miller, kamal@canonical.com,
	edumazet@google.com, mszeredi@suse.cz, fw@strlen.de

On 23/05/14 14:47, Eric Dumazet wrote:
> On Fri, 2014-05-23 at 12:13 +0100, Jim Baxter wrote:
> 
>> What are the side effects of changing the truesize, if the original
>> uncloned skb has the full truesize then isn't the potential memory usage
>> still counted for the avoidance of OOM?
> 
> Nope. This can be disastrous.
> 
> A malicious remote peer can crash your host by sending specially cooked
> TCP messages.
> 
> Send messages with one byte of payload, and out of order so that they
> cant be consumed by receiver, and cant be coalesced/collapsed.
> 
> If you claim the true size is sizeof(sk_buff) + 512, TCP stack will
> accumulate these messages in out of order queue, and will not bother
> with them, unless you hit sk_rcvbuf limit.
> 
> But in reality these messages uses sizeof(sk_buff) + 32768 bytes.
> 
> Divide your physical memory by 32768 : How many such messages will fit
> in memory before the host crashes ?
> 
> I've seen that kind of attacks in real cases.
> 
> Even the fast clones sk_buff mismatch can be noticed. Luckily a 10%
> error has no severe impact.
> 
> TCP stack uses fast clones, and current stack gives them a truesize of
> 2048 + sizeof(sk_buff), while it really should be 2048 +
> 2*sizeof(sk_buff)
> 
> Luckily, GSO/TSO tends to reduce the error, as skbs overhead is lower.
> 
> 

Thank you for clarifying, that is useful to know.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: skbuff truesize incorrect.
  2014-05-23 13:47           ` Eric Dumazet
  2014-05-23 15:00             ` Jim Baxter
@ 2014-05-23 15:30             ` David Laight
  2014-05-23 15:41               ` Eric Dumazet
  1 sibling, 1 reply; 32+ messages in thread
From: David Laight @ 2014-05-23 15:30 UTC (permalink / raw)
  To: 'Eric Dumazet', Jim Baxter
  Cc: 'Bjørn Mork', linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	David S. Miller, kamal@canonical.com, edumazet@google.com,
	mszeredi@suse.cz, fw@strlen.de

From: Eric Dumazet
...
> TCP stack uses fast clones, and current stack gives them a truesize of
> 2048 + sizeof(sk_buff), while it really should be 2048 +
> 2*sizeof(sk_buff)
> 
> Luckily, GSO/TSO tends to reduce the error, as skbs overhead is lower.

Doesn't that affect the tx side - where the truesize doesn't matter as much?

	David



^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: skbuff truesize incorrect.
  2014-05-23 15:30             ` David Laight
@ 2014-05-23 15:41               ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-05-23 15:41 UTC (permalink / raw)
  To: David Laight
  Cc: Jim Baxter, 'Bjørn Mork', linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	David S. Miller, kamal@canonical.com, edumazet@google.com,
	mszeredi@suse.cz, fw@strlen.de

On Fri, 2014-05-23 at 15:30 +0000, David Laight wrote:
> From: Eric Dumazet
> ...
> > TCP stack uses fast clones, and current stack gives them a truesize of
> > 2048 + sizeof(sk_buff), while it really should be 2048 +
> > 2*sizeof(sk_buff)
> > 
> > Luckily, GSO/TSO tends to reduce the error, as skbs overhead is lower.
> 
> Doesn't that affect the tx side - where the truesize doesn't matter as much?

Its not a matter of tx or rx, but percentage of error.

If truesize accounting is wrong by 10%, its not a big deal, because we
normally limit tcp_mem[] to about 16% of available physical memory.

Using 16% of physical memory instead of 16% should not really matter.

Now, if the truesize is wrong by 1600%, then its pretty clear we can
consume all the meory.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  9:33           ` Bjørn Mork
  2014-05-23 14:00             ` Eric Dumazet
@ 2014-05-23 15:44             ` Rick Jones
  2014-05-23 16:00               ` Eric Dumazet
  1 sibling, 1 reply; 32+ messages in thread
From: Rick Jones @ 2014-05-23 15:44 UTC (permalink / raw)
  To: Bjørn Mork, Jim Baxter
  Cc: David Miller, eric.dumazet, linux-usb, linux-kernel, netdev,
	kamal, edumazet, mszeredi, fw

On 05/23/2014 02:33 AM, Bjørn Mork wrote:
> Jim Baxter <jim_baxter@mentor.com> writes:
>
>>> I'll create and test a patch for the cdc_ncm host driver unless someone
>>> else wants to do that. I haven't really played with the gadget driver
>>> before, so I'd prefer if someone knowing it (Jim maybe?) could take care
>>> of it.  If not, then I can always make an attempt using dummy_hcd to
>>> test it.
>> I can create a patch for the host driver, I will issue the gadget patch
>> first to resolve any issues, the fix would be similar.
>
> Well, I couldn't help myself.  I just had to test it.  The attached
> patch works for me, briefly tested with an Ericsson H5321gw NCM device.
> I have no ideas about the performance impact as that modem is limited to
> 21 Mbps HSDPA.

If you are measuring performance with the likes of netperf, you should 
be able to get an idea of the performance effect from the change in 
service demand (CPU consumed per unit of work) even if the maximum 
throughput remains capped.

You can run a netperf TCP_STREAM test along the lines of:

netperf -H <otherguy> -c -C -t TCP_STREAM

and also

netperf -H <otherguy> -c -C -t TCP_RR

For extra added credit you can consider either multiple runs and 
post-processing, or adding a -i 30,3 to the command line to tell netperf 
to run at least three iterations, no more than thirty and it will try to 
achieve a 99% confidence that the reported means for throughput, local 
and remote CPU utilization are within +/- 2.5% of the actual mean.  You 
can narrow or widen that with a -I 99,<width>.  A width of 5% is what 
gives the +/- 2.5% (and/or demonstrates my lack of accurate statistics 
knowledge :) )

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23 15:44             ` Rick Jones
@ 2014-05-23 16:00               ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-05-23 16:00 UTC (permalink / raw)
  To: Rick Jones
  Cc: Bjørn Mork, Jim Baxter, David Miller, linux-usb,
	linux-kernel, netdev, kamal, edumazet, mszeredi, fw

On Fri, 2014-05-23 at 08:44 -0700, Rick Jones wrote:

> If you are measuring performance with the likes of netperf, you should 
> be able to get an idea of the performance effect from the change in 
> service demand (CPU consumed per unit of work) even if the maximum 
> throughput remains capped.

This wont be enough to truly get an idea of the gains this patch brings.

Add some random drops in the equation, and instead of dropping packets,
we gently add them in the out of order queue, gently coalesce them,
gently allow SACK enabled flows to recover thanks to fast retransmits,
with no latency effect (No expensive collapse/prunes)

# nstat -az|egrep "TCPRcvCoalesce|TCPOFOQueue|TCPRcvCollapsed|Prune"
TcpExtPruneCalled               118                0.0
TcpExtRcvPruned                 0                  0.0
TcpExtOfoPruned                 0                  0.0
TcpExtTCPRcvCollapsed           1009               0.0
TcpExtTCPRcvCoalesce            857806             0.0
TcpExtTCPOFOQueue               201246             0.0

Yep, we _might_ consume some cpu cycles in the perfect case where no
packet was dropped, but this is kind of an utopia.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  9:21       ` Jim Baxter
  2014-05-23  9:27         ` David Laight
@ 2014-05-23 16:46         ` David Miller
  1 sibling, 0 replies; 32+ messages in thread
From: David Miller @ 2014-05-23 16:46 UTC (permalink / raw)
  To: jim_baxter
  Cc: linux-usb, linux-kernel, netdev, kamal, bhutchings, edumazet,
	mszeredi, fw

From: Jim Baxter <jim_baxter@mentor.com>
Date: Fri, 23 May 2014 10:21:19 +0100

> On 22/05/14 21:58, David Miller wrote:
>> From: Jim Baxter <jim_baxter@mentor.com>
>> Date: Thu, 22 May 2014 21:21:55 +0100
>> 
>>> If there are multiple clones for an skb all pointing at the same data,
>>> will that distort the memory used when they all have the same truesize?
>> 
>> What is distorted about it?
>> 
>> Each clone references exactly that much backing memory.
>>
> 
> Thank you for you help, I think distorted was the wrong word.
> 
> 
> I think I am confused what truesize is used for it looks like it is used
> to deallocate memory, is this correct?

It's used to determine how much to charge sockets against their read
and write memory limits.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
  2014-05-23  8:52   ` David Laight
  2014-05-23  9:48     ` Bjørn Mork
@ 2014-05-23 20:18     ` David Miller
  2014-05-27 15:23       ` David Laight
  1 sibling, 1 reply; 32+ messages in thread
From: David Miller @ 2014-05-23 20:18 UTC (permalink / raw)
  To: David.Laight
  Cc: eric.dumazet, jim_baxter, linux-usb, linux-kernel, netdev, kamal,
	bhutchings, edumazet, mszeredi, fw

From: David Laight <David.Laight@ACULAB.COM>
Date: Fri, 23 May 2014 08:52:13 +0000

> The hardware will put multiple ethernet frames into a single USB bulk data
> message. To handle this the driver generates a URB that is (hopefully) long
> enough for the longest USB message (typically 32k is assumed to be enough).
> The URB that usb_net generated have the data in a linear skb - which then
> has a large 'truesize'.
> 
> Since USB bulk data are terminated by a short fragment there is actually
> no need for the URB be long enough for the full message. Provided the
> URB are multiples of the USB message size (1k for USB 3) the message
> can be received into multiple URB - the driver just has to be willing
> to merge URB buffers (as well as split them) when generating the ethernet
> frames.

I think we could take a less invasive approach.

Use whatever order page is needed for that 32K chunk for the URB,
but split up the compound page so that the individual pages can be
accounted for separately.

Hook up the pages into the SKB frag array, and only account PAGE_SIZE
into the SKB truesize for the individual pages actually used from
the superpage.

This will largely retain the URB allocation and processing scheme,
yet at the same time dramatically decrease the truesize overrage
factor.

> What the driver needs to do us allocate URB with 2k (or 4k) buffers and
> only allocate the skb when processing the receive data.
> Unfortunately this is a major rework of usb_net.c

> Note that some of the usb ethernet drivers allocate large skb then
> lie about the truesize.

Which has to be fixed, as Eric explained.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: skbuff truesize incorrect.
  2014-05-23 20:18     ` David Miller
@ 2014-05-27 15:23       ` David Laight
       [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D1724E565-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 32+ messages in thread
From: David Laight @ 2014-05-27 15:23 UTC (permalink / raw)
  To: 'David Miller'
  Cc: eric.dumazet@gmail.com, jim_baxter@mentor.com,
	linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, kamal@canonical.com,
	bhutchings@solarflare.com, edumazet@google.com, mszeredi@suse.cz,
	fw@strlen.de

From: David Miller
> From: David Laight <David.Laight@ACULAB.COM>
> Date: Fri, 23 May 2014 08:52:13 +0000
> 
> > The hardware will put multiple ethernet frames into a single USB bulk data
> > message. To handle this the driver generates a URB that is (hopefully) long
> > enough for the longest USB message (typically 32k is assumed to be enough).
> > The URB that usb_net generated have the data in a linear skb - which then
> > has a large 'truesize'.
> >
> > Since USB bulk data are terminated by a short fragment there is actually
> > no need for the URB be long enough for the full message. Provided the
> > URB are multiples of the USB message size (1k for USB 3) the message
> > can be received into multiple URB - the driver just has to be willing
> > to merge URB buffers (as well as split them) when generating the ethernet
> > frames.
> 
> I think we could take a less invasive approach.
> 
> Use whatever order page is needed for that 32K chunk for the URB,
> but split up the compound page so that the individual pages can be
> accounted for separately.

Using 4k 'pages' would be fine (or even 1k).
All the USB interfaces support SG provided the fragments are all
aligned.
Recycling the unused 4k pages would probably be better than using
separate URB for each fragment.

Note that the xhci driver has to split buffers that cross 64k
boundaries into separate ring entries - and these must not cross
the end of the ring (thank you hardware engineers...).
So not receiving into the 'linear' part of an skb helps.

Although I suspect that 32k is excessive - provided the driver can
merge URB (as well as pages) to generate ethernet frames.

> Hook up the pages into the SKB frag array, and only account PAGE_SIZE
> into the SKB truesize for the individual pages actually used from
> the superpage.
> 
> This will largely retain the URB allocation and processing scheme,
> yet at the same time dramatically decrease the truesize overrage
> factor.

True, you still need to sort out how to handle multiple ethernet
frames in the same 4k page, and arbitrary page boundaries within a frame.

I suspect that using 1k or 2k pages and using 'copybreak' to never pass
up shared pages would give the best overall performance.

	David

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: skbuff truesize incorrect.
       [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D1724E565-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
@ 2014-05-27 15:52           ` David Miller
  0 siblings, 0 replies; 32+ messages in thread
From: David Miller @ 2014-05-27 15:52 UTC (permalink / raw)
  To: David.Laight-ZS65k/vG3HxXrIkS9f7CXA
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	jim_baxter-nmGgyN9QBj3QT0dZR+AlfA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, kamal-Z7WLFzj8eWMS+FvcfC7Uqw,
	bhutchings-s/n/eUQHGBpZroRs9YW3xA,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA, mszeredi-AlSwsSmVLrQ,
	fw-HFFVJYpyMKqzQB+pC5nmwQ

From: David Laight <David.Laight-ZS65k/vG3HxXrIkS9f7CXA@public.gmane.org>
Date: Tue, 27 May 2014 15:23:14 +0000

> True, you still need to sort out how to handle multiple ethernet
> frames in the same 4k page, and arbitrary page boundaries within a
> frame.

Multiple ethernet frames in the same 4K page is used by several ethernet
drivers already.  NIU for one.

Several chips have various pools of pages, chopped up into different
power-of-2 chunks.

As the packet arrives, they build up the packet using buffers from the
different sized pools.

Anyways, you just have to charge the page to all packets contained
within.  This doesn't seem to be much of an issue to address.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2014-05-27 15:52 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-22 19:07 skbuff truesize incorrect Jim Baxter
2014-05-22 19:21 ` David Miller
2014-05-22 20:21   ` Jim Baxter
     [not found]     ` <537E5C63.7080607-nmGgyN9QBj3QT0dZR+AlfA@public.gmane.org>
2014-05-22 20:30       ` Eric Dumazet
2014-05-22 20:58     ` David Miller
2014-05-23  9:21       ` Jim Baxter
2014-05-23  9:27         ` David Laight
2014-05-23 16:46         ` David Miller
2014-05-22 19:25 ` Vlad Yasevich
2014-05-22 19:39   ` Jim Baxter
2014-05-22 19:59     ` Eric Dumazet
2014-05-22 20:21       ` Jim Baxter
2014-05-22 20:58 ` Eric Dumazet
2014-05-22 21:03   ` Eric Dumazet
2014-05-22 21:10     ` David Miller
2014-05-23  7:07       ` Bjørn Mork
2014-05-23  8:58         ` Jim Baxter
2014-05-23  9:33           ` Bjørn Mork
2014-05-23 14:00             ` Eric Dumazet
2014-05-23 15:44             ` Rick Jones
2014-05-23 16:00               ` Eric Dumazet
2014-05-23  8:52   ` David Laight
2014-05-23  9:48     ` Bjørn Mork
2014-05-23 10:45       ` David Laight
2014-05-23 11:13         ` Jim Baxter
2014-05-23 13:47           ` Eric Dumazet
2014-05-23 15:00             ` Jim Baxter
2014-05-23 15:30             ` David Laight
2014-05-23 15:41               ` Eric Dumazet
2014-05-23 20:18     ` David Miller
2014-05-27 15:23       ` David Laight
     [not found]         ` <063D6719AE5E284EB5DD2968C1650D6D1724E565-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org>
2014-05-27 15:52           ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).