* truesize for pages shared between SKBs
@ 2014-09-02 12:20 Johannes Berg
2014-09-02 15:51 ` Eric Dumazet
0 siblings, 1 reply; 3+ messages in thread
From: Johannes Berg @ 2014-09-02 12:20 UTC (permalink / raw)
To: netdev, linux-wireless; +Cc: Ido Yariv, Emmanuel Grumbach
Hi,
In our driver, we have 4k receive buffers, but usually ~1500 byte
packets.
How do other drivers handle this? We currently set up the truesize of
each SKB to be its size plus the 4k page size, but we see performance
improvements when we lie and pretend the truesize is just 4k/(# of
packets in the page), which is correct as long as the packets are all
pending in the stack since they share the page.
How do other drivers handle this? Should the truesize maybe be aware of
this kind of sharing? Should we just lie about it and risk that the
truesize is accounted erroneously if some but not all of the packets are
freed?
Thanks,
johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: truesize for pages shared between SKBs
2014-09-02 12:20 truesize for pages shared between SKBs Johannes Berg
@ 2014-09-02 15:51 ` Eric Dumazet
2014-09-02 16:00 ` Johannes Berg
0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2014-09-02 15:51 UTC (permalink / raw)
To: Johannes Berg; +Cc: netdev, linux-wireless, Ido Yariv, Emmanuel Grumbach
On Tue, 2014-09-02 at 14:20 +0200, Johannes Berg wrote:
> Hi,
>
> In our driver, we have 4k receive buffers, but usually ~1500 byte
> packets.
Which driver exactly is that ?
>
> How do other drivers handle this? We currently set up the truesize of
> each SKB to be its size plus the 4k page size, but we see performance
> improvements when we lie and pretend the truesize is just 4k/(# of
> packets in the page), which is correct as long as the packets are all
> pending in the stack since they share the page.
Can you elaborate on 'they share the page' ?
If a 4K page is really split into 2 2KB subpages, then yes, truesize can
be 2KB + skb->head + sizeof(struct sk_buff)
Some drivers do that (Intel IGBVF for example)
If a single 4KB page can be used by a single 1500 frame, then its not
shared ;)
>
> How do other drivers handle this? Should the truesize maybe be aware of
> this kind of sharing? Should we just lie about it and risk that the
> truesize is accounted erroneously if some but not all of the packets are
> freed?
Lies are not worth crashing hosts under memory pressure.
skb->truesize is really how many bytes are consumed by an skb. This
serves in TCP stack to trigger collapses when a socket reaches its
limits.
Your performance is better when you lie, because for the same
sk->sk_rcvbuf value (typically tcp_rmem[2]), TCP window can be bigger,
and allows TCP sender to send more packets (bigger cwnd)
Workaround : make tcp_rmem[2] larger, so that we still have an
appropriate memory limit per socket, acting for OOM prevention, and
allowing better performance for large BDP flows.
Current value is 6MB, which is already quite big IMO for well behaving
drivers.
Real fix would be to make your skb as slim as possible of course.
It helps even if GRO or TCP coalescing can reduce the memory
requirements for bulk flows.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: truesize for pages shared between SKBs
2014-09-02 15:51 ` Eric Dumazet
@ 2014-09-02 16:00 ` Johannes Berg
0 siblings, 0 replies; 3+ messages in thread
From: Johannes Berg @ 2014-09-02 16:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, linux-wireless, Ido Yariv, Emmanuel Grumbach
On Tue, 2014-09-02 at 08:51 -0700, Eric Dumazet wrote:
> > In our driver, we have 4k receive buffers, but usually ~1500 byte
> > packets.
>
> Which driver exactly is that ?
iwlwifi/iwlmvm, of course :)
> Can you elaborate on 'they share the page' ?
>
> If a 4K page is really split into 2 2KB subpages, then yes, truesize can
> be 2KB + skb->head + sizeof(struct sk_buff)
What do you mean by "split"?
There could be any number of packets (though usually two in the
interesting case) in the page, and we call skb_add_rx_frag() for both
packets, pointing to the same page, with different offsets.
> Some drivers do that (Intel IGBVF for example)
It seems to split into two unconditionally, which is interesting.
> If a single 4KB page can be used by a single 1500 frame, then its not
> shared ;)
Right, obviously :)
> > How do other drivers handle this? Should the truesize maybe be aware of
> > this kind of sharing? Should we just lie about it and risk that the
> > truesize is accounted erroneously if some but not all of the packets are
> > freed?
>
> Lies are not worth crashing hosts under memory pressure.
>
> skb->truesize is really how many bytes are consumed by an skb. This
> serves in TCP stack to trigger collapses when a socket reaches its
> limits.
Right - the question is more what "consumed" means in this case. Should
it be correct at the time of SKB creation (in which case we should split
by the number of packets created from the page) or should it be correct
for the worst case (all but one packet are freed quickly, one remains
stuck on some socket with the full 4k page allocated to it) ...
Or maybe we should make it even more complex and check the page sharing
in conjunction with GRO...
> Your performance is better when you lie, because for the same
> sk->sk_rcvbuf value (typically tcp_rmem[2]), TCP window can be bigger,
> and allows TCP sender to send more packets (bigger cwnd)
>
> Workaround : make tcp_rmem[2] larger, so that we still have an
> appropriate memory limit per socket, acting for OOM prevention, and
> allowing better performance for large BDP flows.
>
> Current value is 6MB, which is already quite big IMO for well behaving
> drivers.
>
> Real fix would be to make your skb as slim as possible of course.
> It helps even if GRO or TCP coalescing can reduce the memory
> requirements for bulk flows.
Sure, it's always a trade-off though. If we want to *actually* make it
smaller we'd have to copy the data, which doesn't buy us much either.
The hardware is limited in the RX buffer handling, and there's actually
a chance we might receive close to 4k in a single frame with A-MSDU.
If you wanted to use the medium more efficiently you'd set the MTU
higher, for better page utilisation to a little under 2000 (a perfectly
valid value for 802.11). Nobody really does that though :-)
johannes
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-09-02 16:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-02 12:20 truesize for pages shared between SKBs Johannes Berg
2014-09-02 15:51 ` Eric Dumazet
2014-09-02 16:00 ` Johannes Berg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).