* FlameGraph of mlx4 early drop with order-0 pages
@ 2016-04-15 19:40 Jesper Dangaard Brouer
2016-04-17 13:23 ` Mel Gorman
0 siblings, 1 reply; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2016-04-15 19:40 UTC (permalink / raw)
To: Mel Gorman, linux-mm, netdev@vger.kernel.org, Brenden Blanco
Cc: brouer, tom, alexei.starovoitov, ogerlitz, daniel
Hi Mel,
I did an experiment that you might find interesting. Using Brenden's
early drop with eBPF in the mxl4 driver. I changed the mlx4 driver to
use order-0 pages. It usually use order-3 pages to amortize the cost
of calling the page allocator (which is problematic for other reasons,
like memory pin-down, latency spikes and multi CPU scalability)
With this change I could do around 12Mpps (Mill packet per sec) drops,
usually does 14.5Mpps (limited due to a HW setup/limit, with idle cycles).
Looking at the perf report as a FlameGraph, the page allocator clearly
show up as the bottleneck:
http://people.netfilter.org/hawk/FlameGraph/flamegraph-mlx4-order0-pages-eBPF-XDP-drop.svg
Signing off, heading for the plane soon... see you at MM-summit!
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: FlameGraph of mlx4 early drop with order-0 pages
2016-04-15 19:40 FlameGraph of mlx4 early drop with order-0 pages Jesper Dangaard Brouer
@ 2016-04-17 13:23 ` Mel Gorman
2016-04-17 17:24 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 4+ messages in thread
From: Mel Gorman @ 2016-04-17 13:23 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: linux-mm, netdev@vger.kernel.org, Brenden Blanco, tom,
alexei.starovoitov, ogerlitz, daniel, eric.dumazet, ecree,
john.fastabend, tgraf, johannes, eranlinuxmellanox
On Fri, Apr 15, 2016 at 09:40:34PM +0200, Jesper Dangaard Brouer wrote:
> Hi Mel,
>
> I did an experiment that you might find interesting. Using Brenden's
> early drop with eBPF in the mxl4 driver. I changed the mlx4 driver to
> use order-0 pages. It usually use order-3 pages to amortize the cost
> of calling the page allocator (which is problematic for other reasons,
> like memory pin-down, latency spikes and multi CPU scalability)
>
> With this change I could do around 12Mpps (Mill packet per sec) drops,
> usually does 14.5Mpps (limited due to a HW setup/limit, with idle cycles).
>
> Looking at the perf report as a FlameGraph, the page allocator clearly
> show up as the bottleneck:
>
Yeah, it's very obvious there. You didn't say if this had the optimisations
included or not but it doesn't really matter. Even halving the cost would
still be a lot.
FWIW, the latest series included an optimisation around the debugging
check. I also have an extreme patch that creates a special fast path for
order-0 pages only when there is plenty of free memory. It halved the
cost of the allocation side even on top of the current optimisations. I'm
not super-happy with it though as it duplicates some code and it requires
node-lru to be merged. Right now, node-lru is colliding very badly with
what's in mmotm so there is legwork required.
I also prototyped something that caches high-order pages on the per-cpu
lists on the flight over. It is at the "it builds so it must be ok"
stage. It's at the horrible hack and the accounting is quesionable but
something like it may be justified for SLUB even if network drivers move
away from high-order pages.
> Signing off, heading for the plane soon... see you at MM-summit!
Indeed and we'll slap some sort of plan together. If there is a slot free,
we might spend 15-30 minutes on it. Failing that, we'll grab a table
somewhere. We'll see how far we can get before considering a page-recycle
layer that preserves cache coherent state.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: FlameGraph of mlx4 early drop with order-0 pages
2016-04-17 13:23 ` Mel Gorman
@ 2016-04-17 17:24 ` Jesper Dangaard Brouer
2016-04-17 17:52 ` Mel Gorman
0 siblings, 1 reply; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2016-04-17 17:24 UTC (permalink / raw)
To: Mel Gorman
Cc: linux-mm, netdev@vger.kernel.org, Brenden Blanco, tom,
alexei.starovoitov, ogerlitz, daniel, eric.dumazet, ecree,
john.fastabend, tgraf, johannes, brouer
On Sun, 17 Apr 2016 14:23:57 +0100
Mel Gorman <mgorman@techsingularity.net> wrote:
> > Signing off, heading for the plane soon... see you at MM-summit!
>
> Indeed and we'll slap some sort of plan together. If there is a slot free,
> we might spend 15-30 minutes on it. Failing that, we'll grab a table
> somewhere. We'll see how far we can get before considering a page-recycle
> layer that preserves cache coherent state.
We have a plenum slot tomorrow between 16:00-16:30, called "Generic
Page Pool Facility".
I'm at the Marriott now. I'm wearing my Red Hat/fedora, so I should be
easy to spot... ;-)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: FlameGraph of mlx4 early drop with order-0 pages
2016-04-17 17:24 ` Jesper Dangaard Brouer
@ 2016-04-17 17:52 ` Mel Gorman
0 siblings, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2016-04-17 17:52 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: linux-mm, netdev@vger.kernel.org, Brenden Blanco, tom,
alexei.starovoitov, ogerlitz, daniel, eric.dumazet, ecree,
john.fastabend, tgraf, johannes
On Sun, Apr 17, 2016 at 07:24:32PM +0200, Jesper Dangaard Brouer wrote:
> On Sun, 17 Apr 2016 14:23:57 +0100
> Mel Gorman <mgorman@techsingularity.net> wrote:
>
> > > Signing off, heading for the plane soon... see you at MM-summit!
> >
> > Indeed and we'll slap some sort of plan together. If there is a slot free,
> > we might spend 15-30 minutes on it. Failing that, we'll grab a table
> > somewhere. We'll see how far we can get before considering a page-recycle
> > layer that preserves cache coherent state.
>
> We have a plenum slot tomorrow between 16:00-16:30, called "Generic
> Page Pool Facility".
>
Yeah. We can use part of that if you like to discuss page allocator
concerns. I didn't want to accidentally hijack a session if it was going
to focus on an API for storing cache coherent pages. My focus will still
be on improving the allocator itself and what would and would not be
acceptable there.
> I'm at the Marriott now. I'm wearing my Red Hat/fedora, so I should be
> easy to spot... ;-)
>
I'll keep an eye out!
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-04-17 17:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-15 19:40 FlameGraph of mlx4 early drop with order-0 pages Jesper Dangaard Brouer
2016-04-17 13:23 ` Mel Gorman
2016-04-17 17:24 ` Jesper Dangaard Brouer
2016-04-17 17:52 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).