All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	brouer@redhat.com
Subject: Re: [PATCH 3/4] mm, page_allocator: Only use per-cpu allocator for irq-safe requests
Date: Wed, 4 Jan 2017 15:20:24 +0100	[thread overview]
Message-ID: <20170104152024.7e050b84@redhat.com> (raw)
In-Reply-To: <20170104111049.15501-4-mgorman@techsingularity.net>

On Wed,  4 Jan 2017 11:10:48 +0000 Mel Gorman <mgorman@techsingularity.net> wrote:

> Many workloads that allocate pages are not handling an interrupt at a
> time. As allocation requests may be from IRQ context, it's necessary to
> disable/enable IRQs for every page allocation. This cost is the bulk
> of the free path but also a significant percentage of the allocation
> path.
> 
> This patch alters the locking and checks such that only irq-safe allocation
> requests use the per-cpu allocator. All others acquire the irq-safe
> zone->lock and allocate from the buddy allocator. It relies on disabling
> preemption to safely access the per-cpu structures. 

I love this idea and patch :-)

> It could be slightly
> modified to avoid soft IRQs using it but it's not clear it's worthwhile.

NICs usually refill their RX-ring from SoftIRQ context (NAPI).
Thus, we do want this optimization to work in softirq.

 
> This modification may slow allocations from IRQ context slightly but the main
> gain from the per-cpu allocator is that it scales better for allocations
> from multiple contexts. There is an implicit assumption that intensive
> allocations from IRQ contexts on multiple CPUs from a single NUMA node are
> rare and that the fast majority of scaling issues are encountered in !IRQ
> contexts such as page faulting. 

IHMO, I agree with this implicit assumption.


> It's worth noting that this patch is not
> required for a bulk page allocator but it significantly reduces the overhead.
> 
> The following is results from a page allocator micro-benchmark. Only
> order-0 is interesting as higher orders do not use the per-cpu allocator

I'm seeing approx 34% reduction in a order-0 micro-benchmark! amazing! :-)
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/

>                                           4.10.0-rc2                 4.10.0-rc2
>                                              vanilla               irqsafe-v1r5
> Amean    alloc-odr0-1               287.15 (  0.00%)           219.00 ( 23.73%)
> Amean    alloc-odr0-2               221.23 (  0.00%)           183.23 ( 17.18%)
> Amean    alloc-odr0-4               187.00 (  0.00%)           151.38 ( 19.05%)
> Amean    alloc-odr0-8               167.54 (  0.00%)           132.77 ( 20.75%)
> Amean    alloc-odr0-16              156.00 (  0.00%)           123.00 ( 21.15%)
> Amean    alloc-odr0-32              149.00 (  0.00%)           118.31 ( 20.60%)
> Amean    alloc-odr0-64              138.77 (  0.00%)           116.00 ( 16.41%)
> Amean    alloc-odr0-128             145.00 (  0.00%)           118.00 ( 18.62%)
> Amean    alloc-odr0-256             136.15 (  0.00%)           125.00 (  8.19%)
> Amean    alloc-odr0-512             147.92 (  0.00%)           121.77 ( 17.68%)
> Amean    alloc-odr0-1024            147.23 (  0.00%)           126.15 ( 14.32%)
> Amean    alloc-odr0-2048            155.15 (  0.00%)           129.92 ( 16.26%)
> Amean    alloc-odr0-4096            164.00 (  0.00%)           136.77 ( 16.60%)
> Amean    alloc-odr0-8192            166.92 (  0.00%)           138.08 ( 17.28%)
> Amean    alloc-odr0-16384           159.00 (  0.00%)           138.00 ( 13.21%)
> Amean    free-odr0-1                165.00 (  0.00%)            89.00 ( 46.06%)
> Amean    free-odr0-2                113.00 (  0.00%)            63.00 ( 44.25%)
> Amean    free-odr0-4                 99.00 (  0.00%)            54.00 ( 45.45%)
> Amean    free-odr0-8                 88.00 (  0.00%)            47.38 ( 46.15%)
> Amean    free-odr0-16                83.00 (  0.00%)            46.00 ( 44.58%)
> Amean    free-odr0-32                80.00 (  0.00%)            44.38 ( 44.52%)
> Amean    free-odr0-64                72.62 (  0.00%)            43.00 ( 40.78%)
> Amean    free-odr0-128               78.00 (  0.00%)            42.00 ( 46.15%)
> Amean    free-odr0-256               80.46 (  0.00%)            57.00 ( 29.16%)
> Amean    free-odr0-512               96.38 (  0.00%)            64.69 ( 32.88%)
> Amean    free-odr0-1024             107.31 (  0.00%)            72.54 ( 32.40%)
> Amean    free-odr0-2048             108.92 (  0.00%)            78.08 ( 28.32%)
> Amean    free-odr0-4096             113.38 (  0.00%)            82.23 ( 27.48%)
> Amean    free-odr0-8192             112.08 (  0.00%)            82.85 ( 26.08%)
> Amean    free-odr0-16384            110.38 (  0.00%)            81.92 ( 25.78%)
> Amean    total-odr0-1               452.15 (  0.00%)           308.00 ( 31.88%)
> Amean    total-odr0-2               334.23 (  0.00%)           246.23 ( 26.33%)
> Amean    total-odr0-4               286.00 (  0.00%)           205.38 ( 28.19%)
> Amean    total-odr0-8               255.54 (  0.00%)           180.15 ( 29.50%)
> Amean    total-odr0-16              239.00 (  0.00%)           169.00 ( 29.29%)
> Amean    total-odr0-32              229.00 (  0.00%)           162.69 ( 28.96%)
> Amean    total-odr0-64              211.38 (  0.00%)           159.00 ( 24.78%)
> Amean    total-odr0-128             223.00 (  0.00%)           160.00 ( 28.25%)
> Amean    total-odr0-256             216.62 (  0.00%)           182.00 ( 15.98%)
> Amean    total-odr0-512             244.31 (  0.00%)           186.46 ( 23.68%)
> Amean    total-odr0-1024            254.54 (  0.00%)           198.69 ( 21.94%)
> Amean    total-odr0-2048            264.08 (  0.00%)           208.00 ( 21.24%)
> Amean    total-odr0-4096            277.38 (  0.00%)           219.00 ( 21.05%)
> Amean    total-odr0-8192            279.00 (  0.00%)           220.92 ( 20.82%)
> Amean    total-odr0-16384           269.38 (  0.00%)           219.92 ( 18.36%)
> 
> This is the alloc, free and total overhead of allocating order-0 pages in
> batches of 1 page up to 16384 pages. Avoiding disabling/enabling overhead
> massively reduces overhead. Alloc overhead is roughly reduced by 14-20% in
> most cases. The free path is reduced by 26-46% and the total reduction
> is significant.
> 
[...]
> 
> Similarly, little benefit was seen on networking benchmarks both localhost
> and between physical server/clients where other costs dominate. It's
> possible that this will only be noticable on very high speed networks.

The networking results highly depend on NIC drivers.  As you mention in
the cover-letter, (1) some drivers (e.g mlx4) alloc high-order pages to
work-around order-0 pages and DMA-map being too slow (for their HW
use-case), (2) drivers that do use order-0 pages have driver specific
page-recycling tricks (e.g. mlx5 and ixgbe).  The page_pool target
making a more generic recycle mechanism for drivers to use.

I'm very excited to see improvements in this area! :-)))
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	brouer@redhat.com
Subject: Re: [PATCH 3/4] mm, page_allocator: Only use per-cpu allocator for irq-safe requests
Date: Wed, 4 Jan 2017 15:20:24 +0100	[thread overview]
Message-ID: <20170104152024.7e050b84@redhat.com> (raw)
In-Reply-To: <20170104111049.15501-4-mgorman@techsingularity.net>

On Wed,  4 Jan 2017 11:10:48 +0000 Mel Gorman <mgorman@techsingularity.net> wrote:

> Many workloads that allocate pages are not handling an interrupt at a
> time. As allocation requests may be from IRQ context, it's necessary to
> disable/enable IRQs for every page allocation. This cost is the bulk
> of the free path but also a significant percentage of the allocation
> path.
> 
> This patch alters the locking and checks such that only irq-safe allocation
> requests use the per-cpu allocator. All others acquire the irq-safe
> zone->lock and allocate from the buddy allocator. It relies on disabling
> preemption to safely access the per-cpu structures. 

I love this idea and patch :-)

> It could be slightly
> modified to avoid soft IRQs using it but it's not clear it's worthwhile.

NICs usually refill their RX-ring from SoftIRQ context (NAPI).
Thus, we do want this optimization to work in softirq.

 
> This modification may slow allocations from IRQ context slightly but the main
> gain from the per-cpu allocator is that it scales better for allocations
> from multiple contexts. There is an implicit assumption that intensive
> allocations from IRQ contexts on multiple CPUs from a single NUMA node are
> rare and that the fast majority of scaling issues are encountered in !IRQ
> contexts such as page faulting. 

IHMO, I agree with this implicit assumption.


> It's worth noting that this patch is not
> required for a bulk page allocator but it significantly reduces the overhead.
> 
> The following is results from a page allocator micro-benchmark. Only
> order-0 is interesting as higher orders do not use the per-cpu allocator

I'm seeing approx 34% reduction in a order-0 micro-benchmark! amazing! :-)
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/

>                                           4.10.0-rc2                 4.10.0-rc2
>                                              vanilla               irqsafe-v1r5
> Amean    alloc-odr0-1               287.15 (  0.00%)           219.00 ( 23.73%)
> Amean    alloc-odr0-2               221.23 (  0.00%)           183.23 ( 17.18%)
> Amean    alloc-odr0-4               187.00 (  0.00%)           151.38 ( 19.05%)
> Amean    alloc-odr0-8               167.54 (  0.00%)           132.77 ( 20.75%)
> Amean    alloc-odr0-16              156.00 (  0.00%)           123.00 ( 21.15%)
> Amean    alloc-odr0-32              149.00 (  0.00%)           118.31 ( 20.60%)
> Amean    alloc-odr0-64              138.77 (  0.00%)           116.00 ( 16.41%)
> Amean    alloc-odr0-128             145.00 (  0.00%)           118.00 ( 18.62%)
> Amean    alloc-odr0-256             136.15 (  0.00%)           125.00 (  8.19%)
> Amean    alloc-odr0-512             147.92 (  0.00%)           121.77 ( 17.68%)
> Amean    alloc-odr0-1024            147.23 (  0.00%)           126.15 ( 14.32%)
> Amean    alloc-odr0-2048            155.15 (  0.00%)           129.92 ( 16.26%)
> Amean    alloc-odr0-4096            164.00 (  0.00%)           136.77 ( 16.60%)
> Amean    alloc-odr0-8192            166.92 (  0.00%)           138.08 ( 17.28%)
> Amean    alloc-odr0-16384           159.00 (  0.00%)           138.00 ( 13.21%)
> Amean    free-odr0-1                165.00 (  0.00%)            89.00 ( 46.06%)
> Amean    free-odr0-2                113.00 (  0.00%)            63.00 ( 44.25%)
> Amean    free-odr0-4                 99.00 (  0.00%)            54.00 ( 45.45%)
> Amean    free-odr0-8                 88.00 (  0.00%)            47.38 ( 46.15%)
> Amean    free-odr0-16                83.00 (  0.00%)            46.00 ( 44.58%)
> Amean    free-odr0-32                80.00 (  0.00%)            44.38 ( 44.52%)
> Amean    free-odr0-64                72.62 (  0.00%)            43.00 ( 40.78%)
> Amean    free-odr0-128               78.00 (  0.00%)            42.00 ( 46.15%)
> Amean    free-odr0-256               80.46 (  0.00%)            57.00 ( 29.16%)
> Amean    free-odr0-512               96.38 (  0.00%)            64.69 ( 32.88%)
> Amean    free-odr0-1024             107.31 (  0.00%)            72.54 ( 32.40%)
> Amean    free-odr0-2048             108.92 (  0.00%)            78.08 ( 28.32%)
> Amean    free-odr0-4096             113.38 (  0.00%)            82.23 ( 27.48%)
> Amean    free-odr0-8192             112.08 (  0.00%)            82.85 ( 26.08%)
> Amean    free-odr0-16384            110.38 (  0.00%)            81.92 ( 25.78%)
> Amean    total-odr0-1               452.15 (  0.00%)           308.00 ( 31.88%)
> Amean    total-odr0-2               334.23 (  0.00%)           246.23 ( 26.33%)
> Amean    total-odr0-4               286.00 (  0.00%)           205.38 ( 28.19%)
> Amean    total-odr0-8               255.54 (  0.00%)           180.15 ( 29.50%)
> Amean    total-odr0-16              239.00 (  0.00%)           169.00 ( 29.29%)
> Amean    total-odr0-32              229.00 (  0.00%)           162.69 ( 28.96%)
> Amean    total-odr0-64              211.38 (  0.00%)           159.00 ( 24.78%)
> Amean    total-odr0-128             223.00 (  0.00%)           160.00 ( 28.25%)
> Amean    total-odr0-256             216.62 (  0.00%)           182.00 ( 15.98%)
> Amean    total-odr0-512             244.31 (  0.00%)           186.46 ( 23.68%)
> Amean    total-odr0-1024            254.54 (  0.00%)           198.69 ( 21.94%)
> Amean    total-odr0-2048            264.08 (  0.00%)           208.00 ( 21.24%)
> Amean    total-odr0-4096            277.38 (  0.00%)           219.00 ( 21.05%)
> Amean    total-odr0-8192            279.00 (  0.00%)           220.92 ( 20.82%)
> Amean    total-odr0-16384           269.38 (  0.00%)           219.92 ( 18.36%)
> 
> This is the alloc, free and total overhead of allocating order-0 pages in
> batches of 1 page up to 16384 pages. Avoiding disabling/enabling overhead
> massively reduces overhead. Alloc overhead is roughly reduced by 14-20% in
> most cases. The free path is reduced by 26-46% and the total reduction
> is significant.
> 
[...]
> 
> Similarly, little benefit was seen on networking benchmarks both localhost
> and between physical server/clients where other costs dominate. It's
> possible that this will only be noticable on very high speed networks.

The networking results highly depend on NIC drivers.  As you mention in
the cover-letter, (1) some drivers (e.g mlx4) alloc high-order pages to
work-around order-0 pages and DMA-map being too slow (for their HW
use-case), (2) drivers that do use order-0 pages have driver specific
page-recycling tricks (e.g. mlx5 and ixgbe).  The page_pool target
making a more generic recycle mechanism for drivers to use.

I'm very excited to see improvements in this area! :-)))
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2017-01-04 14:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-04 11:10 [RFC PATCH 0/4] Fast noirq bulk page allocator Mel Gorman
2017-01-04 11:10 ` Mel Gorman
2017-01-04 11:10 ` [PATCH 1/4] mm, page_alloc: Split buffered_rmqueue Mel Gorman
2017-01-04 11:10   ` Mel Gorman
2017-01-04 11:10 ` [PATCH 2/4] mm, page_alloc: Split alloc_pages_nodemask Mel Gorman
2017-01-04 11:10   ` Mel Gorman
2017-01-04 11:10 ` [PATCH 3/4] mm, page_allocator: Only use per-cpu allocator for irq-safe requests Mel Gorman
2017-01-04 11:10   ` Mel Gorman
2017-01-04 14:20   ` Jesper Dangaard Brouer [this message]
2017-01-04 14:20     ` Jesper Dangaard Brouer
2017-01-06  3:26   ` Hillf Danton
2017-01-06  3:26     ` Hillf Danton
2017-01-06 10:15     ` Mel Gorman
2017-01-06 10:15       ` Mel Gorman
2017-01-09  3:14       ` Hillf Danton
2017-01-09  3:14         ` Hillf Danton
2017-01-09  9:48         ` Mel Gorman
2017-01-09  9:48           ` Mel Gorman
2017-01-09  9:55           ` Hillf Danton
2017-01-09  9:55             ` Hillf Danton
2017-01-04 11:10 ` [PATCH 4/4] mm, page_alloc: Add a bulk page allocator Mel Gorman
2017-01-04 11:10   ` Mel Gorman
2017-01-04 13:48   ` Jesper Dangaard Brouer
2017-01-04 13:48     ` Jesper Dangaard Brouer
2017-01-04 14:03     ` Mel Gorman
2017-01-04 14:03       ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2017-01-09 16:35 [RFC PATCH 0/4] Fast noirq bulk page allocator v2r7 Mel Gorman
2017-01-09 16:35 ` [PATCH 3/4] mm, page_allocator: Only use per-cpu allocator for irq-safe requests Mel Gorman
2017-01-09 16:35   ` Mel Gorman
2017-01-11 12:44   ` Jesper Dangaard Brouer
2017-01-11 12:44     ` Jesper Dangaard Brouer
2017-01-11 13:27     ` Jesper Dangaard Brouer
2017-01-11 13:27       ` Jesper Dangaard Brouer
2017-01-12 10:47       ` Mel Gorman
2017-01-12 10:47         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170104152024.7e050b84@redhat.com \
    --to=brouer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.