From: Tony Battersby <tonyb@cybernetics.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: iommu@lists.linux-foundation.org, kernel-team@fb.com,
Matthew Wilcox <willy@infradead.org>,
Keith Busch <kbusch@kernel.org>,
Andy Shevchenko <andy.shevchenko@gmail.com>,
Robin Murphy <robin.murphy@arm.com>,
Tony Lindgren <tony@atomide.com>
Subject: [PATCH 00/10] mpt3sas and dmapool scalability
Date: Tue, 31 May 2022 14:11:16 -0400 [thread overview]
Message-ID: <9b08ab7c-b80b-527d-9adf-7716b0868fbc@cybernetics.com> (raw)
This patch series improves dmapool scalability by replacing linear scans
with red-black trees.
History:
In 2018 this patch series made it through 4 versions. v1 used red-black
trees; v2 - v4 put the dma pool info directly into struct page and used
virt_to_page() to get at it. v4 made a brief appearance in linux-next,
but it caused problems on non-x86 archs where virt_to_page() doesn't
work with dma_alloc_coherent, so it was reverted. I was too busy at the
time to repost the red-black tree version, and I forgot about it until
now. This version is based on the red-black trees of v1, but addressing
all the review comments I got at the time and with additional cleanup
patches.
Note that Keith Busch is also working on improving dmapool scalability,
so for now I would recommend not merging my scalability patches until
Keith's approach can be evaluated. In the meantime, my patches can
serve as a benchmark comparison. I also have a number of cleanup
patches in my series that could be useful on their own.
References:
v1
https://lore.kernel.org/linux-mm/73ec1f52-d758-05df-fb6a-41d269e910d0@cybernetics.com/
v2
https://lore.kernel.org/linux-mm/ec701153-fdc9-37f3-c267-f056159b4606@cybernetics.com/
v3
https://lore.kernel.org/linux-mm/d48854ff-995d-228e-8356-54c141c32117@cybernetics.com/
v4
https://lore.kernel.org/linux-mm/88395080-efc1-4e7b-f813-bb90c86d0745@cybernetics.com/
problem caused by virt_to_page()
https://lore.kernel.org/linux-kernel/20181206013054.GI6707@atomide.com/
Keith Busch's dmapool performance enhancements
https://lore.kernel.org/linux-mm/20220428202714.17630-1-kbusch@kernel.org/
Below is my original description of the motivation for these patches.
drivers/scsi/mpt3sas is running into a scalability problem with the
kernel's DMA pool implementation. With a LSI/Broadcom SAS 9300-8i
12Gb/s HBA and max_sgl_entries=256, during modprobe, mpt3sas does the
equivalent of:
chain_dma_pool = dma_pool_create(size = 128);
for (i = 0; i < 373959; i++)
{
dma_addr[i] = dma_pool_alloc(chain_dma_pool);
}
And at rmmod, system shutdown, or system reboot, mpt3sas does the
equivalent of:
for (i = 0; i < 373959; i++)
{
dma_pool_free(chain_dma_pool, dma_addr[i]);
}
dma_pool_destroy(chain_dma_pool);
With this usage, both dma_pool_alloc() and dma_pool_free() exhibit
O(n^2) complexity, although dma_pool_free() is much worse due to
implementation details. On my system, the dma_pool_free() loop above
takes about 9 seconds to run. Note that the problem was even worse
before commit 74522a92bbf0 ("scsi: mpt3sas: Optimize I/O memory
consumption in driver."), where the dma_pool_free() loop could take ~30
seconds.
mpt3sas also has some other DMA pools, but chain_dma_pool is the only
one with so many allocations:
cat /sys/devices/pci0000:80/0000:80:07.0/0000:85:00.0/pools
(manually cleaned up column alignment)
poolinfo - 0.1
reply_post_free_array pool 1 21 192 1
reply_free pool 1 1 41728 1
reply pool 1 1 1335296 1
sense pool 1 1 970272 1
chain pool 373959 386048 128 12064
reply_post_free pool 12 12 166528 12
The patches in this series improve the scalability of the DMA pool
implementation, which significantly reduces the running time of the
DMA alloc/free loops. With the patches applied, "modprobe mpt3sas",
"rmmod mpt3sas", and system shutdown/reboot with mpt3sas loaded are
significantly faster. Here are some benchmarks (of DMA alloc/free
only, not the entire modprobe/rmmod):
dma_pool_create() + dma_pool_alloc() loop, size = 128, count = 373959
original: 350 ms ( 1x)
dmapool patches: 18 ms (19x)
dma_pool_free() loop + dma_pool_destroy(), size = 128, count = 373959
original: 8901 ms ( 1x)
dmapool patches: 19 ms ( 477x)
next reply other threads:[~2022-05-31 18:11 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-31 18:11 Tony Battersby [this message]
2022-05-31 18:12 ` [PATCH 01/10] dmapool: remove checks for dev == NULL Tony Battersby
2022-05-31 18:23 ` Robin Murphy
2022-05-31 18:13 ` [PATCH 02/10] dmapool: cleanup integer types Tony Battersby
2022-05-31 18:14 ` [PATCH 03/10] dmapool: fix boundary comparison Tony Battersby
2022-05-31 18:17 ` [PATCH 04/10] dmapool: improve accuracy of debug statistics Tony Battersby
2022-05-31 19:48 ` Robin Murphy
2022-05-31 19:52 ` Tony Battersby
2022-05-31 21:55 ` Robin Murphy
2022-05-31 18:18 ` [PATCH 05/10] dmapool: debug: prevent endless loop in case of corruption Tony Battersby
2022-05-31 18:20 ` [PATCH 06/10] dmapool: ignore init_on_free when DMAPOOL_DEBUG enabled Tony Battersby
2022-05-31 18:21 ` [PATCH 07/10] dmapool: speedup DMAPOOL_DEBUG with init_on_alloc Tony Battersby
2022-05-31 18:22 ` [PATCH 08/10] dmapool: cleanup dma_pool_destroy Tony Battersby
2022-05-31 19:33 ` Robin Murphy
2022-05-31 21:40 ` Keith Busch
2022-05-31 18:23 ` [PATCH 09/10] dmapool: improve scalability of dma_pool_alloc Tony Battersby
2022-05-31 18:23 ` [PATCH 10/10] dmapool: improve scalability of dma_pool_free Tony Battersby
2022-05-31 21:54 ` Keith Busch
2022-05-31 22:10 ` Tony Battersby
2022-06-01 9:44 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b08ab7c-b80b-527d-9adf-7716b0868fbc@cybernetics.com \
--to=tonyb@cybernetics.com \
--cc=andy.shevchenko@gmail.com \
--cc=iommu@lists.linux-foundation.org \
--cc=kbusch@kernel.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=robin.murphy@arm.com \
--cc=tony@atomide.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).