From: Matthew Wilcox <willy@infradead.org>
To: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, hch@lst.de
Subject: Re: [PATCH v2 02/18] mm/filemap: Remove dynamically allocated array from filemap_read
Date: Thu, 5 Nov 2020 04:52:14 +0000 [thread overview]
Message-ID: <20201105045214.GH17076@casper.infradead.org> (raw)
In-Reply-To: <20201104213005.GB3365678@moria.home.lan>
On Wed, Nov 04, 2020 at 04:30:05PM -0500, Kent Overstreet wrote:
> On Wed, Nov 04, 2020 at 08:42:03PM +0000, Matthew Wilcox (Oracle) wrote:
> > Increasing the batch size runs into diminishing returns. It's probably
> > better to make, eg, three calls to filemap_get_pages() than it is to
> > call into kmalloc().
>
> I have to disagree. Working with PAGEVEC_SIZE pages is eventually going to be
> like working with 4k pages today, and have you actually read the slub code for
> the kmalloc fast path? It's _really_ fast, there's no atomic operations and it
> doesn't even have to disable preemption - which is why you never see it showing
> up in profiles ever since we switched to slub.
I've been puzzling over this, and trying to run some benchmarks to figure
it out. My test VM is too noisy though; the error bars are too large to
get solid data.
There are three reasons why I think we hit diminishing returns:
1. Cost of going into the slab allocator (one alloc, one free).
Maybe that's not as high as I think it is.
2. Let's say the per-page overhead of walking i_pages is 10% of the
CPU time for a 128kB I/O with a batch size of 1. Increasing the batch
size to 15 means we walk the array 3 times instead of 32 times, or 0.7%
of the CPU time -- total reduction in CPU time of 9.3%. Increasing the
batch size to 32 means we only walk the array once, which cuts it down
from 10% to 0.3% -- reduction in CPU time of 9.7%.
If we are doing 2MB I/Os (and most applications I've looked at recently
only do 128kB), and the 10% remains constant, then the batch-size-15
case walks the tree 17 times instead of 512 times -- 0.6%, whereas the
batch-size-512 case walks the tree once -- 0.02%. But that only loks
like an overall savings of 9.98% versus 9.4%. And is an extra 0.6%
saving worth it?
3. By the time we're doing such large I/Os, we're surely dominated by
memcpy() and not walking the tree. Even if the file you're working on
is a terabyte in size, the radix tree is only 5 layers deep. So that's
five pointer dereferences to find the struct page, and they should stay
in cache (maybe they'd fall out to L2, but surely not as far as L3).
And generally radix tree cachelines stay clean so there shouldn't be any
contention on them from other CPUs unless they're dirtying the pages or
writing them back.
next prev parent reply other threads:[~2020-11-05 4:52 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-04 20:42 [PATCH v2 00/18] Refactor generic_file_buffered_read Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 01/18] mm/filemap: Rename generic_file_buffered_read subfunctions Matthew Wilcox (Oracle)
2020-11-06 8:07 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 02/18] mm/filemap: Remove dynamically allocated array from filemap_read Matthew Wilcox (Oracle)
2020-11-04 21:30 ` Kent Overstreet
2020-11-04 21:43 ` Amy Parker
2020-11-05 0:13 ` Matthew Wilcox
2020-11-06 8:08 ` Christoph Hellwig
2020-11-06 12:30 ` [PATCH 1/4] pagevec: Allow pagevecs to be different sizes Matthew Wilcox (Oracle)
2020-11-06 12:30 ` [PATCH 2/4] pagevec: Increase the size of LRU pagevecs Matthew Wilcox (Oracle)
2020-11-06 12:30 ` [PATCH 3/4] pagevec: Add dynamically allocated pagevecs Matthew Wilcox (Oracle)
2020-11-06 12:30 ` [PATCH 4/4] mm/filemap: Use a dynamically allocated pagevec in filemap_read Matthew Wilcox (Oracle)
2020-11-07 17:08 ` [PATCH 1/4] pagevec: Allow pagevecs to be different sizes Kent Overstreet
2020-11-07 17:20 ` Matthew Wilcox
2020-11-05 4:52 ` Matthew Wilcox [this message]
2020-11-06 8:07 ` [PATCH v2 02/18] mm/filemap: Remove dynamically allocated array from filemap_read Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 03/18] mm/filemap: Convert filemap_get_pages to take a pagevec Matthew Wilcox (Oracle)
2020-11-06 8:08 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 04/18] mm/filemap: Use THPs in generic_file_buffered_read Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 05/18] mm/filemap: Pass a sleep state to put_and_wait_on_page_locked Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 06/18] mm/filemap: Support readpage splitting a page Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 07/18] mm/filemap: Inline __wait_on_page_locked_async into caller Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 08/18] mm/filemap: Don't call ->readpage if IOCB_WAITQ is set Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 09/18] mm/filemap: Change filemap_read_page calling conventions Matthew Wilcox (Oracle)
2020-11-06 8:11 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 10/18] mm/filemap: Change filemap_create_page " Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 11/18] mm/filemap: Convert filemap_update_page to return an errno Matthew Wilcox (Oracle)
2020-11-06 8:14 ` Christoph Hellwig
2020-11-06 8:37 ` Christoph Hellwig
2020-11-09 13:29 ` Matthew Wilcox
2020-11-04 20:42 ` [PATCH v2 12/18] mm/filemap: Move the iocb checks into filemap_update_page Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 13/18] mm/filemap: Add filemap_range_uptodate Matthew Wilcox (Oracle)
2020-11-06 8:16 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 14/18] mm/filemap: Split filemap_readahead out of filemap_get_pages Matthew Wilcox (Oracle)
2020-11-06 8:19 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 15/18] mm/filemap: Restructure filemap_get_pages Matthew Wilcox (Oracle)
2020-11-06 8:21 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 16/18] mm/filemap: Don't relock the page after calling readpage Matthew Wilcox (Oracle)
2020-11-06 8:21 ` Christoph Hellwig
2020-11-04 20:42 ` [PATCH v2 17/18] mm/filemap: Rename generic_file_buffered_read to filemap_read Matthew Wilcox (Oracle)
2020-11-04 20:42 ` [PATCH v2 18/18] mm/filemap: Simplify generic_file_read_iter Matthew Wilcox (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201105045214.GH17076@casper.infradead.org \
--to=willy@infradead.org \
--cc=hch@lst.de \
--cc=kent.overstreet@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).