From: Christoph Hellwig <hch@lst.de>
To: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>,
Dave Chinner <david@fromorbit.com>,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH 1/2] iomap: Support large pages
Date: Fri, 2 Aug 2019 10:27:53 +0200 [thread overview]
Message-ID: <20190802082753.GA10664@lst.de> (raw)
In-Reply-To: <20190801174500.GL4700@bombadil.infradead.org>
On Thu, Aug 01, 2019 at 10:45:00AM -0700, Matthew Wilcox wrote:
> On Thu, Aug 01, 2019 at 06:21:47PM +0200, Christoph Hellwig wrote:
> > On Wed, Jul 31, 2019 at 08:59:55PM -0700, Matthew Wilcox wrote:
> > > - nbits = BITS_TO_LONGS(page_size(page) / SECTOR_SIZE);
> > > - iop = kmalloc(struct_size(iop, uptodate, nbits),
> > > - GFP_NOFS | __GFP_NOFAIL);
> > > - atomic_set(&iop->read_count, 0);
> > > - atomic_set(&iop->write_count, 0);
> > > - bitmap_zero(iop->uptodate, nbits);
> > > + n = BITS_TO_LONGS(page_size(page) >> inode->i_blkbits);
> > > + iop = kmalloc(struct_size(iop, uptodate, n),
> > > + GFP_NOFS | __GFP_NOFAIL | __GFP_ZERO);
> >
> > I am really worried about potential very large GFP_NOFS | __GFP_NOFAIL
> > allocations here.
>
> I don't think it gets _very_ large here. Assuming a 4kB block size
> filesystem, that's 512 bits (64 bytes, plus 16 bytes for the two counters)
> for a 2MB page. For machines with an 8MB PMD page, it's 272 bytes.
> Not a very nice fraction of a page size, so probably rounded up to a 512
> byte allocation, but well under the one page that the MM is supposed to
> guarantee being able to allocate.
And if we use GB pages?
Or 512-byte blocks or at least 1k blocks, which we need to handle even
if they are not preferred by any means. The real issue here is not just
the VMs capability to allocate these by some means, but that we do
__GFP_NOFAIL allocations in nofs context.
> > And thinking about this a bit more while walking
> > at the beach I wonder if a better option is to just allocate one
> > iomap per tail page if needed rather than blowing the head page one
> > up. We'd still always use the read_count and write_count in the
> > head page, but the bitmaps in the tail pages, which should be pretty
> > easily doable.
>
> We wouldn't need to allocate an iomap per tail page, even. We could
> just use one bit of tail-page->private per block. That'd work except
> for 512-byte block size on machines with a 64kB page. I doubt many
> people expect that combination to work well.
We'd still need to deal with the T10 PI tuples for a case like that,
though.
>
> One of my longer-term ambitions is to do away with tail pages under
> certain situations; eg partition the memory between allocatable-as-4kB
> pages and allocatable-as-2MB pages. We'd need a different solution for
> that, but it's a bit of a pipe dream right now anyway.
Yes, lets focus on that. Maybe at some point we'll also get extent
based VM instead of pages ;-)
next prev parent reply other threads:[~2019-08-02 8:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-31 17:17 [RFC 0/2] iomap & xfs support for large pages Matthew Wilcox
2019-07-31 17:17 ` [PATCH 1/2] iomap: Support " Matthew Wilcox
[not found] ` <20190731230315.GJ7777@dread.disaster.area>
2019-08-01 3:59 ` Matthew Wilcox
2019-08-01 16:21 ` Christoph Hellwig
2019-08-01 17:45 ` Matthew Wilcox
2019-08-02 8:27 ` Christoph Hellwig [this message]
2019-07-31 17:17 ` [PATCH 2/2] xfs: " Matthew Wilcox
2019-08-01 16:13 ` Christoph Hellwig
2019-07-31 17:50 ` [RFC 0/2] iomap & xfs support for " Song Liu
2019-07-31 17:59 ` Matthew Wilcox
2019-08-02 14:54 ` Christopher Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190802082753.GA10664@lst.de \
--to=hch@lst.de \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).