From: Ryan Roberts <ryan.roberts@arm.com>
To: Dave Chinner <david@fromorbit.com>, Matthew Wilcox <willy@infradead.org>
Cc: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
chandan.babu@oracle.com, djwong@kernel.org, brauner@kernel.org,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
yang@os.amperecomputing.com, linux-mm@kvack.org,
john.g.garry@oracle.com, linux-fsdevel@vger.kernel.org,
hare@suse.de, p.raghav@samsung.com, mcgrof@kernel.org,
gost.dev@samsung.com, cl@os.amperecomputing.com,
linux-xfs@vger.kernel.org, hch@lst.de, Zi Yan <zi.yan@sent.com>
Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes
Date: Fri, 5 Jul 2024 10:03:31 +0100 [thread overview]
Message-ID: <03b9ea6c-a3c4-41e8-ad47-4e82344da419@arm.com> (raw)
In-Reply-To: <Zod3ZQizBL7MyWEA@dread.disaster.area>
On 05/07/2024 05:32, Dave Chinner wrote:
> On Fri, Jul 05, 2024 at 12:56:28AM +0100, Matthew Wilcox wrote:
>> On Fri, Jul 05, 2024 at 08:06:51AM +1000, Dave Chinner wrote:
>>>>> It seems strange to silently clamp these? Presumably for the bs>ps usecase,
>>>>> whatever values are passed in are a hard requirement? So wouldn't want them to
>>>>> be silently reduced. (Especially given the recent change to reduce the size of
>>>>> MAX_PAGECACHE_ORDER to less then PMD size in some cases).
>>>>
>>>> Hm, yes. We should probably make this return an errno. Including
>>>> returning an errno for !IS_ENABLED() and min > 0.
>>>
>>> What are callers supposed to do with an error? In the case of
>>> setting up a newly allocated inode in XFS, the error would be
>>> returned in the middle of a transaction and so this failure would
>>> result in a filesystem shutdown.
>>
>> I suggest you handle it better than this. If the device is asking for a
>> blocksize > PMD_SIZE, you should fail to mount it.
A detail, but MAX_PAGECACHE_ORDER may be smaller than PMD_SIZE even on systems
with CONFIG_TRANSPARENT_HUGEPAGE as of a fix that is currently in mm-unstable:
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define PREFERRED_MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER
#else
#define PREFERRED_MAX_PAGECACHE_ORDER 8
#endif
/*
* xas_split_alloc() does not support arbitrary orders. This implies no
* 512MB THP on ARM64 with 64KB base page size.
*/
#define MAX_XAS_ORDER (XA_CHUNK_SHIFT * 2 - 1)
#define MAX_PAGECACHE_ORDER min(MAX_XAS_ORDER,
PREFERRED_MAX_PAGECACHE_ORDER)
But that also implies that the page cache can handle up to order-8 without
CONFIG_TRANSPARENT_HUGEPAGE so sounds like there isn't a dependcy on
CONFIG_TRANSPARENT_HUGEPAGE in this respect?
>
> That's my point: we already do that.
>
> The largest block size we support is 64kB and that's way smaller
> than PMD_SIZE on all platforms and we always check for bs > ps
> support at mount time when the filesystem bs > ps.
>
> Hence we're never going to set the min value to anything unsupported
> unless someone makes a massive programming mistake. At which point,
> we want a *hard, immediate fail* so the developer notices their
> mistake immediately. All filesystems and block devices need to
> behave this way so the limits should be encoded as asserts in the
> function to trigger such behaviour.
>
>> If the device is
>> asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is
>> not set, you should also decline to mount the filesystem.
>
> What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems
> being able to use large folios?
>
> If that's an actual dependency of using large folios, then we're at
> the point where the mm side of large folios needs to be divorced
> from CONFIG_TRANSPARENT_HUGEPAGE and always supported.
> Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the
> block layer and also every filesystem that wants to support
> sector/blocks sizes larger than PAGE_SIZE. IOWs, large folio
> support needs to *always* be enabled on systems that say
> CONFIG_BLOCK=y.
>
> I'd much prefer the former occurs, because making the block layer
> and filesystems dependent on an mm feature they don't actually use
> is kinda weird...
>
> -Dave.
>
next prev parent reply other threads:[~2024-07-05 9:03 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 11:44 [PATCH v8 00/10] enable bs > ps in XFS Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes Pankaj Raghav (Samsung)
2024-07-04 12:23 ` Ryan Roberts
2024-07-04 15:20 ` Matthew Wilcox
2024-07-04 15:52 ` Ryan Roberts
2024-07-04 21:28 ` Pankaj Raghav (Samsung)
2024-07-04 22:06 ` Dave Chinner
2024-07-04 23:56 ` Matthew Wilcox
2024-07-05 4:32 ` Dave Chinner
2024-07-05 9:03 ` Ryan Roberts [this message]
2024-07-05 12:45 ` Pankaj Raghav (Samsung)
2024-07-05 13:24 ` Pankaj Raghav (Samsung)
2024-07-05 13:31 ` Ryan Roberts
2024-07-05 14:14 ` Pankaj Raghav (Samsung)
2024-07-08 23:01 ` Dave Chinner
2024-07-09 8:11 ` Ryan Roberts
2024-07-09 13:08 ` Pankaj Raghav (Samsung)
2024-07-05 15:14 ` Matthew Wilcox
2024-07-04 21:34 ` Pankaj Raghav (Samsung)
2024-07-09 16:29 ` Pankaj Raghav (Samsung)
2024-07-09 16:38 ` Matthew Wilcox
2024-07-09 17:33 ` Pankaj Raghav (Samsung)
2024-07-09 16:50 ` Darrick J. Wong
2024-07-09 21:08 ` Pankaj Raghav (Samsung)
2024-07-09 21:59 ` Darrick J. Wong
2024-06-25 11:44 ` [PATCH v8 02/10] filemap: allocate mapping_min_order folios in the page cache Pankaj Raghav (Samsung)
2024-06-25 15:52 ` Matthew Wilcox
2024-06-25 18:06 ` Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 03/10] readahead: allocate folios with mapping_min_order in readahead Pankaj Raghav (Samsung)
2024-07-02 19:38 ` Darrick J. Wong
2024-07-03 14:10 ` Pankaj Raghav (Samsung)
2024-07-04 14:24 ` Ryan Roberts
2024-07-04 14:29 ` Matthew Wilcox
2024-06-25 11:44 ` [PATCH v8 04/10] mm: split a folio in minimum folio order chunks Pankaj Raghav (Samsung)
2024-06-25 14:45 ` Zi Yan
2024-06-25 17:20 ` Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 05/10] filemap: cap PTE range to be created to allowed zero fill in folio_map_range() Pankaj Raghav (Samsung)
2024-07-01 23:39 ` Darrick J. Wong
2024-06-25 11:44 ` [PATCH v8 06/10] iomap: fix iomap_dio_zero() for fs bs > system page size Pankaj Raghav (Samsung)
2024-07-01 2:37 ` Dave Chinner
2024-07-01 11:22 ` Pankaj Raghav (Samsung)
2024-07-01 23:40 ` Darrick J. Wong
2024-07-02 7:42 ` Christoph Hellwig
2024-07-02 10:15 ` Pankaj Raghav (Samsung)
2024-07-02 12:02 ` Christoph Hellwig
2024-07-02 14:01 ` Pankaj Raghav (Samsung)
2024-07-02 15:42 ` Christoph Hellwig
2024-07-02 16:13 ` Pankaj Raghav (Samsung)
2024-07-02 16:51 ` Matthew Wilcox
2024-07-02 17:10 ` Pankaj Raghav (Samsung)
2024-07-03 5:16 ` Christoph Hellwig
2024-07-02 16:50 ` Matthew Wilcox
2024-07-02 13:49 ` Luis Chamberlain
2024-06-25 11:44 ` [PATCH v8 07/10] xfs: use kvmalloc for xattr buffers Pankaj Raghav (Samsung)
2024-06-25 18:07 ` Pankaj Raghav (Samsung)
2024-06-25 11:44 ` [PATCH v8 08/10] xfs: expose block size in stat Pankaj Raghav (Samsung)
2024-07-01 2:33 ` Dave Chinner
2024-06-25 11:44 ` [PATCH v8 09/10] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Pankaj Raghav (Samsung)
2024-07-01 2:34 ` Dave Chinner
2024-06-25 11:44 ` [PATCH v8 10/10] xfs: enable block size larger than page size support Pankaj Raghav (Samsung)
2024-07-01 2:34 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=03b9ea6c-a3c4-41e8-ad47-4e82344da419@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=cl@os.amperecomputing.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=gost.dev@samsung.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=john.g.garry@oracle.com \
--cc=kernel@pankajraghav.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=zi.yan@sent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).