From: Luis Chamberlain <mcgrof@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Kiryl Shutsemau <kirill@shutemov.name>,
"Darrick J. Wong" <djwong@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
Pankaj Raghav <p.raghav@samsung.com>,
Zorro Lang <zlang@redhat.com>,
akpm@linux-foundation.org, linux-mm <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
xfs <linux-xfs@vger.kernel.org>
Subject: Re: Regression in generic/749 with 8k fsblock size on 6.18-rc1
Date: Tue, 21 Oct 2025 10:02:17 -0700 [thread overview]
Message-ID: <aPe8merkg654_sVp@bombadil.infradead.org> (raw)
In-Reply-To: <aPFyqwdv1prLXw5I@dread.disaster.area>
On Fri, Oct 17, 2025 at 09:33:15AM +1100, Dave Chinner wrote:
> On Thu, Oct 16, 2025 at 11:22:00AM +0100, Kiryl Shutsemau wrote:
> > On Wed, Oct 15, 2025 at 10:57:26AM -0700, Darrick J. Wong wrote:
> > > On Wed, Oct 15, 2025 at 04:59:03PM +0100, Kiryl Shutsemau wrote:
> > > > On Tue, Oct 14, 2025 at 10:52:14AM -0700, Darrick J. Wong wrote:
> > > > > Hi there,
> > > > >
> > > > > On 6.18-rc1, generic/749[1] running on XFS with an 8k fsblock size fails
> > > > > with the following:
> > > > >
> > > > > --- /run/fstests/bin/tests/generic/749.out 2025-07-15 14:45:15.170416031 -0700
> > > > > +++ /var/tmp/fstests/generic/749.out.bad 2025-10-13 17:48:53.079872054 -0700
> > > > > @@ -1,2 +1,10 @@
> > > > > QA output created by 749
> > > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > > +Expected SIGBUS when mmap() reading beyond page boundary
> > > > > +Expected SIGBUS when mmap() writing beyond page boundary
> > > > > Silence is golden
> > > > >
> > > > > This test creates small files of various sizes, maps the EOF block, and
> > > > > checks that you can read and write to the mmap'd page up to (but not
> > > > > beyond) the next page boundary.
> > > > >
> > > > > For 8k fsblock filesystems on x86, the pagecache creates a single 8k
> > > > > folio to cache the entire fsblock containing EOF. If EOF is in the
> > > > > first 4096 bytes of that 8k fsblock, then it should be possible to do a
> > > > > mmap read/write of the first 4k, but not the second 4k. Memory accesses
> > > > > to the second 4096 bytes should produce a SIGBUS.
> > > >
> > > > Does anybody actually relies on this behaviour (beyond xfstests)?
> > >
> > > Beats me, but the mmap manpage says:
> > ...
> > > POSIX 2024 says:
> > ...
> > > From both I would surmise that it's a reasonable expectation that you
> > > can't map basepages beyond EOF and have page faults on those pages
> > > succeed.
> >
> > <Added folks form the commit that introduced generic/749>
> >
> > Modern kernel with large folios blurs the line of what is the page.
> >
> > I don't want play spec lawyer. Let's look at real workloads.
>
> Or, more importantly, consider the security-related implications of
> the change....
>
> > If there's anything that actually relies on this SIGBUS corner case,
> > let's see how we can fix the kernel. But it will cost some CPU cycles.
> >
> > If it only broke syntactic test case, I'm inclined to say WONTFIX.
> >
> > Any opinions?
>
> Mapping beyond EOF ranges into userspace address spaces is a
> potential security risk. If there is ever a zeroing-beyond-EOF bug
> related to large folios (history tells us we are *guaranteed* to
> screw this up somewhere in future), then allowing mapping all the
> way to the end of the large folio could expose a -lot more- stale
> kernel data to userspace than just what the tail of a PAGE_SIZE
> faulted region would expose.
>
> Hence allowing applications to successfully fault a (unpredictable)
> distance far beyond EOF because the page cache used a large folio
> spanning EOF seems, to me, to be a very undesirable behaviour to
> expose to userspace.
I think in retrospect, having been involved in carefully crafting
this test, this was certainly an overlooked and clearly valuable use
case for the test which should be documented as otherwise others may
stumble upon it and easily fight it.
So extending the test docs to cover this concern is valuable.
Luis
prev parent reply other threads:[~2025-10-21 17:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-14 17:52 Regression in generic/749 with 8k fsblock size on 6.18-rc1 Darrick J. Wong
2025-10-15 7:39 ` Kirill A. Shutemov
2025-10-15 17:45 ` Darrick J. Wong
2025-10-15 15:59 ` Kiryl Shutsemau
2025-10-15 17:57 ` Darrick J. Wong
2025-10-16 10:22 ` Kiryl Shutsemau
2025-10-16 22:33 ` Dave Chinner
2025-10-17 14:28 ` Kiryl Shutsemau
2025-10-17 16:02 ` Darrick J. Wong
2025-10-17 17:00 ` Kiryl Shutsemau
2025-10-17 17:14 ` Matthew Wilcox
2025-10-21 17:02 ` Luis Chamberlain [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPe8merkg654_sVp@bombadil.infradead.org \
--to=mcgrof@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=p.raghav@samsung.com \
--cc=willy@infradead.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).