* [PATCHv3 0/2] Fix SIGBUS semantics with large folios
@ 2025-10-27 11:56 Kiryl Shutsemau
2025-10-27 11:56 ` [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size Kiryl Shutsemau
2025-10-27 11:56 ` [PATCHv3 2/2] mm/truncate: Unmap large folio on split failure Kiryl Shutsemau
0 siblings, 2 replies; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-10-27 11:56 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Hugh Dickins, Matthew Wilcox,
Alexander Viro, Christian Brauner
Cc: Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Rik van Riel, Harry Yoo,
Johannes Weiner, Shakeel Butt, Baolin Wang, Darrick J. Wong,
Dave Chinner, linux-mm, linux-fsdevel, linux-kernel,
Kiryl Shutsemau
From: Kiryl Shutsemau <kas@kernel.org>
Accessing memory within a VMA, but beyond i_size rounded up to the next
page size, is supposed to generate SIGBUS.
Darrick reported[1] an xfstests regression in v6.18-rc1. generic/749
failed due to missing SIGBUS. This was caused by my recent changes that
try to fault in the whole folio where possible:
19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
357b92761d94 ("mm/filemap: map entire large folio faultaround")
These changes did not consider i_size when setting up PTEs, leading to
xfstest breakage.
However, the problem has been present in the kernel for a long time -
since huge tmpfs was introduced in 2016. The kernel happily maps
PMD-sized folios as PMD without checking i_size. And huge=always tmpfs
allocates PMD-size folios on any writes.
I considered this corner case when I implemented a large tmpfs, and my
conclusion was that no one in their right mind should rely on receiving
a SIGBUS signal when accessing beyond i_size. I cannot imagine how it
could be useful for the workload.
But apparently filesystem folks care a lot about preserving strict
SIGBUS semantics.
Generic/749 was introduced last year with reference to POSIX, but no
real workloads were mentioned. It also acknowledged the tmpfs deviation
from the test case.
POSIX indeed says[3]:
References within the address range starting at pa and
continuing for len bytes to whole pages following the end of an
object shall result in delivery of a SIGBUS signal.
The patchset fixes the regression introduced by recent changes as well
as more subtle SIGBUS breakage due to split failure on truncation.
v3:
- Make an exception for tmpfs/shmem, code restructured;
- Rebased to mm-everything (v2 of the patchset reverted);
v2:
- Fix try_to_unmap() flags;
- Add warning if try_to_unmap() fails to unmap the folio;
- Adjust comments and commit messages;
- Whitespace fixes;
v1:
- Drop RFC;
- Add Signed-off-bys;
Kiryl Shutsemau (2):
mm/memory: Do not populate page table entries beyond i_size
mm/truncate: Unmap large folio on split failure
mm/filemap.c | 28 ++++++++++++++++++++--------
mm/memory.c | 20 +++++++++++++++++++-
mm/truncate.c | 35 +++++++++++++++++++++++++++++------
3 files changed, 68 insertions(+), 15 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-27 11:56 [PATCHv3 0/2] Fix SIGBUS semantics with large folios Kiryl Shutsemau
@ 2025-10-27 11:56 ` Kiryl Shutsemau
2025-10-27 22:33 ` Andrew Morton
2025-10-27 11:56 ` [PATCHv3 2/2] mm/truncate: Unmap large folio on split failure Kiryl Shutsemau
1 sibling, 1 reply; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-10-27 11:56 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Hugh Dickins, Matthew Wilcox,
Alexander Viro, Christian Brauner
Cc: Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Rik van Riel, Harry Yoo,
Johannes Weiner, Shakeel Butt, Baolin Wang, Darrick J. Wong,
Dave Chinner, linux-mm, linux-fsdevel, linux-kernel,
Kiryl Shutsemau
From: Kiryl Shutsemau <kas@kernel.org>
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
Recent changes attempted to fault in full folio where possible. They did
not respect i_size, which led to populating PTEs beyond i_size and
breaking SIGBUS semantics.
Darrick reported generic/749 breakage because of this.
However, the problem existed before the recent changes. With huge=always
tmpfs, any write to a file leads to PMD-size allocation. Following the
fault-in of the folio will install PMD mapping regardless of i_size.
Fix filemap_map_pages() and finish_fault() to not install:
- PTEs beyond i_size;
- PMD mappings across i_size;
Make an exception for shmem/tmpfs that for long time intentionally
mapped with PMDs across i_size.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Fixes: 19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
Fixes: 357b92761d94 ("mm/filemap: map entire large folio faultaround")
Fixes: 01c70267053d ("fs: add a filesystem flag for THPs")
Reported-by: "Darrick J. Wong" <djwong@kernel.org>
---
mm/filemap.c | 28 ++++++++++++++++++++--------
mm/memory.c | 20 +++++++++++++++++++-
2 files changed, 39 insertions(+), 9 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index b7b297c1ad4f..ff75bd89b68c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3690,7 +3690,8 @@ static struct folio *next_uptodate_folio(struct xa_state *xas,
static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
struct folio *folio, unsigned long start,
unsigned long addr, unsigned int nr_pages,
- unsigned long *rss, unsigned short *mmap_miss)
+ unsigned long *rss, unsigned short *mmap_miss,
+ bool can_map_large)
{
unsigned int ref_from_caller = 1;
vm_fault_t ret = 0;
@@ -3705,7 +3706,7 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
* The folio must not cross VMA or page table boundary.
*/
addr0 = addr - start * PAGE_SIZE;
- if (folio_within_vma(folio, vmf->vma) &&
+ if (can_map_large && folio_within_vma(folio, vmf->vma) &&
(addr0 & PMD_MASK) == ((addr0 + folio_size(folio) - 1) & PMD_MASK)) {
vmf->pte -= start;
page -= start;
@@ -3820,13 +3821,27 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
unsigned long rss = 0;
unsigned int nr_pages = 0, folio_type;
unsigned short mmap_miss = 0, mmap_miss_saved;
+ bool can_map_large;
rcu_read_lock();
folio = next_uptodate_folio(&xas, mapping, end_pgoff);
if (!folio)
goto out;
- if (filemap_map_pmd(vmf, folio, start_pgoff)) {
+ file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
+ end_pgoff = min(end_pgoff, file_end);
+
+ /*
+ * Do not allow to map with PTEs beyond i_size and with PMD
+ * across i_size to preserve SIGBUS semantics.
+ *
+ * Make an exception for shmem/tmpfs that for long time
+ * intentionally mapped with PMDs across i_size.
+ */
+ can_map_large = shmem_mapping(mapping) ||
+ file_end >= folio_next_index(folio);
+
+ if (can_map_large && filemap_map_pmd(vmf, folio, start_pgoff)) {
ret = VM_FAULT_NOPAGE;
goto out;
}
@@ -3839,10 +3854,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
goto out;
}
- file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
- if (end_pgoff > file_end)
- end_pgoff = file_end;
-
folio_type = mm_counter_file(folio);
do {
unsigned long end;
@@ -3859,7 +3870,8 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
else
ret |= filemap_map_folio_range(vmf, folio,
xas.xa_index - folio->index, addr,
- nr_pages, &rss, &mmap_miss);
+ nr_pages, &rss, &mmap_miss,
+ can_map_large);
folio_unlock(folio);
} while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL);
diff --git a/mm/memory.c b/mm/memory.c
index 39e21688e74b..1a3eb070f8df 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -77,6 +77,7 @@
#include <linux/sched/sysctl.h>
#include <linux/pgalloc.h>
#include <linux/uaccess.h>
+#include <linux/shmem_fs.h>
#include <trace/events/kmem.h>
@@ -5545,8 +5546,25 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
return ret;
}
+ if (!needs_fallback && vma->vm_file) {
+ struct address_space *mapping = vma->vm_file->f_mapping;
+ pgoff_t file_end;
+
+ file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
+
+ /*
+ * Do not allow to map with PTEs beyond i_size and with PMD
+ * across i_size to preserve SIGBUS semantics.
+ *
+ * Make an exception for shmem/tmpfs that for long time
+ * intentionally mapped with PMDs across i_size.
+ */
+ needs_fallback = !shmem_mapping(mapping) &&
+ file_end < folio_next_index(folio);
+ }
+
if (pmd_none(*vmf->pmd)) {
- if (folio_test_pmd_mappable(folio)) {
+ if (!needs_fallback && folio_test_pmd_mappable(folio)) {
ret = do_set_pmd(vmf, folio, page);
if (ret != VM_FAULT_FALLBACK)
return ret;
--
2.50.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHv3 2/2] mm/truncate: Unmap large folio on split failure
2025-10-27 11:56 [PATCHv3 0/2] Fix SIGBUS semantics with large folios Kiryl Shutsemau
2025-10-27 11:56 ` [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size Kiryl Shutsemau
@ 2025-10-27 11:56 ` Kiryl Shutsemau
1 sibling, 0 replies; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-10-27 11:56 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Hugh Dickins, Matthew Wilcox,
Alexander Viro, Christian Brauner
Cc: Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Rik van Riel, Harry Yoo,
Johannes Weiner, Shakeel Butt, Baolin Wang, Darrick J. Wong,
Dave Chinner, linux-mm, linux-fsdevel, linux-kernel,
Kiryl Shutsemau
From: Kiryl Shutsemau <kas@kernel.org>
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
This behavior might not be respected on truncation.
During truncation, the kernel splits a large folio in order to reclaim
memory. As a side effect, it unmaps the folio and destroys PMD mappings
of the folio. The folio will be refaulted as PTEs and SIGBUS semantics
are preserved.
However, if the split fails, PMD mappings are preserved and the user
will not receive SIGBUS on any accesses within the PMD.
Unmap the folio on split failure. It will lead to refault as PTEs and
preserve SIGBUS semantics.
Make an exception for shmem/tmpfs that for long time intentionally
mapped with PMDs across i_size.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
---
mm/truncate.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/mm/truncate.c b/mm/truncate.c
index 9210cf808f5c..3c5a50ae3274 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -177,6 +177,32 @@ int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
return 0;
}
+static int try_folio_split_or_unmap(struct folio *folio, struct page *split_at,
+ unsigned long min_order)
+{
+ enum ttu_flags ttu_flags =
+ TTU_SYNC |
+ TTU_SPLIT_HUGE_PMD |
+ TTU_IGNORE_MLOCK;
+ int ret;
+
+ ret = try_folio_split_to_order(folio, split_at, min_order);
+
+ /*
+ * If the split fails, unmap the folio, so it will be refaulted
+ * with PTEs to respect SIGBUS semantics.
+ *
+ * Make an exception for shmem/tmpfs that for long time
+ * intentionally mapped with PMDs across i_size.
+ */
+ if (ret && !shmem_mapping(folio->mapping)) {
+ try_to_unmap(folio, ttu_flags);
+ WARN_ON(folio_mapped(folio));
+ }
+
+ return ret;
+}
+
/*
* Handle partial folios. The folio may be entirely within the
* range if a split has raced with us. If not, we zero the part of the
@@ -226,7 +252,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
min_order = mapping_min_folio_order(folio->mapping);
split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE);
- if (!try_folio_split_to_order(folio, split_at, min_order)) {
+ if (!try_folio_split_or_unmap(folio, split_at, min_order)) {
/*
* try to split at offset + length to make sure folios within
* the range can be dropped, especially to avoid memory waste
@@ -250,13 +276,10 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
if (!folio_trylock(folio2))
goto out;
- /*
- * make sure folio2 is large and does not change its mapping.
- * Its split result does not matter here.
- */
+ /* make sure folio2 is large and does not change its mapping */
if (folio_test_large(folio2) &&
folio2->mapping == folio->mapping)
- try_folio_split_to_order(folio2, split_at2, min_order);
+ try_folio_split_or_unmap(folio2, split_at2, min_order);
folio_unlock(folio2);
out:
--
2.50.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-27 11:56 ` [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size Kiryl Shutsemau
@ 2025-10-27 22:33 ` Andrew Morton
2025-10-28 10:23 ` Kiryl Shutsemau
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2025-10-27 22:33 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: David Hildenbrand, Hugh Dickins, Matthew Wilcox, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel, Kiryl Shutsemau
On Mon, 27 Oct 2025 11:56:35 +0000 Kiryl Shutsemau <kirill@shutemov.name> wrote:
> From: Kiryl Shutsemau <kas@kernel.org>
>
> Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
> supposed to generate SIGBUS.
>
> Recent changes attempted to fault in full folio where possible. They did
> not respect i_size, which led to populating PTEs beyond i_size and
> breaking SIGBUS semantics.
>
> Darrick reported generic/749 breakage because of this.
>
> However, the problem existed before the recent changes. With huge=always
> tmpfs, any write to a file leads to PMD-size allocation. Following the
> fault-in of the folio will install PMD mapping regardless of i_size.
>
> Fix filemap_map_pages() and finish_fault() to not install:
> - PTEs beyond i_size;
> - PMD mappings across i_size;
>
> Make an exception for shmem/tmpfs that for long time intentionally
> mapped with PMDs across i_size.
>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> Fixes: 19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
> Fixes: 357b92761d94 ("mm/filemap: map entire large folio faultaround")
> Fixes: 01c70267053d ("fs: add a filesystem flag for THPs")
Multiple Fixes: are confusing.
We have two 6.18-rcX targets and one from 2020. Are we asking people
to backport this all the way back to 2020? If so I'd suggest the
removal of the more recent Fixes: targets.
Also, is [2/2] to be backported? The changelog makes it sound that way,
but no Fixes: was identified?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-27 22:33 ` Andrew Morton
@ 2025-10-28 10:23 ` Kiryl Shutsemau
2025-10-29 9:45 ` Hugh Dickins
0 siblings, 1 reply; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-10-28 10:23 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Hugh Dickins, Matthew Wilcox, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel
On Mon, Oct 27, 2025 at 03:33:23PM -0700, Andrew Morton wrote:
> On Mon, 27 Oct 2025 11:56:35 +0000 Kiryl Shutsemau <kirill@shutemov.name> wrote:
>
> > From: Kiryl Shutsemau <kas@kernel.org>
> >
> > Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
> > supposed to generate SIGBUS.
> >
> > Recent changes attempted to fault in full folio where possible. They did
> > not respect i_size, which led to populating PTEs beyond i_size and
> > breaking SIGBUS semantics.
> >
> > Darrick reported generic/749 breakage because of this.
> >
> > However, the problem existed before the recent changes. With huge=always
> > tmpfs, any write to a file leads to PMD-size allocation. Following the
> > fault-in of the folio will install PMD mapping regardless of i_size.
> >
> > Fix filemap_map_pages() and finish_fault() to not install:
> > - PTEs beyond i_size;
> > - PMD mappings across i_size;
> >
> > Make an exception for shmem/tmpfs that for long time intentionally
> > mapped with PMDs across i_size.
> >
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > Fixes: 19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
> > Fixes: 357b92761d94 ("mm/filemap: map entire large folio faultaround")
> > Fixes: 01c70267053d ("fs: add a filesystem flag for THPs")
>
> Multiple Fixes: are confusing.
>
> We have two 6.18-rcX targets and one from 2020. Are we asking people
> to backport this all the way back to 2020? If so I'd suggest the
> removal of the more recent Fixes: targets.
Okay, fair enough.
> Also, is [2/2] to be backported? The changelog makes it sound that way,
> but no Fixes: was identified?
Looking at split-on-truncate history, looks like this is the right
commit to point to:
Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
It moves split logic from shmem-specific to generic truncate.
As with the first patch, it will not be a trivial backport, but I am
around to help with this.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-28 10:23 ` Kiryl Shutsemau
@ 2025-10-29 9:45 ` Hugh Dickins
2025-10-29 10:23 ` Kiryl Shutsemau
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Hugh Dickins @ 2025-10-29 9:45 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Andrew Morton, David Hildenbrand, Hugh Dickins, Matthew Wilcox,
Alexander Viro, Christian Brauner, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Rik van Riel, Harry Yoo,
Johannes Weiner, Shakeel Butt, Baolin Wang, Darrick J. Wong,
Dave Chinner, linux-mm, linux-fsdevel, linux-kernel
On Tue, 28 Oct 2025, Kiryl Shutsemau wrote:
> On Mon, Oct 27, 2025 at 03:33:23PM -0700, Andrew Morton wrote:
> > On Mon, 27 Oct 2025 11:56:35 +0000 Kiryl Shutsemau <kirill@shutemov.name> wrote:
> >
> > > From: Kiryl Shutsemau <kas@kernel.org>
> > >
> > > Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
> > > supposed to generate SIGBUS.
> > >
> > > Recent changes attempted to fault in full folio where possible. They did
> > > not respect i_size, which led to populating PTEs beyond i_size and
> > > breaking SIGBUS semantics.
> > >
> > > Darrick reported generic/749 breakage because of this.
> > >
> > > However, the problem existed before the recent changes. With huge=always
> > > tmpfs, any write to a file leads to PMD-size allocation. Following the
> > > fault-in of the folio will install PMD mapping regardless of i_size.
> > >
> > > Fix filemap_map_pages() and finish_fault() to not install:
> > > - PTEs beyond i_size;
> > > - PMD mappings across i_size;
> > >
> > > Make an exception for shmem/tmpfs that for long time intentionally
> > > mapped with PMDs across i_size.
Thanks for the v3 patches, which do now suit huge tmpfs.
Not beautiful, but no longer regressing.
> > >
> > > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > > Fixes: 19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()")
> > > Fixes: 357b92761d94 ("mm/filemap: map entire large folio faultaround")
> > > Fixes: 01c70267053d ("fs: add a filesystem flag for THPs")
> >
> > Multiple Fixes: are confusing.
> >
> > We have two 6.18-rcX targets and one from 2020. Are we asking people
> > to backport this all the way back to 2020? If so I'd suggest the
> > removal of the more recent Fixes: targets.
>
> Okay, fair enough.
>
> > Also, is [2/2] to be backported? The changelog makes it sound that way,
> > but no Fixes: was identified?
>
> Looking at split-on-truncate history, looks like this is the right
> commit to point to:
>
> Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
I agree that's the right Fixee for 2/2: the one which introduced
splitting a large folio to non-shmem filesystems in 5.17.
But you're giving yourself too hard a time of backporting with your
5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
flag then was tmpfs, which you're now excepting. The flag got
renamed later (in 5.16) and then in 5.17 at last there was another
filesystem to set it. So, this 1/2 would be
Fixes: 6795801366da ("xfs: Support large folios")
>
> It moves split logic from shmem-specific to generic truncate.
>
> As with the first patch, it will not be a trivial backport, but I am
> around to help with this.
>
> --
> Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-29 9:45 ` Hugh Dickins
@ 2025-10-29 10:23 ` Kiryl Shutsemau
2025-11-01 4:17 ` Andrew Morton
2025-11-01 5:00 ` Matthew Wilcox
2 siblings, 0 replies; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-10-29 10:23 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, David Hildenbrand, Matthew Wilcox, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel
On Wed, Oct 29, 2025 at 02:45:52AM -0700, Hugh Dickins wrote:
> On Tue, 28 Oct 2025, Kiryl Shutsemau wrote:
> >
> > > Also, is [2/2] to be backported? The changelog makes it sound that way,
> > > but no Fixes: was identified?
> >
> > Looking at split-on-truncate history, looks like this is the right
> > commit to point to:
> >
> > Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
>
> I agree that's the right Fixee for 2/2: the one which introduced
> splitting a large folio to non-shmem filesystems in 5.17.
>
> But you're giving yourself too hard a time of backporting with your
> 5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
> flag then was tmpfs, which you're now excepting. The flag got
> renamed later (in 5.16) and then in 5.17 at last there was another
> filesystem to set it. So, this 1/2 would be
>
> Fixes: 6795801366da ("xfs: Support large folios")
Good point.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-29 9:45 ` Hugh Dickins
2025-10-29 10:23 ` Kiryl Shutsemau
@ 2025-11-01 4:17 ` Andrew Morton
2025-11-01 5:00 ` Matthew Wilcox
2 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2025-11-01 4:17 UTC (permalink / raw)
To: Hugh Dickins
Cc: Kiryl Shutsemau, David Hildenbrand, Matthew Wilcox,
Alexander Viro, Christian Brauner, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Rik van Riel, Harry Yoo,
Johannes Weiner, Shakeel Butt, Baolin Wang, Darrick J. Wong,
Dave Chinner, linux-mm, linux-fsdevel, linux-kernel
On Wed, 29 Oct 2025 02:45:52 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote:
> > Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
>
> I agree that's the right Fixee for 2/2: the one which introduced
> splitting a large folio to non-shmem filesystems in 5.17.
>
> But you're giving yourself too hard a time of backporting with your
> 5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
> flag then was tmpfs, which you're now excepting. The flag got
> renamed later (in 5.16) and then in 5.17 at last there was another
> filesystem to set it. So, this 1/2 would be
>
> Fixes: 6795801366da ("xfs: Support large folios")
I updated the changelog in mm.git's copy of this patch, thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-10-29 9:45 ` Hugh Dickins
2025-10-29 10:23 ` Kiryl Shutsemau
2025-11-01 4:17 ` Andrew Morton
@ 2025-11-01 5:00 ` Matthew Wilcox
2025-11-03 10:59 ` Kiryl Shutsemau
2 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2025-11-01 5:00 UTC (permalink / raw)
To: Hugh Dickins
Cc: Kiryl Shutsemau, Andrew Morton, David Hildenbrand, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel
On Wed, Oct 29, 2025 at 02:45:52AM -0700, Hugh Dickins wrote:
> But you're giving yourself too hard a time of backporting with your
> 5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
> flag then was tmpfs, which you're now excepting. The flag got
> renamed later (in 5.16) and then in 5.17 at last there was another
> filesystem to set it. So, this 1/2 would be
>
> Fixes: 6795801366da ("xfs: Support large folios")
I haven't been able to keep up with this patchset -- sorry.
But this problem didn't exist until bs>PS support was added because we
would never add a folio to the page cache which extended beyond i_size
before. We'd shrink the folio order allocated in do_page_cache_ra()
(actually, we still do, but page_cache_ra_unbounded() rounds it up
again). So it doesn't fix that commit at all, but something far more
recent.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-11-01 5:00 ` Matthew Wilcox
@ 2025-11-03 10:59 ` Kiryl Shutsemau
2025-11-03 14:35 ` Matthew Wilcox
0 siblings, 1 reply; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-11-03 10:59 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Hugh Dickins, Andrew Morton, David Hildenbrand, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel
On Sat, Nov 01, 2025 at 05:00:47AM +0000, Matthew Wilcox wrote:
> On Wed, Oct 29, 2025 at 02:45:52AM -0700, Hugh Dickins wrote:
> > But you're giving yourself too hard a time of backporting with your
> > 5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
> > flag then was tmpfs, which you're now excepting. The flag got
> > renamed later (in 5.16) and then in 5.17 at last there was another
> > filesystem to set it. So, this 1/2 would be
> >
> > Fixes: 6795801366da ("xfs: Support large folios")
>
> I haven't been able to keep up with this patchset -- sorry.
>
> But this problem didn't exist until bs>PS support was added because we
> would never add a folio to the page cache which extended beyond i_size
> before. We'd shrink the folio order allocated in do_page_cache_ra()
> (actually, we still do, but page_cache_ra_unbounded() rounds it up
> again). So it doesn't fix that commit at all, but something far more
> recent.
What about truncate path? We could allocate within i_size at first, then
truncate, if truncation failed to split the folio the mapping stays
beyond i_size.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-11-03 10:59 ` Kiryl Shutsemau
@ 2025-11-03 14:35 ` Matthew Wilcox
2025-11-03 15:18 ` Kiryl Shutsemau
0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2025-11-03 14:35 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Hugh Dickins, Andrew Morton, David Hildenbrand, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel
On Mon, Nov 03, 2025 at 10:59:00AM +0000, Kiryl Shutsemau wrote:
> On Sat, Nov 01, 2025 at 05:00:47AM +0000, Matthew Wilcox wrote:
> > On Wed, Oct 29, 2025 at 02:45:52AM -0700, Hugh Dickins wrote:
> > > But you're giving yourself too hard a time of backporting with your
> > > 5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
> > > flag then was tmpfs, which you're now excepting. The flag got
> > > renamed later (in 5.16) and then in 5.17 at last there was another
> > > filesystem to set it. So, this 1/2 would be
> > >
> > > Fixes: 6795801366da ("xfs: Support large folios")
> >
> > I haven't been able to keep up with this patchset -- sorry.
> >
> > But this problem didn't exist until bs>PS support was added because we
> > would never add a folio to the page cache which extended beyond i_size
> > before. We'd shrink the folio order allocated in do_page_cache_ra()
> > (actually, we still do, but page_cache_ra_unbounded() rounds it up
> > again). So it doesn't fix that commit at all, but something far more
> > recent.
>
> What about truncate path? We could allocate within i_size at first, then
> truncate, if truncation failed to split the folio the mapping stays
> beyond i_size.
Is it worth backporting all this way to solve this niche case?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size
2025-11-03 14:35 ` Matthew Wilcox
@ 2025-11-03 15:18 ` Kiryl Shutsemau
0 siblings, 0 replies; 12+ messages in thread
From: Kiryl Shutsemau @ 2025-11-03 15:18 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Hugh Dickins, Andrew Morton, David Hildenbrand, Alexander Viro,
Christian Brauner, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Rik van Riel, Harry Yoo, Johannes Weiner, Shakeel Butt,
Baolin Wang, Darrick J. Wong, Dave Chinner, linux-mm,
linux-fsdevel, linux-kernel
On Mon, Nov 03, 2025 at 02:35:57PM +0000, Matthew Wilcox wrote:
> On Mon, Nov 03, 2025 at 10:59:00AM +0000, Kiryl Shutsemau wrote:
> > On Sat, Nov 01, 2025 at 05:00:47AM +0000, Matthew Wilcox wrote:
> > > On Wed, Oct 29, 2025 at 02:45:52AM -0700, Hugh Dickins wrote:
> > > > But you're giving yourself too hard a time of backporting with your
> > > > 5.10 Fixee 01c70267053d for 1/2: the only filesystem which set the
> > > > flag then was tmpfs, which you're now excepting. The flag got
> > > > renamed later (in 5.16) and then in 5.17 at last there was another
> > > > filesystem to set it. So, this 1/2 would be
> > > >
> > > > Fixes: 6795801366da ("xfs: Support large folios")
> > >
> > > I haven't been able to keep up with this patchset -- sorry.
> > >
> > > But this problem didn't exist until bs>PS support was added because we
> > > would never add a folio to the page cache which extended beyond i_size
> > > before. We'd shrink the folio order allocated in do_page_cache_ra()
> > > (actually, we still do, but page_cache_ra_unbounded() rounds it up
> > > again). So it doesn't fix that commit at all, but something far more
> > > recent.
> >
> > What about truncate path? We could allocate within i_size at first, then
> > truncate, if truncation failed to split the folio the mapping stays
> > beyond i_size.
>
> Is it worth backporting all this way to solve this niche case?
Dave says it is correctness issue, so.. yes?
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-11-03 15:18 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-27 11:56 [PATCHv3 0/2] Fix SIGBUS semantics with large folios Kiryl Shutsemau
2025-10-27 11:56 ` [PATCHv3 1/2] mm/memory: Do not populate page table entries beyond i_size Kiryl Shutsemau
2025-10-27 22:33 ` Andrew Morton
2025-10-28 10:23 ` Kiryl Shutsemau
2025-10-29 9:45 ` Hugh Dickins
2025-10-29 10:23 ` Kiryl Shutsemau
2025-11-01 4:17 ` Andrew Morton
2025-11-01 5:00 ` Matthew Wilcox
2025-11-03 10:59 ` Kiryl Shutsemau
2025-11-03 14:35 ` Matthew Wilcox
2025-11-03 15:18 ` Kiryl Shutsemau
2025-10-27 11:56 ` [PATCHv3 2/2] mm/truncate: Unmap large folio on split failure Kiryl Shutsemau
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).