* [PATCH v3 0/6] Fix for potential deadlock in pre-content event
@ 2025-03-12 7:38 Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 1/6] fsnotify: add pre-content hooks on mmap() Amir Goldstein
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
Jan,
This is the mmap solution proposed by Josef to solve the potential
deadlock with faulting in user pages [1].
I've added test coverage to mmap() pre-content events and verified
no pre-content events on page fault [2].
After some push back on [v2] for disabling page fault pre-content hooks
while leaving their code in the kernel, this series revert the page
fault pre-content hooks.
This leaves DAX files access without pre-content hooks, but that was
never a goal for this feature, so I think that is fine.
Thanks,
Amir.
Changes since v2:
- Revert page fault pre-content hooks
- Remove mmap hook from remap_file_pages() (Lorenzo)
- Create fsnotify_mmap_perm() wrapper (Lorenzo)
[1] https://lore.kernel.org/linux-fsdevel/20250307154614.GA59451@perftesting/
[2] https://github.com/amir73il/ltp/commits/fan_hsm/
[v2] https://lore.kernel.org/linux-fsdevel/20250311114153.1763176-1-amir73il@gmail.com/
[v1] https://lore.kernel.org/linux-fsdevel/20250309115207.908112-1-amir73il@gmail.com/
Amir Goldstein (6):
fsnotify: add pre-content hooks on mmap()
Revert "ext4: add pre-content fsnotify hook for DAX faults"
Revert "xfs: add pre-content fsnotify hook for DAX faults"
Revert "fsnotify: generate pre-content permission event on page fault"
Revert "mm: don't allow huge faults for files with pre content
watches"
Revert "fanotify: disable readahead if we have pre-content watches"
fs/ext4/file.c | 3 --
fs/xfs/xfs_file.c | 13 ------
include/linux/fsnotify.h | 21 ++++++++++
include/linux/mm.h | 1 -
mm/filemap.c | 86 ----------------------------------------
mm/memory.c | 19 ---------
mm/nommu.c | 7 ----
mm/readahead.c | 14 -------
mm/util.c | 3 ++
9 files changed, 24 insertions(+), 143 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3 1/6] fsnotify: add pre-content hooks on mmap()
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
@ 2025-03-12 7:38 ` Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 2/6] Revert "ext4: add pre-content fsnotify hook for DAX faults" Amir Goldstein
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
Pre-content hooks in page faults introduces potential deadlock of HSM
handler in userspace with filesystem freezing.
The requirement with pre-content event is that for every accessed file
range an event covering at least this range will be generated at least
once before the file data is accesses.
In preparation to disabling pre-content event hooks on page faults,
add pre-content hooks at mmap() variants for the entire mmaped range,
so HSM can fill content when user requests to map a portion of the file.
Note that exec() variant also calls vm_mmap_pgoff() internally to map
code sections, so pre-content hooks are also generated in this case.
Link: https://lore.kernel.org/linux-fsdevel/7ehxrhbvehlrjwvrduoxsao5k3x4aw275patsb3krkwuq573yv@o2hskrfawbnc/
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
include/linux/fsnotify.h | 21 +++++++++++++++++++++
mm/util.c | 3 +++
2 files changed, 24 insertions(+)
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 6a33288bd6a1f..83d3ac97f8262 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -170,6 +170,21 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
return fsnotify_path(&file->f_path, FS_ACCESS_PERM);
}
+/*
+ * fsnotify_mmap_perm - permission hook before mmap of file range
+ */
+static inline int fsnotify_mmap_perm(struct file *file, int prot,
+ const loff_t off, size_t len)
+{
+ /*
+ * mmap() generates only pre-content events.
+ */
+ if (!file || likely(!FMODE_FSNOTIFY_HSM(file->f_mode)))
+ return 0;
+
+ return fsnotify_pre_content(&file->f_path, &off, len);
+}
+
/*
* fsnotify_truncate_perm - permission hook before file truncate
*/
@@ -223,6 +238,12 @@ static inline int fsnotify_file_area_perm(struct file *file, int perm_mask,
return 0;
}
+static inline int fsnotify_mmap_perm(struct file *file, int prot,
+ const loff_t off, size_t len)
+{
+ return 0;
+}
+
static inline int fsnotify_truncate_perm(const struct path *path, loff_t length)
{
return 0;
diff --git a/mm/util.c b/mm/util.c
index b6b9684a14388..8c965474d329f 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -23,6 +23,7 @@
#include <linux/processor.h>
#include <linux/sizes.h>
#include <linux/compat.h>
+#include <linux/fsnotify.h>
#include <linux/uaccess.h>
@@ -569,6 +570,8 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
LIST_HEAD(uf);
ret = security_mmap_file(file, prot, flag);
+ if (!ret)
+ ret = fsnotify_mmap_perm(file, prot, pgoff >> PAGE_SHIFT, len);
if (!ret) {
if (mmap_write_lock_killable(mm))
return -EINTR;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 2/6] Revert "ext4: add pre-content fsnotify hook for DAX faults"
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 1/6] fsnotify: add pre-content hooks on mmap() Amir Goldstein
@ 2025-03-12 7:38 ` Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 3/6] Revert "xfs: " Amir Goldstein
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
This reverts commit bb480760ffc7018e21ee6f60241c2b99ff26ee0e.
---
fs/ext4/file.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index a5205149adba3..3bd96c3d4cd0c 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -756,9 +756,6 @@ static vm_fault_t ext4_dax_huge_fault(struct vm_fault *vmf, unsigned int order)
return VM_FAULT_SIGBUS;
}
} else {
- result = filemap_fsnotify_fault(vmf);
- if (unlikely(result))
- return result;
filemap_invalidate_lock_shared(mapping);
}
result = dax_iomap_fault(vmf, order, &pfn, &error, &ext4_iomap_ops);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 3/6] Revert "xfs: add pre-content fsnotify hook for DAX faults"
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 1/6] fsnotify: add pre-content hooks on mmap() Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 2/6] Revert "ext4: add pre-content fsnotify hook for DAX faults" Amir Goldstein
@ 2025-03-12 7:38 ` Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 4/6] Revert "fsnotify: generate pre-content permission event on page fault" Amir Goldstein
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
This reverts commit 7f4796a46571ced5d3d5b0942e1bfea1eedaaecd.
---
fs/xfs/xfs_file.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index f7a7d89c345ec..9a435b1ff2647 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1451,9 +1451,6 @@ xfs_dax_read_fault(
trace_xfs_read_fault(ip, order);
- ret = filemap_fsnotify_fault(vmf);
- if (unlikely(ret))
- return ret;
xfs_ilock(ip, XFS_MMAPLOCK_SHARED);
ret = xfs_dax_fault_locked(vmf, order, false);
xfs_iunlock(ip, XFS_MMAPLOCK_SHARED);
@@ -1482,16 +1479,6 @@ xfs_write_fault(
vm_fault_t ret;
trace_xfs_write_fault(ip, order);
- /*
- * Usually we get here from ->page_mkwrite callback but in case of DAX
- * we will get here also for ordinary write fault. Handle HSM
- * notifications for that case.
- */
- if (IS_DAX(inode)) {
- ret = filemap_fsnotify_fault(vmf);
- if (unlikely(ret))
- return ret;
- }
sb_start_pagefault(inode->i_sb);
file_update_time(vmf->vma->vm_file);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 4/6] Revert "fsnotify: generate pre-content permission event on page fault"
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
` (2 preceding siblings ...)
2025-03-12 7:38 ` [PATCH v3 3/6] Revert "xfs: " Amir Goldstein
@ 2025-03-12 7:38 ` Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 5/6] Revert "mm: don't allow huge faults for files with pre content watches" Amir Goldstein
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
This reverts commit 8392bc2ff8c8bf7c4c5e6dfa71ccd893a3c046f6.
In the use case of buffered write whose input buffer is mmapped file on a
filesystem with a pre-content mark, the prefaulting of the buffer can
happen under the filesystem freeze protection (obtained in vfs_write())
which breaks assumptions of pre-content hook and introduces potential
deadlock of HSM handler in userspace with filesystem freezing.
Now that we have pre-content hooks at file mmap() time, disable the
pre-content event hooks on page fault to avoid the potential deadlock.
Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-fsdevel/7ehxrhbvehlrjwvrduoxsao5k3x4aw275patsb3krkwuq573yv@o2hskrfawbnc/
Fixes: 8392bc2ff8c8b ("fsnotify: generate pre-content permission event on page fault")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
include/linux/mm.h | 1 -
mm/filemap.c | 74 ----------------------------------------------
mm/nommu.c | 7 -----
3 files changed, 82 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7b1068ddcbb70..8483e09aeb2cd 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3420,7 +3420,6 @@ extern vm_fault_t filemap_fault(struct vm_fault *vmf);
extern vm_fault_t filemap_map_pages(struct vm_fault *vmf,
pgoff_t start_pgoff, pgoff_t end_pgoff);
extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
-extern vm_fault_t filemap_fsnotify_fault(struct vm_fault *vmf);
extern unsigned long stack_guard_gap;
/* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
diff --git a/mm/filemap.c b/mm/filemap.c
index 2974691fdfad2..ff5fcdd961364 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -47,7 +47,6 @@
#include <linux/splice.h>
#include <linux/rcupdate_wait.h>
#include <linux/sched/mm.h>
-#include <linux/fsnotify.h>
#include <asm/pgalloc.h>
#include <asm/tlbflush.h>
#include "internal.h"
@@ -3336,48 +3335,6 @@ static vm_fault_t filemap_fault_recheck_pte_none(struct vm_fault *vmf)
return ret;
}
-/**
- * filemap_fsnotify_fault - maybe emit a pre-content event.
- * @vmf: struct vm_fault containing details of the fault.
- *
- * If we have a pre-content watch on this file we will emit an event for this
- * range. If we return anything the fault caller should return immediately, we
- * will return VM_FAULT_RETRY if we had to emit an event, which will trigger the
- * fault again and then the fault handler will run the second time through.
- *
- * Return: a bitwise-OR of %VM_FAULT_ codes, 0 if nothing happened.
- */
-vm_fault_t filemap_fsnotify_fault(struct vm_fault *vmf)
-{
- struct file *fpin = NULL;
- int mask = (vmf->flags & FAULT_FLAG_WRITE) ? MAY_WRITE : MAY_ACCESS;
- loff_t pos = vmf->pgoff >> PAGE_SHIFT;
- size_t count = PAGE_SIZE;
- int err;
-
- /*
- * We already did this and now we're retrying with everything locked,
- * don't emit the event and continue.
- */
- if (vmf->flags & FAULT_FLAG_TRIED)
- return 0;
-
- /* No watches, we're done. */
- if (likely(!FMODE_FSNOTIFY_HSM(vmf->vma->vm_file->f_mode)))
- return 0;
-
- fpin = maybe_unlock_mmap_for_io(vmf, fpin);
- if (!fpin)
- return VM_FAULT_SIGBUS;
-
- err = fsnotify_file_area_perm(fpin, mask, &pos, count);
- fput(fpin);
- if (err)
- return VM_FAULT_SIGBUS;
- return VM_FAULT_RETRY;
-}
-EXPORT_SYMBOL_GPL(filemap_fsnotify_fault);
-
/**
* filemap_fault - read in file data for page fault handling
* @vmf: struct vm_fault containing details of the fault
@@ -3481,37 +3438,6 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
* or because readahead was otherwise unable to retrieve it.
*/
if (unlikely(!folio_test_uptodate(folio))) {
- /*
- * If this is a precontent file we have can now emit an event to
- * try and populate the folio.
- */
- if (!(vmf->flags & FAULT_FLAG_TRIED) &&
- unlikely(FMODE_FSNOTIFY_HSM(file->f_mode))) {
- loff_t pos = folio_pos(folio);
- size_t count = folio_size(folio);
-
- /* We're NOWAIT, we have to retry. */
- if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) {
- folio_unlock(folio);
- goto out_retry;
- }
-
- if (mapping_locked)
- filemap_invalidate_unlock_shared(mapping);
- mapping_locked = false;
-
- folio_unlock(folio);
- fpin = maybe_unlock_mmap_for_io(vmf, fpin);
- if (!fpin)
- goto out_retry;
-
- error = fsnotify_file_area_perm(fpin, MAY_ACCESS, &pos,
- count);
- if (error)
- ret = VM_FAULT_SIGBUS;
- goto out_retry;
- }
-
/*
* If the invalidate lock is not held, the folio was in cache
* and uptodate and now it is not. Strange but possible since we
diff --git a/mm/nommu.c b/mm/nommu.c
index baa79abdaf037..9cb6e99215e2b 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1613,13 +1613,6 @@ int remap_vmalloc_range(struct vm_area_struct *vma, void *addr,
}
EXPORT_SYMBOL(remap_vmalloc_range);
-vm_fault_t filemap_fsnotify_fault(struct vm_fault *vmf)
-{
- BUG();
- return 0;
-}
-EXPORT_SYMBOL_GPL(filemap_fsnotify_fault);
-
vm_fault_t filemap_fault(struct vm_fault *vmf)
{
BUG();
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 5/6] Revert "mm: don't allow huge faults for files with pre content watches"
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
` (3 preceding siblings ...)
2025-03-12 7:38 ` [PATCH v3 4/6] Revert "fsnotify: generate pre-content permission event on page fault" Amir Goldstein
@ 2025-03-12 7:38 ` Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 6/6] Revert "fanotify: disable readahead if we have pre-content watches" Amir Goldstein
2025-03-12 16:56 ` [PATCH v3 0/6] Fix for potential deadlock in pre-content event Jan Kara
6 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
This reverts commit 20bf82a898b65c129af76deb96a1b415d3098a28.
---
mm/memory.c | 19 -------------------
1 file changed, 19 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index b4d3d4893267c..34e65f6bf0d96 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -76,7 +76,6 @@
#include <linux/ptrace.h>
#include <linux/vmalloc.h>
#include <linux/sched/sysctl.h>
-#include <linux/fsnotify.h>
#include <trace/events/kmem.h>
@@ -5743,17 +5742,8 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;
-
if (vma_is_anonymous(vma))
return do_huge_pmd_anonymous_page(vmf);
- /*
- * Currently we just emit PAGE_SIZE for our fault events, so don't allow
- * a huge fault if we have a pre content watch on this file. This would
- * be trivial to support, but there would need to be tests to ensure
- * this works properly and those don't exist currently.
- */
- if (unlikely(FMODE_FSNOTIFY_HSM(vma->vm_file->f_mode)))
- return VM_FAULT_FALLBACK;
if (vma->vm_ops->huge_fault)
return vma->vm_ops->huge_fault(vmf, PMD_ORDER);
return VM_FAULT_FALLBACK;
@@ -5777,9 +5767,6 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf)
}
if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) {
- /* See comment in create_huge_pmd. */
- if (unlikely(FMODE_FSNOTIFY_HSM(vma->vm_file->f_mode)))
- goto split;
if (vma->vm_ops->huge_fault) {
ret = vma->vm_ops->huge_fault(vmf, PMD_ORDER);
if (!(ret & VM_FAULT_FALLBACK))
@@ -5802,9 +5789,6 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf)
/* No support for anonymous transparent PUD pages yet */
if (vma_is_anonymous(vma))
return VM_FAULT_FALLBACK;
- /* See comment in create_huge_pmd. */
- if (unlikely(FMODE_FSNOTIFY_HSM(vma->vm_file->f_mode)))
- return VM_FAULT_FALLBACK;
if (vma->vm_ops->huge_fault)
return vma->vm_ops->huge_fault(vmf, PUD_ORDER);
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
@@ -5822,9 +5806,6 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
if (vma_is_anonymous(vma))
goto split;
if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) {
- /* See comment in create_huge_pmd. */
- if (unlikely(FMODE_FSNOTIFY_HSM(vma->vm_file->f_mode)))
- goto split;
if (vma->vm_ops->huge_fault) {
ret = vma->vm_ops->huge_fault(vmf, PUD_ORDER);
if (!(ret & VM_FAULT_FALLBACK))
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3 6/6] Revert "fanotify: disable readahead if we have pre-content watches"
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
` (4 preceding siblings ...)
2025-03-12 7:38 ` [PATCH v3 5/6] Revert "mm: don't allow huge faults for files with pre content watches" Amir Goldstein
@ 2025-03-12 7:38 ` Amir Goldstein
2025-03-12 16:56 ` [PATCH v3 0/6] Fix for potential deadlock in pre-content event Jan Kara
6 siblings, 0 replies; 8+ messages in thread
From: Amir Goldstein @ 2025-03-12 7:38 UTC (permalink / raw)
To: Jan Kara; +Cc: Josef Bacik, Christian Brauner, linux-fsdevel
This reverts commit fac84846a28c0950d4433118b3dffd44306df62d.
---
mm/filemap.c | 12 ------------
mm/readahead.c | 14 --------------
2 files changed, 26 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index ff5fcdd961364..6d616bb9001eb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3197,14 +3197,6 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
unsigned long vm_flags = vmf->vma->vm_flags;
unsigned int mmap_miss;
- /*
- * If we have pre-content watches we need to disable readahead to make
- * sure that we don't populate our mapping with 0 filled pages that we
- * never emitted an event for.
- */
- if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
- return fpin;
-
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/* Use the readahead code, even if readahead is disabled */
if ((vm_flags & VM_HUGEPAGE) && HPAGE_PMD_ORDER <= MAX_PAGECACHE_ORDER) {
@@ -3273,10 +3265,6 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf,
struct file *fpin = NULL;
unsigned int mmap_miss;
- /* See comment in do_sync_mmap_readahead. */
- if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
- return fpin;
-
/* If we don't want any read-ahead, don't bother */
if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages)
return fpin;
diff --git a/mm/readahead.c b/mm/readahead.c
index 220155a5c9646..6a4e96b69702b 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -128,7 +128,6 @@
#include <linux/blk-cgroup.h>
#include <linux/fadvise.h>
#include <linux/sched/mm.h>
-#include <linux/fsnotify.h>
#include "internal.h"
@@ -558,15 +557,6 @@ void page_cache_sync_ra(struct readahead_control *ractl,
unsigned long max_pages, contig_count;
pgoff_t prev_index, miss;
- /*
- * If we have pre-content watches we need to disable readahead to make
- * sure that we don't find 0 filled pages in cache that we never emitted
- * events for. Filesystems supporting HSM must make sure to not call
- * this function with ractl->file unset for files handled by HSM.
- */
- if (ractl->file && unlikely(FMODE_FSNOTIFY_HSM(ractl->file->f_mode)))
- return;
-
/*
* Even if readahead is disabled, issue this request as readahead
* as we'll need it to satisfy the requested range. The forced
@@ -645,10 +635,6 @@ void page_cache_async_ra(struct readahead_control *ractl,
if (!ra->ra_pages)
return;
- /* See the comment in page_cache_sync_ra. */
- if (ractl->file && unlikely(FMODE_FSNOTIFY_HSM(ractl->file->f_mode)))
- return;
-
/*
* Same bit is used for PG_readahead and PG_reclaim.
*/
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3 0/6] Fix for potential deadlock in pre-content event
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
` (5 preceding siblings ...)
2025-03-12 7:38 ` [PATCH v3 6/6] Revert "fanotify: disable readahead if we have pre-content watches" Amir Goldstein
@ 2025-03-12 16:56 ` Jan Kara
6 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2025-03-12 16:56 UTC (permalink / raw)
To: Amir Goldstein; +Cc: Jan Kara, Josef Bacik, Christian Brauner, linux-fsdevel
Hi!
On Wed 12-03-25 08:38:46, Amir Goldstein wrote:
> This is the mmap solution proposed by Josef to solve the potential
> deadlock with faulting in user pages [1].
>
> I've added test coverage to mmap() pre-content events and verified
> no pre-content events on page fault [2].
Yeah, sorry for a bit delayd reply but this seems like the least
controversial path forward for now. I was thinking for some time about a
proper solution for the deadlock but so far I didn't come up with anything
clever.
> After some push back on [v2] for disabling page fault pre-content hooks
> while leaving their code in the kernel, this series revert the page
> fault pre-content hooks.
>
> This leaves DAX files access without pre-content hooks, but that was
> never a goal for this feature, so I think that is fine.
Yes, I think we can live with that for now.
I'll take the patches to my tree with a view of sending them to Linus over
the weekend after some exposure in linux-next.
Thanks for taking care of this!
Honza
> Changes since v2:
> - Revert page fault pre-content hooks
> - Remove mmap hook from remap_file_pages() (Lorenzo)
> - Create fsnotify_mmap_perm() wrapper (Lorenzo)
>
> [1] https://lore.kernel.org/linux-fsdevel/20250307154614.GA59451@perftesting/
> [2] https://github.com/amir73il/ltp/commits/fan_hsm/
> [v2] https://lore.kernel.org/linux-fsdevel/20250311114153.1763176-1-amir73il@gmail.com/
> [v1] https://lore.kernel.org/linux-fsdevel/20250309115207.908112-1-amir73il@gmail.com/
>
> Amir Goldstein (6):
> fsnotify: add pre-content hooks on mmap()
> Revert "ext4: add pre-content fsnotify hook for DAX faults"
> Revert "xfs: add pre-content fsnotify hook for DAX faults"
> Revert "fsnotify: generate pre-content permission event on page fault"
> Revert "mm: don't allow huge faults for files with pre content
> watches"
> Revert "fanotify: disable readahead if we have pre-content watches"
>
> fs/ext4/file.c | 3 --
> fs/xfs/xfs_file.c | 13 ------
> include/linux/fsnotify.h | 21 ++++++++++
> include/linux/mm.h | 1 -
> mm/filemap.c | 86 ----------------------------------------
> mm/memory.c | 19 ---------
> mm/nommu.c | 7 ----
> mm/readahead.c | 14 -------
> mm/util.c | 3 ++
> 9 files changed, 24 insertions(+), 143 deletions(-)
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-03-12 16:56 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-12 7:38 [PATCH v3 0/6] Fix for potential deadlock in pre-content event Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 1/6] fsnotify: add pre-content hooks on mmap() Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 2/6] Revert "ext4: add pre-content fsnotify hook for DAX faults" Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 3/6] Revert "xfs: " Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 4/6] Revert "fsnotify: generate pre-content permission event on page fault" Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 5/6] Revert "mm: don't allow huge faults for files with pre content watches" Amir Goldstein
2025-03-12 7:38 ` [PATCH v3 6/6] Revert "fanotify: disable readahead if we have pre-content watches" Amir Goldstein
2025-03-12 16:56 ` [PATCH v3 0/6] Fix for potential deadlock in pre-content event Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).