* bad things when too many negative dentries in a directory
@ 2025-04-11 9:40 Miklos Szeredi
2025-04-11 14:47 ` Christian Brauner
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Miklos Szeredi @ 2025-04-11 9:40 UTC (permalink / raw)
To: linux-fsdevel
Cc: Al Viro, Christian Brauner, Amir Goldstein, Jan Kara, Ian Kent
There are reports of soflockups in fsnotify if there are large numbers
of negative dentries (e.g. ~300M) in a directory. This can happen if
lots of temp files are created and removed and there's not enough
memory pressure to trigger the lru shrinker.
These are on old kernels and some of this is possibly due to missing
172e422ffea2 ("fsnotify: clear PARENT_WATCHED flags lazily"), but I
managed to reproduce the softlockup on a recent kernel in
fsnotify_set_children_dentry_flags() (see end of mail).
This was with ~1.2G negative dentries. Doing "rmdir testdir"
afterwards does not trigger the softlockup detector, due to the
reschedules in shrink_dcache_parent() code, but it took 10 minutes(!)
to finish removing that empty directory.
So I wonder, do we really want negative dentries on ->d_children?
Except for shrink_dcache_parent() I don't see any uses. And it's also
a question whether shrinking negative dentries is useful or not. If
they've been around for so long that hundreds of millions of them
could accumulate and that memory wasn't needed by anybody, then it
shouldn't make a big difference if they kept hanging around. On
umount, at the latest, the lru list can be used to kill everything,
AFAICT.
I'm curious if this is the right path? Any better ideas?
Thanks,
Miklos
[96789.366007] watchdog: BUG: soft lockup - CPU#79 stuck for 26s!
[fanotify4:52805]
[96789.373396] Modules linked in: rfkill mlx5_ib ib_uverbs macsec
ib_core vfat fat mlx5_core acpi_ipmi ast ipmi_ssif arm_spe_pmu igb
mlxfw psample i2c_algo_bit tls pci_hyperv_intf ipmi_devintf
ipmi_msghandler arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq loop
fuse nfnetlink xfs nvme crct10dif_ce ghash_ce sha2_ce sha256_arm64
nvme_core sha1_ce sbsa_gwdt nvme_auth i2c_designware_platform
i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod
[96789.413624] CPU: 79 UID: 0 PID: 52805 Comm: fanotify4 Kdump: loaded
Not tainted 6.12.0-55.9.1.el10_0.aarch64 #1
[96789.423698] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS
F31n (SCP: 2.10.20220810) 09/30/2022
[96789.432990] pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[96789.439939] pc : fsnotify_set_children_dentry_flags+0x80/0xf0
[96789.445675] lr : fsnotify_set_children_dentry_flags+0xa4/0xf0
[96789.451408] sp : ffff8000cc77b8c0
[96789.454710] x29: ffff8000cc77b8c0 x28: 0000000000000001 x27: 0000000000000000
[96789.461833] x26: ffff07ff8463dc50 x25: ffff080e6e44dc50 x24: 0000000000000001
[96789.468956] x23: ffff07ff9d94eec0 x22: ffff07fff2cf01b8 x21: ffff07ff9d94ee40
[96789.476079] x20: ffff0800eb6dff40 x19: ffff0800eb6df2c0 x18: 0000000000000014
[96789.483202] x17: 00000000cec6e315 x16: 00000000ed365140 x15: 00000000ae8684a4
[96789.490325] x14: 000000000d831309 x13: 00000000387d7ee0 x12: 0000000000000000
[96789.497448] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffc3bacc1864bc
[96789.504570] x8 : 000000001007ffff x7 : ffffc3bace89a4c0 x6 : 0000000000000001
[96789.511694] x5 : 0000000008000020 x4 : 0000000000000000 x3 : 0000000000000003
[96789.518816] x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffff0800eb6df358
[96789.525939] Call trace:
[96789.528373] fsnotify_set_children_dentry_flags+0x80/0xf0
[96789.533759] fsnotify_recalc_mask.part.0+0x94/0xc8
[96789.538538] fsnotify_recalc_mask+0x1c/0x40
[96789.542709] fanotify_add_mark+0x15c/0x360
[96789.546794] do_fanotify_mark+0x3c0/0x7a0
[96789.550791] __arm64_sys_fanotify_mark+0x30/0x60
[96789.555396] invoke_syscall.constprop.0+0x74/0xd0
[96789.560090] do_el0_svc+0xb0/0xe8
[96789.563393] el0_svc+0x44/0x1d0
[96789.566525] el0t_64_sync_handler+0x120/0x130
[96789.570870] el0t_64_sync+0x1a4/0x1a8
[151513.714945] INFO: task (ostnamed):77658 blocked for more than 122 seconds.
[151513.721903] Tainted: G L ------- ---
6.12.0-55.9.1.el10_0.aarch64 #1
[151513.730334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[151513.738241] task:(ostnamed) state:D stack:0 pid:77658
tgid:77658 ppid:1 flags:0x00000205
[151513.747625] Call trace:
[151513.750146] __switch_to+0xec/0x148
[151513.753712] __schedule+0x234/0x738
[151513.757278] schedule+0x3c/0xe0
[151513.760493] schedule_preempt_disabled+0x2c/0x58
[151513.765188] rwsem_down_write_slowpath+0x1e4/0x720
[151513.770054] down_write+0xac/0xc0
[151513.773444] do_lock_mount+0x3c/0x220
[151513.777185] path_mount+0x378/0x810
[151513.780748] __arm64_sys_mount+0x158/0x2d8
[151513.784921] invoke_syscall.constprop.0+0x74/0xd0
[151513.789702] do_el0_svc+0xb0/0xe8
[151513.793093] el0_svc+0x44/0x1d0
[151513.796312] el0t_64_sync_handler+0x120/0x130
[151513.800744] el0t_64_sync+0x1a4/0x1a8
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: bad things when too many negative dentries in a directory 2025-04-11 9:40 bad things when too many negative dentries in a directory Miklos Szeredi @ 2025-04-11 14:47 ` Christian Brauner 2025-04-11 15:40 ` Miklos Szeredi 2025-04-12 1:48 ` Ian Kent 2025-04-11 21:02 ` Mateusz Guzik 2025-04-20 4:49 ` Al Viro 2 siblings, 2 replies; 23+ messages in thread From: Christian Brauner @ 2025-04-11 14:47 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: > There are reports of soflockups in fsnotify if there are large numbers > of negative dentries (e.g. ~300M) in a directory. This can happen if > lots of temp files are created and removed and there's not enough > memory pressure to trigger the lru shrinker. > > These are on old kernels and some of this is possibly due to missing > 172e422ffea2 ("fsnotify: clear PARENT_WATCHED flags lazily"), but I > managed to reproduce the softlockup on a recent kernel in > fsnotify_set_children_dentry_flags() (see end of mail). > > This was with ~1.2G negative dentries. Doing "rmdir testdir" > afterwards does not trigger the softlockup detector, due to the > reschedules in shrink_dcache_parent() code, but it took 10 minutes(!) > to finish removing that empty directory. > > So I wonder, do we really want negative dentries on ->d_children? > Except for shrink_dcache_parent() I don't see any uses. And it's also > a question whether shrinking negative dentries is useful or not. If > they've been around for so long that hundreds of millions of them > could accumulate and that memory wasn't needed by anybody, then it > shouldn't make a big difference if they kept hanging around. On > umount, at the latest, the lru list can be used to kill everything, > AFAICT. > > I'm curious if this is the right path? Any better ideas? Note that we have a new sysctl: /proc/sys/fs/dentry-negative that can be used to control the negative dentry policy because any generic change that we tried to make has always resulted in unacceptable regressions for someone's workload. Currently we only allow it to be set to 1 (default 0). If set to 1 it will not create negative dentries during unlink. If that's sufficient than recommend this to users that suffer from this problem if not consider adding another sensitive policy. > > Thanks, > Miklos > > > [96789.366007] watchdog: BUG: soft lockup - CPU#79 stuck for 26s! > [fanotify4:52805] > [96789.373396] Modules linked in: rfkill mlx5_ib ib_uverbs macsec > ib_core vfat fat mlx5_core acpi_ipmi ast ipmi_ssif arm_spe_pmu igb > mlxfw psample i2c_algo_bit tls pci_hyperv_intf ipmi_devintf > ipmi_msghandler arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq loop > fuse nfnetlink xfs nvme crct10dif_ce ghash_ce sha2_ce sha256_arm64 > nvme_core sha1_ce sbsa_gwdt nvme_auth i2c_designware_platform > i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod > [96789.413624] CPU: 79 UID: 0 PID: 52805 Comm: fanotify4 Kdump: loaded > Not tainted 6.12.0-55.9.1.el10_0.aarch64 #1 > [96789.423698] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS > F31n (SCP: 2.10.20220810) 09/30/2022 > [96789.432990] pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [96789.439939] pc : fsnotify_set_children_dentry_flags+0x80/0xf0 > [96789.445675] lr : fsnotify_set_children_dentry_flags+0xa4/0xf0 > [96789.451408] sp : ffff8000cc77b8c0 > [96789.454710] x29: ffff8000cc77b8c0 x28: 0000000000000001 x27: 0000000000000000 > [96789.461833] x26: ffff07ff8463dc50 x25: ffff080e6e44dc50 x24: 0000000000000001 > [96789.468956] x23: ffff07ff9d94eec0 x22: ffff07fff2cf01b8 x21: ffff07ff9d94ee40 > [96789.476079] x20: ffff0800eb6dff40 x19: ffff0800eb6df2c0 x18: 0000000000000014 > [96789.483202] x17: 00000000cec6e315 x16: 00000000ed365140 x15: 00000000ae8684a4 > [96789.490325] x14: 000000000d831309 x13: 00000000387d7ee0 x12: 0000000000000000 > [96789.497448] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffc3bacc1864bc > [96789.504570] x8 : 000000001007ffff x7 : ffffc3bace89a4c0 x6 : 0000000000000001 > [96789.511694] x5 : 0000000008000020 x4 : 0000000000000000 x3 : 0000000000000003 > [96789.518816] x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffff0800eb6df358 > [96789.525939] Call trace: > [96789.528373] fsnotify_set_children_dentry_flags+0x80/0xf0 > [96789.533759] fsnotify_recalc_mask.part.0+0x94/0xc8 > [96789.538538] fsnotify_recalc_mask+0x1c/0x40 > [96789.542709] fanotify_add_mark+0x15c/0x360 > [96789.546794] do_fanotify_mark+0x3c0/0x7a0 > [96789.550791] __arm64_sys_fanotify_mark+0x30/0x60 > [96789.555396] invoke_syscall.constprop.0+0x74/0xd0 > [96789.560090] do_el0_svc+0xb0/0xe8 > [96789.563393] el0_svc+0x44/0x1d0 > [96789.566525] el0t_64_sync_handler+0x120/0x130 > [96789.570870] el0t_64_sync+0x1a4/0x1a8 > [151513.714945] INFO: task (ostnamed):77658 blocked for more than 122 seconds. > [151513.721903] Tainted: G L ------- --- > 6.12.0-55.9.1.el10_0.aarch64 #1 > [151513.730334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [151513.738241] task:(ostnamed) state:D stack:0 pid:77658 > tgid:77658 ppid:1 flags:0x00000205 > [151513.747625] Call trace: > [151513.750146] __switch_to+0xec/0x148 > [151513.753712] __schedule+0x234/0x738 > [151513.757278] schedule+0x3c/0xe0 > [151513.760493] schedule_preempt_disabled+0x2c/0x58 > [151513.765188] rwsem_down_write_slowpath+0x1e4/0x720 > [151513.770054] down_write+0xac/0xc0 > [151513.773444] do_lock_mount+0x3c/0x220 > [151513.777185] path_mount+0x378/0x810 > [151513.780748] __arm64_sys_mount+0x158/0x2d8 > [151513.784921] invoke_syscall.constprop.0+0x74/0xd0 > [151513.789702] do_el0_svc+0xb0/0xe8 > [151513.793093] el0_svc+0x44/0x1d0 > [151513.796312] el0t_64_sync_handler+0x120/0x130 > [151513.800744] el0t_64_sync+0x1a4/0x1a8 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 14:47 ` Christian Brauner @ 2025-04-11 15:40 ` Miklos Szeredi 2025-04-11 16:01 ` Matthew Wilcox 2025-04-14 6:28 ` Ian Kent 2025-04-12 1:48 ` Ian Kent 1 sibling, 2 replies; 23+ messages in thread From: Miklos Szeredi @ 2025-04-11 15:40 UTC (permalink / raw) To: Christian Brauner Cc: linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Fri, 11 Apr 2025 at 16:47, Christian Brauner <brauner@kernel.org> wrote: > Note that we have a new sysctl: > > /proc/sys/fs/dentry-negative > > that can be used to control the negative dentry policy because any > generic change that we tried to make has always resulted in unacceptable > regressions for someone's workload. Currently we only allow it to be set > to 1 (default 0). If set to 1 it will not create negative dentries > during unlink. If that's sufficient than recommend this to users that > suffer from this problem if not consider adding another sensitive > policy. Okay, I'll forward that info. However, hundreds of millions of negative dentries can be created rather efficiently without unlink, though this one probably doesn't happen under normal circumstances. Allowing this to starve the scheduler for an arbitrary long time is not a good idea in any case, so the fsnotify problem needs some other solution, and I suspect that it's not to disable negative caching completely, as that would be a major bummer. But the idea of leaving negative dentries off d_children is independent of caching policy. The lookup cache would work fine without d_sib being chained, it only needs careful thought in 1) putting the dentry on d_children when it's turned into positive 2) getting the dentry off d_children when it's turned into negative. Thanks, Miklos ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 15:40 ` Miklos Szeredi @ 2025-04-11 16:01 ` Matthew Wilcox 2025-04-14 14:07 ` James Bottomley 2025-04-14 6:28 ` Ian Kent 1 sibling, 1 reply; 23+ messages in thread From: Matthew Wilcox @ 2025-04-11 16:01 UTC (permalink / raw) To: Miklos Szeredi Cc: Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Fri, Apr 11, 2025 at 05:40:08PM +0200, Miklos Szeredi wrote: > However, hundreds of millions of negative dentries can be created > rather efficiently without unlink, though this one probably doesn't > happen under normal circumstances. Depends on your userspace. Since we don't have union directories, consider the not uncommon case of having a search path A:B:C. Application looks for D in directory A, doesn't find it, creates a negative dentry. Application looks for D in directory B, creates a negative dentry. Application looks for D in directory C, doesn't find it, so it creates it. Now we have two negative dentries and one positive dentry. And for some applications, the name "D" is going to be unique, so the negative dentries have _no_ further use. The application isn't even going to open C/D again. If there's no memory pressure, we can build up billions of dentries. I believe the customer is currently echoing 2 to /proc/sys/vm/drop-caches every hour. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 16:01 ` Matthew Wilcox @ 2025-04-14 14:07 ` James Bottomley 2025-04-14 14:30 ` Matthew Wilcox 0 siblings, 1 reply; 23+ messages in thread From: James Bottomley @ 2025-04-14 14:07 UTC (permalink / raw) To: Matthew Wilcox, Miklos Szeredi Cc: Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Fri, 2025-04-11 at 17:01 +0100, Matthew Wilcox wrote: > On Fri, Apr 11, 2025 at 05:40:08PM +0200, Miklos Szeredi wrote: > > However, hundreds of millions of negative dentries can be created > > rather efficiently without unlink, though this one probably doesn't > > happen under normal circumstances. > > Depends on your userspace. Since we don't have union directories, > consider the not uncommon case of having a search path A:B:C. > Application looks for D in directory A, doesn't find it, creates a > negative dentry. Application looks for D in directory B, creates a > negative dentry. Application looks for D in directory C, doesn't find > it, so it creates it. Now we have two negative dentries and one > positive dentry. If an application does an A:B:C directory search pattern it's usually because it doesn't directly own the file location and hence suggests that other applications would also be looking for it, which would seem to indicate, if the search pattern gets repeated, that the two negative dentries do serve a purpose. > And for some applications, the name "D" is going to be unique, so the > negative dentries have _no_ further use. The application isn't even > going to open C/D again. If there's no memory pressure, we can build > up billions of dentries. I believe the customer is currently echoing > 2 to /proc/sys/vm/drop-caches every hour. So this is an application that's the sole owner of D (i.e. sole controller of the entire path) yet it still does a search for it, why is that (if it's something like to update the location, it would be better served by first looking in the default location before searching others)? The problem is the pattern exactly matches the shared file one above so there doesn't seem to be a heuristic way to distinguish them. Regards, James ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-14 14:07 ` James Bottomley @ 2025-04-14 14:30 ` Matthew Wilcox 2025-04-14 15:40 ` James Bottomley 0 siblings, 1 reply; 23+ messages in thread From: Matthew Wilcox @ 2025-04-14 14:30 UTC (permalink / raw) To: James Bottomley Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Mon, Apr 14, 2025 at 10:07:09AM -0400, James Bottomley wrote: > On Fri, 2025-04-11 at 17:01 +0100, Matthew Wilcox wrote: > > On Fri, Apr 11, 2025 at 05:40:08PM +0200, Miklos Szeredi wrote: > > > However, hundreds of millions of negative dentries can be created > > > rather efficiently without unlink, though this one probably doesn't > > > happen under normal circumstances. > > > > Depends on your userspace. Since we don't have union directories, > > consider the not uncommon case of having a search path A:B:C. > > Application looks for D in directory A, doesn't find it, creates a > > negative dentry. Application looks for D in directory B, creates a > > negative dentry. Application looks for D in directory C, doesn't find > > it, so it creates it. Now we have two negative dentries and one > > positive dentry. > > If an application does an A:B:C directory search pattern it's usually > because it doesn't directly own the file location and hence suggests > that other applications would also be looking for it, which would seem > to indicate, if the search pattern gets repeated, that the two negative > dentries do serve a purpose. Not in this case. It's doing something like looking in /etc/app.d /usr/share/app/defaults/ and then /var/run/app/ . Don't quote me on the exact paths, or suggest alternatives based on these names; it's been a few years since I last looked. But I can assure you no other app is looking at these dentries; they're looked up exactly once. > > And for some applications, the name "D" is going to be unique, so the > > negative dentries have _no_ further use. The application isn't even > > going to open C/D again. If there's no memory pressure, we can build > > up billions of dentries. I believe the customer is currently echoing > > 2 to /proc/sys/vm/drop-caches every hour. > > So this is an application that's the sole owner of D (i.e. sole > controller of the entire path) yet it still does a search for it, why > is that (if it's something like to update the location, it would be > better served by first looking in the default location before searching > others)? The problem is the pattern exactly matches the shared file > one above so there doesn't seem to be a heuristic way to distinguish > them. Everything works fine when there's memory pressure. The problem is that negative dentry growth is only constrained by available memory; there's no reclaim of negative dentries which haven't been looked at in seconds or minutes. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-14 14:30 ` Matthew Wilcox @ 2025-04-14 15:40 ` James Bottomley 2025-04-14 16:14 ` Matthew Wilcox 0 siblings, 1 reply; 23+ messages in thread From: James Bottomley @ 2025-04-14 15:40 UTC (permalink / raw) To: Matthew Wilcox Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Mon, 2025-04-14 at 15:30 +0100, Matthew Wilcox wrote: > On Mon, Apr 14, 2025 at 10:07:09AM -0400, James Bottomley wrote: > > On Fri, 2025-04-11 at 17:01 +0100, Matthew Wilcox wrote: > > > On Fri, Apr 11, 2025 at 05:40:08PM +0200, Miklos Szeredi wrote: > > > > However, hundreds of millions of negative dentries can be > > > > created rather efficiently without unlink, though this one > > > > probably doesn't happen under normal circumstances. > > > > > > Depends on your userspace. Since we don't have union > > > directories, consider the not uncommon case of having a search > > > path A:B:C. Application looks for D in directory A, doesn't find > > > it, creates a negative dentry. Application looks for D in > > > directory B, creates a negative dentry. Application looks for D > > > in directory C, doesn't find it, so it creates it. Now we have > > > two negative dentries and one positive dentry. > > > > If an application does an A:B:C directory search pattern it's > > usually because it doesn't directly own the file location and hence > > suggests that other applications would also be looking for it, > > which would seem to indicate, if the search pattern gets repeated, > > that the two negative dentries do serve a purpose. > > Not in this case. It's doing something like looking in /etc/app.d > /usr/share/app/defaults/ and then /var/run/app/ . Don't quote me on > the exact paths, or suggest alternatives based on these names; it's > been a few years since I last looked. But I can assure you no other > app is looking at these dentries; they're looked up exactly once. I got that's what it's doing, and why the negative dentries are useless since the file name is app specific, I'm just curious why an app that knows it's the only consumer of a file places it in the last place it looks rather than the first ... it seems to be suboptimal and difficult for us to detect heuristically. Regards, James ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-14 15:40 ` James Bottomley @ 2025-04-14 16:14 ` Matthew Wilcox 2025-04-14 17:58 ` James Bottomley 0 siblings, 1 reply; 23+ messages in thread From: Matthew Wilcox @ 2025-04-14 16:14 UTC (permalink / raw) To: James Bottomley Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Mon, Apr 14, 2025 at 11:40:36AM -0400, James Bottomley wrote: > On Mon, 2025-04-14 at 15:30 +0100, Matthew Wilcox wrote: > > > If an application does an A:B:C directory search pattern it's > > > usually because it doesn't directly own the file location and hence > > > suggests that other applications would also be looking for it, > > > which would seem to indicate, if the search pattern gets repeated, > > > that the two negative dentries do serve a purpose. > > > > Not in this case. It's doing something like looking in /etc/app.d > > /usr/share/app/defaults/ and then /var/run/app/ . Don't quote me on > > the exact paths, or suggest alternatives based on these names; it's > > been a few years since I last looked. But I can assure you no other > > app is looking at these dentries; they're looked up exactly once. > > I got that's what it's doing, and why the negative dentries are useless > since the file name is app specific, I'm just curious why an app that > knows it's the only consumer of a file places it in the last place it > looks rather than the first ... it seems to be suboptimal and difficult > for us to detect heuristically. The first two are read only. One is where the package could have an override, the second is where the local sysadmin could have an override. The third is writable. It's not entirely insane. Another way to solve this would be to notice "hey, this directory only has three entries and umpteen negative entries, let's do the thing that ramfs does to tell the dcache that it knows about all positive entries in this directory and delete all the negative ones". I forget what flag that is. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-14 16:14 ` Matthew Wilcox @ 2025-04-14 17:58 ` James Bottomley 2025-04-15 17:22 ` Andreas Dilger 0 siblings, 1 reply; 23+ messages in thread From: James Bottomley @ 2025-04-14 17:58 UTC (permalink / raw) To: Matthew Wilcox Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Mon, 2025-04-14 at 17:14 +0100, Matthew Wilcox wrote: [...] > > I got that's what it's doing, and why the negative dentries are > > useless since the file name is app specific, I'm just curious why > > an app that knows it's the only consumer of a file places it in the > > last place it looks rather than the first ... it seems to be > > suboptimal and difficult for us to detect heuristically. > > The first two are read only. One is where the package could have an > override, the second is where the local sysadmin could have an > override. The third is writable. It's not entirely insane. > > Another way to solve this would be to notice "hey, this directory > only has three entries and umpteen negative entries, let's do the > thing that ramfs does to tell the dcache that it knows about all > positive entries in this directory and delete all the negative > ones". I forget what flag that is. It's not a flag, it's the dentry operations for pseudo filesystems (simple_lookup sets simple_dentry_operations which provides a d_delete that always says don't retain). However, that's really because all pseudo filesystems have a complete dentry cache (all visible files have dentries), so there's no benefit caching negative lookups (and the d_delete trick only affects negative dentries because positive ones have a non zero refcount). There is a DCACHE_DONTCACHE flag that dumps a dentry (regardless of positive or negative) on final dput I suppose that could be set in lookup_open() on negative under some circumstances (open flag, sysctl, etc.). Regards, James ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-14 17:58 ` James Bottomley @ 2025-04-15 17:22 ` Andreas Dilger 2025-04-16 15:18 ` Miklos Szeredi 2025-04-16 15:26 ` James Bottomley 0 siblings, 2 replies; 23+ messages in thread From: Andreas Dilger @ 2025-04-15 17:22 UTC (permalink / raw) To: James Bottomley Cc: Matthew Wilcox, Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent [-- Attachment #1: Type: text/plain, Size: 3348 bytes --] On Apr 14, 2025, at 11:58 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > On Mon, 2025-04-14 at 17:14 +0100, Matthew Wilcox wrote: > [...] >>> I got that's what it's doing, and why the negative dentries are >>> useless since the file name is app specific, I'm just curious why >>> an app that knows it's the only consumer of a file places it in the >>> last place it looks rather than the first ... it seems to be >>> suboptimal and difficult for us to detect heuristically. >> >> The first two are read only. One is where the package could have an >> override, the second is where the local sysadmin could have an >> override. The third is writable. It's not entirely insane. >> >> Another way to solve this would be to notice "hey, this directory >> only has three entries and umpteen negative entries, let's do the >> thing that ramfs does to tell the dcache that it knows about all >> positive entries in this directory and delete all the negative >> ones". I forget what flag that is. > > It's not a flag, it's the dentry operations for pseudo filesystems > (simple_lookup sets simple_dentry_operations which provides a d_delete > that always says don't retain). However, that's really because all > pseudo filesystems have a complete dentry cache (all visible files have > dentries), so there's no benefit caching negative lookups (and the > d_delete trick only affects negative dentries because positive ones > have a non zero refcount). > > There is a DCACHE_DONTCACHE flag that dumps a dentry (regardless of > positive or negative) on final dput I suppose that could be set in > lookup_open() on negative under some circumstances (open flag, sysctl, > etc.). Negative dentries are only useful if there are fewer than the number of entries in that directory. If the negative dentry count exceeds the actual entry count, it would be more efficient to just cache all of the positive dentries and mark the directory with a "full dentry list" flag that indicates all of the names are already present in dcache and any miss is authoritative. In essence that gives an "infinite" negative lookup cache instead of explicitly storing all of the possible negative entries. For directories like ~/bin, /usr/bin, /usr/lib64, etc. (or any directory) where negative lookups are frequent, it should be possible to determine this threshold automatically. Once the negative dentry count exceeds the size of the directory by some factor (e.g. directory size / 16, or the actual entry count if the filesystem knows this, it doesn't have to be exactly correct) then a readdir could load all of the names to fully populate the dcache and set the "full dentry list" flag on the directory would allow dropping all negative dentries in that directory. The VFS/VM should avoid dropping directories/dentries from cache in this case, since it is saving more memory (and avoiding filesystem IO) to keep them pinned rather than dropping them from cache. There might need to be a matching "part of full dentry list" flag on the positive dentries to avoid dcache shrinking of those entries (which would invalidate the premise that the parent holds all of the possible entries in that directory), if checking the parent's flag is too expensive. Cheers, Andreas [-- Attachment #2: Message signed with OpenPGP --] [-- Type: application/pgp-signature, Size: 873 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-15 17:22 ` Andreas Dilger @ 2025-04-16 15:18 ` Miklos Szeredi 2025-04-16 15:37 ` Matthew Wilcox 2025-04-16 21:41 ` Dave Chinner 2025-04-16 15:26 ` James Bottomley 1 sibling, 2 replies; 23+ messages in thread From: Miklos Szeredi @ 2025-04-16 15:18 UTC (permalink / raw) To: Andreas Dilger Cc: James Bottomley, Matthew Wilcox, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Tue, 15 Apr 2025 at 19:22, Andreas Dilger <adilger@dilger.ca> wrote: > If the negative dentry count exceeds the actual entry count, it would > be more efficient to just cache all of the positive dentries and mark > the directory with a "full dentry list" flag that indicates all of the > names are already present in dcache and any miss is authoritative. > In essence that gives an "infinite" negative lookup cache instead of > explicitly storing all of the possible negative entries. This sounds nice in theory, but there are quite a number of things to sort out: - The "full dir read" needs to be done in the background to avoid large latencies, right? - Instantiate inodes during this, or have some dentry flag indicating that it's to be done later? - When does the whole directory get reclaimed? - What about revalidation in netfs? How often should a "full dir read" get triggered? I feel that it's just too complex. What's wrong with just trying to get rid of the bad effects of negative dentries, instead of getting rid of the dentries themselves ? Lack of memory pressure should mean that nobody else needs that memory, so it should make no difference if it's used up in negative dentries instead of being free memory. Maybe I'm missing something fundamental? Thanks, Miklos ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-16 15:18 ` Miklos Szeredi @ 2025-04-16 15:37 ` Matthew Wilcox 2025-04-16 21:41 ` Dave Chinner 1 sibling, 0 replies; 23+ messages in thread From: Matthew Wilcox @ 2025-04-16 15:37 UTC (permalink / raw) To: Miklos Szeredi Cc: Andreas Dilger, James Bottomley, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Wed, Apr 16, 2025 at 05:18:17PM +0200, Miklos Szeredi wrote: > Lack of memory pressure should mean that nobody else needs that > memory, so it should make no difference if it's used up in negative > dentries instead of being free memory. Maybe I'm missing something > fundamental? You're missing two things: - The dentry hash table is a fixed size. Long chains give poor performance, so polluting the hash table with unused entries has a cost. - Eventually, we do trigger reclaim. And then we wait for hours while the reclaiming process tries to shrink billions of entries. I think we had a report on one machine of it taking more than 24 hours ("more than" because the customer decided enough was enough and rebooted the machine). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-16 15:18 ` Miklos Szeredi 2025-04-16 15:37 ` Matthew Wilcox @ 2025-04-16 21:41 ` Dave Chinner 1 sibling, 0 replies; 23+ messages in thread From: Dave Chinner @ 2025-04-16 21:41 UTC (permalink / raw) To: Miklos Szeredi Cc: Andreas Dilger, James Bottomley, Matthew Wilcox, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent On Wed, Apr 16, 2025 at 05:18:17PM +0200, Miklos Szeredi wrote: > On Tue, 15 Apr 2025 at 19:22, Andreas Dilger <adilger@dilger.ca> wrote: > > > If the negative dentry count exceeds the actual entry count, it would > > be more efficient to just cache all of the positive dentries and mark > > the directory with a "full dentry list" flag that indicates all of the > > names are already present in dcache and any miss is authoritative. > > In essence that gives an "infinite" negative lookup cache instead of > > explicitly storing all of the possible negative entries. > > This sounds nice in theory, but there are quite a number of things to sort out: > > - The "full dir read" needs to be done in the background to avoid > large latencies, right? > > - Instantiate inodes during this, or have some dentry flag indicating > that it's to be done later? > > - When does the whole directory get reclaimed? > > - What about revalidation in netfs? How often should a "full dir > read" get triggered? > > I feel that it's just too complex. > > What's wrong with just trying to get rid of the bad effects of > negative dentries, instead of getting rid of the dentries themselves ? > > Lack of memory pressure should mean that nobody else needs that > memory, so it should make no difference if it's used up in negative > dentries instead of being free memory. Maybe I'm missing something > fundamental? There is no issue with the existence of huge numbers of negative dentries. The issue is the overhead and latency of reclaiming hundreds of millions of tiny objects to release the memory is prohibitive. Dentry reclaim is generally pretty slow, especially if it is being done by a single background thread like kswapd. FWIW, I think there is a simpler version of this "per-directory dentry count" heuristic that might work well enough to bound the upper maximum: apply the same hueristic to the entire dentry cache. I'm pretty sure this has been proposed in the past, but we should probably revisit it anyway because this problem hasn't gone away. i.e. if the number of negative dentries exceeds the number of positive dentries and the total number of dentries exceeds a certain amount of memory, kick a background thread to reap some negative dentries from the LRU. e.g. every 30s check if dentries exceed 10% of memory and negative dentries exceed positive. If so, reap the oldest 10% of negative dentries. That will still allow a system with free memory to build up a -lot- of negative dentries, but also largely bound the amount of free memory that can be consumed by negative dentries to around 5% of total memory. -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-15 17:22 ` Andreas Dilger 2025-04-16 15:18 ` Miklos Szeredi @ 2025-04-16 15:26 ` James Bottomley 2025-04-22 6:57 ` Andreas Dilger 1 sibling, 1 reply; 23+ messages in thread From: James Bottomley @ 2025-04-16 15:26 UTC (permalink / raw) To: Andreas Dilger Cc: Matthew Wilcox, Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent [-- Attachment #1: Type: text/plain, Size: 3418 bytes --] On Tue, 2025-04-15 at 11:22 -0600, Andreas Dilger wrote: [...] > Negative dentries are only useful if there are fewer than the number > of entries in that directory. I agree with this, yes. > If the negative dentry count exceeds the actual entry count, Yes, but finding this number is going to be hard. We can't iterate a directory to count them in the fast path and a directory i_size is extremely filesystem and format dependent. However, since we only need a rough count, perhaps having the filesystem export its average directory entry size and simply dividing by that would give a good enough approximation to the number? > it would be more efficient to just cache all of the positive dentries > and mark the directory with a "full dentry list" flag that indicates > all of the names are already present in dcache and any miss is > authoritative. In essence that gives an "infinite" negative lookup > cache instead of explicitly storing all of the possible negative > entries. Practically, I think directories with that flag would probably automatically retain positive child dentries as an addition to our retain_dentry() logic and automatically kill negative ones. This behaviour, though, would remove them from the shrinkers, so probably there would have to be a global count of the number of unshrinkable children this gives us and have that factor into the superblock shrinkers somehow. Probably add the parent to the lru list but make dentry_lru_isolate() always skip until the tipping point for shrinking filled directories is reached? > For directories like ~/bin, /usr/bin, /usr/lib64, etc. (or any > directory) where negative lookups are frequent, it should be possible > to determine this threshold automatically. Once the negative dentry > count exceeds the size of the directory by some factor (e.g. > directory size / 16, or the actual entry count if the filesystem > knows this, it doesn't have to be exactly correct) then a readdir > could load all of the names to fully populate the dcache and set the > "full dentry list" flag on the directory would allow dropping all > negative dentries in that directory. All this supposes we have some per directory count of the negative dentries. I think there'd be push back on adding this to struct dentry and making it an exact count in the fast path. The next logical place to evaluate it would be the shrinkers but then that wouldn't solve Matthew's use case where the shrinkers don't get activated. I suppose some flag that userspace could add to directories it identifies as hot might be the next best thing? > The VFS/VM should avoid dropping directories/dentries from cache in > this case, since it is saving more memory (and avoiding filesystem > IO) to keep them pinned rather than dropping them from cache. There > might need to be a matching "part of full dentry list" flag on the > positive dentries to avoid dcache shrinking of those entries (which > would invalidate the premise that the parent holds all of the > possible entries in that directory), if checking the parent's flag is > too expensive. As I said above, I think simply checking the parent flags in retain_dentry should do. Since you don't need it to be exact and the parent should have a positive refcount, it should be possible to do a READ_ONCE rather than locking. Regards, James [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-16 15:26 ` James Bottomley @ 2025-04-22 6:57 ` Andreas Dilger 0 siblings, 0 replies; 23+ messages in thread From: Andreas Dilger @ 2025-04-22 6:57 UTC (permalink / raw) To: James Bottomley Cc: Matthew Wilcox, Miklos Szeredi, Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara, Ian Kent [-- Attachment #1: Type: text/plain, Size: 5705 bytes --] On Apr 16, 2025, at 9:26 AM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > On Tue, 2025-04-15 at 11:22 -0600, Andreas Dilger wrote: > [...] >> Negative dentries are only useful if there are fewer than the number >> of entries in that directory. > > I agree with this, yes. > >> If the negative dentry count exceeds the actual entry count, > > Yes, but finding this number is going to be hard. We can't iterate a > directory to count them in the fast path and a directory i_size is > extremely filesystem and format dependent. This depends. Some filesystems will store the actual number of entries in the directory, or it can be estimated based on the number of blocks in the directory. > However, since we only need a rough count, perhaps having the filesystem > export its average directory entry size and simply dividing by that > would give a good enough approximation to the number? I would suggest to add an inode method that can be called on the directory to request the (estimated) number of entries in a directory. If the fs has a good idea of this it can return that number, or it can estimate based on allocated blocks. It does not need to be exact, but provides an upper bound on the useful number of negative dcache entries to keep. >> it would be more efficient to just cache all of the positive dentries >> and mark the directory with a "full dentry list" flag that indicates >> all of the names are already present in dcache and any miss is >> authoritative. In essence that gives an "infinite" negative lookup >> cache instead of explicitly storing all of the possible negative >> entries. > > Practically, I think directories with that flag would probably > automatically retain positive child dentries as an addition to our > retain_dentry() logic and automatically kill negative ones. > > This behaviour, though, would remove them from the shrinkers, so > probably there would have to be a global count of the number of > unshrinkable children this gives us and have that factor into the > superblock shrinkers somehow. Probably add the parent to the lru list > but make dentry_lru_isolate() always skip until the tipping point for > shrinking filled directories is reached? It's true that this flag would (generally) remove the directory and its immediate children from the dcache shrinkers. However, the point of a shrinker is to reduce memory usage, and if the directory can no longer guarantee that all positive dentries are cached (so no negative dentries are needed) would generally *increase* memory usage in the end. I could imagine that such directories would eventually be reaped, but it should be much harder to do so. For example, every negative lookup in such a directory should refresh it in the LRU since the parent dentry avoided a negative entry from being added to the dcache. >> For directories like ~/bin, /usr/bin, /usr/lib64, etc. (or any >> directory) where negative lookups are frequent, it should be possible >> to determine this threshold automatically. Once the negative dentry >> count exceeds the size of the directory by some factor (e.g. >> directory size / 16, or the actual entry count if the filesystem >> knows this, it doesn't have to be exactly correct) then a readdir >> could load all of the names to fully populate the dcache and set the >> "full dentry list" flag on the directory would allow dropping all >> negative dentries in that directory. > > All this supposes we have some per directory count of the negative > dentries. I think there'd be push back on adding this to struct dentry > and making it an exact count in the fast path. The next logical place > to evaluate it would be the shrinkers but then that wouldn't solve > Matthew's use case where the shrinkers don't get activated. I suppose > some flag that userspace could add to directories it identifies as hot > might be the next best thing? No. Kernel memory management shouldn't be dependent on userspace doing the right thing, and no userspace would ever be taught to consistently set such a flag. Again, the numbers don't have to be exact, but if negative dcache is 2x the number of dir entries (or e.g. 1000 more as a directory gets larger) then it is time to change to caching only positive entries. Having the negative dcache be directly linked to the parent would be fine too. It doesn't make sense to cache negative dentries longer than the parent, and if there is an upper bound on how many negative entries can exist on a directory avoids the need to shrink them independently. If there is lots of memory pressure on the dcache then directories with inactive negative dentries would eventually be reaped, and even "full dentry list" directories would eventually come around for shrinking if they were inactive for a long time. >> The VFS/VM should avoid dropping directories/dentries from cache in >> this case, since it is saving more memory (and avoiding filesystem >> IO) to keep them pinned rather than dropping them from cache. There >> might need to be a matching "part of full dentry list" flag on the >> positive dentries to avoid dcache shrinking of those entries (which >> would invalidate the premise that the parent holds all of the >> possible entries in that directory), if checking the parent's flag is >> too expensive. > > As I said above, I think simply checking the parent flags in > retain_dentry should do. Since you don't need it to be exact and the > parent should have a positive refcount, it should be possible to do a > READ_ONCE rather than locking. Cheers, Andreas [-- Attachment #2: Message signed with OpenPGP --] [-- Type: application/pgp-signature, Size: 873 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 15:40 ` Miklos Szeredi 2025-04-11 16:01 ` Matthew Wilcox @ 2025-04-14 6:28 ` Ian Kent 2025-04-14 7:17 ` Miklos Szeredi 1 sibling, 1 reply; 23+ messages in thread From: Ian Kent @ 2025-04-14 6:28 UTC (permalink / raw) To: Miklos Szeredi, Christian Brauner Cc: linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara On 11/4/25 23:40, Miklos Szeredi wrote: > On Fri, 11 Apr 2025 at 16:47, Christian Brauner <brauner@kernel.org> wrote: > >> Note that we have a new sysctl: >> >> /proc/sys/fs/dentry-negative >> >> that can be used to control the negative dentry policy because any >> generic change that we tried to make has always resulted in unacceptable >> regressions for someone's workload. Currently we only allow it to be set >> to 1 (default 0). If set to 1 it will not create negative dentries >> during unlink. If that's sufficient than recommend this to users that >> suffer from this problem if not consider adding another sensitive >> policy. > Okay, I'll forward that info. > > However, hundreds of millions of negative dentries can be created > rather efficiently without unlink, though this one probably doesn't > happen under normal circumstances. Allowing this to starve the > scheduler for an arbitrary long time is not a good idea in any case, > so the fsnotify problem needs some other solution, and I suspect that > it's not to disable negative caching completely, as that would be a > major bummer. I know that the most recent case we have seen of this would probably be resolved by the sysctl but this was not the first recent case we had. Unfortunately I can't remember the details, all I remember is it was similar but not quite the same. In any case it's quite possible that many files can be processed, opened and then closed, not unlinked. So I think it's worth considering this as well. > > But the idea of leaving negative dentries off d_children is > independent of caching policy. The lookup cache would work fine > without d_sib being chained, it only needs careful thought in > > 1) putting the dentry on d_children when it's turned into positive > 2) getting the dentry off d_children when it's turned into negative. That shouldn't be too difficult to do ... sounds like a good idea to me. Ian ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-14 6:28 ` Ian Kent @ 2025-04-14 7:17 ` Miklos Szeredi 0 siblings, 0 replies; 23+ messages in thread From: Miklos Szeredi @ 2025-04-14 7:17 UTC (permalink / raw) To: Ian Kent Cc: Christian Brauner, linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara On Mon, 14 Apr 2025 at 08:28, Ian Kent <raven@themaw.net> wrote: > > 1) putting the dentry on d_children when it's turned into positive > > 2) getting the dentry off d_children when it's turned into negative. > > That shouldn't be too difficult to do ... sounds like a good idea to me. I hadn't counted with parent pointers. While not actually dereferenced, they are compared on cache lookup. So if the parent is removed and a directory dentry is recreated with the same pointer the cache becomes corrupted. Keeping the parent alive while any negative child dentries remain doesn't sound too difficult, e.g. an need an additional refcount that is incremented in parent on child unlink and decremented on child reclaim. But that's more space in struct dentry and more complexity... Thanks, Miklos ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 14:47 ` Christian Brauner 2025-04-11 15:40 ` Miklos Szeredi @ 2025-04-12 1:48 ` Ian Kent 2025-04-12 1:56 ` Ian Kent 2025-04-12 6:31 ` Ian Kent 1 sibling, 2 replies; 23+ messages in thread From: Ian Kent @ 2025-04-12 1:48 UTC (permalink / raw) To: Christian Brauner, Miklos Szeredi Cc: linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara On 11/4/25 22:47, Christian Brauner wrote: > On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: >> There are reports of soflockups in fsnotify if there are large numbers >> of negative dentries (e.g. ~300M) in a directory. This can happen if >> lots of temp files are created and removed and there's not enough >> memory pressure to trigger the lru shrinker. >> >> These are on old kernels and some of this is possibly due to missing >> 172e422ffea2 ("fsnotify: clear PARENT_WATCHED flags lazily"), but I >> managed to reproduce the softlockup on a recent kernel in >> fsnotify_set_children_dentry_flags() (see end of mail). >> >> This was with ~1.2G negative dentries. Doing "rmdir testdir" >> afterwards does not trigger the softlockup detector, due to the >> reschedules in shrink_dcache_parent() code, but it took 10 minutes(!) >> to finish removing that empty directory. >> >> So I wonder, do we really want negative dentries on ->d_children? >> Except for shrink_dcache_parent() I don't see any uses. And it's also >> a question whether shrinking negative dentries is useful or not. If >> they've been around for so long that hundreds of millions of them >> could accumulate and that memory wasn't needed by anybody, then it >> shouldn't make a big difference if they kept hanging around. On >> umount, at the latest, the lru list can be used to kill everything, >> AFAICT. >> >> I'm curious if this is the right path? Any better ideas? > Note that we have a new sysctl: > > /proc/sys/fs/dentry-negative > > that can be used to control the negative dentry policy because any > generic change that we tried to make has always resulted in unacceptable > regressions for someone's workload. Currently we only allow it to be set > to 1 (default 0). If set to 1 it will not create negative dentries > during unlink. If that's sufficient than recommend this to users that > suffer from this problem if not consider adding another sensitive > policy. Interesting, I wasn't sure how the negative dentries were accumulating but I didn't actually look at the unlink code (I'll take a look). I thought the most likely cause was laziness not unlinking temporary files (the file names in question "looked" like temporary file names). When I do look at unlink I suspect I'll find the VFS is justified in caching these and the responsibility (or should) lies with the file system call back to unhash the dentry if it doesn't want this caching ... but the file system always doing this is not ideal either ... maybe we need a hint so that the relevant file system callbacks can make this decision for themselves. Ian > >> Thanks, >> Miklos >> >> >> [96789.366007] watchdog: BUG: soft lockup - CPU#79 stuck for 26s! >> [fanotify4:52805] >> [96789.373396] Modules linked in: rfkill mlx5_ib ib_uverbs macsec >> ib_core vfat fat mlx5_core acpi_ipmi ast ipmi_ssif arm_spe_pmu igb >> mlxfw psample i2c_algo_bit tls pci_hyperv_intf ipmi_devintf >> ipmi_msghandler arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq loop >> fuse nfnetlink xfs nvme crct10dif_ce ghash_ce sha2_ce sha256_arm64 >> nvme_core sha1_ce sbsa_gwdt nvme_auth i2c_designware_platform >> i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod >> [96789.413624] CPU: 79 UID: 0 PID: 52805 Comm: fanotify4 Kdump: loaded >> Not tainted 6.12.0-55.9.1.el10_0.aarch64 #1 >> [96789.423698] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS >> F31n (SCP: 2.10.20220810) 09/30/2022 >> [96789.432990] pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> [96789.439939] pc : fsnotify_set_children_dentry_flags+0x80/0xf0 >> [96789.445675] lr : fsnotify_set_children_dentry_flags+0xa4/0xf0 >> [96789.451408] sp : ffff8000cc77b8c0 >> [96789.454710] x29: ffff8000cc77b8c0 x28: 0000000000000001 x27: 0000000000000000 >> [96789.461833] x26: ffff07ff8463dc50 x25: ffff080e6e44dc50 x24: 0000000000000001 >> [96789.468956] x23: ffff07ff9d94eec0 x22: ffff07fff2cf01b8 x21: ffff07ff9d94ee40 >> [96789.476079] x20: ffff0800eb6dff40 x19: ffff0800eb6df2c0 x18: 0000000000000014 >> [96789.483202] x17: 00000000cec6e315 x16: 00000000ed365140 x15: 00000000ae8684a4 >> [96789.490325] x14: 000000000d831309 x13: 00000000387d7ee0 x12: 0000000000000000 >> [96789.497448] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffc3bacc1864bc >> [96789.504570] x8 : 000000001007ffff x7 : ffffc3bace89a4c0 x6 : 0000000000000001 >> [96789.511694] x5 : 0000000008000020 x4 : 0000000000000000 x3 : 0000000000000003 >> [96789.518816] x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffff0800eb6df358 >> [96789.525939] Call trace: >> [96789.528373] fsnotify_set_children_dentry_flags+0x80/0xf0 >> [96789.533759] fsnotify_recalc_mask.part.0+0x94/0xc8 >> [96789.538538] fsnotify_recalc_mask+0x1c/0x40 >> [96789.542709] fanotify_add_mark+0x15c/0x360 >> [96789.546794] do_fanotify_mark+0x3c0/0x7a0 >> [96789.550791] __arm64_sys_fanotify_mark+0x30/0x60 >> [96789.555396] invoke_syscall.constprop.0+0x74/0xd0 >> [96789.560090] do_el0_svc+0xb0/0xe8 >> [96789.563393] el0_svc+0x44/0x1d0 >> [96789.566525] el0t_64_sync_handler+0x120/0x130 >> [96789.570870] el0t_64_sync+0x1a4/0x1a8 >> [151513.714945] INFO: task (ostnamed):77658 blocked for more than 122 seconds. >> [151513.721903] Tainted: G L ------- --- >> 6.12.0-55.9.1.el10_0.aarch64 #1 >> [151513.730334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [151513.738241] task:(ostnamed) state:D stack:0 pid:77658 >> tgid:77658 ppid:1 flags:0x00000205 >> [151513.747625] Call trace: >> [151513.750146] __switch_to+0xec/0x148 >> [151513.753712] __schedule+0x234/0x738 >> [151513.757278] schedule+0x3c/0xe0 >> [151513.760493] schedule_preempt_disabled+0x2c/0x58 >> [151513.765188] rwsem_down_write_slowpath+0x1e4/0x720 >> [151513.770054] down_write+0xac/0xc0 >> [151513.773444] do_lock_mount+0x3c/0x220 >> [151513.777185] path_mount+0x378/0x810 >> [151513.780748] __arm64_sys_mount+0x158/0x2d8 >> [151513.784921] invoke_syscall.constprop.0+0x74/0xd0 >> [151513.789702] do_el0_svc+0xb0/0xe8 >> [151513.793093] el0_svc+0x44/0x1d0 >> [151513.796312] el0t_64_sync_handler+0x120/0x130 >> [151513.800744] el0t_64_sync+0x1a4/0x1a8 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-12 1:48 ` Ian Kent @ 2025-04-12 1:56 ` Ian Kent 2025-04-12 6:31 ` Ian Kent 1 sibling, 0 replies; 23+ messages in thread From: Ian Kent @ 2025-04-12 1:56 UTC (permalink / raw) To: Christian Brauner, Miklos Szeredi Cc: linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara On 12/4/25 09:48, Ian Kent wrote: > > On 11/4/25 22:47, Christian Brauner wrote: >> On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: >>> There are reports of soflockups in fsnotify if there are large numbers >>> of negative dentries (e.g. ~300M) in a directory. This can happen if >>> lots of temp files are created and removed and there's not enough >>> memory pressure to trigger the lru shrinker. >>> >>> These are on old kernels and some of this is possibly due to missing >>> 172e422ffea2 ("fsnotify: clear PARENT_WATCHED flags lazily"), but I >>> managed to reproduce the softlockup on a recent kernel in >>> fsnotify_set_children_dentry_flags() (see end of mail). >>> >>> This was with ~1.2G negative dentries. Doing "rmdir testdir" >>> afterwards does not trigger the softlockup detector, due to the >>> reschedules in shrink_dcache_parent() code, but it took 10 minutes(!) >>> to finish removing that empty directory. >>> >>> So I wonder, do we really want negative dentries on ->d_children? >>> Except for shrink_dcache_parent() I don't see any uses. And it's also >>> a question whether shrinking negative dentries is useful or not. If >>> they've been around for so long that hundreds of millions of them >>> could accumulate and that memory wasn't needed by anybody, then it >>> shouldn't make a big difference if they kept hanging around. On >>> umount, at the latest, the lru list can be used to kill everything, >>> AFAICT. >>> >>> I'm curious if this is the right path? Any better ideas? >> Note that we have a new sysctl: >> >> /proc/sys/fs/dentry-negative >> >> that can be used to control the negative dentry policy because any >> generic change that we tried to make has always resulted in unacceptable >> regressions for someone's workload. Currently we only allow it to be set >> to 1 (default 0). If set to 1 it will not create negative dentries >> during unlink. If that's sufficient than recommend this to users that >> suffer from this problem if not consider adding another sensitive >> policy. > > Interesting, I wasn't sure how the negative dentries were accumulating > but > > I didn't actually look at the unlink code (I'll take a look). I > thought the > > most likely cause was laziness not unlinking temporary files (the file > names > > in question "looked" like temporary file names). > > > When I do look at unlink I suspect I'll find the VFS is justified in > caching > > these and the responsibility (or should) lies with the file system > call back > > to unhash the dentry if it doesn't want this caching ... but the file > system > > always doing this is not ideal either ... maybe we need a hint so that > the > > relevant file system callbacks can make this decision for themselves. Crikey, I thought I had seen something like this in the VFS. We already have a hint, DCACHE_DONTCACHE, and an exported VFS function to handle it, d_mark_dontcache(), with several file systems using it. I'll keep looking ... > > > Ian > >> >>> Thanks, >>> Miklos >>> >>> >>> [96789.366007] watchdog: BUG: soft lockup - CPU#79 stuck for 26s! >>> [fanotify4:52805] >>> [96789.373396] Modules linked in: rfkill mlx5_ib ib_uverbs macsec >>> ib_core vfat fat mlx5_core acpi_ipmi ast ipmi_ssif arm_spe_pmu igb >>> mlxfw psample i2c_algo_bit tls pci_hyperv_intf ipmi_devintf >>> ipmi_msghandler arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq loop >>> fuse nfnetlink xfs nvme crct10dif_ce ghash_ce sha2_ce sha256_arm64 >>> nvme_core sha1_ce sbsa_gwdt nvme_auth i2c_designware_platform >>> i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod >>> [96789.413624] CPU: 79 UID: 0 PID: 52805 Comm: fanotify4 Kdump: loaded >>> Not tainted 6.12.0-55.9.1.el10_0.aarch64 #1 >>> [96789.423698] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS >>> F31n (SCP: 2.10.20220810) 09/30/2022 >>> [96789.432990] pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS >>> BTYPE=--) >>> [96789.439939] pc : fsnotify_set_children_dentry_flags+0x80/0xf0 >>> [96789.445675] lr : fsnotify_set_children_dentry_flags+0xa4/0xf0 >>> [96789.451408] sp : ffff8000cc77b8c0 >>> [96789.454710] x29: ffff8000cc77b8c0 x28: 0000000000000001 x27: >>> 0000000000000000 >>> [96789.461833] x26: ffff07ff8463dc50 x25: ffff080e6e44dc50 x24: >>> 0000000000000001 >>> [96789.468956] x23: ffff07ff9d94eec0 x22: ffff07fff2cf01b8 x21: >>> ffff07ff9d94ee40 >>> [96789.476079] x20: ffff0800eb6dff40 x19: ffff0800eb6df2c0 x18: >>> 0000000000000014 >>> [96789.483202] x17: 00000000cec6e315 x16: 00000000ed365140 x15: >>> 00000000ae8684a4 >>> [96789.490325] x14: 000000000d831309 x13: 00000000387d7ee0 x12: >>> 0000000000000000 >>> [96789.497448] x11: 0000000000000001 x10: 0000000000000001 x9 : >>> ffffc3bacc1864bc >>> [96789.504570] x8 : 000000001007ffff x7 : ffffc3bace89a4c0 x6 : >>> 0000000000000001 >>> [96789.511694] x5 : 0000000008000020 x4 : 0000000000000000 x3 : >>> 0000000000000003 >>> [96789.518816] x2 : 0000000000000001 x1 : 0000000000000000 x0 : >>> ffff0800eb6df358 >>> [96789.525939] Call trace: >>> [96789.528373] fsnotify_set_children_dentry_flags+0x80/0xf0 >>> [96789.533759] fsnotify_recalc_mask.part.0+0x94/0xc8 >>> [96789.538538] fsnotify_recalc_mask+0x1c/0x40 >>> [96789.542709] fanotify_add_mark+0x15c/0x360 >>> [96789.546794] do_fanotify_mark+0x3c0/0x7a0 >>> [96789.550791] __arm64_sys_fanotify_mark+0x30/0x60 >>> [96789.555396] invoke_syscall.constprop.0+0x74/0xd0 >>> [96789.560090] do_el0_svc+0xb0/0xe8 >>> [96789.563393] el0_svc+0x44/0x1d0 >>> [96789.566525] el0t_64_sync_handler+0x120/0x130 >>> [96789.570870] el0t_64_sync+0x1a4/0x1a8 >>> [151513.714945] INFO: task (ostnamed):77658 blocked for more than >>> 122 seconds. >>> [151513.721903] Tainted: G L ------- --- >>> 6.12.0-55.9.1.el10_0.aarch64 #1 >>> [151513.730334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>> disables this message. >>> [151513.738241] task:(ostnamed) state:D stack:0 pid:77658 >>> tgid:77658 ppid:1 flags:0x00000205 >>> [151513.747625] Call trace: >>> [151513.750146] __switch_to+0xec/0x148 >>> [151513.753712] __schedule+0x234/0x738 >>> [151513.757278] schedule+0x3c/0xe0 >>> [151513.760493] schedule_preempt_disabled+0x2c/0x58 >>> [151513.765188] rwsem_down_write_slowpath+0x1e4/0x720 >>> [151513.770054] down_write+0xac/0xc0 >>> [151513.773444] do_lock_mount+0x3c/0x220 >>> [151513.777185] path_mount+0x378/0x810 >>> [151513.780748] __arm64_sys_mount+0x158/0x2d8 >>> [151513.784921] invoke_syscall.constprop.0+0x74/0xd0 >>> [151513.789702] do_el0_svc+0xb0/0xe8 >>> [151513.793093] el0_svc+0x44/0x1d0 >>> [151513.796312] el0t_64_sync_handler+0x120/0x130 >>> [151513.800744] el0t_64_sync+0x1a4/0x1a8 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-12 1:48 ` Ian Kent 2025-04-12 1:56 ` Ian Kent @ 2025-04-12 6:31 ` Ian Kent 1 sibling, 0 replies; 23+ messages in thread From: Ian Kent @ 2025-04-12 6:31 UTC (permalink / raw) To: Christian Brauner, Miklos Szeredi Cc: linux-fsdevel, Al Viro, Amir Goldstein, Jan Kara On 12/4/25 09:48, Ian Kent wrote: > > On 11/4/25 22:47, Christian Brauner wrote: >> On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: >>> There are reports of soflockups in fsnotify if there are large numbers >>> of negative dentries (e.g. ~300M) in a directory. This can happen if >>> lots of temp files are created and removed and there's not enough >>> memory pressure to trigger the lru shrinker. >>> >>> These are on old kernels and some of this is possibly due to missing >>> 172e422ffea2 ("fsnotify: clear PARENT_WATCHED flags lazily"), but I >>> managed to reproduce the softlockup on a recent kernel in >>> fsnotify_set_children_dentry_flags() (see end of mail). >>> >>> This was with ~1.2G negative dentries. Doing "rmdir testdir" >>> afterwards does not trigger the softlockup detector, due to the >>> reschedules in shrink_dcache_parent() code, but it took 10 minutes(!) >>> to finish removing that empty directory. >>> >>> So I wonder, do we really want negative dentries on ->d_children? >>> Except for shrink_dcache_parent() I don't see any uses. And it's also >>> a question whether shrinking negative dentries is useful or not. If >>> they've been around for so long that hundreds of millions of them >>> could accumulate and that memory wasn't needed by anybody, then it >>> shouldn't make a big difference if they kept hanging around. On >>> umount, at the latest, the lru list can be used to kill everything, >>> AFAICT. >>> >>> I'm curious if this is the right path? Any better ideas? >> Note that we have a new sysctl: >> >> /proc/sys/fs/dentry-negative >> >> that can be used to control the negative dentry policy because any >> generic change that we tried to make has always resulted in unacceptable >> regressions for someone's workload. Currently we only allow it to be set >> to 1 (default 0). If set to 1 it will not create negative dentries >> during unlink. If that's sufficient than recommend this to users that >> suffer from this problem if not consider adding another sensitive >> policy. > > Interesting, I wasn't sure how the negative dentries were accumulating > but > > I didn't actually look at the unlink code (I'll take a look). I > thought the > > most likely cause was laziness not unlinking temporary files (the file > names > > in question "looked" like temporary file names). > > > When I do look at unlink I suspect I'll find the VFS is justified in > caching > > these and the responsibility (or should) lies with the file system > call back > > to unhash the dentry if it doesn't want this caching ... but the file > system > > always doing this is not ideal either ... maybe we need a hint so that > the > > relevant file system callbacks can make this decision for themselves. But I didn't find this to be the case at all. Assuming the customer application is behaving sensibly and calling unlink() on the files are no longer needed (for temporary files that should be the case) then unhasing the dentry before final dput() will indeed result in the dentry being discarded. It looks like all we need is e6957c99dca5f ("vfs: Add a sysctl for automated deletion of dentry") and that looks like it will apply cleanly to the kernel we are concerned with. It will be interesting to test this to see if the application is actually behaving. > > > Ian > >> >>> Thanks, >>> Miklos >>> >>> >>> [96789.366007] watchdog: BUG: soft lockup - CPU#79 stuck for 26s! >>> [fanotify4:52805] >>> [96789.373396] Modules linked in: rfkill mlx5_ib ib_uverbs macsec >>> ib_core vfat fat mlx5_core acpi_ipmi ast ipmi_ssif arm_spe_pmu igb >>> mlxfw psample i2c_algo_bit tls pci_hyperv_intf ipmi_devintf >>> ipmi_msghandler arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq loop >>> fuse nfnetlink xfs nvme crct10dif_ce ghash_ce sha2_ce sha256_arm64 >>> nvme_core sha1_ce sbsa_gwdt nvme_auth i2c_designware_platform >>> i2c_designware_core xgene_hwmon dm_mirror dm_region_hash dm_log dm_mod >>> [96789.413624] CPU: 79 UID: 0 PID: 52805 Comm: fanotify4 Kdump: loaded >>> Not tainted 6.12.0-55.9.1.el10_0.aarch64 #1 >>> [96789.423698] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS >>> F31n (SCP: 2.10.20220810) 09/30/2022 >>> [96789.432990] pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS >>> BTYPE=--) >>> [96789.439939] pc : fsnotify_set_children_dentry_flags+0x80/0xf0 >>> [96789.445675] lr : fsnotify_set_children_dentry_flags+0xa4/0xf0 >>> [96789.451408] sp : ffff8000cc77b8c0 >>> [96789.454710] x29: ffff8000cc77b8c0 x28: 0000000000000001 x27: >>> 0000000000000000 >>> [96789.461833] x26: ffff07ff8463dc50 x25: ffff080e6e44dc50 x24: >>> 0000000000000001 >>> [96789.468956] x23: ffff07ff9d94eec0 x22: ffff07fff2cf01b8 x21: >>> ffff07ff9d94ee40 >>> [96789.476079] x20: ffff0800eb6dff40 x19: ffff0800eb6df2c0 x18: >>> 0000000000000014 >>> [96789.483202] x17: 00000000cec6e315 x16: 00000000ed365140 x15: >>> 00000000ae8684a4 >>> [96789.490325] x14: 000000000d831309 x13: 00000000387d7ee0 x12: >>> 0000000000000000 >>> [96789.497448] x11: 0000000000000001 x10: 0000000000000001 x9 : >>> ffffc3bacc1864bc >>> [96789.504570] x8 : 000000001007ffff x7 : ffffc3bace89a4c0 x6 : >>> 0000000000000001 >>> [96789.511694] x5 : 0000000008000020 x4 : 0000000000000000 x3 : >>> 0000000000000003 >>> [96789.518816] x2 : 0000000000000001 x1 : 0000000000000000 x0 : >>> ffff0800eb6df358 >>> [96789.525939] Call trace: >>> [96789.528373] fsnotify_set_children_dentry_flags+0x80/0xf0 >>> [96789.533759] fsnotify_recalc_mask.part.0+0x94/0xc8 >>> [96789.538538] fsnotify_recalc_mask+0x1c/0x40 >>> [96789.542709] fanotify_add_mark+0x15c/0x360 >>> [96789.546794] do_fanotify_mark+0x3c0/0x7a0 >>> [96789.550791] __arm64_sys_fanotify_mark+0x30/0x60 >>> [96789.555396] invoke_syscall.constprop.0+0x74/0xd0 >>> [96789.560090] do_el0_svc+0xb0/0xe8 >>> [96789.563393] el0_svc+0x44/0x1d0 >>> [96789.566525] el0t_64_sync_handler+0x120/0x130 >>> [96789.570870] el0t_64_sync+0x1a4/0x1a8 >>> [151513.714945] INFO: task (ostnamed):77658 blocked for more than >>> 122 seconds. >>> [151513.721903] Tainted: G L ------- --- >>> 6.12.0-55.9.1.el10_0.aarch64 #1 >>> [151513.730334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>> disables this message. >>> [151513.738241] task:(ostnamed) state:D stack:0 pid:77658 >>> tgid:77658 ppid:1 flags:0x00000205 >>> [151513.747625] Call trace: >>> [151513.750146] __switch_to+0xec/0x148 >>> [151513.753712] __schedule+0x234/0x738 >>> [151513.757278] schedule+0x3c/0xe0 >>> [151513.760493] schedule_preempt_disabled+0x2c/0x58 >>> [151513.765188] rwsem_down_write_slowpath+0x1e4/0x720 >>> [151513.770054] down_write+0xac/0xc0 >>> [151513.773444] do_lock_mount+0x3c/0x220 >>> [151513.777185] path_mount+0x378/0x810 >>> [151513.780748] __arm64_sys_mount+0x158/0x2d8 >>> [151513.784921] invoke_syscall.constprop.0+0x74/0xd0 >>> [151513.789702] do_el0_svc+0xb0/0xe8 >>> [151513.793093] el0_svc+0x44/0x1d0 >>> [151513.796312] el0t_64_sync_handler+0x120/0x130 >>> [151513.800744] el0t_64_sync+0x1a4/0x1a8 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 9:40 bad things when too many negative dentries in a directory Miklos Szeredi 2025-04-11 14:47 ` Christian Brauner @ 2025-04-11 21:02 ` Mateusz Guzik 2025-04-20 4:49 ` Al Viro 2 siblings, 0 replies; 23+ messages in thread From: Mateusz Guzik @ 2025-04-11 21:02 UTC (permalink / raw) To: Miklos Szeredi Cc: linux-fsdevel, Al Viro, Christian Brauner, Amir Goldstein, Jan Kara, Ian Kent On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: > There are reports of soflockups in fsnotify if there are large numbers > of negative dentries (e.g. ~300M) in a directory. This can happen if > lots of temp files are created and removed and there's not enough > memory pressure to trigger the lru shrinker. > > These are on old kernels and some of this is possibly due to missing > 172e422ffea2 ("fsnotify: clear PARENT_WATCHED flags lazily"), but I > managed to reproduce the softlockup on a recent kernel in > fsnotify_set_children_dentry_flags() (see end of mail). > > This was with ~1.2G negative dentries. Doing "rmdir testdir" > afterwards does not trigger the softlockup detector, due to the > reschedules in shrink_dcache_parent() code, but it took 10 minutes(!) > to finish removing that empty directory. > I wrote about this some time ago: https://lore.kernel.org/linux-fsdevel/f7bp3ggliqbb7adyysonxgvo6zn76mo4unroagfcuu3bfghynu@7wkgqkfb5c43/#t bottom line is only a small subset of negative entries is useful in the long run while a great policy to tame the total count while not hindering performance is left as an exercise for the reader(tm), I outlined something which should be *tolerable*. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-11 9:40 bad things when too many negative dentries in a directory Miklos Szeredi 2025-04-11 14:47 ` Christian Brauner 2025-04-11 21:02 ` Mateusz Guzik @ 2025-04-20 4:49 ` Al Viro 2025-05-08 15:45 ` Miklos Szeredi 2 siblings, 1 reply; 23+ messages in thread From: Al Viro @ 2025-04-20 4:49 UTC (permalink / raw) To: Miklos Szeredi Cc: linux-fsdevel, Christian Brauner, Amir Goldstein, Jan Kara, Ian Kent On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: > Except for shrink_dcache_parent() I don't see any uses. And it's also > a question whether shrinking negative dentries is useful or not. One-word answer: umount. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: bad things when too many negative dentries in a directory 2025-04-20 4:49 ` Al Viro @ 2025-05-08 15:45 ` Miklos Szeredi 0 siblings, 0 replies; 23+ messages in thread From: Miklos Szeredi @ 2025-05-08 15:45 UTC (permalink / raw) To: Al Viro Cc: linux-fsdevel, Christian Brauner, Amir Goldstein, Jan Kara, Ian Kent On Sun, 20 Apr 2025 at 06:49, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Fri, Apr 11, 2025 at 11:40:28AM +0200, Miklos Szeredi wrote: > > > Except for shrink_dcache_parent() I don't see any uses. And it's also > > a question whether shrinking negative dentries is useful or not. > > One-word answer: umount. shink_dcache_sb() should work fine in that situation. The only thing it can't do is hunt down spurious references to dentries, but that's a debug thing and not something that is needed in production. Am I missing something? Thanks, Miklos ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2025-05-08 15:46 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-11 9:40 bad things when too many negative dentries in a directory Miklos Szeredi 2025-04-11 14:47 ` Christian Brauner 2025-04-11 15:40 ` Miklos Szeredi 2025-04-11 16:01 ` Matthew Wilcox 2025-04-14 14:07 ` James Bottomley 2025-04-14 14:30 ` Matthew Wilcox 2025-04-14 15:40 ` James Bottomley 2025-04-14 16:14 ` Matthew Wilcox 2025-04-14 17:58 ` James Bottomley 2025-04-15 17:22 ` Andreas Dilger 2025-04-16 15:18 ` Miklos Szeredi 2025-04-16 15:37 ` Matthew Wilcox 2025-04-16 21:41 ` Dave Chinner 2025-04-16 15:26 ` James Bottomley 2025-04-22 6:57 ` Andreas Dilger 2025-04-14 6:28 ` Ian Kent 2025-04-14 7:17 ` Miklos Szeredi 2025-04-12 1:48 ` Ian Kent 2025-04-12 1:56 ` Ian Kent 2025-04-12 6:31 ` Ian Kent 2025-04-11 21:02 ` Mateusz Guzik 2025-04-20 4:49 ` Al Viro 2025-05-08 15:45 ` Miklos Szeredi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.