* [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file @ 2024-05-11 2:27 Yafang Shao 2024-05-11 2:53 ` Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Yafang Shao @ 2024-05-11 2:27 UTC (permalink / raw) To: viro, brauner, jack Cc: linux-fsdevel, Yafang Shao, Wangkai, Colin Walters, Waiman Long, Linus Torvalds Our applications, built on Elasticsearch[0], frequently create and delete files. These applications operate within containers, some with a memory limit exceeding 100GB. Over prolonged periods, the accumulation of negative dentries within these containers can amount to tens of gigabytes. Upon container exit, directories are deleted. However, due to the numerous associated dentries, this process can be time-consuming. Our users have expressed frustration with this prolonged exit duration, which constitutes our first issue. Simultaneously, other processes may attempt to access the parent directory of the Elasticsearch directories. Since the task responsible for deleting the dentries holds the inode lock, processes attempting directory lookup experience significant delays. This issue, our second problem, is easily demonstrated: - Task 1 generates negative dentries: $ pwd ~/test $ mkdir es && cd es/ && ./create_and_delete_files.sh [ After generating tens of GB dentries ] $ cd ~/test && rm -rf es [ It will take a long duration to finish ] - Task 2 attempts to lookup the 'test/' directory $ pwd ~/test $ ls The 'ls' command in Task 2 experiences prolonged execution as Task 1 is deleting the dentries. We've devised a solution to address both issues by deleting associated dentry when removing a file. Interestingly, we've noted that a similar patch was proposed years ago[1], although it was rejected citing the absence of tangible issues caused by negative dentries. Given our current challenges, we're resubmitting the proposal. All relevant stakeholders from previous discussions have been included for reference. [0]. https://github.com/elastic/elasticsearch [1]. https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Cc: Wangkai <wangkai86@huawei.com> Cc: Jan Kara <jack@suse.cz> Cc: Colin Walters <walters@verbum.org> Cc: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> --- fs/dcache.c | 4 ++++ include/linux/dcache.h | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index 71a8e943a0fa..4b97f60f0e64 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -701,6 +701,9 @@ static inline bool retain_dentry(struct dentry *dentry, bool locked) if (unlikely(d_flags & DCACHE_DONTCACHE)) return false; + if (unlikely(dentry->d_flags & DCACHE_FILE_DELETED)) + return false; + // At this point it looks like we ought to keep it. We also might // need to do something - put it on LRU if it wasn't there already // and mark it referenced if it was on LRU, but not marked yet. @@ -2392,6 +2395,7 @@ void d_delete(struct dentry * dentry) spin_unlock(&dentry->d_lock); spin_unlock(&inode->i_lock); } + dentry->d_flags |= DCACHE_FILE_DELETED; } EXPORT_SYMBOL(d_delete); diff --git a/include/linux/dcache.h b/include/linux/dcache.h index bf53e3894aae..55a69682918c 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -210,7 +210,7 @@ struct dentry_operations { #define DCACHE_NOKEY_NAME BIT(25) /* Encrypted name encoded without key */ #define DCACHE_OP_REAL BIT(26) - +#define DCACHE_FILE_DELETED BIT(27) /* File is deleted */ #define DCACHE_PAR_LOOKUP BIT(28) /* being looked up (with parent locked shared) */ #define DCACHE_DENTRY_CURSOR BIT(29) #define DCACHE_NORCU BIT(30) /* No RCU delay for freeing */ -- 2.39.1 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 2:27 [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file Yafang Shao @ 2024-05-11 2:53 ` Linus Torvalds 2024-05-11 3:35 ` Yafang Shao 2024-05-11 3:36 ` [RFC PATCH] fs: dcache: " Al Viro 0 siblings, 2 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 2:53 UTC (permalink / raw) To: Yafang Shao Cc: viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Fri, 10 May 2024 at 19:28, Yafang Shao <laoar.shao@gmail.com> wrote: > > We've devised a solution to address both issues by deleting associated > dentry when removing a file. This patch is buggy. You are modifying d_flags outside the locked region. So at a minimum, the DCACHE_FILE_DELETED bit setting would need to just go into the if (dentry->d_lockref.count == 1) { side of the conditional, since the other side of that conditional already unhashes the dentry which makes this all moot anyway. That said, I think it's buggy in another way too: what if somebody else looks up the dentry before it actually gets unhashed? Then you have another ref to it, and the dentry might live long enough that it then gets re-used for a newly created file (which is why we have those negative dentries in the first place). So you'd have to clear the DCACHE_FILE_DELETED if the dentry is then made live by a file creation or rename or whatever. So that d_flags thing is actually pretty complicated. But since you made all this unconditional anyway, I think having a new dentry flag is unnecessary in the first place, and I suspect you are better off just unhashing the dentry unconditionally instead. IOW, I think the simpler patch is likely just something like this: --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2381,6 +2381,7 @@ void d_delete(struct dentry * dentry) spin_lock(&inode->i_lock); spin_lock(&dentry->d_lock); + __d_drop(dentry); /* * Are we the only user? */ @@ -2388,7 +2389,6 @@ void d_delete(struct dentry * dentry) dentry->d_flags &= ~DCACHE_CANT_MOUNT; dentry_unlink_inode(dentry); } else { - __d_drop(dentry); spin_unlock(&dentry->d_lock); spin_unlock(&inode->i_lock); } although I think Al needs to ACK this, and I suspect that unhashing the dentry also makes that dentry->d_flags &= ~DCACHE_CANT_MOUNT; pointless (because the dentry won't be reused, so DCACHE_CANT_MOUNT just won't matter). I do worry that there are loads that actually love our current behavior, but maybe it's worth doing the simple unconditional "make d_delete() always unhash" and only worry about whether that causes performance problems for people who commonly create a new file in its place when we get such a report. IOW, the more complex thing might be to actually take other behavior into account (eg "do we have so many negative dentries that we really don't want to create new ones"). Al - can you please step in and tell us what else I've missed, and why my suggested version of the patch is also broken garbage? Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 2:53 ` Linus Torvalds @ 2024-05-11 3:35 ` Yafang Shao 2024-05-11 4:54 ` Waiman Long 2024-05-15 2:18 ` [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file Yafang Shao 2024-05-11 3:36 ` [RFC PATCH] fs: dcache: " Al Viro 1 sibling, 2 replies; 43+ messages in thread From: Yafang Shao @ 2024-05-11 3:35 UTC (permalink / raw) To: Linus Torvalds Cc: viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Sat, May 11, 2024 at 10:54 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Fri, 10 May 2024 at 19:28, Yafang Shao <laoar.shao@gmail.com> wrote: > > > > We've devised a solution to address both issues by deleting associated > > dentry when removing a file. > > This patch is buggy. You are modifying d_flags outside the locked region. > > So at a minimum, the DCACHE_FILE_DELETED bit setting would need to > just go into the > > if (dentry->d_lockref.count == 1) { > > side of the conditional, since the other side of that conditional > already unhashes the dentry which makes this all moot anyway. > > That said, I think it's buggy in another way too: what if somebody > else looks up the dentry before it actually gets unhashed? Then you > have another ref to it, and the dentry might live long enough that it > then gets re-used for a newly created file (which is why we have those > negative dentries in the first place). > > So you'd have to clear the DCACHE_FILE_DELETED if the dentry is then > made live by a file creation or rename or whatever. > > So that d_flags thing is actually pretty complicated. > > But since you made all this unconditional anyway, I think having a new > dentry flag is unnecessary in the first place, and I suspect you are > better off just unhashing the dentry unconditionally instead. > > IOW, I think the simpler patch is likely just something like this: It's simpler. I used to contemplate handling it that way, but lack the knowledge and courage to proceed, hence I opted for the d_flags solution. I'll conduct tests on the revised change. Appreciate your suggestion. > > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -2381,6 +2381,7 @@ void d_delete(struct dentry * dentry) > > spin_lock(&inode->i_lock); > spin_lock(&dentry->d_lock); > + __d_drop(dentry); > /* > * Are we the only user? > */ > @@ -2388,7 +2389,6 @@ void d_delete(struct dentry * dentry) > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > dentry_unlink_inode(dentry); > } else { > - __d_drop(dentry); > spin_unlock(&dentry->d_lock); > spin_unlock(&inode->i_lock); > } > > although I think Al needs to ACK this, and I suspect that unhashing > the dentry also makes that > > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > > pointless (because the dentry won't be reused, so DCACHE_CANT_MOUNT > just won't matter). > > I do worry that there are loads that actually love our current > behavior, but maybe it's worth doing the simple unconditional "make > d_delete() always unhash" and only worry about whether that causes > performance problems for people who commonly create a new file in its > place when we get such a report. > > IOW, the more complex thing might be to actually take other behavior > into account (eg "do we have so many negative dentries that we really > don't want to create new ones"). This poses a substantial challenge. Despite recurrent discussions within the community about improving negative dentry over and over, there hasn't been a consensus on how to address it. > > Al - can you please step in and tell us what else I've missed, and why > my suggested version of the patch is also broken garbage? > > Linus -- Regards Yafang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 3:35 ` Yafang Shao @ 2024-05-11 4:54 ` Waiman Long 2024-05-11 15:58 ` Matthew Wilcox 2024-05-15 2:18 ` [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file Yafang Shao 1 sibling, 1 reply; 43+ messages in thread From: Waiman Long @ 2024-05-11 4:54 UTC (permalink / raw) To: Yafang Shao, Linus Torvalds Cc: viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters On 5/10/24 23:35, Yafang Shao wrote: >> pointless (because the dentry won't be reused, so DCACHE_CANT_MOUNT >> just won't matter). >> >> I do worry that there are loads that actually love our current >> behavior, but maybe it's worth doing the simple unconditional "make >> d_delete() always unhash" and only worry about whether that causes >> performance problems for people who commonly create a new file in its >> place when we get such a report. >> >> IOW, the more complex thing might be to actually take other behavior >> into account (eg "do we have so many negative dentries that we really >> don't want to create new ones"). > This poses a substantial challenge. Despite recurrent discussions > within the community about improving negative dentry over and over, > there hasn't been a consensus on how to address it. I had suggested in the past to have a sysctl parameter to set a threshold on how many negative dentries (as a percentage of total system memory) are regarded as too many and activate some processes to control dentry creation. The hard part is to discard the oldest negative dentries as we can have many memcg LRU lists to look at and it can be a time consuming process. We could theoretically start removing dentries when removing files when 50% of the threshold is reached. At 100%, we start discarding old negative dentries. At 150%, we stop negative dentry creation altogether. My 2 cents. Cheers, Longman ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 4:54 ` Waiman Long @ 2024-05-11 15:58 ` Matthew Wilcox 2024-05-11 16:07 ` Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Matthew Wilcox @ 2024-05-11 15:58 UTC (permalink / raw) To: Waiman Long Cc: Yafang Shao, Linus Torvalds, viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters On Sat, May 11, 2024 at 12:54:18AM -0400, Waiman Long wrote: > I had suggested in the past to have a sysctl parameter to set a threshold on > how many negative dentries (as a percentage of total system memory) are Yes, but that's obviously bogus. System memory is completely uncorrelated with the optimum number of negative dentries to keep around for workload performance. Some multiple of "number of positive dentries" might make sense. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 15:58 ` Matthew Wilcox @ 2024-05-11 16:07 ` Linus Torvalds 2024-05-11 16:13 ` Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 16:07 UTC (permalink / raw) To: Matthew Wilcox Cc: Waiman Long, Yafang Shao, viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters On Sat, 11 May 2024 at 08:59, Matthew Wilcox <willy@infradead.org> wrote: > > Some multiple of "number of positive dentries" might make sense. The thing is, it would probably have to be per parent-dentry to be useful, since that is typically what causes latency concerns: not "global negative dentries", but "negative dentries that have to be flushed for this parent as it is deleted". And you'd probably need to make the child list be some kind of LRU list to make this all effective. Which is all likely rather expensive in both CPU time (stats tend to cause lots of dirty caches) and in memory (the dentry would grow for both stats and for the d_children list - it would almost certainly have to change from a 'struct hlist_head' to a 'struct list_head' so that you can put things either at the to the beginning or end). I dunno. I haven't looked at just *how* nasty it would be. Maybe it wouldn't be as bad as I suspect it would be. Now, the other option might be to just make the latency concerns smaller. It's not like removing negative dentries is very costly per se. I think the issue has always been the dcache_lock, not the work to remove the dentries themselves. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 16:07 ` Linus Torvalds @ 2024-05-11 16:13 ` Linus Torvalds 2024-05-11 18:05 ` Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 16:13 UTC (permalink / raw) To: Matthew Wilcox Cc: Waiman Long, Yafang Shao, viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters On Sat, 11 May 2024 at 09:07, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Now, the other option might be to just make the latency concerns > smaller. It's not like removing negative dentries is very costly per > se. I think the issue has always been the dcache_lock, not the work to > remove the dentries themselves. Actually, going back to re-read this particular report, at least this time it was the inode lock of the parent, not the dcache_lock. But the point ends up being the same - lots of negative dentries aren't necessarily a problem in themselves, because the common operation that matters is the hash lookup, which scales fairly well. They mainly tend to become a problem when they hold up something else. So better batching, or maybe just walking the negative child dentry list after having marked the parent dead and then released the lock, might also be the solution. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 16:13 ` Linus Torvalds @ 2024-05-11 18:05 ` Linus Torvalds 2024-05-11 18:26 ` [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 18:05 UTC (permalink / raw) To: Matthew Wilcox Cc: Waiman Long, Yafang Shao, viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters On Sat, 11 May 2024 at 09:13, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So better batching, or maybe just walking the negative child dentry > list after having marked the parent dead and then released the lock, > might also be the solution. IOW, maybe the solution is something as simple as this UNTESTED patch instead? --- a/fs/namei.c +++ b/fs/namei.c @@ -4207,16 +4207,19 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) goto out; - shrink_dcache_parent(dentry); dentry->d_inode->i_flags |= S_DEAD; dont_mount(dentry); detach_mounts(dentry); + inode_unlock(dentry->d_inode); + + shrink_dcache_parent(dentry); + dput(dentry); + d_delete_notify(dir, dentry); + return 0; out: inode_unlock(dentry->d_inode); dput(dentry); - if (!error) - d_delete_notify(dir, dentry); return error; } EXPORT_SYMBOL(vfs_rmdir); where that "shrink_dcache_parent()" will still be quite expensive if the directory has a ton of negative dentries, but because we now free them after we've marked the dentry dead and released the inode, nobody much cares any more? Note (as usual) that this is untested, and maybe I haven't thought everything through and I might be missing some important detail. But I think shrinking the children later should be just fine - there's nothing people can *do* with them. At worst they are reachable for some lookup that doesn't take the locks, but that will just result in a negative dentry which is all good, since they cannot become positive dentries any more at this point. So I think the inode lock is irrelevant for negative dentries, and shrinking them outside the lock feels like a very natural thing to do. Again: this is more of a "brainstorming patch" than an actual suggestion. Yafang - it might be worth testing for your load, but please do so knowing that it might have some consistency issues, so don't test it on any production machinery, please ;) Can anybody point out why I'm being silly, and the above change is completely broken garbage? Please use small words to point out my mental deficiencies to make sure I get it. Just to clarify: this obviously will *not* speed up the actual rmdir itself. *ALL* it does is to avoid holding the lock over the potentially long cleanup operation. So the latency of the rmdir is still the same, but now it should no longer matter for anything else. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 18:05 ` Linus Torvalds @ 2024-05-11 18:26 ` Linus Torvalds 2024-05-11 18:42 ` Linus Torvalds 2024-05-11 19:24 ` [PATCH] " Al Viro 0 siblings, 2 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 18:26 UTC (permalink / raw) To: torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, viro, walters, wangkai86, willy Yafang Shao reports that he has seen loads that generate billions of negative dentries in a directory, which then when the directory is removed causes excessive latencies for other users because the dentry shrinking is done under the directory inode lock. There seems to be no actual reason for holding the inode lock any more by the time we get rid of the now uninteresting negative dentries, and it's an effect of just code layout (the shared error path). Reorganize the code trivially to just have a separate success path, which simplifies the code (since 'd_delete_notify()' is only called in the success path anyway) and makes it trivial to just move the dentry shrinking outside the inode lock. Reported-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/all/20240511022729.35144-1-laoar.shao@gmail.com/ Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- Ok, this is the same patch, just with a commit message. And I actually am running a kernel with that patch, so it compiles and boots. Very limited testing: I created a directory with ten million negative dentries, and then did a "rmdir" in one terminal window while doing a "ls" inside that directory in another to kind of reproduce (on a smaller scale) what Yafang was reporting. The "ls" was not affected at all. But honestly, this was *ONE* single trivial test, so it's almost completely worthless. fs/namei.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 28e62238346e..474b1ee3266d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4217,16 +4217,19 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) goto out; - shrink_dcache_parent(dentry); dentry->d_inode->i_flags |= S_DEAD; dont_mount(dentry); detach_mounts(dentry); + inode_unlock(dentry->d_inode); + + shrink_dcache_parent(dentry); + dput(dentry); + d_delete_notify(dir, dentry); + return 0; out: inode_unlock(dentry->d_inode); dput(dentry); - if (!error) - d_delete_notify(dir, dentry); return error; } EXPORT_SYMBOL(vfs_rmdir); -- 2.44.0.330.g4d18c88175 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 18:26 ` [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' Linus Torvalds @ 2024-05-11 18:42 ` Linus Torvalds 2024-05-11 19:28 ` Al Viro 2024-05-11 20:02 ` [PATCH v2] " Linus Torvalds 2024-05-11 19:24 ` [PATCH] " Al Viro 1 sibling, 2 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 18:42 UTC (permalink / raw) To: torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, viro, walters, wangkai86, willy On Sat, 11 May 2024 at 11:29, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Reorganize the code trivially to just have a separate success path, > which simplifies the code (since 'd_delete_notify()' is only called in > the success path anyway) and makes it trivial to just move the dentry > shrinking outside the inode lock. Bah. I think this might need more work. The *caller* of vfs_rmdir() also holds a lock, ie we have do_rmdir() doing inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags); ... error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); dput(dentry); inode_unlock(path.dentry->d_inode); so we have another level of locking going on, and my patch only moved the dcache pruning outside the lock of the directory we're removing (not outside the lock of the directory that contains the removed directory). And that outside lock is the much more important one, I bet. So I still think this approach may be the right one, but that patch of mine didn't go far enough. Sadly, while do_rmdir() itself is trivial to fix up to do this, we have several other users of vfs_rmdir() (ecryptfs, devtmpfs, overlayfs in addition to nfsd and ksmbd), so the more complete patch would be noticeably bigger. My bad. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 18:42 ` Linus Torvalds @ 2024-05-11 19:28 ` Al Viro 2024-05-11 19:55 ` Linus Torvalds 2024-05-12 15:45 ` James Bottomley 2024-05-11 20:02 ` [PATCH v2] " Linus Torvalds 1 sibling, 2 replies; 43+ messages in thread From: Al Viro @ 2024-05-11 19:28 UTC (permalink / raw) To: Linus Torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sat, May 11, 2024 at 11:42:34AM -0700, Linus Torvalds wrote: > so we have another level of locking going on, and my patch only moved > the dcache pruning outside the lock of the directory we're removing > (not outside the lock of the directory that contains the removed > directory). > > And that outside lock is the much more important one, I bet. ... and _that_ is where taking d_delete outside of the lock might take an unpleasant analysis of a lot of code. We have places where we assume that holding the parent locked will prevent ->d_inode changes of children. It might be possible to get rid of that, but it will take a non-trivial amount of work. In any case, I think the original poster said that parent directories were not removed, so I doubt that rmdir() behaviour is relevant for their load. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 19:28 ` Al Viro @ 2024-05-11 19:55 ` Linus Torvalds 2024-05-11 20:31 ` Al Viro 2024-05-12 15:45 ` James Bottomley 1 sibling, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 19:55 UTC (permalink / raw) To: Al Viro Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy [-- Attachment #1: Type: text/plain, Size: 2457 bytes --] On Sat, 11 May 2024 at 12:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Sat, May 11, 2024 at 11:42:34AM -0700, Linus Torvalds wrote: > > > > And that outside lock is the much more important one, I bet. > > ... and _that_ is where taking d_delete outside of the lock might > take an unpleasant analysis of a lot of code. Hmm. It really shouldn't matter. There can only be negative children of the now deleted directory, so there are no actual effects on inodes. It only affects the d_child list, which is protected by d_lock (and can be modified outside of the inode lock anyway due to memory pressure). What am I missing? > In any case, I think the original poster said that parent directories > were not removed, so I doubt that rmdir() behaviour is relevant for > their load. I don't see that at all. The load was a "rm -rf" of a directory tree, and all of that was successful as far as I can see from the report. The issue was that an unrelated process just looking at the directory (either one - I clearly tested the wrong one) would be held up by the directory lock while the pruning was going on. And yes, the pruning can take a couple of seconds with "just" a few million negative dentries. The negative dentries obviously don't even have to be the result of a 'delete' - the easy way to see this is to do a lot of bogus lookups. Attached is my excessively stupid test-case in case you want to play around with it: [dirtest]$ time ./a.out dir ; time rmdir dir real 0m12.592s user 0m1.153s sys 0m11.412s real 0m1.892s user 0m0.001s sys 0m1.528s so you can see how it takes almost two seconds to then flush those negative dentries - even when there were no 'unlink()' calls at all, just failed lookups. It's maybe instructive to do the same on tmpfs, which has /* * Retaining negative dentries for an in-memory filesystem just wastes * memory and lookup time: arrange for them to be deleted immediately. */ int always_delete_dentry(const struct dentry *dentry) { return 1; } and so if you do the same test on /tmp, the results are very different: [dirtest]$ time ./a.out /tmp/sillydir ; time rmdir /tmp/sillydir real 0m8.129s user 0m1.164s sys 0m6.592s real 0m0.001s user 0m0.000s sys 0m0.001s so it does show very different patterns and you can test the whole "what happens without negative dentries" case. Linus [-- Attachment #2: main.c --] [-- Type: text/x-csrc, Size: 575 bytes --] #include <unistd.h> #include <fcntl.h> #include <sys/stat.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <errno.h> #define FATAL(x) do { if (x) die(#x); } while (0) static void die(const char *s) { fprintf(stderr, "%s: %s\n", s, strerror(errno)); exit(1); } int main(int argc, char ** argv) { char *dirname = argv[1]; FATAL(argc < 2); FATAL(mkdir(dirname, 0700)); for (int i = 0; i < 10000000; i++) { int fd; char name[128]; snprintf(name, sizeof(name), "%s/name-%09d", dirname, i); FATAL(open(name, O_RDONLY) >= 0); } return 0; } ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 19:55 ` Linus Torvalds @ 2024-05-11 20:31 ` Al Viro 2024-05-11 21:17 ` Al Viro 0 siblings, 1 reply; 43+ messages in thread From: Al Viro @ 2024-05-11 20:31 UTC (permalink / raw) To: Linus Torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sat, May 11, 2024 at 12:55:29PM -0700, Linus Torvalds wrote: > On Sat, 11 May 2024 at 12:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > On Sat, May 11, 2024 at 11:42:34AM -0700, Linus Torvalds wrote: > > > > > > And that outside lock is the much more important one, I bet. > > > > ... and _that_ is where taking d_delete outside of the lock might > > take an unpleasant analysis of a lot of code. > > Hmm. It really shouldn't matter. There can only be negative children > of the now deleted directory, so there are no actual effects on > inodes. > > It only affects the d_child list, which is protected by d_lock (and > can be modified outside of the inode lock anyway due to memory > pressure). > > What am I missing? fsnotify and related fun, basically. I'll need to redo the analysis, but IIRC there had been places where correctness had been guaranteed by the fact that this had been serialized by the lock on parent. No idea how easy would it be to adapt to that change; I'll check, but it'll take a few days, I'm afraid. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 20:31 ` Al Viro @ 2024-05-11 21:17 ` Al Viro 0 siblings, 0 replies; 43+ messages in thread From: Al Viro @ 2024-05-11 21:17 UTC (permalink / raw) To: Linus Torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sat, May 11, 2024 at 09:31:43PM +0100, Al Viro wrote: > On Sat, May 11, 2024 at 12:55:29PM -0700, Linus Torvalds wrote: > > On Sat, 11 May 2024 at 12:28, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > > > On Sat, May 11, 2024 at 11:42:34AM -0700, Linus Torvalds wrote: > > > > > > > > And that outside lock is the much more important one, I bet. > > > > > > ... and _that_ is where taking d_delete outside of the lock might > > > take an unpleasant analysis of a lot of code. > > > > Hmm. It really shouldn't matter. There can only be negative children > > of the now deleted directory, so there are no actual effects on > > inodes. > > > > It only affects the d_child list, which is protected by d_lock (and > > can be modified outside of the inode lock anyway due to memory > > pressure). > > > > What am I missing? > > fsnotify and related fun, basically. I'll need to redo the analysis, > but IIRC there had been places where correctness had been guaranteed > by the fact that this had been serialized by the lock on parent. As an aside, I'd really love to see d_rehash() gone - the really old nest of users is gone (used to be in nfs), but there's still a weird corner case in exfat + a bunch in AFS. Life would be simpler if those had been gone - many correctness proofs around dcache have unpleasant warts dealing with that crap. Not relevant in this case, but it makes analysis harder in general... ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 19:28 ` Al Viro 2024-05-11 19:55 ` Linus Torvalds @ 2024-05-12 15:45 ` James Bottomley 2024-05-12 16:16 ` Al Viro 1 sibling, 1 reply; 43+ messages in thread From: James Bottomley @ 2024-05-12 15:45 UTC (permalink / raw) To: Al Viro, Linus Torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sat, 2024-05-11 at 20:28 +0100, Al Viro wrote: > On Sat, May 11, 2024 at 11:42:34AM -0700, Linus Torvalds wrote: > > > so we have another level of locking going on, and my patch only > > moved > > the dcache pruning outside the lock of the directory we're removing > > (not outside the lock of the directory that contains the removed > > directory). > > > > And that outside lock is the much more important one, I bet. > > ... and _that_ is where taking d_delete outside of the lock might > take an unpleasant analysis of a lot of code. Couldn't you obviate this by doing it from a workqueue? Even if the directory is recreated, the chances are most of the negative dentries that were under it will still exist and be removable by the time the workqueue runs. James ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-12 15:45 ` James Bottomley @ 2024-05-12 16:16 ` Al Viro 2024-05-12 19:59 ` Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Al Viro @ 2024-05-12 16:16 UTC (permalink / raw) To: James Bottomley Cc: Linus Torvalds, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sun, May 12, 2024 at 11:45:44AM -0400, James Bottomley wrote: > On Sat, 2024-05-11 at 20:28 +0100, Al Viro wrote: > > On Sat, May 11, 2024 at 11:42:34AM -0700, Linus Torvalds wrote: > > > > > so we have another level of locking going on, and my patch only > > > moved > > > the dcache pruning outside the lock of the directory we're removing > > > (not outside the lock of the directory that contains the removed > > > directory). > > > > > > And that outside lock is the much more important one, I bet. > > > > ... and _that_ is where taking d_delete outside of the lock might > > take an unpleasant analysis of a lot of code. > > Couldn't you obviate this by doing it from a workqueue? Even if the > directory is recreated, the chances are most of the negative dentries > that were under it will still exist and be removable by the time the > workqueue runs. Eviction in general - sure shrink_dcache_parent() in particular... not really - you'd need to keep dentry pinned for that and that'd cause all kinds of fun for umount d_delete() - even worse (you don't want dcache lookups to find that sucker after rmdir(2) returned success to userland). ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-12 16:16 ` Al Viro @ 2024-05-12 19:59 ` Linus Torvalds 2024-05-12 20:29 ` Linus Torvalds 2024-05-13 5:31 ` Al Viro 0 siblings, 2 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-12 19:59 UTC (permalink / raw) To: Al Viro Cc: James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sun, 12 May 2024 at 09:16, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Eviction in general - sure > shrink_dcache_parent() in particular... not really - you'd need to keep > dentry pinned for that and that'd cause all kinds of fun for umount > d_delete() - even worse (you don't want dcache lookups to find that > sucker after rmdir(2) returned success to userland). All of those dentries *should* be negative dentries, so I really don't see the worry. For example, any child dentries will obviously still elevate the dentry count on the parent, so I really think we could actually skip the dentry shrink *entirely* and it wouldn't really have any semantic effect. IOW, I really think the "shrink dentries on rmdir" is a "let's get rid of pointless children that will never be used" than a semantic issue - cleaning things up and releasing memory, rather than about behavior. We will have already marked the inode dead, and called d_delete() on the directory: dentry->d_inode->i_flags |= S_DEAD; dont_mount(dentry); detach_mounts(dentry); ... if (!error) d_delete_notify(dir, dentry); // does d_delete(dentry) so the removed directory entry itself will have either turned into a negatve dentry or will unhash it (if there are other users). So the children are already unreachable through that name, and can only be reached through somebody who still has the directory open. And I do not see how "rmdir()" can *possibly* have any valid semantic effect on any user that has that directory as its PWD, so I claim that the dentries that exist at this point must already not be relevant from a semantic standpoint. So Al, this is my argument: the only dentry that *matters* is the dentry of the removed directory itself, and that's the one that sees the "d_delete()" (and all the noise to make sure you can't do new lookups and can't mount on top of it). The dentries *underneath* it cannot matter for semantics, and can happily be pruned - or not pruned - at any later date. Can you address *THIS* issue? Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-12 19:59 ` Linus Torvalds @ 2024-05-12 20:29 ` Linus Torvalds 2024-05-13 5:31 ` Al Viro 1 sibling, 0 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-12 20:29 UTC (permalink / raw) To: Al Viro Cc: James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sun, 12 May 2024 at 12:59, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So the children are already unreachable through that name, and can > only be reached through somebody who still has the directory open. And > I do not see how "rmdir()" can *possibly* have any valid semantic > effect on any user that has that directory as its PWD, so I claim that > the dentries that exist at this point must already not be relevant > from a semantic standpoint. Side note: our IS_DEADDIR() checks seem a bit inconsistent. That's actually what should catch some of the "I'm an a directory that has been removed", but we have that check in lookup_open(), in lookup_one_qstr_excl(), and in __lookup_slow(). I wonder if we should check it in lookup_dcache() too? Because yes, that's a difference that rmdir makes, in how it sets S_DEAD, and this seems to be an area where existing cached dentries are handled differently. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-12 19:59 ` Linus Torvalds 2024-05-12 20:29 ` Linus Torvalds @ 2024-05-13 5:31 ` Al Viro 2024-05-13 15:58 ` Linus Torvalds 1 sibling, 1 reply; 43+ messages in thread From: Al Viro @ 2024-05-13 5:31 UTC (permalink / raw) To: Linus Torvalds Cc: James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sun, May 12, 2024 at 12:59:57PM -0700, Linus Torvalds wrote: > so the removed directory entry itself will have either turned into a > negatve dentry or will unhash it (if there are other users). > > So the children are already unreachable through that name, and can > only be reached through somebody who still has the directory open. And > I do not see how "rmdir()" can *possibly* have any valid semantic > effect on any user that has that directory as its PWD, so I claim that > the dentries that exist at this point must already not be relevant > from a semantic standpoint. > > So Al, this is my argument: the only dentry that *matters* is the > dentry of the removed directory itself, and that's the one that sees > the "d_delete()" (and all the noise to make sure you can't do new > lookups and can't mount on top of it). Recall what d_delete() will do if you have other references. That's why we want shrink_dcache_parent() *before* d_delete(). BTW, example of the reasons why d_delete() without directory being locked is painful: simple_positive() is currently called either under ->d_lock on dentry in question or under ->i_rwsem on the parent. Either is enough to stabilize it. Get d_delete() happening without parent locked and all callers of simple_positive() must take ->d_lock. This one is not hard to adjust, but we need to find all such places. Currently positivity of hashed dentry can change only with parent held exclusive. It's going to take some digging... ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-13 5:31 ` Al Viro @ 2024-05-13 15:58 ` Linus Torvalds 2024-05-13 16:33 ` Al Viro 0 siblings, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-13 15:58 UTC (permalink / raw) To: Al Viro Cc: James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sun, 12 May 2024 at 22:31, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Recall what d_delete() will do if you have other references. > That's why we want shrink_dcache_parent() *before* d_delete(). Ahh. Yes. So with the shrinking done after as my patch, that means that now the directory that gets removed is always unhashed. > BTW, example of the reasons why d_delete() without directory being locked > is painful Oh, I would never suggest moving the d_delete() of the directory being removed itself outside the lock. I agree 100% on that side. That said, maybe it's ok to just always unhash the directory when doing the vfs_rmdir(). I certainly think it's better than unhashing all the negative dentries like the original patch did. Ho humm. Let me think about this some more. We *could* strive for a hybrid approach, where we handle the common case ("not a ton of child dentries") differently, and just get rid of them synchronously, and handle the "millions of children" case by unhashing the directory and dealing with shrinking the children async. But now it's the merge window, and I have 60+ pull requests in my inbox, so I will have to go back to work on that. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-13 15:58 ` Linus Torvalds @ 2024-05-13 16:33 ` Al Viro 2024-05-13 16:44 ` Linus Torvalds 2024-05-23 7:18 ` Dave Chinner 0 siblings, 2 replies; 43+ messages in thread From: Al Viro @ 2024-05-13 16:33 UTC (permalink / raw) To: Linus Torvalds Cc: James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Mon, May 13, 2024 at 08:58:33AM -0700, Linus Torvalds wrote: > We *could* strive for a hybrid approach, where we handle the common > case ("not a ton of child dentries") differently, and just get rid of > them synchronously, and handle the "millions of children" case by > unhashing the directory and dealing with shrinking the children async. try_to_shrink_children()? Doable, and not even that hard to do, but as for shrinking async... We can easily move it out of inode_lock on parent, but doing that really async would either need to be tied into e.g. remount r/o logics or we'd get userland regressions. I mean, "I have an opened unlinked file, can't remount r/o" is one thing, but "I've done rm -rf ~luser, can't remount r/o for a while" when all luser's processes had been killed and nothing is holding any of that opened... ouch. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-13 16:33 ` Al Viro @ 2024-05-13 16:44 ` Linus Torvalds 2024-05-23 7:18 ` Dave Chinner 1 sibling, 0 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-13 16:44 UTC (permalink / raw) To: Al Viro Cc: James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Mon, 13 May 2024 at 09:33, Al Viro <viro@zeniv.linux.org.uk> wrote: > > try_to_shrink_children()? Doable, and not even that hard to do, but > as for shrinking async... We can easily move it out of inode_lock > on parent, but doing that really async would either need to be > tied into e.g. remount r/o logics or we'd get userland regressions. Let's aim to just get it out from under the inode lock first, and then look at anything more exciting later, perhaps? Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-13 16:33 ` Al Viro 2024-05-13 16:44 ` Linus Torvalds @ 2024-05-23 7:18 ` Dave Chinner 1 sibling, 0 replies; 43+ messages in thread From: Dave Chinner @ 2024-05-23 7:18 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, James Bottomley, brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Mon, May 13, 2024 at 05:33:32PM +0100, Al Viro wrote: > On Mon, May 13, 2024 at 08:58:33AM -0700, Linus Torvalds wrote: > > > We *could* strive for a hybrid approach, where we handle the common > > case ("not a ton of child dentries") differently, and just get rid of > > them synchronously, and handle the "millions of children" case by > > unhashing the directory and dealing with shrinking the children async. > > try_to_shrink_children()? Doable, and not even that hard to do, but > as for shrinking async... We can easily move it out of inode_lock > on parent, but doing that really async would either need to be > tied into e.g. remount r/o logics or we'd get userland regressions. > > I mean, "I have an opened unlinked file, can't remount r/o" is one > thing, but "I've done rm -rf ~luser, can't remount r/o for a while" > when all luser's processes had been killed and nothing is holding > any of that opened... ouch. There is no ouch for the vast majority of users: XFS has been doing background async inode unlink processing since 5.14 (i.e. for almost 3 years now). See commit ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") for more of the background on this change - it was implemented because we needed to allow the scrub code to drop inode references from within transaction contexts, and evict() processing could run a nested transaction which could then deadlock the filesystem. Hence XFS offloads the inode freeing half of the unlink operation (i.e. the bit that happens in evict() context) to per-cpu workqueues instead of doing the work directly in evict() context. We allow evict() to completely tear down the VFS inode context, but don't free it in ->destroy_inode() because we still have work to do on it. XFS doesn't need an active VFS inode context to do the internal metadata updates needed to free an inode, so it's trivial to defer this work to a background context outside the VFS inode life cycle. Hence over half the processing work of every unlink() operation on XFS is now done in kworker threads rather than via the unlink() syscall context. Yes, that means freeze, remount r/o and unmount will block in xfs_inodegc_stop() waiting for these async inode freeing operations to be flushed and completed. However, there have been very few reported issues with freeze, remount r/o or unmount being significantly delayed - there's an occasional report of an inodes with tens of millions of extents to free delaying an operation, but that's no change from unlink() taking minutes to run and delaying the operation that way, anyway.... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v2] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 18:42 ` Linus Torvalds 2024-05-11 19:28 ` Al Viro @ 2024-05-11 20:02 ` Linus Torvalds 2024-05-12 3:06 ` Yafang Shao 1 sibling, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 20:02 UTC (permalink / raw) To: torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, viro, walters, wangkai86, willy Yafang Shao reports that he has seen loads that generate billions of negative dentries in a directory, which then when the directory is removed causes excessive latencies for other users because the dentry shrinking is done under the directory inode lock. There seems to be no actual reason for holding the inode lock any more by the time we get rid of the now uninteresting negative dentries, and it's an effect of the calling convention. Split the 'vfs_rmdir()' function into two separate phases: - 'vfs_rmdir_raw()' does the actual main rmdir heavy lifting - 'vfs_rmdir_cleanup()' needs to be run by the caller after a successful raw call, after the caller has dropped the inode locks. We leave the 'vfs_rmdir()' function around, since it has multiple callers, and only convert the main system call path to the new two-phase model. The other uses will be left as an exercise for the reader for when people find they care. [ Side note: I think the 'dget()/dput()' pair in vfs_rmdir_raw() is superfluous, since callers always have to have a dentry reference over the call anyway. That's a separate issue. - Linus ] Reported-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/all/20240511022729.35144-1-laoar.shao@gmail.com/ Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- Second version - this time doing the dentry pruning even later, after releasing the parent inode lock too. I did the same amount of "extensive testing" on this one as the previous one. IOW, little-to-none. fs/namei.c | 61 ++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 41 insertions(+), 20 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 28e62238346e..15b4ff6ed1e5 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4176,21 +4176,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode) return do_mkdirat(AT_FDCWD, getname(pathname), mode); } -/** - * vfs_rmdir - remove directory - * @idmap: idmap of the mount the inode was found from - * @dir: inode of @dentry - * @dentry: pointer to dentry of the base directory - * - * Remove a directory. - * - * If the inode has been found through an idmapped mount the idmap of - * the vfsmount must be passed through @idmap. This function will then take - * care to map the inode according to @idmap before checking permissions. - * On non-idmapped mounts or if permission checking is to be performed on the - * raw inode simply pass @nop_mnt_idmap. - */ -int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, +static int vfs_rmdir_raw(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry) { int error = may_delete(idmap, dir, dentry, 1); @@ -4217,18 +4203,43 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) goto out; - shrink_dcache_parent(dentry); dentry->d_inode->i_flags |= S_DEAD; dont_mount(dentry); detach_mounts(dentry); - out: inode_unlock(dentry->d_inode); dput(dentry); - if (!error) - d_delete_notify(dir, dentry); return error; } + +static inline void vfs_rmdir_cleanup(struct inode *dir, struct dentry *dentry) +{ + shrink_dcache_parent(dentry); + d_delete_notify(dir, dentry); +} + +/** + * vfs_rmdir - remove directory + * @idmap: idmap of the mount the inode was found from + * @dir: inode of @dentry + * @dentry: pointer to dentry of the base directory + * + * Remove a directory. + * + * If the inode has been found through an idmapped mount the idmap of + * the vfsmount must be passed through @idmap. This function will then take + * care to map the inode according to @idmap before checking permissions. + * On non-idmapped mounts or if permission checking is to be performed on the + * raw inode simply pass @nop_mnt_idmap. + */ +int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry) +{ + int retval = vfs_rmdir_raw(idmap, dir, dentry); + if (!retval) + vfs_rmdir_cleanup(dir, dentry); + return retval; +} EXPORT_SYMBOL(vfs_rmdir); int do_rmdir(int dfd, struct filename *name) @@ -4272,7 +4283,17 @@ int do_rmdir(int dfd, struct filename *name) error = security_path_rmdir(&path, dentry); if (error) goto exit4; - error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); + error = vfs_rmdir_raw(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); + if (error) + goto exit4; + inode_unlock(path.dentry->d_inode); + mnt_drop_write(path.mnt); + vfs_rmdir_cleanup(path.dentry->d_inode, dentry); + dput(dentry); + path_put(&path); + putname(name); + return 0; + exit4: dput(dentry); exit3: -- 2.44.0.330.g4d18c88175 ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v2] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 20:02 ` [PATCH v2] " Linus Torvalds @ 2024-05-12 3:06 ` Yafang Shao 2024-05-12 3:30 ` Al Viro 0 siblings, 1 reply; 43+ messages in thread From: Yafang Shao @ 2024-05-12 3:06 UTC (permalink / raw) To: Linus Torvalds Cc: brauner, jack, linux-fsdevel, longman, viro, walters, wangkai86, willy On Sun, May 12, 2024 at 4:03 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Yafang Shao reports that he has seen loads that generate billions of > negative dentries in a directory, which then when the directory is > removed causes excessive latencies for other users because the dentry > shrinking is done under the directory inode lock. > > There seems to be no actual reason for holding the inode lock any more > by the time we get rid of the now uninteresting negative dentries, and > it's an effect of the calling convention. > > Split the 'vfs_rmdir()' function into two separate phases: > > - 'vfs_rmdir_raw()' does the actual main rmdir heavy lifting > > - 'vfs_rmdir_cleanup()' needs to be run by the caller after a > successful raw call, after the caller has dropped the inode locks. > > We leave the 'vfs_rmdir()' function around, since it has multiple > callers, and only convert the main system call path to the new two-phase > model. The other uses will be left as an exercise for the reader for > when people find they care. > > [ Side note: I think the 'dget()/dput()' pair in vfs_rmdir_raw() is > superfluous, since callers always have to have a dentry reference over > the call anyway. That's a separate issue. - Linus ] > > Reported-by: Yafang Shao <laoar.shao@gmail.com> > Link: https://lore.kernel.org/all/20240511022729.35144-1-laoar.shao@gmail.com/ > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Christian Brauner <brauner@kernel.org> > Cc: Jan Kara <jack@suse.cz> > Cc: Waiman Long <longman@redhat.com> > Cc: Matthew Wilcox <willy@infradead.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> This could resolve the secondary concern. Tested-by: Yafang Shao <laoar.shao@gmail.com> Might it be feasible to execute the vfs_rmdir_cleanup() within a kwoker? Such an approach could potentially mitigate the initial concern as well. > --- > > Second version - this time doing the dentry pruning even later, after > releasing the parent inode lock too. > > I did the same amount of "extensive testing" on this one as the previous > one. IOW, little-to-none. > > fs/namei.c | 61 ++++++++++++++++++++++++++++++++++++------------------ > 1 file changed, 41 insertions(+), 20 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 28e62238346e..15b4ff6ed1e5 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -4176,21 +4176,7 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode) > return do_mkdirat(AT_FDCWD, getname(pathname), mode); > } > > -/** > - * vfs_rmdir - remove directory > - * @idmap: idmap of the mount the inode was found from > - * @dir: inode of @dentry > - * @dentry: pointer to dentry of the base directory > - * > - * Remove a directory. > - * > - * If the inode has been found through an idmapped mount the idmap of > - * the vfsmount must be passed through @idmap. This function will then take > - * care to map the inode according to @idmap before checking permissions. > - * On non-idmapped mounts or if permission checking is to be performed on the > - * raw inode simply pass @nop_mnt_idmap. > - */ > -int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, > +static int vfs_rmdir_raw(struct mnt_idmap *idmap, struct inode *dir, > struct dentry *dentry) > { > int error = may_delete(idmap, dir, dentry, 1); > @@ -4217,18 +4203,43 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, > if (error) > goto out; > > - shrink_dcache_parent(dentry); > dentry->d_inode->i_flags |= S_DEAD; > dont_mount(dentry); > detach_mounts(dentry); > - > out: > inode_unlock(dentry->d_inode); > dput(dentry); > - if (!error) > - d_delete_notify(dir, dentry); > return error; > } > + > +static inline void vfs_rmdir_cleanup(struct inode *dir, struct dentry *dentry) > +{ > + shrink_dcache_parent(dentry); > + d_delete_notify(dir, dentry); > +} > + > +/** > + * vfs_rmdir - remove directory > + * @idmap: idmap of the mount the inode was found from > + * @dir: inode of @dentry > + * @dentry: pointer to dentry of the base directory > + * > + * Remove a directory. > + * > + * If the inode has been found through an idmapped mount the idmap of > + * the vfsmount must be passed through @idmap. This function will then take > + * care to map the inode according to @idmap before checking permissions. > + * On non-idmapped mounts or if permission checking is to be performed on the > + * raw inode simply pass @nop_mnt_idmap. > + */ > +int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, > + struct dentry *dentry) > +{ > + int retval = vfs_rmdir_raw(idmap, dir, dentry); > + if (!retval) > + vfs_rmdir_cleanup(dir, dentry); > + return retval; > +} > EXPORT_SYMBOL(vfs_rmdir); > > int do_rmdir(int dfd, struct filename *name) > @@ -4272,7 +4283,17 @@ int do_rmdir(int dfd, struct filename *name) > error = security_path_rmdir(&path, dentry); > if (error) > goto exit4; > - error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); > + error = vfs_rmdir_raw(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); > + if (error) > + goto exit4; > + inode_unlock(path.dentry->d_inode); > + mnt_drop_write(path.mnt); > + vfs_rmdir_cleanup(path.dentry->d_inode, dentry); > + dput(dentry); > + path_put(&path); > + putname(name); > + return 0; > + > exit4: > dput(dentry); > exit3: > -- > 2.44.0.330.g4d18c88175 > -- Regards Yafang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-12 3:06 ` Yafang Shao @ 2024-05-12 3:30 ` Al Viro 2024-05-12 3:36 ` Yafang Shao 0 siblings, 1 reply; 43+ messages in thread From: Al Viro @ 2024-05-12 3:30 UTC (permalink / raw) To: Yafang Shao Cc: Linus Torvalds, brauner, jack, linux-fsdevel, longman, walters, wangkai86, willy On Sun, May 12, 2024 at 11:06:07AM +0800, Yafang Shao wrote: > This could resolve the secondary concern. > Tested-by: Yafang Shao <laoar.shao@gmail.com> > > Might it be feasible to execute the vfs_rmdir_cleanup() within a > kwoker? Such an approach could potentially mitigate the initial > concern as well. I'm honestly not sure I understood you correctly; in case I have and you really want to make that asynchronous, the answer's "we can't do that". What's more, we can not even delay that past the call of mnt_drop_write() in do_rmdir(). ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v2] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-12 3:30 ` Al Viro @ 2024-05-12 3:36 ` Yafang Shao 0 siblings, 0 replies; 43+ messages in thread From: Yafang Shao @ 2024-05-12 3:36 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, brauner, jack, linux-fsdevel, longman, walters, wangkai86, willy On Sun, May 12, 2024 at 11:30 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Sun, May 12, 2024 at 11:06:07AM +0800, Yafang Shao wrote: > > This could resolve the secondary concern. > > Tested-by: Yafang Shao <laoar.shao@gmail.com> > > > > Might it be feasible to execute the vfs_rmdir_cleanup() within a > > kwoker? Such an approach could potentially mitigate the initial > > concern as well. > > I'm honestly not sure I understood you correctly; in case I > have and you really want to make that asynchronous, Exactly, that's precisely my point. > the answer's "we > can't do that". What's more, we can not even delay that past the call > of mnt_drop_write() in do_rmdir(). Thanks for your explanation. -- Regards Yafang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' 2024-05-11 18:26 ` [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' Linus Torvalds 2024-05-11 18:42 ` Linus Torvalds @ 2024-05-11 19:24 ` Al Viro 1 sibling, 0 replies; 43+ messages in thread From: Al Viro @ 2024-05-11 19:24 UTC (permalink / raw) To: Linus Torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, walters, wangkai86, willy On Sat, May 11, 2024 at 11:26:26AM -0700, Linus Torvalds wrote: > Yafang Shao reports that he has seen loads that generate billions of > negative dentries in a directory, which then when the directory is > removed causes excessive latencies for other users because the dentry > shrinking is done under the directory inode lock. > > There seems to be no actual reason for holding the inode lock any more > by the time we get rid of the now uninteresting negative dentries, and > it's an effect of just code layout (the shared error path). > > Reorganize the code trivially to just have a separate success path, > which simplifies the code (since 'd_delete_notify()' is only called in > the success path anyway) and makes it trivial to just move the dentry > shrinking outside the inode lock. Wait, I thought he had confirmed that in the setup in question directories were *not* removed, hadn't he? I've no objections against that patch, but if the above is accurate it's not going to help... ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 3:35 ` Yafang Shao 2024-05-11 4:54 ` Waiman Long @ 2024-05-15 2:18 ` Yafang Shao 2024-05-15 2:36 ` Linus Torvalds 1 sibling, 1 reply; 43+ messages in thread From: Yafang Shao @ 2024-05-15 2:18 UTC (permalink / raw) To: Linus Torvalds Cc: viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Sat, May 11, 2024 at 11:35 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > On Sat, May 11, 2024 at 10:54 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > On Fri, 10 May 2024 at 19:28, Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > We've devised a solution to address both issues by deleting associated > > > dentry when removing a file. > > > > This patch is buggy. You are modifying d_flags outside the locked region. > > > > So at a minimum, the DCACHE_FILE_DELETED bit setting would need to > > just go into the > > > > if (dentry->d_lockref.count == 1) { > > > > side of the conditional, since the other side of that conditional > > already unhashes the dentry which makes this all moot anyway. > > > > That said, I think it's buggy in another way too: what if somebody > > else looks up the dentry before it actually gets unhashed? Then you > > have another ref to it, and the dentry might live long enough that it > > then gets re-used for a newly created file (which is why we have those > > negative dentries in the first place). > > > > So you'd have to clear the DCACHE_FILE_DELETED if the dentry is then > > made live by a file creation or rename or whatever. > > > > So that d_flags thing is actually pretty complicated. > > > > But since you made all this unconditional anyway, I think having a new > > dentry flag is unnecessary in the first place, and I suspect you are > > better off just unhashing the dentry unconditionally instead. > > > > IOW, I think the simpler patch is likely just something like this: > > It's simpler. I used to contemplate handling it that way, but lack the > knowledge and courage to proceed, hence I opted for the d_flags > solution. > I'll conduct tests on the revised change. Appreciate your suggestion. > We have successfully applied a hotfix to a subset of our production servers, totaling several thousand. The hotfix is as follows: diff --git a/fs/dcache.c b/fs/dcache.c index 52e6d5f..30eb733 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2557,14 +2557,14 @@ void d_delete(struct dentry * dentry) spin_lock(&inode->i_lock); spin_lock(&dentry->d_lock); + __d_drop(dentry); + /* * Are we the only user? */ if (dentry->d_lockref.count == 1) { - dentry->d_flags &= ~DCACHE_CANT_MOUNT; dentry_unlink_inode(dentry); } else { - __d_drop(dentry); spin_unlock(&dentry->d_lock); spin_unlock(&inode->i_lock); } So far, it has been functioning well without any regressions. We are planning to roll this update out to our entire fleet, which consists of hundreds of thousands of servers. I believe this change is still necessary. Would you prefer to commit it directly, or should I send an official patch? If the "unlink-create" issue is a concern, perhaps we can address it by adding a /sys/kernel/debug/vfs/delete_file_legacy entry? > > > > --- a/fs/dcache.c > > +++ b/fs/dcache.c > > @@ -2381,6 +2381,7 @@ void d_delete(struct dentry * dentry) > > > > spin_lock(&inode->i_lock); > > spin_lock(&dentry->d_lock); > > + __d_drop(dentry); > > /* > > * Are we the only user? > > */ > > @@ -2388,7 +2389,6 @@ void d_delete(struct dentry * dentry) > > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > > dentry_unlink_inode(dentry); > > } else { > > - __d_drop(dentry); > > spin_unlock(&dentry->d_lock); > > spin_unlock(&inode->i_lock); > > } > > > > although I think Al needs to ACK this, and I suspect that unhashing > > the dentry also makes that > > > > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > > > > pointless (because the dentry won't be reused, so DCACHE_CANT_MOUNT > > just won't matter). > > > > I do worry that there are loads that actually love our current > > behavior, but maybe it's worth doing the simple unconditional "make > > d_delete() always unhash" and only worry about whether that causes > > performance problems for people who commonly create a new file in its > > place when we get such a report. > > > > IOW, the more complex thing might be to actually take other behavior > > into account (eg "do we have so many negative dentries that we really > > don't want to create new ones"). > > This poses a substantial challenge. Despite recurrent discussions > within the community about improving negative dentry over and over, > there hasn't been a consensus on how to address it. > > > > > Al - can you please step in and tell us what else I've missed, and why > > my suggested version of the patch is also broken garbage? > > > > Linus > > > -- > Regards > Yafang -- Regards Yafang ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-15 2:18 ` [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file Yafang Shao @ 2024-05-15 2:36 ` Linus Torvalds 2024-05-15 9:17 ` [PATCH] vfs: " Yafang Shao 0 siblings, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-15 2:36 UTC (permalink / raw) To: Yafang Shao Cc: viro, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Tue, 14 May 2024 at 19:19, Yafang Shao <laoar.shao@gmail.com> wrote: > > I believe this change is still necessary. Would you prefer to commit > it directly, or should I send an official patch? Sending an official patch just for people to try out sounds good regardless, but I'd really like some real performance testing on other loads too before just taking it. It *may* be acceptable, and it's certainly simple. It's also almost certainly safe from a correctness angle. But it might regress performance on other important loads even if it fixes an issue on your load. That's why we had the whole discussion about alternatives. I'm still of the opinion that we should probably *try* this simple approach, but I really would hope we could have some test tree that people run a lot of benchmarks on. Does anybody do filesystem benchmarking on the mm tree? Or do we have some good tree to just give it a good testing? It feels a bit excessive to put it in my development tree just to get some performance testing coverage, but maybe that's what we have to do.. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-15 2:36 ` Linus Torvalds @ 2024-05-15 9:17 ` Yafang Shao 2024-05-15 16:05 ` Linus Torvalds 2024-05-22 8:11 ` kernel test robot 0 siblings, 2 replies; 43+ messages in thread From: Yafang Shao @ 2024-05-15 9:17 UTC (permalink / raw) To: torvalds Cc: brauner, jack, laoar.shao, linux-fsdevel, longman, viro, walters, wangkai86, linux-mm, linux-kernel, Matthew Wilcox Our applications, built on Elasticsearch[0], frequently create and delete files. These applications operate within containers, some with a memory limit exceeding 100GB. Over prolonged periods, the accumulation of negative dentries within these containers can amount to tens of gigabytes. Upon container exit, directories are deleted. However, due to the numerous associated dentries, this process can be time-consuming. Our users have expressed frustration with this prolonged exit duration, which constitutes our first issue. Simultaneously, other processes may attempt to access the parent directory of the Elasticsearch directories. Since the task responsible for deleting the dentries holds the inode lock, processes attempting directory lookup experience significant delays. This issue, our second problem, is easily demonstrated: - Task 1 generates negative dentries: $ pwd ~/test $ mkdir es && cd es/ && ./create_and_delete_files.sh [ After generating tens of GB dentries ] $ cd ~/test && rm -rf es [ It will take a long duration to finish ] - Task 2 attempts to lookup the 'test/' directory $ pwd ~/test $ ls The 'ls' command in Task 2 experiences prolonged execution as Task 1 is deleting the dentries. We've devised a solution to address both issues by deleting associated dentry when removing a file. Interestingly, we've noted that a similar patch was proposed years ago[1], although it was rejected citing the absence of tangible issues caused by negative dentries. Given our current challenges, we're resubmitting the proposal. All relevant stakeholders from previous discussions have been included for reference. Some alternative solutions are also under discussion[2][3], such as shrinking child dentries outside of the parent inode lock or even asynchronously shrinking child dentries. However, given the straightforward nature of the current solution, I believe this approach is still necessary. [0]. https://github.com/elastic/elasticsearch [1]. https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com [2]. https://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/ [3]. https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Wangkai <wangkai86@huawei.com> Cc: Colin Walters <walters@verbum.org> --- fs/dcache.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 71a8e943a0fa..2ffdb98e9166 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2360,19 +2360,17 @@ EXPORT_SYMBOL(d_hash_and_lookup); * - unhash this dentry and free it. * * Usually, we want to just turn this into - * a negative dentry, but if anybody else is - * currently using the dentry or the inode - * we can't do that and we fall back on removing - * it from the hash queues and waiting for - * it to be deleted later when it has no users + * a negative dentry, but certain workloads can + * generate a large number of negative dentries. + * Therefore, it would be better to simply + * unhash it. */ /** * d_delete - delete a dentry * @dentry: The dentry to delete * - * Turn the dentry into a negative dentry if possible, otherwise - * remove it from the hash queues so it can be deleted later + * Remove the dentry from the hash queues so it can be deleted later. */ void d_delete(struct dentry * dentry) @@ -2381,14 +2379,14 @@ void d_delete(struct dentry * dentry) spin_lock(&inode->i_lock); spin_lock(&dentry->d_lock); + __d_drop(dentry); + /* * Are we the only user? */ if (dentry->d_lockref.count == 1) { - dentry->d_flags &= ~DCACHE_CANT_MOUNT; dentry_unlink_inode(dentry); } else { - __d_drop(dentry); spin_unlock(&dentry->d_lock); spin_unlock(&inode->i_lock); } -- 2.30.1 (Apple Git-130) ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-15 9:17 ` [PATCH] vfs: " Yafang Shao @ 2024-05-15 16:05 ` Linus Torvalds 2024-05-16 13:44 ` Oliver Sang 2024-05-22 8:51 ` Oliver Sang 2024-05-22 8:11 ` kernel test robot 1 sibling, 2 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-15 16:05 UTC (permalink / raw) To: Yafang Shao, kernel test robot Cc: brauner, jack, linux-fsdevel, longman, viro, walters, wangkai86, linux-mm, linux-kernel, Matthew Wilcox Oliver, is there any chance you could run this through the test robot performance suite? The original full patch at https://lore.kernel.org/all/20240515091727.22034-1-laoar.shao@gmail.com/ and it would be interesting if the test robot could see if the patch makes any difference on any other loads? Thanks, Linus On Wed, 15 May 2024 at 02:17, Yafang Shao <laoar.shao@gmail.com> wrote: > > Our applications, built on Elasticsearch[0], frequently create and delete > files. These applications operate within containers, some with a memory > limit exceeding 100GB. Over prolonged periods, the accumulation of negative > dentries within these containers can amount to tens of gigabytes. > > Upon container exit, directories are deleted. However, due to the numerous > associated dentries, this process can be time-consuming. Our users have > expressed frustration with this prolonged exit duration, which constitutes > our first issue. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-15 16:05 ` Linus Torvalds @ 2024-05-16 13:44 ` Oliver Sang 2024-05-22 8:51 ` Oliver Sang 1 sibling, 0 replies; 43+ messages in thread From: Oliver Sang @ 2024-05-16 13:44 UTC (permalink / raw) To: Linus Torvalds Cc: Yafang Shao, brauner, jack, linux-fsdevel, longman, viro, walters, wangkai86, linux-mm, linux-kernel, Matthew Wilcox, oliver.sang hi, Linus, On Wed, May 15, 2024 at 09:05:24AM -0700, Linus Torvalds wrote: > Oliver, > is there any chance you could run this through the test robot > performance suite? The original full patch at > > https://lore.kernel.org/all/20240515091727.22034-1-laoar.shao@gmail.com/ > > and it would be interesting if the test robot could see if the patch > makes any difference on any other loads? sure. it's our great pleasure. thanks a lot to involve us. the tests are started, but due to resource contraint, it may cost 2-3 days to run enough tests. will keep you guys updated, Thanks! > > Thanks, > Linus > > On Wed, 15 May 2024 at 02:17, Yafang Shao <laoar.shao@gmail.com> wrote: > > > > Our applications, built on Elasticsearch[0], frequently create and delete > > files. These applications operate within containers, some with a memory > > limit exceeding 100GB. Over prolonged periods, the accumulation of negative > > dentries within these containers can amount to tens of gigabytes. > > > > Upon container exit, directories are deleted. However, due to the numerous > > associated dentries, this process can be time-consuming. Our users have > > expressed frustration with this prolonged exit duration, which constitutes > > our first issue. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-15 16:05 ` Linus Torvalds 2024-05-16 13:44 ` Oliver Sang @ 2024-05-22 8:51 ` Oliver Sang 2024-05-23 2:21 ` Yafang Shao 1 sibling, 1 reply; 43+ messages in thread From: Oliver Sang @ 2024-05-22 8:51 UTC (permalink / raw) To: Linus Torvalds Cc: Yafang Shao, brauner, jack, linux-fsdevel, longman, viro, walters, wangkai86, linux-mm, linux-kernel, Matthew Wilcox, ying.huang, feng.tang, fengwei.yin, philip.li, yujie.liu, oliver.sang hi, Linus, hi, Yafang Shao, On Wed, May 15, 2024 at 09:05:24AM -0700, Linus Torvalds wrote: > Oliver, > is there any chance you could run this through the test robot > performance suite? The original full patch at > > https://lore.kernel.org/all/20240515091727.22034-1-laoar.shao@gmail.com/ > > and it would be interesting if the test robot could see if the patch > makes any difference on any other loads? > we just reported a stress-ng performance improvement by this patch [1] test robot applied this patch upon 3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq") filesystem is not our team's major domain, so we just made some limited review of the results, and decided to send out the report FYI. at first stage, we decided to check below catagories of tests as priority: stress-ng filesystem filebench mailserver reaim fileserver we also pick sysbench-fileio, blogbench into coverage. here is a summary. for stress-ng, besided [1] which was reported, we got below data that are about this patch comparing to 3c999d1ae3. either there is no significant performance change, or the change is smaller than the noise which will make test robot's bisect fail, so these information is just FYI. and if you have any doubt about any subtests, could you let us know then we could check further? (also included some net test results) 12.87 ± 6% -0.6% 12.79 stress-ng.xattr.ops_per_sec 6721 ± 5% +7.5% 7224 ± 27% stress-ng.rawdev.ops_per_sec 9002 ± 7% -8.7% 8217 stress-ng.dirmany.ops_per_sec 8594743 ± 4% -3.0% 8337417 stress-ng.rawsock.ops_per_sec 2056 ± 3% +2.9% 2116 stress-ng.dirdeep.ops_per_sec 4307 ± 21% -6.9% 4009 stress-ng.dir.ops_per_sec 137946 ± 18% +5.8% 145942 stress-ng.fiemap.ops_per_sec 22413006 ± 2% +2.5% 22982512 ± 2% stress-ng.sockdiag.ops_per_sec 286714 ± 2% -3.8% 275876 ± 5% stress-ng.udp-flood.ops_per_sec 82904 ± 46% -31.6% 56716 stress-ng.sctp.ops_per_sec 9853408 -0.3% 9826387 stress-ng.ping-sock.ops_per_sec 84667 ± 12% -26.7% 62050 ± 17% stress-ng.dccp.ops_per_sec 61750 ± 25% -24.2% 46821 ± 38% stress-ng.open.ops_per_sec 583443 ± 3% -3.4% 563822 stress-ng.file-ioctl.ops_per_sec 11919 ± 28% -34.3% 7833 stress-ng.dentry.ops_per_sec 18.59 ± 12% -23.9% 14.15 ± 27% stress-ng.swap.ops_per_sec 246.37 ± 2% +15.9% 285.58 ± 12% stress-ng.aiol.ops_per_sec 7.45 -4.8% 7.10 ± 7% stress-ng.fallocate.ops_per_sec 207.97 ± 7% +5.2% 218.70 stress-ng.copy-file.ops_per_sec 69.87 ± 7% +5.8% 73.93 ± 5% stress-ng.fpunch.ops_per_sec 0.25 ± 21% +24.0% 0.31 stress-ng.inode-flags.ops_per_sec 849.35 ± 6% +1.4% 861.51 stress-ng.mknod.ops_per_sec 926144 ± 4% -5.2% 877558 stress-ng.lease.ops_per_sec 82924 -2.1% 81220 stress-ng.fcntl.ops_per_sec 6.19 ±124% -50.7% 3.05 stress-ng.chattr.ops_per_sec 676.90 ± 4% -1.9% 663.94 ± 5% stress-ng.iomix.ops_per_sec 0.93 ± 6% +5.6% 0.98 ± 7% stress-ng.symlink.ops_per_sec 1703608 -3.8% 1639057 ± 3% stress-ng.eventfd.ops_per_sec 1735861 -0.6% 1726072 stress-ng.sockpair.ops_per_sec 85440 -2.0% 83705 stress-ng.rawudp.ops_per_sec 6198 +0.6% 6236 stress-ng.sockabuse.ops_per_sec 39226 +0.0% 39234 stress-ng.sock.ops_per_sec 1358 +0.3% 1363 stress-ng.tun.ops_per_sec 9794021 -1.7% 9623340 stress-ng.icmp-flood.ops_per_sec 1324728 +0.3% 1328244 stress-ng.epoll.ops_per_sec 146150 -2.0% 143231 stress-ng.rawpkt.ops_per_sec 6381112 -0.4% 6352696 stress-ng.udp.ops_per_sec 1234258 +0.2% 1236738 stress-ng.sockfd.ops_per_sec 23954 -0.1% 23932 stress-ng.sockmany.ops_per_sec 257030 -0.1% 256860 stress-ng.netdev.ops_per_sec 6337097 +0.1% 6341130 stress-ng.flock.ops_per_sec 173212 -0.3% 172728 stress-ng.rename.ops_per_sec 199.69 +0.6% 200.82 stress-ng.sync-file.ops_per_sec 606.57 +0.8% 611.53 stress-ng.chown.ops_per_sec 183549 -0.9% 181975 stress-ng.handle.ops_per_sec 1299 +0.0% 1299 stress-ng.hdd.ops_per_sec 98371066 +0.2% 98571113 stress-ng.lockofd.ops_per_sec 25.49 -4.3% 24.39 stress-ng.ioprio.ops_per_sec 96745191 -1.5% 95333632 stress-ng.locka.ops_per_sec 582.35 +0.1% 582.86 stress-ng.chmod.ops_per_sec 2075897 -2.2% 2029552 stress-ng.getdent.ops_per_sec 60.47 -1.9% 59.34 stress-ng.metamix.ops_per_sec 14161 -0.3% 14123 stress-ng.io.ops_per_sec 23.98 -1.5% 23.61 stress-ng.link.ops_per_sec 27514 +0.0% 27528 stress-ng.filename.ops_per_sec 44955 +1.6% 45678 stress-ng.dnotify.ops_per_sec 160.94 +0.4% 161.51 stress-ng.inotify.ops_per_sec 2452224 +4.0% 2549607 stress-ng.lockf.ops_per_sec 6761 +0.3% 6779 stress-ng.fsize.ops_per_sec 775083 -1.5% 763487 stress-ng.fanotify.ops_per_sec 309124 -4.2% 296285 stress-ng.utime.ops_per_sec 25567 -0.1% 25530 stress-ng.dup.ops_per_sec 1858 +0.9% 1876 stress-ng.procfs.ops_per_sec 105804 -3.9% 101658 stress-ng.access.ops_per_sec 1.04 -1.9% 1.02 stress-ng.chdir.ops_per_sec 82753 -0.3% 82480 stress-ng.fstat.ops_per_sec 681128 +3.7% 706375 stress-ng.acl.ops_per_sec 11892 -0.1% 11875 stress-ng.bind-mount.ops_per_sec for filebench, similar results, but data is less unstable than stress-ng, which means for most of them, we regarded them as that they should not be impacted by this patch. for reaim/sysbench-fileio/blogbench, the data are quite stable, and we didn't notice any significant performance changes. we even doubt whether they go through the code path changed by this patch. so for these, we don't list full results here. BTW, besides filesystem tests, this patch is also piped into other performance test categories such like net, scheduler, mm and others, and since it also goes into our so-called hourly kernels, it could run by full other performance test suites which test robot supports. so in following 2-3 weeks, it's still possible for us to report other results including regression. thanks [1] https://lore.kernel.org/all/202405221518.ecea2810-oliver.sang@intel.com/ > Thanks, > Linus > > On Wed, 15 May 2024 at 02:17, Yafang Shao <laoar.shao@gmail.com> wrote: > > > > Our applications, built on Elasticsearch[0], frequently create and delete > > files. These applications operate within containers, some with a memory > > limit exceeding 100GB. Over prolonged periods, the accumulation of negative > > dentries within these containers can amount to tens of gigabytes. > > > > Upon container exit, directories are deleted. However, due to the numerous > > associated dentries, this process can be time-consuming. Our users have > > expressed frustration with this prolonged exit duration, which constitutes > > our first issue. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-22 8:51 ` Oliver Sang @ 2024-05-23 2:21 ` Yafang Shao 0 siblings, 0 replies; 43+ messages in thread From: Yafang Shao @ 2024-05-23 2:21 UTC (permalink / raw) To: Oliver Sang Cc: Linus Torvalds, brauner, jack, linux-fsdevel, longman, viro, walters, wangkai86, linux-mm, linux-kernel, Matthew Wilcox, ying.huang, feng.tang, fengwei.yin, philip.li, yujie.liu On Wed, May 22, 2024 at 4:51 PM Oliver Sang <oliver.sang@intel.com> wrote: > > > hi, Linus, hi, Yafang Shao, > > > On Wed, May 15, 2024 at 09:05:24AM -0700, Linus Torvalds wrote: > > Oliver, > > is there any chance you could run this through the test robot > > performance suite? The original full patch at > > > > https://lore.kernel.org/all/20240515091727.22034-1-laoar.shao@gmail.com/ > > > > and it would be interesting if the test robot could see if the patch > > makes any difference on any other loads? > > > > we just reported a stress-ng performance improvement by this patch [1] Awesome! > > test robot applied this patch upon > 3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq") > > filesystem is not our team's major domain, so we just made some limited review > of the results, and decided to send out the report FYI. > > at first stage, we decided to check below catagories of tests as priority: > > stress-ng filesystem > filebench mailserver > reaim fileserver > > we also pick sysbench-fileio, blogbench into coverage. > > here is a summary. > > for stress-ng, besided [1] which was reported, we got below data that are > about this patch comparing to 3c999d1ae3. > > either there is no significant performance change, or the change is smaller > than the noise which will make test robot's bisect fail, so these information > is just FYI. and if you have any doubt about any subtests, could you let us know > then we could check further? > > (also included some net test results) > > 12.87 ą 6% -0.6% 12.79 stress-ng.xattr.ops_per_sec > 6721 ą 5% +7.5% 7224 ą 27% stress-ng.rawdev.ops_per_sec > 9002 ą 7% -8.7% 8217 stress-ng.dirmany.ops_per_sec > 8594743 ą 4% -3.0% 8337417 stress-ng.rawsock.ops_per_sec > 2056 ą 3% +2.9% 2116 stress-ng.dirdeep.ops_per_sec > 4307 ą 21% -6.9% 4009 stress-ng.dir.ops_per_sec > 137946 ą 18% +5.8% 145942 stress-ng.fiemap.ops_per_sec > 22413006 ą 2% +2.5% 22982512 ą 2% stress-ng.sockdiag.ops_per_sec > 286714 ą 2% -3.8% 275876 ą 5% stress-ng.udp-flood.ops_per_sec > 82904 ą 46% -31.6% 56716 stress-ng.sctp.ops_per_sec > 9853408 -0.3% 9826387 stress-ng.ping-sock.ops_per_sec > 84667 ą 12% -26.7% 62050 ą 17% stress-ng.dccp.ops_per_sec > 61750 ą 25% -24.2% 46821 ą 38% stress-ng.open.ops_per_sec > 583443 ą 3% -3.4% 563822 stress-ng.file-ioctl.ops_per_sec > 11919 ą 28% -34.3% 7833 stress-ng.dentry.ops_per_sec > 18.59 ą 12% -23.9% 14.15 ą 27% stress-ng.swap.ops_per_sec > 246.37 ą 2% +15.9% 285.58 ą 12% stress-ng.aiol.ops_per_sec > 7.45 -4.8% 7.10 ą 7% stress-ng.fallocate.ops_per_sec > 207.97 ą 7% +5.2% 218.70 stress-ng.copy-file.ops_per_sec > 69.87 ą 7% +5.8% 73.93 ą 5% stress-ng.fpunch.ops_per_sec > 0.25 ą 21% +24.0% 0.31 stress-ng.inode-flags.ops_per_sec > 849.35 ą 6% +1.4% 861.51 stress-ng.mknod.ops_per_sec > 926144 ą 4% -5.2% 877558 stress-ng.lease.ops_per_sec > 82924 -2.1% 81220 stress-ng.fcntl.ops_per_sec > 6.19 ą124% -50.7% 3.05 stress-ng.chattr.ops_per_sec > 676.90 ą 4% -1.9% 663.94 ą 5% stress-ng.iomix.ops_per_sec > 0.93 ą 6% +5.6% 0.98 ą 7% stress-ng.symlink.ops_per_sec > 1703608 -3.8% 1639057 ą 3% stress-ng.eventfd.ops_per_sec > 1735861 -0.6% 1726072 stress-ng.sockpair.ops_per_sec > 85440 -2.0% 83705 stress-ng.rawudp.ops_per_sec > 6198 +0.6% 6236 stress-ng.sockabuse.ops_per_sec > 39226 +0.0% 39234 stress-ng.sock.ops_per_sec > 1358 +0.3% 1363 stress-ng.tun.ops_per_sec > 9794021 -1.7% 9623340 stress-ng.icmp-flood.ops_per_sec > 1324728 +0.3% 1328244 stress-ng.epoll.ops_per_sec > 146150 -2.0% 143231 stress-ng.rawpkt.ops_per_sec > 6381112 -0.4% 6352696 stress-ng.udp.ops_per_sec > 1234258 +0.2% 1236738 stress-ng.sockfd.ops_per_sec > 23954 -0.1% 23932 stress-ng.sockmany.ops_per_sec > 257030 -0.1% 256860 stress-ng.netdev.ops_per_sec > 6337097 +0.1% 6341130 stress-ng.flock.ops_per_sec > 173212 -0.3% 172728 stress-ng.rename.ops_per_sec > 199.69 +0.6% 200.82 stress-ng.sync-file.ops_per_sec > 606.57 +0.8% 611.53 stress-ng.chown.ops_per_sec > 183549 -0.9% 181975 stress-ng.handle.ops_per_sec > 1299 +0.0% 1299 stress-ng.hdd.ops_per_sec > 98371066 +0.2% 98571113 stress-ng.lockofd.ops_per_sec > 25.49 -4.3% 24.39 stress-ng.ioprio.ops_per_sec > 96745191 -1.5% 95333632 stress-ng.locka.ops_per_sec > 582.35 +0.1% 582.86 stress-ng.chmod.ops_per_sec > 2075897 -2.2% 2029552 stress-ng.getdent.ops_per_sec > 60.47 -1.9% 59.34 stress-ng.metamix.ops_per_sec > 14161 -0.3% 14123 stress-ng.io.ops_per_sec > 23.98 -1.5% 23.61 stress-ng.link.ops_per_sec > 27514 +0.0% 27528 stress-ng.filename.ops_per_sec > 44955 +1.6% 45678 stress-ng.dnotify.ops_per_sec > 160.94 +0.4% 161.51 stress-ng.inotify.ops_per_sec > 2452224 +4.0% 2549607 stress-ng.lockf.ops_per_sec > 6761 +0.3% 6779 stress-ng.fsize.ops_per_sec > 775083 -1.5% 763487 stress-ng.fanotify.ops_per_sec > 309124 -4.2% 296285 stress-ng.utime.ops_per_sec > 25567 -0.1% 25530 stress-ng.dup.ops_per_sec > 1858 +0.9% 1876 stress-ng.procfs.ops_per_sec > 105804 -3.9% 101658 stress-ng.access.ops_per_sec > 1.04 -1.9% 1.02 stress-ng.chdir.ops_per_sec > 82753 -0.3% 82480 stress-ng.fstat.ops_per_sec > 681128 +3.7% 706375 stress-ng.acl.ops_per_sec > 11892 -0.1% 11875 stress-ng.bind-mount.ops_per_sec > > > for filebench, similar results, but data is less unstable than stress-ng, which > means for most of them, we regarded them as that they should not be impacted by > this patch. > > for reaim/sysbench-fileio/blogbench, the data are quite stable, and we didn't > notice any significant performance changes. we even doubt whether they go > through the code path changed by this patch. > > so for these, we don't list full results here. > > BTW, besides filesystem tests, this patch is also piped into other performance > test categories such like net, scheduler, mm and others, and since it also goes > into our so-called hourly kernels, it could run by full other performance test > suites which test robot supports. so in following 2-3 weeks, it's still possible > for us to report other results including regression. > That's great. Many thanks for your help. -- Regards Yafang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-15 9:17 ` [PATCH] vfs: " Yafang Shao 2024-05-15 16:05 ` Linus Torvalds @ 2024-05-22 8:11 ` kernel test robot 2024-05-22 16:00 ` Linus Torvalds 1 sibling, 1 reply; 43+ messages in thread From: kernel test robot @ 2024-05-22 8:11 UTC (permalink / raw) To: Yafang Shao Cc: oe-lkp, lkp, Linus Torvalds, Al Viro, Christian Brauner, Jan Kara, Waiman Long, Matthew Wilcox, Wangkai, Colin Walters, linux-fsdevel, ying.huang, feng.tang, fengwei.yin, laoar.shao, linux-mm, linux-kernel, oliver.sang Hello, kernel test robot noticed a 6.7% improvement of stress-ng.touch.ops_per_sec on: commit: 3681ce364442ce2ec7c7fbe90ad0aca651db475b ("[PATCH] vfs: Delete the associated dentry when deleting a file") url: https://github.com/intel-lab-lkp/linux/commits/Yafang-Shao/vfs-Delete-the-associated-dentry-when-deleting-a-file/20240516-144340 base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 3c999d1ae3c75991902a1a7dad0cb62c2a3008b4 patch link: https://lore.kernel.org/all/20240515091727.22034-1-laoar.shao@gmail.com/ patch subject: [PATCH] vfs: Delete the associated dentry when deleting a file testcase: stress-ng test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 100% disk: 1HDD testtime: 60s fs: ext4 test: touch cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240522/202405221518.ecea2810-oliver.sang@intel.com ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/touch/stress-ng/60s commit: 3c999d1ae3 ("Merge tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq") 3681ce3644 ("vfs: Delete the associated dentry when deleting a file") 3c999d1ae3c75991 3681ce364442ce2ec7c7fbe90ad ---------------- --------------------------- %stddev %change %stddev \ | \ 9.884e+08 -49.1% 5.027e+08 ± 4% cpuidle..time 23.46 ± 2% -38.3% 14.47 ± 5% iostat.cpu.idle 75.71 +11.7% 84.60 iostat.cpu.system 620100 ± 5% -24.0% 471157 ± 6% numa-numastat.node0.local_node 650333 ± 3% -22.3% 505122 ± 5% numa-numastat.node0.numa_hit 736690 ± 3% -21.7% 577182 ± 4% numa-numastat.node1.local_node 772759 ± 3% -21.1% 609537 ± 3% numa-numastat.node1.numa_hit 23.42 -38.1% 14.50 ± 5% vmstat.cpu.id 54.39 ± 3% +13.4% 61.67 ± 2% vmstat.procs.r 75507 ± 6% +24.6% 94113 ± 5% vmstat.system.cs 177863 +14.2% 203176 vmstat.system.in 21.30 ± 2% -9.5 11.76 ± 7% mpstat.cpu.all.idle% 0.37 ± 2% +0.1 0.42 ± 2% mpstat.cpu.all.irq% 0.14 -0.0 0.11 ± 2% mpstat.cpu.all.soft% 77.37 +9.4 86.79 mpstat.cpu.all.sys% 0.81 +0.1 0.92 ± 2% mpstat.cpu.all.usr% 68.24 -11.5% 60.40 stress-ng.time.elapsed_time 68.24 -11.5% 60.40 stress-ng.time.elapsed_time.max 175478 +5.3% 184765 stress-ng.time.involuntary_context_switches 5038 +12.4% 5663 stress-ng.time.percent_of_cpu_this_job_got 259551 ± 2% +6.7% 277010 stress-ng.touch.ops_per_sec 454143 ± 2% -9.2% 412504 ± 2% meminfo.Active 445748 ± 2% -9.3% 404154 ± 2% meminfo.Active(anon) 1714332 ± 2% -92.8% 123761 ± 2% meminfo.KReclaimable 7111721 -23.7% 5423125 meminfo.Memused 1714332 ± 2% -92.8% 123761 ± 2% meminfo.SReclaimable 291963 -35.6% 188122 meminfo.SUnreclaim 2006296 -84.5% 311883 meminfo.Slab 7289014 -24.0% 5539852 meminfo.max_used_kB 875347 ± 3% -91.3% 75880 ± 30% numa-meminfo.node0.KReclaimable 875347 ± 3% -91.3% 75880 ± 30% numa-meminfo.node0.SReclaimable 161663 ± 7% -31.3% 111087 ± 12% numa-meminfo.node0.SUnreclaim 1037010 ± 3% -82.0% 186968 ± 16% numa-meminfo.node0.Slab 838678 ± 3% -94.3% 48016 ± 50% numa-meminfo.node1.KReclaimable 838678 ± 3% -94.3% 48016 ± 50% numa-meminfo.node1.SReclaimable 130259 ± 8% -40.8% 77055 ± 18% numa-meminfo.node1.SUnreclaim 968937 ± 3% -87.1% 125072 ± 25% numa-meminfo.node1.Slab 218872 ± 3% -91.3% 18961 ± 31% numa-vmstat.node0.nr_slab_reclaimable 40425 ± 7% -31.3% 27772 ± 12% numa-vmstat.node0.nr_slab_unreclaimable 649973 ± 3% -22.3% 504715 ± 5% numa-vmstat.node0.numa_hit 619740 ± 5% -24.0% 470750 ± 6% numa-vmstat.node0.numa_local 209697 ± 3% -94.3% 12019 ± 50% numa-vmstat.node1.nr_slab_reclaimable 32567 ± 8% -40.8% 19264 ± 18% numa-vmstat.node1.nr_slab_unreclaimable 771696 ± 3% -21.1% 608506 ± 3% numa-vmstat.node1.numa_hit 735626 ± 3% -21.7% 576154 ± 4% numa-vmstat.node1.numa_local 111469 ± 2% -9.3% 101103 ± 2% proc-vmstat.nr_active_anon 446.10 -36.6% 283.05 proc-vmstat.nr_dirtied 18812 +2.5% 19287 proc-vmstat.nr_kernel_stack 428165 ± 2% -92.8% 30970 ± 2% proc-vmstat.nr_slab_reclaimable 72954 -35.5% 47032 proc-vmstat.nr_slab_unreclaimable 111469 ± 2% -9.3% 101103 ± 2% proc-vmstat.nr_zone_active_anon 1424982 -21.7% 1116423 proc-vmstat.numa_hit 1358679 -22.7% 1050103 ± 2% proc-vmstat.numa_local 4261989 ± 3% -14.9% 3625822 ± 3% proc-vmstat.pgalloc_normal 399826 -4.6% 381396 proc-vmstat.pgfault 3996977 ± 3% -15.9% 3359964 ± 3% proc-vmstat.pgfree 13500 -5.9% 12708 proc-vmstat.pgpgout 0.13 ± 3% -0.0 0.08 ± 4% perf-profile.children.cycles-pp.d_lookup 0.12 ± 4% -0.0 0.08 perf-profile.children.cycles-pp.__d_lookup 0.11 ± 2% +0.0 0.13 ± 3% perf-profile.children.cycles-pp.irq_exit_rcu 0.06 ± 7% +0.0 0.09 ± 2% perf-profile.children.cycles-pp.__slab_free 0.02 ±100% +0.0 0.06 ± 8% perf-profile.children.cycles-pp.__memcg_slab_free_hook 0.07 ± 7% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.run_ksoftirqd 0.09 ± 5% +0.0 0.14 ± 4% perf-profile.children.cycles-pp.smpboot_thread_fn 0.18 ± 3% +0.1 0.24 ± 3% perf-profile.children.cycles-pp.kmem_cache_free 0.10 ± 7% +0.1 0.16 ± 7% perf-profile.children.cycles-pp.kthread 0.10 ± 7% +0.1 0.16 ± 7% perf-profile.children.cycles-pp.ret_from_fork 0.10 ± 7% +0.1 0.16 ± 7% perf-profile.children.cycles-pp.ret_from_fork_asm 0.14 ± 4% +0.1 0.20 ± 3% perf-profile.children.cycles-pp.rcu_do_batch 0.15 ± 3% +0.1 0.21 ± 2% perf-profile.children.cycles-pp.rcu_core 0.18 ± 3% +0.1 0.24 ± 2% perf-profile.children.cycles-pp.handle_softirqs 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.list_lru_del 0.82 ± 2% +0.1 0.89 ± 3% perf-profile.children.cycles-pp._raw_spin_lock 0.01 ±238% +0.1 0.08 ± 4% perf-profile.children.cycles-pp.__call_rcu_common 0.00 +0.1 0.08 ± 3% perf-profile.children.cycles-pp.d_lru_del 0.00 +0.2 0.17 ± 3% perf-profile.children.cycles-pp.__dentry_kill 0.33 ± 11% +0.2 0.50 ± 6% perf-profile.children.cycles-pp.dput 0.09 ± 4% -0.0 0.05 ± 22% perf-profile.self.cycles-pp.__d_lookup 0.06 ± 5% +0.0 0.09 ± 5% perf-profile.self.cycles-pp.__slab_free 0.74 ± 2% +0.1 0.79 perf-profile.self.cycles-pp._raw_spin_lock 2.55 ± 4% -40.6% 1.51 ± 5% perf-stat.i.MPKI 1.219e+10 +12.8% 1.376e+10 perf-stat.i.branch-instructions 0.27 ± 2% -0.1 0.21 ± 2% perf-stat.i.branch-miss-rate% 22693772 +14.3% 25941855 perf-stat.i.branch-misses 39.43 -4.6 34.83 perf-stat.i.cache-miss-rate% 95834044 ± 3% +10.9% 1.063e+08 ± 4% perf-stat.i.cache-misses 2.658e+08 ± 2% +14.8% 3.051e+08 ± 4% perf-stat.i.cache-references 77622 ± 6% +25.5% 97395 ± 5% perf-stat.i.context-switches 2.82 +3.3% 2.91 perf-stat.i.cpi 1.821e+11 +12.5% 2.049e+11 perf-stat.i.cpu-cycles 9159 ± 3% +22.9% 11257 ± 3% perf-stat.i.cpu-migrations 6.223e+10 +12.8% 7.019e+10 perf-stat.i.instructions 0.36 -4.8% 0.34 perf-stat.i.ipc 1.19 ± 9% +27.6% 1.51 ± 5% perf-stat.i.metric.K/sec 4716 +6.3% 5011 perf-stat.i.minor-faults 4716 +6.3% 5011 perf-stat.i.page-faults 35.93 -1.2 34.77 perf-stat.overall.cache-miss-rate% 1.214e+10 +11.5% 1.353e+10 perf-stat.ps.branch-instructions 22276184 +13.3% 25240906 perf-stat.ps.branch-misses 95133798 ± 3% +9.8% 1.045e+08 ± 4% perf-stat.ps.cache-misses 2.648e+08 ± 2% +13.5% 3.004e+08 ± 4% perf-stat.ps.cache-references 77612 ± 6% +23.8% 96101 ± 5% perf-stat.ps.context-switches 1.813e+11 +11.1% 2.015e+11 perf-stat.ps.cpu-cycles 9093 ± 3% +21.5% 11047 ± 2% perf-stat.ps.cpu-migrations 6.192e+10 +11.5% 6.901e+10 perf-stat.ps.instructions 4639 +5.6% 4899 perf-stat.ps.minor-faults 4640 +5.6% 4899 perf-stat.ps.page-faults Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-22 8:11 ` kernel test robot @ 2024-05-22 16:00 ` Linus Torvalds 2024-05-22 17:13 ` Matthew Wilcox 0 siblings, 1 reply; 43+ messages in thread From: Linus Torvalds @ 2024-05-22 16:00 UTC (permalink / raw) To: kernel test robot Cc: Yafang Shao, oe-lkp, lkp, Al Viro, Christian Brauner, Jan Kara, Waiman Long, Matthew Wilcox, Wangkai, Colin Walters, linux-fsdevel, ying.huang, feng.tang, fengwei.yin, linux-mm, linux-kernel On Wed, 22 May 2024 at 01:12, kernel test robot <oliver.sang@intel.com> wrote: > > kernel test robot noticed a 6.7% improvement of stress-ng.touch.ops_per_sec on: > > commit: 3681ce364442ce2ec7c7fbe90ad0aca651db475b ("[PATCH] vfs: Delete the associated dentry when deleting a file") Ok, since everything else is at least tentatively in the noise, and the only hard numbers we have are the ones from Yafang's Elasticsearch load and this - both of which say that this is a good patch - I decided to just apply this ASAP just to get more testing. I just wanted to note very explicitly that this is very much tentative: this will be reverted very aggressively if somebody reports some other real-world load performance issues, and we'll have to look at other solutions. But I just don't think we'll get much more actual testing of this without just saying "let's try it". Also, I ended up removing the part of the patch that stopped clearing the DCACHE_CANT_MOUNT bit. I think it's right, but it's really unrelated to the actual problem at hand, and there are other cleanups - like the unnecessary dget/dput pair - in this path that could also be looked at. Anyway, let's see if somebody notices any issues with this. And I think we should look at the "shrink dentries" case anyway for other reasons, since it's easy to create a ton of negative dentries with just lots of lookups (rather than lots of unlinking of existing files). Of course, if you do billions of lookups of different files that do not exist in the same directory, I suspect you just have yourself to blame, so the "lots of negative lookups" load doesn't sound particularly realistic. TLDR; I applied it for testing because we're in the merge window and there's no reason not to try. Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-22 16:00 ` Linus Torvalds @ 2024-05-22 17:13 ` Matthew Wilcox 2024-05-22 18:11 ` Linus Torvalds 0 siblings, 1 reply; 43+ messages in thread From: Matthew Wilcox @ 2024-05-22 17:13 UTC (permalink / raw) To: Linus Torvalds Cc: kernel test robot, Yafang Shao, oe-lkp, lkp, Al Viro, Christian Brauner, Jan Kara, Waiman Long, Wangkai, Colin Walters, linux-fsdevel, ying.huang, feng.tang, fengwei.yin, linux-mm, linux-kernel On Wed, May 22, 2024 at 09:00:03AM -0700, Linus Torvalds wrote: > Of course, if you do billions of lookups of different files that do > not exist in the same directory, I suspect you just have yourself to > blame, so the "lots of negative lookups" load doesn't sound > particularly realistic. Oh no. We have real customers that this hits and it's not even stupid. Every time an "event" happens, they look up something like a hash in three different directories (like $PATH or multiple -I flags to the compiler). Order is important, so they can't just look it up in the directory that it's most likely to exist in. It usually fails to exist in directory A and B, so we create dentries that say it doesn't. And those dentries are literally never referenced again. Then directory C has the file they're looking for (or it doesn't and it gets created because the process has write access to C and not A or B). plan9 handles this so much better because it has that union-mount stuff instead of search paths. So it creates one dentry that tells it which of those directories it actually exists in. But we're stuck with unix-style search paths, so life is pain. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH] vfs: Delete the associated dentry when deleting a file 2024-05-22 17:13 ` Matthew Wilcox @ 2024-05-22 18:11 ` Linus Torvalds 0 siblings, 0 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-22 18:11 UTC (permalink / raw) To: Matthew Wilcox Cc: kernel test robot, Yafang Shao, oe-lkp, lkp, Al Viro, Christian Brauner, Jan Kara, Waiman Long, Wangkai, Colin Walters, linux-fsdevel, ying.huang, feng.tang, fengwei.yin, linux-mm, linux-kernel On Wed, 22 May 2024 at 10:14, Matthew Wilcox <willy@infradead.org> wrote: > > On Wed, May 22, 2024 at 09:00:03AM -0700, Linus Torvalds wrote: > > Of course, if you do billions of lookups of different files that do > > not exist in the same directory, I suspect you just have yourself to > > blame, so the "lots of negative lookups" load doesn't sound > > particularly realistic. > > Oh no. We have real customers that this hits and it's not even stupid. Oh, negative dentries exist, and yes, they are a major feature on some loads. Exactly because of the kinds of situations you describe. In fact, that's the _point_. Things like PATH lookup require negative dentries for good performance, because otherwise all the missed cases would force a lookup all the way out to the filesystem. So having thousands of negative dentries is normal and expected. And it will grow for bigger machines and loads, of course. That said, I don't think we've had people on real loads complain about them being in the hundreds of millions like Yafang's case. > plan9 handles this so much better because it has that union-mount stuff > instead of search paths. So it creates one dentry that tells it which of > those directories it actually exists in. But we're stuck with unix-style > search paths, so life is pain. I suspect you are just not as aware of the downsides of the plan9 models. People tend to think plan9 was great. It had huge and fundamental design mistakes. Union mounts à la plan9 aren't hugely wonderful, and there's a reason overlayfs does things differently (not that that is hugely wonderful either). Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 2:53 ` Linus Torvalds 2024-05-11 3:35 ` Yafang Shao @ 2024-05-11 3:36 ` Al Viro 2024-05-11 3:51 ` Yafang Shao ` (2 more replies) 1 sibling, 3 replies; 43+ messages in thread From: Al Viro @ 2024-05-11 3:36 UTC (permalink / raw) To: Linus Torvalds Cc: Yafang Shao, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Fri, May 10, 2024 at 07:53:49PM -0700, Linus Torvalds wrote: > although I think Al needs to ACK this, and I suspect that unhashing > the dentry also makes that > > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > > pointless (because the dentry won't be reused, so DCACHE_CANT_MOUNT > just won't matter). > > I do worry that there are loads that actually love our current > behavior, but maybe it's worth doing the simple unconditional "make > d_delete() always unhash" and only worry about whether that causes > performance problems for people who commonly create a new file in its > place when we get such a report. > > IOW, the more complex thing might be to actually take other behavior > into account (eg "do we have so many negative dentries that we really > don't want to create new ones"). > > Al - can you please step in and tell us what else I've missed, and why > my suggested version of the patch is also broken garbage? Need to RTFS and think for a while; I think it should be OK, but I'll need to dig through the tree to tell if there's anything nasty for e.g. network filesystems. Said that, I seriously suspect that there are loads where it would become painful. unlink() + creat() is _not_ a rare sequence, and this would shove an extra negative lookup into each of those. I would like to see the details on original posters' setup. Note that successful rmdir() evicts all children, so that it would seem that their arseloads of negative dentries come from a bunch of unlinks in surviving directories. It would be interesting to see if using something like mkdir graveyard rename victim over there, then unlink in new place rename next victim over there, then unlink in new place .... rmdir graveyard would change the situation with memory pressure - it would trigger eviction of all those negatives at controlled point(s) (rmdir). I'm not saying that it's a good mitigation, but it would get more details on that memory pressure. ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 3:36 ` [RFC PATCH] fs: dcache: " Al Viro @ 2024-05-11 3:51 ` Yafang Shao 2024-05-11 5:18 ` Al Viro 2024-05-11 5:32 ` Linus Torvalds 2 siblings, 0 replies; 43+ messages in thread From: Yafang Shao @ 2024-05-11 3:51 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Sat, May 11, 2024 at 11:36 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Fri, May 10, 2024 at 07:53:49PM -0700, Linus Torvalds wrote: > > > although I think Al needs to ACK this, and I suspect that unhashing > > the dentry also makes that > > > > dentry->d_flags &= ~DCACHE_CANT_MOUNT; > > > > pointless (because the dentry won't be reused, so DCACHE_CANT_MOUNT > > just won't matter). > > > > I do worry that there are loads that actually love our current > > behavior, but maybe it's worth doing the simple unconditional "make > > d_delete() always unhash" and only worry about whether that causes > > performance problems for people who commonly create a new file in its > > place when we get such a report. > > > > IOW, the more complex thing might be to actually take other behavior > > into account (eg "do we have so many negative dentries that we really > > don't want to create new ones"). > > > > Al - can you please step in and tell us what else I've missed, and why > > my suggested version of the patch is also broken garbage? > > Need to RTFS and think for a while; I think it should be OK, but I'll need > to dig through the tree to tell if there's anything nasty for e.g. network > filesystems. > > Said that, I seriously suspect that there are loads where it would become > painful. unlink() + creat() is _not_ a rare sequence, and this would > shove an extra negative lookup into each of those. > > I would like to see the details on original posters' setup. Note that > successful rmdir() evicts all children, so that it would seem that their > arseloads of negative dentries come from a bunch of unlinks in surviving > directories. Right, the directories are fixed. We've engaged in discussions with ES users regarding changing the directory, but it would entail a significant adjustment for them. > > It would be interesting to see if using something like > mkdir graveyard > rename victim over there, then unlink in new place > rename next victim over there, then unlink in new place > .... > rmdir graveyard > would change the situation with memory pressure - it would trigger > eviction of all those negatives at controlled point(s) (rmdir). > I'm not saying that it's a good mitigation, but it would get more > details on that memory pressure. -- Regards Yafang ^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 3:36 ` [RFC PATCH] fs: dcache: " Al Viro 2024-05-11 3:51 ` Yafang Shao @ 2024-05-11 5:18 ` Al Viro 2024-05-11 5:32 ` Linus Torvalds 2 siblings, 0 replies; 43+ messages in thread From: Al Viro @ 2024-05-11 5:18 UTC (permalink / raw) To: Linus Torvalds Cc: Yafang Shao, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Sat, May 11, 2024 at 04:36:19AM +0100, Al Viro wrote: > Said that, I seriously suspect that there are loads where it would become > painful. unlink() + creat() is _not_ a rare sequence, and this would > shove an extra negative lookup into each of those. > > I would like to see the details on original posters' setup. Note that > successful rmdir() evicts all children, so that it would seem that their > arseloads of negative dentries come from a bunch of unlinks in surviving > directories. > > It would be interesting to see if using something like > mkdir graveyard > rename victim over there, then unlink in new place > rename next victim over there, then unlink in new place > .... > rmdir graveyard > would change the situation with memory pressure - it would trigger > eviction of all those negatives at controlled point(s) (rmdir). > I'm not saying that it's a good mitigation, but it would get more > details on that memory pressure. BTW, how about adding to do_vfs_ioctl() something like case FS_IOC_FORGET: shrink_dcache_parent(file->f_path.dentry); return 0; possibly with option for dropping only negatives? Even in the minimal form it would allow userland to force eviction in given directory tree, without disrupting things anywhere else. That's a trivial (completely untested) patch, really - diff --git a/fs/ioctl.c b/fs/ioctl.c index 1d5abfdf0f22..342bb71cf76c 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -878,6 +878,12 @@ static int do_vfs_ioctl(struct file *filp, unsigned int fd, case FS_IOC_GETFSSYSFSPATH: return ioctl_get_fs_sysfs_path(filp, argp); + case FS_IOC_FORGET: + if (arg != 0) // only 0 for now + break; + shrink_dcache_parent(filp->f_path.dentry); + return 0; + default: if (S_ISREG(inode->i_mode)) return file_ioctl(filp, cmd, argp); diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 45e4e64fd664..143129510289 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -222,6 +222,7 @@ struct fsxattr { #define FS_IOC_GETFLAGS _IOR('f', 1, long) #define FS_IOC_SETFLAGS _IOW('f', 2, long) +#define FS_IOC_FORGET _IOW('f', 255, int) #define FS_IOC_GETVERSION _IOR('v', 1, long) #define FS_IOC_SETVERSION _IOW('v', 2, long) #define FS_IOC_FIEMAP _IOWR('f', 11, struct fiemap) ^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file 2024-05-11 3:36 ` [RFC PATCH] fs: dcache: " Al Viro 2024-05-11 3:51 ` Yafang Shao 2024-05-11 5:18 ` Al Viro @ 2024-05-11 5:32 ` Linus Torvalds 2 siblings, 0 replies; 43+ messages in thread From: Linus Torvalds @ 2024-05-11 5:32 UTC (permalink / raw) To: Al Viro Cc: Yafang Shao, brauner, jack, linux-fsdevel, Wangkai, Colin Walters, Waiman Long On Fri, 10 May 2024 at 20:36, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Said that, I seriously suspect that there are loads where it would become > painful. unlink() + creat() is _not_ a rare sequence, and this would > shove an extra negative lookup into each of those. The other side of this is that the "lots of negative dentries after people have removed files" is definitely something that has come up before as a latency problem. So I do wonder if we've just been too worried about not changing the status quo, and maybe the whole "make unlink() turn a positive dentry into a negative one" was always a mis-optimization. I do agree that we might have to do something more complicated, but I think that before we just _assume_ we have to do that, maybe we should just do the simple and stupid thing. Because while "unlink and create" is most definitely a very real pattern, maybe it's not really _so_ common that we should care about it as a primary issue? The reason to keep the negative dentry around is to avoid the unnecessary "->lookup()" followed by "->create()" directory operations. But network filesystems - where that is _particularly_ expensive - end up having done the whole ->atomic_open thing anyway, so maybe that reason isn't as big of a deal as it used to be? And I don't particularly like your "give people a flush ioctl" either. There are security concerns with that one - both for timing and for just basically messing with performance. At the very least, I think it should require write permissions on the directory you are flushing, but I really think we should instead aim to be better about this in the first place. Anyway, in how many loads is that "unlink -> create" pattern more than a very occasional thing? Which is why I think maybe we're just overthinking this and being too timid when saying "we should count negative dentries or something". Linus ^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2024-05-23 7:18 UTC | newest] Thread overview: 43+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-11 2:27 [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file Yafang Shao 2024-05-11 2:53 ` Linus Torvalds 2024-05-11 3:35 ` Yafang Shao 2024-05-11 4:54 ` Waiman Long 2024-05-11 15:58 ` Matthew Wilcox 2024-05-11 16:07 ` Linus Torvalds 2024-05-11 16:13 ` Linus Torvalds 2024-05-11 18:05 ` Linus Torvalds 2024-05-11 18:26 ` [PATCH] vfs: move dentry shrinking outside the inode lock in 'rmdir()' Linus Torvalds 2024-05-11 18:42 ` Linus Torvalds 2024-05-11 19:28 ` Al Viro 2024-05-11 19:55 ` Linus Torvalds 2024-05-11 20:31 ` Al Viro 2024-05-11 21:17 ` Al Viro 2024-05-12 15:45 ` James Bottomley 2024-05-12 16:16 ` Al Viro 2024-05-12 19:59 ` Linus Torvalds 2024-05-12 20:29 ` Linus Torvalds 2024-05-13 5:31 ` Al Viro 2024-05-13 15:58 ` Linus Torvalds 2024-05-13 16:33 ` Al Viro 2024-05-13 16:44 ` Linus Torvalds 2024-05-23 7:18 ` Dave Chinner 2024-05-11 20:02 ` [PATCH v2] " Linus Torvalds 2024-05-12 3:06 ` Yafang Shao 2024-05-12 3:30 ` Al Viro 2024-05-12 3:36 ` Yafang Shao 2024-05-11 19:24 ` [PATCH] " Al Viro 2024-05-15 2:18 ` [RFC PATCH] fs: dcache: Delete the associated dentry when deleting a file Yafang Shao 2024-05-15 2:36 ` Linus Torvalds 2024-05-15 9:17 ` [PATCH] vfs: " Yafang Shao 2024-05-15 16:05 ` Linus Torvalds 2024-05-16 13:44 ` Oliver Sang 2024-05-22 8:51 ` Oliver Sang 2024-05-23 2:21 ` Yafang Shao 2024-05-22 8:11 ` kernel test robot 2024-05-22 16:00 ` Linus Torvalds 2024-05-22 17:13 ` Matthew Wilcox 2024-05-22 18:11 ` Linus Torvalds 2024-05-11 3:36 ` [RFC PATCH] fs: dcache: " Al Viro 2024-05-11 3:51 ` Yafang Shao 2024-05-11 5:18 ` Al Viro 2024-05-11 5:32 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).