* [LTP] [PATCH] fanotify10: Calling drop_cache three times to ensure the inode is evicted
@ 2024-10-18 7:13 Zizhi Wo via ltp
2024-10-18 11:52 ` Jan Kara
0 siblings, 1 reply; 3+ messages in thread
From: Zizhi Wo via ltp @ 2024-10-18 7:13 UTC (permalink / raw)
To: ltp, jack, amir73il, pvorel; +Cc: yangerkun
In commmit 6df425bb7040 ("fanotify10: Calling drop_cache twice to ensure
the inode is evicted"), the number of drop_cache operations was increased
to two in order to prevent inodes from being left uncleared due to inodes
and dentries being on different NUMA nodes, which would cause the testcase
to fail.
However, during our local testing, I found that this testcase still
occasionally fails:
fanotify10.c:771: TINFO: Test #27: don't ignore fs events created inside a parent with evicted ignore mark
fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
......
fanotify10.c:841: TPASS: group 0 (8) got 16 events: mask 20 pid=22962
fanotify10.c:841: TPASS: group 1 (8) got 16 events: mask 20 pid=22962
fanotify10.c:841: TPASS: group 2 (8) got 16 events: mask 20 pid=22962
fanotify10.c:836: TFAIL: group 0 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
fanotify10.c:836: TFAIL: group 1 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
fanotify10.c:836: TFAIL: group 2 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
......
In this testcase(Test #27), ignore_path is "fs_mnt/testdir", and event_path
is "fs_mnt/testdir/testfile". The purpose of ignore_path is to verify that
the ignore_mark does not pin the corresponding inode. If any inodes remain
after drop_cache, the test case will fail. Here, the ignore_path is located
in a two-level directory structure, and two drop_cache operations might
still not be enough.
Consider the scenario where the parent inode is on NUMA0, the parent dentry
is on NUMA1, the child inode is on NUMA2, and the child dentry is on NUMA3.
After the first drop_cache, the child dentry is cleared, its inode and
parent dentry are placed in the *corresponding* lru link list; after the
second drop_cache, the parent dentry and the child inode are cleared; only
after the third drop_cache would the parent inode be fully released. The
corresponding kernel flow is as follows:
drop_caches_sysctl_handler
drop_slab
// Traverse NUMA in order.
drop_slab_node
...
// Free dentry, child dentry pin parent dentry and its inode.
prune_dcache_sb
...
dentry_unlink_inode
...
// Place the inode into its corresponding NUMA lru link list.
list_lru_add
/*
* Free inode. If the NUMA where the inode resides is different from
* its dentry, the inode may not be released this time.
*/
prune_icache_sb
drop_cache1 drop_cache2 drop_cache3
NUMA0: parent inode exist exist free
NUMA1: parent dentry exist free free
NUMA2: child inode exist free free
NUMA3: child dentry free free free
Due to the release of the dependency chain, the drop_cache cleanup also
takes several times. Therefore, to be safe, three drop_cache operations are
needed to handle the two-level directory structure.
Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
---
testcases/kernel/syscalls/fanotify/fanotify10.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/testcases/kernel/syscalls/fanotify/fanotify10.c b/testcases/kernel/syscalls/fanotify/fanotify10.c
index ce5469580..eedd1442f 100644
--- a/testcases/kernel/syscalls/fanotify/fanotify10.c
+++ b/testcases/kernel/syscalls/fanotify/fanotify10.c
@@ -515,7 +515,11 @@ static void drop_caches(void)
if (syncfs(fd_syncfs) < 0)
tst_brk(TBROK | TERRNO, "Unexpected error when syncing filesystem");
- /* Need to drop twice to ensure the inode is evicted. */
+ /*
+ * In order to ensure that the inode can be released in the two-tier
+ * directory structure, drop_cache is required three times.
+ */
+ SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
}
--
2.39.2
--
Mailing list info: https://lists.linux.it/listinfo/ltp
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [LTP] [PATCH] fanotify10: Calling drop_cache three times to ensure the inode is evicted
2024-10-18 7:13 [LTP] [PATCH] fanotify10: Calling drop_cache three times to ensure the inode is evicted Zizhi Wo via ltp
@ 2024-10-18 11:52 ` Jan Kara
2024-10-18 17:18 ` Petr Vorel
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2024-10-18 11:52 UTC (permalink / raw)
To: Zizhi Wo; +Cc: yangerkun, jack, ltp
On Fri 18-10-24 15:13:53, Zizhi Wo wrote:
> In commmit 6df425bb7040 ("fanotify10: Calling drop_cache twice to ensure
> the inode is evicted"), the number of drop_cache operations was increased
> to two in order to prevent inodes from being left uncleared due to inodes
> and dentries being on different NUMA nodes, which would cause the testcase
> to fail.
>
> However, during our local testing, I found that this testcase still
> occasionally fails:
>
> fanotify10.c:771: TINFO: Test #27: don't ignore fs events created inside a parent with evicted ignore mark
> fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
> fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
> fanotify10.c:510: TPASS: Found 0 ignore marks which is in expected range 0-0
> fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
> fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
> fanotify10.c:510: TPASS: Found 1 ignore marks which is in expected range 0-8
> ......
> fanotify10.c:841: TPASS: group 0 (8) got 16 events: mask 20 pid=22962
> fanotify10.c:841: TPASS: group 1 (8) got 16 events: mask 20 pid=22962
> fanotify10.c:841: TPASS: group 2 (8) got 16 events: mask 20 pid=22962
> fanotify10.c:836: TFAIL: group 0 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
> fanotify10.c:836: TFAIL: group 1 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
> fanotify10.c:836: TFAIL: group 2 (4) with FAN_MARK_FILESYSTEM got unexpected number of events (15 != 16)
> ......
Sigh, I'm not sure we can ever make this work reliably.
> In this testcase(Test #27), ignore_path is "fs_mnt/testdir", and event_path
> is "fs_mnt/testdir/testfile". The purpose of ignore_path is to verify that
> the ignore_mark does not pin the corresponding inode. If any inodes remain
> after drop_cache, the test case will fail. Here, the ignore_path is located
> in a two-level directory structure, and two drop_cache operations might
> still not be enough.
>
> Consider the scenario where the parent inode is on NUMA0, the parent dentry
> is on NUMA1, the child inode is on NUMA2, and the child dentry is on NUMA3.
> After the first drop_cache, the child dentry is cleared, its inode and
> parent dentry are placed in the *corresponding* lru link list; after the
> second drop_cache, the parent dentry and the child inode are cleared; only
> after the third drop_cache would the parent inode be fully released. The
> corresponding kernel flow is as follows:
>
> drop_caches_sysctl_handler
> drop_slab
> // Traverse NUMA in order.
> drop_slab_node
> ...
> // Free dentry, child dentry pin parent dentry and its inode.
> prune_dcache_sb
> ...
> dentry_unlink_inode
> ...
> // Place the inode into its corresponding NUMA lru link list.
> list_lru_add
> /*
> * Free inode. If the NUMA where the inode resides is different from
> * its dentry, the inode may not be released this time.
> */
> prune_icache_sb
>
> drop_cache1 drop_cache2 drop_cache3
> NUMA0: parent inode exist exist free
> NUMA1: parent dentry exist free free
> NUMA2: child inode exist free free
> NUMA3: child dentry free free free
Well, this is right but there's also the while ((freed >> shift++) > 1)
loop in drop_slab() which should generally make us loop as long as there's
something to reclaim. But yes, if in theory the only thing we can reclaim
is the child dentry in the first round, then what you suggest may happen.
> Due to the release of the dependency chain, the drop_cache cleanup also
> takes several times. Therefore, to be safe, three drop_cache operations are
> needed to handle the two-level directory structure.
OK, I'm willing to give this one last try. If it doesn't work out, I'd just
drop these tests until we can find a more reliable way of testing this.
Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
> ---
> testcases/kernel/syscalls/fanotify/fanotify10.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/testcases/kernel/syscalls/fanotify/fanotify10.c b/testcases/kernel/syscalls/fanotify/fanotify10.c
> index ce5469580..eedd1442f 100644
> --- a/testcases/kernel/syscalls/fanotify/fanotify10.c
> +++ b/testcases/kernel/syscalls/fanotify/fanotify10.c
> @@ -515,7 +515,11 @@ static void drop_caches(void)
> if (syncfs(fd_syncfs) < 0)
> tst_brk(TBROK | TERRNO, "Unexpected error when syncing filesystem");
>
> - /* Need to drop twice to ensure the inode is evicted. */
> + /*
> + * In order to ensure that the inode can be released in the two-tier
> + * directory structure, drop_cache is required three times.
> + */
> + SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
> SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
> SAFE_FILE_PRINTF(DROP_CACHES_FILE, "3");
> }
> --
> 2.39.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
--
Mailing list info: https://lists.linux.it/listinfo/ltp
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [LTP] [PATCH] fanotify10: Calling drop_cache three times to ensure the inode is evicted
2024-10-18 11:52 ` Jan Kara
@ 2024-10-18 17:18 ` Petr Vorel
0 siblings, 0 replies; 3+ messages in thread
From: Petr Vorel @ 2024-10-18 17:18 UTC (permalink / raw)
To: Jan Kara; +Cc: yangerkun, ltp
Hi Jan, Zizhi Wo,
...
> > NUMA0: parent inode exist exist free
> > NUMA1: parent dentry exist free free
> > NUMA2: child inode exist free free
> > NUMA3: child dentry free free free
> Well, this is right but there's also the while ((freed >> shift++) > 1)
> loop in drop_slab() which should generally make us loop as long as there's
> something to reclaim. But yes, if in theory the only thing we can reclaim
> is the child dentry in the first round, then what you suggest may happen.
> > Due to the release of the dependency chain, the drop_cache cleanup also
> > takes several times. Therefore, to be safe, three drop_cache operations are
> > needed to handle the two-level directory structure.
> OK, I'm willing to give this one last try. If it doesn't work out, I'd just
> drop these tests until we can find a more reliable way of testing this.
> Feel free to add:
> Reviewed-by: Jan Kara <jack@suse.cz>
Thanks you both for yet another fix and review, both is very much appreciated.
Hopefully this will finally work.
Kind regards,
Petr
> Honza
...
--
Mailing list info: https://lists.linux.it/listinfo/ltp
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-10-18 17:18 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-18 7:13 [LTP] [PATCH] fanotify10: Calling drop_cache three times to ensure the inode is evicted Zizhi Wo via ltp
2024-10-18 11:52 ` Jan Kara
2024-10-18 17:18 ` Petr Vorel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox