* [syzbot] [cgroups?] BUG: sleeping function called from invalid context in cgroup_rstat_flush
@ 2024-06-27 14:03 syzbot
2024-06-27 16:31 ` Johannes Weiner
2024-06-27 20:17 ` [PATCH] cachestat: do not flush stats in recency check Nhat Pham
0 siblings, 2 replies; 6+ messages in thread
From: syzbot @ 2024-06-27 14:03 UTC (permalink / raw)
To: cgroups, hannes, linux-kernel, lizefan.x, syzkaller-bugs, tj
Hello,
syzbot found the following issue on:
HEAD commit: 7c16f0a4ed1c Merge tag 'i2c-for-6.10-rc5' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1511528e980000
kernel config: https://syzkaller.appspot.com/x/.config?x=12f98862a3c0c799
dashboard link: https://syzkaller.appspot.com/bug?extid=b7f13b2d0cc156edf61a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/50560e9024e5/disk-7c16f0a4.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/080c27daee72/vmlinux-7c16f0a4.xz
kernel image: https://storage.googleapis.com/syzbot-assets/c528e0da4544/bzImage-7c16f0a4.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b7f13b2d0cc156edf61a@syzkaller.appspotmail.com
BUG: sleeping function called from invalid context at kernel/cgroup/rstat.c:351
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 17332, name: syz-executor.4
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
1 lock held by syz-executor.4/17332:
#0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline]
#0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline]
#0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: filemap_cachestat mm/filemap.c:4251 [inline]
#0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __do_sys_cachestat mm/filemap.c:4407 [inline]
#0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __se_sys_cachestat+0x3ee/0xbb0 mm/filemap.c:4372
CPU: 1 PID: 17332 Comm: syz-executor.4 Not tainted 6.10.0-rc4-syzkaller-00330-g7c16f0a4ed1c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
__might_resched+0x5d4/0x780 kernel/sched/core.c:10196
cgroup_rstat_flush+0x1e/0x50 kernel/cgroup/rstat.c:351
workingset_test_recent+0x48a/0xa90 mm/workingset.c:473
filemap_cachestat mm/filemap.c:4314 [inline]
__do_sys_cachestat mm/filemap.c:4407 [inline]
__se_sys_cachestat+0x795/0xbb0 mm/filemap.c:4372
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff329e7d0a9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff32ac600c8 EFLAGS: 00000246 ORIG_RAX: 00000000000001c3
RAX: ffffffffffffffda RBX: 00007ff329fb4120 RCX: 00007ff329e7d0a9
RDX: 0000000020000080 RSI: 0000000020000040 RDI: 0000000000000005
RBP: 00007ff329eec074 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007ff329fb4120 R15: 00007ffd3e0ff4a8
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [syzbot] [cgroups?] BUG: sleeping function called from invalid context in cgroup_rstat_flush 2024-06-27 14:03 [syzbot] [cgroups?] BUG: sleeping function called from invalid context in cgroup_rstat_flush syzbot @ 2024-06-27 16:31 ` Johannes Weiner 2024-06-27 19:09 ` Nhat Pham 2024-06-27 20:17 ` [PATCH] cachestat: do not flush stats in recency check Nhat Pham 1 sibling, 1 reply; 6+ messages in thread From: Johannes Weiner @ 2024-06-27 16:31 UTC (permalink / raw) To: syzbot; +Cc: cgroups, linux-kernel, lizefan.x, syzkaller-bugs, tj, Nhat Pham On Thu, Jun 27, 2024 at 07:03:21AM -0700, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 7c16f0a4ed1c Merge tag 'i2c-for-6.10-rc5' of git://git.ker.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1511528e980000 > kernel config: https://syzkaller.appspot.com/x/.config?x=12f98862a3c0c799 > dashboard link: https://syzkaller.appspot.com/bug?extid=b7f13b2d0cc156edf61a > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > Unfortunately, I don't have any reproducer for this issue yet. > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/50560e9024e5/disk-7c16f0a4.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/080c27daee72/vmlinux-7c16f0a4.xz > kernel image: https://storage.googleapis.com/syzbot-assets/c528e0da4544/bzImage-7c16f0a4.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+b7f13b2d0cc156edf61a@syzkaller.appspotmail.com > > BUG: sleeping function called from invalid context at kernel/cgroup/rstat.c:351 > in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 17332, name: syz-executor.4 > preempt_count: 0, expected: 0 > RCU nest depth: 1, expected: 0 > 1 lock held by syz-executor.4/17332: > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline] > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: filemap_cachestat mm/filemap.c:4251 [inline] > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __do_sys_cachestat mm/filemap.c:4407 [inline] > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __se_sys_cachestat+0x3ee/0xbb0 mm/filemap.c:4372 > CPU: 1 PID: 17332 Comm: syz-executor.4 Not tainted 6.10.0-rc4-syzkaller-00330-g7c16f0a4ed1c #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 > __might_resched+0x5d4/0x780 kernel/sched/core.c:10196 > cgroup_rstat_flush+0x1e/0x50 kernel/cgroup/rstat.c:351 > workingset_test_recent+0x48a/0xa90 mm/workingset.c:473 > filemap_cachestat mm/filemap.c:4314 [inline] > __do_sys_cachestat mm/filemap.c:4407 [inline] > __se_sys_cachestat+0x795/0xbb0 mm/filemap.c:4372 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x77/0x7f Ok yeah, cachestat() holds the rcu read lock, so workingset_test_recent() can't do a sleepable rstat flush. I think the easiest fix would be to flush rstat from the root down (NULL) in filemap_cachestat(), before the rcu section, and add a flag to workingset_test_recent() to forego it. Nhat? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [syzbot] [cgroups?] BUG: sleeping function called from invalid context in cgroup_rstat_flush 2024-06-27 16:31 ` Johannes Weiner @ 2024-06-27 19:09 ` Nhat Pham 0 siblings, 0 replies; 6+ messages in thread From: Nhat Pham @ 2024-06-27 19:09 UTC (permalink / raw) To: Johannes Weiner Cc: syzbot, cgroups, linux-kernel, lizefan.x, syzkaller-bugs, tj On Thu, Jun 27, 2024 at 9:31 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Thu, Jun 27, 2024 at 07:03:21AM -0700, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 7c16f0a4ed1c Merge tag 'i2c-for-6.10-rc5' of git://git.ker.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1511528e980000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=12f98862a3c0c799 > > dashboard link: https://syzkaller.appspot.com/bug?extid=b7f13b2d0cc156edf61a > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/50560e9024e5/disk-7c16f0a4.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/080c27daee72/vmlinux-7c16f0a4.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/c528e0da4544/bzImage-7c16f0a4.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+b7f13b2d0cc156edf61a@syzkaller.appspotmail.com > > > > BUG: sleeping function called from invalid context at kernel/cgroup/rstat.c:351 > > in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 17332, name: syz-executor.4 > > preempt_count: 0, expected: 0 > > RCU nest depth: 1, expected: 0 > > 1 lock held by syz-executor.4/17332: > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: filemap_cachestat mm/filemap.c:4251 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __do_sys_cachestat mm/filemap.c:4407 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __se_sys_cachestat+0x3ee/0xbb0 mm/filemap.c:4372 > > CPU: 1 PID: 17332 Comm: syz-executor.4 Not tainted 6.10.0-rc4-syzkaller-00330-g7c16f0a4ed1c #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 > > Call Trace: > > <TASK> > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 > > __might_resched+0x5d4/0x780 kernel/sched/core.c:10196 > > cgroup_rstat_flush+0x1e/0x50 kernel/cgroup/rstat.c:351 > > workingset_test_recent+0x48a/0xa90 mm/workingset.c:473 > > filemap_cachestat mm/filemap.c:4314 [inline] > > __do_sys_cachestat mm/filemap.c:4407 [inline] > > __se_sys_cachestat+0x795/0xbb0 mm/filemap.c:4372 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > Ok yeah, cachestat() holds the rcu read lock, so > workingset_test_recent() can't do a sleepable rstat flush. > > I think the easiest fix would be to flush rstat from the root down > (NULL) in filemap_cachestat(), before the rcu section, and add a flag > to workingset_test_recent() to forego it. Nhat? You're right. I think it's been broken since this commit: b00684722262 mm: workingset: move the stats flush into workingset_test_recent() which moves the stats flushing from the refault step (before rcu read lock section) to inside workingset_test_recent(). I believe that's 6.8, 6.9, and 6.10 we need to fix? The fix sounds reasonable to me :) Let me whip up something real quick. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] cachestat: do not flush stats in recency check 2024-06-27 14:03 [syzbot] [cgroups?] BUG: sleeping function called from invalid context in cgroup_rstat_flush syzbot 2024-06-27 16:31 ` Johannes Weiner @ 2024-06-27 20:17 ` Nhat Pham 2024-06-27 20:41 ` Johannes Weiner 2024-06-28 1:58 ` Shakeel Butt 1 sibling, 2 replies; 6+ messages in thread From: Nhat Pham @ 2024-06-27 20:17 UTC (permalink / raw) To: akpm Cc: hannes, kernel-team, linux-mm, linux-kernel, stable, willy, david, ryan.roberts, ying.huang, viro, kasong, yosryahmed, shakeel.butt, linux-fsdevel syzbot detects that cachestat() is flushing stats, which can sleep, in its RCU read section (see [1]). This is done in the workingset_test_recent() step (which checks if the folio's eviction is recent). Move the stat flushing step to before the RCU read section of cachestat, and skip stat flushing during the recency check. [1]: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/ Reported-by: syzbot+b7f13b2d0cc156edf61a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/ Debugged-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Nhat Pham <nphamcs@gmail.com> Fixes: b00684722262 ("mm: workingset: move the stats flush into workingset_test_recent()") Cc: stable@vger.kernel.org # v6.8+ --- include/linux/swap.h | 3 ++- mm/filemap.c | 5 ++++- mm/workingset.c | 14 +++++++++++--- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index bd450023b9a4..e685e93ba354 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -354,7 +354,8 @@ static inline swp_entry_t page_swap_entry(struct page *page) } /* linux/mm/workingset.c */ -bool workingset_test_recent(void *shadow, bool file, bool *workingset); +bool workingset_test_recent(void *shadow, bool file, bool *workingset, + bool flush); void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages); void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg); void workingset_refault(struct folio *folio, void *shadow); diff --git a/mm/filemap.c b/mm/filemap.c index fedefb10d947..298485d4b992 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -4248,6 +4248,9 @@ static void filemap_cachestat(struct address_space *mapping, XA_STATE(xas, &mapping->i_pages, first_index); struct folio *folio; + /* Flush stats (and potentially sleep) outside the RCU read section. */ + mem_cgroup_flush_stats_ratelimited(NULL); + rcu_read_lock(); xas_for_each(&xas, folio, last_index) { int order; @@ -4311,7 +4314,7 @@ static void filemap_cachestat(struct address_space *mapping, goto resched; } #endif - if (workingset_test_recent(shadow, true, &workingset)) + if (workingset_test_recent(shadow, true, &workingset, false)) cs->nr_recently_evicted += nr_pages; goto resched; diff --git a/mm/workingset.c b/mm/workingset.c index c22adb93622a..a2b28e356e68 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -412,10 +412,12 @@ void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg) * @file: whether the corresponding folio is from the file lru. * @workingset: where the workingset value unpacked from shadow should * be stored. + * @flush: whether to flush cgroup rstat. * * Return: true if the shadow is for a recently evicted folio; false otherwise. */ -bool workingset_test_recent(void *shadow, bool file, bool *workingset) +bool workingset_test_recent(void *shadow, bool file, bool *workingset, + bool flush) { struct mem_cgroup *eviction_memcg; struct lruvec *eviction_lruvec; @@ -467,10 +469,16 @@ bool workingset_test_recent(void *shadow, bool file, bool *workingset) /* * Flush stats (and potentially sleep) outside the RCU read section. + * + * Note that workingset_test_recent() itself might be called in RCU read + * section (for e.g, in cachestat) - these callers need to skip flushing + * stats (via the flush argument). + * * XXX: With per-memcg flushing and thresholding, is ratelimiting * still needed here? */ - mem_cgroup_flush_stats_ratelimited(eviction_memcg); + if (flush) + mem_cgroup_flush_stats_ratelimited(eviction_memcg); eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat); refault = atomic_long_read(&eviction_lruvec->nonresident_age); @@ -558,7 +566,7 @@ void workingset_refault(struct folio *folio, void *shadow) mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr); - if (!workingset_test_recent(shadow, file, &workingset)) + if (!workingset_test_recent(shadow, file, &workingset, true)) return; folio_set_active(folio); base-commit: a5c6fededf806aba1ff9b0f01278f7d089da5725 -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] cachestat: do not flush stats in recency check 2024-06-27 20:17 ` [PATCH] cachestat: do not flush stats in recency check Nhat Pham @ 2024-06-27 20:41 ` Johannes Weiner 2024-06-28 1:58 ` Shakeel Butt 1 sibling, 0 replies; 6+ messages in thread From: Johannes Weiner @ 2024-06-27 20:41 UTC (permalink / raw) To: Nhat Pham Cc: akpm, kernel-team, linux-mm, linux-kernel, stable, willy, david, ryan.roberts, ying.huang, viro, kasong, yosryahmed, shakeel.butt, linux-fsdevel On Thu, Jun 27, 2024 at 01:17:37PM -0700, Nhat Pham wrote: > syzbot detects that cachestat() is flushing stats, which can sleep, in > its RCU read section (see [1]). This is done in the > workingset_test_recent() step (which checks if the folio's eviction is > recent). > > Move the stat flushing step to before the RCU read section of cachestat, > and skip stat flushing during the recency check. > > [1]: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/ > > Reported-by: syzbot+b7f13b2d0cc156edf61a@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/ > Debugged-by: Johannes Weiner <hannes@cmpxchg.org> > Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > Signed-off-by: Nhat Pham <nphamcs@gmail.com> > Fixes: b00684722262 ("mm: workingset: move the stats flush into workingset_test_recent()") > Cc: stable@vger.kernel.org # v6.8+ Acked-by: Johannes Weiner <hannes@cmpxchg.org> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] cachestat: do not flush stats in recency check 2024-06-27 20:17 ` [PATCH] cachestat: do not flush stats in recency check Nhat Pham 2024-06-27 20:41 ` Johannes Weiner @ 2024-06-28 1:58 ` Shakeel Butt 1 sibling, 0 replies; 6+ messages in thread From: Shakeel Butt @ 2024-06-28 1:58 UTC (permalink / raw) To: Nhat Pham Cc: akpm, hannes, kernel-team, linux-mm, linux-kernel, stable, willy, david, ryan.roberts, ying.huang, viro, kasong, yosryahmed, linux-fsdevel On Thu, Jun 27, 2024 at 01:17:37PM GMT, Nhat Pham wrote: > syzbot detects that cachestat() is flushing stats, which can sleep, in > its RCU read section (see [1]). This is done in the > workingset_test_recent() step (which checks if the folio's eviction is > recent). > > Move the stat flushing step to before the RCU read section of cachestat, > and skip stat flushing during the recency check. > > [1]: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/ > > Reported-by: syzbot+b7f13b2d0cc156edf61a@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/cgroups/000000000000f71227061bdf97e0@google.com/ > Debugged-by: Johannes Weiner <hannes@cmpxchg.org> > Suggested-by: Johannes Weiner <hannes@cmpxchg.org> > Signed-off-by: Nhat Pham <nphamcs@gmail.com> > Fixes: b00684722262 ("mm: workingset: move the stats flush into workingset_test_recent()") > Cc: stable@vger.kernel.org # v6.8+ Acked-by: Shakeel Butt <shakeel.butt@linux.dev> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-28 1:58 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-27 14:03 [syzbot] [cgroups?] BUG: sleeping function called from invalid context in cgroup_rstat_flush syzbot 2024-06-27 16:31 ` Johannes Weiner 2024-06-27 19:09 ` Nhat Pham 2024-06-27 20:17 ` [PATCH] cachestat: do not flush stats in recency check Nhat Pham 2024-06-27 20:41 ` Johannes Weiner 2024-06-28 1:58 ` Shakeel Butt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox