[syzbot] WARNING in p9_client

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [syzbot] WARNING in p9_client_destroy
@ 2022-02-28  0:53 syzbot
  2022-02-28  1:38 ` asmadeus
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: syzbot @ 2022-02-28  0:53 UTC (permalink / raw)
  To: asmadeus, davem, ericvh, kuba, linux-kernel, linux_oss, lucho,
	netdev, syzkaller-bugs, v9fs-developer

Hello,

syzbot found the following issue on:

HEAD commit:    23d04328444a Merge tag 'for-5.17/parisc-4' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1614a812700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f2a8c25b60d49d24
dashboard link: https://syzkaller.appspot.com/bug?extid=5e28cdb7ebd0f2389ca4
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com

kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when called from p9_client_destroy+0x213/0x370 net/9p/client.c:1100
WARNING: CPU: 1 PID: 3701 at mm/slab_common.c:502 kmem_cache_destroy mm/slab_common.c:502 [inline]
WARNING: CPU: 1 PID: 3701 at mm/slab_common.c:502 kmem_cache_destroy+0x13b/0x140 mm/slab_common.c:490
Modules linked in:
CPU: 1 PID: 3701 Comm: syz-executor.3 Not tainted 5.17.0-rc5-syzkaller-00021-g23d04328444a #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:kmem_cache_destroy mm/slab_common.c:502 [inline]
RIP: 0010:kmem_cache_destroy+0x13b/0x140 mm/slab_common.c:490
Code: da a8 0e 48 89 ee e8 44 6e 15 00 eb c1 c3 48 8b 55 58 48 c7 c6 60 cd b6 89 48 c7 c7 30 83 3a 8b 48 8b 4c 24 18 e8 9b 30 60 07 <0f> 0b eb a0 90 41 55 49 89 d5 41 54 49 89 f4 55 48 89 fd 53 48 83
RSP: 0018:ffffc90002767cf0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 1ffff920004ecfa5 RCX: 0000000000000000
RDX: ffff88801e56a280 RSI: ffffffff815f4b38 RDI: fffff520004ecf90
RBP: ffff888020ba8b00 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815ef1ce R11: 0000000000000000 R12: 0000000000000001
R13: ffffc90002767d68 R14: dffffc0000000000 R15: 0000000000000000
FS:  00005555561b0400(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555556ead708 CR3: 0000000068b97000 CR4: 0000000000150ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 p9_client_destroy+0x213/0x370 net/9p/client.c:1100
 v9fs_session_close+0x45/0x2d0 fs/9p/v9fs.c:504
 v9fs_kill_super+0x49/0x90 fs/9p/vfs_super.c:226
 deactivate_locked_super+0x94/0x160 fs/super.c:332
 deactivate_super+0xad/0xd0 fs/super.c:363
 cleanup_mnt+0x3a2/0x540 fs/namespace.c:1173
 task_work_run+0xdd/0x1a0 kernel/task_work.c:164
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
 exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:207
 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
 syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f5ff63ed4c7
Code: ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff01862e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f5ff63ed4c7
RDX: 00007fff01862f6c RSI: 000000000000000a RDI: 00007fff01862f60
RBP: 00007fff01862f60 R08: 00000000ffffffff R09: 00007fff01862d30
R10: 00005555561b18b3 R11: 0000000000000246 R12: 00007f5ff64451ea
R13: 00007fff01864020 R14: 00005555561b1810 R15: 00007fff01864060
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-02-28  0:53 [syzbot] WARNING in p9_client_destroy syzbot
@ 2022-02-28  1:38 ` asmadeus
  2022-07-24  8:28 ` syzbot
  2022-07-24 13:17 ` syzbot
  2 siblings, 0 replies; 13+ messages in thread
From: asmadeus @ 2022-02-28  1:38 UTC (permalink / raw)
  To: syzbot
  Cc: davem, ericvh, kuba, linux-kernel, linux_oss, lucho, netdev,
	syzkaller-bugs, v9fs-developer

syzbot wrote on Sun, Feb 27, 2022 at 04:53:29PM -0800:
> kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when
> called from p9_client_destroy+0x213/0x370 net/9p/client.c:1100

hmm, there is no previous "Packet with tag %d has still references"
(sic) message, so this is probably because p9_tag_cleanup only relies on
rcu read lock for consistency, so even if the connection has been closed
above (clnt->trans_mode->close) there could have been a request sent
(= tag added) just before that which isn't visible on the destroying
side?

I guess adding an rcu_barrier() is what makes most sense here to protect
this case?
I'll send a patch in the next few days unless it was a stupid idea.
-- 
Dominique

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-02-28  0:53 [syzbot] WARNING in p9_client_destroy syzbot
  2022-02-28  1:38 ` asmadeus
@ 2022-07-24  8:28 ` syzbot
  2022-07-24 13:17 ` syzbot
  2 siblings, 0 replies; 13+ messages in thread
From: syzbot @ 2022-07-24  8:28 UTC (permalink / raw)
  To: asmadeus, davem, edumazet, ericvh, k.kahurani, kuba, linux-kernel,
	linux_oss, lucho, netdev, pabeni, syzkaller-bugs, v9fs-developer

syzbot has found a reproducer for the following issue on:

HEAD commit:    cb71b93c2dc3 Add linux-next specific files for 20220628
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=106a4022080000
kernel config:  https://syzkaller.appspot.com/x/.config?x=badbc1adb2d582eb
dashboard link: https://syzkaller.appspot.com/bug?extid=5e28cdb7ebd0f2389ca4
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=156f74ee080000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com

------------[ cut here ]------------
kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when called from p9_client_destroy+0x213/0x370 net/9p/client.c:1100
WARNING: CPU: 0 PID: 3687 at mm/slab_common.c:505 kmem_cache_destroy mm/slab_common.c:505 [inline]
WARNING: CPU: 0 PID: 3687 at mm/slab_common.c:505 kmem_cache_destroy+0x138/0x140 mm/slab_common.c:493
Modules linked in:
CPU: 1 PID: 3687 Comm: syz-executor.0 Not tainted 5.19.0-rc4-next-20220628-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
RIP: 0010:kmem_cache_destroy mm/slab_common.c:505 [inline]
RIP: 0010:kmem_cache_destroy+0x138/0x140 mm/slab_common.c:493
Code: 95 18 00 48 89 ef e8 07 96 18 00 eb cc c3 48 8b 55 60 48 c7 c6 80 da d7 89 48 c7 c7 88 e8 61 8b 48 8b 4c 24 18 e8 f2 3a 86 07 <0f> 0b eb ab 0f 1f 40 00 41 56 41 89 d6 41 55 49 89 f5 41 54 49 89
RSP: 0018:ffffc900034efcf0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 1ffff9200069dfa5 RCX: 0000000000000000
RDX: ffff88807513ba80 RSI: ffffffff81610608 RDI: fffff5200069df90
RBP: ffff88801f0cc8c0 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffffc900034efd68 R14: dffffc0000000000 R15: 0000000000000000
FS:  0000555556019400(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe57b1fe718 CR3: 00000000728bc000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 p9_client_destroy+0x213/0x370 net/9p/client.c:1100
 v9fs_session_close+0x45/0x2d0 fs/9p/v9fs.c:504
 v9fs_kill_super+0x49/0x90 fs/9p/vfs_super.c:226
 deactivate_locked_super+0x94/0x160 fs/super.c:332
 deactivate_super+0xad/0xd0 fs/super.c:363
 cleanup_mnt+0x3a2/0x540 fs/namespace.c:1186
 task_work_run+0xdd/0x1a0 kernel/task_work.c:177
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:169 [inline]
 exit_to_user_mode_prepare+0x23c/0x250 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:294
 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fe57ba8a677
Code: ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff19aa4578 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fe57ba8a677
RDX: 00007fff19aa464c RSI: 000000000000000a RDI: 00007fff19aa4640
RBP: 00007fff19aa4640 R08: 00000000ffffffff R09: 00007fff19aa4410
R10: 000055555601a8b3 R11: 0000000000000246 R12: 00007fe57bae22a6
R13: 00007fff19aa5700 R14: 000055555601a810 R15: 00007fff19aa5740
 </TASK>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-02-28  0:53 [syzbot] WARNING in p9_client_destroy syzbot
  2022-02-28  1:38 ` asmadeus
  2022-07-24  8:28 ` syzbot
@ 2022-07-24 13:17 ` syzbot
  2022-07-25 10:15   ` Vlastimil Babka
  2 siblings, 1 reply; 13+ messages in thread
From: syzbot @ 2022-07-24 13:17 UTC (permalink / raw)
  To: akpm, asmadeus, davem, edumazet, elver, ericvh, hdanton,
	k.kahurani, kuba, linux-kernel, linux_oss, lucho, netdev, pabeni,
	rientjes, syzkaller-bugs, torvalds, v9fs-developer, vbabka

syzbot has bisected this issue to:

commit 7302e91f39a81a9c2efcf4bc5749d18128366945
Author: Marco Elver <elver@google.com>
Date:   Fri Jan 14 22:03:58 2022 +0000

    mm/slab_common: use WARN() if cache still has objects on destroy

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=142882ce080000
start commit:   cb71b93c2dc3 Add linux-next specific files for 20220628
git tree:       linux-next
final oops:     https://syzkaller.appspot.com/x/report.txt?x=162882ce080000
console output: https://syzkaller.appspot.com/x/log.txt?x=122882ce080000
kernel config:  https://syzkaller.appspot.com/x/.config?x=badbc1adb2d582eb
dashboard link: https://syzkaller.appspot.com/bug?extid=5e28cdb7ebd0f2389ca4
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=156f74ee080000

Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com
Fixes: 7302e91f39a8 ("mm/slab_common: use WARN() if cache still has objects on destroy")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-07-24 13:17 ` syzbot
@ 2022-07-25 10:15   ` Vlastimil Babka
  2022-07-25 11:50     ` asmadeus
  0 siblings, 1 reply; 13+ messages in thread
From: Vlastimil Babka @ 2022-07-25 10:15 UTC (permalink / raw)
  To: syzbot, akpm, asmadeus, davem, edumazet, elver, ericvh, hdanton,
	k.kahurani, kuba, linux-kernel, linux_oss, lucho, netdev, pabeni,
	rientjes, syzkaller-bugs, torvalds, v9fs-developer

On 7/24/22 15:17, syzbot wrote:
> syzbot has bisected this issue to:
> 
> commit 7302e91f39a81a9c2efcf4bc5749d18128366945
> Author: Marco Elver <elver@google.com>
> Date:   Fri Jan 14 22:03:58 2022 +0000
> 
>     mm/slab_common: use WARN() if cache still has objects on destroy

Just to state the obvious, bisection pointed to a commit that added the
warning, but the reason for the warning would be that p9 is destroying a
kmem_cache without freeing all the objects there first, and that would be
true even before the commit.

> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=142882ce080000
> start commit:   cb71b93c2dc3 Add linux-next specific files for 20220628
> git tree:       linux-next
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=162882ce080000
> console output: https://syzkaller.appspot.com/x/log.txt?x=122882ce080000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=badbc1adb2d582eb
> dashboard link: https://syzkaller.appspot.com/bug?extid=5e28cdb7ebd0f2389ca4
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=156f74ee080000
> 
> Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com
> Fixes: 7302e91f39a8 ("mm/slab_common: use WARN() if cache still has objects on destroy")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-07-25 10:15   ` Vlastimil Babka
@ 2022-07-25 11:50     ` asmadeus
  2022-07-25 12:45       ` Dmitry Vyukov
  0 siblings, 1 reply; 13+ messages in thread
From: asmadeus @ 2022-07-25 11:50 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: syzbot, akpm, davem, edumazet, elver, ericvh, hdanton, k.kahurani,
	kuba, linux-kernel, linux_oss, lucho, netdev, pabeni, rientjes,
	syzkaller-bugs, torvalds, v9fs-developer

Vlastimil Babka wrote on Mon, Jul 25, 2022 at 12:15:24PM +0200:
> On 7/24/22 15:17, syzbot wrote:
> > syzbot has bisected this issue to:
> > 
> > commit 7302e91f39a81a9c2efcf4bc5749d18128366945
> > Author: Marco Elver <elver@google.com>
> > Date:   Fri Jan 14 22:03:58 2022 +0000
> > 
> >     mm/slab_common: use WARN() if cache still has objects on destroy
> 
> Just to state the obvious, bisection pointed to a commit that added the
> warning, but the reason for the warning would be that p9 is destroying a
> kmem_cache without freeing all the objects there first, and that would be
> true even before the commit.

Probably true from the moment that cache/idr was introduced... I've got
a couple of fixes in next but given syzcaller claims that's the tree it
was produced on I guess there can be more such leaks.
(well, the lines it sent in the backtrace yesterday don't match next,
but I wouldn't count on it)

If someone wants to have a look please feel free, I would bet the
problem is just that p9_fd_close() doesn't call or does something
equivalent to p9_conn_cancel() and there just are some requests that
haven't been sent yet when the mount is closed..
But I don't have/can/want to take the time to check right now as I
consider such a leak harmless enough, someone has to be root or
equivalent to do 9p mounts in most cases.

-- 
Dominique

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-07-25 11:50     ` asmadeus
@ 2022-07-25 12:45       ` Dmitry Vyukov
  2022-07-26 12:09         ` Christian Schoenebeck
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Vyukov @ 2022-07-25 12:45 UTC (permalink / raw)
  To: asmadeus
  Cc: Vlastimil Babka, syzbot, akpm, davem, edumazet, elver, ericvh,
	hdanton, k.kahurani, kuba, linux-kernel, linux_oss, lucho, netdev,
	pabeni, rientjes, syzkaller-bugs, torvalds, v9fs-developer

On Mon, 25 Jul 2022 at 13:51, <asmadeus@codewreck.org> wrote:
>
> Vlastimil Babka wrote on Mon, Jul 25, 2022 at 12:15:24PM +0200:
> > On 7/24/22 15:17, syzbot wrote:
> > > syzbot has bisected this issue to:
> > >
> > > commit 7302e91f39a81a9c2efcf4bc5749d18128366945
> > > Author: Marco Elver <elver@google.com>
> > > Date:   Fri Jan 14 22:03:58 2022 +0000
> > >
> > >     mm/slab_common: use WARN() if cache still has objects on destroy
> >
> > Just to state the obvious, bisection pointed to a commit that added the
> > warning, but the reason for the warning would be that p9 is destroying a
> > kmem_cache without freeing all the objects there first, and that would be
> > true even before the commit.
>
> Probably true from the moment that cache/idr was introduced... I've got
> a couple of fixes in next but given syzcaller claims that's the tree it
> was produced on I guess there can be more such leaks.
> (well, the lines it sent in the backtrace yesterday don't match next,
> but I wouldn't count on it)
>
> If someone wants to have a look please feel free, I would bet the
> problem is just that p9_fd_close() doesn't call or does something
> equivalent to p9_conn_cancel() and there just are some requests that
> haven't been sent yet when the mount is closed..
> But I don't have/can/want to take the time to check right now as I
> consider such a leak harmless enough, someone has to be root or
> equivalent to do 9p mounts in most cases.

FWIW with KASAN we have allocation stacks for each heap object. So
when KASAN is enabled that warning could list all live object
allocation stacks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-07-25 12:45       ` Dmitry Vyukov
@ 2022-07-26 12:09         ` Christian Schoenebeck
  2022-07-29 12:31           ` Dmitry Vyukov
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Schoenebeck @ 2022-07-26 12:09 UTC (permalink / raw)
  To: asmadeus, Dmitry Vyukov
  Cc: Vlastimil Babka, syzbot, akpm, davem, edumazet, elver, ericvh,
	hdanton, k.kahurani, kuba, linux-kernel, lucho, netdev, pabeni,
	rientjes, syzkaller-bugs, torvalds, v9fs-developer

On Montag, 25. Juli 2022 14:45:08 CEST Dmitry Vyukov wrote:
> On Mon, 25 Jul 2022 at 13:51, <asmadeus@codewreck.org> wrote:
> > Vlastimil Babka wrote on Mon, Jul 25, 2022 at 12:15:24PM +0200:
> > > On 7/24/22 15:17, syzbot wrote:
> > > > syzbot has bisected this issue to:
> > > > 
> > > > commit 7302e91f39a81a9c2efcf4bc5749d18128366945
> > > > Author: Marco Elver <elver@google.com>
> > > > Date:   Fri Jan 14 22:03:58 2022 +0000
> > > > 
> > > >     mm/slab_common: use WARN() if cache still has objects on destroy
> > > 
> > > Just to state the obvious, bisection pointed to a commit that added the
> > > warning, but the reason for the warning would be that p9 is destroying a
> > > kmem_cache without freeing all the objects there first, and that would
> > > be
> > > true even before the commit.
> > 
> > Probably true from the moment that cache/idr was introduced... I've got
> > a couple of fixes in next but given syzcaller claims that's the tree it
> > was produced on I guess there can be more such leaks.
> > (well, the lines it sent in the backtrace yesterday don't match next,
> > but I wouldn't count on it)
> > 
> > If someone wants to have a look please feel free, I would bet the
> > problem is just that p9_fd_close() doesn't call or does something
> > equivalent to p9_conn_cancel() and there just are some requests that
> > haven't been sent yet when the mount is closed..
> > But I don't have/can/want to take the time to check right now as I
> > consider such a leak harmless enough, someone has to be root or
> > equivalent to do 9p mounts in most cases.
> 
> FWIW with KASAN we have allocation stacks for each heap object. So
> when KASAN is enabled that warning could list all live object
> allocation stacks.

With allocation stack you mean the backtrace/call stack at the point in time 
when the memory originally was acquired?

If the answer is yes, then sure, if someone had a chance to post those 
backtraces, then that would help us to take a closer look at where this leak 
might happen. Otherwise I fear it will end up among those other "lack of 
priority" issues.

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-07-26 12:09         ` Christian Schoenebeck
@ 2022-07-29 12:31           ` Dmitry Vyukov
  0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Vyukov @ 2022-07-29 12:31 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: asmadeus, Vlastimil Babka, syzbot, akpm, davem, edumazet, elver,
	ericvh, hdanton, k.kahurani, kuba, linux-kernel, lucho, netdev,
	pabeni, rientjes, syzkaller-bugs, torvalds, v9fs-developer

On Tue, 26 Jul 2022 at 14:10, Christian Schoenebeck
<linux_oss@crudebyte.com> wrote:
>
> On Montag, 25. Juli 2022 14:45:08 CEST Dmitry Vyukov wrote:
> > On Mon, 25 Jul 2022 at 13:51, <asmadeus@codewreck.org> wrote:
> > > Vlastimil Babka wrote on Mon, Jul 25, 2022 at 12:15:24PM +0200:
> > > > On 7/24/22 15:17, syzbot wrote:
> > > > > syzbot has bisected this issue to:
> > > > >
> > > > > commit 7302e91f39a81a9c2efcf4bc5749d18128366945
> > > > > Author: Marco Elver <elver@google.com>
> > > > > Date:   Fri Jan 14 22:03:58 2022 +0000
> > > > >
> > > > >     mm/slab_common: use WARN() if cache still has objects on destroy
> > > >
> > > > Just to state the obvious, bisection pointed to a commit that added the
> > > > warning, but the reason for the warning would be that p9 is destroying a
> > > > kmem_cache without freeing all the objects there first, and that would
> > > > be
> > > > true even before the commit.
> > >
> > > Probably true from the moment that cache/idr was introduced... I've got
> > > a couple of fixes in next but given syzcaller claims that's the tree it
> > > was produced on I guess there can be more such leaks.
> > > (well, the lines it sent in the backtrace yesterday don't match next,
> > > but I wouldn't count on it)
> > >
> > > If someone wants to have a look please feel free, I would bet the
> > > problem is just that p9_fd_close() doesn't call or does something
> > > equivalent to p9_conn_cancel() and there just are some requests that
> > > haven't been sent yet when the mount is closed..
> > > But I don't have/can/want to take the time to check right now as I
> > > consider such a leak harmless enough, someone has to be root or
> > > equivalent to do 9p mounts in most cases.
> >
> > FWIW with KASAN we have allocation stacks for each heap object. So
> > when KASAN is enabled that warning could list all live object
> > allocation stacks.
>
> With allocation stack you mean the backtrace/call stack at the point in time
> when the memory originally was acquired?
>
> If the answer is yes, then sure, if someone had a chance to post those
> backtraces, then that would help us to take a closer look at where this leak
> might happen. Otherwise I fear it will end up among those other "lack of
> priority" issues.

Yes, I meant providing allocation stacks for leaked objects.
Filed https://bugzilla.kernel.org/show_bug.cgi?id=216306 for this feature.

^ permalink raw reply	[flat|nested] 13+ messages in thread

[parent not found: <CAAZOf26g-L2nSV-Siw6mwWQv1nv6on8c0fWqB4bKmX73QAFzow@mail.gmail.com>]

* Re: [syzbot] WARNING in p9_client_destroy
       [not found] <CAAZOf26g-L2nSV-Siw6mwWQv1nv6on8c0fWqB4bKmX73QAFzow@mail.gmail.com>
@ 2022-03-26 11:46 ` David Kahurani
  2022-03-26 11:48 ` Christian Schoenebeck
  1 sibling, 0 replies; 13+ messages in thread
From: David Kahurani @ 2022-03-26 11:46 UTC (permalink / raw)
  To: davem, ericvh, kuba, linux-kernel, linux_oss, lucho, netdev,
	syzkaller-bugs, v9fs-developer, syzbot+5e28cdb7ebd0f2389ca4

Sorry, got to resend this in plain text. It doesn't look like it is
getting through to the mailing lists.

On Thu, Mar 24, 2022 at 3:13 PM David Kahurani <k.kahurani@gmail.com> wrote:
>
> On Monday, February 28, 2022 at 4:38:57 AM UTC+3 asmadeus@codewreck.org wrote:
>>
>> syzbot wrote on Sun, Feb 27, 2022 at 04:53:29PM -0800:
>> > kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when
>> > called from p9_client_destroy+0x213/0x370 net/9p/client.c:1100
>>
>> hmm, there is no previous "Packet with tag %d has still references"
>> (sic) message, so this is probably because p9_tag_cleanup only relies on
>> rcu read lock for consistency, so even if the connection has been closed
>> above (clnt->trans_mode->close) there could have been a request sent
>> (= tag added) just before that which isn't visible on the destroying
>> side?
>>
>> I guess adding an rcu_barrier() is what makes most sense here to protect
>> this case?
>> I'll send a patch in the next few days unless it was a stupid idea.
>
>
> Looking at this brought me to the same conclusion.
>
> ---------------------
>
> From cd5a11207a140004bf55005fac7f7e4cec2fd075 Mon Sep 17 00:00:00 2001
> From: David Kahurani <k.kahurani@gmail.com>
> Date: Thu, 24 Mar 2022 15:00:23 +0300
> Subject: [PATCH] net/9p: Flush any delayed rce free
>
> As is best practice
>
> kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when called from p9_client_destroy+0x213/0x370 net/9p/client.c:1100
> WARNING: CPU: 1 PID: 3701 at mm/slab_common.c:502 kmem_cache_destroy mm/slab_common.c:502 [inline]
> WARNING: CPU: 1 PID: 3701 at mm/slab_common.c:502 kmem_cache_destroy+0x13b/0x140 mm/slab_common.c:490
> Modules linked in:
> CPU: 1 PID: 3701 Comm: syz-executor.3 Not tainted 5.17.0-rc5-syzkaller-00021-g23d04328444a #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> RIP: 0010:kmem_cache_destroy mm/slab_common.c:502 [inline]
> RIP: 0010:kmem_cache_destroy+0x13b/0x140 mm/slab_common.c:490
> Code: da a8 0e 48 89 ee e8 44 6e 15 00 eb c1 c3 48 8b 55 58 48 c7 c6 60 cd b6 89 48 c7 c7 30 83 3a 8b 48 8b 4c 24 18 e8 9b 30 60 07 <0f> 0b eb a0 90 41 55 49 89 d5 41 54 49 89 f4 55 48 89 fd 53 48 83
> RSP: 0018:ffffc90002767cf0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 1ffff920004ecfa5 RCX: 0000000000000000
> RDX: ffff88801e56a280 RSI: ffffffff815f4b38 RDI: fffff520004ecf90
> RBP: ffff888020ba8b00 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffffff815ef1ce R11: 0000000000000000 R12: 0000000000000001
> R13: ffffc90002767d68 R14: dffffc0000000000 R15: 0000000000000000
> FS:  00005555561b0400(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000555556ead708 CR3: 0000000068b97000 CR4: 0000000000150ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  p9_client_destroy+0x213/0x370 net/9p/client.c:1100
>  v9fs_session_close+0x45/0x2d0 fs/9p/v9fs.c:504
>  v9fs_kill_super+0x49/0x90 fs/9p/vfs_super.c:226
>  deactivate_locked_super+0x94/0x160 fs/super.c:332
>  deactivate_super+0xad/0xd0 fs/super.c:363
>  cleanup_mnt+0x3a2/0x540 fs/namespace.c:1173
>  task_work_run+0xdd/0x1a0 kernel/task_work.c:164
>  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
>  exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:207
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
>  syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
>  do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f5ff63ed4c7
> Code: ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fff01862e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f5ff63ed4c7
> RDX: 00007fff01862f6c RSI: 000000000000000a RDI: 00007fff01862f60
> RBP: 00007fff01862f60 R08: 00000000ffffffff R09: 00007fff01862d30
> R10: 00005555561b18b3 R11: 0000000000000246 R12: 00007f5ff64451ea
> R13: 00007fff01864020 R14: 00005555561b1810 R15: 00007fff01864060
>  </TASK>
>
> Signed-off-by: David Kahurani <k.kahurani@gmail.com>
> Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com
> ---
>  net/9p/client.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/9p/client.c b/net/9p/client.c
> index 8bba0d9cf..67c51913a 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -1097,6 +1097,7 @@ void p9_client_destroy(struct p9_client *clnt)
>
>   p9_tag_cleanup(clnt);
>
> + rcu_barrier();
>   kmem_cache_destroy(clnt->fcall_cache);
>   kfree(clnt);
>  }
> --
> 2.25.1
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
       [not found] <CAAZOf26g-L2nSV-Siw6mwWQv1nv6on8c0fWqB4bKmX73QAFzow@mail.gmail.com>
  2022-03-26 11:46 ` David Kahurani
@ 2022-03-26 11:48 ` Christian Schoenebeck
  2022-03-26 12:24   ` asmadeus
  1 sibling, 1 reply; 13+ messages in thread
From: Christian Schoenebeck @ 2022-03-26 11:48 UTC (permalink / raw)
  To: David Kahurani
  Cc: davem, ericvh, kuba, linux-kernel, lucho, netdev, syzkaller-bugs,
	v9fs-developer, syzbot+5e28cdb7ebd0f2389ca4, asmadeus

On Donnerstag, 24. März 2022 13:13:25 CET David Kahurani wrote:
> On Monday, February 28, 2022 at 4:38:57 AM UTC+3 asmadeus@codewreck.org
> 
> wrote:
> > syzbot wrote on Sun, Feb 27, 2022 at 04:53:29PM -0800:
> > > kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when
> > > called from p9_client_destroy+0x213/0x370 net/9p/client.c:1100
> > 
> > hmm, there is no previous "Packet with tag %d has still references"
> > (sic) message, so this is probably because p9_tag_cleanup only relies on
> > rcu read lock for consistency, so even if the connection has been closed
> > above (clnt->trans_mode->close) there could have been a request sent
> > (= tag added) just before that which isn't visible on the destroying
> > side?
> > 
> > I guess adding an rcu_barrier() is what makes most sense here to protect
> > this case?
> > I'll send a patch in the next few days unless it was a stupid idea.
> 
> Looking at this brought me to the same conclusion.
> 
> ---------------------
> 
> From cd5a11207a140004bf55005fac7f7e4cec2fd075 Mon Sep 17 00:00:00 2001
> From: David Kahurani <k.kahurani@gmail.com>
> Date: Thu, 24 Mar 2022 15:00:23 +0300
> Subject: [PATCH] net/9p: Flush any delayed rce free
> 
> As is best practice
> 
> kmem_cache_destroy 9p-fcall-cache: Slab cache still has objects when called
> from p9_client_destroy+0x213/0x370 net/9p/client.c:1100
> WARNING: CPU: 1 PID: 3701 at mm/slab_common.c:502 kmem_cache_destroy
> mm/slab_common.c:502 [inline]
> WARNING: CPU: 1 PID: 3701 at mm/slab_common.c:502
> kmem_cache_destroy+0x13b/0x140 mm/slab_common.c:490
> Modules linked in:
> CPU: 1 PID: 3701 Comm: syz-executor.3 Not tainted
> 5.17.0-rc5-syzkaller-00021-g23d04328444a #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> RIP: 0010:kmem_cache_destroy mm/slab_common.c:502 [inline]
> RIP: 0010:kmem_cache_destroy+0x13b/0x140 mm/slab_common.c:490
> Code: da a8 0e 48 89 ee e8 44 6e 15 00 eb c1 c3 48 8b 55 58 48 c7 c6 60 cd
> b6 89 48 c7 c7 30 83 3a 8b 48 8b 4c 24 18 e8 9b 30 60 07 <0f> 0b eb a0 90
> 41 55 49 89 d5 41 54 49 89 f4 55 48 89 fd 53 48 83
> RSP: 0018:ffffc90002767cf0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 1ffff920004ecfa5 RCX: 0000000000000000
> RDX: ffff88801e56a280 RSI: ffffffff815f4b38 RDI: fffff520004ecf90
> RBP: ffff888020ba8b00 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffffff815ef1ce R11: 0000000000000000 R12: 0000000000000001
> R13: ffffc90002767d68 R14: dffffc0000000000 R15: 0000000000000000
> FS:  00005555561b0400(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000555556ead708 CR3: 0000000068b97000 CR4: 0000000000150ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  p9_client_destroy+0x213/0x370 net/9p/client.c:1100
>  v9fs_session_close+0x45/0x2d0 fs/9p/v9fs.c:504
>  v9fs_kill_super+0x49/0x90 fs/9p/vfs_super.c:226
>  deactivate_locked_super+0x94/0x160 fs/super.c:332
>  deactivate_super+0xad/0xd0 fs/super.c:363
>  cleanup_mnt+0x3a2/0x540 fs/namespace.c:1173
>  task_work_run+0xdd/0x1a0 kernel/task_work.c:164
>  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
>  exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:207
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
>  syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
>  do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f5ff63ed4c7
> Code: ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09
> 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff
> ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fff01862e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f5ff63ed4c7
> RDX: 00007fff01862f6c RSI: 000000000000000a RDI: 00007fff01862f60
> RBP: 00007fff01862f60 R08: 00000000ffffffff R09: 00007fff01862d30
> R10: 00005555561b18b3 R11: 0000000000000246 R12: 00007f5ff64451ea
> R13: 00007fff01864020 R14: 00005555561b1810 R15: 00007fff01864060
>  </TASK>
> 
> Signed-off-by: David Kahurani <k.kahurani@gmail.com>
> Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com

I'm not absolutely sure that this will really fix this issue, but it seems to 
be a good idea to add a rcu_barrier() call here nevertheless.

Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>

> ---
>  net/9p/client.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/9p/client.c b/net/9p/client.c
> index 8bba0d9cf..67c51913a 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -1097,6 +1097,7 @@ void p9_client_destroy(struct p9_client *clnt)
> 
>   p9_tag_cleanup(clnt);
> 
> + rcu_barrier();
>   kmem_cache_destroy(clnt->fcall_cache);
>   kfree(clnt);
>  }





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-03-26 11:48 ` Christian Schoenebeck
@ 2022-03-26 12:24   ` asmadeus
  2022-03-26 12:36     ` Christian Schoenebeck
  0 siblings, 1 reply; 13+ messages in thread
From: asmadeus @ 2022-03-26 12:24 UTC (permalink / raw)
  To: Christian Schoenebeck
  Cc: David Kahurani, davem, ericvh, kuba, linux-kernel, lucho, netdev,
	syzkaller-bugs, v9fs-developer, syzbot+5e28cdb7ebd0f2389ca4

Christian Schoenebeck wrote on Sat, Mar 26, 2022 at 12:48:26PM +0100:
> [...]
>
> > Signed-off-by: David Kahurani <k.kahurani@gmail.com>
> > Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com

Looks good to me - it's pretty much what I'd have done if I hadn't
forgotten!
It doesn't strike me as anything critical and I don't have anything else
for this cycle so I'll just queue it in -next for now, and submit it
at the start of the 5.19 cycle in ~2months.

> I'm not absolutely sure that this will really fix this issue, but it seems to 
> be a good idea to add a rcu_barrier() call here nevertheless.

Yeah, I'm not really sure either but this is the only idea I have given
the debug code doesn't list anything left in the cache, and David came
to the same conclusion :/

Can't hurt though, so let's try and see if syzbot complains
again. Thanks for the review!

-- 
Dominique

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [syzbot] WARNING in p9_client_destroy
  2022-03-26 12:24   ` asmadeus
@ 2022-03-26 12:36     ` Christian Schoenebeck
  0 siblings, 0 replies; 13+ messages in thread
From: Christian Schoenebeck @ 2022-03-26 12:36 UTC (permalink / raw)
  To: asmadeus
  Cc: David Kahurani, davem, ericvh, kuba, linux-kernel, lucho, netdev,
	syzkaller-bugs, v9fs-developer, syzbot+5e28cdb7ebd0f2389ca4

On Samstag, 26. März 2022 13:24:10 CET asmadeus@codewreck.org wrote:
> Christian Schoenebeck wrote on Sat, Mar 26, 2022 at 12:48:26PM +0100:
> > [...]
> > 
> > > Signed-off-by: David Kahurani <k.kahurani@gmail.com>
> > > Reported-by: syzbot+5e28cdb7ebd0f2389ca4@syzkaller.appspotmail.com
> 
> Looks good to me - it's pretty much what I'd have done if I hadn't
> forgotten!
> It doesn't strike me as anything critical and I don't have anything else
> for this cycle so I'll just queue it in -next for now, and submit it
> at the start of the 5.19 cycle in ~2months.

BTW, another issue that I am seeing for a long time affects the fs-cache: When
I use cache=mmap then things seem to be harmless, I periodically see messages
like these, but that's about it:

[90763.435562] FS-Cache: Duplicate cookie detected
[90763.436514] FS-Cache: O-cookie c=00dcb42f [p=00000003 fl=216 nc=0 na=0]
[90763.437795] FS-Cache: O-cookie d=0000000000000000{?} n=0000000000000000
[90763.440096] FS-Cache: O-key=[8] 'a7ab2c0000000000'
[90763.441656] FS-Cache: N-cookie c=00dcb4a7 [p=00000003 fl=2 nc=0 na=1]
[90763.446753] FS-Cache: N-cookie d=000000005b583d5a{9p.inode} n=00000000212184fb
[90763.448196] FS-Cache: N-key=[8] 'a7ab2c0000000000'

The real trouble starts when I use cache=loose though, in this case I get all
sorts of misbehaviours from time to time, especially complaining about invalid
file descriptors.

Any clues?

Best regards,
Christian Schoenebeck



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-29 12:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-28  0:53 [syzbot] WARNING in p9_client_destroy syzbot
2022-02-28  1:38 ` asmadeus
2022-07-24  8:28 ` syzbot
2022-07-24 13:17 ` syzbot
2022-07-25 10:15   ` Vlastimil Babka
2022-07-25 11:50     ` asmadeus
2022-07-25 12:45       ` Dmitry Vyukov
2022-07-26 12:09         ` Christian Schoenebeck
2022-07-29 12:31           ` Dmitry Vyukov
     [not found] <CAAZOf26g-L2nSV-Siw6mwWQv1nv6on8c0fWqB4bKmX73QAFzow@mail.gmail.com>
2022-03-26 11:46 ` David Kahurani
2022-03-26 11:48 ` Christian Schoenebeck
2022-03-26 12:24   ` asmadeus
2022-03-26 12:36     ` Christian Schoenebeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).