public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2)
@ 2025-11-07  7:30 syzbot
  2025-11-08  7:05 ` Edward Adam Davis
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: syzbot @ 2025-11-07  7:30 UTC (permalink / raw)
  To: agruenba, gfs2, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    c2c2ccfd4ba7 Merge tag 'net-6.18-rc5' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11a39084580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=cb128cd5cb439809
dashboard link: https://syzkaller.appspot.com/bug?extid=63ba84f14f62e61a5fd0
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=171a7812580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1375dbcd980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/b0451ba3fe41/disk-c2c2ccfd.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d3e8c67119ab/vmlinux-c2c2ccfd.xz
kernel image: https://storage.googleapis.com/syzbot-assets/1d8e176e5054/bzImage-c2c2ccfd.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/1af9667b349a/mount_0.gz
  fsck result: failed (log: https://syzkaller.appspot.com/x/fsck.log?x=131a7812580000)

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com

BUG: memory leak
unreferenced object 0xffff888126cf1000 (size 144):
  comm "syz.2.26", pid 6030, jiffies 4294942626
  hex dump (first 32 bytes):
    c0 ef 59 82 ff ff ff ff 05 00 00 00 db 1a 00 00  ..Y.............
    0b 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00  ................
  backtrace (crc f56b339f):
    kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
    slab_post_alloc_hook mm/slub.c:4975 [inline]
    slab_alloc_node mm/slub.c:5280 [inline]
    kmem_cache_alloc_noprof+0x397/0x5a0 mm/slub.c:5287
    gfs2_trans_begin+0x29/0xa0 fs/gfs2/trans.c:115
    alloc_dinode fs/gfs2/inode.c:418 [inline]
    gfs2_create_inode+0xca0/0x1890 fs/gfs2/inode.c:807
    gfs2_atomic_open+0x98/0x190 fs/gfs2/inode.c:1387
    atomic_open fs/namei.c:3656 [inline]
    lookup_open fs/namei.c:3767 [inline]
    open_last_lookups fs/namei.c:3895 [inline]
    path_openat+0x13ef/0x1eb0 fs/namei.c:4131
    do_filp_open+0x102/0x1f0 fs/namei.c:4161
    do_sys_openat2+0xc1/0x140 fs/open.c:1437
    do_sys_open fs/open.c:1452 [inline]
    __do_sys_openat fs/open.c:1468 [inline]
    __se_sys_openat fs/open.c:1463 [inline]
    __x64_sys_openat+0xb2/0x100 fs/open.c:1463
    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
    do_syscall_64+0xa4/0xfa0 arch/x86/entry/syscall_64.c:94
    entry_SYSCALL_64_after_hwframe+0x77/0x7f

connection error: failed to recv *flatrpc.ExecutorMessageRawT: EOF


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2)
  2025-11-07  7:30 [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot
@ 2025-11-08  7:05 ` Edward Adam Davis
  2025-11-08  7:34   ` syzbot
  2025-11-08  9:13 ` [PATCH] gfs2: Fix memory leak in gfs2_trans_begin Edward Adam Davis
  2025-11-08 22:43 ` Forwarded: Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot
  2 siblings, 1 reply; 6+ messages in thread
From: Edward Adam Davis @ 2025-11-08  7:05 UTC (permalink / raw)
  To: syzbot+63ba84f14f62e61a5fd0; +Cc: linux-kernel, syzkaller-bugs

#syz test

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 115c4ac457e9..7bba7951dbdb 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1169,11 +1169,13 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	 * never queued onto any of the ail lists. Here we add it to
 	 * ail1 just so that ail_drain() will find and free it.
 	 */
-	spin_lock(&sdp->sd_ail_lock);
-	if (tr && list_empty(&tr->tr_list))
-		list_add(&tr->tr_list, &sdp->sd_ail1_list);
-	spin_unlock(&sdp->sd_ail_lock);
-	tr = NULL;
+	if (gfs2_withdrawing(sdp)) {
+		spin_lock(&sdp->sd_ail_lock);
+		if (tr && list_empty(&tr->tr_list))
+			list_add(&tr->tr_list, &sdp->sd_ail1_list);
+		spin_unlock(&sdp->sd_ail_lock);
+		tr = NULL;
+	}
 	goto out_end;
 }
 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2)
  2025-11-08  7:05 ` Edward Adam Davis
@ 2025-11-08  7:34   ` syzbot
  0 siblings, 0 replies; 6+ messages in thread
From: syzbot @ 2025-11-08  7:34 UTC (permalink / raw)
  To: eadavis, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com
Tested-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com

Tested on:

commit:         e811c33b Merge tag 'drm-fixes-2025-11-08' of https://g..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14e60b42580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=cb128cd5cb439809
dashboard link: https://syzkaller.appspot.com/bug?extid=63ba84f14f62e61a5fd0
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1549117c580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] gfs2: Fix memory leak in gfs2_trans_begin
  2025-11-07  7:30 [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot
  2025-11-08  7:05 ` Edward Adam Davis
@ 2025-11-08  9:13 ` Edward Adam Davis
  2025-11-08 20:00   ` Andreas Gruenbacher
  2025-11-08 22:43 ` Forwarded: Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot
  2 siblings, 1 reply; 6+ messages in thread
From: Edward Adam Davis @ 2025-11-08  9:13 UTC (permalink / raw)
  To: syzbot+63ba84f14f62e61a5fd0; +Cc: agruenba, gfs2, linux-kernel, syzkaller-bugs

According to log [1], a "bad magic number" was found when checking the
metatype, which caused gfs2 withdraw.

The root cause of the problem is: log flush treats non-delayed withdraw
as withdraw, resulting in no one reclaiming the memory of transaction.
See the call stack below for details.

	CPU1					CPU2
	====					====
gfs2_meta_buffer()
gfs2_metatype_check()
gfs2_metatype_check_i()
gfs2_metatype_check_ii()		gfs2_log_flush()
gfs2_withdraw()				tr = sdp->sd_log_tr
signal_our_withdraw()			sdp->sd_log_tr = NULL
gfs2_ail_drain()			goto out_withdraw
spin_unlock(&sdp->sd_ail_lock)    	trans_drain()
					spin_lock(&sdp->sd_ail_lock)
					list_add(&tr->tr_list, &sdp->sd_ail1_list)
					tr = NULL
					goto out_end

The original text suggests adding a delayed withdraw check to handle
transaction cases to avoid similar memory leaks.

syzbot reported:
[1]
gfs2: fsid=syz:syz.0: fatal: invalid metadata block - bh = 9381 (bad magic number), function = gfs2_meta_buffer, file = fs/gfs2/meta_io.c, line = 499

[2]
BUG: memory leak
unreferenced object 0xffff888126cf1000 (size 144):
  backtrace (crc f56b339f):
    gfs2_trans_begin+0x29/0xa0 fs/gfs2/trans.c:115
    alloc_dinode fs/gfs2/inode.c:418 [inline]
    gfs2_create_inode+0xca0/0x1890 fs/gfs2/inode.c:807


Fixes: f5456b5d67cf ("gfs2: Clean up revokes on normal withdraws")
Reported-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=63ba84f14f62e61a5fd0
Tested-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 fs/gfs2/log.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 115c4ac457e9..7bba7951dbdb 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1169,11 +1169,13 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
 	 * never queued onto any of the ail lists. Here we add it to
 	 * ail1 just so that ail_drain() will find and free it.
 	 */
-	spin_lock(&sdp->sd_ail_lock);
-	if (tr && list_empty(&tr->tr_list))
-		list_add(&tr->tr_list, &sdp->sd_ail1_list);
-	spin_unlock(&sdp->sd_ail_lock);
-	tr = NULL;
+	if (gfs2_withdrawing(sdp)) {
+		spin_lock(&sdp->sd_ail_lock);
+		if (tr && list_empty(&tr->tr_list))
+			list_add(&tr->tr_list, &sdp->sd_ail1_list);
+		spin_unlock(&sdp->sd_ail_lock);
+		tr = NULL;
+	}
 	goto out_end;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] gfs2: Fix memory leak in gfs2_trans_begin
  2025-11-08  9:13 ` [PATCH] gfs2: Fix memory leak in gfs2_trans_begin Edward Adam Davis
@ 2025-11-08 20:00   ` Andreas Gruenbacher
  0 siblings, 0 replies; 6+ messages in thread
From: Andreas Gruenbacher @ 2025-11-08 20:00 UTC (permalink / raw)
  To: Edward Adam Davis
  Cc: syzbot+63ba84f14f62e61a5fd0, gfs2, linux-kernel, syzkaller-bugs

Hello,

On Sat, Nov 8, 2025 at 10:13 AM Edward Adam Davis <eadavis@qq.com> wrote:
> According to log [1], a "bad magic number" was found when checking the
> metatype, which caused gfs2 withdraw.
>
> The root cause of the problem is: log flush treats non-delayed withdraw
> as withdraw, resulting in no one reclaiming the memory of transaction.
> See the call stack below for details.
>
>         CPU1                                    CPU2
>         ====                                    ====
> gfs2_meta_buffer()
> gfs2_metatype_check()
> gfs2_metatype_check_i()
> gfs2_metatype_check_ii()                gfs2_log_flush()
> gfs2_withdraw()                         tr = sdp->sd_log_tr
> signal_our_withdraw()                   sdp->sd_log_tr = NULL
> gfs2_ail_drain()                        goto out_withdraw
> spin_unlock(&sdp->sd_ail_lock)          trans_drain()
>                                         spin_lock(&sdp->sd_ail_lock)
>                                         list_add(&tr->tr_list, &sdp->sd_ail1_list)
>                                         tr = NULL
>                                         goto out_end
>

this bug report is against upstream commit c2c2ccfd4ba7, which
precedes the withdraw rework on gfs2's for-next branch. With those
patches, the race you are describing is no longer possible because
do_withdraw() now uses sdp->sd_log_flush_lock and the SDF_JOURNAL_LIVE
flag to synchronize with gfs2_log_flush().

I don't know why Bob chose to push the transaction onto the ail1 list
instead of freeing it in gfs2_log_flush(); that's something to clean
up. I've pushed an untested patch doing that to for-later.

Related commits:
58e08e8d83ab ("gfs2: fix trans slab error when withdraw occurs inside
log_flush")
f5456b5d67cf ("gfs2: Clean up revokes on normal withdraws")

Thanks,
Andreas

> The original text suggests adding a delayed withdraw check to handle
> transaction cases to avoid similar memory leaks.
>
> syzbot reported:
> [1]
> gfs2: fsid=syz:syz.0: fatal: invalid metadata block - bh = 9381 (bad magic number), function = gfs2_meta_buffer, file = fs/gfs2/meta_io.c, line = 499
>
> [2]
> BUG: memory leak
> unreferenced object 0xffff888126cf1000 (size 144):
>   backtrace (crc f56b339f):
>     gfs2_trans_begin+0x29/0xa0 fs/gfs2/trans.c:115
>     alloc_dinode fs/gfs2/inode.c:418 [inline]
>     gfs2_create_inode+0xca0/0x1890 fs/gfs2/inode.c:807
>
>
> Fixes: f5456b5d67cf ("gfs2: Clean up revokes on normal withdraws")
> Reported-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=63ba84f14f62e61a5fd0
> Tested-by: syzbot+63ba84f14f62e61a5fd0@syzkaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
>  fs/gfs2/log.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> index 115c4ac457e9..7bba7951dbdb 100644
> --- a/fs/gfs2/log.c
> +++ b/fs/gfs2/log.c
> @@ -1169,11 +1169,13 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
>          * never queued onto any of the ail lists. Here we add it to
>          * ail1 just so that ail_drain() will find and free it.
>          */
> -       spin_lock(&sdp->sd_ail_lock);
> -       if (tr && list_empty(&tr->tr_list))
> -               list_add(&tr->tr_list, &sdp->sd_ail1_list);
> -       spin_unlock(&sdp->sd_ail_lock);
> -       tr = NULL;
> +       if (gfs2_withdrawing(sdp)) {
> +               spin_lock(&sdp->sd_ail_lock);
> +               if (tr && list_empty(&tr->tr_list))
> +                       list_add(&tr->tr_list, &sdp->sd_ail1_list);
> +               spin_unlock(&sdp->sd_ail_lock);
> +               tr = NULL;
> +       }
>         goto out_end;
>  }
>
> --
> 2.43.0
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Forwarded: Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2)
  2025-11-07  7:30 [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot
  2025-11-08  7:05 ` Edward Adam Davis
  2025-11-08  9:13 ` [PATCH] gfs2: Fix memory leak in gfs2_trans_begin Edward Adam Davis
@ 2025-11-08 22:43 ` syzbot
  2 siblings, 0 replies; 6+ messages in thread
From: syzbot @ 2025-11-08 22:43 UTC (permalink / raw)
  To: linux-kernel, syzkaller-bugs

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com.

***

Subject: Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2)
Author: agruenba@redhat.com

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git
withdraw


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-08 22:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07  7:30 [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot
2025-11-08  7:05 ` Edward Adam Davis
2025-11-08  7:34   ` syzbot
2025-11-08  9:13 ` [PATCH] gfs2: Fix memory leak in gfs2_trans_begin Edward Adam Davis
2025-11-08 20:00   ` Andreas Gruenbacher
2025-11-08 22:43 ` Forwarded: Re: [syzbot] [gfs2?] memory leak in gfs2_trans_begin (2) syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox