public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5
@ 2023-04-06  2:34 Pengfei Xu
  2023-04-11  0:33 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Pengfei Xu @ 2023-04-06  2:34 UTC (permalink / raw)
  To: dchinner; +Cc: linux-xfs, djwong, heng.su, lkp

Hi Dave Chinner and xfs experts,

Greeting!

There is task hung in xlog_grant_head_check in v6.3-rc5 kernel.

Platform: x86 platforms

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230405_094839_xlog_grant_head_check
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.c
Syzkaller analysis repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.report
Syzkaller analysis repro.stats: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.stats
Reproduced prog repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.prog
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/bisect_info.log

It could be reproduced in maximum 2100s.
Bisected and found bad commit was:
"
fe08cc5044486096bfb5ce9d3db4e915e53281ea
xfs: open code sb verifier feature checks
"
It's just the suspected commit, because reverted above commit on top of v6.3-rc5
kernel then made kernel failed, could not double confirm for the issue.

"
[   24.818100] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=339 'systemd'
[   28.230533] loop0: detected capacity change from 0 to 65536
[   28.232522] XFS (loop0): Deprecated V4 format (crc=0) will not be supported after September 2030.
[   28.233447] XFS (loop0): Mounting V10 Filesystem d28317a9-9e04-4f2a-be27-e55b4c413ff6
[   28.234235] XFS (loop0): Log size 66 blocks too small, minimum size is 1968 blocks
[   28.234856] XFS (loop0): Log size out of supported range.
[   28.235289] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
[   28.239290] XFS (loop0): Starting recovery (logdev: internal)
[   28.240979] XFS (loop0): Ending recovery (logdev: internal)
[  300.150944] INFO: task repro:541 blocked for more than 147 seconds.
[  300.151523]       Not tainted 6.3.0-rc5-7e364e56293b+ #1
[  300.152102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  300.152716] task:repro           state:D stack:0     pid:541   ppid:540    flags:0x00004004
[  300.153373] Call Trace:
[  300.153580]  <TASK>
[  300.153765]  __schedule+0x40a/0xc30
[  300.154078]  schedule+0x5b/0xe0
[  300.154349]  xlog_grant_head_wait+0x53/0x3a0
[  300.154715]  xlog_grant_head_check+0x1a5/0x1c0
[  300.155113]  xfs_log_reserve+0x145/0x380
[  300.155442]  xfs_trans_reserve+0x226/0x270
[  300.155780]  xfs_trans_alloc+0x147/0x470
[  300.156112]  xfs_qm_qino_alloc+0xcf/0x510
[  300.156441]  ? write_comp_data+0x2f/0x90
[  300.156770]  xfs_qm_init_quotainos+0x30a/0x400
[  300.157139]  xfs_qm_init_quotainfo+0x9d/0x4b0
[  300.157499]  ? write_comp_data+0x2f/0x90
[  300.157827]  xfs_qm_mount_quotas+0x40/0x3c0
[  300.158167]  xfs_mountfs+0xc37/0xce0
[  300.158467]  xfs_fs_fill_super+0x7aa/0xdc0
[  300.158817]  get_tree_bdev+0x24b/0x350
[  300.159126]  ? __pfx_xfs_fs_fill_super+0x10/0x10
[  300.159503]  xfs_fs_get_tree+0x25/0x30
[  300.159815]  vfs_get_tree+0x3b/0x140
[  300.160118]  path_mount+0x769/0x10f0
[  300.160415]  ? write_comp_data+0x2f/0x90
[  300.160743]  do_mount+0xaf/0xd0
[  300.161009]  __x64_sys_mount+0x14b/0x160
[  300.161331]  do_syscall_64+0x3b/0x90
[  300.161632]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  300.162041] RIP: 0033:0x7fece24223ae
[  300.162333] RSP: 002b:00007fff584561e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  300.162937] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fece24223ae
[  300.163494] RDX: 000000002000ad00 RSI: 000000002000ad40 RDI: 00007fff58456320
[  300.164051] RBP: 00007fff584563b0 R08: 00007fff58456220 R09: 0000000000000000
[  300.164612] R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000401240
[  300.165168] R13: 00007fff584564f0 R14: 0000000000000000 R15: 0000000000000000
[  300.165732]  </TASK>
[  300.165919] 
[  300.165919] Showing all locks held in the system:
[  300.166402] 1 lock held by rcu_tasks_kthre/11:
[  300.166773]  #0: ffffffff83d63450 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x420
[  300.167530] 1 lock held by rcu_tasks_rude_/12:
[  300.167886]  #0: ffffffff83d631d0 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x420
[  300.168683] 1 lock held by rcu_tasks_trace/13:
[  300.169039]  #0: ffffffff83d62f10 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x420
[  300.169839] 1 lock held by khungtaskd/29:
[  300.170160]  #0: ffffffff83d63e60 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x1b/0x1e0
[  300.170891] 2 locks held by repro/541:
[  300.171194]  #0: ffff88800de780e0 (&type->s_umount_key#47/1){+.+.}-{3:3}, at: alloc_super+0x12b/0x480
[  300.171926]  #1: ffff88800de78638 (sb_internal#2){.+.+}-{0:0}, at: xfs_qm_qino_alloc+0xcf/0x510
[  300.172634] 
[  300.172769] =============================================
"

I hope the info is helpful.

Thanks!

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
make
make install

Thanks!
BR.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5
  2023-04-06  2:34 [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5 Pengfei Xu
@ 2023-04-11  0:33 ` Dave Chinner
  2023-04-11  8:15   ` Pengfei Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2023-04-11  0:33 UTC (permalink / raw)
  To: Pengfei Xu; +Cc: dchinner, linux-xfs, djwong, heng.su, lkp

On Thu, Apr 06, 2023 at 10:34:02AM +0800, Pengfei Xu wrote:
> Hi Dave Chinner and xfs experts,
> 
> Greeting!
> 
> There is task hung in xlog_grant_head_check in v6.3-rc5 kernel.
> 
> Platform: x86 platforms
> 
> All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230405_094839_xlog_grant_head_check
> Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.c
> Syzkaller analysis repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.report
> Syzkaller analysis repro.stats: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.stats
> Reproduced prog repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.prog
> Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/kconfig_origin
> Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/bisect_info.log
> 
> It could be reproduced in maximum 2100s.
> Bisected and found bad commit was:
> "
> fe08cc5044486096bfb5ce9d3db4e915e53281ea
> xfs: open code sb verifier feature checks
> "
> It's just the suspected commit, because reverted above commit on top of v6.3-rc5
> kernel then made kernel failed, could not double confirm for the issue.
> 
> "
> [   24.818100] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=339 'systemd'
> [   28.230533] loop0: detected capacity change from 0 to 65536
> [   28.232522] XFS (loop0): Deprecated V4 format (crc=0) will not be supported after September 2030.
> [   28.233447] XFS (loop0): Mounting V10 Filesystem d28317a9-9e04-4f2a-be27-e55b4c413ff6

Yeah, there's the issue that the bisect found - has nothing to do
with the log hang. fe08cc5044486 allowed filesystem versions > 5 to
be mounted, prior to that it wasn't allowed. I think this was just a
simple oversight.

Not a bit deal, everything is based on feature support checks and
not version numbers, so it's not a critical issue.

Low severity, low priority, but something we should fix and push
back to stable kernels sooner rather than later.

> [   28.234235] XFS (loop0): Log size 66 blocks too small, minimum size is 1968 blocks
> [   28.234856] XFS (loop0): Log size out of supported range.
> [   28.235289] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> [   28.239290] XFS (loop0): Starting recovery (logdev: internal)
> [   28.240979] XFS (loop0): Ending recovery (logdev: internal)
> [  300.150944] INFO: task repro:541 blocked for more than 147 seconds.
> [  300.151523]       Not tainted 6.3.0-rc5-7e364e56293b+ #1
> [  300.152102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  300.152716] task:repro           state:D stack:0     pid:541   ppid:540    flags:0x00004004
> [  300.153373] Call Trace:
> [  300.153580]  <TASK>
> [  300.153765]  __schedule+0x40a/0xc30
> [  300.154078]  schedule+0x5b/0xe0
> [  300.154349]  xlog_grant_head_wait+0x53/0x3a0
> [  300.154715]  xlog_grant_head_check+0x1a5/0x1c0
> [  300.155113]  xfs_log_reserve+0x145/0x380
> [  300.155442]  xfs_trans_reserve+0x226/0x270
> [  300.155780]  xfs_trans_alloc+0x147/0x470
> [  300.156112]  xfs_qm_qino_alloc+0xcf/0x510

This log hang is *not a bug*. It is -expected- given that syzbot is
screwing around with fuzzed V4 filesystems. I almost just threw this
report in the bin because I saw it was a V4 filesytsem being
mounted.

That is, V5 filesystems will refuse to mount a filesystem with a log
that is too small, completely avoiding this sort of hang caused by
the log being way smaller than a transaction reservation (guaranteed
hang). But we cannot do the same thing for V4 filesystems, because
there were bugs in and inconsistencies between mkfs and the kernel
over the minimum valid log size. Hence when we hit a V4 filesystem
in that situation, we issue a warning and allow operation to
continue because that's historical V4 filesystem behaviour.

This kernel issued the "log size too small" warning, and then there
was a log space hang which is entirely predictable and not a kernel
bug. syzbot is doing something stupid, syzbot needs to be taught not
to do stupid things.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5
  2023-04-11  0:33 ` Dave Chinner
@ 2023-04-11  8:15   ` Pengfei Xu
  2023-04-11 15:03     ` Darrick J. Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Pengfei Xu @ 2023-04-11  8:15 UTC (permalink / raw)
  To: Dave Chinner; +Cc: dchinner, linux-xfs, djwong, heng.su, lkp

Hi Dave,

On 2023-04-11 at 10:33:53 +1000, Dave Chinner wrote:
> On Thu, Apr 06, 2023 at 10:34:02AM +0800, Pengfei Xu wrote:
> > Hi Dave Chinner and xfs experts,
> > 
> > Greeting!
> > 
> > There is task hung in xlog_grant_head_check in v6.3-rc5 kernel.
> > 
> > Platform: x86 platforms
> > 
> > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230405_094839_xlog_grant_head_check
> > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.c
> > Syzkaller analysis repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.report
> > Syzkaller analysis repro.stats: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.stats
> > Reproduced prog repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.prog
> > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/kconfig_origin
> > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/bisect_info.log
> > 
> > It could be reproduced in maximum 2100s.
> > Bisected and found bad commit was:
> > "
> > fe08cc5044486096bfb5ce9d3db4e915e53281ea
> > xfs: open code sb verifier feature checks
> > "
> > It's just the suspected commit, because reverted above commit on top of v6.3-rc5
> > kernel then made kernel failed, could not double confirm for the issue.
> > 
> > "
> > [   24.818100] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=339 'systemd'
> > [   28.230533] loop0: detected capacity change from 0 to 65536
> > [   28.232522] XFS (loop0): Deprecated V4 format (crc=0) will not be supported after September 2030.
> > [   28.233447] XFS (loop0): Mounting V10 Filesystem d28317a9-9e04-4f2a-be27-e55b4c413ff6
> 
> Yeah, there's the issue that the bisect found - has nothing to do
> with the log hang. fe08cc5044486 allowed filesystem versions > 5 to
> be mounted, prior to that it wasn't allowed. I think this was just a
> simple oversight.
> 
> Not a bit deal, everything is based on feature support checks and
> not version numbers, so it's not a critical issue.
> 
> Low severity, low priority, but something we should fix and push
> back to stable kernels sooner rather than later.
> 
  Ah, this issue was found from somewhere else, not the target place, and
  bisect is rewarding instead of wasting your time.
  It's great and lucky this time!  :)


> > [   28.234235] XFS (loop0): Log size 66 blocks too small, minimum size is 1968 blocks
> > [   28.234856] XFS (loop0): Log size out of supported range.
> > [   28.235289] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > [   28.239290] XFS (loop0): Starting recovery (logdev: internal)
> > [   28.240979] XFS (loop0): Ending recovery (logdev: internal)
> > [  300.150944] INFO: task repro:541 blocked for more than 147 seconds.
> > [  300.151523]       Not tainted 6.3.0-rc5-7e364e56293b+ #1
> > [  300.152102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  300.152716] task:repro           state:D stack:0     pid:541   ppid:540    flags:0x00004004
> > [  300.153373] Call Trace:
> > [  300.153580]  <TASK>
> > [  300.153765]  __schedule+0x40a/0xc30
> > [  300.154078]  schedule+0x5b/0xe0
> > [  300.154349]  xlog_grant_head_wait+0x53/0x3a0
> > [  300.154715]  xlog_grant_head_check+0x1a5/0x1c0
> > [  300.155113]  xfs_log_reserve+0x145/0x380
> > [  300.155442]  xfs_trans_reserve+0x226/0x270
> > [  300.155780]  xfs_trans_alloc+0x147/0x470
> > [  300.156112]  xfs_qm_qino_alloc+0xcf/0x510
> 
> This log hang is *not a bug*. It is -expected- given that syzbot is
> screwing around with fuzzed V4 filesystems. I almost just threw this
> report in the bin because I saw it was a V4 filesytsem being
> mounted.
> 
> That is, V5 filesystems will refuse to mount a filesystem with a log
> that is too small, completely avoiding this sort of hang caused by
> the log being way smaller than a transaction reservation (guaranteed
> hang). But we cannot do the same thing for V4 filesystems, because
> there were bugs in and inconsistencies between mkfs and the kernel
> over the minimum valid log size. Hence when we hit a V4 filesystem
> in that situation, we issue a warning and allow operation to
> continue because that's historical V4 filesystem behaviour.
> 
> This kernel issued the "log size too small" warning, and then there
> was a log space hang which is entirely predictable and not a kernel
> bug. syzbot is doing something stupid, syzbot needs to be taught not
> to do stupid things.
> 
 Thanks for pointing out this syzkaller issue, I will send the problem to
 syzkaller and related syzkaller author.

 Thanks again!
 BR.
 -Pengfei

> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5
  2023-04-11  8:15   ` Pengfei Xu
@ 2023-04-11 15:03     ` Darrick J. Wong
  2023-04-12  7:18       ` Pengfei Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2023-04-11 15:03 UTC (permalink / raw)
  To: Pengfei Xu; +Cc: Dave Chinner, dchinner, linux-xfs, heng.su, lkp

On Tue, Apr 11, 2023 at 04:15:20PM +0800, Pengfei Xu wrote:
> Hi Dave,
> 
> On 2023-04-11 at 10:33:53 +1000, Dave Chinner wrote:
> > On Thu, Apr 06, 2023 at 10:34:02AM +0800, Pengfei Xu wrote:
> > > Hi Dave Chinner and xfs experts,
> > > 
> > > Greeting!
> > > 
> > > There is task hung in xlog_grant_head_check in v6.3-rc5 kernel.
> > > 
> > > Platform: x86 platforms
> > > 
> > > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230405_094839_xlog_grant_head_check
> > > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.c
> > > Syzkaller analysis repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.report
> > > Syzkaller analysis repro.stats: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.stats
> > > Reproduced prog repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.prog
> > > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/kconfig_origin
> > > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/bisect_info.log
> > > 
> > > It could be reproduced in maximum 2100s.
> > > Bisected and found bad commit was:
> > > "
> > > fe08cc5044486096bfb5ce9d3db4e915e53281ea
> > > xfs: open code sb verifier feature checks
> > > "
> > > It's just the suspected commit, because reverted above commit on top of v6.3-rc5
> > > kernel then made kernel failed, could not double confirm for the issue.
> > > 
> > > "
> > > [   24.818100] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=339 'systemd'
> > > [   28.230533] loop0: detected capacity change from 0 to 65536
> > > [   28.232522] XFS (loop0): Deprecated V4 format (crc=0) will not be supported after September 2030.
> > > [   28.233447] XFS (loop0): Mounting V10 Filesystem d28317a9-9e04-4f2a-be27-e55b4c413ff6
> > 
> > Yeah, there's the issue that the bisect found - has nothing to do
> > with the log hang. fe08cc5044486 allowed filesystem versions > 5 to
> > be mounted, prior to that it wasn't allowed. I think this was just a
> > simple oversight.
> > 
> > Not a bit deal, everything is based on feature support checks and
> > not version numbers, so it's not a critical issue.
> > 
> > Low severity, low priority, but something we should fix and push
> > back to stable kernels sooner rather than later.
> > 
>   Ah, this issue was found from somewhere else, not the target place, and
>   bisect is rewarding instead of wasting your time.
>   It's great and lucky this time!  :)
> 
> 
> > > [   28.234235] XFS (loop0): Log size 66 blocks too small, minimum size is 1968 blocks
> > > [   28.234856] XFS (loop0): Log size out of supported range.
> > > [   28.235289] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > > [   28.239290] XFS (loop0): Starting recovery (logdev: internal)
> > > [   28.240979] XFS (loop0): Ending recovery (logdev: internal)
> > > [  300.150944] INFO: task repro:541 blocked for more than 147 seconds.
> > > [  300.151523]       Not tainted 6.3.0-rc5-7e364e56293b+ #1
> > > [  300.152102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > [  300.152716] task:repro           state:D stack:0     pid:541   ppid:540    flags:0x00004004
> > > [  300.153373] Call Trace:
> > > [  300.153580]  <TASK>
> > > [  300.153765]  __schedule+0x40a/0xc30
> > > [  300.154078]  schedule+0x5b/0xe0
> > > [  300.154349]  xlog_grant_head_wait+0x53/0x3a0
> > > [  300.154715]  xlog_grant_head_check+0x1a5/0x1c0
> > > [  300.155113]  xfs_log_reserve+0x145/0x380
> > > [  300.155442]  xfs_trans_reserve+0x226/0x270
> > > [  300.155780]  xfs_trans_alloc+0x147/0x470
> > > [  300.156112]  xfs_qm_qino_alloc+0xcf/0x510
> > 
> > This log hang is *not a bug*. It is -expected- given that syzbot is
> > screwing around with fuzzed V4 filesystems. I almost just threw this
> > report in the bin because I saw it was a V4 filesytsem being
> > mounted.
> > 
> > That is, V5 filesystems will refuse to mount a filesystem with a log
> > that is too small, completely avoiding this sort of hang caused by
> > the log being way smaller than a transaction reservation (guaranteed
> > hang). But we cannot do the same thing for V4 filesystems, because
> > there were bugs in and inconsistencies between mkfs and the kernel
> > over the minimum valid log size. Hence when we hit a V4 filesystem
> > in that situation, we issue a warning and allow operation to
> > continue because that's historical V4 filesystem behaviour.
> > 
> > This kernel issued the "log size too small" warning, and then there
> > was a log space hang which is entirely predictable and not a kernel
> > bug. syzbot is doing something stupid, syzbot needs to be taught not
> > to do stupid things.
> > 
>  Thanks for pointing out this syzkaller issue, I will send the problem to
>  syzkaller and related syzkaller author.

Don't bother, we already had this discussion *five years ago*:

https://lore.kernel.org/linux-xfs/20180523044742.GZ23861@dastard/

The same points there still apply -- we cannot break existing V4 users,
the format is scheduled for removal, and it's *really unfair* for
megacorporations like Intel and Google to dump zeroday reproducers onto
public mailing lists expecting the maintainers will just magically come
up with engineering resources to go fix all these corner cases.

Silicon Valley tech companies just laid off what, like 295,000
programmers in the last 9 months?  Just think about what we could do if
1% of that went back to work fixing all the broken crap.

Hire a team to triage and fix the damn bugs or stop sending them.

--D

>  Thanks again!
>  BR.
>  -Pengfei
> 
> > -Dave.
> > -- 
> > Dave Chinner
> > david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5
  2023-04-11 15:03     ` Darrick J. Wong
@ 2023-04-12  7:18       ` Pengfei Xu
  0 siblings, 0 replies; 5+ messages in thread
From: Pengfei Xu @ 2023-04-12  7:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Dave Chinner, dchinner, linux-xfs, heng.su, lkp

Hi Darrick,

On 2023-04-11 at 08:03:36 -0700, Darrick J. Wong wrote:
> On Tue, Apr 11, 2023 at 04:15:20PM +0800, Pengfei Xu wrote:
> > Hi Dave,
> > 
> > On 2023-04-11 at 10:33:53 +1000, Dave Chinner wrote:
> > > On Thu, Apr 06, 2023 at 10:34:02AM +0800, Pengfei Xu wrote:
> > > > Hi Dave Chinner and xfs experts,
> > > > 
> > > > Greeting!
> > > > 
> > > > There is task hung in xlog_grant_head_check in v6.3-rc5 kernel.
> > > > 
> > > > Platform: x86 platforms
> > > > 
> > > > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230405_094839_xlog_grant_head_check
> > > > Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.c
> > > > Syzkaller analysis repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.report
> > > > Syzkaller analysis repro.stats: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.stats
> > > > Reproduced prog repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/repro.prog
> > > > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/kconfig_origin
> > > > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230405_094839_xlog_grant_head_check/bisect_info.log
> > > > 
> > > > It could be reproduced in maximum 2100s.
> > > > Bisected and found bad commit was:
> > > > "
> > > > fe08cc5044486096bfb5ce9d3db4e915e53281ea
> > > > xfs: open code sb verifier feature checks
> > > > "
> > > > It's just the suspected commit, because reverted above commit on top of v6.3-rc5
> > > > kernel then made kernel failed, could not double confirm for the issue.
> > > > 
> > > > "
> > > > [   24.818100] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=339 'systemd'
> > > > [   28.230533] loop0: detected capacity change from 0 to 65536
> > > > [   28.232522] XFS (loop0): Deprecated V4 format (crc=0) will not be supported after September 2030.
> > > > [   28.233447] XFS (loop0): Mounting V10 Filesystem d28317a9-9e04-4f2a-be27-e55b4c413ff6
> > > 
> > > Yeah, there's the issue that the bisect found - has nothing to do
> > > with the log hang. fe08cc5044486 allowed filesystem versions > 5 to
> > > be mounted, prior to that it wasn't allowed. I think this was just a
> > > simple oversight.
> > > 
> > > Not a bit deal, everything is based on feature support checks and
> > > not version numbers, so it's not a critical issue.
> > > 
> > > Low severity, low priority, but something we should fix and push
> > > back to stable kernels sooner rather than later.
> > > 
> >   Ah, this issue was found from somewhere else, not the target place, and
> >   bisect is rewarding instead of wasting your time.
> >   It's great and lucky this time!  :)
> > 
> > 
> > > > [   28.234235] XFS (loop0): Log size 66 blocks too small, minimum size is 1968 blocks
> > > > [   28.234856] XFS (loop0): Log size out of supported range.
> > > > [   28.235289] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > > > [   28.239290] XFS (loop0): Starting recovery (logdev: internal)
> > > > [   28.240979] XFS (loop0): Ending recovery (logdev: internal)
> > > > [  300.150944] INFO: task repro:541 blocked for more than 147 seconds.
> > > > [  300.151523]       Not tainted 6.3.0-rc5-7e364e56293b+ #1
> > > > [  300.152102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > [  300.152716] task:repro           state:D stack:0     pid:541   ppid:540    flags:0x00004004
> > > > [  300.153373] Call Trace:
> > > > [  300.153580]  <TASK>
> > > > [  300.153765]  __schedule+0x40a/0xc30
> > > > [  300.154078]  schedule+0x5b/0xe0
> > > > [  300.154349]  xlog_grant_head_wait+0x53/0x3a0
> > > > [  300.154715]  xlog_grant_head_check+0x1a5/0x1c0
> > > > [  300.155113]  xfs_log_reserve+0x145/0x380
> > > > [  300.155442]  xfs_trans_reserve+0x226/0x270
> > > > [  300.155780]  xfs_trans_alloc+0x147/0x470
> > > > [  300.156112]  xfs_qm_qino_alloc+0xcf/0x510
> > > 
> > > This log hang is *not a bug*. It is -expected- given that syzbot is
> > > screwing around with fuzzed V4 filesystems. I almost just threw this
> > > report in the bin because I saw it was a V4 filesytsem being
> > > mounted.
> > > 
> > > That is, V5 filesystems will refuse to mount a filesystem with a log
> > > that is too small, completely avoiding this sort of hang caused by
> > > the log being way smaller than a transaction reservation (guaranteed
> > > hang). But we cannot do the same thing for V4 filesystems, because
> > > there were bugs in and inconsistencies between mkfs and the kernel
> > > over the minimum valid log size. Hence when we hit a V4 filesystem
> > > in that situation, we issue a warning and allow operation to
> > > continue because that's historical V4 filesystem behaviour.
> > > 
> > > This kernel issued the "log size too small" warning, and then there
> > > was a log space hang which is entirely predictable and not a kernel
> > > bug. syzbot is doing something stupid, syzbot needs to be taught not
> > > to do stupid things.
> > > 
> >  Thanks for pointing out this syzkaller issue, I will send the problem to
> >  syzkaller and related syzkaller author.
> 
> Don't bother, we already had this discussion *five years ago*:
> 
> https://lore.kernel.org/linux-xfs/20180523044742.GZ23861@dastard/
> 
> The same points there still apply -- we cannot break existing V4 users,
> the format is scheduled for removal, and it's *really unfair* for
> megacorporations like Intel and Google to dump zeroday reproducers onto
> public mailing lists expecting the maintainers will just magically come
> up with engineering resources to go fix all these corner cases.
> 
> Silicon Valley tech companies just laid off what, like 295,000
> programmers in the last 9 months?  Just think about what we could do if
> 1% of that went back to work fixing all the broken crap.
> 
> Hire a team to triage and fix the damn bugs or stop sending them.
> 
  Thanks for your info sharing for the issue history!
  I have sent one issue report to syzkaller before I received your email.
  Yes, we should not report useless report to Linux community.
  Thanks for suggestion!
  Anyway, we will carefully review reports of V4 filesystem issues before
  sending them to reduce useless report.

  Thanks!
  BR.
  -Pengfei(Intel)

> --D
> 
> >  Thanks again!
> >  BR.
> >  -Pengfei
> > 
> > > -Dave.
> > > -- 
> > > Dave Chinner
> > > david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-04-12  7:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-06  2:34 [Syzkaller & bisect] There is task hung in xlog_grant_head_check in v6.3-rc5 Pengfei Xu
2023-04-11  0:33 ` Dave Chinner
2023-04-11  8:15   ` Pengfei Xu
2023-04-11 15:03     ` Darrick J. Wong
2023-04-12  7:18       ` Pengfei Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox