linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andres Freund <andres@anarazel.de>
To: Theodore Ts'o <tytso@mit.edu>
Cc: "Shreeya Patel" <shreeya.patel@collabora.com>,
	linux-ext4@vger.kernel.org,
	"Ricardo Cañuelo" <ricardo.canuelo@collabora.com>,
	"gustavo.padovan@collabora.com" <gustavo.padovan@collabora.com>,
	groeck@google.com, zsm@google.com, garrick@google.com,
	"Linux regressions mailing list" <regressions@lists.linux.dev>
Subject: Re: [syzbot] INFO: task hung in ext4_fallocate
Date: Tue, 3 Oct 2023 07:11:38 -0700	[thread overview]
Message-ID: <20231003141138.owt6qwqyf4slgqgp@alap3.anarazel.de> (raw)
In-Reply-To: <ZQjKOjrjDYsoXBXj@mit.edu>

Hi,

On 2023-09-18 18:07:54 -0400, Theodore Ts'o wrote:
> On Mon, Sep 18, 2023 at 08:13:30PM +0530, Shreeya Patel wrote:
> > When I tried to reproduce this issue on mainline linux kernel using the
> > reproducer provided by syzbot, I see an endless loop of following errors :-
> >
> > [   89.689751][ T3922] loop1: detected capacity change from 0 to 512
> > [   89.690514][ T3916] EXT4-fs error (device loop4): ext4_map_blocks:577:
> > inode #3: block 9: comm poc: lblock 0 mapped to illegal pblock 9 (length 1)
> > [   89.694709][ T3890] EXT4-fs error (device loop0): ext4_map_blocks:577:
>
> Please note that maliciously corrupted file system is considered low
> priority by ext4 developers.

FWIW, I am seeing occasional hangs in ext4_fallocate with 6.6-rc4 as well,
just doing database development on my laptop.

Oct 03 06:48:30 alap5 kernel: INFO: task postgres:132485 blocked for more than 122 seconds.
Oct 03 06:48:30 alap5 kernel:       Not tainted 6.6.0-rc4-andres-00009-ge38f6d0c8360 #74
Oct 03 06:48:30 alap5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 03 06:48:30 alap5 kernel: task:postgres        state:D stack:0     pid:132485 ppid:132017 flags:0x00020006
Oct 03 06:48:30 alap5 kernel: Call Trace:
Oct 03 06:48:30 alap5 kernel:  <TASK>
Oct 03 06:48:30 alap5 kernel:  __schedule+0x388/0x13e0
Oct 03 06:48:30 alap5 kernel:  ? nvme_queue_rqs+0x1e6/0x290
Oct 03 06:48:30 alap5 kernel:  schedule+0x5f/0xe0
Oct 03 06:48:30 alap5 kernel:  inode_dio_wait+0xd5/0x100
Oct 03 06:48:30 alap5 kernel:  ? membarrier_register_private_expedited+0xa0/0xa0
Oct 03 06:48:30 alap5 kernel:  ext4_fallocate+0x12f/0x1040
Oct 03 06:48:30 alap5 kernel:  ? io_submit_sqes+0x392/0x660
Oct 03 06:48:30 alap5 kernel:  vfs_fallocate+0x135/0x360
Oct 03 06:48:30 alap5 kernel:  __x64_sys_fallocate+0x42/0x70
Oct 03 06:48:30 alap5 kernel:  do_syscall_64+0x38/0x80
Oct 03 06:48:30 alap5 kernel:  entry_SYSCALL_64_after_hwframe+0x46/0xb0
Oct 03 06:48:30 alap5 kernel: RIP: 0033:0x7fda32415f82
Oct 03 06:48:30 alap5 kernel: RSP: 002b:00007ffedbea0828 EFLAGS: 00000246 ORIG_RAX: 000000000000011d
Oct 03 06:48:30 alap5 kernel: RAX: ffffffffffffffda RBX: 00000000002a6000 RCX: 00007fda32415f82
Oct 03 06:48:30 alap5 kernel: RDX: 0000000014b22000 RSI: 0000000000000000 RDI: 0000000000000016
Oct 03 06:48:30 alap5 kernel: RBP: 0000000014b22000 R08: 0000000014b22000 R09: 0000558f70eabed0
Oct 03 06:48:30 alap5 kernel: R10: 00000000002a6000 R11: 0000000000000246 R12: 00000000000000e0
Oct 03 06:48:30 alap5 kernel: R13: 000000000a000013 R14: 0000000000000004 R15: 0000558f709d4e70
Oct 03 06:48:30 alap5 kernel:  </TASK>

cat /proc/132485/stack
[<0>] inode_dio_wait+0xd5/0x100
[<0>] ext4_fallocate+0x12f/0x1040
[<0>] vfs_fallocate+0x135/0x360
[<0>] __x64_sys_fallocate+0x42/0x70
[<0>] do_syscall_64+0x38/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

cat /sys/kernel/debug/block/nvme1n1/hctx*/busy
doesn't show any active IO. Nor does the issue resolve when resetting the
controller - so I don't think this is a disk/block level issue.


This is on 8f1b4600373f, with one local, unrelated, commit ontop.

Unfortunately, so far, I've only reproduced this every couple hours of
interactive work, so bisecting isn't really feasible. I've not hit it on 6.5
over a considerably longer time, so I am reasonably confident this isn't an
older issue.

Greetings,

Andres Freund

  reply	other threads:[~2023-10-03 14:11 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-18 14:43 [syzbot] INFO: task hung in ext4_fallocate Shreeya Patel
2023-09-18 22:07 ` Theodore Ts'o
2023-10-03 14:11   ` Andres Freund [this message]
2023-10-03 23:25     ` Theodore Ts'o
2023-10-04  0:42       ` Andres Freund
2023-10-05  9:04         ` Shreeya Patel
2023-10-10  9:21           ` Thorsten Leemhuis
2023-10-10 16:25             ` Andres Freund
2023-10-12 20:58             ` Shreeya Patel
2023-10-14  1:54               ` Theodore Ts'o
2023-10-17 14:58                 ` Shreeya Patel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231003141138.owt6qwqyf4slgqgp@alap3.anarazel.de \
    --to=andres@anarazel.de \
    --cc=garrick@google.com \
    --cc=groeck@google.com \
    --cc=gustavo.padovan@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=ricardo.canuelo@collabora.com \
    --cc=shreeya.patel@collabora.com \
    --cc=tytso@mit.edu \
    --cc=zsm@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).