From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, "stable@vger.kernel.org, dsterba@suse.cz,
Filipe Manana" <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>,
Filipe Manana <fdmanana@suse.com>
Subject: [PATCH 5.10 28/29] btrfs: fix crash after non-aligned direct IO write with O_DSYNC
Date: Mon, 22 Feb 2021 13:13:22 +0100 [thread overview]
Message-ID: <20210222121025.628566179@linuxfoundation.org> (raw)
In-Reply-To: <20210222121019.444399883@linuxfoundation.org>
From: Filipe Manana <fdmanana@suse.com>
Whenever we attempt to do a non-aligned direct IO write with O_DSYNC, we
end up triggering an assertion and crashing. Example reproducer:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV > /dev/null
mount $DEV $MNT
# Do a direct IO write with O_DSYNC into a non-aligned range...
xfs_io -f -d -s -c "pwrite -S 0xab -b 64K 1111 64K" $MNT/foobar
umount $MNT
When running the reproducer an assertion fails and produces the following
trace:
[ 2418.403134] assertion failed: !current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA, in fs/btrfs/space-info.c:1467
[ 2418.403745] ------------[ cut here ]------------
[ 2418.404306] kernel BUG at fs/btrfs/ctree.h:3286!
[ 2418.404862] invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC PTI
[ 2418.405451] CPU: 1 PID: 64705 Comm: xfs_io Tainted: G D 5.10.15-btrfs-next-87 #1
[ 2418.406026] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2418.407228] RIP: 0010:assertfail.constprop.0+0x18/0x26 [btrfs]
[ 2418.407835] Code: e6 48 c7 (...)
[ 2418.409078] RSP: 0018:ffffb06080d13c98 EFLAGS: 00010246
[ 2418.409696] RAX: 000000000000006c RBX: ffff994c1debbf08 RCX: 0000000000000000
[ 2418.410302] RDX: 0000000000000000 RSI: 0000000000000027 RDI: 00000000ffffffff
[ 2418.410904] RBP: ffff994c21770000 R08: 0000000000000000 R09: 0000000000000000
[ 2418.411504] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000010000
[ 2418.412111] R13: ffff994c22198400 R14: ffff994c21770000 R15: 0000000000000000
[ 2418.412713] FS: 00007f54fd7aff00(0000) GS:ffff994d35200000(0000) knlGS:0000000000000000
[ 2418.413326] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2418.413933] CR2: 000056549596d000 CR3: 000000010b928003 CR4: 0000000000370ee0
[ 2418.414528] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2418.415109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2418.415669] Call Trace:
[ 2418.416254] btrfs_reserve_data_bytes.cold+0x22/0x22 [btrfs]
[ 2418.416812] btrfs_check_data_free_space+0x4c/0xa0 [btrfs]
[ 2418.417380] btrfs_buffered_write+0x1b0/0x7f0 [btrfs]
[ 2418.418315] btrfs_file_write_iter+0x2a9/0x770 [btrfs]
[ 2418.418920] new_sync_write+0x11f/0x1c0
[ 2418.419430] vfs_write+0x2bb/0x3b0
[ 2418.419972] __x64_sys_pwrite64+0x90/0xc0
[ 2418.420486] do_syscall_64+0x33/0x80
[ 2418.420979] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2418.421486] RIP: 0033:0x7f54fda0b986
[ 2418.421981] Code: 48 c7 c0 (...)
[ 2418.423019] RSP: 002b:00007ffc40569c38 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
[ 2418.423547] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54fda0b986
[ 2418.424075] RDX: 0000000000010000 RSI: 000056549595e000 RDI: 0000000000000003
[ 2418.424596] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000400
[ 2418.425119] R10: 0000000000000400 R11: 0000000000000246 R12: 00000000ffffffff
[ 2418.425644] R13: 0000000000000400 R14: 0000000000010000 R15: 0000000000000000
[ 2418.426148] Modules linked in: btrfs blake2b_generic (...)
[ 2418.429540] ---[ end trace ef2aeb44dc0afa34 ]---
1) At btrfs_file_write_iter() we set current->journal_info to
BTRFS_DIO_SYNC_STUB;
2) We then call __btrfs_direct_write(), which calls btrfs_direct_IO();
3) We can't do the direct IO write because it starts at a non-aligned
offset (1111). So at btrfs_direct_IO() we return -EINVAL (coming from
check_direct_IO() which does the alignment check), but we leave
current->journal_info set to BTRFS_DIO_SYNC_STUB - we only clear it
at btrfs_dio_iomap_begin(), because we assume we always get there;
4) Then at __btrfs_direct_write() we see that the attempt to do the
direct IO write was not successful, 0 bytes written, so we fallback
to a buffered write by calling btrfs_buffered_write();
5) There we call btrfs_check_data_free_space() which in turn calls
btrfs_alloc_data_chunk_ondemand() and that calls
btrfs_reserve_data_bytes() with flush == BTRFS_RESERVE_FLUSH_DATA;
6) Then at btrfs_reserve_data_bytes() we have current->journal_info set to
BTRFS_DIO_SYNC_STUB, therefore not NULL, and flush has the value
BTRFS_RESERVE_FLUSH_DATA, triggering the second assertion:
int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
enum btrfs_reserve_flush_enum flush)
{
struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
int ret;
ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE);
ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
(...)
So fix that by setting the journal to NULL whenever check_direct_IO()
returns a failure.
This bug only affects 5.10 kernels, and the regression was introduced in
5.10-rc1 by commit 0eb79294dbe328 ("btrfs: dio iomap DSYNC workaround").
The bug does not exist in 5.11 kernels due to commit ecfdc08b8cc65d
("btrfs: remove dio iomap DSYNC workaround"), which depends on a large
patchset that went into the merge window for 5.11. So this is a fix only
for 5.10.x stable kernels, as there are people hitting this bug.
Fixes: 0eb79294dbe328 ("btrfs: dio iomap DSYNC workaround")
CC: stable@vger.kernel.org # 5.10 (and only 5.10)
Acked-by: David Sterba <dsterba@suse.com>
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1181605
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/inode.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8026,8 +8026,12 @@ ssize_t btrfs_direct_IO(struct kiocb *io
bool relock = false;
ssize_t ret;
- if (check_direct_IO(fs_info, iter, offset))
+ if (check_direct_IO(fs_info, iter, offset)) {
+ ASSERT(current->journal_info == NULL ||
+ current->journal_info == BTRFS_DIO_SYNC_STUB);
+ current->journal_info = NULL;
return 0;
+ }
count = iov_iter_count(iter);
if (iov_iter_rw(iter) == WRITE) {
next prev parent reply other threads:[~2021-02-22 12:16 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-22 12:12 [PATCH 5.10 00/29] 5.10.18-rc1 review Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 01/29] vdpa_sim: remove hard-coded virtq count Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 02/29] vdpa_sim: add struct vdpasim_dev_attr for device attributes Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 03/29] vdpa_sim: store parsed MAC address in a buffer Greg Kroah-Hartman
2021-02-22 19:54 ` Pavel Machek
2021-02-23 4:49 ` Greg Kroah-Hartman
2021-02-23 8:06 ` Stefano Garzarella
2021-02-24 8:29 ` Pavel Machek
2021-02-24 8:36 ` Stefano Garzarella
2021-02-22 12:12 ` [PATCH 5.10 04/29] vdpa_sim: make config generic and usable for any device type Greg Kroah-Hartman
2021-02-22 12:12 ` [PATCH 5.10 05/29] vdpa_sim: add get_config callback in vdpasim_dev_attr Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 06/29] IB/isert: add module param to set sg_tablesize for IO cmd Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 07/29] net: qrtr: Fix port ID for control messages Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 08/29] mptcp: skip to next candidate if subflow has unacked data Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 09/29] net/sched: fix miss init the mru in qdisc_skb_cb Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 10/29] mt76: mt7915: fix endian issues Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 11/29] mt76: mt7615: fix rdd mcu cmd endianness Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 12/29] net: sched: incorrect Kconfig dependencies on Netfilter modules Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 13/29] net: openvswitch: fix TTL decrement exception action execution Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 14/29] net: bridge: Fix a warning when del bridge sysfs Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 15/29] net: fix proc_fs init handling in af_packet and tls Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 16/29] Xen/x86: dont bail early from clear_foreign_p2m_mapping() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 17/29] Xen/x86: also check kernel mapping in set_foreign_p2m_mapping() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 18/29] Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 19/29] Xen/gntdev: correct error checking " Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 20/29] xen/arm: dont ignore return errors from set_phys_to_machine Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 21/29] xen-blkback: dont "handle" error by BUG() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 22/29] xen-netback: " Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 23/29] xen-scsiback: " Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 24/29] xen-blkback: fix error handling in xen_blkbk_map() Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 25/29] tty: protect tty_write from odd low-level tty disciplines Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 26/29] Bluetooth: btusb: Always fallback to alt 1 for WBS Greg Kroah-Hartman
2021-02-22 12:13 ` [PATCH 5.10 27/29] btrfs: fix backport of 2175bf57dc952 in 5.10.13 Greg Kroah-Hartman
2021-02-22 12:13 ` Greg Kroah-Hartman [this message]
2021-02-22 12:13 ` [PATCH 5.10 29/29] media: pwc: Use correct device for DMA Greg Kroah-Hartman
2021-02-22 17:17 ` [PATCH 5.10 00/29] 5.10.18-rc1 review Florian Fainelli
2021-02-24 18:42 ` Greg Kroah-Hartman
2021-02-22 18:42 ` Pavel Machek
2021-02-24 18:42 ` Greg Kroah-Hartman
2021-02-22 21:28 ` Guenter Roeck
2021-02-22 21:34 ` Igor
2021-02-23 2:49 ` Naresh Kamboju
2021-02-23 14:49 ` Jon Hunter
2021-02-24 18:41 ` Greg Kroah-Hartman
2021-02-23 21:06 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210222121025.628566179@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.