From: Dave Chinner <david@fromorbit.com>
To: "hch@lst.de" <hch@lst.de>
Cc: "Verma, Vishal L" <vishal.l.verma@intel.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"Williams, Dan J" <dan.j.williams@intel.com>,
"darrick.wong@oracle.com" <darrick.wong@oracle.com>
Subject: Re: 5.3-rc1 regression with XFS log recovery
Date: Mon, 19 Aug 2019 15:31:22 +1000 [thread overview]
Message-ID: <20190819053122.GK7777@dread.disaster.area> (raw)
In-Reply-To: <20190819044012.GA15800@lst.de>
On Mon, Aug 19, 2019 at 06:40:12AM +0200, hch@lst.de wrote:
> On Mon, Aug 19, 2019 at 06:29:05AM +0200, hch@lst.de wrote:
> > On Mon, Aug 19, 2019 at 02:22:59PM +1000, Dave Chinner wrote:
> > > That implies a kmalloc heap issue.
> > >
> > > Oh, is memory poisoning or something that modifies the alignment of
> > > slabs turned on?
> > >
> > > i.e. 4k/8k allocations from the kmalloc heap slabs might not be
> > > appropriately aligned for IO, similar to the problems we have with
> > > the xen blk driver?
> >
> > That is what I suspect, and as you can see in the attached config I
> > usually run with slab debuggig on.
>
> Yep, looks like an unaligned allocation:
*nod*
> With the following debug patch. Based on that I think I'll just
> formally submit the vmalloc switch as we're at -rc5, and then we
> can restart the unaligned slub allocation drama..
But I'm not so sure that's all there is to it. I turned KASAN on and
it works for the first few mkfs/mount cycles, then a mount basically
pegs a CPU and it's spending most of it's time inside KASAN
accouting code like this:
.....
save_stack+0x19/0x80
__kasan_kmalloc.constprop.10+0xc1/0xd0
kmem_cache_alloc+0xd2/0x220
mempool_alloc+0xda/0x230
bio_alloc_bioset+0x12d/0x2d0
xfs_rw_bdev+0x53/0x290
xlog_do_io+0xd1/0x1c0
xlog_bread+0x23/0x70
xlog_rseek_logrec_hdr+0x207/0x2a0
After a couple of minutes, the mount fails an ASSERT:
XFS: Assertion failed: head_blk != tail_blk, file: fs/xfs/xfs_log_recover.c, line: 5236
And moments after KASAN spews this:
BUG: KASAN: use-after-free in rwsem_down_read_slowpath+0x685/0x8f0
Read of size 4 at addr ffff88806f152778 by task systemd-udevd/2231
CPU: 4 PID: 2231 Comm: systemd-udevd Tainted: G D 5.3.0-rc5-dgc+ #1506
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
dump_stack+0x7c/0xc0
print_address_description+0x6c/0x322
? rwsem_down_read_slowpath+0x685/0x8f0
__kasan_report.cold.6+0x1c/0x3e
? rwsem_down_read_slowpath+0x685/0x8f0
? rwsem_down_read_slowpath+0x685/0x8f0
kasan_report+0xe/0x12
rwsem_down_read_slowpath+0x685/0x8f0
? unwind_get_return_address_ptr+0x50/0x50
? unwind_next_frame+0x6d6/0x8a0
? __down_timeout+0x1c0/0x1c0
? unwind_next_frame+0x6d6/0x8a0
? _raw_spin_lock+0x87/0xe0
? _raw_spin_lock+0x87/0xe0
? __cpuidle_text_end+0x5/0x5
? set_init_blocksize+0xe0/0xe0
? preempt_count_sub+0x43/0x50
? __might_sleep+0x31/0xd0
? set_init_blocksize+0xe0/0xe0
? ___might_sleep+0xc8/0xe0
down_read+0x18d/0x1a0
? refcount_sub_and_test_checked+0xaf/0x150
? rwsem_down_read_slowpath+0x8f0/0x8f0
? _raw_spin_lock+0x87/0xe0
__get_super.part.12+0xf8/0x130
fsync_bdev+0xf/0x60
invalidate_partition+0x1e/0x40
rescan_partitions+0x8a/0x420
blkdev_reread_part+0x1e/0x30
blkdev_ioctl+0xb0b/0xe60
? __blkdev_driver_ioctl+0x80/0x80
? __bpf_prog_run64+0xc0/0xc0
? stack_trace_save+0x8a/0xb0
? save_stack+0x4d/0x80
? __seccomp_filter+0x133/0xa10
? preempt_count_sub+0x43/0x50
block_ioctl+0x6d/0x80
do_vfs_ioctl+0x134/0x9c0
? ioctl_preallocate+0x140/0x140
? selinux_file_ioctl+0x2b7/0x360
? selinux_capable+0x20/0x20
? syscall_trace_enter+0x233/0x540
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x3d/0x50
do_syscall_64+0x70/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fade328a427
Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 8
RSP: 002b:00007ffdc4755928 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000000e RCX: 00007fade328a427
RDX: 0000000000000000 RSI: 000000000000125f RDI: 000000000000000e
RBP: 0000000000000000 R08: 0000559597306140 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000055959736dbc0
R13: 0000000000000000 R14: 00007ffdc47569c8 R15: 000055959730dac0
Allocated by task 4739:
save_stack+0x19/0x80
__kasan_kmalloc.constprop.10+0xc1/0xd0
kmem_cache_alloc_node+0xf3/0x240
copy_process+0x1f91/0x2f20
_do_fork+0xe0/0x530
__x64_sys_clone+0x10e/0x160
do_syscall_64+0x70/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 0:
save_stack+0x19/0x80
__kasan_slab_free+0x12e/0x180
kmem_cache_free+0x84/0x2c0
rcu_core+0x35f/0x810
__do_softirq+0x15f/0x476
The buggy address belongs to the object at ffff888237048000
which belongs to the cache task_struct of size 9792
The buggy address is located 56 bytes inside of
9792-byte region [ffff888237048000, ffff88823704a640)
The buggy address belongs to the page:
page:ffffea0008dc1200 refcount:1 mapcount:0 mapping:ffff888078a91800 index:0x0 compound_mapcount: 0
flags: 0x57ffffc0010200(slab|head)
raw: 0057ffffc0010200 dead000000000100 dead000000000122 ffff888078a91800
raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888237047f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888237047f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888237048000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888237048080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888237048100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Oh, hell, it's an rwsem that is referencing a free task.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-08-19 5:32 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-16 20:59 5.3-rc1 regression with XFS log recovery Verma, Vishal L
2019-08-18 7:11 ` hch
2019-08-18 7:41 ` hch
2019-08-18 17:34 ` hch
2019-08-19 0:08 ` Dave Chinner
2019-08-19 3:49 ` hch
2019-08-19 4:11 ` hch
2019-08-19 4:22 ` Dave Chinner
2019-08-19 4:29 ` hch
2019-08-19 4:40 ` hch
2019-08-19 5:31 ` Dave Chinner [this message]
2019-08-20 6:14 ` hch
2019-08-20 4:41 ` Dave Chinner
2019-08-20 5:53 ` hch
2019-08-20 7:44 ` Dave Chinner
2019-08-20 8:13 ` Ming Lei
2019-08-20 9:24 ` Ming Lei
2019-08-20 16:30 ` Verma, Vishal L
2019-08-20 21:44 ` Dave Chinner
2019-08-20 22:08 ` Verma, Vishal L
2019-08-20 23:53 ` Dave Chinner
2019-08-21 2:19 ` Ming Lei
2019-08-21 1:56 ` Ming Lei
2019-08-19 4:15 ` Dave Chinner
2019-08-19 17:19 ` Verma, Vishal L
2019-08-21 0:26 ` Dave Chinner
2019-08-21 0:44 ` hch
2019-08-21 1:08 ` Dave Chinner
2019-08-21 1:56 ` Verma, Vishal L
2019-08-21 6:15 ` Dave Chinner
2019-08-26 17:32 ` Verma, Vishal L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190819053122.GK7777@dread.disaster.area \
--to=david@fromorbit.com \
--cc=dan.j.williams@intel.com \
--cc=darrick.wong@oracle.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox