From: Xiao Ni <xni@redhat.com>
To: Song Liu <song@kernel.org>
Cc: guoqing.jiang@linux.dev, linux-raid@vger.kernel.org, ffan@redhat.com
Subject: Re: [PATCH 1/1] Add mddev->io_acct_cnt for raid0_quiesce
Date: Sun, 23 Oct 2022 17:15:51 +0800 [thread overview]
Message-ID: <CALTww29M6sdefmCaP5Btd6BvR0i6YK9R-zo0YmnSTM76o2rTeA@mail.gmail.com> (raw)
In-Reply-To: <CAPhsuW5p=Hu8r+8qH1sxWBmna68hjJ+=aZL-fmRbFbEpsb2vQQ@mail.gmail.com>
On Sat, Oct 22, 2022 at 5:10 AM Song Liu <song@kernel.org> wrote:
>
> On Fri, Oct 21, 2022 at 3:07 AM Xiao Ni <xni@redhat.com> wrote:
> >
> > On Fri, Oct 21, 2022 at 3:50 AM Song Liu <song@kernel.org> wrote:
> > >
> > > On Sun, Oct 16, 2022 at 7:11 PM Xiao Ni <xni@redhat.com> wrote:
> > > >
> > > > It has added io_acct_set for raid0/raid5 io accounting and it needs to
> > > > alloc md_io_acct in the i/o path. They are free when the bios come back
> > > > from member disks. Now we don't have a method to monitor if those bios
> > > > are all come back. In the takeover process, it needs to free the raid0
> > > > memory resource including the memory pool for md_io_acct. But maybe some
> > > > bios are still not returned. When those bios are returned, it can cause
> > > > panic bcause of introducing NULL pointer or invalid address.
> > > >
> > > > This patch adds io_acct_cnt. So when stopping raid0, it can use this
> > > > to wait until all bios come back.
> > > >
> > > > Reported-by: Fine Fan <ffan@redhat.com>
> > > > Signed-off-by: Xiao Ni <xni@redhat.com>
> > >
> > > I have seen a lot of warnings and errors in dmesg with this patch. For example:
> > >
> > > [ 402.116463] =============================================================================
> > > [ 402.117176] BUG bio-144 (Tainted: G B W ): Right
> > > Redzone overwritten
> > > [ 402.117837] -----------------------------------------------------------------------------
> > > [ 402.117837]
> > > [ 402.118713] 0xffff88816f683cd0-0xffff88816f683cd7 @offset=15568.
> > > First byte 0x0 instead of 0xcc
> > > [ 402.119505] Allocated in mempool_alloc+0x79/0x1a0 age=1038 cpu=19 pid=1130
> > > [ 402.120133] kmem_cache_alloc+0x2dc/0x3c0
> > > [ 402.120510] mempool_alloc+0x79/0x1a0
> > > [ 402.120840] bio_alloc_bioset+0xcb/0x530
> > > [ 402.121205] bio_alloc_clone+0x20/0x60
> > > [ 402.121560] md_account_bio+0x41/0x80
> > > [ 402.121890] raid5_make_request+0x1cf/0x1450
> > > [ 402.122327] md_handle_request+0x26c/0x3f0
> > > [ 402.122700] __submit_bio+0x53/0x180
> > > [ 402.123030] submit_bio_noacct_nocheck+0xe8/0x2b0
> > > [ 402.123453] __blkdev_direct_IO_async+0x109/0x1d0
> > > [ 402.123897] generic_file_direct_write+0x9c/0x1e0
> > > [ 402.124332] __generic_file_write_iter+0x95/0x170
> > > [ 402.124771] blkdev_write_iter+0xe9/0x180
> > > [ 402.125162] aio_write+0x11a/0x2e0
> > > [ 402.125503] io_submit_one+0x627/0xd20
> > > [ 402.125844] __x64_sys_io_submit+0x88/0x250
> > > [ 402.126223] Slab 0xffffea0005bda000 objects=51 used=51
> > > fp=0x0000000000000000 flags=0x200000000010200(slab|head|node=0|zone=2)
> > > [ 402.127227] Object 0xffff88816f683c40 @offset=15424 fp=0x0000000000000000
> > > [ 402.127227]
> > > [ 402.127960] Redzone ffff88816f683c00: cc cc cc cc cc cc cc cc cc
> > > cc cc cc cc cc cc cc ................
> > > [ 402.128797] Redzone ffff88816f683c10: cc cc cc cc cc cc cc cc cc
> > > cc cc cc cc cc cc cc ................
> > > [ 402.129665] Redzone ffff88816f683c20: cc cc cc cc cc cc cc cc cc
> > > cc cc cc cc cc cc cc ................
> > > [ 402.130503] Redzone ffff88816f683c30: cc cc cc cc cc cc cc cc cc
> > > cc cc cc cc cc cc cc ................
> > > [ 402.131336] Object ffff88816f683c40: 80 a3 68 6f 81 88 ff ff af
> > > 21 00 00 01 00 00 00 ..ho.....!......
> > > [ 402.132166] Object ffff88816f683c50: 00 00 00 00 00 00 00 00 80
> > > 23 09 0b 81 88 ff ff .........#......
> > > [ 402.132996] Object ffff88816f683c60: 01 88 00 00 02 00 04 40 00
> > > 5a 5a 5a 00 00 00 00 .......@.ZZZ....
> > > [ 402.133822] Object ffff88816f683c70: 88 86 1c 00 00 00 00 00 00
> > > 10 00 00 00 00 00 00 ................
> > > [ 402.134647] Object ffff88816f683c80: 00 00 00 00 ff ff ff ff e0
> > > a9 a8 81 ff ff ff ff ................
> > > [ 402.135501] Object ffff88816f683c90: 40 3c 68 6f 81 88 ff ff 00
> > > 00 00 00 00 00 00 00 @<ho............
> > > [ 402.136354] Object ffff88816f683ca0: 00 00 00 00 00 00 00 00 00
> > > 00 00 00 00 00 00 00 ................
> > > [ 402.137174] Object ffff88816f683cb0: 00 00 00 00 00 00 00 00 00
> > > 00 00 00 01 00 00 00 ................
> > > [ 402.138027] Object ffff88816f683cc0: 00 a4 68 6f 81 88 ff ff 40
> > > 2f c4 73 81 88 ff ff ..ho....@/.s....
> > > [ 402.138857] Redzone ffff88816f683cd0: 00 20 c4 73 81 88 ff ff
> > > . .s....
> > > [ 402.139657] Padding ffff88816f683d20: 5a 5a 5a 5a 5a 5a 5a 5a 5a
> > > 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
> > > [ 402.140510] Padding ffff88816f683d30: 5a 5a 5a 5a 5a 5a 5a 5a 5a
> > > 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
> > > [ 402.141345] CPU: 29 PID: 1092 Comm: md0_raid5 Tainted: G B W
> > > 6.1.0-rc1+ #145
> > > [ 402.142083] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
> > > [ 402.143127] Call Trace:
> > > [ 402.143365] <TASK>
> > > [ 402.143563] dump_stack_lvl+0x45/0x5d
> > > [ 402.143899] check_bytes_and_report.cold+0x6d/0x85
> > > [ 402.144343] check_object+0x1fa/0x2d0
> > > [ 402.144675] free_debug_processing+0x1bc/0x660
> > > [ 402.145091] ? md_end_io_acct+0x3c/0x80
> > > [ 402.145464] ? md_end_io_acct+0x3c/0x80
> > > [ 402.145812] kmem_cache_free+0x55f/0x5b0
> > > [ 402.146164] md_end_io_acct+0x3c/0x80
> > > [ 402.146498] handle_stripe+0x11a5/0x1d70
> > > [ 402.146849] handle_active_stripes.constprop.0+0x487/0x5e0
> > > [ 402.147353] raid5d+0x40d/0x680
> > > [ 402.147640] ? lock_acquire+0x1ad/0x310
> > > [ 402.147989] md_thread+0xc2/0x170
> > > [ 402.148319] ? prepare_to_wait_exclusive+0xe0/0xe0
> > > [ 402.148749] ? register_md_personality+0x90/0x90
> > > [ 402.149162] kthread+0xf2/0x120
> > > [ 402.149455] ? kthread_complete_and_exit+0x20/0x20
> > > [ 402.149884] ret_from_fork+0x22/0x30
> > > [ 402.150211] </TASK>
> > > [ 402.150431] FIX bio-144: Restoring Right Redzone
> > > 0xffff88816f683cd0-0xffff88816f683cd7=0xcc
> > > [ 402.151196] FIX bio-144: Object at 0xffff88816f683c40 not freed
> > >
> > > Please fix them and resend.
> > >
> > > Thanks,
> > > Song
> >
> > Hi Song
> >
> > What commands do you run? I've run some tests and didn't see the messages.
> > By the way, what disks do you use?
>
> I see these with regular IO. Some fio-libaio-direct workload should trigger it.
> This is running in Qemu on virtual nvme devices, and with some debug
> options enabled (KASAN, LOCKDEP, etc.).
Hi Song
I have reproduced this. Thanks for pointing this out. I'll fix this and re-send
v2.
Regards
Xiao
next prev parent reply other threads:[~2022-10-23 9:16 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-17 2:11 [PATCH 1/1] Add mddev->io_acct_cnt for raid0_quiesce Xiao Ni
2022-10-20 19:49 ` Song Liu
2022-10-21 10:07 ` Xiao Ni
2022-10-21 21:10 ` Song Liu
2022-10-23 9:15 ` Xiao Ni [this message]
-- strict thread matches above, loose matches on Subject: below --
2022-10-12 9:11 Xiao Ni
2022-10-13 0:45 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALTww29M6sdefmCaP5Btd6BvR0i6YK9R-zo0YmnSTM76o2rTeA@mail.gmail.com \
--to=xni@redhat.com \
--cc=ffan@redhat.com \
--cc=guoqing.jiang@linux.dev \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).