From: Guoqing Jiang <guoqing.jiang@linux.dev>
To: Benjamin Sonntag <benjamin@octopuce.fr>,
linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Bug in reshape+discard?
Date: Tue, 14 Mar 2023 15:35:08 +0800 [thread overview]
Message-ID: <af276fee-83ef-9ea1-9cac-780b8ec4d36c@linux.dev> (raw)
In-Reply-To: <1132955858.4035.1677965762570.JavaMail.zimbra@z1.octopuce.fr>
Hi,
On 3/5/23 05:36, Benjamin Sonntag wrote:
> Hi everyone,
>
> I think we found a bug in the mdadm code here at Octopuce.
Probably something wrong inside md raid.
> I'm reporting it here, please tell me if that's not the right place to report it, or if you need any other information.
>
> This bug "hangs" processes in the Device-busy (D) state forever, until we reboot. It has been tested on both a debian 5.10 an 6.0 Linux kernel
Do you mean it happened on both kernel versions? Could you share
relevant stacks by "cat /proc/${pid of D state process}/stack''?
> How to trigger the bug:
>
> - create a raid5 or raid6 block device using mdadm
> mdadm --create /dev/md0 -l 5 -n 3 /dev/sd{a,b,c}2
>
> - create a partition on it and mount it USING DISCARD/TRIM (important) (the underlying device must support trim)
> mkfs.ext4 /dev/md0
> mount /dev/md0 /mnt -o discard
>
> - create a few files
> for i in $( seq 1 1000 ) ; do dd if=/dev/zero of=/mnt/$i bs=10M count=1 ; done
>
> - expand the raid by adding a new drive
> mdadm --add /dev/md0 /dev/sdd2
> mdadm --grow /dev/md0 -n 4
>
> - the last command will start a "reshape" operation on md0
> - DURING THE RESHAPE (it's important) erase some file (it goes fine)
> rm /mnt/* -rf
>
> - THEN, still during the reshape (important) try to sync or fsync
> sync
>
> - the sync process get stuck in the D state. no way to kill it until reboot
> (in fact, any process that does sync during the reshape after some files were deleted will get stuck, such as mariadbd or rsyslog...)
>
> - If you don't mount with discard your partition, the 'sync' works properly
>
>
> An easy way to circumvent this problem:
>
> - before reshaping, remount without discard
> mount /mnt -o remount,nodiscard
>
> - after the reshaping ends, remount with discard
> mount /mnt -o remount,discard
>
>
> We don't really know how to start searching for a solution, since it requires knowing the internal of MD & Discard pretty well :/ (and I'm definitely not a kernel coder ;) )
>
> thanks for your help on this issue,
Assume reshape + discard works with previous kernel version, maybe
you can try to bisect kernel tree to see which commit might caused
the bug.
Thanks,
Guoqing
prev parent reply other threads:[~2023-03-14 7:35 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-04 21:36 Bug in reshape+discard? Benjamin Sonntag
2023-03-14 7:35 ` Guoqing Jiang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=af276fee-83ef-9ea1-9cac-780b8ec4d36c@linux.dev \
--to=guoqing.jiang@linux.dev \
--cc=benjamin@octopuce.fr \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).