From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
To: Mateusz Kusiak <mateusz.kusiak@linux.intel.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: MD: Long delay for container drive removal
Date: Mon, 24 Jun 2024 09:14:10 +0200 [thread overview]
Message-ID: <20240624091410.00007100@linux.intel.com> (raw)
In-Reply-To: <24cf4b0e-2cb5-4b50-8867-f7feadaf367d@linux.intel.com>
On Thu, 20 Jun 2024 14:43:50 +0200
Mateusz Kusiak <mateusz.kusiak@linux.intel.com> wrote:
> On 18.06.2024 16:24, Mateusz Kusiak wrote:
> > Hi all,
> > we have an issue submitted for SLES15SP6 that is caused by huge delays when
> > trying to remove drive from a container.
> >
> > The scenario is as follows:
> > 1. Create two drive imsm container
> > # mdadm --create --run /dev/md/imsm --metadata=imsm --raid-devices=2
> > /dev/nvme[0-1]n1 2. Remove single drive from container
> > # mdadm /dev/md127 --remove /dev/nvme0n1
> >
> > The problem is that drive removal may take up to 7 seconds, which causes
> > timeouts for other components that are mdadm dependent.
> >
> > We narrowed it down to be MD related. We tested this with inbox mdadm-4.3
> > and mdadm-4.2 on SP6 and delay time is pretty much the same. SP5 is free of
> > this issue.
> >
> > I also tried RHEL 8.9 and drive removal is almost instant.
> >
> > Is it default behavior now, or should we treat this as an issue?
> >
> > Thanks,
> > Mateusz
> >
>
> I dug into this more. I retested this on:
> - Ubuntu 24.04 with inbox kernel 6.6.0: No reproduction
> - RHEL 9.4 with usptream kernel: 6.9.5-1: Got reproduction
> (Note that SLES15SP6 comes with 6.8.0-rc4 inbox)
>
> I plugged into mdadm with gdb and found out that ioctl call in
> hot_remove_disk() fails and it's causing a delay. The function looks as
> follows:
>
> int hot_remove_disk(int mdfd, unsigned long dev, int force)
> {
> int cnt = force ? 500 : 5;
> int ret;
>
> /* HOT_REMOVE_DISK can fail with EBUSY if there are
> * outstanding IO requests to the device.
> * In this case, it can be helpful to wait a little while,
> * up to 5 seconds if 'force' is set, or 50 msec if not.
> */
> while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
> errno == EBUSY &&
> cnt-- > 0)
> sleep_for(0, MSEC_TO_NSEC(10), true);
>
> return ret;
> }
> ... if it fails, then it defaults to removing drive via sysfs call.
>
> Looks like a kernel ioctl issue...
>
Hello,
I investigated this. Looks like HOT_REMOVE_DRIVE ioctl almost always failed for
raid with no raid personality. At some point it was allowed but it was blocked
6 years ago in c42a0e2675 (this id leads to merge commit, so giving title "md:
fix NULL dereference of mddev->pers in remove_and_add_spares()").
And that explains why we have outdated comment in mdadm:
if (err && errno == ENODEV) {
/* Old kernels rejected this if no personality
* is registered */
I'm working to make it fixed in mdadm (for kernels with this hang), I will
remove ioctl call for external containers:
https://github.com/md-raid-utilities/mdadm/pull/31
On HOT_REMOVE_DRIVE ioctl path, there is a wait for clearing MD_RECOVERY_NEEDED
flag with timeout set to 5 seconds. When I disabled this for arrays
with no personality- it fixes issue. However, I'm not sure if it is right fix. I
would expect to not set MD_RECOVERY_NEEDED for arrays with no MD personality.
Kuai and Song could you please advice?
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c0426a6d2fd1..bd1cedeb105b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7827,7 +7827,7 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t
mode, return get_bitmap_file(mddev, argp);
}
- if (cmd == HOT_REMOVE_DISK)
+ if (cmd == HOT_REMOVE_DISK && mddev->pers)
/* need to ensure recovery thread has run */
wait_event_interruptible_timeout(mddev->sb_wait,
!test_bit(MD_RECOVERY_NEEDED,
Thanks,
Mariusz
next prev parent reply other threads:[~2024-06-24 7:14 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-18 14:24 MD: Long delay for container drive removal Mateusz Kusiak
2024-06-20 12:43 ` Mateusz Kusiak
2024-06-24 7:14 ` Mariusz Tkaczyk [this message]
2024-06-25 9:25 ` Yu Kuai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240624091410.00007100@linux.intel.com \
--to=mariusz.tkaczyk@linux.intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=mateusz.kusiak@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.