linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xiao Ni <xni@redhat.com>
To: neilb@suse.com
Cc: linux-raid@vger.kernel.org
Subject: Re: The dev node can't be released at once after stopping raid
Date: Wed, 30 Aug 2017 23:55:17 -0400 (EDT)	[thread overview]
Message-ID: <1941784023.3852671.1504151717389.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <1471667815.16472496.1496296238179.JavaMail.zimbra@redhat.com>

Hi Neil

I have searched in history emails and there have many topics like this. Sorry for talking
about this again. But it looks like the situation I encountered is different. There is 1 second
window between stop the raid device and delete the node /dev/md0. The /dev/md0 node can be
removed successfully after 1 second. 

There is no process that open the /dev/md0 after mdadm -S /dev/md0: 

mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1 --assume-clean
dmesg:
[36416.860525] Opened by mdadm, pid is 3523
[36416.984160] md/raid1:md0: active with 2 out of 2 mirrors
[36416.984181] md0: detected capacity change from 0 to 523239424
[36416.984219] Released by mdadm, pid is 3523
[36416.984228] remove_and_add_spares
[36416.991588] Opened by mdadm, pid is 3541
[36416.997183] Released by mdadm, pid is 3541
[36417.001376] Opened by systemd-udevd, pid is 3525
[36417.007128] Released by systemd-udevd, pid is 3525

udev:
KERNEL[36419.830817] add      /devices/virtual/bdi/9:0 (bdi)
KERNEL[36419.831045] add      /devices/virtual/block/md0 (block)
UDEV  [36419.832911] add      /devices/virtual/bdi/9:0 (bdi)
UDEV  [36419.836380] add      /devices/virtual/block/md0 (block)
KERNEL[36419.877705] change   /devices/virtual/block/loop0 (block)
KERNEL[36419.878057] change   /devices/virtual/block/loop0 (block)
KERNEL[36419.926761] change   /devices/virtual/block/loop1 (block)
KERNEL[36419.927015] change   /devices/virtual/block/loop1 (block)
UDEV  [36419.953112] change   /devices/virtual/block/loop0 (block)
UDEV  [36419.953141] change   /devices/virtual/block/loop1 (block)
KERNEL[36419.954765] change   /devices/virtual/block/md0 (block)
UDEV  [36419.955973] change   /devices/virtual/block/loop0 (block)
UDEV  [36419.962799] change   /devices/virtual/block/loop1 (block)
UDEV  [36419.982934] change   /devices/virtual/block/md0 (block)

mdadm -S /dev/md0
dmesg:
[36493.068054] Opened by mdadm, pid is 3552
[36493.072051] Released by mdadm, pid is 3552
[36493.076123] Opened by mdadm, pid is 3552
[36493.080073] md0: detected capacity change from 523239424 to 0
[36493.080077] md: md0 stopped.
[36493.273011] Released by mdadm, pid is 3552
udev:
KERNEL[36496.300219] remove   /devices/virtual/bdi/9:0 (bdi)
KERNEL[36496.300335] remove   /devices/virtual/block/md0 (block)
UDEV  [36496.300736] remove   /devices/virtual/bdi/9:0 (bdi)
UDEV  [36496.301812] remove   /devices/virtual/block/md0 (block)

There are only REMOVE events during command mdadm -S /dev/md0.

I tried to create a lvm and remove it to check whether lvm has this problem or not. 

pvcreate /dev/md0 
vgcreate vg /dev/md0 
lvcreate -L 100M -n test vg
lvremove vg/test -y
ls /dev/mapper/vg-test
ls /dev/dm-3

The node /dev/mapper/vg-test and /dev/dm-3 can be removed in time. There is no time
window. So it looks like it's a problem of md. Could you give some suggestions about
this? What should I do next? 

If it's not a bug, why there is a 1 second window?

Best Regards
Xiao

----- Original Message -----
> From: "Xiao Ni" <xni@redhat.com>
> To: "Zhilong Liu" <zlliu@suse.com>
> Cc: linux-raid@vger.kernel.org
> Sent: Thursday, June 1, 2017 1:50:38 PM
> Subject: Re: The dev node can't be released at once after stopping raid
> 
> 
> 
> ----- Original Message -----
> > From: "Zhilong Liu" <zlliu@suse.com>
> > To: "Xiao Ni" <xni@redhat.com>, linux-raid@vger.kernel.org
> > Sent: Thursday, June 1, 2017 12:43:49 PM
> > Subject: Re: The dev node can't be released at once after stopping raid
> > 
> > 
> > 
> > On 06/01/2017 11:47 AM, Xiao Ni wrote:
> > > Hi all
> > >
> > > I tried with the latest linux stable kernel and latest mdadm.
> > >
> > > After stopping a raid device, the dev node directory can't be released
> > > at once. I did a simple test, the script is:
> > >
> > > #!/bin/sh
> > >
> > > while [ 1 ]; do
> > > mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1
> > > sleep 5
> > > mdadm -S /dev/md0
> > > ls /dev/md0
> > > sleep 1
> > > ls /dev/md0
> > > done
> > >
> > > mdadm: stopped /dev/md0
> > > /dev/md0
> > > ls: cannot access /dev/md0: No such file or directory
> > >
> > > It usually detects dev node /dev/md0 isn't released after stopping raid.
> > > I'm not sure whether it's a bug or not. Do we need to do some job to
> > > make sure that the node should be released before command mdadm -S
> > > return?
> > 
> > it's waiting for processing the udev events. we can monitor it via to "#
> > udevadm monitor".
> > 
> > For mdadm -S /dev/md0, Manage_stop() has already did the errno checking,
> > 
> > cut piece of code from Manage.c
> > .. .. .. ..
> > done:
> > 
> >      /* As we have an O_EXCL open, any use of the device
> >       * which blocks STOP_ARRAY is probably a transient use,
> >       * so it is reasonable to retry for a while - 5 seconds.
> >       */
> >      count = 25; err = 0;
> >      while (count && fd >= 0 &&
> >             (err = ioctl(fd, STOP_ARRAY, NULL)) < 0 && errno == EBUSY) {
> >          usleep(200000);
> >          count --;
> >      }
> 
> Hi Zhilong
> 
> Good suggestions. I tried it and it can add some codes in the script to wait.
> Is it better to check the udev events in mdadm? Let's check it after closing
> mdfd when Manage_stop returns. Because it's mdadm's job, right?
> 
> Regards
> Xiao
> > 
> > Best regards,
> > -Zhilong
> > 
> > > Best Regards
> > > Xiao
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2017-08-31  3:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <51439640.15505639.1496113073965.JavaMail.zimbra@redhat.com>
2017-06-01  3:47 ` The dev node can't be released at once after stopping raid Xiao Ni
2017-06-01  4:43   ` Zhilong Liu
2017-06-01  5:50     ` Xiao Ni
2017-08-31  3:55       ` Xiao Ni [this message]
2017-08-31  4:36         ` NeilBrown
2017-08-31  6:17           ` Xiao Ni
2017-08-31  6:48             ` NeilBrown
2017-08-31  7:16               ` Xiao Ni
2017-08-31 23:39                 ` NeilBrown
2017-09-01  0:30                   ` Xiao Ni
2017-09-01  4:34                     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1941784023.3852671.1504151717389.JavaMail.zimbra@redhat.com \
    --to=xni@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).