dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: dm-devel@redhat.com
Subject: Re: How do you force-close a dm device after a disk failure?
Date: Thu, 17 Sep 2015 16:04:13 +0200	[thread overview]
Message-ID: <20150917140413.GH7519@soda.linbit> (raw)
In-Reply-To: <20150914194552.213afd64@korath.teln.shikadi.net>

On Mon, Sep 14, 2015 at 07:45:52PM +1000, Adam Nielsen wrote:
> > Whole dm  table with all deps needs to be known.
> 
> $ dmsetup table
> backup: 0 11720531968 crypt aes-xts-plain64
>   0000000000000000000000000000000000000000000000000000000000000000 0
>   9:10 4096
> 
> $ dmsetup status
> backup: 0 11720531968 crypt
> 
> $ dmsetup ls --tree
> backup (253:0)
>  └─ (9:10)
> 
> $ dmsetup info -f
> Name:              backup
> State:             ACTIVE (DEFERRED REMOVE)
> Read Ahead:        4096
> Tables present:    LIVE
> Open count:        1
> Event number:      0
> Major, minor:      253, 0
> Number of targets: 1
> UUID: CRYPT-LUKS1-d0b3d38e421545908537dc50f59fb217-backup
> 
> All I'm using it for is to encrypt an mdadm-style RAID array composed
> of two external disks, connected temporarily via USB to do a full
> system backup with rsync.
> 
> > > I'm not sure how to do this, could you please elaborate?  I thought
> > > "dmsetup remove --force" would do this but as that doesn't work
> > 
> > really state of whole table needs to be known.
> > 
> > >> Also note - dmsetup remove  supports --deferred removal (see man
> > >> page).
> > >
> > > Oh I didn't notice that.  It doesn't seem to have much of an effect
> > > though:
> > 
> > Sure it will not fix your problem - it's like lazy umount...
> 
> So replacing the table with the 'error' target won't release the
> underlying device, even though that device is not used by the new
> target?
> 
> > What is not clear to me is - what is your expectation here ?
> > Obviously your system is far more broken - so placing 'error' target
> > for your backup device will not fix it.
> > 
> > You should likely attach also portion of 'dmesg' - there surely will
> > be written what is going wrong with your system.
> 
> What happened was in the middle of the backup, there was some USB
> interruption and the disks dropped out, so the writes started failing.
> The kernel logs were full of write errors to various sector numbers.  I
> think you would have the same result if you set things up with a USB
> stick and then unplugged it during a data transfer.
> 
> The devices are connected like this:
> 
>   dm device "backup"
>    |
>    +-- mdadm device /dev/md10
>         |
>         +-- USB/SATA disk A (/dev/sdd)
>         |
>         +-- USB/SATA disk B (/dev/sde)

mdadm /dev/md10 --fail /dev/sdd --remove /dev/sdd
mdadm /dev/md10 --fail /dev/sde --remove /dev/sde
(or maybe combine in one command line, if that is supposed to work)

Should kick out both disks from the MD,
should make md10 fail all pending (and new) request,
should even get the stuck dm suspend unstuck.

No?

Cheers,

	Lars Ellenberg

> The problem is that I can't just reconnect the disks and rerun the
> backup.  mdadm refuses to stop the RAID array as it is in use by
> the dm device, and it thinks the array is active despite the disks being
> unplugged and in a drawer.  If I reconnect the disks they appear as
> different devices (sdf and sdg) but I still can't start the "new" array
> from these new disk devices, as it tells me the disks are already part
> of an active array.
> 
> So the only way I can have another go at running this backup is to
> close down /dev/md10, and it seems the only way I can do that is to
> tell dm to release that device.  It doesn't matter if the dm device
> "backup" is unusable, I will just create "backup2" to use for the
> second attempt.
> 
> But until I can figure out how to get dm to release the underlying
> device, I'm stuck!
> 
> > i.e. you cannot expect 'remove --force' will work when your machine
> > start to show kernel errors.
> 
> There were no kernel crashes, just errors related to USB transfers.  I
> would assume this is not much different to how a real failed disk might
> behave, so I figure it is a situation that should be encountered
> relatively often!
> 
> Thanks again,
> Adam.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

      parent reply	other threads:[~2015-09-17 14:04 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-14  0:29 How do you force-close a dm device after a disk failure? Adam Nielsen
2015-09-14  6:43 ` Zdenek Kabelac
2015-09-14  8:59   ` Adam Nielsen
2015-09-14  9:16     ` Zdenek Kabelac
2015-09-14  9:45       ` Adam Nielsen
2015-09-14 10:04         ` Zdenek Kabelac
2015-09-16  0:58           ` Adam Nielsen
2015-09-16  8:04             ` Zdenek Kabelac
2015-09-16 12:35               ` Adam Nielsen
2015-09-16 13:03                 ` Zdenek Kabelac
2015-09-19  9:47                   ` Adam Nielsen
2015-09-21 11:39                     ` Lars Ellenberg
2015-09-21 17:50                       ` Zdenek Kabelac
2015-09-17 11:41                 ` Zdenek Kabelac
2015-09-17 14:04         ` Lars Ellenberg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150917140413.GH7519@soda.linbit \
    --to=lars.ellenberg@linbit.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).