deleting a dead device

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* deleting a dead device
@ 2014-09-21  1:39 Russell Coker
  2014-09-21  7:01 ` Duncan
  2014-09-21 17:05 ` Chris Murphy
  0 siblings, 2 replies; 5+ messages in thread
From: Russell Coker @ 2014-09-21  1:39 UTC (permalink / raw)
  To: linux-btrfs

On a system running the Debian 3.14.15-2 kernel I added a new drive to a 
RAID-1 array.  My aim was to add a device and remove one of the old devices.

Sep 21 11:26:51 server kernel: [2070145.375221] BTRFS: lost page write due to 
I/O error on /dev/sdc3
Sep 21 11:26:51 server kernel: [2070145.375225] BTRFS: bdev /dev/sdc3 errs: wr 
269, rd 0, flush 0, corrupt 0, gen 0
Sep 21 11:27:21 server kernel: [2070175.517691] BTRFS: lost page write due to 
I/O error on /dev/sdc3
Sep 21 11:27:21 server kernel: [2070175.517699] BTRFS: bdev /dev/sdc3 errs: wr 
270, rd 0, flush 0, corrupt 0, gen 0
Sep 21 11:27:21 server kernel: [2070175.517712] BTRFS: lost page write due to 
I/O error on /dev/sdc3
Sep 21 11:27:21 server kernel: [2070175.517715] BTRFS: bdev /dev/sdc3 errs: wr 
271, rd 0, flush 0, corrupt 0, gen 0
Sep 21 11:27:51 server kernel: [2070205.665947] BTRFS: lost page write due to 
I/O error on /dev/sdc3
Sep 21 11:27:51 server kernel: [2070205.665955] BTRFS: bdev /dev/sdc3 errs: wr 
272, rd 0, flush 0, corrupt 0, gen 0
Sep 21 11:27:51 server kernel: [2070205.665967] BTRFS: lost page write due to 
I/O error on /dev/sdc3
Sep 21 11:27:51 server kernel: [2070205.665971] BTRFS: bdev /dev/sdc3 errs: wr 
273, rd 0, flush 0, corrupt 0, gen 0

Anyway the new drive turned out to have some errors, writes failed and I've 
got a heap of errors such as the above.  The errors started immediately after 
adding the drive and the system wasn't actively writing to the filesystem.  So 
very few (if any) writes made it to the device.

# btrfs device delete /dev/sdc3 /
ERROR: error removing the device '/dev/sdc3' - Invalid argument

It seems that I can't remove the device because removing requires writing.

# btrfs device delete /dev/sdc3 /
ERROR: error removing the device '/dev/sdc3' - No such file or directory
# btrfs device stats /
[/dev/sda3].write_io_errs   0
[/dev/sda3].read_io_errs    0
[/dev/sda3].flush_io_errs   0
[/dev/sda3].corruption_errs 57
[/dev/sda3].generation_errs 0
[/dev/sdb3].write_io_errs   0
[/dev/sdb3].read_io_errs    0
[/dev/sdb3].flush_io_errs   0
[/dev/sdb3].corruption_errs 0
[/dev/sdb3].generation_errs 0
[/dev/sdc3].write_io_errs   267
[/dev/sdc3].read_io_errs    0
[/dev/sdc3].flush_io_errs   0
[/dev/sdc3].corruption_errs 0
[/dev/sdc3].generation_errs 0

The drive is attached by USB so I turned off the USB device and then got the 
above result.  So it still seems impossible to remove the device even though 
it's physically not present.  I've connected a new USB disk which is now 
/dev/sdd, so it seems that BTRFS is keeping the name /dev/sdc locked.

Should there be a way to fix this without rebooting or anything?

Also as an aside, while the stats about write errors are useful, in this case 
it would be really good if there was a count of successful writes, it would be 
useful to know if the successful write count was close to 0.  My understanding 
of the BTRFS design is that there would be no performance penalty for adding 
counts of the number of successful reads and writes to the superblock.  Could 
this be done?

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a dead device
  2014-09-21  1:39 deleting a dead device Russell Coker
@ 2014-09-21  7:01 ` Duncan
  2014-09-21  7:34   ` Russell Coker
  2014-09-21 17:05 ` Chris Murphy
  1 sibling, 1 reply; 5+ messages in thread
From: Duncan @ 2014-09-21  7:01 UTC (permalink / raw)
  To: linux-btrfs

Russell Coker posted on Sun, 21 Sep 2014 11:39:17 +1000 as excerpted:

> On a system running the Debian 3.14.15-2 kernel I added a new drive to a
> RAID-1 array.  My aim was to add a device and remove one of the old
> devices.

That's an old kernel and presumably an old btrfs-progs.  Quite a number 
of device management fixes have gone in recently, and you'd likely not be 
in quite that predicament were you running a current kernel (make sure 
it's 3.16.2+ or 3.17-rc2+ to get the fix for the workqueues bug that 
affected 3.15 and thru 3.16.1 and 3.17-rc1).

And the recommended way to handle a device replace now would be btrfs 
replace, doing the add and delete in a single (albeit possibly long) step 
instead of separately.

[snip most of the problem description]

> The drive is attached by USB so I turned off the USB device and then got
> the above result.  So it still seems impossible to remove the device
> even though it's physically not present.  I've connected a new USB disk
> which is now /dev/sdd, so it seems that BTRFS is keeping the name
> /dev/sdc locked.
> 
> Should there be a way to fix this without rebooting or anything?

Did you try btrfs device delete missing?  It's documented on the wiki but 
apparently not yet on the manpage.  According to the wiki that deletes 
the first device that was in the metadata but not found when booting, so 
you may have to reboot to do it, but it should work.  Tho with the recent 
stale-devices fixes, were that a current kernel you may not actually have 
to reboot to have delete missing work.  But you probably will on 3.14, 
and of course to upgrade kernels you'd have to reboot anyway, so...

> Also as an aside, while the stats about write errors are useful, in this
> case it would be really good if there was a count of successful writes,
> it would be useful to know if the successful write count was close to 0.
>  My understanding of the BTRFS design is that there would be no
> performance penalty for adding counts of the number of successful reads
> and writes to the superblock.  Could this be done?

Not necessarily for reads, consider the case when the filesystem is read-
only as my btrfs root filesystem is by default -- lots of reads but 
likely no writes and no super-block updates for the entire uptime.  But I 
believe you're correct for writes, since they'd ultimately update the 
superblocks anyway.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a dead device
  2014-09-21  7:01 ` Duncan
@ 2014-09-21  7:34   ` Russell Coker
  0 siblings, 0 replies; 5+ messages in thread
From: Russell Coker @ 2014-09-21  7:34 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Sun, 21 Sep 2014, Duncan <1i5t5.duncan@cox.net> wrote:
> Russell Coker posted on Sun, 21 Sep 2014 11:39:17 +1000 as excerpted:
> > On a system running the Debian 3.14.15-2 kernel I added a new drive to a
> > RAID-1 array.  My aim was to add a device and remove one of the old
> > devices.
> 
> That's an old kernel and presumably an old btrfs-progs.  Quite a number
> of device management fixes have gone in recently, and you'd likely not be
> in quite that predicament were you running a current kernel (make sure
> it's 3.16.2+ or 3.17-rc2+ to get the fix for the workqueues bug that
> affected 3.15 and thru 3.16.1 and 3.17-rc1).

3.16.2 is in Debian, I'm in the process of upgrading to it.

> And the recommended way to handle a device replace now would be btrfs
> replace, doing the add and delete in a single (albeit possibly long) step
> instead of separately.

I'm changing from a 500G single filesystem to a 200G RAID-1 (there's only 150G 
of data).  The change from 500G to 200G can't be done with a replace as a 
replace requires an equal or greater size.

I did a 2 step process, add/delete to go to a 200G USB attached device for 
half the array and then replace to go from 200G on USB to 200G internal.

> > The drive is attached by USB so I turned off the USB device and then got
> > the above result.  So it still seems impossible to remove the device
> > even though it's physically not present.  I've connected a new USB disk
> > which is now /dev/sdd, so it seems that BTRFS is keeping the name
> > /dev/sdc locked.
> > 
> > Should there be a way to fix this without rebooting or anything?
> 
> Did you try btrfs device delete missing?  It's documented on the wiki but
> apparently not yet on the manpage.

I did that after rebooting.  It didn't occur to me to try a "missing" 
operation when the drive really wasn't missing.

> According to the wiki that deletes
> the first device that was in the metadata but not found when booting, so
> you may have to reboot to do it, but it should work.

That would be a bug.  There's no reason a reboot should be required if we can 
remove a drive and add a new one with the kernel recognising it.  Hot-swap 
disks aren't any sort of new feature.

> Tho with the recent
> stale-devices fixes, were that a current kernel you may not actually have
> to reboot to have delete missing work.  But you probably will on 3.14,
> and of course to upgrade kernels you'd have to reboot anyway, so...

Yes a reboot was needed anyway.  But I'd have liked to delay that.

> > Also as an aside, while the stats about write errors are useful, in this
> > case it would be really good if there was a count of successful writes,
> > it would be useful to know if the successful write count was close to 0.
> > 
> >  My understanding of the BTRFS design is that there would be no
> > 
> > performance penalty for adding counts of the number of successful reads
> > and writes to the superblock.  Could this be done?
> 
> Not necessarily for reads, consider the case when the filesystem is read-
> only as my btrfs root filesystem is by default -- lots of reads but
> likely no writes and no super-block updates for the entire uptime.  But I
> believe you're correct for writes, since they'd ultimately update the
> superblocks anyway.

For the case of a read-only filesystem it's OK to skip read stats.  It would 
also be a bad idea to update read stats without writing data.  But there's no 
reason why read stats couldn't be accumulated in-memory and written out the 
next time something was written to disk.  That would give a slight inaccuracy 
in the case where there was a power failure after some period of reading 
without writing, but that's an unusual corner case.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a dead device
  2014-09-21  1:39 deleting a dead device Russell Coker
  2014-09-21  7:01 ` Duncan
@ 2014-09-21 17:05 ` Chris Murphy
  2014-09-24  3:20   ` Russell Coker
  1 sibling, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2014-09-21 17:05 UTC (permalink / raw)
  To: Btrfs BTRFS

On Sep 20, 2014, at 7:39 PM, Russell Coker <russell@coker.com.au> wrote:
> 
> Anyway the new drive turned out to have some errors, writes failed and I've 
> got a heap of errors such as the above.

I'm curious if smartctl -t conveyance reveals any problems, it's not a full surface test but is designed to be a test for (typical?) problems drives have due to shipment damage, and doesn't take very long.

>  The errors started immediately after 
> adding the drive and the system wasn't actively writing to the filesystem.  So 
> very few (if any) writes made it to the device.
> 
> # btrfs device delete /dev/sdc3 /
> ERROR: error removing the device '/dev/sdc3' - Invalid argument
> 
> It seems that I can't remove the device because removing requires writing.

What kernel message do you get associated with this? Try using the devid instead of /dev/.

For future reference, btrfs replace start is better to use than add+delete. It's an optimization but it also makes it possible to ignore the device being replaced for reads; and you can also get a status on the progress with "btrfs replace status". And it looks like it does some additional error checking.

> 
> # btrfs device delete /dev/sdc3 /
> ERROR: error removing the device '/dev/sdc3' - No such file or directory
> # btrfs device stats /
> [/dev/sda3].write_io_errs   0
> [/dev/sda3].read_io_errs    0
> [/dev/sda3].flush_io_errs   0
> [/dev/sda3].corruption_errs 57
> [/dev/sda3].generation_errs 0
> [/dev/sdb3].write_io_errs   0
> [/dev/sdb3].read_io_errs    0
> [/dev/sdb3].flush_io_errs   0
> [/dev/sdb3].corruption_errs 0
> [/dev/sdb3].generation_errs 0
> [/dev/sdc3].write_io_errs   267
> [/dev/sdc3].read_io_errs    0
> [/dev/sdc3].flush_io_errs   0
> [/dev/sdc3].corruption_errs 0
> [/dev/sdc3].generation_errs 0
> 
> The drive is attached by USB so I turned off the USB device and then got the 
> above result.  So it still seems impossible to remove the device even though 
> it's physically not present.  I've connected a new USB disk which is now 
> /dev/sdd, so it seems that BTRFS is keeping the name /dev/sdc locked.

Pretty sure kernel assignment is major:minor, and anything under /dev/ is udev. What do you get for
btrfs fi show

Unfortunately this won't show devid for missing devices, so you might have to infer this. But you can use btrfs replace start -r <devid> /dev/sddX <mountpoint>

> 
> Also as an aside, while the stats about write errors are useful, in this case 
> it would be really good if there was a count of successful writes, it would be 
> useful to know if the successful write count was close to 0.

I think this is for other tools. Btrfs is a file system its responsible for the integrity of the data it writes, I don't think it's responsible for prequalifying drives.

Even a simple dd if=/dev/zero of=/dev/sdc bs=64k count=1600 will write out 100MB, and dmesg will show if there are any controller or drive problems on writes. You may have to do more than 100MB for problems to show up but you get the idea.

You can also use badblocks -swv (progress, destructive write/read, verbose) which will also show writes the drive says succeeded but are actually corrupt.

Use smartctl -t conveyance/short/long to isolate the drive mechanism itself. This obviously doesn't test writes.

Consumer drives should fairly quickly report persistent write failures, which libata will report in dmesg. A common problem though, is they try to do reads much longer than the linux SCSI layer default timeout. Either the SCT ERC timeout of the drive needs to be reduced below 30 seconds; or the linux SCSI layer timeout needs to be raised above the drive SCT ERC timeout. Otherwise the drive keeps trying to do reads, the linux SCSI layer gives up on the non-communicating drive (which is busy recovering) and resets the link. Now the read error doesn't actually happen, doesn't report the offending sector, and Btrfs can't fix the problem.

Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deleting a dead device
  2014-09-21 17:05 ` Chris Murphy
@ 2014-09-24  3:20   ` Russell Coker
  0 siblings, 0 replies; 5+ messages in thread
From: Russell Coker @ 2014-09-24  3:20 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Sun, 21 Sep 2014 11:05:46 Chris Murphy wrote:
> On Sep 20, 2014, at 7:39 PM, Russell Coker <russell@coker.com.au> wrote:
> > Anyway the new drive turned out to have some errors, writes failed and
> > I've
> > got a heap of errors such as the above.
> 
> I'm curious if smartctl -t conveyance reveals any problems, it's not a full
> surface test but is designed to be a test for (typical?) problems drives
> have due to shipment damage, and doesn't take very long.

Unfortunately I got to your message after sending the defective drive to e-
waste.  But I expect that any test that involves real reads/writes would 
report a failure (if the USB-SATA device supported passing them through) as 
the drive seemed to fail for everything.

> > # btrfs device delete /dev/sdc3 /
> > ERROR: error removing the device '/dev/sdc3' - Invalid argument
> > 
> > It seems that I can't remove the device because removing requires writing.
> 
> What kernel message do you get associated with this? Try using the devid
> instead of /dev/.

I'll keep that in mine.

device delete <dev> [<dev>...] <path>
              Remove device(s) from a filesystem identified by <path>.

The man page has the above text which makes no mention of devid, so I think we 
need a documentation patch for this.

> For future reference, btrfs replace start is better to use than add+delete.
> It's an optimization but it also makes it possible to ignore the device
> being replaced for reads; and you can also get a status on the progress
> with "btrfs replace status". And it looks like it does some additional
> error checking.

Oh yes, I've done this and it works well.  However it doesn't work if the 
replacement is smaller than the device being replaced.

> > Also as an aside, while the stats about write errors are useful, in this
> > case it would be really good if there was a count of successful writes,
> > it would be useful to know if the successful write count was close to 0.
> 
> I think this is for other tools. Btrfs is a file system its responsible for
> the integrity of the data it writes, I don't think it's responsible for
> prequalifying drives.

I agree that it doesn't have to prequalify drives.  But it should expose all 
data it has which can be of use to the sysadmin.  After it was too late I 
realised that I could have used iostat to get stats for the block device.  But 
it would still be nice to have stats from btrfs.

Also btrfs has to deal with the fact that drives may fail at any time.  
Admittedly I was using a drive I knew to be slightly sub-standard (I got it 
free because it gave an error in a client's RAID-Z array).  But sometimes 
drives like that last for years, it's difficult to predict.

> Even a simple dd if=/dev/zero of=/dev/sdc bs=64k count=1600 will write out
> 100MB, and dmesg will show if there are any controller or drive problems on
> writes. You may have to do more than 100MB for problems to show up but you
> get the idea.

True.  But a drive can fail after 101MB.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-24 15:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-21  1:39 deleting a dead device Russell Coker
2014-09-21  7:01 ` Duncan
2014-09-21  7:34   ` Russell Coker
2014-09-21 17:05 ` Chris Murphy
2014-09-24  3:20   ` Russell Coker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).