BtrFs on drives with error recovery control / TLER?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* BtrFs on drives with error recovery control / TLER?
@ 2015-01-15 19:54 Daniel Pocock
  2015-01-15 22:52 ` Duncan
  0 siblings, 1 reply; 2+ messages in thread
From: Daniel Pocock @ 2015-01-15 19:54 UTC (permalink / raw)
  To: linux-btrfs



Hi,

Can anybody comment on how BtrFs (particularly RAID1 mirroring)
interacts with drives that offer error recovery control (or TLER in WDC
terms)?

I generally prefer to buy this type of drive for any serious data
storage purposes

I notice ZFS gets a mention in the Wikipedia article about the topic:
http://en.wikipedia.org/wiki/Error_recovery_control

Should BtrFs be mentioned there too?

Regards,

Daniel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: BtrFs on drives with error recovery control / TLER?
  2015-01-15 19:54 BtrFs on drives with error recovery control / TLER? Daniel Pocock
@ 2015-01-15 22:52 ` Duncan
  0 siblings, 0 replies; 2+ messages in thread
From: Duncan @ 2015-01-15 22:52 UTC (permalink / raw)
  To: linux-btrfs

Daniel Pocock posted on Thu, 15 Jan 2015 20:54:10 +0100 as excerpted:

> Can anybody comment on how BtrFs (particularly RAID1 mirroring)
> interacts with drives that offer error recovery control (or TLER in WDC
> terms)?
> 
> I generally prefer to buy this type of drive for any serious data
> storage purposes
> 
> I notice ZFS gets a mention in the Wikipedia article about the topic:
> http://en.wikipedia.org/wiki/Error_recovery_control
> 
> Should BtrFs be mentioned there too?

I make no claims to being an expert in this area and others with more 
expertise will likely be along shortly.  However...

In general you have a valid worry, and the recommendation is as with 
other raid technology, if possible, set your device to a recovery time 
under 30 seconds, as that's the default Linux SCSI level link reset time, 
and it will short-circuit the process and doesn't get the bad sector 
marked as such and remapped to a reserve sector, on the device.

On consumer level devices where setting the device recovery time isn't 
possible, the hard-wired recovery time can be near two minutes, so the 
recommendation is to set the Linux SCSI level link reset time to 120 
seconds or so, thus allowing the hardware device to timeout first so it 
can again recognize the bad sector and do its remapping thing.

In general, this recommendation should apply to all Linux-kernel-based 
soft-raid technologies (including btrfs, mdraid, dmraid...) where the 
raid redundancy can fill in the missing data so letting it fail and 
potentially trigger a remap is the best strategy.

OTOH, the shorter time wouldn't be recommended (tho a longer SCSI reset 
time well could be) for a single-device btrfs or a multi-device btrfs in 
raid0 or single mode, because in those cases, the assumption is that 
there's no other copies of the data, so letting the device take up to two 
minutes to try to retrieve that data in the hope that the extra tries 
will finally be successful, can very possibly save that data... of course 
at the cost of a system that goes unresponsive for upto two minutes at a 
time, which clearly isn't going to work if it's happening frequently.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-01-15 22:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-15 19:54 BtrFs on drives with error recovery control / TLER? Daniel Pocock
2015-01-15 22:52 ` Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).