Re: BtrFs on drives with error recovery control / TLER?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: BtrFs on drives with error recovery control / TLER?
Date: Thu, 15 Jan 2015 22:52:45 +0000 (UTC)	[thread overview]
Message-ID: <pan$4400a$2a1b3182$7ae7e911$d2acb24@cox.net> (raw)
In-Reply-To: 54B81AE2.7050902@pocock.pro

Daniel Pocock posted on Thu, 15 Jan 2015 20:54:10 +0100 as excerpted:

> Can anybody comment on how BtrFs (particularly RAID1 mirroring)
> interacts with drives that offer error recovery control (or TLER in WDC
> terms)?
> 
> I generally prefer to buy this type of drive for any serious data
> storage purposes
> 
> I notice ZFS gets a mention in the Wikipedia article about the topic:
> http://en.wikipedia.org/wiki/Error_recovery_control
> 
> Should BtrFs be mentioned there too?

I make no claims to being an expert in this area and others with more 
expertise will likely be along shortly.  However...

In general you have a valid worry, and the recommendation is as with 
other raid technology, if possible, set your device to a recovery time 
under 30 seconds, as that's the default Linux SCSI level link reset time, 
and it will short-circuit the process and doesn't get the bad sector 
marked as such and remapped to a reserve sector, on the device.

On consumer level devices where setting the device recovery time isn't 
possible, the hard-wired recovery time can be near two minutes, so the 
recommendation is to set the Linux SCSI level link reset time to 120 
seconds or so, thus allowing the hardware device to timeout first so it 
can again recognize the bad sector and do its remapping thing.

In general, this recommendation should apply to all Linux-kernel-based 
soft-raid technologies (including btrfs, mdraid, dmraid...) where the 
raid redundancy can fill in the missing data so letting it fail and 
potentially trigger a remap is the best strategy.

OTOH, the shorter time wouldn't be recommended (tho a longer SCSI reset 
time well could be) for a single-device btrfs or a multi-device btrfs in 
raid0 or single mode, because in those cases, the assumption is that 
there's no other copies of the data, so letting the device take up to two 
minutes to try to retrieve that data in the hope that the extra tries 
will finally be successful, can very possibly save that data... of course 
at the cost of a system that goes unresponsive for upto two minutes at a 
time, which clearly isn't going to work if it's happening frequently.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

     prev parent reply	other threads:[~2015-01-15 22:52 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-15 19:54 BtrFs on drives with error recovery control / TLER? Daniel Pocock
2015-01-15 22:52 ` Duncan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$4400a$2a1b3182$7ae7e911$d2acb24@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).