From: Wols Lists <antlists@youngman.org.uk>
To: Alireza Haghdoost <alireza@cs.umn.edu>
Cc: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>,
Dan Williams <dan.j.williams@intel.com>, Shaohua Li <shli@fb.com>,
Neil Brown <neilb@suse.de>,
linux-raid <linux-raid@vger.kernel.org>,
Song Liu <songliubraving@fb.com>,
Kernel-team@fb.com
Subject: Re: [RFC] raid5: add a log device to fix raid5/6 write hole issue
Date: Wed, 01 Apr 2015 21:18:36 +0100 [thread overview]
Message-ID: <551C529C.2040503@youngman.org.uk> (raw)
In-Reply-To: <CAB-428kMJ7g9RL+P4ZYsMRJVdJc7jeU-wmA1zJGJ4UyKWwSeJA@mail.gmail.com>
On 01/04/15 21:04, Alireza Haghdoost wrote:
> On Wed, Apr 1, 2015 at 2:57 PM, Wols Lists <antlists@youngman.org.uk> wrote:
>> On 01/04/15 19:46, Alireza Haghdoost wrote:
>>>> Now, how can be assured, in that case, that the "cache"
>>>>> device is safe after the power is restored?
>>> You do sync write-ahead logging on the Flash cache. If it return
>>> successful, you do fire the writes to the RAID. If system crash/fails
>>> during the RAID writes (Write-hole), you just recover data by scanning
>>> write-ahead log in the flash cache and replay the logs into the RAID
>>> drives.
>>>
>> Just to throw something nasty into the mix, I'm not sure whether it's
>> SSDs or SD-cards, but there certainly *was* a spate of corrupted
>> *controllers*.
>>
>> In other words, a power failure would RELIABLY TRASH the device, if it
>> happened at the wrong moment. Hopefully that's been fixed ...
>>
>
> That is certainly true. As Dan mentioned, the cache device it-self
> should be safe against power failure. I agree this is not the case for
> all SSD cards in the market but might be the case for Facebook. I hate
> to say this but It seems these efforts are useful dependent to what
> kind of hardware is deployed for cache device.
>
It would be nice, but probably not possible, to have some form of
black-list of "these devices are unsafe/dangerous". Along the lines of
"mdadm --probe /dev/sda" or whatever, that gets the device type, checks
it, and says "this SSD can be destroyed by a power failure" or "this is
a cheap disk with the timeout problem" or something. But even if someone
did it, the database would probably bit-rot fairly quickly :-(
Cheers,
Wol
next prev parent reply other threads:[~2015-04-01 20:18 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-30 22:25 [RFC] raid5: add a log device to fix raid5/6 write hole issue Shaohua Li
2015-04-01 3:47 ` Dan Williams
2015-04-01 5:53 ` Shaohua Li
2015-04-01 6:02 ` NeilBrown
2015-04-01 17:14 ` Shaohua Li
2015-04-01 18:36 ` Piergiorgio Sartor
2015-04-01 18:46 ` Dan Williams
2015-04-01 20:07 ` Jiang, Dave
2015-04-01 18:46 ` Alireza Haghdoost
2015-04-01 19:57 ` Wols Lists
2015-04-01 20:04 ` Alireza Haghdoost
2015-04-01 20:18 ` Wols Lists [this message]
2015-04-01 20:17 ` Jens Axboe
2015-04-01 21:53 ` NeilBrown
2015-04-01 23:40 ` Shaohua Li
2015-04-02 0:19 ` NeilBrown
2015-04-02 4:07 ` Shaohua Li
2015-04-09 0:43 ` Shaohua Li
2015-04-09 5:04 ` NeilBrown
2015-04-09 6:15 ` Shaohua Li
2015-04-09 15:37 ` Dan Williams
2015-04-09 16:03 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=551C529C.2040503@youngman.org.uk \
--to=antlists@youngman.org.uk \
--cc=Kernel-team@fb.com \
--cc=alireza@cs.umn.edu \
--cc=dan.j.williams@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=piergiorgio.sartor@nexgo.de \
--cc=shli@fb.com \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox