From: Michael Lyle <mlyle@lyle.org>
To: Alexandr Kuznetsov <progmachine@xenlab.one>,
linux-bcache@vger.kernel.org
Subject: Re: bcache failure hangs something in kernel
Date: Fri, 13 Oct 2017 01:11:41 -0700 [thread overview]
Message-ID: <024c8d28-9ddb-09fe-c2b0-8a7d0aed493d@lyle.org> (raw)
In-Reply-To: <b4cf93d3-7e15-9f20-941b-7f11269c510b@xenlab.one>
On 10/13/2017 12:59 AM, Alexandr Kuznetsov wrote:
> Hi
>
>> It looks like probably the superblock of md0p2 and other data structures
>> were corrupted during the lvm commands, and in turn this is triggering
>> bugs with bcache (bcache should detect the situation and abort
>> everything, but instead is left with the bucket_lock held and freezes).
> This immediately rises questions about reliability and safety of lvm and
> bcache.
Neither is safe if you overwrite the superblock with an errant command.
If you pvcreate'd on the backing device directly, or did something
similarly, that would be expected to go badly.
> I thought that lvm is old, mature and safe technology, but here it is
> stuck, then manualy interrupted and result is catastrophic data corruption.
> lvm sits on top of that sandwich of block devices, on layer of
> /dev/bcache* devices. Another question here is how crazy lvm could
> damage data outside of /dev/bcache* devices? This means that some
> necessary io buffer range checks are missing inside bcache.
I don't know what commands you ran. I've never seen/heard of a bcache
superblock corrupted, and I believe the mappings/shrink are appropriate.
> Unfortunately this md0p* block devices are not separate from each other
> - there is one 2Tb volume on top of them inside lvm. Loss of one 100Gib
> part and dirty data in another 100Gib part can kill entire file system
> with very high probability. Yesterday I have read that bcache failures
> are nasty, because file system roots data often resides on cache and is
> dirty on backing device.> Is there any tool like fsck exist, that can check and may be try to
> recover data from caching and backing devices? Or developers can get
> this corrupted images to experiment for bugfixing?
Sorry, no. Other filesystems / block devices will not behave well if
you overwrite their superblock, either. This is not behavior bcache is
expected to recover gracefully from (though it shouldn't hang).
re: the dirty data in the 100GB part, having a filesystem with a
superblock marked dirty is fine if the cache device is available.
Mike
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-10-13 8:11 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-12 12:49 bcache failure hangs something in kernel Alexandr Kuznetsov
2017-10-12 18:12 ` Michael Lyle
2017-10-13 7:59 ` Alexandr Kuznetsov
2017-10-13 8:11 ` Michael Lyle [this message]
2017-10-13 9:10 ` Alexandr Kuznetsov
2017-10-13 9:13 ` Michael Lyle
2017-10-13 10:11 ` Alexandr Kuznetsov
2017-11-14 13:27 ` Nix
2017-11-14 17:20 ` Michael Lyle
2017-11-14 18:25 ` Nix
2017-11-14 19:03 ` Michael Lyle
2017-11-17 20:13 ` Nix
2017-11-15 8:44 ` Alexandr Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=024c8d28-9ddb-09fe-c2b0-8a7d0aed493d@lyle.org \
--to=mlyle@lyle.org \
--cc=linux-bcache@vger.kernel.org \
--cc=progmachine@xenlab.one \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox