linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Zdenek Kabelac <zkabelac@redhat.com>,
	Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Nikolay Borisov <nborisov@suse.com>,
	linux-block@vger.kernel.org, dm-devel@redhat.com,
	linux-fsdevel@vger.kernel.org
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Ideas to reuse filesystem's checksum to enhance dm-raid1/10/5/6?
Date: Thu, 16 Nov 2017 07:41:22 -0500	[thread overview]
Message-ID: <5e5f8561-655f-7a9b-78a2-c775443b2adb@gmail.com> (raw)
In-Reply-To: <9b81b628-d10b-b62e-74f7-86c8ba2f939b@redhat.com>

On 2017-11-16 07:33, Zdenek Kabelac wrote:
> Dne 16.11.2017 v 11:04 Qu Wenruo napsal(a):
>>
>>
>> On 2017年11月16日 17:43, Zdenek Kabelac wrote:
>>> Dne 16.11.2017 v 09:08 Qu Wenruo napsal(a):
>>>>
>>>>
>>>>>>>>>
>>>>>>>> [What we have]
>>>>>>>> The nearest infrastructure I found in kernel is
>>>>>>>> bio_integrity_payload.
>>>>>>>>
>>>
>>> Hi
>>>
>>> We already have  dm-integrity target upstream.
>>> What's missing in this target ?
>>
>> If I didn't miss anything, the dm-integrity is designed to calculate and
>> restore csum into its space to verify the integrity.
>> The csum happens when bio reaches dm-integrity.
>>
>> However what I want is, fs generate bio with attached verification hook,
>> and pass to lower layers to verify it.
>>
>> For example, if we use the following device mapper layout:
>>
>>          FS (can be any fs with metadata csum)
>>                  |
>>               dm-integrity
>>                  |
>>               dm-raid1
>>                 / \
>>           disk1     disk2
>>
>> If some data in disk1 get corrupted (the disk itself is still good), and
>> when dm-raid1 tries to read the corrupted data, it may return the
>> corrupted one, and then caught by dm-integrity, finally return -EIO to 
>> FS.
>>
>> But the truth is, we could at least try to read out data in disk2 if we
>> know the csum for it.
>> And use the checksum to verify if it's the correct data.
>>
>>
>> So my idea will be:
>>       FS (with metadata csum, or even data csum support)
>>                  |  READ bio for metadata
>>                  |  -With metadata verification hook
>>              dm-raid1
>>                 / \
>>            disk1   disk2
>>
>> dm-raid1 handles the bio, reading out data from disk1.
>> But the result can't pass verification hook.
>> Then retry with disk2.
>>
>> If result from disk2 passes verification hook. That's good, returning
>> the result from disk2 to upper layer (fs).
>> And we can even submit WRITE bio to try to write the good result back to
>> disk1.
>>
>> If result from disk2 doesn't pass verification hook, then we return -EIO
>> to upper layer.
>>
>> That's what btrfs has already done for DUP/RAID1/10 (although RAID5/6
>> will also try to rebuild data, but it still has some problem).
>>
>> I just want to make device-mapper raid able to handle such case too.
>> Especially when most fs supports checksum for their metadata.
>>
> 
> Hi
> 
> IMHO you are looking for too complicated solution.
> 
> If your checksum is calculated and checked at FS level there is no added 
> value when you spread this logic to other layers.
> 
> dm-integrity adds basic 'check-summing' to any filesystem without the 
> need to modify fs itself - the paid price is - if there is bug between 
> passing data from  'fs' to dm-integrity'  it cannot be captured.
But that is true of pretty much any layering, not just dm-integrity. 
There's just a slightly larger window for corruption with dm-integrity.
> 
> Advantage of having separated 'fs' and 'block' layer is in its 
> separation and simplicity at each level.
> 
> If you want integrated solution - you are simply looking for btrfs where 
> multiple layers are integrated together.
> 
> You are also possibly missing feature of dm-interity - it's not just 
> giving you 'checksum' - it also makes you sure - device has proper 
> content - you can't just 'replace block' even with proper checksum for a 
> block somewhere in the middle of you device... and when joined with 
> crypto - it makes it way more secure...
And to expand a bit further, the correct way to integrate dm-integrity 
into the stack when RAID is involved is to put it _below_ the RAID 
layer, so each underlying device is it's own dm-integrity target. 
Assuming I understand the way dm-raid and md handle -EIO, that should 
get you a similar level of protection to BTRFS (worse in some ways, 
better in others).

  reply	other threads:[~2017-11-16 12:41 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-16  2:18 Ideas to reuse filesystem's checksum to enhance dm-raid1/10/5/6? Qu Wenruo
2017-11-16  6:54 ` Nikolay Borisov
2017-11-16  7:38   ` Qu Wenruo
2017-11-16  7:42     ` Nikolay Borisov
2017-11-16  8:08       ` Qu Wenruo
2017-11-16  9:43         ` Zdenek Kabelac
2017-11-16 10:04           ` Qu Wenruo
2017-11-16 12:33             ` Zdenek Kabelac
2017-11-16 12:41               ` Austin S. Hemmelgarn [this message]
2017-11-16 14:06               ` Qu Wenruo
2017-11-16 16:47                 ` Austin S. Hemmelgarn
2017-11-16 21:05                   ` Pasi Kärkkäinen
2017-11-17  1:30                   ` Qu Wenruo
2017-11-17 12:22                     ` Austin S. Hemmelgarn
2017-11-16 22:32             ` Chris Murphy
2017-11-17  1:22               ` Qu Wenruo
2017-11-17  1:54                 ` Chris Murphy
2017-11-17  1:55                   ` Chris Murphy
2017-11-21  2:53               ` [dm-devel] " Theodore Ts'o
2017-11-17  1:26 ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5e5f8561-655f-7a9b-78a2-c775443b2adb@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).