does btrfs-receive use/compare the checksums from the btrfs-send side?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* does btrfs-receive use/compare the checksums from the btrfs-send side?
@ 2016-08-28  3:46 Christoph Anton Mitterer
  2016-08-28 17:35 ` Chris Murphy
  2016-08-29  8:25 ` Qu Wenruo
  0 siblings, 2 replies; 9+ messages in thread
From: Christoph Anton Mitterer @ 2016-08-28  3:46 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 537 bytes --]

Hey.

I've often wondered:
When I do a send/receive, does the receiving side use the checksums
from the sending side (either by directly storing them or by comparing
them with calculated checksums and failing if they don't match after
the transfer)?

Cause that would effectively secure any transport in between against
transmission errors ... and also against badblocks/etc. on the
receiving fs.

If btrfs is already that smart, then I think this feature should be
mentioned in the send/receive manpages.

Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-28  3:46 does btrfs-receive use/compare the checksums from the btrfs-send side? Christoph Anton Mitterer
@ 2016-08-28 17:35 ` Chris Murphy
  2016-08-28 17:50   ` Christoph Anton Mitterer
  2016-08-29  8:25 ` Qu Wenruo
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2016-08-28 17:35 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: Btrfs BTRFS

On Sat, Aug 27, 2016 at 9:46 PM, Christoph Anton Mitterer
<calestyo@scientia.net> wrote:
> Hey.
>
> I've often wondered:
> When I do a send/receive, does the receiving side use the checksums
> from the sending side (either by directly storing them or by comparing
> them with calculated checksums and failing if they don't match after
> the transfer)?
>
> Cause that would effectively secure any transport in between against
> transmission errors ... and also against badblocks/etc. on the
> receiving fs.
>
> If btrfs is already that smart, then I think this feature should be
> mentioned in the send/receive manpages.

I don't see evidence of them in the btrfs send file, so I don't think
csums are in the stream.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-28 17:35 ` Chris Murphy
@ 2016-08-28 17:50   ` Christoph Anton Mitterer
  2016-08-28 20:19     ` Adam Borowski
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Anton Mitterer @ 2016-08-28 17:50 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 438 bytes --]

On Sun, 2016-08-28 at 11:35 -0600, Chris Murphy wrote:
> I don't see evidence of them in the btrfs send file, so I don't think
> csums are in the stream.

hmm... isn't that kinda unfortunate not to make use of the information
that's already there?

IMO, to the extent this is possibly, btrfs should generally re-use
csums (or compare freshly created ones with the ones already known) on
copy-like operations.


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-28 17:50   ` Christoph Anton Mitterer
@ 2016-08-28 20:19     ` Adam Borowski
  2016-08-28 20:25       ` Christoph Anton Mitterer
  0 siblings, 1 reply; 9+ messages in thread
From: Adam Borowski @ 2016-08-28 20:19 UTC (permalink / raw)
  To: linux-btrfs

On Sun, Aug 28, 2016 at 07:50:42PM +0200, Christoph Anton Mitterer wrote:
> On Sun, 2016-08-28 at 11:35 -0600, Chris Murphy wrote:
> > I don't see evidence of them in the btrfs send file, so I don't think
> > csums are in the stream.
> 
> hmm... isn't that kinda unfortunate not to make use of the information
> that's already there?

Transports over which you're likely to send a filesystem stream already
protect against corruption.

It'd still be nice to have something for those which don't, of course.

-- 
An imaginary friend squared is a real enemy.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-28 20:19     ` Adam Borowski
@ 2016-08-28 20:25       ` Christoph Anton Mitterer
  2016-08-30 17:14         ` Sean Greenslade
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Anton Mitterer @ 2016-08-28 20:25 UTC (permalink / raw)
  To: Adam Borowski, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

On Sun, 2016-08-28 at 22:19 +0200, Adam Borowski wrote:
> Transports over which you're likely to send a filesystem stream
> already
> protect against corruption.
Well... in some cases,... but not always... just consider a plain old
netcat...


> It'd still be nice to have something for those which don't, of
> course.
And it would be even more nice in the case of doing e.g. backups, even
if it's from local fs to another local fs.... so that one doesn't have
to do another round of diff'ing, because one already knows the copy is
guaranteed to be valid (or at least the checksum is from the source,
and a further scrub on the copy would reveal any silent block
corruption).

Cheers.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-28  3:46 does btrfs-receive use/compare the checksums from the btrfs-send side? Christoph Anton Mitterer
  2016-08-28 17:35 ` Chris Murphy
@ 2016-08-29  8:25 ` Qu Wenruo
  2016-09-04  4:29   ` Christoph Anton Mitterer
  1 sibling, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2016-08-29  8:25 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



At 08/28/2016 11:46 AM, Christoph Anton Mitterer wrote:
> Hey.
>
> I've often wondered:
> When I do a send/receive, does the receiving side use the checksums
> from the sending side (either by directly storing them or by comparing
> them with calculated checksums and failing if they don't match after
> the transfer)?

Send will generate checksum for each command.

https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive#Overal_strucutre

(Finally the new page is useful now)

Although the checksum is not the same way for btrfs.
For btrfs, data checksum is CRC32 per sectorsize(4K for x86_64), and 
CRC32 for the whole tree block.

For send stream, it's CRC32 for the whole command.

Thanks,
Qu

>
> Cause that would effectively secure any transport in between against
> transmission errors ... and also against badblocks/etc. on the
> receiving fs.
>
> If btrfs is already that smart, then I think this feature should be
> mentioned in the send/receive manpages.
>
> Cheers,
> Chris.
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-28 20:25       ` Christoph Anton Mitterer
@ 2016-08-30 17:14         ` Sean Greenslade
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Greenslade @ 2016-08-30 17:14 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: Adam Borowski, linux-btrfs

On Sun, Aug 28, 2016 at 10:25:32PM +0200, Christoph Anton Mitterer wrote:
> On Sun, 2016-08-28 at 22:19 +0200, Adam Borowski wrote:
> > Transports over which you're likely to send a filesystem stream
> > already
> > protect against corruption.
> Well... in some cases,... but not always... just consider a plain old
> netcat...

Netcat uses TCP by default, so there is error correction and a
guaranteed-correct stream transfer there.

--Sean

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-08-29  8:25 ` Qu Wenruo
@ 2016-09-04  4:29   ` Christoph Anton Mitterer
  2016-09-05  7:45     ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Anton Mitterer @ 2016-09-04  4:29 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]

On Mon, 2016-08-29 at 16:25 +0800, Qu Wenruo wrote:
> Send will generate checksum for each command.
What does "command" mean here? Or better said how much data is secured
with one CRC32?

> For send stream, it's CRC32 for the whole command.
And this is verified then on the receiving end?

Wouldn't it be useful (if this technically possibly) to use the
checksums directly from the sent blocks? That way one could also catch
any errors on the receiving side, that occurred after the checksum from
the receive was verified (e.g. memory errors).

And couldn't one do something similar locally, when btrfs copies
blocks?

At least something like this would seem to me like the most native way:
- One want's checksum protection
- One copies data
- One has already checksums for that data

=> thus that should be used, as it's the most canonical version of a
   checksum for that data... anything that is newly calculated could
   in the best case just be good, and in the worst add new errors
   (unnoticed),... e.g. when memory is broken and the new checksum is
   calculated over that.

Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5930 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: does btrfs-receive use/compare the checksums from the btrfs-send side?
  2016-09-04  4:29   ` Christoph Anton Mitterer
@ 2016-09-05  7:45     ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2016-09-05  7:45 UTC (permalink / raw)
  To: Christoph Anton Mitterer, linux-btrfs



At 09/04/2016 12:29 PM, Christoph Anton Mitterer wrote:
> On Mon, 2016-08-29 at 16:25 +0800, Qu Wenruo wrote:
>> Send will generate checksum for each command.
> What does "command" mean here? Or better said how much data is secured
> with one CRC32?

Command is one send command stream, containing all needed info for a 
operation,
like subvolume command, containing UUID,tranid,
chown command, containing gid/uid,
chmod command, containing mode,
utimes command, containting acm times and a lot of others.

For how much data is secured by 1 CRC32, it depends on the size of the 
command.

Normal command is quite small, but the exception would be write command.
More than 48K bytes can be secured by one CRC32.

>
>
>> For send stream, it's CRC32 for the whole command.
> And this is verified then on the receiving end?

Yes.

>
>
> Wouldn't it be useful (if this technically possibly) to use the
> checksums directly from the sent blocks? That way one could also catch
> any errors on the receiving side, that occurred after the checksum from
> the receive was verified (e.g. memory errors).
>
> And couldn't one do something similar locally, when btrfs copies
> blocks?
>
> At least something like this would seem to me like the most native way:
> - One want's checksum protection
> - One copies data
> - One has already checksums for that data

You can try my dump-send command branch, to verify how send/receive works:
https://github.com/adam900710/btrfs-progs/tree/dump_send_stream

With several try, you could find at least the following reasons:

1) Not all data has checksum
    Only non-inlined data has checksum.
    Inlined data has no checksum (protected by leaf checksum then)

2) Send doesn't following sectorsize unit for non-inlined data
    Just create a 6K file, send the subvolume out, and use dump-send
    to exam it.
    You'll find that, send stream contains exactly 6K data, not 8K
    (2* 4K, which 4K is sectorsize).

    While for data checksum, that are all in sectorsize unit.

3) We need to protect the whole command, not file data only.
    Even write command contains metadata info, like the offset and length
    of the write.
    Since we need to protect the whole command, why not introduce the
    complexity to use TWO CRC32 for meta and data?

>
> => thus that should be used, as it's the most canonical version of a
>    checksum for that data... anything that is newly calculated could
>    in the best case just be good, and in the worst add new errors
>    (unnoticed),... e.g. when memory is broken and the new checksum is
>    calculated over that.

However most bugs are not caused by memory corruption, but humans.
So the command checksum design seems quite good for me though.
It's unified, simple structure and expandable.

Thanks,
Qu

>
>
> Cheers,
> Chris.
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-09-05  7:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-28  3:46 does btrfs-receive use/compare the checksums from the btrfs-send side? Christoph Anton Mitterer
2016-08-28 17:35 ` Chris Murphy
2016-08-28 17:50   ` Christoph Anton Mitterer
2016-08-28 20:19     ` Adam Borowski
2016-08-28 20:25       ` Christoph Anton Mitterer
2016-08-30 17:14         ` Sean Greenslade
2016-08-29  8:25 ` Qu Wenruo
2016-09-04  4:29   ` Christoph Anton Mitterer
2016-09-05  7:45     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).