is cryptographically secure integrity checking possible with btrfs?

All of lore.kernel.org
 help / color / mirror / Atom feed

* is cryptographically secure integrity checking possible with btrfs?
@ 2015-03-12  3:35 Christoph Anton Mitterer
  2015-03-12  4:07 ` Liu Bo
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-12  3:35 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2587 bytes --]

Hey.

For encryption we have dm-crypt and in principle I'm happy with having
that at the block device level below the filesystem - perhaps except for
any possible performance issues, especially when used with software RAID
(regardless of whether MD or btrfs')[0].

But obviously integrity protection is still missing... and I guess for
many people this would actually be much more interesting than
encryption.

Now as far as I understand btrfs, everything, data and metadata is
checksummed (right now only with CRC32, but enough space would be left
for something "better"). And further, these checksums are verified on
each read.

Are these checksums chained? I.e. when a datablock is written/changed,
not only its checksum is affected, but also those of all meta-data
blocks up to the super blocks?

If so, one could perhaps use this like git for integrity protection.

For example when unmounting a btrfs filesystem, it could print the final
checksum of the superblock.
The user could store this on some safe device (for the paranoid one e.g.
on a USB stick that always travels along and that by coincidence also
contains boot loader, kernel+initrd and the dm-crypt keys necessary to
decrypt the fully encrypted system).

On mount one could then specify the expected checksum for the
superblock. If it differs already, then the mount should obviously fails
right away (or try backup superblocks and that like, but again only use
them if they match the sum).

Obviously one would also need an operation mode in which btrfs dies with
bells and red signs as soon as something (data or metadata) is read
afterwards, which doesn't match the expected checksums.
again, RAID copies, etc. could of course be tried - but if no valid copy
is found, then it should assume compromise and the read operation should
error out and not deliver any data at all (it may be compromised and
nothing should use it).

Of course people might still want to read such "compromised" blocks
(e.g. when they are sure that they've only suffered from accidental data
corruption and try to rescue as much as possible), but that should then
require a special mount option.

Well I'm not that crypto expert,... and btrfs would probably at least
need more than CRC32.
Unfortunately the 512bit version of Keccak wouldn't fit in, would it? ;)

Cheers,
Chris.

[0] Does anyone know the current status of this? When I have say a btrfs
RAID6 with n disks, each actually being a dm-crypt device... will this
run all in one IO thread killing the performance?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: is cryptographically secure integrity checking possible with btrfs?
  2015-03-12  3:35 is cryptographically secure integrity checking possible with btrfs? Christoph Anton Mitterer
@ 2015-03-12  4:07 ` Liu Bo
  2015-03-12  4:48   ` Christoph Anton Mitterer
  0 siblings, 1 reply; 3+ messages in thread
From: Liu Bo @ 2015-03-12  4:07 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-btrfs

Hi, 

On Thu, Mar 12, 2015 at 04:35:19AM +0100, Christoph Anton Mitterer wrote:
> Hey.
> 
> For encryption we have dm-crypt and in principle I'm happy with having
> that at the block device level below the filesystem - perhaps except for
> any possible performance issues, especially when used with software RAID
> (regardless of whether MD or btrfs')[0].
> 
> But obviously integrity protection is still missing... and I guess for
> many people this would actually be much more interesting than
> encryption.
> 
> 
> Now as far as I understand btrfs, everything, data and metadata is
> checksummed (right now only with CRC32, but enough space would be left
> for something "better"). And further, these checksums are verified on
> each read.
> 
> Are these checksums chained? I.e. when a datablock is written/changed,
> not only its checksum is affected, but also those of all meta-data
> blocks up to the super blocks?

checksum is updated along with the corresponding datablock's change, but
updating the super block depends on when btrfs starts committing
transaction.

> 
> If so, one could perhaps use this like git for integrity protection.
> 
> For example when unmounting a btrfs filesystem, it could print the final
> checksum of the superblock.
> The user could store this on some safe device (for the paranoid one e.g.
> on a USB stick that always travels along and that by coincidence also
> contains boot loader, kernel+initrd and the dm-crypt keys necessary to
> decrypt the fully encrypted system).
> 
> 
> On mount one could then specify the expected checksum for the
> superblock. If it differs already, then the mount should obviously fails
> right away (or try backup superblocks and that like, but again only use
> them if they match the sum).

That's exactly what we have now in btrfs, but the superblock checksum is only to
verify superblock itself, nothing more.

> 
> Obviously one would also need an operation mode in which btrfs dies with
> bells and red signs as soon as something (data or metadata) is read
> afterwards, which doesn't match the expected checksums.
> again, RAID copies, etc. could of course be tried - but if no valid copy
> is found, then it should assume compromise and the read operation should
> error out and not deliver any data at all (it may be compromised and
> nothing should use it).

Okay, if you kept using btrfs for a while, you can figure out the above
is basically true, but there are some differences,
a) if we find a metadata checksum mismatch, we do go to
get another copy(we usually have two copies for metadata) for good copy
if it is, and if no, we dont make btrfs die but throw a warning(actually
it can refuse to mount if you got such errors during mount and couldnt
find good copies).
b) if we find a data checksum mismatch, we do the same thing, but throw
bad news into dmesg logs and return EIO.


> 
> Of course people might still want to read such "compromised" blocks
> (e.g. when they are sure that they've only suffered from accidental data
> corruption and try to rescue as much as possible), but that should then
> require a special mount option.

So you're asking for a strict mode, but for datablock corruption, is it
OK for you to just flip btrfs into readonly mode instead of making it
die?

> 
> 
> Well I'm not that crypto expert,... and btrfs would probably at least
> need more than CRC32.
> Unfortunately the 512bit version of Keccak wouldn't fit in, would it? ;)

IIRC, we have only 256bit space, so 512bit may need some other tricks.

Thanks,

-liubo
> 
> 
> Cheers,
> Chris.
> 
> 
> [0] Does anyone know the current status of this? When I have say a btrfs
> RAID6 with n disks, each actually being a dm-crypt device... will this
> run all in one IO thread killing the performance?


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: is cryptographically secure integrity checking possible with btrfs?
  2015-03-12  4:07 ` Liu Bo
@ 2015-03-12  4:48   ` Christoph Anton Mitterer
  0 siblings, 0 replies; 3+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-12  4:48 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5697 bytes --]

On Thu, 2015-03-12 at 12:07 +0800, Liu Bo wrote: 
> checksum is updated along with the corresponding datablock's change, but
> updating the super block depends on when btrfs starts committing
> transaction.
Well I guess it was clear that you don't update the whole chain on every
single write =) (other wise btrfs would write to it all the time)... but
AFAIU it would be enough if everything is flushed out to disk at the end
(i.e. when the user unmounts and the checksum is printed).


Of course one general problem remains (which I didn't think of before):
What happens in case the system crashes... then the user would have no
way to now the last valid super-checksum.

I'm not sure whether this can be easily solved.. if at all.
One idea would be to make a snapshot of the whole fs in the beginning,
so in case of a crash, the user could go at least back to that validated
state.
But since snapshots are on subvols and not the whole fs, this wouldn't
really work.
And even if there was some way implemented to keep the pre-mounted state
of the fs (until unmount) this could cost a lot of space.

Hmm,... I think this is actually a bigger problem on the whole idea,
maybe it makes it unreasonable to use such functionality on "unattended"
filesystems and even when I sit next to my harddisk while the system
crashes (where I can be sure that no one forged any data after the
crash) one would need to trust the storage device.


> > On mount one could then specify the expected checksum for the
> > superblock. If it differs already, then the mount should obviously fails
> > right away (or try backup superblocks and that like, but again only use
> > them if they match the sum).
> 
> That's exactly what we have now in btrfs, but the superblock checksum is only to
> verify superblock itself, nothing more.

Sure but that's obviously different and not enough for cryptographic
integrity validation of *all* the data in the fs.

 
> > Obviously one would also need an operation mode in which btrfs dies with
> > bells and red signs as soon as something (data or metadata) is read
> > afterwards, which doesn't match the expected checksums.
> > again, RAID copies, etc. could of course be tried - but if no valid copy
> > is found, then it should assume compromise and the read operation should
> > error out and not deliver any data at all (it may be compromised and
> > nothing should use it).
> 
> Okay, if you kept using btrfs for a while, you can figure out the above
> is basically true
Hehe... well I've wrote about that some time ago, IMHO (and no one here
should take this offensive) the documentation is in a suboptimal state,
especially also since there are still many things going on.

Especially what one could really expect from the system, things like
- Over which data are checksums calculated, and what exactly happens on
errors (both data and metadata). Endusers are do not necessarily know
that btrfs will always verify checksums and with RAID look for a valid
block and give back only such (unlike MD or hardware RAID do usually)-
- What happens if I abort defrag (e.g. Ctrl-C)? Will it abort cleanly?
Will all data be lost? What if the system crashes during defrag or
balance? Are these procedures so safe that they cannot loose data in
these cases? Will the procedure continue automatically after reboot?
- Why does btrfs still need a log and what is it used for.

Anyway,... different topic ;)


> but there are some differences,
> a) if we find a metadata checksum mismatch, we do go to
> get another copy(we usually have two copies for metadata) for good copy
> if it is, and if no, we dont make btrfs die but throw a warning(actually
> it can refuse to mount if you got such errors during mount and couldnt
> find good copies).
And what if bad meta-data is found during normal operations (e.g.
read/write) with no good copy being found? Does it just give a warning
and try to do it's best to give back the read file/it's
attrs/permissions/etc. (which could already be a problem)? Or will at
also fail to read any files/dirs/etc. for which that meta-data would
have been needed (e.g. their checksums).


> > Of course people might still want to read such "compromised" blocks
> > (e.g. when they are sure that they've only suffered from accidental data
> > corruption and try to rescue as much as possible), but that should then
> > require a special mount option.
> 
> So you're asking for a strict mode, but for datablock corruption, is it
> OK for you to just flip btrfs into readonly mode instead of making it
> die?
Not sure what you mean.
If btrfs would be operating in "normal" (i.e. not a kind of data
recovery mode explicitly chosen by the user at mount time) than it
should fail to give back any data/metadata which it cannot verify.

I'm not sure whether it would need to go into readonly mode... perhaps
it would be a good idea, cause one could theoretically think about
blocking attacks against such integrity protected filesystem, e.g.
something like:
|
+--trusted-keys.d
|  \-- ...
+--revoked-keys.d
   \-- ...

If an attacker would manage to just corrupt the metadata for some of the
revoked-keys and the fs would just not give them back, he could use that
for tricky attacks.
So indeed, it might be better to go into read-only mode when a
unrecoverable validation error occurs... or perhaps even allow the user
to make a combination of remount-ro + kernel panic (some software might
still continue to work even when the fs is ro and for some users it
might be better to completely die than to use bad/insecure data).


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-12  4:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-12  3:35 is cryptographically secure integrity checking possible with btrfs? Christoph Anton Mitterer
2015-03-12  4:07 ` Liu Bo
2015-03-12  4:48   ` Christoph Anton Mitterer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.