Btrfs Fscrypt Design Document

All of lore.kernel.org
 help / color / mirror / Atom feed

* Btrfs Fscrypt Design Document
@ 2021-10-21 18:34 Omar Sandoval
  2021-10-22 15:47 ` Neal Gompa
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Omar Sandoval @ 2021-10-21 18:34 UTC (permalink / raw)
  To: linux-btrfs, linux-fscrypt, Theodore Y. Ts'o, Jaegeuk Kim,
	Eric Biggers, kernel-team

Hello,

I've been working on adding fscrypt support to Btrfs. Btrfs has some
features (namely, reflinks and snapshots) that don't work well with the
existing fscrypt encryption policies. I've been discussing and
prototyping how to support these Btrfs features with fscrypt, so I
figured it was high time I write it down and loop in the fscrypt
developers as well.

Here is the Google Doc:
https://docs.google.com/document/d/1iNnrqyZqJ2I5nfWKt7cd1T9xwU0iHhjhk9ALQW3XuII/edit?usp=sharing

Please feel free to comment there or via email.

Thanks,
Omar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs Fscrypt Design Document
  2021-10-21 18:34 Btrfs Fscrypt Design Document Omar Sandoval
@ 2021-10-22 15:47 ` Neal Gompa
       [not found] ` <CAMnT83tLqZU-bdsOJX9L==c82EvmQ2QTiOYCLch=kasscU+MiA@mail.gmail.com>
  2021-10-25 19:49 ` Eric Biggers
  2 siblings, 0 replies; 8+ messages in thread
From: Neal Gompa @ 2021-10-22 15:47 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: Btrfs BTRFS, linux-fscrypt, Theodore Y. Ts'o, Jaegeuk Kim,
	Eric Biggers, kernel-team

On Thu, Oct 21, 2021 at 2:35 PM Omar Sandoval <osandov@osandov.com> wrote:
>
> Hello,
>
> I've been working on adding fscrypt support to Btrfs. Btrfs has some
> features (namely, reflinks and snapshots) that don't work well with the
> existing fscrypt encryption policies. I've been discussing and
> prototyping how to support these Btrfs features with fscrypt, so I
> figured it was high time I write it down and loop in the fscrypt
> developers as well.
>
> Here is the Google Doc:
> https://docs.google.com/document/d/1iNnrqyZqJ2I5nfWKt7cd1T9xwU0iHhjhk9ALQW3XuII/edit?usp=sharing
>
> Please feel free to comment there or via email.
>

This looks great! I'm looking over it and leaving comments in the doc.


-- 
真実はいつも一つ！/ Always, there's only one truth!

^ permalink raw reply	[flat|nested] 8+ messages in thread

[parent not found: <CAMnT83tLqZU-bdsOJX9L==c82EvmQ2QTiOYCLch=kasscU+MiA@mail.gmail.com>]

* Re: Btrfs Fscrypt Design Document
       [not found] ` <CAMnT83tLqZU-bdsOJX9L==c82EvmQ2QTiOYCLch=kasscU+MiA@mail.gmail.com>
@ 2021-10-22 19:59   ` Omar Sandoval
  2021-10-25 19:25     ` Eric Biggers
  0 siblings, 1 reply; 8+ messages in thread
From: Omar Sandoval @ 2021-10-22 19:59 UTC (permalink / raw)
  To: Vadim Akimov
  Cc: linux-btrfs, linux-fscrypt, Theodore Y. Ts'o, Jaegeuk Kim,
	Eric Biggers, kernel-team

On Fri, Oct 22, 2021 at 10:14:11PM +0300, Vadim Akimov wrote:
> Hi!
> 
> On Thu, 21 Oct 2021 at 21:34, Omar Sandoval <osandov@osandov.com> wrote:
> 
> > Here is the Google Doc:
> >
> > https://docs.google.com/document/d/1iNnrqyZqJ2I5nfWKt7cd1T9xwU0iHhjhk9ALQW3XuII/edit?usp=sharing
> >
> 
> As I've understood, you are inclined to have single key and only change IV
> for each extent. This might be dangerous as per this answer (and comments
> below):  https://crypto.stackexchange.com/a/70630/71448

Correct me if I'm wrong, but I don't think this is a practical concern
in the fscrypt threat model. The birthday bound for AES is 256 EiB
(2^(128 / 2) blocks * 16 bytes per block). The theoretical maximum size
of a Btrfs filesystem is 16 EiB (since we use 64-bit byte addresses).
fscrypt protects against a "single point-in-time permanent offline
compromise". This means that the attacker only has what was on disk at
the time that they stole your disk. In this case, they won't have enough
data for a birthday attack. I'm curious where that post got the
"multiple petabytes" number.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs Fscrypt Design Document
  2021-10-22 19:59   ` Omar Sandoval
@ 2021-10-25 19:25     ` Eric Biggers
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Biggers @ 2021-10-25 19:25 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: Vadim Akimov, linux-btrfs, linux-fscrypt, Theodore Y. Ts'o,
	Jaegeuk Kim, kernel-team

On Fri, Oct 22, 2021 at 12:59:35PM -0700, Omar Sandoval wrote:
> On Fri, Oct 22, 2021 at 10:14:11PM +0300, Vadim Akimov wrote:
> > Hi!
> > 
> > On Thu, 21 Oct 2021 at 21:34, Omar Sandoval <osandov@osandov.com> wrote:
> > 
> > > Here is the Google Doc:
> > >
> > > https://docs.google.com/document/d/1iNnrqyZqJ2I5nfWKt7cd1T9xwU0iHhjhk9ALQW3XuII/edit?usp=sharing
> > >
> > 
> > As I've understood, you are inclined to have single key and only change IV
> > for each extent. This might be dangerous as per this answer (and comments
> > below):  https://crypto.stackexchange.com/a/70630/71448
> 
> Correct me if I'm wrong, but I don't think this is a practical concern
> in the fscrypt threat model. The birthday bound for AES is 256 EiB
> (2^(128 / 2) blocks * 16 bytes per block). The theoretical maximum size
> of a Btrfs filesystem is 16 EiB (since we use 64-bit byte addresses).
> fscrypt protects against a "single point-in-time permanent offline
> compromise". This means that the attacker only has what was on disk at
> the time that they stole your disk. In this case, they won't have enough
> data for a birthday attack. I'm curious where that post got the
> "multiple petabytes" number.

So, fscrypt originally only supported per-file keys.  The reason we added
support for some "one key per encryption policy" settings are because there are
cases where many keys can't be handled efficiently.  In the case of Adiantum
encryption (which is intended for devices which might not have a lot of memory)
a key takes a lot of memory, so we didn't want to have one for every file.
Similarly, in the case where file contents encryption is done using UFS or eMMC
inline encryption hardware rather than in software, there might be only a small
number of hardware keyslots and changing them can be slow, so we didn't want to
have to change keys for every file.

There are definitely some advantages to per-file keys, including reducing the
amount of data which is encrypted with each key, increasing the difficulty of
recovering deleted files, and eliminating the need to distinguish between
different files in the IVs.

None of these are too important in practice, though.  E.g. we don't get anywhere
near the cryptographic bounds in practice anyway, and secure deletion isn't
guaranteed even with per-file keys.

For btrfs, it sounds like per-file keys won't work out due to reflinks anyway.
However you could do per-extent keys in the same way, where the key for each
extent is derived from a nonce (stored in the metadata describing the extent)
and the master key.

Did you consider per-extent keys?  If they are practical, that would be the best
approach cryptographically.  But if they aren't practical (more likely IMO,
given that a file can contain a large number of extents), I think it would be
acceptable to not use them.

- Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs Fscrypt Design Document
  2021-10-21 18:34 Btrfs Fscrypt Design Document Omar Sandoval
  2021-10-22 15:47 ` Neal Gompa
       [not found] ` <CAMnT83tLqZU-bdsOJX9L==c82EvmQ2QTiOYCLch=kasscU+MiA@mail.gmail.com>
@ 2021-10-25 19:49 ` Eric Biggers
  2021-10-26  7:00   ` Vadim Akimov
  2021-10-26  7:53   ` Omar Sandoval
  2 siblings, 2 replies; 8+ messages in thread
From: Eric Biggers @ 2021-10-25 19:49 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: linux-btrfs, linux-fscrypt, Theodore Y. Ts'o, Jaegeuk Kim,
	kernel-team

On Thu, Oct 21, 2021 at 11:34:19AM -0700, Omar Sandoval wrote:
> Hello,
> 
> I've been working on adding fscrypt support to Btrfs. Btrfs has some
> features (namely, reflinks and snapshots) that don't work well with the
> existing fscrypt encryption policies. I've been discussing and
> prototyping how to support these Btrfs features with fscrypt, so I
> figured it was high time I write it down and loop in the fscrypt
> developers as well.
> 
> Here is the Google Doc:
> https://docs.google.com/document/d/1iNnrqyZqJ2I5nfWKt7cd1T9xwU0iHhjhk9ALQW3XuII/edit?usp=sharing
> 
> Please feel free to comment there or via email.
> 

Just some preliminary comments:

Given that you need reflinking to remain supported, for file contents encryption
I think it's the right choice to store the IVs explicitly rather than have them
determined by the offset within the file.

How many derived encryption keys to use is somewhat orthogonal to that.  As I
mentioned in my other mail, you could still have one key per extent rather than
one per encryption policy as you're proposing.  I'm *guessing* it wouldn't be
practical, and I don't consider it to be required (just preferable), but the
document doesn't discuss this possibility at all.

Storing just the "starting IV" for each extent also makes sense, assuming that
you only want to support an unauthenticated mode such as AES-XTS.  However,
given that btrfs is a copy-on-write filesystem and thus can support per-block
metadata, a natural question is why not support an authenticated mode such as
AES-GCM, with a nonce and authentication tag stored per block?  Have you thought
about this?

Now, I personally think that authenticating file contents only wouldn't give
much benefit, and whole-filesystem authentication would be needed to get a real
benefit.  But "why aren't you using an authenticated mode" is a *very* common
question, so you need an answer to that -- or ideally, just support it if it
isn't much work.

What is your proposal for how filenames encryption would work when the
EXPLICIT_IV flag is used?  That doesn't appear to be mentioned.

Finally, the proposal to allow encrypting the changed data of snapshots is a
larger departure from the fscrypt model.  I'm still trying to wrap my head
around how that could work.  Could you provide any more details about that?
E.g. what metadata would actually be stored on-disk, and how would it be used?
When would things be done in terms of filesystem operations?  E.g. let's say I
open a file for writing -- would the encryption key be set up right away, or
would it not happen until I actually write data?

- Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs Fscrypt Design Document
  2021-10-25 19:49 ` Eric Biggers
@ 2021-10-26  7:00   ` Vadim Akimov
  2021-10-26  7:53   ` Omar Sandoval
  1 sibling, 0 replies; 8+ messages in thread
From: Vadim Akimov @ 2021-10-26  7:00 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Omar Sandoval, linux-btrfs, linux-fscrypt, Theodore Y. Ts'o,
	Jaegeuk Kim, kernel-team

On Mon, 25 Oct 2021 at 22:59, Eric Biggers <ebiggers@kernel.org> wrote:

> However,
> given that btrfs is a copy-on-write filesystem and thus can support per-block
> metadata, a natural question is why not support an authenticated mode such as
> AES-GCM, with a nonce and authentication tag stored per block?  Have you thought
> about this?

Can't the existing checksum fields be just reused to keep HMACs? This
way even the unencrypted metadata could be authenticated.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs Fscrypt Design Document
  2021-10-25 19:49 ` Eric Biggers
  2021-10-26  7:00   ` Vadim Akimov
@ 2021-10-26  7:53   ` Omar Sandoval
  2021-10-26 14:56     ` David Sterba
  1 sibling, 1 reply; 8+ messages in thread
From: Omar Sandoval @ 2021-10-26  7:53 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-btrfs, linux-fscrypt, Theodore Y. Ts'o, Jaegeuk Kim,
	kernel-team

On Mon, Oct 25, 2021 at 12:49:51PM -0700, Eric Biggers wrote:
> On Thu, Oct 21, 2021 at 11:34:19AM -0700, Omar Sandoval wrote:
> > Hello,
> > 
> > I've been working on adding fscrypt support to Btrfs. Btrfs has some
> > features (namely, reflinks and snapshots) that don't work well with the
> > existing fscrypt encryption policies. I've been discussing and
> > prototyping how to support these Btrfs features with fscrypt, so I
> > figured it was high time I write it down and loop in the fscrypt
> > developers as well.
> > 
> > Here is the Google Doc:
> > https://docs.google.com/document/d/1iNnrqyZqJ2I5nfWKt7cd1T9xwU0iHhjhk9ALQW3XuII/edit?usp=sharing
> > 
> > Please feel free to comment there or via email.
> > 
> 
> Just some preliminary comments:
> 
> Given that you need reflinking to remain supported, for file contents encryption
> I think it's the right choice to store the IVs explicitly rather than have them
> determined by the offset within the file.
> 
> How many derived encryption keys to use is somewhat orthogonal to that.  As I
> mentioned in my other mail, you could still have one key per extent rather than
> one per encryption policy as you're proposing.  I'm *guessing* it wouldn't be
> practical, and I don't consider it to be required (just preferable), but the
> document doesn't discuss this possibility at all.

I overlooked this option because my gut instinct was that the memory
usage would be prohibitive. It looks like one AES-256-XTS prepared key
is about 1k in memory (960 bytes for the encryption and decryption key
schedules for each key, plus a bit more for the crypto API structures).
I thought it'd be too expensive to store this naively for each cached
extent.

However, across various machines I checked, the number of cached inodes
and the number of cached extents is in the same order magnitude (and in
fact, almost equal in many cases). So per-extent keys aren't out of the
question. We can store a 16-byte nonce in the extent, use that to derive
the per-extent key from the master key, and use the offset in the extent
as the IV. I'll think about it some more and make sure I'm not missing
anything.

> Storing just the "starting IV" for each extent also makes sense, assuming that
> you only want to support an unauthenticated mode such as AES-XTS.  However,
> given that btrfs is a copy-on-write filesystem and thus can support per-block
> metadata, a natural question is why not support an authenticated mode such as
> AES-GCM, with a nonce and authentication tag stored per block?  Have you thought
> about this?
> 
> Now, I personally think that authenticating file contents only wouldn't give
> much benefit, and whole-filesystem authentication would be needed to get a real
> benefit.  But "why aren't you using an authenticated mode" is a *very* common
> question, so you need an answer to that -- or ideally, just support it if it
> isn't much work.

We already store a checksum per block; I don't see any reason that it
couldn't be a MAC. Johannes Thumshirn had a proof of concept for storing
an HMAC for all blocks:
https://lore.kernel.org/linux-btrfs/20191015121405.19066-1-jthumshirn@suse.de/#b
Plumbing it through for authenticated encryption would be a little
harder, but probably not by much.

> What is your proposal for how filenames encryption would work when the
> EXPLICIT_IV flag is used?  That doesn't appear to be mentioned.

Since there's no such thing as "reflinking" filenames, I think filename
encryption can be unchanged, i.e., per-directory encryption keys. (This
would probably be the case with per-extent keys for data, as well.)

> Finally, the proposal to allow encrypting the changed data of snapshots is a
> larger departure from the fscrypt model.  I'm still trying to wrap my head
> around how that could work.  Could you provide any more details about that?
> E.g. what metadata would actually be stored on-disk, and how would it be used?
> When would things be done in terms of filesystem operations?  E.g. let's say I
> open a file for writing -- would the encryption key be set up right away, or
> would it not happen until I actually write data?

On disk, we still only need to store the usual fscrypt context. It will
always be present for the top-level of the snapshot. It may or may not
be present for any files or directories under that.

In memory, we'd store whether the subvolume is encrypted. This would be
set when enabling encryption and when caching the subvolume. Since every
inode has a reference to the subvolume it is in, and inodes can't move
between subvolumes, all we need is a check like:

if (IS_ENCRYPTED(inode->subvolume) && !IS_ENCRYPTED(inode))
	set_up_encryption(inode);

I'm leaning towards doing that either at the time that userspace writes
the data, or at the time that we're flushing the data to disk, whichever
ends up being more convenient for Btrfs. I'd rather not do it at open
time.

Thanks for the very helpful reply!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs Fscrypt Design Document
  2021-10-26  7:53   ` Omar Sandoval
@ 2021-10-26 14:56     ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2021-10-26 14:56 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: Eric Biggers, linux-btrfs, linux-fscrypt, Theodore Y. Ts'o,
	Jaegeuk Kim, kernel-team

On Tue, Oct 26, 2021 at 12:53:25AM -0700, Omar Sandoval wrote:
> On Mon, Oct 25, 2021 at 12:49:51PM -0700, Eric Biggers wrote:
> > On Thu, Oct 21, 2021 at 11:34:19AM -0700, Omar Sandoval wrote:
> > Now, I personally think that authenticating file contents only wouldn't give
> > much benefit, and whole-filesystem authentication would be needed to get a real
> > benefit.  But "why aren't you using an authenticated mode" is a *very* common
> > question, so you need an answer to that -- or ideally, just support it if it
> > isn't much work.
> 
> We already store a checksum per block; I don't see any reason that it
> couldn't be a MAC. Johannes Thumshirn had a proof of concept for storing
> an HMAC for all blocks:
> https://lore.kernel.org/linux-btrfs/20191015121405.19066-1-jthumshirn@suse.de/#b
> Plumbing it through for authenticated encryption would be a little
> harder, but probably not by much.

I've been working on the HMAC as checksums and still want to finish as
time permits, so if you have any potential changes beyond "hmac is just
another checksum", please let me know.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-26 14:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-21 18:34 Btrfs Fscrypt Design Document Omar Sandoval
2021-10-22 15:47 ` Neal Gompa
     [not found] ` <CAMnT83tLqZU-bdsOJX9L==c82EvmQ2QTiOYCLch=kasscU+MiA@mail.gmail.com>
2021-10-22 19:59   ` Omar Sandoval
2021-10-25 19:25     ` Eric Biggers
2021-10-25 19:49 ` Eric Biggers
2021-10-26  7:00   ` Vadim Akimov
2021-10-26  7:53   ` Omar Sandoval
2021-10-26 14:56     ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.