linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Andreas Dilger <adilger@dilger.ca>, Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Mimi Zohar <zohar@linux.vnet.ibm.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	lsf-pc@lists.linux-foundation.org
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] fs-verity: file system-level integrity protection
Date: Fri, 02 Feb 2018 06:34:41 +0100	[thread overview]
Message-ID: <1517549681.3222.12.camel@HansenPartnership.com> (raw)
In-Reply-To: <F38F74DF-0C5E-440C-A850-607FADAA9129@dilger.ca>

[-- Attachment #1: Type: text/plain, Size: 3555 bytes --]

On Thu, 2018-02-01 at 16:43 -0700, Andreas Dilger wrote:
> On Feb 1, 2018, at 4:04 PM, Dave Chinner <david@fromorbit.com> wrote:
> > 
> > 
> > On Wed, Jan 31, 2018 at 07:03:16PM -0500, Theodore Ts'o wrote:
> > > 
> > > On Wed, Jan 31, 2018 at 12:41:13PM -0800, James Bottomley wrote:
> > > > 
> > > > > 
> > > > > Like fscrypto, where most of the code is in fs/crypto, most
> > > > > of the fs-verity will be in fs/verity.  There will be minimal
> > > > > hooks in a particular file system, so if another file system
> > > > > wants to play, then can do so relatively easily.
> > > > 
> > > > OK, sounds good ... I notice, now I look, that fscrypt uses
> > > > xattrs (albeit hidden under the covers of get/set_context),
> > > > will dm-verity use the same trick or do people really need
> > > > space in the inode?
> > > 
> > > I assume you mean fs-verity above, and no, we aren't going to use
> > > xattrs because the Merkle tree won't fit in the xattr.  So the
> > > plan was to put the fs-verity header, the PKCS7 signature, and
> > > the Merkle tree after i_size (rounded to a blocksize
> > > boundary).  Remember, the fs-verity case we only worry about the
> > > read-ony case.
> > 
> > I think putting valid data beyond EOF is going to be problematic
> > for many filesystems. Getting things like truncate right are hard
> > enough without having to special case a bunch of new functionality
> > that specifically allows IO access beyond EOF. Indeed, how does
> > "truncate isize but leave special data behind" work and what's the
> > userspace API to drive it? And how does it interact with all the
> > page cache code that checks for page->index beyond EOF to detect a
> > truncated page that should not be accessed?
> > 
> > There's also further complications for filesystems like XFS e.g.
> > how do we tell the difference between valid data beyond EOF and
> > speculative allocation (done by delalloc) beyond EOF that contains
> > no data and can be removed if it is not written to in a short
> > while?
> > 
> > This just seems like a horrible can of worms to me and is not
> > something we should be building generic infrastructure around.
> > 
> > Just how big do these merkle trees get, anyway?
> 
> The Merkle tree will have one checksum per "leaf block" of the
> filesystem (though I'd recommend to use a fixed-size checksum leaf
> block like 4KB so that userspace doesn't need to care about the
> actual filesystem blocksize on disk).  After that, there is a tree of
> checksums from the leaf blocks up to the root.  If there was a weak
> checksum like CRC32 (4 bytes/leaf) then the tree size would be
> somewhat over 0.1% of the file size.  If the tree has a strong
> checksum like SHA256 (32 bytes/leaf) then the overhead is over 0.8%.

Actually, based on what Ted has already said about his use case, I
don't believe we need a binary merkle tree.  The binary tree itself is
only useful if you're doing partial hash verifications on random chunks
as you download, like in p2p.  In this use case we only verify the leaf
nodes on read and the signature is only verified at open, so it sounds
like all we actually need is the leaf chunks and a signature over the
hash of all of them, in fact a simple hash list.  That also protects us
from having to worry about pre-image attacks which are a known problem
of binary merkle trees.

Of course, the leaf nodes are sill about 50% of the binary tree size,
so I've only made a small space improvement.

James

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  parent reply	other threads:[~2018-02-02  5:34 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-25 19:11 [LSF/MM TOPIC] fs-verity: file system-level integrity protection Theodore Ts'o
2018-01-25 21:49 ` Chuck Lever
2018-01-25 23:39   ` Theodore Ts'o
2018-01-26  0:47 ` James Bottomley
2018-01-26  2:30   ` Theodore Ts'o
2018-01-26  4:50     ` James Bottomley
2018-01-26 14:58       ` Theodore Ts'o
2018-01-26 16:44         ` [Lsf-pc] " James Bottomley
2018-01-26 21:55           ` Theodore Ts'o
2018-01-27  7:58             ` Andreas Dilger
2018-01-27 16:19               ` James Bottomley
2018-01-27 17:08                 ` James Bottomley
2018-01-28  2:46                 ` Theodore Ts'o
2018-01-28 17:19                   ` James Bottomley
2018-01-28 18:03                   ` James Bottomley
2018-01-28 18:19                     ` Chuck Lever
2018-01-29  6:39                       ` James Bottomley
2018-01-29 15:22                         ` Chuck Lever
2018-01-30  6:47                           ` James Bottomley
2018-01-28 21:49                     ` Theodore Ts'o
2018-01-28 22:49                       ` Theodore Ts'o
2018-01-28 23:04                       ` Mimi Zohar
2018-01-29  0:38                         ` Theodore Ts'o
2018-01-29  1:53                           ` Mimi Zohar
2018-01-29  2:38                             ` Theodore Ts'o
2018-01-29  3:39                               ` Mimi Zohar
2018-01-29  4:40                                 ` Theodore Ts'o
2018-01-29  4:50                                 ` Theodore Ts'o
2018-01-29 12:09                                   ` Mimi Zohar
2018-01-29 13:58                                     ` Mimi Zohar
2018-01-29 23:02                                     ` Theodore Ts'o
2018-01-30 23:25                                       ` Mimi Zohar
2018-01-31 16:05                                         ` Theodore Ts'o
2018-01-31 17:12                                           ` James Bottomley
2018-01-31 18:46                                             ` Theodore Ts'o
2018-01-31 20:41                                               ` James Bottomley
2018-02-01  0:03                                                 ` Theodore Ts'o
2018-02-01 23:04                                                   ` Dave Chinner
2018-02-01 23:43                                                     ` Andreas Dilger
2018-02-02  0:13                                                       ` Dave Chinner
2018-02-02  5:34                                                       ` James Bottomley [this message]
2018-02-02  2:40                                                     ` Theodore Ts'o
2018-02-02  9:05                                                       ` Dave Chinner
2018-01-31 20:40                                           ` Mimi Zohar
2018-01-31 22:00                                             ` Theodore Ts'o
2018-02-01 15:17                                               ` Mimi Zohar
2018-01-29  0:21                       ` James Bottomley
2018-01-29  1:03                         ` Theodore Ts'o
2018-01-29 21:21                           ` Andreas Dilger
2018-01-26 18:13         ` Mimi Zohar
2018-01-29 18:54   ` Michael Halcrow
2018-01-26  7:58 ` Colin Walters
2018-01-26 15:29   ` Theodore Ts'o
2018-01-26 16:40     ` Colin Walters
2018-01-26 16:49       ` [Lsf-pc] " James Bottomley
2018-01-26 17:05         ` Colin Walters
2018-01-26 17:54 ` Mimi Zohar
2018-02-02  0:02 ` Steve French
2018-02-07 13:04 ` David Gstir

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1517549681.3222.12.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=adilger@dilger.ca \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=zohar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).