linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Eric Biggers <ebiggers@kernel.org>,
	<linux-fscrypt@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	<linux-ext4@vger.kernel.org>,
	<linux-f2fs-devel@lists.sourceforge.net>
Subject: Re: Proposal: A new fs-verity interface
Date: Thu, 24 Jan 2019 18:22:37 -0500	[thread overview]
Message-ID: <20190124232237.GH8785@mit.edu> (raw)
In-Reply-To: <CAHk-=wh7jq6U3+ou_FDujB-ipg9JcJAGSjEQ_niCwLG313x6UA@mail.gmail.com>

On Fri, Jan 25, 2019 at 10:40:31AM +1300, Linus Torvalds wrote:
> 
> I _assume_ (but it's exactly that - just an assumption) this whole
> design decision comes from basically having a transport layer that is
> entirely unaware of the merle data, so the data  is brought in some
> entirely traditional way that can only transfer regular file contents
> (ie tar/zip/ar kind of thing, but presumably actually just in the form
> of an android APK). And then the new interface is just a way to
> "convert" that into the actual final security model.

How the transport layer is going to send the merkle data is really
unrelated (e.g., it's not necessarily going to be at the end of the
file data).

> One thing that is also unclear to me is whether that "secure" model
> needs to be stable on disk (ie is this considered an actual write that
> *modifies* the underlying filesystem, and the merkle tree data ends up
> being associated long-term and over reboots), or whether it would be
> acceptable to just have it be a temporary "view" of the file where the
> filesystem itself can be read-only, and all that happens is that now
> the merkle tree is associated with that file as long as the filesystem
> is mounted (or until it is disassociated).

It's the first.  We need to keep the Merkle tree and associated
metadata information (which might include a PKCS 7 digital signature)
permanently associated with the file.  So it has to be stored in the
file; it's associated metadata.

> Maybe this was answered in some of the earlier email threads that (at
> least for me) were then somewhat overshadowed by the merge window work
> and the holidays. So it's possible that I repeat myself. But I do have
> to say that I think I'd *still* prefer this to be something more like
> an xattr, and that maybe we'd be better off actually improving out
> "write to xattr" interface or something.

The main issue is that for a 129 MB file, the Merkle data is going to
be a Megabyte.  So using a set/get interface, ala our current xattr
interface, seems awkward.  Also, currently for most file systems,
xattrs are limited in size to around 4k to 32k, and most xattrs
relatively small (e.g., SELinux labels, ACL's).  So even if we used
the xattr interface, for many file systems, for something that might
be 1 megabyte (for a 129 MB file to be protected by fs-verity), it
would almost certainly be stored in a different location than other
xattrs.  So similarly, changing our attr interfaces for big blobs,
when the vast majority of xattrs are small ones, doesn't seem to be a
great use of time.

The other thing I'll point out is that file system developers
generally have frowned on using setting xattrs having magic side
effects, since that would mean making the xattr set/get interface
acting more lke an ioctl.  When we make an file to become fs-verity
protected, it does have a side-effect of making the file immutable.
That's not a huge side-effect, but that's another reason where it
feels like the xattr interface seems like the wrong effort.

> I understand that you don't want to load the whole merkle tree into
> memory, and that is the reason that you want to point to some "stable
> on disk" area, but the hole punching does seem to be a particularly
> nasty part of it. It would be much better to have the merkle data in
> some place where it doesn't then need to be hidden again, no?

It's not really a "hole punch", but we are moving the data around.
That's because Dave Chinner and Christoph demanded it.  The original
approach was to put it at the end of the file, and then hide it.  If
the question is "why hide the metadata", it's because it's metadata.
We certainly don't want to make it be visible as part of the file
stream.

We could store the metadata somewhere else --- for example, we could
store it in another inode.  But inodes have overhead, and that would
mean using two inodes for every fs-verity protected files --- and we
don't need all of the other metadata (mtime, ctime, etc.) for the
Merkle tree.  So that's how we got to where we were.  I think the
approach of storing it using the same extent tree where we map logical
block numbers to physical block numbers make a lot of sense for ext4
and f2fs.

It seems that some file system (which may never even implement
fs-verity) their developers hate that particular approach.  So that's
where the suggestion of using a separate file descriptor to convey the
Merkle tree data to the file system came from.  It wasn't my first
choice.

						- Ted

  reply	other threads:[~2019-01-24 23:22 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10  5:15 Proposal: A new fs-verity interface Theodore Y. Ts'o
2019-01-10  5:15 ` Theodore Y. Ts'o
2019-01-10 18:18 ` Darrick J. Wong
2019-01-10 18:18   ` Darrick J. Wong
2019-01-14 23:41 ` Dave Chinner
2019-01-14 23:41   ` Dave Chinner
2019-01-23  5:10   ` Theodore Y. Ts'o
2019-01-24 21:25     ` Dave Chinner
2019-01-24 21:40       ` Linus Torvalds
2019-01-24 23:22         ` Theodore Y. Ts'o [this message]
2019-01-25  0:32           ` Matthew Wilcox
2019-01-25  0:35           ` Linus Torvalds
2019-01-29 15:48             ` Theodore Y. Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190124232237.GH8785@mit.edu \
    --to=tytso@mit.edu \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=ebiggers@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).