linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: linux-fsdevel@vger.kernel.org
Cc: samba-technical@lists.samba.org, Eric Biggers <ebiggers@kernel.org>
Subject: Streams support in Linux
Date: Sat, 25 Aug 2018 06:51:07 -0700	[thread overview]
Message-ID: <20180825135107.GA12251@bombadil.infradead.org> (raw)


[starting a separate thread to not hijack the fs-verity submission]

Eric Biggers wrote:
> In theory it would be a much cleaner design to store verity metadata
> separately from the data.  But the Merkle tree can be very large.
> For example, a 1 GB file using SHA-512 would have a 16.6 MB Merkle tree.
> So the Merkle tree can't be an extended attribute, since the xattrs API
> requires xattrs to be small (<= 64 KB), and most filesystems further limit
> xattr sizes in their on-disk format to as little as 4 KB.  Furthermore,
> even if both of these limits were to be increased, the xattrs functions
> (both the syscalls, and the internal functions that filesystems have)
> are all based around getting/setting the entire xattr value.
> 
> Also when used with fscrypt, we want the Merkle tree and
> fsverity_descriptor to be encrypted, so they doesn't leak plaintext
> hashes.  And we want the Merkle tree to be paged into memory, just like
> the file contents, to take advantage of the usual Linux memory management.
> 
> What we really need is *streams*, like NTFS has.  But the filesystems
> we're targetting don't support streams, nor does the Linux syscall
> interface have any API for accessing streams, nor does the VFS support
> them.
> 
> Adding streams support to all those things would be a huge multi-year
> effort, controversial, and almost certainly not worth it just for
> fs-verity.

There are, of course, other clients for file streams.  Samba is one,
GNOME could use streams for various desktoppy things, and I'm certain
other users would come out of the woodwork if we had them.

Let's go over the properties of a file stream:

 - It has no life independent of the file it's attached to; you can't move
   it from one file to another
 - If the file is deleted, it is also deleted
 - If the file is renamed, it travels with the file
 - If the file is copied, the copying program decides whether any named
   streams are copied along with it.
 - Can be created, deleted.  Can be renamed?
 - Openable, seekable, cachable
 - Does not have sub-streams of its own
 - Directories may also have streams which are distinct from the files
   in the directory
 - Can pipes / sockets / device nodes / symlinks / ... have streams?  Unclear.
   Probably not useful.

NTFS, UDF and SMB all support streams already.  Microsoft opted to
include the functionality in ReFS (which dropped some of the less-used
functionality of NTFS), so it's clearly useful.

Here's my proposed syscall API for this:

openat()
To access a named stream, we need to be able to get a file descriptor for
it.  The new openat() syscall seems like the best way to accompish this;
specify a file descriptor, a new AT_NAMED_STREAM flag and a filename,
and the last component of the filename will be treated as the name of
the stream within the object.  This permits us to distinguish between
a named stream on a directory and a file within a directory.

fstat()
st_ino may be different for different names.  st_dev may be different.
st_mode will match the object for files, even if it is changed after
creation.  For directories, it will match except that execute permission
will be removed and S_IFMT will be S_ISREG (do we want to define a
new S_ISSTRM?).  st_nlink will be 1.  st_uid and st_gid will match.  It
will have its own st_atime/st_mtime/st_ctime.  Accessing a stream will not
update its parent's atime/mtime/ctime.

mmap(), read(), write(), close(), splice(), sendfile(), fallocate(),
ftruncate(), dup(), dup2(), dup3(), utimensat(), futimens(), select(), poll(),
lseek(), 
fcntl(): F_DUPFD, F_GETFD, F_GETFL, F_SETFL, F_SETLK, F_SETLKW, F_GETLK,
F_GETOWN, F_SETOWN, F_GETSIG, F_SETSIG, F_SETLEASE, F_GETLEASE)

These system calls work as expected

linkat(), symlinkat(), mknodat(), mkdirat(), 
These system calls will return -EPERM.

renameat()
If olddirfd + oldpath refers to a stream then newdirfd + newpath must
refer to a stream within the same parent object.  If that stream exists,
it is removed.  If olddirfd + oldpath does not refer to a stream, then
newdirfd + newpath must not refer to a stream.

The two file specifications must resolve to the same parent object.  It
is possible to use renameat() to rename a stream within an object, but
not to move a stream from one object to another.  If newpath refers to
an existing named stream, it is removed.  

unlinkat()
This is how you remove an individual named stream

unlink()
Unlinking a file with named streams removes all named streams from that
file and then unlinks the file.  Open streams will continue to exist in
the filesystem until they are closed, just as unlinked files do.

link(), rename()
Renaming or linking to a file with named streams does not affect the streams.

We may need a new system call for enumerating the streams associated
with a file or directory.  We can't use getdents() because there's no
way to distinguish between wanting to read the contents of a directory
and the named streams on a directory.


For shell programming, I would suggest a new program:

	strcat [FILE] [STREAM]...

which opens [FILE], then each named stream within that file, concatenating
said STREAMs to stdout.  We probably need a strls too.

             reply	other threads:[~2018-08-25 17:30 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-25 13:51 Matthew Wilcox [this message]
2018-08-25 14:47 ` Streams support in Linux Al Viro
2018-08-25 15:51   ` Matthew Wilcox
2018-08-25 18:00     ` Al Viro
2018-08-25 20:57       ` Matthew Wilcox
2018-08-25 22:36         ` Al Viro
2018-08-26  1:03           ` Steve French
2018-08-27 17:05             ` Jeremy Allison
2018-08-27 17:41               ` Jeremy Allison
2018-08-27 18:21               ` Matthew Wilcox
2018-08-27 18:45                 ` Al Viro
2018-08-27 19:06                 ` Jeremy Allison
2018-08-28  0:45                 ` Theodore Y. Ts'o
2018-08-28  1:07                   ` Steve French
2018-08-28 18:12                     ` Jeremy Allison
2018-08-28 18:32                       ` Steve French
2018-08-28 18:40                         ` Jeremy Allison
2018-08-28 19:43                           ` Steve French
2018-08-28 19:47                             ` Jeremy Allison
2018-08-28 20:43                               ` Steve French
2018-08-28 20:47                                 ` Jeremy Allison
2018-08-28 20:51                                   ` Steve French
2018-08-28 21:19                                   ` Stefan Metzmacher
2018-08-28 21:22                                     ` Jeremy Allison
2018-08-28 21:23                                     ` Steve French
2018-08-29  5:13                                       ` Ralph Böhme
2018-08-29 13:46                       ` Tom Talpey
2018-08-29 13:54                         ` Aurélien Aptel
2018-08-29 15:02                           ` Tom Talpey
2018-08-29 16:00                             ` Jeremy Allison
2018-08-29 15:59                         ` Jeremy Allison
2018-08-29 18:52                           ` Andreas Dilger
2018-08-26 20:30           ` Matthew Wilcox
2018-08-25 16:25 ` Theodore Y. Ts'o
2018-08-27 16:33   ` Jeremy Allison
  -- strict thread matches above, loose matches on Subject: below --
2018-09-20  2:06 Shahbaz Youssefi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180825135107.GA12251@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=ebiggers@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=samba-technical@lists.samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).