public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: Andreas Dilger <adilger@dilger.ca>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	James Simmons <jsimmons@infradead.org>,
	tahsin@google.com, nauman@google.com, tytso@google.com
Subject: Re: [PATCH] ext4: xattr-in-inode support
Date: Thu, 20 Apr 2017 17:24:40 -0400	[thread overview]
Message-ID: <20170420212440.w4oek4rbzxeu2qqk@thunk.org> (raw)
In-Reply-To: <20170420075823.GA18523@quack2.suse.cz>

On Thu, Apr 20, 2017 at 09:58:23AM +0200, Jan Kara wrote:
> So the proposal seems to have implicit in it that we will be
> "deduplicating" xattr values. Currently we deduplicate only full external
> xattr blocks (which possibly contain more xattrs). Any idea how big win
> that is going to be over deduplicating only full sets of xattrs?

So in Windows, the security ID can be larger than what can fit in the
inode (if file creator belongs to foreign domains; I'm told that the
SID in some cases can be 12k or more).  And of course the Windows/Rich
acl can also be substantially bigger than what can fit in the inode.

So if you a directory hierarcy which all have the same ACL's, and a
large number of users that writing into that directory (so there is a
large number of different sids), the resulting cross product can be
large.

Windows also has a large number of other use cases for extended
attributes that will be unique.  In some cases, such as the Unix
timestamps, file owner, permissions bits, for files written by the
Windows Subsystem for Linux will fit in the inode table.  The
information that a particular flie was downloaded from 
"http://russia.phish.org/rootme.exe" so the user could be asked if
they really wanted to open it is also stored in an xattr.

It's definitely true that adding some hueristics to sort certain
xattrs into in-inode xattr will definitely help.  (For example, this
will definitely help the Android SE Linux label / ext4 encryption
context overflow case.)  But there will be definitely some cases,
probably mostly with Windows CIFS serving, where Microsoft is using
enough xattrs where this will probably be useful.

> One idea I had in mind was that one way of supporting larger xattrs would
> be to support something like xattr fork - i.e., in the xattr space of the
> inode we would have root of an extent tree describing xattr space of the
> inode. Then inside the space described by the extent tree would be stored
> xattrs - possibly in the same format as they are currently stored in a
> block (we would just redefine that e_value_block+e_value_offs describe the
> offset of xattr value inside the xattr space). From the perspective of
> "disk reads required to get the xattrs" this proposal should be similar as
> above (xattr space description will mostly fully fit in the xattr space of
> the inode) so we will just go and read the xattr headers and then value.
> It has an advantage that it basically does not limit xattr size or number
> of xattrs. It has the disadvantage that deduplication possibilities are
> lower.

The concern of disk reads required to get the xattrs is especially of
concern for those things are needed every time the file is accessed
--- e.g., for Rich ACL's.  It's the sharing which is what fixes the
disk seeks, and so the lower deduplications possibilities are a major
weakness of the scheme you've proposed above.

I'm personally not that interested in suppporting a large number of
large xattr's.  If we allow xattr values in inodes, that will allow
for a small number large xattr's, which ought to be sufficient, no?

      	    	   	 	  	- Ted

  parent reply	other threads:[~2017-04-20 21:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-13 19:58 [PATCH] ext4: xattr-in-inode support Andreas Dilger
2017-04-14 13:27 ` Theodore Ts'o
2017-04-17 19:07   ` Andreas Dilger
2017-04-20  7:58   ` Jan Kara
2017-04-20 21:22     ` Andreas Dilger
2017-04-21  7:54       ` Amir Goldstein
2017-04-20 21:24     ` Theodore Ts'o [this message]
2017-04-16 19:09 ` Alexey Lyashkov
2017-04-17 19:19   ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170420212440.w4oek4rbzxeu2qqk@thunk.org \
    --to=tytso@mit.edu \
    --cc=adilger@dilger.ca \
    --cc=jack@suse.cz \
    --cc=jsimmons@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=nauman@google.com \
    --cc=tahsin@google.com \
    --cc=tytso@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox