From: Jan Kara <jack@suse.cz>
To: Hugh Dickins <hughd@google.com>
Cc: Christian Brauner <brauner@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Oleksandr Tymoshenko <ovt@google.com>,
Carlos Maiolino <cem@kernel.org>,
Jeff Layton <jlayton@kernel.org>,
Chuck Lever <chuck.lever@oracle.com>, Jan Kara <jack@suse.cz>,
Miklos Szeredi <miklos@szeredi.hu>, Daniel Xu <dxu@dxuuu.xyz>,
Chris Down <chris@chrisdown.name>, Tejun Heo <tj@kernel.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Matthew Wilcox <willy@infradead.org>,
Christoph Hellwig <hch@infradead.org>,
Pete Zaitcev <zaitcev@redhat.com>, Helge Deller <deller@gmx.de>,
Topi Miettinen <toiwoton@gmail.com>, Yu Kuai <yukuai3@huawei.com>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH vfs.tmpfs 3/5] tmpfs,xattr: enable limited user extended attributes
Date: Wed, 9 Aug 2023 11:50:54 +0200 [thread overview]
Message-ID: <20230809095054.w3cjtggejv3nph5e@quack3> (raw)
In-Reply-To: <2e63b26e-df46-5baa-c7d6-f9a8dd3282c5@google.com>
On Tue 08-08-23 21:33:56, Hugh Dickins wrote:
> Enable "user." extended attributes on tmpfs, limiting them by tracking
> the space they occupy, and deducting that space from the limited ispace
> (unless tmpfs mounted with nr_inodes=0 to leave that ispace unlimited).
>
> tmpfs inodes and simple xattrs are both unswappable, and have to be in
> lowmem on a 32-bit highmem kernel: so the ispace limit is appropriate
> for xattrs, without any need for a further mount option.
>
> Add simple_xattr_space() to give approximate but deterministic estimate
> of the space taken up by each xattr: with simple_xattrs_free() outputting
> the space freed if required (but kernfs and even some tmpfs usages do not
> require that, so don't waste time on strlen'ing if not needed).
>
> Security and trusted xattrs were already supported: for consistency and
> simplicity, account them from the same pool; though there's a small risk
> that a tmpfs with enough space before would now be considered too small.
>
> When extended attributes are used, "df -i" does show more IUsed and less
> IFree than can be explained by the inodes: document that (manpage later).
>
> xfstests tests/generic which were not run on tmpfs before but now pass:
> 020 037 062 070 077 097 103 117 337 377 454 486 523 533 611 618 728
> with no new failures.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> Documentation/filesystems/tmpfs.rst | 7 ++-
> fs/Kconfig | 4 +-
> fs/kernfs/dir.c | 2 +-
> fs/xattr.c | 28 ++++++++++-
> include/linux/xattr.h | 3 +-
> mm/shmem.c | 78 +++++++++++++++++++++++++++----
> 6 files changed, 106 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst
> index 67422ee10e03..56a26c843dbe 100644
> --- a/Documentation/filesystems/tmpfs.rst
> +++ b/Documentation/filesystems/tmpfs.rst
> @@ -21,8 +21,8 @@ explained further below, some of which can be reconfigured dynamically on the
> fly using a remount ('mount -o remount ...') of the filesystem. A tmpfs
> filesystem can be resized but it cannot be resized to a size below its current
> usage. tmpfs also supports POSIX ACLs, and extended attributes for the
> -trusted.* and security.* namespaces. ramfs does not use swap and you cannot
> -modify any parameter for a ramfs filesystem. The size limit of a ramfs
> +trusted.*, security.* and user.* namespaces. ramfs does not use swap and you
> +cannot modify any parameter for a ramfs filesystem. The size limit of a ramfs
> filesystem is how much memory you have available, and so care must be taken if
> used so to not run out of memory.
>
> @@ -97,6 +97,9 @@ mount with such options, since it allows any user with write access to
> use up all the memory on the machine; but enhances the scalability of
> that instance in a system with many CPUs making intensive use of it.
>
> +If nr_inodes is not 0, that limited space for inodes is also used up by
> +extended attributes: "df -i"'s IUsed and IUse% increase, IFree decreases.
> +
> tmpfs blocks may be swapped out, when there is a shortage of memory.
> tmpfs has a mount option to disable its use of swap:
>
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 8218a71933f9..7da21f563192 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -205,8 +205,8 @@ config TMPFS_XATTR
> Extended attributes are name:value pairs associated with inodes by
> the kernel or by users (see the attr(5) manual page for details).
>
> - Currently this enables support for the trusted.* and
> - security.* namespaces.
> + This enables support for the trusted.*, security.* and user.*
> + namespaces.
>
> You need this for POSIX ACL support on tmpfs.
>
> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> index 5a1a4af9d3d2..660995856a04 100644
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -556,7 +556,7 @@ void kernfs_put(struct kernfs_node *kn)
> kfree_const(kn->name);
>
> if (kn->iattr) {
> - simple_xattrs_free(&kn->iattr->xattrs);
> + simple_xattrs_free(&kn->iattr->xattrs, NULL);
> kmem_cache_free(kernfs_iattrs_cache, kn->iattr);
> }
> spin_lock(&kernfs_idr_lock);
> diff --git a/fs/xattr.c b/fs/xattr.c
> index ba37a8f5cfd1..2d607542281b 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -1039,6 +1039,26 @@ const char *xattr_full_name(const struct xattr_handler *handler,
> }
> EXPORT_SYMBOL(xattr_full_name);
>
> +/**
> + * simple_xattr_space - estimate the memory used by a simple xattr
> + * @name: the full name of the xattr
> + * @size: the size of its value
> + *
> + * This takes no account of how much larger the two slab objects actually are:
> + * that would depend on the slab implementation, when what is required is a
> + * deterministic number, which grows with name length and size and quantity.
> + *
> + * Return: The approximate number of bytes of memory used by such an xattr.
> + */
> +size_t simple_xattr_space(const char *name, size_t size)
> +{
> + /*
> + * Use "40" instead of sizeof(struct simple_xattr), to return the
> + * same result on 32-bit and 64-bit, and even if simple_xattr grows.
> + */
> + return 40 + size + strlen(name);
> +}
> +
> /**
> * simple_xattr_free - free an xattr object
> * @xattr: the xattr object
> @@ -1363,14 +1383,17 @@ void simple_xattrs_init(struct simple_xattrs *xattrs)
> /**
> * simple_xattrs_free - free xattrs
> * @xattrs: xattr header whose xattrs to destroy
> + * @freed_space: approximate number of bytes of memory freed from @xattrs
> *
> * Destroy all xattrs in @xattr. When this is called no one can hold a
> * reference to any of the xattrs anymore.
> */
> -void simple_xattrs_free(struct simple_xattrs *xattrs)
> +void simple_xattrs_free(struct simple_xattrs *xattrs, size_t *freed_space)
> {
> struct rb_node *rbp;
>
> + if (freed_space)
> + *freed_space = 0;
> rbp = rb_first(&xattrs->rb_root);
> while (rbp) {
> struct simple_xattr *xattr;
> @@ -1379,6 +1402,9 @@ void simple_xattrs_free(struct simple_xattrs *xattrs)
> rbp_next = rb_next(rbp);
> xattr = rb_entry(rbp, struct simple_xattr, rb_node);
> rb_erase(&xattr->rb_node, &xattrs->rb_root);
> + if (freed_space)
> + *freed_space += simple_xattr_space(xattr->name,
> + xattr->size);
> simple_xattr_free(xattr);
> rbp = rbp_next;
> }
> diff --git a/include/linux/xattr.h b/include/linux/xattr.h
> index e37fe667ae04..d20051865800 100644
> --- a/include/linux/xattr.h
> +++ b/include/linux/xattr.h
> @@ -114,7 +114,8 @@ struct simple_xattr {
> };
>
> void simple_xattrs_init(struct simple_xattrs *xattrs);
> -void simple_xattrs_free(struct simple_xattrs *xattrs);
> +void simple_xattrs_free(struct simple_xattrs *xattrs, size_t *freed_space);
> +size_t simple_xattr_space(const char *name, size_t size);
> struct simple_xattr *simple_xattr_alloc(const void *value, size_t size);
> void simple_xattr_free(struct simple_xattr *xattr);
> int simple_xattr_get(struct simple_xattrs *xattrs, const char *name,
> diff --git a/mm/shmem.c b/mm/shmem.c
> index c39471384168..7420b510a9f3 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -393,12 +393,12 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop)
> return 0;
> }
>
> -static void shmem_free_inode(struct super_block *sb)
> +static void shmem_free_inode(struct super_block *sb, size_t freed_ispace)
> {
> struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
> if (sbinfo->max_inodes) {
> raw_spin_lock(&sbinfo->stat_lock);
> - sbinfo->free_ispace += BOGO_INODE_SIZE;
> + sbinfo->free_ispace += BOGO_INODE_SIZE + freed_ispace;
> raw_spin_unlock(&sbinfo->stat_lock);
> }
> }
> @@ -1232,6 +1232,7 @@ static void shmem_evict_inode(struct inode *inode)
> {
> struct shmem_inode_info *info = SHMEM_I(inode);
> struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
> + size_t freed;
>
> if (shmem_mapping(inode->i_mapping)) {
> shmem_unacct_size(info->flags, inode->i_size);
> @@ -1258,9 +1259,9 @@ static void shmem_evict_inode(struct inode *inode)
> }
> }
>
> - simple_xattrs_free(&info->xattrs);
> + simple_xattrs_free(&info->xattrs, sbinfo->max_inodes ? &freed : NULL);
> + shmem_free_inode(inode->i_sb, freed);
> WARN_ON(inode->i_blocks);
> - shmem_free_inode(inode->i_sb);
> clear_inode(inode);
> #ifdef CONFIG_TMPFS_QUOTA
> dquot_free_inode(inode);
> @@ -2440,7 +2441,7 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
> inode = new_inode(sb);
>
> if (!inode) {
> - shmem_free_inode(sb);
> + shmem_free_inode(sb, 0);
> return ERR_PTR(-ENOSPC);
> }
>
> @@ -3281,7 +3282,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr
> ret = simple_offset_add(shmem_get_offset_ctx(dir), dentry);
> if (ret) {
> if (inode->i_nlink)
> - shmem_free_inode(inode->i_sb);
> + shmem_free_inode(inode->i_sb, 0);
> goto out;
> }
>
> @@ -3301,7 +3302,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry)
> struct inode *inode = d_inode(dentry);
>
> if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode))
> - shmem_free_inode(inode->i_sb);
> + shmem_free_inode(inode->i_sb, 0);
>
> simple_offset_remove(shmem_get_offset_ctx(dir), dentry);
>
> @@ -3554,21 +3555,40 @@ static int shmem_initxattrs(struct inode *inode,
> void *fs_info)
> {
> struct shmem_inode_info *info = SHMEM_I(inode);
> + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
> const struct xattr *xattr;
> struct simple_xattr *new_xattr;
> + size_t ispace = 0;
> size_t len;
>
> + if (sbinfo->max_inodes) {
> + for (xattr = xattr_array; xattr->name != NULL; xattr++) {
> + ispace += simple_xattr_space(xattr->name,
> + xattr->value_len + XATTR_SECURITY_PREFIX_LEN);
> + }
> + if (ispace) {
> + raw_spin_lock(&sbinfo->stat_lock);
> + if (sbinfo->free_ispace < ispace)
> + ispace = 0;
> + else
> + sbinfo->free_ispace -= ispace;
> + raw_spin_unlock(&sbinfo->stat_lock);
> + if (!ispace)
> + return -ENOSPC;
> + }
> + }
> +
> for (xattr = xattr_array; xattr->name != NULL; xattr++) {
> new_xattr = simple_xattr_alloc(xattr->value, xattr->value_len);
> if (!new_xattr)
> - return -ENOMEM;
> + break;
>
> len = strlen(xattr->name) + 1;
> new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len,
> GFP_KERNEL);
> if (!new_xattr->name) {
> kvfree(new_xattr);
> - return -ENOMEM;
> + break;
> }
>
> memcpy(new_xattr->name, XATTR_SECURITY_PREFIX,
> @@ -3579,6 +3599,16 @@ static int shmem_initxattrs(struct inode *inode,
> simple_xattr_add(&info->xattrs, new_xattr);
> }
>
> + if (xattr->name != NULL) {
> + if (ispace) {
> + raw_spin_lock(&sbinfo->stat_lock);
> + sbinfo->free_ispace += ispace;
> + raw_spin_unlock(&sbinfo->stat_lock);
> + }
> + simple_xattrs_free(&info->xattrs, NULL);
> + return -ENOMEM;
> + }
> +
> return 0;
> }
>
> @@ -3599,16 +3629,39 @@ static int shmem_xattr_handler_set(const struct xattr_handler *handler,
> size_t size, int flags)
> {
> struct shmem_inode_info *info = SHMEM_I(inode);
> + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
> struct simple_xattr *old_xattr;
> + size_t ispace = 0;
>
> name = xattr_full_name(handler, name);
> + if (value && sbinfo->max_inodes) {
> + ispace = simple_xattr_space(name, size);
> + raw_spin_lock(&sbinfo->stat_lock);
> + if (sbinfo->free_ispace < ispace)
> + ispace = 0;
> + else
> + sbinfo->free_ispace -= ispace;
> + raw_spin_unlock(&sbinfo->stat_lock);
> + if (!ispace)
> + return -ENOSPC;
> + }
> +
> old_xattr = simple_xattr_set(&info->xattrs, name, value, size, flags);
> if (!IS_ERR(old_xattr)) {
> + ispace = 0;
> + if (old_xattr && sbinfo->max_inodes)
> + ispace = simple_xattr_space(old_xattr->name,
> + old_xattr->size);
> simple_xattr_free(old_xattr);
> old_xattr = NULL;
> inode->i_ctime = current_time(inode);
> inode_inc_iversion(inode);
> }
> + if (ispace) {
> + raw_spin_lock(&sbinfo->stat_lock);
> + sbinfo->free_ispace += ispace;
> + raw_spin_unlock(&sbinfo->stat_lock);
> + }
> return PTR_ERR(old_xattr);
> }
>
> @@ -3624,9 +3677,16 @@ static const struct xattr_handler shmem_trusted_xattr_handler = {
> .set = shmem_xattr_handler_set,
> };
>
> +static const struct xattr_handler shmem_user_xattr_handler = {
> + .prefix = XATTR_USER_PREFIX,
> + .get = shmem_xattr_handler_get,
> + .set = shmem_xattr_handler_set,
> +};
> +
> static const struct xattr_handler *shmem_xattr_handlers[] = {
> &shmem_security_xattr_handler,
> &shmem_trusted_xattr_handler,
> + &shmem_user_xattr_handler,
> NULL
> };
>
> --
> 2.35.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2023-08-09 9:50 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-09 4:28 [PATCH vfs.tmpfs 0/5] tmpfs: user xattrs and direct IO Hugh Dickins
2023-08-09 4:30 ` [PATCH vfs.tmpfs 1/5] xattr: simple_xattr_set() return old_xattr to be freed Hugh Dickins
2023-08-09 9:21 ` Jan Kara
2023-08-09 11:37 ` Christian Brauner
2023-08-09 13:19 ` Carlos Maiolino
2023-08-09 4:32 ` [PATCH vfs.tmpfs 2/5] tmpfs: track free_ispace instead of free_inodes Hugh Dickins
2023-08-09 9:33 ` Jan Kara
2023-08-09 13:29 ` Carlos Maiolino
2023-08-09 4:33 ` [PATCH vfs.tmpfs 3/5] tmpfs,xattr: enable limited user extended attributes Hugh Dickins
2023-08-09 9:50 ` Jan Kara [this message]
2023-08-09 13:52 ` Carlos Maiolino
2023-08-09 4:34 ` [PATCH vfs.tmpfs 4/5] tmpfs: trivial support for direct IO Hugh Dickins
2023-08-09 9:54 ` Jan Kara
2023-08-09 13:41 ` Christoph Hellwig
2023-08-10 23:41 ` Darrick J. Wong
2023-08-11 6:16 ` Hugh Dickins
2023-08-11 8:34 ` Christoph Hellwig
2023-08-11 6:08 ` Hugh Dickins
2023-08-11 6:27 ` [PATCH vfs.tmpfs v2 " Hugh Dickins
2023-08-11 8:35 ` Christoph Hellwig
2023-08-11 8:56 ` Jan Kara
2023-08-11 11:00 ` (subset) " Christian Brauner
2023-08-09 4:36 ` [PATCH vfs.tmpfs 5/5] mm: invalidation check mapping before folio_contains Hugh Dickins
2023-08-09 9:27 ` Jan Kara
2023-08-09 6:45 ` [PATCH vfs.tmpfs 0/5] tmpfs: user xattrs and direct IO Christian Brauner
2023-08-09 11:33 ` Christian Brauner
2023-08-10 5:50 ` Hugh Dickins
2023-08-10 10:07 ` Christian Brauner
2023-08-21 17:39 ` [PATCH vfs.tmpfs] tmpfs,xattr: GFP_KERNEL_ACCOUNT for simple xattrs Hugh Dickins
2023-08-21 17:57 ` Jan Kara
2023-08-22 8:58 ` (subset) " Christian Brauner
2023-08-10 23:23 ` [PATCH vfs.tmpfs 0/5] tmpfs: user xattrs and direct IO Pete Zaitcev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230809095054.w3cjtggejv3nph5e@quack3 \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=chris@chrisdown.name \
--cc=chuck.lever@oracle.com \
--cc=deller@gmx.de \
--cc=dxu@dxuuu.xyz \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=jlayton@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=ovt@google.com \
--cc=tj@kernel.org \
--cc=toiwoton@gmail.com \
--cc=willy@infradead.org \
--cc=yukuai3@huawei.com \
--cc=zaitcev@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).