Re: [RFC PATCH] getvalues(2) prototype

linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Miklos Szeredi <mszeredi@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org, linux-man@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	Karel Zak <kzak@redhat.com>, Ian Kent <raven@themaw.net>,
	David Howells <dhowells@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <christian@brauner.io>,
	Amir Goldstein <amir73il@gmail.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>
Subject: Re: [RFC PATCH] getvalues(2) prototype
Date: Thu, 24 Mar 2022 09:58:43 +1100	[thread overview]
Message-ID: <20220323225843.GI1609613@dread.disaster.area> (raw)
In-Reply-To: <20220322192712.709170-1-mszeredi@redhat.com>

On Tue, Mar 22, 2022 at 08:27:12PM +0100, Miklos Szeredi wrote:
> Add a new userspace API that allows getting multiple short values in a
> single syscall.
> 
> This would be useful for the following reasons:
> 
> - Calling open/read/close for many small files is inefficient.  E.g. on my
>   desktop invoking lsof(1) results in ~60k open + read + close calls under
>   /proc and 90% of those are 128 bytes or less.

How does doing the open/read/close in a single syscall make this any
more efficient? All it saves is the overhead of a couple of
syscalls, it doesn't reduce any of the setup or teardown overhead
needed to read the data itself....

> - Interfaces for getting various attributes and statistics are fragmented.
>   For files we have basic stat, statx, extended attributes, file attributes
>   (for which there are two overlapping ioctl interfaces).  For mounts and
>   superblocks we have stat*fs as well as /proc/$PID/{mountinfo,mountstats}.
>   The latter also has the problem on not allowing queries on a specific
>   mount.

https://xkcd.com/927/

> - Some attributes are cheap to generate, some are expensive.  Allowing
>   userspace to select which ones it needs should allow optimizing queries.
> 
> - Adding an ascii namespace should allow easy extension and self
>   description.
> 
> - The values can be text or binary, whichever is fits best.
> 
> The interface definition is:
> 
> struct name_val {
> 	const char *name;	/* in */
> 	struct iovec value_in;	/* in */
> 	struct iovec value_out;	/* out */
> 	uint32_t error;		/* out */
> 	uint32_t reserved;
> };

Ahhh, XFS_IOC_ATTRMULTI_BY_HANDLE reborn. This is how xfsdump gets
and sets attributes efficiently when dumping and restoring files -
it's an interface that allows batches of xattr operations to be run
on a file in a single syscall.

I've said in the past when discussing things like statx() that maybe
everything should be addressable via the xattr namespace and
set/queried via xattr names regardless of how the filesystem stores
the data. The VFS/filesystem simply translates the name to the
storage location of the information. It might be held in xattrs, but
it could just be a flag bit in an inode field.

Then we just get named xattrs in batches from an open fd.

> int getvalues(int dfd, const char *path, struct name_val *vec, size_t num,
> 	      unsigned int flags);
> 
> @dfd and @path are used to lookup object $ORIGIN.  @vec contains @num
> name/value descriptors.  @flags contains lookup flags for @path.
> 
> The syscall returns the number of values filled or an error.
> 
> A single name/value descriptor has the following fields:
> 
> @name describes the object whose value is to be returned.  E.g.
> 
> mnt                    - list of mount parameters
> mnt:mountpoint         - the mountpoint of the mount of $ORIGIN
> mntns                  - list of mount ID's reachable from the current root
> mntns:21:parentid      - parent ID of the mount with ID of 21
> xattr:security.selinux - the security.selinux extended attribute
> data:foo/bar           - the data contained in file $ORIGIN/foo/bar

How are these different from just declaring new xattr namespaces for
these things. e.g. open any file and list the xattrs in the
xattr:mount.mnt namespace to get the list of mount parameters for
that mount.

Why do we need a new "xattr in everything but name" interface when
we could just extend the one we've already got and formalise a new,
cleaner version of xattr batch APIs that have been around for 20-odd
years already?

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2022-03-23 22:58 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 19:27 [RFC PATCH] getvalues(2) prototype Miklos Szeredi
2022-03-22 19:30 ` Miklos Szeredi
2022-03-22 20:36 ` Casey Schaufler
2022-03-22 20:53   ` Casey Schaufler
2022-03-23  7:14   ` Greg KH
2022-03-23  7:16 ` Greg KH
2022-03-23 10:26   ` Bernd Schubert
2022-03-23 11:42     ` Greg KH
2022-03-23 12:06       ` Bernd Schubert
2022-03-23 12:13         ` Greg KH
2022-03-23 19:29     ` J. Bruce Fields
2022-03-23 11:42 ` Christian Brauner
2022-03-23 13:24   ` Miklos Szeredi
2022-03-23 13:38     ` Greg KH
2022-03-23 15:23       ` Miklos Szeredi
2022-03-24  6:56         ` Greg KH
2022-03-23 13:51     ` Casey Schaufler
2022-03-23 14:00       ` Miklos Szeredi
2022-03-23 22:39         ` Casey Schaufler
2022-03-23 22:19     ` Theodore Ts'o
2022-03-24  6:34       ` Christoph Hellwig
2022-03-24  8:44       ` Miklos Szeredi
2022-03-24 16:15         ` Eric W. Biederman
2022-03-25  8:46         ` Karel Zak
2022-03-25  8:54           ` Greg KH
2022-03-25  9:25             ` Karel Zak
2022-03-26  4:19               ` Theodore Ts'o
2022-03-25 18:40           ` Linus Torvalds
2022-03-25 11:02         ` Cyril Hrubis
2022-03-23 22:58 ` Dave Chinner [this message]
2022-03-23 23:17   ` Casey Schaufler
2022-03-24  8:57   ` Miklos Szeredi
2022-03-24 10:34     ` Amir Goldstein
2022-03-24 20:31     ` Dave Chinner
2022-03-25  9:10       ` Karel Zak
2022-03-25 16:42       ` Trond Myklebust
2022-03-27 21:03         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220323225843.GI1609613@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=amir73il@gmail.com \
    --cc=christian@brauner.io \
    --cc=dhowells@redhat.com \
    --cc=kzak@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=raven@themaw.net \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).