public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: NeilBrown <neilb@suse.de>, Frank Filz <ffilzlnx@mindspring.com>,
	'Theodore Ts'o' <tytso@mit.edu>,
	'Donald Buczek' <buczek@molgen.mpg.de>,
	linux-bcachefs@vger.kernel.org,
	'Stefan Krueger' <stefan.krueger@aei.mpg.de>,
	'David Howells' <dhowells@redhat.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: file handle in statx
Date: Wed, 13 Dec 2023 10:44:07 +1100	[thread overview]
Message-ID: <ZXjwR/6jfxFbLq9Y@dread.disaster.area> (raw)
In-Reply-To: <20231212223927.comwbwcmpvrd7xk4@moria.home.lan>

On Tue, Dec 12, 2023 at 05:39:27PM -0500, Kent Overstreet wrote:
> On Wed, Dec 13, 2023 at 09:23:18AM +1100, Dave Chinner wrote:
> > On Wed, Dec 13, 2023 at 08:57:43AM +1100, NeilBrown wrote:
> > > On Wed, 13 Dec 2023, Dave Chinner wrote:
> > > > On Tue, Dec 12, 2023 at 09:15:29AM -0800, Frank Filz wrote:
> > > > > > On Tue, Dec 12, 2023 at 10:10:23AM +0100, Donald Buczek wrote:
> > > > > > > On 12/12/23 06:53, Dave Chinner wrote:
> > > > > > >
> > > > > > > > So can someone please explain to me why we need to try to re-invent
> > > > > > > > a generic filehandle concept in statx when we already have a have
> > > > > > > > working and widely supported user API that provides exactly this
> > > > > > > > functionality?
> > > > > > >
> > > > > > > name_to_handle_at() is fine, but userspace could profit from being
> > > > > > > able to retrieve the filehandle together with the other metadata in a
> > > > > > > single system call.
> > > > > > 
> > > > > > Can you say more?  What, specifically is the application that would want
> > > > > to do
> > > > > > that, and is it really in such a hot path that it would be a user-visible
> > > > > > improveable, let aloine something that can be actually be measured?
> > > > > 
> > > > > A user space NFS server like Ganesha could benefit from getting attributes
> > > > > and file handle in a single system call.
> > > > 
> > > > At the cost of every other application that doesn't need those
> > > > attributes.
> > > 
> > > Why do you think there would be a cost?
> > 
> > It's as much maintenance and testing cost as it is a runtime cost.
> > We have to test and check this functionality works as advertised,
> > and we have to maintain that in working order forever more. That's
> > not free, especially if it is decided that the implementation needs
> > to be hyper-optimised in each individual filesystem because of
> > performance cost reasons.
> > 
> > Indeed, even the runtime "do we need to fetch this information"
> > checks have a measurable cost, especially as statx() is a very hot
> > kernel path. We've been optimising branches out of things like
> > setting up kiocbs because when that path is taken millions of times
> > every second each logic branch that decides if something needs to be
> > done or not has a direct measurable cost. statx() is a hot path that
> > can be called millions of times a second.....
> 
> Like Neal mentioned we won't even be fetching the fh if it wasn't
> explicitly requested - and like I mentioned, we can avoid the
> .encode_fh() call for local filesystems with a bit of work at the VFS
> layer.
> 
> OTOH, when you're running rsync in incremental mode, and detecting
> hardlinks, your point that "statx can be called millions of times per
> second" would apply just as much to the additional name_to_handle_at()
> call - we'd be nearly doubling their overhead for scanning files that
> don't need to be sent.

Hardlinked files are indicated by st_nlink > 1, not by requiring
userspace to store every st_ino/dev it sees and having to compare
the st-ino/dev of every newly stat()d inode against that ino/dev
cache.

We only need ino/dev/filehandles for hardlink path disambiguation.

IOWs, this use case does not need name_to_handle_at() for millions
of inodes - it is just needed on the regular file inodes that have
st_nlink > 1.

Hence even for wrokloads like rsync with hardlink detection, we
don't need filehandles for every inode being stat()d.  And that's
ignoring the fact that, outside of certain niche use cases,
hardlinks are rare.

I'm really struggling to see what filehandles in statx() actually
optimises in any meaningful manner....

> > And then comes the cost of encoding dynamically sized information in
> > struct statx - filehandles are not fixed size - and statx is most
> > definitely not set up or intended for dynamically sized attribute
> > data. This adds more complexity to statx because it wasn't designed
> > or intended to handle dynamically sized attributes. Optional
> > attributes, yes, but not attributes that might vary in size from fs
> > to fs or even inode type to inode type within a fileystem (e.g. dir
> > filehandles can, optionally, encode the parent inode in them).
> 
> Since it looks like expanding statx is not going to be quite as easy as
> hoped, I proposed elsewhere in the thread that we reserve a smaller
> fixed size in statx (32 bytes) and set a flag if it won't fit,
> indicating that userspace needs to fall back to name_to_handle_at().

struct btrfs_fid is 40 bytes in size. Sure, that's not all used for
name_to_handle_at(), but we already have in-kernel filehandles that
can optionally configured to be bigger than 32 bytes...

> Stuffing a _dynamically_ sized attribute into statx would indeed be
> painful - I believe were always talking about a fixed size buffer in
> statx, the discussion's been over how big it needs to be...

The contents of the buffer is still dynamically sized, so there's
still a length attribute that needs to be emitted to userspace with
the buffer.

And then what happens with the next attribute that someone wants
statx() to expose that can be dynamically sized? Are we really
planning to allow the struct statx to be expanded indefinitely
with largely unused static data arrays?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2023-12-12 23:44 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-28  7:49 How to cope with subvolumes and snapshots on muti-user systems? Donald Buczek
2023-11-29 21:43 ` Kent Overstreet
2023-11-30  7:35   ` Donald Buczek
2023-11-30  7:39     ` Kent Overstreet
2023-11-30 20:37       ` NeilBrown
2023-12-04 10:47         ` Donald Buczek
2023-12-04 22:45           ` NeilBrown
2023-12-05 21:35             ` Donald Buczek
2023-12-05 22:01               ` NeilBrown
2023-12-07 11:53                 ` Donald Buczek
2023-12-08  1:16                   ` NeilBrown
2023-12-08  1:37                     ` Kent Overstreet
2023-12-08  2:13                       ` NeilBrown
2023-12-08  2:49                         ` Kent Overstreet
2023-12-08 11:34                           ` Donald Buczek
2023-12-08 20:02                             ` Kent Overstreet
2023-12-11 22:43                               ` NeilBrown
2023-12-11 23:32                                 ` file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?) Kent Overstreet
2023-12-11 23:40                                   ` David Howells
2023-12-12 20:59                                     ` Kent Overstreet
2023-12-12 22:57                                       ` NeilBrown
2023-12-12 23:43                                         ` Kent Overstreet
2023-12-13  0:02                                           ` NeilBrown
2023-12-13  0:14                                             ` Kent Overstreet
2023-12-13 22:45                                             ` Andreas Dilger
2023-12-13 23:24                                               ` Kent Overstreet
2023-12-11 23:53                                   ` NeilBrown
2023-12-12  0:05                                     ` Kent Overstreet
2023-12-12  0:59                                       ` NeilBrown
2023-12-12  1:10                                         ` Kent Overstreet
2023-12-12  2:13                                           ` NeilBrown
2023-12-12  2:24                                             ` Kent Overstreet
2023-12-12  9:08                                             ` Christian Brauner
2023-12-12  5:53                                         ` Dave Chinner
2023-12-12  6:32                                           ` Amir Goldstein
2023-12-12  8:56                                             ` Christian Brauner
2023-12-12  9:10                                               ` David Howells
2023-12-12  9:23                                                 ` Christian Brauner
2023-12-12  9:28                                                   ` Miklos Szeredi
2023-12-12  9:35                                                     ` Christian Brauner
2023-12-12  9:42                                                       ` Miklos Szeredi
2023-12-12 13:47                                                         ` Christian Brauner
2023-12-12 14:06                                                           ` Miklos Szeredi
2023-12-12 15:24                                                             ` Christian Brauner
2023-12-12 15:28                                                       ` Kent Overstreet
2023-12-12  9:46                                                   ` David Howells
2023-12-12 15:16                                               ` Kent Overstreet
2023-12-12 15:29                                                 ` Christian Brauner
2023-12-12 15:35                                                   ` Kent Overstreet
2023-12-12 15:38                                                     ` Miklos Szeredi
2023-12-12 15:43                                                       ` Kent Overstreet
2023-12-12 15:57                                                         ` Miklos Szeredi
2023-12-12 16:08                                                           ` Kent Overstreet
2023-12-12 16:30                                                             ` Miklos Szeredi
2023-12-12 16:41                                                               ` Kent Overstreet
2023-12-12 21:53                                                               ` NeilBrown
2023-12-13  9:41                                                             ` Christian Brauner
2023-12-12 21:46                                                       ` NeilBrown
2023-12-13  9:47                                                         ` Christian Brauner
2023-12-13 10:04                                                           ` Christian Brauner
2023-12-14 22:47                                                           ` NeilBrown
2023-12-15  0:36                                                             ` Kent Overstreet
2023-12-12  7:03                                           ` David Howells
2023-12-12  9:10                                           ` file handle in statx Donald Buczek
2023-12-12 15:20                                             ` Theodore Ts'o
2023-12-12 17:15                                               ` Frank Filz
2023-12-12 17:44                                                 ` Kent Overstreet
2023-12-12 18:17                                                   ` Amir Goldstein
2023-12-12 19:18                                                     ` Frank Filz
2023-12-12 20:59                                                 ` Dave Chinner
2023-12-12 21:57                                                   ` NeilBrown
2023-12-12 22:23                                                     ` Dave Chinner
2023-12-12 22:36                                                       ` NeilBrown
2023-12-12 22:39                                                       ` Kent Overstreet
2023-12-12 23:44                                                         ` Dave Chinner [this message]
2023-12-13  0:00                                                           ` Kent Overstreet
2023-12-13  7:37                                               ` Donald Buczek
2023-12-13 12:28                                                 ` Kent Overstreet
2023-12-13 13:48                                                   ` Donald Buczek
2023-12-19  7:41                                                     ` Donald Buczek
2023-12-12 15:21                                           ` file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?) Kent Overstreet
2023-12-12 20:48                                             ` Dave Chinner
2023-12-12 21:23                                               ` Kent Overstreet
2023-12-12 22:10                                                 ` Dave Chinner
2023-12-12 22:31                                                   ` NeilBrown
2023-12-12 23:06                                                     ` Dave Chinner
2023-12-12 23:42                                                       ` Kent Overstreet
2023-12-13  0:03                                                       ` NeilBrown
2023-12-12 22:00                                               ` NeilBrown
2023-12-12  0:25                                     ` David Howells
2023-12-13 12:43                                 ` How to cope with subvolumes and snapshots on muti-user systems? Donald Buczek
2023-11-30 20:36   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXjwR/6jfxFbLq9Y@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=buczek@molgen.mpg.de \
    --cc=dhowells@redhat.com \
    --cc=ffilzlnx@mindspring.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=stefan.krueger@aei.mpg.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox