public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-bcachefs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, Dave Chinner <dchinner@redhat.com>
Subject: Re: [GIT PULL] bcachefs changes for 6.12-rc1
Date: Wed, 25 Sep 2024 14:43:15 +1000	[thread overview]
Message-ID: <ZvOU4+3IIn48g43v@dread.disaster.area> (raw)
In-Reply-To: <v5lvhjauvcx27fcsyhyugzexdk7sik7an2soyxtx5dxj3oxjqa@gbvyu2kc7vpy>

On Tue, Sep 24, 2024 at 10:13:01PM -0400, Kent Overstreet wrote:
> On Wed, Sep 25, 2024 at 11:00:10AM GMT, Dave Chinner wrote:
> > > Eh? Of course it'd have to be coherent, but just checking if an inode is
> > > present in the VFS cache is what, 1-2 cache misses? Depending on hash
> > > table fill factor...
> > 
> > Sure, when there is no contention and you have CPU to spare. But the
> > moment the lookup hits contention problems (i.e. we are exceeding
> > the cache lookup scalability capability), we are straight back to
> > running a VFS cache speed instead of uncached speed.
> 
> The cache lookups are just reads; they don't introduce scalability
> issues, unless they're contending with other cores writing to those
> cachelines - checking if an item is present in a hash table is trivial
> to do locklessly.

Which was not something the VFS inode cache did until a couple of
months ago. Just because something is possible/easy today, it
doesn't mean it was possible or viable 15-20 years ago.

> But pulling an inode into and then evicting it from the inode cache
> entails a lot more work - just initializing a struct inode is
> nontrivial, and then there's the (multiple) shared data structures you
> have to manipulate.

Yes, but to avoid this we'd need to come up with a mechanism that is
generally safe for most filesystems, not just bcachefs.

I mean, if you can come up with a stat() mechanism that is safe
enough for us read straight out the XFS buffer cache for inode cache
misses, then we'll switch over to using it ASAP.

That's your challenge - if you want bcachefs to be able to do this,
then you have to make sure the infrastructure required works for
other filesystems just as safely, too.

> And incidentally this sort of "we have a cache on top of the btree, but
> sometimes we have to do direct access" is already something that comes
> up a lot in bcachefs, primarily for the alloc btree. _Tons_ of fun, but
> doesn't actually come up here for us since we don't use the vfs inode
> cache as a writeback cache.

And there-in lies the problem for the general case. Most filesystems
do use writeback caching of inode metadata via the VFS inode state,
XFS included, and that's where all the dragons are hiding.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-09-25  4:43 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-21 19:27 [GIT PULL] bcachefs changes for 6.12-rc1 Kent Overstreet
2024-09-23 17:18 ` Linus Torvalds
2024-09-23 19:07   ` Linus Torvalds
2024-09-23 19:58     ` Kent Overstreet
2024-09-23 19:56   ` Kent Overstreet
2024-09-24  0:26     ` Dave Chinner
2024-09-24  1:55       ` Kent Overstreet
2024-09-24  2:26       ` Linus Torvalds
2024-09-24  2:48         ` Linus Torvalds
2024-09-24  3:55           ` Dave Chinner
2024-09-24 16:57             ` Linus Torvalds
2024-09-24 17:27               ` Kent Overstreet
2024-09-25  0:17               ` Dave Chinner
2024-09-25  1:45                 ` Linus Torvalds
2024-09-25 11:41                   ` Christian Brauner
2024-09-25  2:48                 ` Kent Overstreet
2024-09-27  0:48                   ` Herbert Xu
2024-09-28  0:11                     ` Kent Overstreet
2024-09-28  0:47                       ` Herbert Xu
2024-09-24  2:55         ` Kent Overstreet
2024-09-24  3:34           ` Dave Chinner
2024-09-24  3:47             ` Kent Overstreet
2024-09-25  1:00               ` Dave Chinner
2024-09-25  2:13                 ` Kent Overstreet
2024-09-25  4:43                   ` Dave Chinner [this message]
2024-09-25  5:11                     ` Kent Overstreet
2024-09-24  3:04         ` Dave Chinner
2024-09-23 19:06 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvOU4+3IIn48g43v@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox