From: Dave Chinner <david@fromorbit.com>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Bypass filesystems for reading cached pages
Date: Tue, 23 Jun 2020 10:52:18 +1000 [thread overview]
Message-ID: <20200623005218.GF2040@dread.disaster.area> (raw)
In-Reply-To: <CAHc6FU4b_z+vhjVPmaU46VhqoD+Y7jLN3=BRDZPrS2v=_pVpfw@mail.gmail.com>
On Mon, Jun 22, 2020 at 04:35:05PM +0200, Andreas Gruenbacher wrote:
> On Mon, Jun 22, 2020 at 2:32 AM Dave Chinner <david@fromorbit.com> wrote:
> > On Fri, Jun 19, 2020 at 08:50:36AM -0700, Matthew Wilcox wrote:
> > >
> > > This patch lifts the IOCB_CACHED idea expressed by Andreas to the VFS.
> > > The advantage of this patch is that we can avoid taking any filesystem
> > > lock, as long as the pages being accessed are in the cache (and we don't
> > > need to readahead any pages into the cache). We also avoid an indirect
> > > function call in these cases.
> >
> > What does this micro-optimisation actually gain us except for more
> > complexity in the IO path?
> >
> > i.e. if a filesystem lock has such massive overhead that it slows
> > down the cached readahead path in production workloads, then that's
> > something the filesystem needs to address, not unconditionally
> > bypass the filesystem before the IO gets anywhere near it.
>
> I'm fine with not moving that functionality into the VFS. The problem
> I have in gfs2 is that taking glocks is really expensive. Part of that
> overhead is accidental, but we definitely won't be able to fix it in
> the short term. So something like the IOCB_CACHED flag that prevents
> generic_file_read_iter from issuing readahead I/O would save the day
> for us. Does that idea stand a chance?
I have no problem with a "NOREADAHEAD" flag being passed to
generic_file_read_iter(). It's not a "already cached" flag though,
it's a "don't start any IO" directive, just like the NOWAIT flag is
a "don't block on locks or IO in progress" directive and not an
"already cached" flag. Readahead is something we should be doing,
unless a filesystem has a very good reason not to, such as the gfs2
locking case here...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2020-06-23 0:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-19 15:50 [RFC] Bypass filesystems for reading cached pages Matthew Wilcox
2020-06-19 19:06 ` Chaitanya Kulkarni
2020-06-19 20:12 ` Matthew Wilcox
2020-06-19 21:25 ` Chaitanya Kulkarni
2020-06-20 6:19 ` Amir Goldstein
2020-06-20 19:15 ` Matthew Wilcox
2020-06-21 6:00 ` Amir Goldstein
2020-06-22 1:02 ` Dave Chinner
2020-06-22 0:32 ` Dave Chinner
2020-06-22 14:35 ` Andreas Gruenbacher
2020-06-22 18:13 ` Matthew Wilcox
2020-06-24 12:35 ` Andreas Gruenbacher
2020-07-02 15:16 ` Andreas Gruenbacher
2020-07-02 17:30 ` Matthew Wilcox
2020-06-23 0:52 ` Dave Chinner [this message]
2020-06-23 7:41 ` Andreas Gruenbacher
2020-06-22 19:18 ` Matthew Wilcox
2020-06-23 2:35 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200623005218.GF2040@dread.disaster.area \
--to=david@fromorbit.com \
--cc=agruenba@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.