From: Matthew Wilcox <willy@infradead.org>
To: Mike Marshall <hubcapsc@gmail.com>
Cc: Mike Marshall <hubcap@omnibond.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: problem with orangefs readpage...
Date: Sat, 2 Jan 2021 02:07:33 +0000 [thread overview]
Message-ID: <20210102020733.GA431927@casper.infradead.org> (raw)
In-Reply-To: <CAAoczXbw9A+kqMemEsJax+CaPkQsJzZNw6Y7XFhTsBqDnGD6hw@mail.gmail.com>
On Fri, Jan 01, 2021 at 05:15:32PM -0500, Mike Marshall wrote:
> Hi Matthew... Thanks so much for the suggestions!
> > This is some new version of orangefs_readpage(), right?
> No, that code has been upstream for a while... that readahead_control
> thing looks very interesting :-) ...
Oh, my, I was looking at a tree from before 2018 that still had
orangefs_readpages. So, yes, I think what's happening is that
orangefs_readpage() is being called from the readahead code.
You'll hit this path:
next_page = find_get_page(inode->i_mapping, index);
if (next_page) {
gossip_debug(GOSSIP_FILE_DEBUG,
"%s: found next page, quitting\n",
__func__);
put_page(next_page);
goto out;
because readahead already allocated those pages for you and is trying
to fill them one-at-a-time.
Implementing ->readahead, even without dhowells' patch to expand
the ractl will definitely help you!
> -Mike
>
> On Thu, Dec 31, 2020 at 11:08 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> > On Thu, Dec 31, 2020 at 04:51:53PM -0500, Mike Marshall wrote:
> > > Greetings...
> > >
> > > I hope some of you will suffer through reading this long message :-) ...
> >
> > Hi Mike! Happy New Year!
> >
> > > Orangefs isn't built to do small IO. Reading a
> > > big file in page cache sized chunks is slow and painful.
> > > I tried to write orangefs_readpage so that it would do a reasonable
> > > sized hard IO, fill the page that was being called for, and then
> > > go ahead and fill a whole bunch of the following pages into the
> > > page cache with the extra data in the IO buffer.
> >
> > This is some new version of orangefs_readpage(), right? I don't see
> > anything resembling this in the current codebase. Did you disable
> > orangefs_readpages() as part of this work? Because the behaviour you're
> > describing sounds very much like what the readahead code might do to a
> > filesystem which implements readpage and neither readahead nor readpages.
> >
> > > orangefs_readpage gets called for the first four pages and then my
> > > prefill kicks in and fills the next pages and the right data ends
> > > up in /tmp/nine. I, of course, wished and planned for orangefs_readpage
> > > to only get called once, I don't understand why it gets called four
> > > times, which results in three extraneous expensive hard IOs.
> >
> > I might suggest some judicious calling of dump_stack() to understand
> > exactly what's calling you. My suspicion is that it's this loop in
> > read_pages():
> >
> > while ((page = readahead_page(rac))) {
> > aops->readpage(rac->file, page);
> > put_page(page);
> > }
> >
> > which doesn't test for PageUptodate before calling you.
> >
> > It'd probably be best if you implemented ->readahead, which has its own
> > ideas about which pages would be the right ones to read. It's not always
> > correct, but generally better to have that logic in the VFS than in each
> > filesystem.
> >
> > You probably want to have a look at Dave Howells' work to allow
> > the filesystem to expand the ractl:
> >
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
> >
> > specifically this patch:
> >
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=fscache-iter&id=f582790b32d5d1d8b937df95a8b2b5fdb8380e46
> >
prev parent reply other threads:[~2021-01-02 2:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-31 21:51 problem with orangefs readpage Mike Marshall
2021-01-01 4:08 ` Matthew Wilcox
[not found] ` <CAAoczXbw9A+kqMemEsJax+CaPkQsJzZNw6Y7XFhTsBqDnGD6hw@mail.gmail.com>
2021-01-01 22:23 ` Fwd: " Mike Marshall
2021-01-02 2:07 ` Matthew Wilcox [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210102020733.GA431927@casper.infradead.org \
--to=willy@infradead.org \
--cc=hubcap@omnibond.com \
--cc=hubcapsc@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).