From: Mike Marshall <hubcap@omnibond.com>
To: adilger@dilger.ca
Cc: Mike Marshall <hubcap@clemson.edu>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 00/17] orangefs: page cache
Date: Tue, 2 Oct 2018 13:58:38 -0400 [thread overview]
Message-ID: <CAOg9mSSd6qoG9VBHUEFW-fTJ9afonXyPS3R0v3szfpWLjqSpcw@mail.gmail.com> (raw)
In-Reply-To: <D61D4718-1AD0-45CD-92B2-BDCAC61AA295@dilger.ca>
That seems like one of several writeback errors that might occur...
It looks to me
like our code would set PG_error through setPageError and then the error should
be returned back to the application through close or fsync which seems
a little late.
I guess this is the kind of writeback error "mess" that Jeff Layton
was talking about
at LFSMM a couple of years ago...
-Mike
On Mon, Oct 1, 2018 at 4:03 PM Andreas Dilger <adilger@dilger.ca> wrote:
>
> On Sep 20, 2018, at 12:31 PM, Mike Marshall <hubcap@omnibond.com> wrote:
> >
> > Using the page cache seems like a game changer for the Orangefs kernel module.
> > Workloads with small IO suffer trying to push a parallel filesystem
> > with just a handful of bytes at a time. Below, vm2 with Fedora's 4.17
> > has /pvfsmnt mounted from an Orangefs filesystem that is itself running
> > on vm2. vm1 with 4.19.0-rc2 plus the Orangefs page cache patch, also has
> > its /pvfsmnt mounted from a local Orangefs filesystem.
>
> Is there some mechanism to prevent the client cache size exceeding the amount
> of free space on the filesystem? If not, then the client may write data that
> can never be flushed to disk on the server.
>
> Cheers, Andreas
>
> > [vm2]$ dd if=/dev/zero of=/pvfsmnt/d.vm2/d.foo/dds.out bs=128 count=4194304
> > 4194304+0 records in
> > 4194304+0 records out
> > 536870912 bytes (537 MB, 512 MiB) copied, 662.013 s, 811 kB/s
> >
> > [vm1]$ dd if=/dev/zero of=/pvfsmnt/d.vm1/d.foo/dds.out bs=128 count=4194304
> > 4194304+0 records in
> > 4194304+0 records out
> > 536870912 bytes (537 MB, 512 MiB) copied, 11.3072 s, 47.5 MB/s
> >
> > Small IO collects in the page cache until a reasonable amount of
> > data is available for writeback.
> >
> > The trick, it seems, is to improve small IO without harming large IO.
> > Aligning writeback sizes, when possible, with the size of the IO buffer
> > that the Orangefs kernel module shares with its userspace component seems
> > promising on my dinky vm tests.
> >
> > -Mike
> >
> > On Mon, Sep 17, 2018 at 4:11 PM Martin Brandenburg <martin@omnibond.com> wrote:
> >>
> >> If no major issues are found in review or in our testing, we intend to
> >> submit this during the next merge window.
> >>
> >> The goal of all this is to significantly reduce the number of network
> >> requests made to the OrangeFS
> >>
> >> First the xattr cache is needed because otherwise we make a ton of
> >> getxattr calls from security_inode_need_killpriv.
> >>
> >> Then there's some reorganization so inode changes can be cached.
> >> Finally, we enable write_inode.
> >>
> >> Then remove the old readpages. Next there's some reorganization to
> >> support readpage/writepage. Finally, enable readpage/writepage which
> >> is fairly straightforward except for the need to separate writes from
> >> different uid/gid pairs due to the design of our server.
> >>
> >> Martin Brandenburg (17):
> >> orangefs: implement xattr cache
> >> orangefs: do not invalidate attributes on inode create
> >> orangefs: simply orangefs_inode_getattr interface
> >> orangefs: update attributes rather than relying on server
> >> orangefs: hold i_lock during inode_getattr
> >> orangefs: set up and use backing_dev_info
> >> orangefs: let setattr write to cached inode
> >> orangefs: reorganize setattr functions to track attribute changes
> >> orangefs: remove orangefs_readpages
> >> orangefs: service ops done for writeback are not killable
> >> orangefs: migrate to generic_file_read_iter
> >> orangefs: implement writepage
> >> orangefs: skip inode writeout if nothing to write
> >> orangefs: write range tracking
> >> orangefs: avoid fsync service operation on flush
> >> orangefs: use kmem_cache for orangefs_write_request
> >> orangefs: implement writepages
> >>
> >> fs/orangefs/acl.c | 4 +-
> >> fs/orangefs/file.c | 193 ++++--------
> >> fs/orangefs/inode.c | 576 +++++++++++++++++++++++++++-------
> >> fs/orangefs/namei.c | 41 ++-
> >> fs/orangefs/orangefs-cache.c | 24 +-
> >> fs/orangefs/orangefs-kernel.h | 56 +++-
> >> fs/orangefs/orangefs-mod.c | 10 +-
> >> fs/orangefs/orangefs-utils.c | 181 +++++------
> >> fs/orangefs/super.c | 38 ++-
> >> fs/orangefs/waitqueue.c | 18 +-
> >> fs/orangefs/xattr.c | 104 ++++++
> >> 11 files changed, 839 insertions(+), 406 deletions(-)
> >>
> >> --
> >> 2.19.0
> >>
>
>
> Cheers, Andreas
>
>
>
>
>
next prev parent reply other threads:[~2018-10-03 0:43 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-17 20:10 [PATCH 00/17] orangefs: page cache Martin Brandenburg
2018-09-17 20:10 ` [PATCH 01/17] orangefs: implement xattr cache Martin Brandenburg
2018-09-17 20:10 ` [PATCH 02/17] orangefs: do not invalidate attributes on inode create Martin Brandenburg
2018-09-17 20:10 ` [PATCH 03/17] orangefs: simply orangefs_inode_getattr interface Martin Brandenburg
2018-09-17 20:10 ` [PATCH 04/17] orangefs: update attributes rather than relying on server Martin Brandenburg
2018-09-17 20:10 ` [PATCH 05/17] orangefs: hold i_lock during inode_getattr Martin Brandenburg
2018-09-17 20:10 ` [PATCH 06/17] orangefs: set up and use backing_dev_info Martin Brandenburg
2018-09-17 20:10 ` [PATCH 07/17] orangefs: let setattr write to cached inode Martin Brandenburg
2018-09-17 20:10 ` [PATCH 08/17] orangefs: reorganize setattr functions to track attribute changes Martin Brandenburg
2018-09-17 20:10 ` [PATCH 09/17] orangefs: remove orangefs_readpages Martin Brandenburg
2018-09-17 20:10 ` [PATCH 10/17] orangefs: service ops done for writeback are not killable Martin Brandenburg
2018-09-17 20:10 ` [PATCH 11/17] orangefs: migrate to generic_file_read_iter Martin Brandenburg
2018-09-17 20:10 ` [PATCH 12/17] orangefs: implement writepage Martin Brandenburg
2018-09-17 20:10 ` [PATCH 13/17] orangefs: skip inode writeout if nothing to write Martin Brandenburg
2018-09-17 20:10 ` [PATCH 14/17] orangefs: write range tracking Martin Brandenburg
2018-09-17 20:10 ` [PATCH 15/17] orangefs: avoid fsync service operation on flush Martin Brandenburg
2018-09-17 20:10 ` [PATCH 16/17] orangefs: use kmem_cache for orangefs_write_request Martin Brandenburg
2018-09-17 20:10 ` [PATCH 17/17] orangefs: implement writepages Martin Brandenburg
2018-09-18 21:46 ` martin
2018-09-20 18:31 ` [PATCH 00/17] orangefs: page cache Mike Marshall
2018-10-01 20:03 ` Andreas Dilger
2018-10-02 17:58 ` Mike Marshall [this message]
2018-10-02 20:13 ` martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOg9mSSd6qoG9VBHUEFW-fTJ9afonXyPS3R0v3szfpWLjqSpcw@mail.gmail.com \
--to=hubcap@omnibond.com \
--cc=adilger@dilger.ca \
--cc=hubcap@clemson.edu \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).