linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Marshall <hubcap@omnibond.com>
To: adilger@dilger.ca
Cc: Mike Marshall <hubcap@clemson.edu>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 00/17] orangefs: page cache
Date: Tue, 2 Oct 2018 13:58:38 -0400	[thread overview]
Message-ID: <CAOg9mSSd6qoG9VBHUEFW-fTJ9afonXyPS3R0v3szfpWLjqSpcw@mail.gmail.com> (raw)
In-Reply-To: <D61D4718-1AD0-45CD-92B2-BDCAC61AA295@dilger.ca>

That seems like one of several writeback errors that might occur...
It looks to me
like our code would set PG_error through setPageError and then the error should
be returned back to the application through close or fsync which seems
a little late.

I guess this is the kind of writeback error "mess" that Jeff Layton
was talking about
at LFSMM a couple of years ago...

-Mike
On Mon, Oct 1, 2018 at 4:03 PM Andreas Dilger <adilger@dilger.ca> wrote:
>
> On Sep 20, 2018, at 12:31 PM, Mike Marshall <hubcap@omnibond.com> wrote:
> >
> > Using the page cache seems like a game changer for the Orangefs kernel module.
> > Workloads with small IO suffer trying to push a parallel filesystem
> > with just a handful of bytes at a time. Below, vm2 with Fedora's 4.17
> > has /pvfsmnt mounted from an Orangefs filesystem that is itself running
> > on vm2. vm1 with 4.19.0-rc2  plus the Orangefs page cache patch, also has
> > its /pvfsmnt mounted from a local Orangefs filesystem.
>
> Is there some mechanism to prevent the client cache size exceeding the amount
> of free space on the filesystem?  If not, then the client may write data that
> can never be flushed to disk on the server.
>
> Cheers, Andreas
>
> > [vm2]$ dd if=/dev/zero of=/pvfsmnt/d.vm2/d.foo/dds.out bs=128 count=4194304
> > 4194304+0 records in
> > 4194304+0 records out
> > 536870912 bytes (537 MB, 512 MiB) copied, 662.013 s, 811 kB/s
> >
> > [vm1]$ dd if=/dev/zero of=/pvfsmnt/d.vm1/d.foo/dds.out bs=128 count=4194304
> > 4194304+0 records in
> > 4194304+0 records out
> > 536870912 bytes (537 MB, 512 MiB) copied, 11.3072 s, 47.5 MB/s
> >
> > Small IO collects in the page cache until a reasonable amount of
> > data is available for writeback.
> >
> > The trick, it seems, is to improve small IO without harming large IO.
> > Aligning writeback sizes, when possible, with the size of the IO buffer
> > that the Orangefs kernel module shares with its userspace component seems
> > promising on my dinky vm tests.
> >
> > -Mike
> >
> > On Mon, Sep 17, 2018 at 4:11 PM Martin Brandenburg <martin@omnibond.com> wrote:
> >>
> >> If no major issues are found in review or in our testing, we intend to
> >> submit this during the next merge window.
> >>
> >> The goal of all this is to significantly reduce the number of network
> >> requests made to the OrangeFS
> >>
> >> First the xattr cache is needed because otherwise we make a ton of
> >> getxattr calls from security_inode_need_killpriv.
> >>
> >> Then there's some reorganization so inode changes can be cached.
> >> Finally, we enable write_inode.
> >>
> >> Then remove the old readpages.  Next there's some reorganization to
> >> support readpage/writepage.  Finally, enable readpage/writepage which
> >> is fairly straightforward except for the need to separate writes from
> >> different uid/gid pairs due to the design of our server.
> >>
> >> Martin Brandenburg (17):
> >>  orangefs: implement xattr cache
> >>  orangefs: do not invalidate attributes on inode create
> >>  orangefs: simply orangefs_inode_getattr interface
> >>  orangefs: update attributes rather than relying on server
> >>  orangefs: hold i_lock during inode_getattr
> >>  orangefs: set up and use backing_dev_info
> >>  orangefs: let setattr write to cached inode
> >>  orangefs: reorganize setattr functions to track attribute changes
> >>  orangefs: remove orangefs_readpages
> >>  orangefs: service ops done for writeback are not killable
> >>  orangefs: migrate to generic_file_read_iter
> >>  orangefs: implement writepage
> >>  orangefs: skip inode writeout if nothing to write
> >>  orangefs: write range tracking
> >>  orangefs: avoid fsync service operation on flush
> >>  orangefs: use kmem_cache for orangefs_write_request
> >>  orangefs: implement writepages
> >>
> >> fs/orangefs/acl.c             |   4 +-
> >> fs/orangefs/file.c            | 193 ++++--------
> >> fs/orangefs/inode.c           | 576 +++++++++++++++++++++++++++-------
> >> fs/orangefs/namei.c           |  41 ++-
> >> fs/orangefs/orangefs-cache.c  |  24 +-
> >> fs/orangefs/orangefs-kernel.h |  56 +++-
> >> fs/orangefs/orangefs-mod.c    |  10 +-
> >> fs/orangefs/orangefs-utils.c  | 181 +++++------
> >> fs/orangefs/super.c           |  38 ++-
> >> fs/orangefs/waitqueue.c       |  18 +-
> >> fs/orangefs/xattr.c           | 104 ++++++
> >> 11 files changed, 839 insertions(+), 406 deletions(-)
> >>
> >> --
> >> 2.19.0
> >>
>
>
> Cheers, Andreas
>
>
>
>
>

  reply	other threads:[~2018-10-03  0:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 20:10 [PATCH 00/17] orangefs: page cache Martin Brandenburg
2018-09-17 20:10 ` [PATCH 01/17] orangefs: implement xattr cache Martin Brandenburg
2018-09-17 20:10 ` [PATCH 02/17] orangefs: do not invalidate attributes on inode create Martin Brandenburg
2018-09-17 20:10 ` [PATCH 03/17] orangefs: simply orangefs_inode_getattr interface Martin Brandenburg
2018-09-17 20:10 ` [PATCH 04/17] orangefs: update attributes rather than relying on server Martin Brandenburg
2018-09-17 20:10 ` [PATCH 05/17] orangefs: hold i_lock during inode_getattr Martin Brandenburg
2018-09-17 20:10 ` [PATCH 06/17] orangefs: set up and use backing_dev_info Martin Brandenburg
2018-09-17 20:10 ` [PATCH 07/17] orangefs: let setattr write to cached inode Martin Brandenburg
2018-09-17 20:10 ` [PATCH 08/17] orangefs: reorganize setattr functions to track attribute changes Martin Brandenburg
2018-09-17 20:10 ` [PATCH 09/17] orangefs: remove orangefs_readpages Martin Brandenburg
2018-09-17 20:10 ` [PATCH 10/17] orangefs: service ops done for writeback are not killable Martin Brandenburg
2018-09-17 20:10 ` [PATCH 11/17] orangefs: migrate to generic_file_read_iter Martin Brandenburg
2018-09-17 20:10 ` [PATCH 12/17] orangefs: implement writepage Martin Brandenburg
2018-09-17 20:10 ` [PATCH 13/17] orangefs: skip inode writeout if nothing to write Martin Brandenburg
2018-09-17 20:10 ` [PATCH 14/17] orangefs: write range tracking Martin Brandenburg
2018-09-17 20:10 ` [PATCH 15/17] orangefs: avoid fsync service operation on flush Martin Brandenburg
2018-09-17 20:10 ` [PATCH 16/17] orangefs: use kmem_cache for orangefs_write_request Martin Brandenburg
2018-09-17 20:10 ` [PATCH 17/17] orangefs: implement writepages Martin Brandenburg
2018-09-18 21:46   ` martin
2018-09-20 18:31 ` [PATCH 00/17] orangefs: page cache Mike Marshall
2018-10-01 20:03   ` Andreas Dilger
2018-10-02 17:58     ` Mike Marshall [this message]
2018-10-02 20:13     ` martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOg9mSSd6qoG9VBHUEFW-fTJ9afonXyPS3R0v3szfpWLjqSpcw@mail.gmail.com \
    --to=hubcap@omnibond.com \
    --cc=adilger@dilger.ca \
    --cc=hubcap@clemson.edu \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).