public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Filipe Manana <fdmanana@suse.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	David Hildenbrand <david@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Buffered I/O broken on s390x with page faults disabled (gfs2)
Date: Thu, 10 Mar 2022 12:13:25 +0000	[thread overview]
Message-ID: <YinrZVoAnEU8/wpa@debian9.Home> (raw)
In-Reply-To: <CAHc6FU5+AgDcoXE4Qfh_9hpn9d_it4aFyhoS=TKpqrBPe4GP+w@mail.gmail.com>

On Wed, Mar 09, 2022 at 10:08:32PM +0100, Andreas Gruenbacher wrote:
> On Wed, Mar 9, 2022 at 8:08 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Wed, Mar 9, 2022 at 10:42 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> > > With a large enough buffer, a simple malloc() will return unmapped
> > > pages, and reading into such a buffer will result in fault-in.  So page
> > > faults during read() are actually pretty normal, and it's not the user's
> > > fault.
> >
> > Agreed. But that wasn't the case here:
> >
> > > In my test case, the buffer was pre-initialized with memset() to avoid
> > > those kinds of page faults, which meant that the page faults in
> > > gfs2_file_read_iter() only started to happen when we were out of memory.
> > > But that's not the common case.
> >
> > Exactly. I do not think this is a case that we should - or need to -
> > optimize for.
> >
> > And doing too much pre-faulting is actually counter-productive.
> >
> > > * Get rid of max_size: it really makes no sense to second-guess what the
> > >   caller needs.
> >
> > It's not about "what caller needs". It's literally about latency
> > issues. If you can force a busy loop in kernel space by having one
> > unmapped page and then do a 2GB read(), that's a *PROBLEM*.
> >
> > Now, we can try this thing, because I think we end up having other
> > size limitations in the IO subsystem that means that the filesystem
> > won't actually do that, but the moment I hear somebody talk about
> > latencies, that max_size goes back.
> 
> Thanks, this puts fault_in_safe_writeable() in line with
> fault_in_readable() and fault_in_writeable().
> 
> There currently are two users of
> fault_in_safe_writeable()/fault_in_iov_iter_writeable(): gfs2 and
> btrfs.
> In gfs2, we cap the size at BIO_MAX_VECS pages (256). I don't see an
> explicit cap in btrfs; adding Filipe.

On btrfs, for buffered writes, we have some cap (done at btrfs_buffered_write()),
for buffered reads we don't have any control on that as we use filemap_read().

For direct IO we don't have any cap, we try to fault in everything that's left.
However we keep track if we are doing any progress, and if we aren't making any
progress we just fall back to the buffered IO path. So that prevents infinite
or long loops.

There's really no good reason to not cap how much we try to fault in in the
direct IO paths. We should do it, as it probably has a negative performance
impact for very large direct IO reads/writes.

Thanks.

> 
> Andreas
> 


  reply	other threads:[~2022-03-10 12:13 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-07 22:52 Buffered I/O broken on s390x with page faults disabled (gfs2) Andreas Gruenbacher
2022-03-07 23:18 ` Linus Torvalds
2022-03-08  8:21   ` David Hildenbrand
2022-03-08  8:37     ` David Hildenbrand
2022-03-08 12:11       ` David Hildenbrand
2022-03-08 12:24         ` David Hildenbrand
2022-03-08 13:20           ` Gerald Schaefer
2022-03-08 13:32             ` David Hildenbrand
2022-03-08 14:14               ` Gerald Schaefer
2022-03-08 17:23           ` David Hildenbrand
2022-03-08 17:26     ` Linus Torvalds
2022-03-08 17:40       ` Linus Torvalds
2022-03-08 19:27         ` Linus Torvalds
2022-03-08 20:03           ` Linus Torvalds
2022-03-08 23:24             ` Andreas Gruenbacher
2022-03-09  0:22               ` Linus Torvalds
2022-03-09 18:42                 ` Andreas Gruenbacher
2022-03-09 19:08                   ` Linus Torvalds
2022-03-09 20:57                     ` Andreas Gruenbacher
2022-03-09 21:08                     ` Andreas Gruenbacher
2022-03-10 12:13                       ` Filipe Manana [this message]
2022-03-09 19:21                   ` Linus Torvalds
2022-03-09 19:35                     ` Andreas Gruenbacher
2022-03-09 20:18                       ` Linus Torvalds
2022-03-09 20:36                         ` Andreas Gruenbacher
2022-03-09 20:48                           ` Linus Torvalds
2022-03-09 20:54                             ` Linus Torvalds
2022-03-10 17:13           ` David Hildenbrand
2022-03-10 18:00             ` Andreas Gruenbacher
2022-03-10 18:35             ` Linus Torvalds
2022-03-10 18:38               ` David Hildenbrand
2022-03-10 18:47               ` Andreas Gruenbacher
2022-03-10 19:22                 ` Linus Torvalds
2022-03-10 19:56                   ` Linus Torvalds
2022-03-10 20:23                     ` Andreas Gruenbacher
2022-03-08 17:47       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YinrZVoAnEU8/wpa@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=agruenba@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox