From: Dave Chinner <david@fromorbit.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>, Jiri Kosina <jikos@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
Linux API <linux-api@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Greg KH <gregkh@linuxfoundation.org>,
Jann Horn <jannh@google.com>,
Dominique Martinet <asmadeus@codewreck.org>,
Andy Lutomirski <luto@amacapital.net>,
Kevin Easton <kevin@guarana.org>,
Matthew Wilcox <willy@infradead.org>,
Cyril Hrubis <chrubis@suse.cz>, Tejun Heo <tj@kernel.org>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Daniel Gruss <daniel@gruss.cc>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 2/3] mm/filemap: initiate readahead even if IOCB_NOWAIT is set for the I/O
Date: Fri, 1 Feb 2019 16:13:55 +1100 [thread overview]
Message-ID: <20190201051355.GV6173@dastard> (raw)
In-Reply-To: <CAHk-=wjkiNPWb97JXV6=J6DzscB1g7moGJ6G_nSe=AEbMugTNw@mail.gmail.com>
On Thu, Jan 31, 2019 at 09:54:16AM -0800, Linus Torvalds wrote:
> On Thu, Jan 31, 2019 at 2:23 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > OK, I guess my question was not precise. What does prevent taking fs
> > locks down the path?
>
> IOCB_NOWAIT has never meant that, and will never mean it.
I think you're wrong, Linus. IOCB_NOWAIT was specifically designed
to prevent blocking on filesystem locks during AIO submission. The
initial commits spell that out pretty clearly:
commit b745fafaf70c0a98a2e1e7ac8cb14542889ceb0e
Author: Goldwyn Rodrigues <rgoldwyn@suse.com>
Date: Tue Jun 20 07:05:43 2017 -0500
fs: Introduce RWF_NOWAIT and FMODE_AIO_NOWAIT
RWF_NOWAIT informs kernel to bail out if an AIO request will block
for reasons such as file allocations, or a writeback triggered,
or would block while allocating requests while performing
direct I/O.
RWF_NOWAIT is translated to IOCB_NOWAIT for iocb->ki_flags.
FMODE_AIO_NOWAIT is a flag which identifies the file opened is capable
of returning -EAGAIN if the AIO call will block. This must be set by
supporting filesystems in the ->open() call.
Filesystems xfs, btrfs and ext4 would be supported in the following patches.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
commit 29a5d29ec181ebdc98a26cedbd76ce9870248892
Author: Goldwyn Rodrigues <rgoldwyn@suse.com>
Date: Tue Jun 20 07:05:48 2017 -0500
xfs: nowait aio support
If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable
immediately.
IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin
if it needs allocation either due to file extension, writing to a hole,
or COW or waiting for other DIOs to finish.
Return -EAGAIN if we don't have extent list in memory.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
commit 728fbc0e10b7f3ce2ee043b32e3453fd5201c055
Author: Goldwyn Rodrigues <rgoldwyn@suse.com>
Date: Tue Jun 20 07:05:47 2017 -0500
ext4: nowait aio support
Return EAGAIN if any of the following checks fail for direct I/O:
+ i_rwsem is lockable
+ Writing beyond end of file (will trigger allocation)
+ Blocks are not allocated at the write location
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
> We will never give user space those kinds of guarantees. We do locking
> for various reasons. For example, we'll do the mm lock just when
> fetching/storing data from/to user space if there's a page fault.
You are conflating "best effort non-blocking operation" with
"atomic guarantee". RWF_NOWAIT/IOCB_NOWAIT is the
former, not the latter.
i.e. RWF_NOWAIT addresses the "every second IO submission blocks"
problems that AIO submission suffered from due to filesystem lock
contention, not the rare and unusual things like "page fault during
get_user_pages in direct IO submission". Maybe one day, but right
now those rare cases are not pain points for applications that
require nonblock AIO submission via RWF_NOWAIT.
> Or -
> more obviously - we'll also check for - and sleep on - mandatory locks
> in rw_verify_area().
Well, only if you don't use fcntl(O_NONBLOCK) on the file to tell
mandatory locking to fail with -EAGAIN instead of sleeping.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-02-01 5:13 UTC|newest]
Thread overview: 163+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-05 17:27 [PATCH] mm/mincore: allow for making sys_mincore() privileged Jiri Kosina
2019-01-05 19:14 ` Vlastimil Babka
2019-01-05 19:24 ` Jiri Kosina
2019-01-05 19:38 ` Vlastimil Babka
2019-01-08 9:14 ` Bernd Petrovitsch
2019-01-08 11:37 ` Jiri Kosina
2019-01-08 13:53 ` Bernd Petrovitsch
2019-01-08 14:08 ` Kirill A. Shutemov
2019-01-05 19:44 ` kbuild test robot
2019-01-05 19:46 ` Linus Torvalds
2019-01-05 20:12 ` Jiri Kosina
2019-01-05 20:17 ` Linus Torvalds
2019-01-05 20:43 ` Jiri Kosina
2019-01-05 21:54 ` Linus Torvalds
2019-01-06 11:33 ` Kevin Easton
2019-01-08 8:50 ` Kevin Easton
2019-01-18 14:23 ` Tejun Heo
2019-01-05 20:13 ` Linus Torvalds
2019-01-05 19:56 ` kbuild test robot
2019-01-05 22:54 ` Jann Horn
2019-01-05 23:05 ` Linus Torvalds
2019-01-05 23:16 ` Linus Torvalds
2019-01-05 23:28 ` Linus Torvalds
2019-01-05 23:39 ` Linus Torvalds
2019-01-06 0:11 ` Matthew Wilcox
2019-01-06 0:22 ` Linus Torvalds
2019-01-06 1:50 ` Linus Torvalds
2019-01-06 21:46 ` Linus Torvalds
2019-01-08 4:43 ` Dave Chinner
2019-01-08 17:57 ` Linus Torvalds
2019-01-09 2:24 ` Dave Chinner
2019-01-09 2:31 ` Jiri Kosina
2019-01-09 4:39 ` Dave Chinner
2019-01-09 10:08 ` Jiri Kosina
2019-01-10 1:15 ` Dave Chinner
2019-01-10 7:54 ` Jiri Kosina
2019-01-09 18:25 ` Linus Torvalds
2019-01-10 0:44 ` Dave Chinner
2019-01-10 1:18 ` Linus Torvalds
2019-01-10 5:26 ` Andy Lutomirski
2019-01-10 14:47 ` Matthew Wilcox
2019-01-10 21:44 ` Dave Chinner
2019-01-10 21:59 ` Linus Torvalds
2019-01-11 1:47 ` Dave Chinner
2019-01-10 7:03 ` Dave Chinner
2019-01-10 11:47 ` Linus Torvalds
2019-01-10 12:24 ` Dominique Martinet
2019-01-10 12:24 ` Dominique Martinet
2019-01-10 22:11 ` Linus Torvalds
2019-01-11 2:03 ` Dave Chinner
2019-01-11 2:18 ` Linus Torvalds
2019-01-11 4:04 ` Dave Chinner
2019-01-11 4:08 ` Andy Lutomirski
2019-01-11 7:20 ` Dave Chinner
2019-01-11 7:08 ` Linus Torvalds
2019-01-11 7:36 ` Dave Chinner
2019-01-11 16:26 ` Linus Torvalds
2019-01-15 23:45 ` Dave Chinner
2019-01-16 4:54 ` Linus Torvalds
2019-01-16 5:49 ` Linus Torvalds
2019-01-17 1:26 ` Dave Chinner
2019-02-20 15:49 ` Nicolai Stange
2019-01-11 4:57 ` Dominique Martinet
2019-01-11 7:11 ` Linus Torvalds
2019-01-11 7:32 ` Dominique Martinet
2019-01-16 0:42 ` Josh Snyder
2019-01-16 5:00 ` Linus Torvalds
2019-01-16 5:25 ` Andy Lutomirski
2019-01-16 5:34 ` Linus Torvalds
2019-01-16 5:46 ` Dominique Martinet
2019-01-16 5:58 ` Linus Torvalds
2019-01-16 6:34 ` Dominique Martinet
2019-01-16 7:52 ` Josh Snyder
2019-01-16 12:18 ` Kevin Easton
2019-01-17 21:45 ` Vlastimil Babka
2019-01-18 4:49 ` Linus Torvalds
2019-01-18 18:58 ` Vlastimil Babka
2019-01-16 16:12 ` Jiri Kosina
2019-01-16 17:48 ` Linus Torvalds
2019-01-16 20:23 ` Jiri Kosina
2019-01-16 21:37 ` Matthew Wilcox
2019-01-16 21:41 ` Jiri Kosina
2019-01-17 9:52 ` Cyril Hrubis
2019-01-28 13:49 ` Cyril Hrubis
2019-01-17 4:51 ` Linus Torvalds
2019-01-18 4:54 ` Linus Torvalds
2019-01-17 1:49 ` Dominique Martinet
2019-01-23 20:27 ` Linus Torvalds
2019-01-23 20:35 ` Linus Torvalds
2019-01-23 23:12 ` Jiri Kosina
2019-01-24 0:20 ` Linus Torvalds
2019-01-24 0:24 ` Dominique Martinet
2019-01-24 12:45 ` Dominique Martinet
2019-01-24 14:25 ` Jiri Kosina
2019-01-27 22:35 ` Jiri Kosina
2019-01-28 0:05 ` Dominique Martinet
2019-01-29 23:52 ` Jiri Kosina
2019-01-30 9:09 ` Michal Hocko
2019-01-30 12:29 ` Jiri Kosina
2019-01-16 12:36 ` Matthew Wilcox
2019-01-10 14:50 ` Matthew Wilcox
2019-01-11 7:36 ` Jiri Kosina
2019-01-17 2:22 ` Dave Chinner
2019-01-17 8:18 ` Jiri Kosina
2019-01-17 21:06 ` Dave Chinner
2019-01-07 4:32 ` Dominique Martinet
2019-01-07 10:33 ` Vlastimil Babka
2019-01-07 11:08 ` Dominique Martinet
2019-01-07 11:59 ` Vlastimil Babka
2019-01-07 13:29 ` Daniel Gruss
2019-01-07 10:10 ` Michael Ellerman
2019-01-05 23:09 ` Jiri Kosina
2019-01-30 12:44 ` [PATCH 0/3] mincore() and IOCB_NOWAIT adjustments Vlastimil Babka
2019-01-30 12:44 ` [PATCH 1/3] mm/mincore: make mincore() more conservative Vlastimil Babka
2019-01-31 9:43 ` Michal Hocko
2019-01-31 9:51 ` Dominique Martinet
2019-01-31 17:46 ` Josh Snyder
2019-02-01 8:56 ` Vlastimil Babka
2019-02-01 8:56 ` Vlastimil Babka
2019-03-06 23:13 ` Andrew Morton
2019-03-07 0:01 ` Jiri Kosina
2019-03-07 0:40 ` Dominique Martinet
2019-03-07 5:46 ` Jiri Kosina
2019-01-30 12:44 ` [PATCH 2/3] mm/filemap: initiate readahead even if IOCB_NOWAIT is set for the I/O Vlastimil Babka
2019-01-30 15:04 ` Florian Weimer
2019-01-30 15:15 ` Jiri Kosina
2019-01-31 10:47 ` Florian Weimer
2019-01-31 11:34 ` Jiri Kosina
2019-01-31 9:56 ` Michal Hocko
2019-01-31 10:15 ` Jiri Kosina
2019-01-31 10:23 ` Michal Hocko
2019-01-31 10:30 ` Jiri Kosina
2019-01-31 11:32 ` Michal Hocko
2019-01-31 17:54 ` Linus Torvalds
2019-02-01 5:13 ` Dave Chinner [this message]
2019-02-01 7:05 ` Linus Torvalds
2019-02-01 7:21 ` Linus Torvalds
2019-02-01 1:44 ` Dave Chinner
2019-02-12 15:48 ` Jiri Kosina
2019-01-31 12:04 ` Daniel Gruss
2019-01-31 12:06 ` Vlastimil Babka
2019-01-31 12:08 ` Jiri Kosina
2019-01-31 12:57 ` Daniel Gruss
2019-01-30 12:44 ` [PATCH 3/3] mm/mincore: provide mapped status when cached status is not allowed Vlastimil Babka
2019-01-31 10:09 ` Michal Hocko
2019-02-01 9:04 ` Vlastimil Babka
2019-02-01 9:11 ` Michal Hocko
2019-02-01 9:27 ` Vlastimil Babka
2019-02-06 20:14 ` Jiri Kosina
2019-02-12 3:44 ` Jiri Kosina
2019-02-12 6:36 ` Michal Hocko
2019-02-12 13:09 ` Jiri Kosina
2019-02-12 14:01 ` Michal Hocko
2019-03-06 12:11 ` [PATCH 0/3] mincore() and IOCB_NOWAIT adjustments Jiri Kosina
2019-03-06 22:35 ` Andrew Morton
2019-03-06 22:48 ` Jiri Kosina
2019-03-06 23:23 ` Andrew Morton
2019-03-06 23:32 ` Dominique Martinet
2019-03-06 23:38 ` Andrew Morton
2019-03-09 16:53 ` Linus Torvalds
2019-03-12 14:17 ` [PATCH v2 0/2] prevent mincore() page cache leaks Vlastimil Babka
2019-03-12 14:17 ` [PATCH v2 1/2] mm/mincore: make mincore() more conservative Vlastimil Babka
2019-03-12 14:17 ` [PATCH v2 2/2] mm/mincore: provide mapped status when cached status is not allowed Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190201051355.GV6173@dastard \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=asmadeus@codewreck.org \
--cc=chrubis@suse.cz \
--cc=daniel@gruss.cc \
--cc=gregkh@linuxfoundation.org \
--cc=jannh@google.com \
--cc=jikos@kernel.org \
--cc=kevin@guarana.org \
--cc=kirill@shutemov.name \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
--cc=mhocko@kernel.org \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.