linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Dave Chinner <david@fromorbit.com>, Hao Xu <haoxu@linux.alibaba.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-fsdevel@vger.kernel.org,
	Jeffle Xu <jefflexu@linux.alibaba.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	io-uring@vger.kernel.org, Joseph Qi <joseph.qi@linux.alibaba.com>
Subject: Re: [PATCH v3 RESEND] iomap: set REQ_NOWAIT according to IOCB_NOWAIT in Direct IO
Date: Mon, 7 Dec 2020 16:40:38 -0700	[thread overview]
Message-ID: <3b33a9e3-0f03-38cc-d484-3f355f75df73@kernel.dk> (raw)
In-Reply-To: <20201207022130.GC4170059@dread.disaster.area>

On 12/6/20 7:21 PM, Dave Chinner wrote:
> On Fri, Dec 04, 2020 at 05:44:56PM +0800, Hao Xu wrote:
>> Currently, IOCB_NOWAIT is ignored in Direct IO, REQ_NOWAIT is only set
>> when IOCB_HIPRI is set. But REQ_NOWAIT should be set as well when
>> IOCB_NOWAIT is set.
>>
>> Suggested-by: Jeffle Xu <jefflexu@linux.alibaba.com>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
>> ---
>>
>> Hi all,
>> I tested fio io_uring direct read for a file on ext4 filesystem on a
>> nvme ssd. I found that IOCB_NOWAIT is ignored in iomap layer, which
>> means REQ_NOWAIT is not set in bio->bi_opf.
> 
> What iomap is doing is correct behaviour. IOCB_NOWAIT applies to the
> filesystem behaviour, not the block device.
> 
> REQ_NOWAIT can result in partial IO failures because the error is
> only reported to the iomap layer via IO completions. Hence we can
> split a DIO into multiple bios and have random bios in that IO fail
> with EAGAIN because REQ_NOWAIT is set. This error will
> get reported to the submitter via completion, and it will override
> any of the partial IOs that actually completed.
> 
> Hence, like the recently reported multi-mapping IOCB_NOWAIT bug
> reported by Jens and fixed in commit 883a790a8440 ("xfs: don't allow
> NOWAIT DIO across extent boundaries") we'll get silent partial
> writes occurring because the second submitted bio in an IO can
> trigger EAGAIN errors with partial IO completion having already
> occurred.
> 
> Further, we don't allow partial IO completion for DIO on XFS at all.
> DIO must be completely submitted and completed or return an error
> without having issued any IO at all.  Hence using REQ_NOWAIT for
> DIO bios is incorrect and not desirable.

What you say makes total sense for a user using RWF_NOWAIT, but it
doesn't make a lot of sense for io_uring where we really want
IOCB_NOWAIT to be what it suggests it is - don't wait for other IO to
complete, if avoidable. One of the things that really suck with
aio/libai is the "yeah it's probably async, but lol, might not be"
aspect of it.

For io_uring, if we do get -EAGAIN, we'll retry without NOWAIT set. So
the concern about fractured/short writes doesn't bubble up to the
application. Hence we really want an IOCB_NOWAIT_REALLY on that side,
instead of the poor mans IOCB_MAYBE_NOWAIT semantics.

-- 
Jens Axboe


  reply	other threads:[~2020-12-07 23:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04  9:44 [PATCH v3 RESEND] iomap: set REQ_NOWAIT according to IOCB_NOWAIT in Direct IO Hao Xu
2020-12-04 11:44 ` Pavel Begunkov
2020-12-07  2:21 ` Dave Chinner
2020-12-07 23:40   ` Jens Axboe [this message]
2020-12-09 21:15     ` Dave Chinner
2020-12-10  2:33       ` JeffleXu
2020-12-08  5:46   ` JeffleXu
2020-12-09 21:23     ` Dave Chinner
2020-12-10  1:55       ` JeffleXu
2020-12-10  5:18         ` Dave Chinner
2020-12-11  2:50           ` JeffleXu
2020-12-14  2:56             ` Dave Chinner
2020-12-15  9:43               ` JeffleXu
2021-04-02 14:32                 ` Pavel Begunkov
2021-04-02 16:26                   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b33a9e3-0f03-38cc-d484-3f355f75df73@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=haoxu@linux.alibaba.com \
    --cc=hch@infradead.org \
    --cc=io-uring@vger.kernel.org \
    --cc=jefflexu@linux.alibaba.com \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).