From: Liu Bo <bo.li.liu@oracle.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: fdmanana@gmail.com, linux-btrfs <linux-btrfs@vger.kernel.org>,
Josef Bacik <josef@toxicpanda.com>
Subject: Re: [PATCH] btrfs: Fix locking during DIO read
Date: Wed, 21 Feb 2018 10:14:13 -0800 [thread overview]
Message-ID: <20180221181412.GA9910@lim.localdomain> (raw)
In-Reply-To: <de457af6-6a9c-43f4-b9c6-429b69deec77@suse.com>
On Wed, Feb 21, 2018 at 04:15:38PM +0200, Nikolay Borisov wrote:
>
>
> On 21.02.2018 15:51, Filipe Manana wrote:
> > On Wed, Feb 21, 2018 at 11:41 AM, Nikolay Borisov <nborisov@suse.com> wrote:
> >> Currently the DIO read cases uses a botched idea from ext4 to ensure
> >> that DIO reads don't race with truncate. The idea is that if we have a
> >> pending truncate we set BTRFS_INODE_READDIO_NEED_LOCK which in turn
> >> forces the dio read case to fallback to inode_locking to prevent
> >> read/truncate races. Unfortunately this is subtly broken for at least
> >> 2 reasons:
> >>
> >> 1. inode_dio_begin in btrfs_direct_IO is called outside of inode_lock
> >> (for the read case). This means that there is no ordering guarantee
> >> between the invocation of inode_dio_wait and the increment of
> >> i_dio_count in btrfs_direct_IO in the tread case.
> >
> > Also, looking at this changelog, the diff and the code, why is it a
> > problem not calling inode_dio_begin without the inode lock in the dio
> > read path?
> > The truncate path calls inode_dio_wait after setting the bit
> > BTRFS_INODE_READDIO_NEED_LOCK and before clearing it.
> > Assuming the functions to set and clear that bit are correct, I don't
> > see what problem this brings.
>
> Assume you have a truncate and a dio READ in parallel. So the following
> execution is possible:
>
> T1: T2:
> btrfs_setattr
> set_bit(BTRFS_INODE_READDIO_NEED_LOCK)
> inode_dio_wait (reads i_dio_count) btrfs_direct_IO
> clear_bit(BTRFS_INODE_READDIO_NEED_LOCK) inode_dio_begin (inc's i_dio_count)
>
> Since we have no ordering between beginning a dio and waiting for it then
> truncate can assume there isn't any pending dio. At the same time
> btrfs_direct_IO will increment i_dio_count but won't see BTRFS_INODE_READDIO_NEED_LOCK
> ever being set and so will proceed servicing the read.
>
->setattr here has truncated the inode's isize and
do_blockdev_direct_IO() then checks if the offset to read from is
within the isize, if true, dio read is fine with this truncate, if
not, it returns.
Apart from this, do you have other concerns?
thanks,
-liubo
prev parent reply other threads:[~2018-02-21 18:20 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-21 11:41 [PATCH] btrfs: Fix locking during DIO read Nikolay Borisov
2018-02-21 13:06 ` Filipe Manana
2018-02-21 13:10 ` Nikolay Borisov
2018-02-21 13:27 ` Filipe Manana
2018-02-21 13:51 ` Filipe Manana
2018-02-21 14:15 ` Nikolay Borisov
2018-02-21 14:42 ` Filipe Manana
2018-02-21 18:28 ` Liu Bo
2018-02-21 18:38 ` Nikolay Borisov
2018-02-21 19:05 ` Filipe Manana
2018-02-21 22:38 ` Liu Bo
2018-02-22 6:49 ` Nikolay Borisov
2018-02-22 19:09 ` Liu Bo
2018-02-22 19:24 ` Liu Bo
2018-02-22 23:39 ` David Sterba
2018-02-23 6:36 ` Nikolay Borisov
2018-02-22 10:05 ` Filipe Manana
2018-02-21 18:14 ` Liu Bo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180221181412.GA9910@lim.localdomain \
--to=bo.li.liu@oracle.com \
--cc=fdmanana@gmail.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).