linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>, Ted Tso <tytso@mit.edu>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH 03/11] ext4: Convert DAX reads to iomap infrastructure
Date: Fri, 11 Nov 2016 11:17:51 +0100	[thread overview]
Message-ID: <20161111101750.GD2730@quack2.suse.cz> (raw)
In-Reply-To: <20161110215431.GC27200@linux.intel.com>

On Thu 10-11-16 14:54:31, Ross Zwisler wrote:
> On Tue, Nov 08, 2016 at 12:08:09PM +0100, Jan Kara wrote:
> > Implement basic iomap_begin function that handles reading and use it for
> > DAX reads.
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ext4/ext4.h  |  2 ++
> >  fs/ext4/file.c  | 38 +++++++++++++++++++++++++++++++++++++-
> >  fs/ext4/inode.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 93 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> > index 282a51b07c57..098b39910001 100644
> > --- a/fs/ext4/ext4.h
> > +++ b/fs/ext4/ext4.h
> > @@ -3271,6 +3271,8 @@ static inline bool ext4_aligned_io(struct inode *inode, loff_t off, loff_t len)
> >  	return IS_ALIGNED(off, blksize) && IS_ALIGNED(len, blksize);
> >  }
> >  
> > +extern struct iomap_ops ext4_iomap_ops;
> > +
> >  #endif	/* __KERNEL__ */
> >  
> >  #define EFSBADCRC	EBADMSG		/* Bad CRC detected */
> > diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> > index 9facb4dc5c70..1f25c644cb12 100644
> > --- a/fs/ext4/file.c
> > +++ b/fs/ext4/file.c
> > @@ -31,6 +31,42 @@
> >  #include "xattr.h"
> >  #include "acl.h"
> >  
> > +#ifdef CONFIG_FS_DAX
> > +static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to)
> > +{
> > +	struct inode *inode = file_inode(iocb->ki_filp);
> > +	ssize_t ret;
> > +
> > +	inode_lock_shared(inode);
> > +	/*
> > +	 * Recheck under inode lock - at this point we are sure it cannot
> > +	 * change anymore
> > +	 */
> > +	if (!IS_DAX(inode)) {
> > +		inode_unlock_shared(inode);
> > +		/* Fallback to buffered IO in case we cannot support DAX */
> > +		return generic_file_read_iter(iocb, to);
> 
> Is this not also racy, since we've just dropped the inode lock?  What's to
> prevent this sequence?
> 
> Thread 0				Thread 1
> --------				--------
> ext4_file_read_iter()
>   IS_DAX() returns true
>   					changes S_DAX to false
>   ext4_dax_read_iter()
>     inode_lock_shared()
>     IS_DAX() returns false
>     inode_unlock_shared()
>   					changes S_DAX to true
>     generic_file_read_iter() on a DAX inode
> 
> 
> Or are we okay in this scenario?

Yup, I'm aware of this. The real problem is that there's no way to
serialize with buffered reads for ext4 (they take only page locks) so
currently you can have buffered reads in flight when inode gets switched to
DAX mode. I agree there is a potential for breakage and it needs to be
resolved eventually but the problem is not new and these patches don't make
it really any worse so I just somewhat fixed it up by patch 2/11 and left
full solution to a separate patch set.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2016-11-11 10:17 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-08 11:08 [PATCH 0/11 v2] ext4: Convert ext4 DAX IO to iomap framework Jan Kara
2016-11-08 11:08 ` [PATCH 01/11] ext4: Factor out checks from ext4_file_write_iter() Jan Kara
2016-11-10 21:25   ` Ross Zwisler
2016-11-08 11:08 ` [PATCH 02/11] ext4: Let S_DAX set only if DAX is really supported Jan Kara
2016-11-10 21:46   ` Ross Zwisler
2016-11-11 10:08     ` Jan Kara
2016-11-11 17:56       ` Ross Zwisler
2016-11-08 11:08 ` [PATCH 03/11] ext4: Convert DAX reads to iomap infrastructure Jan Kara
2016-11-10 21:54   ` Ross Zwisler
2016-11-11 10:17     ` Jan Kara [this message]
2016-11-11 17:57       ` Ross Zwisler
2016-11-08 11:08 ` [PATCH 04/11] ext4: Use iomap for zeroing blocks in DAX mode Jan Kara
2016-11-10 22:05   ` Ross Zwisler
2016-11-08 11:08 ` [PATCH 05/11] ext4: DAX iomap write support Jan Kara
2016-11-08 11:08 ` [PATCH 06/11] ext4: Avoid split extents for DAX writes Jan Kara
2016-11-11 20:25   ` Ross Zwisler
2016-11-14  8:54     ` Jan Kara
2016-11-08 11:08 ` [PATCH 07/11] dax: Introduce IOMAP_FAULT flag Jan Kara
2016-11-08 23:21   ` Dave Chinner
2016-11-09 15:03   ` Christoph Hellwig
2016-11-08 11:08 ` [PATCH 08/11] ext4: Convert DAX faults to iomap infrastructure Jan Kara
2016-11-08 11:08 ` [PATCH 09/11] ext4: Rip out DAX handling from direct IO path Jan Kara
2016-11-16 17:13   ` Ross Zwisler
2016-11-17  9:41     ` Jan Kara
2016-11-08 11:08 ` [PATCH 10/11] ext2: Use iomap_zero_range() for zeroing truncated page in DAX path Jan Kara
2016-11-16 17:31   ` Ross Zwisler
2016-11-08 11:08 ` [PATCH 11/11] dax: Rip out get_block based IO support Jan Kara
2016-11-16 18:11   ` Ross Zwisler
2016-11-17  9:45     ` Jan Kara
2016-11-08 23:17 ` [PATCH 0/11 v2] ext4: Convert ext4 DAX IO to iomap framework Dave Chinner
2016-11-09 15:02   ` Christoph Hellwig
2016-11-09 23:22     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161111101750.GD2730@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).