public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: jack@suse.cz, xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: don't invalidate whole file on DAX read/write
Date: Wed, 3 Aug 2016 17:34:37 +0200	[thread overview]
Message-ID: <20160803153437.GC4576@quack2.suse.cz> (raw)
In-Reply-To: <1470181226-20935-1-git-send-email-david@fromorbit.com>

On Wed 03-08-16 09:40:26, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When we do DAX IO, we try to invalidate the entire page cache held
> on the file. This is incorrect as it will trash the entire mapping
> tree that now tracks dirty state in exceptional entries in the radix
> tree slots.
> 
> What we are trying to do is remove cached pages (e.g from reads
> into holes) that sit in the radix tree over the range we are about
> to write to. Hence we should just limit the invalidation to the
> range we are about to overwrite.

The patch looks good. Just one comment below.

> 
> Reported-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_file.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index ed95e5b..e612a02 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -741,9 +741,20 @@ xfs_file_dax_write(
>  	 * page is inserted into the pagecache when we have to serve a write
>  	 * fault on a hole.  It should never be dirtied and can simply be
>  	 * dropped from the pagecache once we get real data for the page.
> +	 *
> +	 * XXX: This is racy against mmap, and there's nothing we can do about
> +	 * it. dax_do_io() should really do this invalidation internally as
> +	 * it will know if we've allocated over a holei for this specific IO and
> +	 * if so it needs to update the mapping tree and invalidate existing
> +	 * PTEs over the newly allocated range. Remove this invalidation when
> +	 * dax_do_io() is fixed up.

And would it be OK for XFS if dax_do_io() actually invalidated page cache /
PTEs under just XFS_IOLOCK_SHARED? Because currently you seem to be careful
to call invalidate_inode_pages2() only when holding the lock exclusively
and then demote it to a shared one when calling dax_do_io().

								Honza

>  	 */
>  	if (mapping->nrpages) {
> -		ret = invalidate_inode_pages2(mapping);
> +		loff_t end = iocb->ki_pos + iov_iter_count(from) - 1;
> +
> +		ret = invalidate_inode_pages2_range(mapping,
> +						    iocb->ki_pos >> PAGE_SHIFT,
> +						    end >> PAGE_SHIFT);
>  		WARN_ON_ONCE(ret);
>  	}
>  
> -- 
> 2.8.0.rc3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2016-08-03 15:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-02 23:40 [PATCH] xfs: don't invalidate whole file on DAX read/write Dave Chinner
2016-08-03  9:25 ` Christoph Hellwig
2016-08-03 15:34 ` Jan Kara [this message]
2016-08-03 22:32   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160803153437.GC4576@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox