linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Dave Chinner <david@fromorbit.com>,
	Ingo Molnar <mingo@redhat.com>, Jan Kara <jack@suse.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	Matthew Wilcox <willy@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-nvdimm@ml01.01.org, x86@kernel.org,
	xfs@oss.sgi.com, Dan Williams <dan.j.williams@intel.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v5 4/7] dax: add support for fsync/sync
Date: Tue, 22 Dec 2015 16:51:23 -0700	[thread overview]
Message-ID: <20151222235123.GA24124@linux.intel.com> (raw)
In-Reply-To: <20151222144625.f400e12e362cf9b00f6ffb36@linux-foundation.org>

On Tue, Dec 22, 2015 at 02:46:25PM -0800, Andrew Morton wrote:
> On Fri, 18 Dec 2015 22:22:17 -0700 Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> 
> > To properly handle fsync/msync in an efficient way DAX needs to track dirty
> > pages so it is able to flush them durably to media on demand.
> > 
> > The tracking of dirty pages is done via the radix tree in struct
> > address_space.  This radix tree is already used by the page writeback
> > infrastructure for tracking dirty pages associated with an open file, and
> > it already has support for exceptional (non struct page*) entries.  We
> > build upon these features to add exceptional entries to the radix tree for
> > DAX dirty PMD or PTE pages at fault time.
> 
> I'm getting a few rejects here against other pending changes.  Things
> look OK to me but please do runtime test the end result as it resides
> in linux-next.  Which will be next year.

Sounds good.  I'm hoping to soon send out an updated version of this series
which merges with Dan's changes to dax.c.  Thank you for pulling these into
-mm.

> --- a/fs/dax.c~dax-add-support-for-fsync-sync-fix
> +++ a/fs/dax.c
> @@ -383,10 +383,8 @@ static void dax_writeback_one(struct add
>  	struct radix_tree_node *node;
>  	void **slot;
>  
> -	if (type != RADIX_DAX_PTE && type != RADIX_DAX_PMD) {
> -		WARN_ON_ONCE(1);
> +	if (WARN_ON_ONCE(type != RADIX_DAX_PTE && type != RADIX_DAX_PMD))
>  		return;
> -	}

This is much cleaner, thanks.  I'll make this change throughout my set.

> > +/*
> > + * Flush the mapping to the persistent domain within the byte range of [start,
> > + * end]. This is required by data integrity operations to ensure file data is
> > + * on persistent storage prior to completion of the operation.
> > + */
> > +void dax_writeback_mapping_range(struct address_space *mapping, loff_t start,
> > +		loff_t end)
> > +{
> > +	struct inode *inode = mapping->host;
> > +	pgoff_t indices[PAGEVEC_SIZE];
> > +	pgoff_t start_page, end_page;
> > +	struct pagevec pvec;
> > +	void *entry;
> > +	int i;
> > +
> > +	if (inode->i_blkbits != PAGE_SHIFT) {
> > +		WARN_ON_ONCE(1);
> > +		return;
> > +	}
> 
> again
> 
> > +	rcu_read_lock();
> > +	entry = radix_tree_lookup(&mapping->page_tree, start & PMD_MASK);
> > +	rcu_read_unlock();
> 
> What stabilizes the memory at *entry after rcu_read_unlock()?

Nothing in this function.  We use the entry that is currently in the tree to
know whether or not to expand the range of offsets that we need to flush.
Even if we are racing with someone, expanding our flushing range is
non-destructive.

We get a list of entries based on what is dirty later in this function via
find_get_entries_tag(), and before we take any action on those entries we
re-verify them while holding the tree_lock in dax_writeback_one().

The next version of this series will have updated version of this code which
also accounts for block device removal via dax_map_atomic() inside of
dax_writeback_one().

  reply	other threads:[~2015-12-22 23:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-19  5:22 [PATCH v5 0/7] DAX fsync/msync support Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 1/7] pmem: add wb_cache_pmem() to the PMEM API Ross Zwisler
2015-12-22 22:44   ` Andrew Morton
2015-12-23  0:00     ` Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 2/7] dax: support dirty DAX entries in radix tree Ross Zwisler
2015-12-21 17:15   ` Jan Kara
2015-12-21 17:45     ` Ross Zwisler
2015-12-22 22:46   ` Andrew Morton
2015-12-23  0:16     ` Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 3/7] mm: add find_get_entries_tag() Ross Zwisler
2015-12-22 22:46   ` Andrew Morton
2015-12-19  5:22 ` [PATCH v5 4/7] dax: add support for fsync/sync Ross Zwisler
2015-12-19 18:37   ` Dan Williams
2015-12-21 17:05     ` Ross Zwisler
2015-12-21 17:49       ` Dan Williams
2015-12-21 19:27       ` Dan Williams
2015-12-22 22:46   ` Andrew Morton
2015-12-22 23:51     ` Ross Zwisler [this message]
2015-12-19  5:22 ` [PATCH v5 5/7] ext2: call dax_pfn_mkwrite() for DAX fsync/msync Ross Zwisler
2015-12-21 17:32   ` Jan Kara
2015-12-19  5:22 ` [PATCH v5 6/7] ext4: " Ross Zwisler
2015-12-21 17:32   ` Jan Kara
2015-12-19  5:22 ` [PATCH v5 7/7] xfs: " Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151222235123.GA24124@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    --cc=x86@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).