[V2 PATCH 1/2] aio: clean up aio_migratepage() and related code much

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Gu Zheng <guz.fnst@cn.fujitsu.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>,
	Dave Jones <davej@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>,
	jmoyer@redhat.com, kosaki.motohiro@jp.fujitsu.com,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	miaox@cn.fujitsu.com, linux-aio@kvack.org,
	fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [V2 PATCH 1/2] aio: clean up aio_migratepage() and related code much
Date: Mon, 24 Mar 2014 18:59:30 +0800	[thread overview]
Message-ID: <53301012.7040306@cn.fujitsu.com> (raw)
In-Reply-To: <20140321183509.GC23173@kvack.org>

As the page migration framework holds lock_page() to protect the pages
(both old and new) while migrating, so while the page migrating, both
of old page and new page are locked. And the aio context teardown
routine will call *truncate*(in put_aio_ring_file()) to truncate
pagecache which needs to acquire page_lock() for each page one by one.
So there is a native mutual exclusion between *migrate page* v.s. truncate().

If put_aio_ring_file() is called at first of the context teardown flow
(aio_free_ring). Then, page migration and ctx freeing will have mutual
execution guarded by lock_page() v.s. truncate(). Once a page is removed
from radix-tree, it will not be migrated. On the other hand, the context
can not be freed while the page migraiton are ongoing.

aio_free_ring
-put_aio_ring_file()		  |
 |-truncate_setsize()             |migrate_pages()
 | |-truncate_inode_pages_range() | |-__unmap_and_move()
 |  |-trylock_page(page)          |  |-lock_page(old)
 |-i_mapping->private_data = NULL |  ...
 |-ctx->aio_ring_file=NULL        |   |-move_to_new_page()
 |                                |    |-trylock_page(newpage)
			          |     |-aio_migratepage()
So that, the additional spinlock(address_space's private_lock) used to
protect use and updates of the mapping's private_data, and the sane check
of context is needless, so remove it here.

Besides, we also move filling ring_pages[] array and ctx->nr_pages into the
page_lock protection in aio_setup_ring to keep the semantic unanimous.

Moreover, after migrate_page_move_mapping() success, page migration should
never fail, so here we merge the flow in one routine.

Thanks KAMEZAWA Hiroyuki very much for giving directions on this.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
v2: Rebased on latest linus' tree.
---
 fs/aio.c |   79 ++++++++++++++++++--------------------------------------------
 1 files changed, 23 insertions(+), 56 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 062a5f6..6453c12 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -229,11 +229,8 @@ static void put_aio_ring_file(struct kioctx *ctx)
 	if (aio_ring_file) {
 		truncate_setsize(aio_ring_file->f_inode, 0);
 
-		/* Prevent further access to the kioctx from migratepages */
-		spin_lock(&aio_ring_file->f_inode->i_mapping->private_lock);
 		aio_ring_file->f_inode->i_mapping->private_data = NULL;
 		ctx->aio_ring_file = NULL;
-		spin_unlock(&aio_ring_file->f_inode->i_mapping->private_lock);
 
 		fput(aio_ring_file);
 	}
@@ -243,6 +240,8 @@ static void aio_free_ring(struct kioctx *ctx)
 {
 	int i;
 
+	put_aio_ring_file(ctx);
+
 	for (i = 0; i < ctx->nr_pages; i++) {
 		struct page *page;
 		pr_debug("pid(%d) [%d] page->count=%d\n", current->pid, i,
@@ -254,8 +253,6 @@ static void aio_free_ring(struct kioctx *ctx)
 		put_page(page);
 	}
 
-	put_aio_ring_file(ctx);
-
 	if (ctx->ring_pages && ctx->ring_pages != ctx->internal_pages) {
 		kfree(ctx->ring_pages);
 		ctx->ring_pages = NULL;
@@ -283,32 +280,22 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
 {
 	struct kioctx *ctx;
 	unsigned long flags;
-	int rc;
-
-	rc = 0;
+	pgoff_t index;
+	int rc = MIGRATEPAGE_SUCCESS;
 
-	/* Make sure the old page hasn't already been changed */
-	spin_lock(&mapping->private_lock);
+	/*
+	 * Because old page is *locked*, if we see mapping, the page isn't
+	 * truncated, andmapping , mapping->private_data etc...are all valid.
+	 */
 	ctx = mapping->private_data;
-	if (ctx) {
-		pgoff_t idx;
-		spin_lock_irqsave(&ctx->completion_lock, flags);
-		idx = old->index;
-		if (idx < (pgoff_t)ctx->nr_pages) {
-			if (ctx->ring_pages[idx] != old)
-				rc = -EAGAIN;
-		} else
-			rc = -EINVAL;
-		spin_unlock_irqrestore(&ctx->completion_lock, flags);
-	} else
-		rc = -EINVAL;
-	spin_unlock(&mapping->private_lock);
 
-	if (rc != 0)
-		return rc;
+	index = old->index;
+	BUG_ON(index >= (pgoff_t)ctx->nr_pages);
+	VM_BUG_ON(ctx->ring_pages[index] != old);
 
 	/* Writeback must be complete */
 	BUG_ON(PageWriteback(old));
+	/* Extra ref cnt for rind_pages[] array */
 	get_page(new);
 
 	rc = migrate_page_move_mapping(mapping, new, old, NULL, mode, 1);
@@ -317,34 +304,13 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
 		return rc;
 	}
 
-	/* We can potentially race against kioctx teardown here.  Use the
-	 * address_space's private data lock to protect the mapping's
-	 * private_data.
-	 */
-	spin_lock(&mapping->private_lock);
-	ctx = mapping->private_data;
-	if (ctx) {
-		pgoff_t idx;
-		spin_lock_irqsave(&ctx->completion_lock, flags);
-		migrate_page_copy(new, old);
-		idx = old->index;
-		if (idx < (pgoff_t)ctx->nr_pages) {
-			/* And only do the move if things haven't changed */
-			if (ctx->ring_pages[idx] == old)
-				ctx->ring_pages[idx] = new;
-			else
-				rc = -EAGAIN;
-		} else
-			rc = -EINVAL;
-		spin_unlock_irqrestore(&ctx->completion_lock, flags);
-	} else
-		rc = -EBUSY;
-	spin_unlock(&mapping->private_lock);
+	spin_lock_irqsave(&ctx->completion_lock, flags);
+	/* Migration should not fail if migrate_page_move_mapping SUCCESS */
+	migrate_page_copy(new, old);
+	ctx->ring_pages[old->index] = new;
+	spin_unlock_irqrestore(&ctx->completion_lock, flags);
 
-	if (rc == MIGRATEPAGE_SUCCESS)
-		put_page(old);
-	else
-		put_page(new);
+	put_page(old);
 
 	return rc;
 }
@@ -397,6 +363,8 @@ static int aio_setup_ring(struct kioctx *ctx)
 		}
 	}
 
+	ctx->nr_pages = 0;
+
 	for (i = 0; i < nr_pages; i++) {
 		struct page *page;
 		page = find_or_create_page(file->f_inode->i_mapping,
@@ -407,13 +375,12 @@ static int aio_setup_ring(struct kioctx *ctx)
 			 current->pid, i, page_count(page));
 		SetPageUptodate(page);
 		SetPageDirty(page);
-		unlock_page(page);
-
 		ctx->ring_pages[i] = page;
+		ctx->nr_pages++;
+		unlock_page(page);
 	}
-	ctx->nr_pages = i;
 
-	if (unlikely(i != nr_pages)) {
+	if (unlikely(ctx->nr_pages != nr_pages)) {
 		aio_free_ring(ctx);
 		return -EAGAIN;
 	}
-- 
1.7.7

next prev parent reply	other threads:[~2014-03-24 11:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-20  5:46 [PATCH 2/2] aio: fix the confliction of read events and migrating ring page Gu Zheng
2014-03-20 14:32 ` Dave Jones
2014-03-20 16:30   ` Benjamin LaHaise
2014-03-21  1:56     ` Gu Zheng
2014-03-21 17:35       ` Benjamin LaHaise
2014-03-21 18:35       ` [PATCH] aio: ensure access to ctx->ring_pages is correctly serialised Benjamin LaHaise
2014-03-24 10:56         ` Gu Zheng
2014-03-24 10:59         ` Gu Zheng [this message]
2014-03-24 13:20           ` [V2 PATCH 1/2] aio: clean up aio_migratepage() and related code much Benjamin LaHaise
2014-03-25 10:11             ` Gu Zheng
2014-03-24 10:59         ` [V2 PATCH 2/2] aio: fix the confliction of aio read events and aio migrate page Gu Zheng
2014-03-24 18:22         ` [PATCH] aio: ensure access to ctx->ring_pages is correctly serialised Sasha Levin
2014-03-24 19:07           ` Benjamin LaHaise
2014-03-25 17:47             ` Sasha Levin
2014-03-25 18:57               ` Benjamin LaHaise

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:062a5f6 dfblob:6453c12 )
 OR (
bs:"[V2 PATCH 1/2] aio: clean up aio_migratepage() and related code much" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53301012.7040306@cn.fujitsu.com \
    --to=guz.fnst@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=bcrl@kvack.org \
    --cc=davej@redhat.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jmoyer@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miaox@cn.fujitsu.com \
    --cc=tangchen@cn.fujitsu.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox