From: Gu Zheng <guz.fnst@cn.fujitsu.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>,
viro@zeniv.linux.org.uk, jmoyer@redhat.com,
kosaki.motohiro@gmail.com, kosaki.motohiro@jp.fujitsu.com,
isimatu.yasuaki@jp.fujitsu.com, linux-fsdevel@vger.kernel.org,
linux-aio@kvack.org, linux-kernel@vger.kernel.org,
miaox@cn.fujitsu.com
Subject: Re: [RESEND v2 PATCH 1/2] aio, memory-hotplug: Fix confliction when migrating and accessing ring pages.
Date: Fri, 14 Mar 2014 18:25:16 +0800 [thread overview]
Message-ID: <5322D90C.5050207@cn.fujitsu.com> (raw)
In-Reply-To: <20140312221735.GF32444@kvack.org>
Hi Ben,
On 03/13/2014 06:17 AM, Benjamin LaHaise wrote:
> Hello Tang,
>
> On Wed, Mar 12, 2014 at 01:25:26PM +0800, Tang Chen wrote:
> ... <snip> ...
>
>>> Another spot is in
>>> aio_read_events_ring() where head and tail are fetched from the ring
>>> without
>>> any locking. I also fear we'll be introducing new performance issues with
>>> all the additonal spinlock bouncing, despite the fact that is only ever
>>> needed for migration. I'm going to continue looking into this today and
>>> will try to send out a followup to this email later.
>>
>> In the beginning of aio_read_events_ring(), it reads head and tail, not
>> write.
>> So even if ring pages are migrated, the contents of the pages will not
>> be changed.
>> So reading it is OK, from old page or from the new page, I think.
>
> Your assumption that reading it is okay is incorrect. Since we do not have
> a reference on the page at that point, it is possible that the read of the
> page takes place after the page has been freed and allocated to another part
> of the kernel. This would result in the read returning invalid information.
What about the following patch? It adds additional reference to protect the page
avoid being freed when we reading it.
ps.It is applied on linux-next(3-13).
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
fs/aio.c | 16 ++++++++++++----
1 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 4133ba9..a4f3a4f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -283,7 +283,7 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
{
struct kioctx *ctx;
unsigned long flags;
- int rc;
+ int rc, extra_count;
rc = 0;
@@ -311,7 +311,10 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
BUG_ON(PageWriteback(old));
get_page(new);
- rc = migrate_page_move_mapping(mapping, new, old, NULL, mode, 1);
+ extra_count = page_count(old) - page_has_private(old) - 2;
+
+ rc = migrate_page_move_mapping(mapping, new, old,
+ NULL, mode, extra_count);
if (rc != MIGRATEPAGE_SUCCESS) {
put_page(new);
return rc;
@@ -1047,13 +1050,17 @@ static long aio_read_events_ring(struct kioctx *ctx,
unsigned head, tail, pos;
long ret = 0;
int copy_ret;
+ struct page *page;
mutex_lock(&ctx->ring_lock);
- ring = kmap_atomic(ctx->ring_pages[0]);
+ page = ctx->ring_pages[0];
+ get_page(page);
+ ring = kmap_atomic(page);
head = ring->head;
tail = ring->tail;
kunmap_atomic(ring);
+ put_page(page);
pr_debug("h%u t%u m%u\n", head, tail, ctx->nr_events);
@@ -1063,7 +1070,6 @@ static long aio_read_events_ring(struct kioctx *ctx,
while (ret < nr) {
long avail;
struct io_event *ev;
- struct page *page;
avail = (head <= tail ? tail : ctx->nr_events) - head;
if (head == tail)
@@ -1075,6 +1081,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
pos = head + AIO_EVENTS_OFFSET;
page = ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE];
+ get_page(page);
pos %= AIO_EVENTS_PER_PAGE;
/*
@@ -1087,6 +1094,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
copy_ret = copy_to_user(event + ret, ev + pos,
sizeof(*ev) * avail);
kunmap(page);
+ put_page(page);
if (unlikely(copy_ret)) {
ret = -EFAULT;
--
1.7.7
>
> -ben
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
WARNING: multiple messages have this Message-ID (diff)
From: Gu Zheng <guz.fnst@cn.fujitsu.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>,
viro@zeniv.linux.org.uk, jmoyer@redhat.com,
kosaki.motohiro@gmail.com, kosaki.motohiro@jp.fujitsu.com,
isimatu.yasuaki@jp.fujitsu.com, linux-fsdevel@vger.kernel.org,
linux-aio@kvack.org, linux-kernel@vger.kernel.org,
miaox@cn.fujitsu.com
Subject: Re: [RESEND v2 PATCH 1/2] aio, memory-hotplug: Fix confliction when migrating and accessing ring pages.
Date: Fri, 14 Mar 2014 18:25:16 +0800 [thread overview]
Message-ID: <5322D90C.5050207@cn.fujitsu.com> (raw)
In-Reply-To: <20140312221735.GF32444@kvack.org>
Hi Ben,
On 03/13/2014 06:17 AM, Benjamin LaHaise wrote:
> Hello Tang,
>
> On Wed, Mar 12, 2014 at 01:25:26PM +0800, Tang Chen wrote:
> ... <snip> ...
>
>>> Another spot is in
>>> aio_read_events_ring() where head and tail are fetched from the ring
>>> without
>>> any locking. I also fear we'll be introducing new performance issues with
>>> all the additonal spinlock bouncing, despite the fact that is only ever
>>> needed for migration. I'm going to continue looking into this today and
>>> will try to send out a followup to this email later.
>>
>> In the beginning of aio_read_events_ring(), it reads head and tail, not
>> write.
>> So even if ring pages are migrated, the contents of the pages will not
>> be changed.
>> So reading it is OK, from old page or from the new page, I think.
>
> Your assumption that reading it is okay is incorrect. Since we do not have
> a reference on the page at that point, it is possible that the read of the
> page takes place after the page has been freed and allocated to another part
> of the kernel. This would result in the read returning invalid information.
What about the following patch? It adds additional reference to protect the page
avoid being freed when we reading it.
ps.It is applied on linux-next(3-13).
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
fs/aio.c | 16 ++++++++++++----
1 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 4133ba9..a4f3a4f 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -283,7 +283,7 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
{
struct kioctx *ctx;
unsigned long flags;
- int rc;
+ int rc, extra_count;
rc = 0;
@@ -311,7 +311,10 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
BUG_ON(PageWriteback(old));
get_page(new);
- rc = migrate_page_move_mapping(mapping, new, old, NULL, mode, 1);
+ extra_count = page_count(old) - page_has_private(old) - 2;
+
+ rc = migrate_page_move_mapping(mapping, new, old,
+ NULL, mode, extra_count);
if (rc != MIGRATEPAGE_SUCCESS) {
put_page(new);
return rc;
@@ -1047,13 +1050,17 @@ static long aio_read_events_ring(struct kioctx *ctx,
unsigned head, tail, pos;
long ret = 0;
int copy_ret;
+ struct page *page;
mutex_lock(&ctx->ring_lock);
- ring = kmap_atomic(ctx->ring_pages[0]);
+ page = ctx->ring_pages[0];
+ get_page(page);
+ ring = kmap_atomic(page);
head = ring->head;
tail = ring->tail;
kunmap_atomic(ring);
+ put_page(page);
pr_debug("h%u t%u m%u\n", head, tail, ctx->nr_events);
@@ -1063,7 +1070,6 @@ static long aio_read_events_ring(struct kioctx *ctx,
while (ret < nr) {
long avail;
struct io_event *ev;
- struct page *page;
avail = (head <= tail ? tail : ctx->nr_events) - head;
if (head == tail)
@@ -1075,6 +1081,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
pos = head + AIO_EVENTS_OFFSET;
page = ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE];
+ get_page(page);
pos %= AIO_EVENTS_PER_PAGE;
/*
@@ -1087,6 +1094,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
copy_ret = copy_to_user(event + ret, ev + pos,
sizeof(*ev) * avail);
kunmap(page);
+ put_page(page);
if (unlikely(copy_ret)) {
ret = -EFAULT;
--
1.7.7
>
> -ben
next prev parent reply other threads:[~2014-03-14 10:25 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-10 8:15 [RESEND v2 PATCH 0/2] Bug fix in aio ring page migration Tang Chen
2014-03-10 8:15 ` Tang Chen
2014-03-10 8:15 ` [RESEND v2 PATCH 1/2] aio, memory-hotplug: Fix confliction when migrating and accessing ring pages Tang Chen
2014-03-10 8:15 ` Tang Chen
2014-03-11 18:46 ` Benjamin LaHaise
2014-03-11 18:46 ` Benjamin LaHaise
2014-03-12 5:25 ` Tang Chen
2014-03-12 5:25 ` Tang Chen
2014-03-12 22:17 ` Benjamin LaHaise
2014-03-12 22:17 ` Benjamin LaHaise
2014-03-14 10:25 ` Gu Zheng [this message]
2014-03-14 10:25 ` Gu Zheng
2014-03-14 15:14 ` Benjamin LaHaise
2014-03-14 15:14 ` Benjamin LaHaise
2014-03-16 2:06 ` Gu Zheng
2014-03-16 2:06 ` Gu Zheng
2014-03-17 6:50 ` Tang Chen
2014-03-10 8:15 ` [RESEND v2 PATCH 2/2] aio, mem-hotplug: Add memory barrier to aio ring page migration Tang Chen
2014-03-10 8:15 ` Tang Chen
2014-03-13 9:45 ` [RESEND v2 PATCH 0/2] Bug fix in " Gu Zheng
2014-03-13 9:45 ` Gu Zheng
2014-03-16 21:21 ` Ben Hutchings
2014-05-13 23:58 ` Greg Kroah-Hartman
2014-05-14 1:13 ` Gu Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5322D90C.5050207@cn.fujitsu.com \
--to=guz.fnst@cn.fujitsu.com \
--cc=bcrl@kvack.org \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=jmoyer@redhat.com \
--cc=kosaki.motohiro@gmail.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miaox@cn.fujitsu.com \
--cc=tangchen@cn.fujitsu.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.