From: Andrea Arcangeli <aarcange@redhat.com>
To: Christoph Lameter <cl@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Izik Eidus <ieidus@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kvm@vger.kernel.org, chrisw@redhat.com, avi@redhat.com,
izike@qumranet.com
Subject: Re: [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another
Date: Wed, 12 Nov 2008 03:27:01 +0100 [thread overview]
Message-ID: <20081112022701.GT10818@random.random> (raw)
In-Reply-To: <Pine.LNX.4.64.0811111823030.31625@quilx.com>
On Tue, Nov 11, 2008 at 06:27:09PM -0600, Christoph Lameter wrote:
> Then page migration will not occur because there is an unresolved
> reference.
So are you checking if there's an unresolved reference only in the
very place I just quoted in the previous email? If answer is yes: what
should prevent get_user_pages from running in parallel from another
thread? get_user_pages will trigger a minor fault and get the elevated
reference just after you read page_count. To you it looks like there
is no o_direct in progress when you proceed to the core of migration
code, but in effect o_direct just started a moment after you read the
page count.
What can protect you is PG lock or mmap_sem in _write_ mode (and
they've to be hold for the whole duration of the migration). I don't
see any of the two being hold while you read the page count... You
don't seem to be using stop_machine either (stop_machine pretty
expensive on the 4096 way I guess).
This wasn't reproduced in practice but it should be possible to
reproduce it by just writing a testcase with three threads, one forks
in a loop (child just quit) the other memset 0 the first 512bytes of a
page, and then o_direct read from a 0xff 512byte region and checks
that the first 512bytes are all non-zero in a loop, and the third
writes 1 byte to the last 512bytes of the page in a loop. Eventually
the comparison should show zero data in the page.
To reproduce with migration just start the thread that memset 0, reads
a 0xff region with o_direct, and checks it's all 0xff in a loop, and
then migrate the memory of this thread back and forth between two
nodes with the sys_move_pages (mpol is safe by luck because it
surrounds migrate_pages with the mmap_sem in write mode). Eventually
you should see zero bytes despite I/O is complete.
Reproducing this is normal life would take time and for the fork bug
it may not be reproducible depending of what the app is doing. Mixing
sys_move_pages with o_direct in the same process with on two different
threads, instead should eventually eventually reproduce it. And with
gup_fast is now unfixable until more infrastructure is added to
slowdown gup_fast a bit (unless Nick finds an RCU way of doing it).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-11-12 2:27 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-11 13:21 [PATCH 0/4] ksm - dynamic page sharing driver for linux Izik Eidus
2008-11-11 13:21 ` [PATCH 1/4] rmap: add page_wrprotect() function, Izik Eidus, Izik Eidus
2008-11-11 13:21 ` [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another Izik Eidus, Izik Eidus
2008-11-11 13:21 ` [PATCH 3/4] add ksm kernel shared memory driver Izik Eidus, Izik Eidus
2008-11-11 13:21 ` [PATCH 4/4] MMU_NOTIFIRES: add set_pte_at_notify() Izik Eidus, Izik Eidus
2008-11-11 20:38 ` [PATCH 3/4] add ksm kernel shared memory driver Andrew Morton
2008-11-11 22:03 ` Andrea Arcangeli
2008-11-11 22:03 ` Jonathan Corbet
2008-11-11 22:17 ` Izik Eidus
2008-11-11 22:25 ` Jonathan Corbet
2008-11-11 22:31 ` Izik Eidus
2008-11-11 22:30 ` Jonathan Corbet
2008-11-11 22:38 ` Izik Eidus
2008-11-11 23:02 ` Izik Eidus
2008-11-11 23:03 ` Andrea Arcangeli
2008-11-11 22:49 ` Avi Kivity
2008-11-11 22:40 ` Valdis.Kletnieks
2008-11-13 6:13 ` Eric Rannaud
2008-11-11 22:43 ` Avi Kivity
2008-11-11 19:45 ` [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another Andrew Morton
2008-11-11 20:57 ` Izik Eidus
2008-11-11 21:21 ` Christoph Lameter
2008-11-11 21:23 ` Izik Eidus
2008-11-11 21:31 ` Christoph Lameter
2008-11-11 21:37 ` Izik Eidus
2008-11-11 22:24 ` Andrea Arcangeli
2008-11-12 2:19 ` KAMEZAWA Hiroyuki
2008-11-12 10:05 ` Avi Kivity
2008-11-12 11:11 ` Izik Eidus
2008-11-13 6:11 ` KAMEZAWA Hiroyuki
2008-11-13 10:38 ` Izik Eidus
2008-11-13 11:32 ` KAMEZAWA Hiroyuki
2008-11-11 21:35 ` Andrea Arcangeli
2008-11-11 21:06 ` Andrea Arcangeli
2008-11-11 21:26 ` Christoph Lameter
2008-11-11 21:39 ` Avi Kivity
2008-11-11 21:47 ` Christoph Lameter
2008-11-11 21:55 ` Izik Eidus
2008-11-11 22:36 ` Avi Kivity
2008-11-11 22:17 ` Andrea Arcangeli
2008-11-11 22:30 ` Christoph Lameter
2008-11-11 23:17 ` Andrea Arcangeli
2008-11-11 23:25 ` Andrea Arcangeli
2008-11-12 0:27 ` Christoph Lameter
2008-11-12 2:27 ` Andrea Arcangeli [this message]
2008-11-12 3:10 ` Christoph Lameter
2008-11-12 17:32 ` Andrea Arcangeli
2008-11-12 20:08 ` Lee Schermerhorn
2008-11-12 20:31 ` Christoph Lameter
2008-11-12 20:27 ` Christoph Lameter
2008-11-12 22:09 ` Lee Schermerhorn
2008-11-13 2:00 ` Andrea Arcangeli
2008-11-13 2:31 ` Andrea Arcangeli
2008-11-13 4:02 ` Nick Piggin
2008-11-11 19:39 ` [PATCH 1/4] rmap: add page_wrprotect() function, Andrew Morton
2008-11-11 20:38 ` Andrea Arcangeli
2008-11-11 21:01 ` Andrew Morton
2008-11-11 21:17 ` Andrea Arcangeli
2008-11-11 18:30 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux Andrew Morton
2008-11-11 18:48 ` Avi Kivity
2008-11-11 19:08 ` Izik Eidus
2008-11-11 19:11 ` Andrew Morton
2008-11-11 19:18 ` Izik Eidus
2008-11-11 19:32 ` Andrew Morton
2008-11-11 19:52 ` Izik Eidus
2008-11-11 20:08 ` Izik Eidus
2008-11-11 19:29 ` Avi Kivity
2008-11-11 19:55 ` Andrea Arcangeli
2008-11-11 19:07 ` Izik Eidus
2008-11-11 19:20 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081112022701.GT10818@random.random \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=chrisw@redhat.com \
--cc=cl@linux-foundation.org \
--cc=ieidus@redhat.com \
--cc=izike@qumranet.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).