linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Izik Eidus <ieidus@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kvm@vger.kernel.org, chrisw@redhat.com, avi@redhat.com,
	izike@qumranet.com
Subject: Re: [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another
Date: Thu, 13 Nov 2008 03:31:35 +0100	[thread overview]
Message-ID: <20081113023135.GD10818@random.random> (raw)
In-Reply-To: <20081113020059.GC10818@random.random>

On Thu, Nov 13, 2008 at 03:00:59AM +0100, Andrea Arcangeli wrote:
> CPU0 migrate.c			CPU1 filemap.c
> -------				----------
> 				find_get_page
> 				radix_tree_lookup_slot returns the oldpage
> page_count still = expected_count
> freeze_ref (oldpage->count = 0)
> radix_tree_replace (too late, other side already got the oldpage)
> unfreeze_ref (oldpage->count = 2)
> 				page_cache_get_speculative(old_page)
> 				set count to 3 and succeeds

After reading more of this lockless radix tree code, I realized this
below check is the one that was intended to restart find_get_page and
prevent it to return the oldpage:

				    if (unlikely(page != *pagep)) {

But there's no barrier there, atomic_add_unless would need to provide
an atomic smp_mb() _after_ atomic_add_unless executed. In the old days
the atomic_* routines had no implied memory barriers, you had to use
smp_mb__after_atomic_add_unless if you wanted to avoid the race. I
don't see much in the ppc implementation of atomic_add_unless that
would provide an implicit smb_mb after the page_cache_get_speculative
returns, so I can't see why the pagep can't be by find_get_page read
before the other cpu executes radix_tree_replace in the above
timeline.

I guess you intended to put an smp_mb() in between the
page_cache_get_speculative and the *pagep to make the code safe on ppc
too, but there isn't, and I think it must be fixed, either that or I
don't understand ppc assembly right. The other side has a smp_wmb
implicit inside radix_tree_replace_slot so it should be ok already to
ensure we see the refcount going to 0 before we see the pagep changed
(the fact the other side has a memory barrier, further confirms this
side needs it too).

BTW, the radix_tree_deref_slot might miss a rcu_barrier_depends()
after radix_tree_deref_slot returns but I'm not entirely sure and only
alpha would be affected in the worst case.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-11-13  2:31 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-11 13:21 [PATCH 0/4] ksm - dynamic page sharing driver for linux Izik Eidus
2008-11-11 13:21 ` [PATCH 1/4] rmap: add page_wrprotect() function, Izik Eidus, Izik Eidus
2008-11-11 13:21   ` [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another Izik Eidus, Izik Eidus
2008-11-11 13:21     ` [PATCH 3/4] add ksm kernel shared memory driver Izik Eidus, Izik Eidus
2008-11-11 13:21       ` [PATCH 4/4] MMU_NOTIFIRES: add set_pte_at_notify() Izik Eidus, Izik Eidus
2008-11-11 20:38       ` [PATCH 3/4] add ksm kernel shared memory driver Andrew Morton
2008-11-11 22:03         ` Andrea Arcangeli
2008-11-11 22:03       ` Jonathan Corbet
2008-11-11 22:17         ` Izik Eidus
2008-11-11 22:25           ` Jonathan Corbet
2008-11-11 22:31             ` Izik Eidus
2008-11-11 22:30           ` Jonathan Corbet
2008-11-11 22:38             ` Izik Eidus
2008-11-11 23:02             ` Izik Eidus
2008-11-11 23:03             ` Andrea Arcangeli
2008-11-11 22:49           ` Avi Kivity
2008-11-11 22:40         ` Valdis.Kletnieks
2008-11-13  6:13           ` Eric Rannaud
2008-11-11 22:43         ` Avi Kivity
2008-11-11 19:45     ` [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another Andrew Morton
2008-11-11 20:57       ` Izik Eidus
2008-11-11 21:21         ` Christoph Lameter
2008-11-11 21:23           ` Izik Eidus
2008-11-11 21:31             ` Christoph Lameter
2008-11-11 21:37               ` Izik Eidus
2008-11-11 22:24               ` Andrea Arcangeli
2008-11-12  2:19                 ` KAMEZAWA Hiroyuki
2008-11-12 10:05                   ` Avi Kivity
2008-11-12 11:11                     ` Izik Eidus
2008-11-13  6:11                       ` KAMEZAWA Hiroyuki
2008-11-13 10:38                         ` Izik Eidus
2008-11-13 11:32                           ` KAMEZAWA Hiroyuki
2008-11-11 21:35           ` Andrea Arcangeli
2008-11-11 21:06       ` Andrea Arcangeli
2008-11-11 21:26         ` Christoph Lameter
2008-11-11 21:39           ` Avi Kivity
2008-11-11 21:47             ` Christoph Lameter
2008-11-11 21:55               ` Izik Eidus
2008-11-11 22:36               ` Avi Kivity
2008-11-11 22:17           ` Andrea Arcangeli
2008-11-11 22:30             ` Christoph Lameter
2008-11-11 23:17               ` Andrea Arcangeli
2008-11-11 23:25                 ` Andrea Arcangeli
2008-11-12  0:27                 ` Christoph Lameter
2008-11-12  2:27                   ` Andrea Arcangeli
2008-11-12  3:10                     ` Christoph Lameter
2008-11-12 17:32                       ` Andrea Arcangeli
2008-11-12 20:08                         ` Lee Schermerhorn
2008-11-12 20:31                           ` Christoph Lameter
2008-11-12 20:27                         ` Christoph Lameter
2008-11-12 22:09                           ` Lee Schermerhorn
2008-11-13  2:00                             ` Andrea Arcangeli
2008-11-13  2:31                               ` Andrea Arcangeli [this message]
2008-11-13  4:02                                 ` Nick Piggin
2008-11-11 19:39   ` [PATCH 1/4] rmap: add page_wrprotect() function, Andrew Morton
2008-11-11 20:38     ` Andrea Arcangeli
2008-11-11 21:01       ` Andrew Morton
2008-11-11 21:17         ` Andrea Arcangeli
2008-11-11 18:30 ` [PATCH 0/4] ksm - dynamic page sharing driver for linux Andrew Morton
2008-11-11 18:48   ` Avi Kivity
2008-11-11 19:08     ` Izik Eidus
2008-11-11 19:11     ` Andrew Morton
2008-11-11 19:18       ` Izik Eidus
2008-11-11 19:32         ` Andrew Morton
2008-11-11 19:52           ` Izik Eidus
2008-11-11 20:08             ` Izik Eidus
2008-11-11 19:29       ` Avi Kivity
2008-11-11 19:55       ` Andrea Arcangeli
2008-11-11 19:07   ` Izik Eidus
2008-11-11 19:20     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081113023135.GD10818@random.random \
    --to=aarcange@redhat.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=ieidus@redhat.com \
    --cc=izike@qumranet.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).