public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* thp and memory barrier assumptions
@ 2012-07-26 20:31 Peter Zijlstra
  2012-07-26 20:33 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2012-07-26 20:31 UTC (permalink / raw)
  To: Andrea Arcangeli, Rik van Riel
  Cc: Andrew Morton, paulmck, Oleg Nesterov, linux-kernel, Hugh Dickins


__do_huge_pmd_anonymous_page() contains:

                /*
                 * The spinlocking to take the lru_lock inside
                 * page_add_new_anon_rmap() acts as a full memory
                 * barrier to be sure clear_huge_page writes become
                 * visible after the set_pmd_at() write.
                 */
                page_add_new_anon_rmap(page, vma, haddr);


page_add_new_anon_rmap() doesn't look to actually do a LOCK+UNLOCK
except for unevictable pages.

But even if it did do an unconditional LOCK+UNLOCK that doesn't make a
full memory barrier, see Documentation/memory-barriers.txt.

In particular:

        *A = a;
        LOCK
        UNLOCK
        *B = b;

may occur as:

        LOCK, STORE *B, STORE *A, UNLOCK


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: thp and memory barrier assumptions
  2012-07-26 20:31 thp and memory barrier assumptions Peter Zijlstra
@ 2012-07-26 20:33 ` Peter Zijlstra
  2012-08-03 19:30   ` Andrea Arcangeli
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2012-07-26 20:33 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Rik van Riel, Andrew Morton, paulmck, Oleg Nesterov, linux-kernel,
	Hugh Dickins

On Thu, 2012-07-26 at 22:31 +0200, Peter Zijlstra wrote:
> __do_huge_pmd_anonymous_page() contains:
> 
>                 /*
>                  * The spinlocking to take the lru_lock inside
>                  * page_add_new_anon_rmap() acts as a full memory
>                  * barrier to be sure clear_huge_page writes become
>                  * visible after the set_pmd_at() write.
>                  */
>                 page_add_new_anon_rmap(page, vma, haddr);
> 
> 
> page_add_new_anon_rmap() doesn't look to actually do a LOCK+UNLOCK
> except for unevictable pages.
> 
> But even if it did do an unconditional LOCK+UNLOCK that doesn't make a
> full memory barrier, see Documentation/memory-barriers.txt.
> 
> In particular:
> 
>         *A = a;
>         LOCK
>         UNLOCK
>         *B = b;
> 
> may occur as:
> 
>         LOCK, STORE *B, STORE *A, UNLOCK
> 


Also, what is that barrier() in handle_mm_fault() doing? And why doesn't
it have a comment explaining that?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: thp and memory barrier assumptions
  2012-07-26 20:33 ` Peter Zijlstra
@ 2012-08-03 19:30   ` Andrea Arcangeli
  0 siblings, 0 replies; 3+ messages in thread
From: Andrea Arcangeli @ 2012-08-03 19:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rik van Riel, Andrew Morton, paulmck, Oleg Nesterov, linux-kernel,
	Hugh Dickins

On Thu, Jul 26, 2012 at 10:33:25PM +0200, Peter Zijlstra wrote:
> On Thu, 2012-07-26 at 22:31 +0200, Peter Zijlstra wrote:
> > __do_huge_pmd_anonymous_page() contains:
> > 
> >                 /*
> >                  * The spinlocking to take the lru_lock inside
> >                  * page_add_new_anon_rmap() acts as a full memory
> >                  * barrier to be sure clear_huge_page writes become
> >                  * visible after the set_pmd_at() write.
> >                  */
> >                 page_add_new_anon_rmap(page, vma, haddr);
> > 
> > 
> > page_add_new_anon_rmap() doesn't look to actually do a LOCK+UNLOCK
> > except for unevictable pages.
> > 
> > But even if it did do an unconditional LOCK+UNLOCK that doesn't make a
> > full memory barrier, see Documentation/memory-barriers.txt.
> > 
> > In particular:
> > 
> >         *A = a;
> >         LOCK
> >         UNLOCK
> >         *B = b;
> > 
> > may occur as:
> > 
> >         LOCK, STORE *B, STORE *A, UNLOCK
> > 
> 

I fixed that last year (I think Mel pointed out the bug) but I've been
so busy with other things I forgot to push that theoretical fix from
aa.git to -mm. As soon as autonuma is merged, I'll return to focus on
pushing the other pending patches in my queue that are being starved.

http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=commitdiff;h=d598a3f7ae4ca9d2c2a8653fbe790aab9b1a3141

Can you review it? If ok I'll submit it so it won't starve no
more. Also note the other bugfix that was in fair.c I think is only
needed with AutoNUMA applied this is why I didn't submit it
separately.

This can't affect x86 where even a locked bitop is the equivalent of a
full memory barrier.

> Also, what is that barrier() in handle_mm_fault() doing? And why doesn't
> it have a comment explaining that?

I added the docs below:

=====
>From ad51771a2c3fa697fa0267edda23b48d0b85f023 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <aarcange@redhat.com>
Date: Fri, 3 Aug 2012 21:10:44 +0200
Subject: [PATCH] thp: document barrier() in wrprotect THP fault path

Inline doc.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/memory.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 420a449..9ec5bba 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3522,6 +3522,12 @@ retry:
 		pmd_t orig_pmd = *pmd;
 		int ret;
 
+		/*
+		 * flush orig_pmd on the stack to avoid invalidating
+		 * the pmd_trans_huge(orig_pmd) check and to allow
+		 * do_huge_pmd_wp_page to run a reliable
+		 * pmd_same(*pmd, orig_pmd).
+		 */
 		barrier();
 		if (pmd_trans_huge(orig_pmd)) {
 			if (flags & FAULT_FLAG_WRITE &&

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-08-03 19:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-26 20:31 thp and memory barrier assumptions Peter Zijlstra
2012-07-26 20:33 ` Peter Zijlstra
2012-08-03 19:30   ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox