* [PATCH] THP: Use explicit memory barrier
@ 2013-03-31 23:45 Minchan Kim
2013-04-01 1:01 ` Kamezawa Hiroyuki
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Minchan Kim @ 2013-03-31 23:45 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, Minchan Kim, Mel Gorman, Andrea Arcangeli,
Hugh Dickins
__do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
spinlock for making sure that clear_huge_page write become visible
after set set_pmd_at() write.
But lru_cache_add_lru uses pagevec so it could miss spinlock
easily so above rule was broken so user may see inconsistent data.
This patch fixes it with using explict barrier rather than depending
on lru spinlock.
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/huge_memory.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bfa142e..fad800e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -725,11 +725,10 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct *mm,
pmd_t entry;
entry = mk_huge_pmd(page, vma);
/*
- * The spinlocking to take the lru_lock inside
- * page_add_new_anon_rmap() acts as a full memory
- * barrier to be sure clear_huge_page writes become
- * visible after the set_pmd_at() write.
+ * clear_huge_page write become visible after the
+ * set_pmd_at() write.
*/
+ smp_wmb();
page_add_new_anon_rmap(page, vma, haddr);
set_pmd_at(mm, haddr, pmd, entry);
pgtable_trans_huge_deposit(mm, pgtable);
--
1.8.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-03-31 23:45 [PATCH] THP: Use explicit memory barrier Minchan Kim
@ 2013-04-01 1:01 ` Kamezawa Hiroyuki
2013-04-01 4:45 ` Minchan Kim
2013-04-01 23:35 ` David Rientjes
2013-04-04 2:45 ` Simon Jeons
2 siblings, 1 reply; 10+ messages in thread
From: Kamezawa Hiroyuki @ 2013-04-01 1:01 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Hugh Dickins
(2013/04/01 8:45), Minchan Kim wrote:
> __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> spinlock for making sure that clear_huge_page write become visible
> after set set_pmd_at() write.
>
> But lru_cache_add_lru uses pagevec so it could miss spinlock
> easily so above rule was broken so user may see inconsistent data.
> This patch fixes it with using explict barrier rather than depending
> on lru spinlock.
>
Hmm...how about do_anonymous_page() ? there are no comments/locks/barriers.
Users can see non-zero value after page fault in theory ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-04-01 1:01 ` Kamezawa Hiroyuki
@ 2013-04-01 4:45 ` Minchan Kim
0 siblings, 0 replies; 10+ messages in thread
From: Minchan Kim @ 2013-04-01 4:45 UTC (permalink / raw)
To: Kamezawa Hiroyuki
Cc: Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Hugh Dickins
Hey Kame,
On Mon, Apr 01, 2013 at 10:01:49AM +0900, Kamezawa Hiroyuki wrote:
> (2013/04/01 8:45), Minchan Kim wrote:
> > __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> > spinlock for making sure that clear_huge_page write become visible
> > after set set_pmd_at() write.
> >
> > But lru_cache_add_lru uses pagevec so it could miss spinlock
> > easily so above rule was broken so user may see inconsistent data.
> > This patch fixes it with using explict barrier rather than depending
> > on lru spinlock.
> >
>
> Hmm...how about do_anonymous_page() ? there are no comments/locks/barriers.
> Users can see non-zero value after page fault in theory ?
Maybe, but as you know well, we didn't see any report about that until now
so I'm not sure. Ccing people in this thread could have clear answer rather
than me.
In THP, I just found inconsistency between Andrea's comment and code.
That's why I sent a patch.
If your concern turns out truth, Ya, you might find ancient bug. :)
Thanks.
>
> Thanks,
> -Kame
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-03-31 23:45 [PATCH] THP: Use explicit memory barrier Minchan Kim
2013-04-01 1:01 ` Kamezawa Hiroyuki
@ 2013-04-01 23:35 ` David Rientjes
2013-04-02 0:37 ` Minchan Kim
2013-04-04 2:45 ` Simon Jeons
2 siblings, 1 reply; 10+ messages in thread
From: David Rientjes @ 2013-04-01 23:35 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Hugh Dickins
On Mon, 1 Apr 2013, Minchan Kim wrote:
> __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> spinlock for making sure that clear_huge_page write become visible
> after set set_pmd_at() write.
>
> But lru_cache_add_lru uses pagevec so it could miss spinlock
> easily so above rule was broken so user may see inconsistent data.
>
> This patch fixes it with using explict barrier rather than depending
> on lru spinlock.
>
Is this the same issue that Andrea responded to in the "thp and memory
barrier assumptions" thread at http://marc.info/?t=134333512700004 ?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-04-01 23:35 ` David Rientjes
@ 2013-04-02 0:37 ` Minchan Kim
2013-04-02 19:20 ` David Rientjes
2013-04-02 19:30 ` Hugh Dickins
0 siblings, 2 replies; 10+ messages in thread
From: Minchan Kim @ 2013-04-02 0:37 UTC (permalink / raw)
To: David Rientjes
Cc: Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Hugh Dickins
On Mon, Apr 01, 2013 at 04:35:38PM -0700, David Rientjes wrote:
> On Mon, 1 Apr 2013, Minchan Kim wrote:
>
> > __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> > spinlock for making sure that clear_huge_page write become visible
> > after set set_pmd_at() write.
> >
> > But lru_cache_add_lru uses pagevec so it could miss spinlock
> > easily so above rule was broken so user may see inconsistent data.
> >
> > This patch fixes it with using explict barrier rather than depending
> > on lru spinlock.
> >
>
> Is this the same issue that Andrea responded to in the "thp and memory
> barrier assumptions" thread at http://marc.info/?t=134333512700004 ?
Yes and Peter pointed out further step.
Thanks for pointing out.
Not that I know that Andrea alreay noticed it, I don't care about this
patch.
Remaining question is Kame's one.
Isn't there anyone could answer it?
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-04-02 0:37 ` Minchan Kim
@ 2013-04-02 19:20 ` David Rientjes
2013-04-02 19:30 ` Hugh Dickins
1 sibling, 0 replies; 10+ messages in thread
From: David Rientjes @ 2013-04-02 19:20 UTC (permalink / raw)
To: Minchan Kim, Andrea Arcangeli
Cc: Andrew Morton, linux-kernel, linux-mm, Mel Gorman, Hugh Dickins
On Tue, 2 Apr 2013, Minchan Kim wrote:
> Yes and Peter pointed out further step.
> Thanks for pointing out.
> Not that I know that Andrea alreay noticed it, I don't care about this
> patch.
>
Andrea, do you have time to send
c08e0c9ee786 ("thp: add memory barrier to __do_huge_pmd_anonymous_page")
b08d75a595ec ("thp: document barrier() in wrprotect THP fault path")
from the master branch of aa.git to Andrew? I would do it, but one isn't
signed-off in your tree.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-04-02 0:37 ` Minchan Kim
2013-04-02 19:20 ` David Rientjes
@ 2013-04-02 19:30 ` Hugh Dickins
2013-04-03 0:14 ` Minchan Kim
1 sibling, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2013-04-02 19:30 UTC (permalink / raw)
To: Minchan Kim
Cc: David Rientjes, Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Kamezawa Hiroyuki
On Tue, 2 Apr 2013, Minchan Kim wrote:
> On Mon, Apr 01, 2013 at 04:35:38PM -0700, David Rientjes wrote:
> > On Mon, 1 Apr 2013, Minchan Kim wrote:
> >
> > > __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> > > spinlock for making sure that clear_huge_page write become visible
> > > after set set_pmd_at() write.
> > >
> > > But lru_cache_add_lru uses pagevec so it could miss spinlock
> > > easily so above rule was broken so user may see inconsistent data.
> > >
> > > This patch fixes it with using explict barrier rather than depending
> > > on lru spinlock.
> > >
> >
> > Is this the same issue that Andrea responded to in the "thp and memory
> > barrier assumptions" thread at http://marc.info/?t=134333512700004 ?
>
> Yes and Peter pointed out further step.
> Thanks for pointing out.
> Not that I know that Andrea alreay noticed it, I don't care about this
> patch.
>
> Remaining question is Kame's one.
> > Hmm...how about do_anonymous_page() ? there are no comments/locks/barriers.
> > Users can see non-zero value after page fault in theory ?
> Isn't there anyone could answer it?
See Nick's 2008 0ed361dec "mm: fix PageUptodate data race", which gave us
static inline void __SetPageUptodate(struct page *page)
{
smp_wmb();
__set_bit(PG_uptodate, &(page)->flags);
}
So both do_anonymous_page() and __do_huge_pmd_anonymous_page() look safe
to me already, though the huge_memory one could do with a fixed comment.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-04-02 19:30 ` Hugh Dickins
@ 2013-04-03 0:14 ` Minchan Kim
2013-04-04 13:45 ` Andrea Arcangeli
0 siblings, 1 reply; 10+ messages in thread
From: Minchan Kim @ 2013-04-03 0:14 UTC (permalink / raw)
To: Hugh Dickins
Cc: David Rientjes, Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Kamezawa Hiroyuki, Peter Zijlstra
On Tue, Apr 02, 2013 at 12:30:15PM -0700, Hugh Dickins wrote:
> On Tue, 2 Apr 2013, Minchan Kim wrote:
> > On Mon, Apr 01, 2013 at 04:35:38PM -0700, David Rientjes wrote:
> > > On Mon, 1 Apr 2013, Minchan Kim wrote:
> > >
> > > > __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> > > > spinlock for making sure that clear_huge_page write become visible
> > > > after set set_pmd_at() write.
> > > >
> > > > But lru_cache_add_lru uses pagevec so it could miss spinlock
> > > > easily so above rule was broken so user may see inconsistent data.
> > > >
> > > > This patch fixes it with using explict barrier rather than depending
> > > > on lru spinlock.
> > > >
> > >
> > > Is this the same issue that Andrea responded to in the "thp and memory
> > > barrier assumptions" thread at http://marc.info/?t=134333512700004 ?
> >
> > Yes and Peter pointed out further step.
> > Thanks for pointing out.
> > Not that I know that Andrea alreay noticed it, I don't care about this
> > patch.
> >
> > Remaining question is Kame's one.
> > > Hmm...how about do_anonymous_page() ? there are no comments/locks/barriers.
> > > Users can see non-zero value after page fault in theory ?
> > Isn't there anyone could answer it?
>
> See Nick's 2008 0ed361dec "mm: fix PageUptodate data race", which gave us
>
> static inline void __SetPageUptodate(struct page *page)
> {
> smp_wmb();
> __set_bit(PG_uptodate, &(page)->flags);
> }
>
> So both do_anonymous_page() and __do_huge_pmd_anonymous_page() look safe
> to me already, though the huge_memory one could do with a fixed comment.
Thanks you very much!
That's one everybody are really missing.
Here it goes!
==================== 8< =====================
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-04-03 0:14 ` Minchan Kim
@ 2013-04-04 13:45 ` Andrea Arcangeli
0 siblings, 0 replies; 10+ messages in thread
From: Andrea Arcangeli @ 2013-04-04 13:45 UTC (permalink / raw)
To: Minchan Kim
Cc: Hugh Dickins, David Rientjes, Andrew Morton, linux-kernel,
linux-mm, Mel Gorman, Kamezawa Hiroyuki, Peter Zijlstra
On Wed, Apr 03, 2013 at 09:14:01AM +0900, Minchan Kim wrote:
> clear_huge_page(page, haddr, HPAGE_PMD_NR);
> + /*
> + * The memory barrier inside __SetPageUptodate makes sure that
> + * clear_huge_page writes become visible after the set_pmd_at()
s/after/before/
> + * write.
> + */
> __SetPageUptodate(page);
>
> spin_lock(&mm->page_table_lock);
> @@ -724,12 +729,6 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct *mm,
> } else {
> pmd_t entry;
> entry = mk_huge_pmd(page, vma);
> - /*
> - * The spinlocking to take the lru_lock inside
> - * page_add_new_anon_rmap() acts as a full memory
> - * barrier to be sure clear_huge_page writes become
> - * visible after the set_pmd_at() write.
> - */
> page_add_new_anon_rmap(page, vma, haddr);
> set_pmd_at(mm, haddr, pmd, entry);
> pgtable_trans_huge_deposit(mm, pgtable);
> diff --git a/mm/memory.c b/mm/memory.c
> index 494526a..d0da51e 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3196,6 +3196,11 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
> page = alloc_zeroed_user_highpage_movable(vma, address);
> if (!page)
> goto oom;
> + /*
> + * The memory barrier inside __SetPageUptodate makes sure that
> + * preceeding stores to the page contents become visible after
> + * the set_pte_at() write.
> + */
s/after/before/
After the above correction it looks nice cleanup, thanks!
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] THP: Use explicit memory barrier
2013-03-31 23:45 [PATCH] THP: Use explicit memory barrier Minchan Kim
2013-04-01 1:01 ` Kamezawa Hiroyuki
2013-04-01 23:35 ` David Rientjes
@ 2013-04-04 2:45 ` Simon Jeons
2 siblings, 0 replies; 10+ messages in thread
From: Simon Jeons @ 2013-04-04 2:45 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, linux-kernel, linux-mm, Mel Gorman,
Andrea Arcangeli, Hugh Dickins
Hi Minchan,
On 04/01/2013 07:45 AM, Minchan Kim wrote:
> __do_huge_pmd_anonymous_page depends on page_add_new_anon_rmap's
> spinlock for making sure that clear_huge_page write become visible
> after set set_pmd_at() write.
1. There are no pte modify, why take page_table_lock here?
2. What's the meaning of "clear_huge_page write become visible after set
set_pmd_at() write"?
>
> But lru_cache_add_lru uses pagevec so it could miss spinlock
> easily so above rule was broken so user may see inconsistent data.
>
> This patch fixes it with using explict barrier rather than depending
> on lru spinlock.
>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> mm/huge_memory.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index bfa142e..fad800e 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -725,11 +725,10 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct *mm,
> pmd_t entry;
> entry = mk_huge_pmd(page, vma);
> /*
> - * The spinlocking to take the lru_lock inside
> - * page_add_new_anon_rmap() acts as a full memory
> - * barrier to be sure clear_huge_page writes become
> - * visible after the set_pmd_at() write.
> + * clear_huge_page write become visible after the
> + * set_pmd_at() write.
> */
> + smp_wmb();
> page_add_new_anon_rmap(page, vma, haddr);
> set_pmd_at(mm, haddr, pmd, entry);
> pgtable_trans_huge_deposit(mm, pgtable);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-04-04 13:46 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-31 23:45 [PATCH] THP: Use explicit memory barrier Minchan Kim
2013-04-01 1:01 ` Kamezawa Hiroyuki
2013-04-01 4:45 ` Minchan Kim
2013-04-01 23:35 ` David Rientjes
2013-04-02 0:37 ` Minchan Kim
2013-04-02 19:20 ` David Rientjes
2013-04-02 19:30 ` Hugh Dickins
2013-04-03 0:14 ` Minchan Kim
2013-04-04 13:45 ` Andrea Arcangeli
2013-04-04 2:45 ` Simon Jeons
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).