* [PATCH] mm/rmap: fix BUG at rmap_walk
@ 2013-12-19 0:16 Wanpeng Li
2013-12-19 0:28 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2013-12-19 0:16 UTC (permalink / raw)
To: Andrew Morton
Cc: Sasha Levin, Hugh Dickins, Joonsoo Kim, linux-mm, linux-kernel,
Wanpeng Li
page_get_anon_vma() called in page_referenced_anon() will lock and
increase the refcount of anon_vma, page won't be locked for anonymous
page. This patch fix it by skip check anonymous page locked.
[ 588.698828] kernel BUG at mm/rmap.c:1663!
[ 588.699380] invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC
[ 588.700347] Dumping ftrace buffer:
[ 588.701186] (ftrace buffer empty)
[ 588.702062] Modules linked in:
[ 588.702759] CPU: 0 PID: 4647 Comm: kswapd0 Tainted: G D W 3.13.0-rc4-next-20131218-sasha-00012-g1962367-dirty #4155
[ 588.704330] task: ffff880062bcb000 ti: ffff880062450000 task.ti: ffff880062450000
[ 588.705507] RIP: 0010:[<ffffffff81289c80>] [<ffffffff81289c80>] rmap_walk+0x10/0x50
[ 588.706800] RSP: 0018:ffff8800624518d8 EFLAGS: 00010246
[ 588.707515] RAX: 000fffff80080048 RBX: ffffea00000227c0 RCX: 0000000000000000
[ 588.707515] RDX: 0000000000000000 RSI: ffff8800624518e8 RDI: ffffea00000227c0
[ 588.707515] RBP: ffff8800624518d8 R08: ffff8800624518e8 R09: 0000000000000000
[ 588.707515] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800624519d8
[ 588.707515] R13: 0000000000000000 R14: ffffea00000227e0 R15: 0000000000000000
[ 588.707515] FS: 0000000000000000(0000) GS:ffff880065200000(0000) knlGS:0000000000000000
[ 588.707515] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 588.707515] CR2: 00007fec40cbe0f8 CR3: 00000000c2382000 CR4: 00000000000006f0
[ 588.707515] Stack:
[ 588.707515] ffff880062451958 ffffffff81289f4b ffff880062451918 ffffffff81289f80
[ 588.707515] 0000000000000000 0000000000000000 ffffffff8128af60 0000000000000000
[ 588.707515] 0000000000000024 0000000000000000 0000000000000000 0000000000000286
[ 588.707515] Call Trace:
[ 588.707515] [<ffffffff81289f4b>] page_referenced+0xcb/0x100
[ 588.707515] [<ffffffff81289f80>] ? page_referenced+0x100/0x100
[ 588.707515] [<ffffffff8128af60>] ? invalid_page_referenced_vma+0x170/0x170
[ 588.707515] [<ffffffff81264302>] shrink_active_list+0x212/0x330
[ 588.707515] [<ffffffff81260e23>] ? inactive_file_is_low+0x33/0x50
[ 588.707515] [<ffffffff812646f5>] shrink_lruvec+0x2d5/0x300
[ 588.707515] [<ffffffff812647b6>] shrink_zone+0x96/0x1e0
[ 588.707515] [<ffffffff81265b06>] kswapd_shrink_zone+0xf6/0x1c0
[ 588.707515] [<ffffffff81265f43>] balance_pgdat+0x373/0x550
[ 588.707515] [<ffffffff81266d63>] kswapd+0x2f3/0x350
[ 588.707515] [<ffffffff81266a70>] ? perf_trace_mm_vmscan_lru_isolate_template+0x120/0x120
[ 588.707515] [<ffffffff8115c9c5>] kthread+0x105/0x110
[ 588.707515] [<ffffffff8115c8c0>] ? set_kthreadd_affinity+0x30/0x30
[ 588.707515] [<ffffffff843a6a7c>] ret_from_fork+0x7c/0xb0
[ 588.707515] [<ffffffff8115c8c0>] ? set_kthreadd_affinity+0x30/0x30
[ 588.707515] Code: c0 48 83 c4 18 89 d0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 66 0f 1f 84
00 00 00 00 00 55 48 89 e5 66 66 66 66 90 48 8b 07 a8 01 75 10 <0f> 0b 66 0f 1f 44 00 0
0 eb fe 66 0f 1f 44 00 00 f6 47 08 01 74
[ 588.707515] RIP [<ffffffff81289c80>] rmap_walk+0x10/0x50
[ 588.707515] RSP <ffff8800624518d8>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
---
mm/rmap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index d792e71..daf20d5 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1660,7 +1660,8 @@ done:
int rmap_walk(struct page *page, struct rmap_walk_control *rwc)
{
- VM_BUG_ON(!PageLocked(page));
+ if (!PageAnon(page) || PageKsm(page))
+ VM_BUG_ON(!PageLocked(page));
if (unlikely(PageKsm(page)))
return rmap_walk_ksm(page, rwc);
--
1.8.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:16 [PATCH] mm/rmap: fix BUG at rmap_walk Wanpeng Li
@ 2013-12-19 0:28 ` Andrew Morton
2013-12-19 0:41 ` Sasha Levin
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Andrew Morton @ 2013-12-19 0:28 UTC (permalink / raw)
To: Wanpeng Li; +Cc: Sasha Levin, Hugh Dickins, Joonsoo Kim, linux-mm, linux-kernel
On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
> page_get_anon_vma() called in page_referenced_anon() will lock and
> increase the refcount of anon_vma, page won't be locked for anonymous
> page. This patch fix it by skip check anonymous page locked.
>
> [ 588.698828] kernel BUG at mm/rmap.c:1663!
Why is all this suddenly happening. Did we change something, or did a
new test get added to trinity?
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1660,7 +1660,8 @@ done:
>
> int rmap_walk(struct page *page, struct rmap_walk_control *rwc)
> {
> - VM_BUG_ON(!PageLocked(page));
> + if (!PageAnon(page) || PageKsm(page))
> + VM_BUG_ON(!PageLocked(page));
>
> if (unlikely(PageKsm(page)))
> return rmap_walk_ksm(page, rwc);
Is there any reason why rmap_walk_ksm() and rmap_walk_file() *need*
PageLocked() whereas rmap_walk_anon() does not? If so, let's implement
it like this:
--- a/mm/rmap.c~a
+++ a/mm/rmap.c
@@ -1716,6 +1716,10 @@ static int rmap_walk_file(struct page *p
struct vm_area_struct *vma;
int ret = SWAP_AGAIN;
+ /*
+ * page must be locked because <reason goes here>
+ */
+ VM_BUG_ON(!PageLocked(page));
if (!mapping)
return ret;
mutex_lock(&mapping->i_mmap_mutex);
@@ -1737,8 +1741,6 @@ static int rmap_walk_file(struct page *p
int rmap_walk(struct page *page, int (*rmap_one)(struct page *,
struct vm_area_struct *, unsigned long, void *), void *arg)
{
- VM_BUG_ON(!PageLocked(page));
-
if (unlikely(PageKsm(page)))
return rmap_walk_ksm(page, rmap_one, arg);
else if (PageAnon(page))
--- a/mm/ksm.c~a
+++ a/mm/ksm.c
@@ -2006,6 +2006,9 @@ int rmap_walk_ksm(struct page *page, int
int search_new_forks = 0;
VM_BUG_ON(!PageKsm(page));
+ /*
+ * page must be locked because <reason goes here>
+ */
VM_BUG_ON(!PageLocked(page));
stable_node = page_stable_node(page);
Or if there is no reason why the page must be locked for
rmap_walk_ksm() and rmap_walk_file(), let's just remove rmap_walk()'s
VM_BUG_ON()? And rmap_walk_ksm()'s as well - it's duplicative anyway.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:28 ` Andrew Morton
@ 2013-12-19 0:41 ` Sasha Levin
2013-12-19 0:50 ` Andrew Morton
2013-12-19 0:56 ` Wanpeng Li
2013-12-19 0:58 ` Joonsoo Kim
2 siblings, 1 reply; 10+ messages in thread
From: Sasha Levin @ 2013-12-19 0:41 UTC (permalink / raw)
To: Andrew Morton, Wanpeng Li
Cc: Hugh Dickins, Joonsoo Kim, linux-mm, linux-kernel, Dave Jones
On 12/18/2013 07:28 PM, Andrew Morton wrote:
> On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
>
>> page_get_anon_vma() called in page_referenced_anon() will lock and
>> increase the refcount of anon_vma, page won't be locked for anonymous
>> page. This patch fix it by skip check anonymous page locked.
>>
>> [ 588.698828] kernel BUG at mm/rmap.c:1663!
>
> Why is all this suddenly happening. Did we change something, or did a
> new test get added to trinity?
Dave has improved mmap testing in trinity, maybe it's related?
> Or if there is no reason why the page must be locked for
> rmap_walk_ksm() and rmap_walk_file(), let's just remove rmap_walk()'s
> VM_BUG_ON()? And rmap_walk_ksm()'s as well - it's duplicative anyway.
IMO, removing all these VM_BUG_ON()s (which is happening quite often recently) will
lead to having bugs sneak by causing obscure undetected corruption instead of
being very obvious through a BUG.
I know it might be silly, but if we're removing a lot of these - can we "balance"
it back by asking people to introduce new VM_BUG_ON()s, along with a short comment
explaining why to places where these assumptions are going unchecked and are obvious
to them but not to many others?
I'll be more than happy to fuzz through patches that do that to make sure
we catch corner cases.
Thanks,
Sasha
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:41 ` Sasha Levin
@ 2013-12-19 0:50 ` Andrew Morton
2013-12-19 1:30 ` Dave Jones
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2013-12-19 0:50 UTC (permalink / raw)
To: Sasha Levin
Cc: Wanpeng Li, Hugh Dickins, Joonsoo Kim, linux-mm, linux-kernel,
Dave Jones
On Wed, 18 Dec 2013 19:41:44 -0500 Sasha Levin <sasha.levin@oracle.com> wrote:
> On 12/18/2013 07:28 PM, Andrew Morton wrote:
> > On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
> >
> >> page_get_anon_vma() called in page_referenced_anon() will lock and
> >> increase the refcount of anon_vma, page won't be locked for anonymous
> >> page. This patch fix it by skip check anonymous page locked.
> >>
> >> [ 588.698828] kernel BUG at mm/rmap.c:1663!
> >
> > Why is all this suddenly happening. Did we change something, or did a
> > new test get added to trinity?
>
> Dave has improved mmap testing in trinity, maybe it's related?
Dave, can you please summarise recent trinity changes for us?
> > Or if there is no reason why the page must be locked for
> > rmap_walk_ksm() and rmap_walk_file(), let's just remove rmap_walk()'s
> > VM_BUG_ON()? And rmap_walk_ksm()'s as well - it's duplicative anyway.
>
> IMO, removing all these VM_BUG_ON()s (which is happening quite often recently) will
> lead to having bugs sneak by causing obscure undetected corruption instead of
> being very obvious through a BUG.
>
Well. a) My patch was functionally the same as the one Wanpeng
proposed, only better ;) and b) we shouldn't just assert X because we
observed that the existing code does X. If a particular function
*needs* PageLocked(page) then sure, it can and should assert that the
page is locked. Preferably with a comment explaining *why*
PageLocked() is needed. That way we don't end up with years-old
assertions which nobody understands any more, which is what we have
now.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:28 ` Andrew Morton
2013-12-19 0:41 ` Sasha Levin
@ 2013-12-19 0:56 ` Wanpeng Li
2013-12-19 0:58 ` Joonsoo Kim
2 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2013-12-19 0:56 UTC (permalink / raw)
To: Andrew Morton
Cc: Sasha Levin, Hugh Dickins, Joonsoo Kim, linux-mm, linux-kernel
Hi Andrew,
On Wed, Dec 18, 2013 at 04:28:58PM -0800, Andrew Morton wrote:
>On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
>
>> page_get_anon_vma() called in page_referenced_anon() will lock and
>> increase the refcount of anon_vma, page won't be locked for anonymous
>> page. This patch fix it by skip check anonymous page locked.
>>
>> [ 588.698828] kernel BUG at mm/rmap.c:1663!
>
>Why is all this suddenly happening. Did we change something, or did a
>new test get added to trinity?
>
They are introduced by Joonsoo's rmap_walk.
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1660,7 +1660,8 @@ done:
>>
>> int rmap_walk(struct page *page, struct rmap_walk_control *rwc)
>> {
>> - VM_BUG_ON(!PageLocked(page));
>> + if (!PageAnon(page) || PageKsm(page))
>> + VM_BUG_ON(!PageLocked(page));
>>
>> if (unlikely(PageKsm(page)))
>> return rmap_walk_ksm(page, rwc);
>
>Is there any reason why rmap_walk_ksm() and rmap_walk_file() *need*
>PageLocked() whereas rmap_walk_anon() does not? If so, let's implement
>it like this:
All callsites of rmap_walk()
try_to_unmap() page should be lockecd (checked in rmap_walk())
try_to_munlock() pages should be locked (checked in try_to_munlock())
page_referenced() pages should be locked except anonymous page (checked in rmap_walk())
page_mkclean() pages should be locked (checked in page_mkclean())
remove_migration_ptes() pages should be locked (checked in rmap_walk())
We can move PageLocked(page) check to the callsites instead of in
rmap_walk() since anonymous page is not locked in page_referenced().
Regards,
Wanpeng Li
>
>
>--- a/mm/rmap.c~a
>+++ a/mm/rmap.c
>@@ -1716,6 +1716,10 @@ static int rmap_walk_file(struct page *p
> struct vm_area_struct *vma;
> int ret = SWAP_AGAIN;
>
>+ /*
>+ * page must be locked because <reason goes here>
>+ */
>+ VM_BUG_ON(!PageLocked(page));
> if (!mapping)
> return ret;
> mutex_lock(&mapping->i_mmap_mutex);
>@@ -1737,8 +1741,6 @@ static int rmap_walk_file(struct page *p
> int rmap_walk(struct page *page, int (*rmap_one)(struct page *,
> struct vm_area_struct *, unsigned long, void *), void *arg)
> {
>- VM_BUG_ON(!PageLocked(page));
>-
> if (unlikely(PageKsm(page)))
> return rmap_walk_ksm(page, rmap_one, arg);
> else if (PageAnon(page))
>--- a/mm/ksm.c~a
>+++ a/mm/ksm.c
>@@ -2006,6 +2006,9 @@ int rmap_walk_ksm(struct page *page, int
> int search_new_forks = 0;
>
> VM_BUG_ON(!PageKsm(page));
>+ /*
>+ * page must be locked because <reason goes here>
>+ */
> VM_BUG_ON(!PageLocked(page));
>
> stable_node = page_stable_node(page);
>
>
>Or if there is no reason why the page must be locked for
>rmap_walk_ksm() and rmap_walk_file(), let's just remove rmap_walk()'s
>VM_BUG_ON()? And rmap_walk_ksm()'s as well - it's duplicative anyway.
>
>--
>To unsubscribe, send a message with 'unsubscribe linux-mm' in
>the body to majordomo@kvack.org. For more info on Linux MM,
>see: http://www.linux-mm.org/ .
>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:28 ` Andrew Morton
2013-12-19 0:41 ` Sasha Levin
2013-12-19 0:56 ` Wanpeng Li
@ 2013-12-19 0:58 ` Joonsoo Kim
2013-12-19 1:04 ` Andrew Morton
2 siblings, 1 reply; 10+ messages in thread
From: Joonsoo Kim @ 2013-12-19 0:58 UTC (permalink / raw)
To: Andrew Morton
Cc: Wanpeng Li, Sasha Levin, Hugh Dickins, linux-mm, linux-kernel
Hello, Andrew.
On Wed, Dec 18, 2013 at 04:28:58PM -0800, Andrew Morton wrote:
> On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
>
> > page_get_anon_vma() called in page_referenced_anon() will lock and
> > increase the refcount of anon_vma, page won't be locked for anonymous
> > page. This patch fix it by skip check anonymous page locked.
> >
> > [ 588.698828] kernel BUG at mm/rmap.c:1663!
>
> Why is all this suddenly happening. Did we change something, or did a
> new test get added to trinity?
It is my fault.
I should remove this VM_BUG_ON() since rmap_walk() can be called
without holding PageLock() in this case.
I think that adding VM_BUG_ON() to each rmap_walk calllers is better
than this patch, because, now, rmap_walk() is called by many places and
each places has different contexts.
>
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1660,7 +1660,8 @@ done:
> >
> > int rmap_walk(struct page *page, struct rmap_walk_control *rwc)
> > {
> > - VM_BUG_ON(!PageLocked(page));
> > + if (!PageAnon(page) || PageKsm(page))
> > + VM_BUG_ON(!PageLocked(page));
> >
> > if (unlikely(PageKsm(page)))
> > return rmap_walk_ksm(page, rwc);
>
> Is there any reason why rmap_walk_ksm() and rmap_walk_file() *need*
> PageLocked() whereas rmap_walk_anon() does not? If so, let's implement
> it like this:
I will investigate this more.
I don't know why the pagelock is needed only (!PageAnon || PageKsm) case
on page_referenced(). I changed page_referenced() from it's own rmap walker
to common rmap walker directly. So for now, I don't know exact reason.
Thanks.
>
>
> --- a/mm/rmap.c~a
> +++ a/mm/rmap.c
> @@ -1716,6 +1716,10 @@ static int rmap_walk_file(struct page *p
> struct vm_area_struct *vma;
> int ret = SWAP_AGAIN;
>
> + /*
> + * page must be locked because <reason goes here>
> + */
> + VM_BUG_ON(!PageLocked(page));
> if (!mapping)
> return ret;
> mutex_lock(&mapping->i_mmap_mutex);
> @@ -1737,8 +1741,6 @@ static int rmap_walk_file(struct page *p
> int rmap_walk(struct page *page, int (*rmap_one)(struct page *,
> struct vm_area_struct *, unsigned long, void *), void *arg)
> {
> - VM_BUG_ON(!PageLocked(page));
> -
> if (unlikely(PageKsm(page)))
> return rmap_walk_ksm(page, rmap_one, arg);
> else if (PageAnon(page))
> --- a/mm/ksm.c~a
> +++ a/mm/ksm.c
> @@ -2006,6 +2006,9 @@ int rmap_walk_ksm(struct page *page, int
> int search_new_forks = 0;
>
> VM_BUG_ON(!PageKsm(page));
> + /*
> + * page must be locked because <reason goes here>
> + */
> VM_BUG_ON(!PageLocked(page));
>
> stable_node = page_stable_node(page);
>
>
> Or if there is no reason why the page must be locked for
> rmap_walk_ksm() and rmap_walk_file(), let's just remove rmap_walk()'s
> VM_BUG_ON()? And rmap_walk_ksm()'s as well - it's duplicative anyway.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:58 ` Joonsoo Kim
@ 2013-12-19 1:04 ` Andrew Morton
2013-12-19 1:12 ` Wanpeng Li
2013-12-19 1:14 ` Joonsoo Kim
0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2013-12-19 1:04 UTC (permalink / raw)
To: Joonsoo Kim; +Cc: Wanpeng Li, Sasha Levin, Hugh Dickins, linux-mm, linux-kernel
On Thu, 19 Dec 2013 09:58:05 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> On Wed, Dec 18, 2013 at 04:28:58PM -0800, Andrew Morton wrote:
> > On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
> >
> > > page_get_anon_vma() called in page_referenced_anon() will lock and
> > > increase the refcount of anon_vma, page won't be locked for anonymous
> > > page. This patch fix it by skip check anonymous page locked.
> > >
> > > [ 588.698828] kernel BUG at mm/rmap.c:1663!
> >
> > Why is all this suddenly happening. Did we change something, or did a
> > new test get added to trinity?
>
> It is my fault.
> I should remove this VM_BUG_ON() since rmap_walk() can be called
> without holding PageLock() in this case.
>
> I think that adding VM_BUG_ON() to each rmap_walk calllers is better
> than this patch, because, now, rmap_walk() is called by many places and
> each places has different contexts.
I don't think that putting the assertion into the caller makes a lot of
sense, particularly if that code just did a lock_page()! If a *callee*
needs PageLocked() then that callee should assert that the page is
locked. So
VM_BUG_ON(!PageLocked(page));
means "this code requires that the page be locked". And if that code
requires PageLocked(), there must be reasons for this. Let's also
include an explanation of those reasons.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 1:04 ` Andrew Morton
@ 2013-12-19 1:12 ` Wanpeng Li
2013-12-19 1:14 ` Joonsoo Kim
1 sibling, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2013-12-19 1:12 UTC (permalink / raw)
To: Andrew Morton
Cc: Joonsoo Kim, Sasha Levin, Hugh Dickins, linux-mm, linux-kernel
Hi Andrew,
On Wed, Dec 18, 2013 at 05:04:29PM -0800, Andrew Morton wrote:
>On Thu, 19 Dec 2013 09:58:05 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
>
>> On Wed, Dec 18, 2013 at 04:28:58PM -0800, Andrew Morton wrote:
>> > On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
>> >
>> > > page_get_anon_vma() called in page_referenced_anon() will lock and
>> > > increase the refcount of anon_vma, page won't be locked for anonymous
>> > > page. This patch fix it by skip check anonymous page locked.
>> > >
>> > > [ 588.698828] kernel BUG at mm/rmap.c:1663!
>> >
>> > Why is all this suddenly happening. Did we change something, or did a
>> > new test get added to trinity?
>>
>> It is my fault.
>> I should remove this VM_BUG_ON() since rmap_walk() can be called
>> without holding PageLock() in this case.
>>
>> I think that adding VM_BUG_ON() to each rmap_walk calllers is better
>> than this patch, because, now, rmap_walk() is called by many places and
>> each places has different contexts.
>
>I don't think that putting the assertion into the caller makes a lot of
>sense, particularly if that code just did a lock_page()! If a *callee*
>needs PageLocked() then that callee should assert that the page is
>locked. So
>
> VM_BUG_ON(!PageLocked(page));
>
>means "this code requires that the page be locked". And if that code
>requires PageLocked(), there must be reasons for this. Let's also
>include an explanation of those reasons.
I will add this check and explanation to the callee rmap_one hook of
rmap_walk_control and send another version of the patch. ;-)
Regards,
Wanpeng Li
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 1:04 ` Andrew Morton
2013-12-19 1:12 ` Wanpeng Li
@ 2013-12-19 1:14 ` Joonsoo Kim
1 sibling, 0 replies; 10+ messages in thread
From: Joonsoo Kim @ 2013-12-19 1:14 UTC (permalink / raw)
To: Andrew Morton
Cc: Wanpeng Li, Sasha Levin, Hugh Dickins, linux-mm, linux-kernel
On Wed, Dec 18, 2013 at 05:04:29PM -0800, Andrew Morton wrote:
> On Thu, 19 Dec 2013 09:58:05 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
>
> > On Wed, Dec 18, 2013 at 04:28:58PM -0800, Andrew Morton wrote:
> > > On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
> > >
> > > > page_get_anon_vma() called in page_referenced_anon() will lock and
> > > > increase the refcount of anon_vma, page won't be locked for anonymous
> > > > page. This patch fix it by skip check anonymous page locked.
> > > >
> > > > [ 588.698828] kernel BUG at mm/rmap.c:1663!
> > >
> > > Why is all this suddenly happening. Did we change something, or did a
> > > new test get added to trinity?
> >
> > It is my fault.
> > I should remove this VM_BUG_ON() since rmap_walk() can be called
> > without holding PageLock() in this case.
> >
> > I think that adding VM_BUG_ON() to each rmap_walk calllers is better
> > than this patch, because, now, rmap_walk() is called by many places and
> > each places has different contexts.
>
> I don't think that putting the assertion into the caller makes a lot of
> sense, particularly if that code just did a lock_page()! If a *callee*
> needs PageLocked() then that callee should assert that the page is
> locked. So
>
> VM_BUG_ON(!PageLocked(page));
>
> means "this code requires that the page be locked". And if that code
> requires PageLocked(), there must be reasons for this. Let's also
> include an explanation of those reasons.
Yes, if this condition is invariant for rmap_walk(), we should put this on
rmap_walk(). But if not, we should put this on the other place. I will
investigate more and send good solution :)
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/rmap: fix BUG at rmap_walk
2013-12-19 0:50 ` Andrew Morton
@ 2013-12-19 1:30 ` Dave Jones
0 siblings, 0 replies; 10+ messages in thread
From: Dave Jones @ 2013-12-19 1:30 UTC (permalink / raw)
To: Andrew Morton
Cc: Sasha Levin, Wanpeng Li, Hugh Dickins, Joonsoo Kim, linux-mm,
linux-kernel
On Wed, Dec 18, 2013 at 04:50:49PM -0800, Andrew Morton wrote:
> On Wed, 18 Dec 2013 19:41:44 -0500 Sasha Levin <sasha.levin@oracle.com> wrote:
>
> > On 12/18/2013 07:28 PM, Andrew Morton wrote:
> > > On Thu, 19 Dec 2013 08:16:35 +0800 Wanpeng Li <liwanp@linux.vnet.ibm.com> wrote:
> > >
> > >> page_get_anon_vma() called in page_referenced_anon() will lock and
> > >> increase the refcount of anon_vma, page won't be locked for anonymous
> > >> page. This patch fix it by skip check anonymous page locked.
> > >>
> > >> [ 588.698828] kernel BUG at mm/rmap.c:1663!
> > >
> > > Why is all this suddenly happening. Did we change something, or did a
> > > new test get added to trinity?
> >
> > Dave has improved mmap testing in trinity, maybe it's related?
>
> Dave, can you please summarise recent trinity changes for us?
In the past, the only mmaps we did were created on startup by the initial process,
and were then inherited by the child processes when they fork()'d.
After the recent changes, that's still true, but in addition, we now have
some smarts so that when a child does a random mmap() call, if it succeeds,
we store the result in a per-child list for re-use in subsequent syscalls by
that same child process.
Another reason that a lot of hugepage stuff seems to be falling out is that
trinity didn't do big mmaps before, because after a few children forked
and the maps got touched, we'd run into oom's pretty quickly.
Now that we have per-child mmapping, sometimes it successfully does a hugepage map.
http://git.codemonkey.org.uk/?p=trinity.git;a=blob_plain;f=syscalls/mmap.c;hb=HEAD
is the code that's generating the random map arguments. It should be pretty obvious,
but holler if there's something that needs explaining.
dirty_mapping() is here http://git.codemonkey.org.uk/?p=trinity.git;a=blob_plain;f=maps.c;hb=HEAD
Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-12-19 1:30 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-19 0:16 [PATCH] mm/rmap: fix BUG at rmap_walk Wanpeng Li
2013-12-19 0:28 ` Andrew Morton
2013-12-19 0:41 ` Sasha Levin
2013-12-19 0:50 ` Andrew Morton
2013-12-19 1:30 ` Dave Jones
2013-12-19 0:56 ` Wanpeng Li
2013-12-19 0:58 ` Joonsoo Kim
2013-12-19 1:04 ` Andrew Morton
2013-12-19 1:12 ` Wanpeng Li
2013-12-19 1:14 ` Joonsoo Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).