From: Andrea Arcangeli <andrea@novell.com>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andi Kleen <ak@suse.de>, Andrew Morton <akpm@osdl.org>
Subject: Re: fix iounmap and a pageattr memleak (x86 and x86-64)
Date: Tue, 2 Nov 2004 23:07:20 +0100 [thread overview]
Message-ID: <20041102220720.GV3571@dualathlon.random> (raw)
In-Reply-To: <4187FA6D.3070604@us.ibm.com>
On Tue, Nov 02, 2004 at 01:21:49PM -0800, Dave Hansen wrote:
> This patch:
>
> >From: Andrea Arcangeli <andrea@novell.com>
> >
> >- fix silent memleak in the pageattr code that I found while searching
> > for the bug Andi fixed in the second patch below (basically reference
> > counting in split page was done on the pmd instead of the pte).
> >
> >- Part of this patch is also needed to make the above work on x86
> >(otherwise
> > one of my new above BUGS() will trigger signalling the fact a bug was
> > there). The below patch creates a subtle dependency that (_PAGE_PCD <<
> > 24)
> > must not be zero. It's not the cleanest thing ever, but since it's an
> > hardware bitflag I doubt it's going to break.
> >
> >Signed-off-by: Andi Kleen <ak@suse.de>
> >Signed-off-by: Andrea Arcangeli <andrea@novell.com>
> >Signed-off-by: Andrew Morton <akpm@osdl.org>
> >---
> >
> > 25-akpm/arch/i386/mm/ioremap.c | 4 ++--
> > 25-akpm/arch/i386/mm/pageattr.c | 13 +++++++------
> > 25-akpm/arch/x86_64/mm/ioremap.c | 14 +++++++-------
> > 25-akpm/arch/x86_64/mm/pageattr.c | 23 ++++++++++++++---------
> > 4 files changed, 30 insertions(+), 24 deletions(-)
>
> is hitting this BUG() during bootup:
>
> /* memleak and potential failed 2M page regeneration */
> BUG_ON(!page_count(kpte_page));
>
> in 2.6.10-rc1-mm2.
>
> Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> Memory: 511144k/524288k available (1856k kernel code, 12608k reserved,
> 1186k data, 164k init, 0k highmem)
> Checking if this processor honours the WP bit even in supervisor mode... Ok.
> Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
> ------------[ cut here ]------------
> kernel BUG at arch/i386/mm/pageattr.c:136!
> invalid operand: 0000 [#1]
> SMP DEBUG_PAGEALLOC
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c0113f48>] Not tainted VLI
> EFLAGS: 00010046 (2.6.10-rc1-mm2)
> EIP is at __change_page_attr+0x28c/0x358
> eax: ffffffff ebx: 017ff163 ecx: 00000000 edx: c10001e0
> esi: 00000000 edi: c000fff8 ebp: c10001e0 esp: c03f9d98
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=c03f9000 task=c0345b40)
> Stack: c102ffe0 00000000 00000000 00000046 c0113ceb c17f9000 c102ff20
> c102ff20 017ff163 00000000 017ff163 00000000 c0113ceb c17fe000
> c102ffc0 c102ffc0 c1000000 c17ff000 c000fff8 017ff163 00000000
> 017ff163 00000000 00000000
> Call Trace:
> [<c0113ceb>] __change_page_attr+0x2f/0x358
> [<c0113ceb>] __change_page_attr+0x2f/0x358
> [<c011404a>] change_page_attr+0x36/0x54
> [<c0114148>] kernel_map_pages+0x30/0x5f
> [<c0137d80>] __alloc_pages+0x340/0x350
> [<c0137dad>] __get_free_pages+0x1d/0x30
> [<c013adfa>] kmem_getpages+0x26/0xd4
> [<c013c221>] cache_grow+0xb1/0x150
> [<c013c84a>] cache_alloc_refill+0x232/0x280
> [<c013ccbe>] kmem_cache_alloc+0x5a/0x78
> [<c01d4970>] idr_pre_get+0x1c/0x44
> [<c0181b60>] sysfs_fill_super+0x0/0xa4
> [<c01572cc>] set_anon_super+0x10/0xb8
> [<c0156cff>] sget+0xb3/0x148
> [<c0181b60>] sysfs_fill_super+0x0/0xa4
> [<c0157636>] get_sb_single+0x26/0x8c
> [<c0157608>] compare_single+0x0/0x8
> [<c01572bc>] set_anon_super+0x0/0xb8
> [<c0181c1d>] sysfs_get_sb+0x19/0x1d
> [<c0181b60>] sysfs_fill_super+0x0/0xa4
> [<c01576ea>] do_kern_mount+0x4e/0xd0
> [<c015777d>] kern_mount+0x11/0x15
> [<c0409962>] sysfs_init+0x1e/0x50
> [<c0409430>] mnt_init+0xb4/0xc0
> [<c040917a>] vfs_caches_init+0x7e/0x94
> [<c03fa831>] start_kernel+0x12d/0x150
> Code: c7 0f 75 f5 f0 ff 4d 04 eb 08 0f 0b 85 00 88 c1 2d c0 8b 45 00 89
> ea f6 c4 80 74 07 8b 55 0c 8d 74 26 00 8b 42 04 83 f8 ff 75 08 <0f> 0b
> 88 00 88 c1 2d c0 a1 ac 6c 34 c0 a8 08 0f 84 aa 00 00 00
>
> I'm tracking down now to see exactly what's going on. This just a
> regular, plain 4-way x86 box with 4GB of RAM. Removing that BUG_ON()
> lets me boot just fine.
you've a debugging option enabled.
I'm afraid somebody wrote common code with the hope that
change_page_attr had a natural universal API.
change_page_attr API has alwasy required you a certain level of symmetry
to work. Now I made it completley symmetric just to make it simpler to
use and I added BUG where the previous code would screwup.
If you do something like this you'll trigger a BUG_ON too:
change_page_attr(page, PAGE_KERNEL_NOCACHE)
change_page_attr(page, PAGE_KERNEL)
change_page_attr(page, PAGE_KERNEL)
the last one will BUG().
At least the new API with these changes will not leak memory anymore as
far as you retain simmetry (even if you run the change_page_attr on the
same page from different context).
I suspect that such debugging option has never worked right.
pageattr will need fixing so that to provide a natural universal API
that keeps track of each page. Previously this would have failed:
A change_page_attr(page1, UNCACHED) -> pte page_count = 2
B change_page_attr(page1, WHATEVER) -> pte page_count still 2
C change_page_attr(page2, UNCACHED) -> pte page count = 3
A change_page_attr(page1, PAGE_KERNEL) -> pte page count = 2
B change_page_attr(page1, PAGE_KERNEL) -> pte page count = 1 ->screwup
Now the above can work fine. But still examples like the above one will
trigger bugs since they're simply violating the current restricted
change_page_attr API.
I believe if you disable the DEBUG_PAGEALLOC it should work fine again.
Still I recommend investigating _why_ debug_pagealloc is violating the
API. It might not be necessary to wait for the pageattr universal
feature to make DEBUG_PAGEALLOC work safe.
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <andrea@novell.com>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andi Kleen <ak@suse.de>, Andrew Morton <akpm@osdl.org>
Subject: Re: fix iounmap and a pageattr memleak (x86 and x86-64)
Date: Tue, 2 Nov 2004 23:07:20 +0100 [thread overview]
Message-ID: <20041102220720.GV3571@dualathlon.random> (raw)
In-Reply-To: <4187FA6D.3070604@us.ibm.com>
On Tue, Nov 02, 2004 at 01:21:49PM -0800, Dave Hansen wrote:
> This patch:
>
> >From: Andrea Arcangeli <andrea@novell.com>
> >
> >- fix silent memleak in the pageattr code that I found while searching
> > for the bug Andi fixed in the second patch below (basically reference
> > counting in split page was done on the pmd instead of the pte).
> >
> >- Part of this patch is also needed to make the above work on x86
> >(otherwise
> > one of my new above BUGS() will trigger signalling the fact a bug was
> > there). The below patch creates a subtle dependency that (_PAGE_PCD <<
> > 24)
> > must not be zero. It's not the cleanest thing ever, but since it's an
> > hardware bitflag I doubt it's going to break.
> >
> >Signed-off-by: Andi Kleen <ak@suse.de>
> >Signed-off-by: Andrea Arcangeli <andrea@novell.com>
> >Signed-off-by: Andrew Morton <akpm@osdl.org>
> >---
> >
> > 25-akpm/arch/i386/mm/ioremap.c | 4 ++--
> > 25-akpm/arch/i386/mm/pageattr.c | 13 +++++++------
> > 25-akpm/arch/x86_64/mm/ioremap.c | 14 +++++++-------
> > 25-akpm/arch/x86_64/mm/pageattr.c | 23 ++++++++++++++---------
> > 4 files changed, 30 insertions(+), 24 deletions(-)
>
> is hitting this BUG() during bootup:
>
> /* memleak and potential failed 2M page regeneration */
> BUG_ON(!page_count(kpte_page));
>
> in 2.6.10-rc1-mm2.
>
> Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> Memory: 511144k/524288k available (1856k kernel code, 12608k reserved,
> 1186k data, 164k init, 0k highmem)
> Checking if this processor honours the WP bit even in supervisor mode... Ok.
> Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
> ------------[ cut here ]------------
> kernel BUG at arch/i386/mm/pageattr.c:136!
> invalid operand: 0000 [#1]
> SMP DEBUG_PAGEALLOC
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c0113f48>] Not tainted VLI
> EFLAGS: 00010046 (2.6.10-rc1-mm2)
> EIP is at __change_page_attr+0x28c/0x358
> eax: ffffffff ebx: 017ff163 ecx: 00000000 edx: c10001e0
> esi: 00000000 edi: c000fff8 ebp: c10001e0 esp: c03f9d98
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=c03f9000 task=c0345b40)
> Stack: c102ffe0 00000000 00000000 00000046 c0113ceb c17f9000 c102ff20
> c102ff20 017ff163 00000000 017ff163 00000000 c0113ceb c17fe000
> c102ffc0 c102ffc0 c1000000 c17ff000 c000fff8 017ff163 00000000
> 017ff163 00000000 00000000
> Call Trace:
> [<c0113ceb>] __change_page_attr+0x2f/0x358
> [<c0113ceb>] __change_page_attr+0x2f/0x358
> [<c011404a>] change_page_attr+0x36/0x54
> [<c0114148>] kernel_map_pages+0x30/0x5f
> [<c0137d80>] __alloc_pages+0x340/0x350
> [<c0137dad>] __get_free_pages+0x1d/0x30
> [<c013adfa>] kmem_getpages+0x26/0xd4
> [<c013c221>] cache_grow+0xb1/0x150
> [<c013c84a>] cache_alloc_refill+0x232/0x280
> [<c013ccbe>] kmem_cache_alloc+0x5a/0x78
> [<c01d4970>] idr_pre_get+0x1c/0x44
> [<c0181b60>] sysfs_fill_super+0x0/0xa4
> [<c01572cc>] set_anon_super+0x10/0xb8
> [<c0156cff>] sget+0xb3/0x148
> [<c0181b60>] sysfs_fill_super+0x0/0xa4
> [<c0157636>] get_sb_single+0x26/0x8c
> [<c0157608>] compare_single+0x0/0x8
> [<c01572bc>] set_anon_super+0x0/0xb8
> [<c0181c1d>] sysfs_get_sb+0x19/0x1d
> [<c0181b60>] sysfs_fill_super+0x0/0xa4
> [<c01576ea>] do_kern_mount+0x4e/0xd0
> [<c015777d>] kern_mount+0x11/0x15
> [<c0409962>] sysfs_init+0x1e/0x50
> [<c0409430>] mnt_init+0xb4/0xc0
> [<c040917a>] vfs_caches_init+0x7e/0x94
> [<c03fa831>] start_kernel+0x12d/0x150
> Code: c7 0f 75 f5 f0 ff 4d 04 eb 08 0f 0b 85 00 88 c1 2d c0 8b 45 00 89
> ea f6 c4 80 74 07 8b 55 0c 8d 74 26 00 8b 42 04 83 f8 ff 75 08 <0f> 0b
> 88 00 88 c1 2d c0 a1 ac 6c 34 c0 a8 08 0f 84 aa 00 00 00
>
> I'm tracking down now to see exactly what's going on. This just a
> regular, plain 4-way x86 box with 4GB of RAM. Removing that BUG_ON()
> lets me boot just fine.
you've a debugging option enabled.
I'm afraid somebody wrote common code with the hope that
change_page_attr had a natural universal API.
change_page_attr API has alwasy required you a certain level of symmetry
to work. Now I made it completley symmetric just to make it simpler to
use and I added BUG where the previous code would screwup.
If you do something like this you'll trigger a BUG_ON too:
change_page_attr(page, PAGE_KERNEL_NOCACHE)
change_page_attr(page, PAGE_KERNEL)
change_page_attr(page, PAGE_KERNEL)
the last one will BUG().
At least the new API with these changes will not leak memory anymore as
far as you retain simmetry (even if you run the change_page_attr on the
same page from different context).
I suspect that such debugging option has never worked right.
pageattr will need fixing so that to provide a natural universal API
that keeps track of each page. Previously this would have failed:
A change_page_attr(page1, UNCACHED) -> pte page_count = 2
B change_page_attr(page1, WHATEVER) -> pte page_count still 2
C change_page_attr(page2, UNCACHED) -> pte page count = 3
A change_page_attr(page1, PAGE_KERNEL) -> pte page count = 2
B change_page_attr(page1, PAGE_KERNEL) -> pte page count = 1 ->screwup
Now the above can work fine. But still examples like the above one will
trigger bugs since they're simply violating the current restricted
change_page_attr API.
I believe if you disable the DEBUG_PAGEALLOC it should work fine again.
Still I recommend investigating _why_ debug_pagealloc is violating the
API. It might not be necessary to wait for the pageattr universal
feature to make DEBUG_PAGEALLOC work safe.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2004-11-03 0:48 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-02 21:21 fix iounmap and a pageattr memleak (x86 and x86-64) Dave Hansen
2004-11-02 21:21 ` Dave Hansen
2004-11-02 22:07 ` Andrea Arcangeli [this message]
2004-11-02 22:07 ` Andrea Arcangeli
2004-11-02 22:21 ` Dave Hansen
2004-11-02 22:21 ` Dave Hansen
2004-11-02 22:29 ` Andrew Morton
2004-11-02 22:29 ` Andrew Morton
2004-11-02 22:34 ` Dave Hansen
2004-11-02 22:34 ` Dave Hansen
2004-11-03 0:54 ` Andrea Arcangeli
2004-11-03 0:54 ` Andrea Arcangeli
2004-11-02 22:45 ` Dave Hansen
2004-11-02 23:00 ` Dave Hansen
2004-11-02 23:00 ` Dave Hansen
2004-11-03 1:35 ` Andrea Arcangeli
2004-11-03 1:35 ` Andrea Arcangeli
2004-11-03 1:43 ` Dave Hansen
2004-11-03 1:43 ` Dave Hansen
2004-11-03 2:26 ` Andrea Arcangeli
2004-11-03 2:26 ` Andrea Arcangeli
2004-11-03 2:48 ` Dave Hansen
2004-11-03 2:48 ` Dave Hansen
2004-11-03 3:05 ` Andrea Arcangeli
2004-11-03 3:05 ` Andrea Arcangeli
2004-11-03 19:37 ` Dave Hansen
2004-11-03 19:37 ` Dave Hansen
2004-11-05 0:02 ` Dave Hansen
2004-11-05 0:40 ` Dave Hansen
2004-11-05 0:53 ` Andrea Arcangeli
2004-11-05 0:53 ` Andrea Arcangeli
2004-11-05 1:55 ` Dave Hansen
2004-11-05 1:55 ` Dave Hansen
2004-11-05 2:08 ` Andrea Arcangeli
2004-11-05 2:08 ` Andrea Arcangeli
2004-11-05 2:23 ` Dave Hansen
2004-11-05 4:03 ` Andrea Arcangeli
2004-11-05 4:03 ` Andrea Arcangeli
2004-11-05 4:20 ` Andrea Arcangeli
2004-11-05 4:20 ` Andrea Arcangeli
2004-11-02 23:04 ` Andrew Morton
2004-11-02 23:04 ` Andrew Morton
2004-11-03 1:40 ` Andrea Arcangeli
2004-11-03 1:40 ` Andrea Arcangeli
2004-11-02 22:34 ` Jason Baron
2004-11-02 22:34 ` Jason Baron
2004-11-02 23:12 ` Andrea Arcangeli
2004-11-02 23:12 ` Andrea Arcangeli
-- strict thread matches above, loose matches on Subject: below --
2004-10-28 19:21 Andrea Arcangeli
2004-11-05 8:07 ` Andrea Arcangeli
2004-11-05 8:31 ` Andi Kleen
2004-11-05 8:49 ` Andrea Arcangeli
2005-02-14 23:15 ` Andrea Arcangeli
2005-02-15 10:39 ` Andi Kleen
2005-02-15 10:48 ` Andrea Arcangeli
2005-02-15 10:51 ` Andi Kleen
2005-02-15 11:11 ` Andrea Arcangeli
2005-02-15 13:14 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041102220720.GV3571@dualathlon.random \
--to=andrea@novell.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.