* [SPARC32] NULL pointer derefference
@ 2007-07-30 2:18 Mark Fortescue
2007-07-30 4:19 ` David Miller
2007-07-31 5:10 ` Ollie Wild
0 siblings, 2 replies; 8+ messages in thread
From: Mark Fortescue @ 2007-07-30 2:18 UTC (permalink / raw)
To: Ollie Wild
Cc: Andrew Morton, linux-arch, sparclinux, wli, linux-mm,
linux-kernel, davem
Hi All,
Unfortunatly Sparc32 sun4c low level memory management apears to be
incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
mm: variable length argument support.
For some reason, this commit corrupts the memory used by the low level
context/pte handling ring buffers in arch/sparc/mm/sun4c (in
add_ring_ordered, head->next becomes set to a NULL pointer).
I had a quick look at http://www.linux-mm.org to see if there were any
diagrams that show what is going on in the memory management systems, to
see if there was something that I could use to help me work out what is
going on, but I could not see any.
Can any one help?
Regards
Mark Fortescue.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [SPARC32] NULL pointer derefference
2007-07-30 2:18 [SPARC32] NULL pointer derefference Mark Fortescue
@ 2007-07-30 4:19 ` David Miller
2007-07-30 10:39 ` Mark Fortescue
2007-07-31 5:35 ` Mark Fortescue
2007-07-31 5:10 ` Ollie Wild
1 sibling, 2 replies; 8+ messages in thread
From: David Miller @ 2007-07-30 4:19 UTC (permalink / raw)
To: mark; +Cc: aaw, akpm, linux-arch, sparclinux, wli, linux-mm, linux-kernel
From: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST)
> Unfortunatly Sparc32 sun4c low level memory management apears to be
> incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
> mm: variable length argument support.
>
> For some reason, this commit corrupts the memory used by the low level
> context/pte handling ring buffers in arch/sparc/mm/sun4c (in
> add_ring_ordered, head->next becomes set to a NULL pointer).
>
> I had a quick look at http://www.linux-mm.org to see if there were any
> diagrams that show what is going on in the memory management systems, to
> see if there was something that I could use to help me work out what is
> going on, but I could not see any.
One possible issue is sequencing, perhaps the stack argument copy
is occuring before the new context is setup properly on sun4c.
Another issue might be the new flush_cache_page() call in this
new code in fs/exec.c, there are now cases where flush_cache_page()
will be called on kernel addresses, and sun4c's implementation might
not like that at all.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [SPARC32] NULL pointer derefference
2007-07-30 4:19 ` David Miller
@ 2007-07-30 10:39 ` Mark Fortescue
2007-07-31 5:35 ` Mark Fortescue
1 sibling, 0 replies; 8+ messages in thread
From: Mark Fortescue @ 2007-07-30 10:39 UTC (permalink / raw)
To: David Miller
Cc: aaw, akpm, linux-arch, sparclinux, wli, linux-mm, linux-kernel
Hi David,
Thanks for the comments.
On Sun, 29 Jul 2007, David Miller wrote:
> From: Mark Fortescue <mark@mtfhpc.demon.co.uk>
> Date: Mon, 30 Jul 2007 03:18:42 +0100 (BST)
>
>> Unfortunatly Sparc32 sun4c low level memory management apears to be
>> incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
>> mm: variable length argument support.
>>
>> For some reason, this commit corrupts the memory used by the low level
>> context/pte handling ring buffers in arch/sparc/mm/sun4c (in
>> add_ring_ordered, head->next becomes set to a NULL pointer).
>>
>> I had a quick look at http://www.linux-mm.org to see if there were any
>> diagrams that show what is going on in the memory management systems, to
>> see if there was something that I could use to help me work out what is
>> going on, but I could not see any.
>
> One possible issue is sequencing, perhaps the stack argument copy
> is occuring before the new context is setup properly on sun4c.
>
I will see if I can generate some debug code to check out this posibility.
> Another issue might be the new flush_cache_page() call in this
> new code in fs/exec.c, there are now cases where flush_cache_page()
> will be called on kernel addresses, and sun4c's implementation might
> not like that at all.
>
I backed the commit out of my latest git pull (app 2am this morning) and I
end up with a working kernel so this confirms that is is somthing
specific to this patch.
I will try adding in a flush_cache_page() at an appropriate point on the
pre-commit version of the code to see if that makes a mess of things.
Regards
Mark Fortescue.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [SPARC32] NULL pointer derefference
2007-07-30 2:18 [SPARC32] NULL pointer derefference Mark Fortescue
2007-07-30 4:19 ` David Miller
@ 2007-07-31 5:10 ` Ollie Wild
1 sibling, 0 replies; 8+ messages in thread
From: Ollie Wild @ 2007-07-31 5:10 UTC (permalink / raw)
To: Mark Fortescue
Cc: Andrew Morton, linux-arch, sparclinux, wli, linux-mm,
linux-kernel, davem
On 7/29/07, Mark Fortescue <mark@mtfhpc.demon.co.uk> wrote:
> Hi All,
>
> Unfortunatly Sparc32 sun4c low level memory management apears to be
> incompatible with commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
> mm: variable length argument support.
I feel like I ought to help out with this since it's my change which
broke things, but I don't have access to a Sparc32 box. Does anyone
have a remotely rebootable machine I can use?
Ollie
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [SPARC32] NULL pointer derefference
2007-07-30 4:19 ` David Miller
2007-07-30 10:39 ` Mark Fortescue
@ 2007-07-31 5:35 ` Mark Fortescue
2007-07-31 6:42 ` David Miller
1 sibling, 1 reply; 8+ messages in thread
From: Mark Fortescue @ 2007-07-31 5:35 UTC (permalink / raw)
To: David Miller
Cc: aaw, akpm, linux-arch, sparclinux, wli, linux-mm, linux-kernel
Hi David,
>
> One possible issue is sequencing, perhaps the stack argument copy
> is occuring before the new context is setup properly on sun4c.
>
I think it is somthing related to this but too much has changed for me to
work out what is going on. At present, I don't have a good enough
understanding of the virtual memory system and how it interracts with the
sun4c mmu.
The original code did a job lot of pte stuf in install_arg_page. The new
code seems to replace this using get_user_pages but I have not worked out
how get_user_pages gets to the point at which it allocated pte's i.e.
maps the stack memory it is about to put the arguments into.
> Another issue might be the new flush_cache_page() call in this
> new code in fs/exec.c, there are now cases where flush_cache_page()
> will be called on kernel addresses, and sun4c's implementation might
> not like that at all.
I commented out the flush_cache_page callmade in the new code. This had no
effect on the problem. Other tests have shown it is breaking earlier than
this.
I am going to try to narrow down exactly where the pointer gets messed up
as this should help.
Regards
Mark Fortescue.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [SPARC32] NULL pointer derefference
2007-07-31 5:35 ` Mark Fortescue
@ 2007-07-31 6:42 ` David Miller
2007-07-31 7:55 ` [PATCH] " Mark Fortescue
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2007-07-31 6:42 UTC (permalink / raw)
To: mark; +Cc: aaw, akpm, linux-arch, sparclinux, wli, linux-mm, linux-kernel
From: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Date: Tue, 31 Jul 2007 06:35:29 +0100 (BST)
> The original code did a job lot of pte stuf in install_arg_page. The
> new code seems to replace this using get_user_pages but I have not
> worked out how get_user_pages gets to the point at which it
> allocated pte's i.e. maps the stack memory it is about to put the
> arguments into.
get_user_pages() essentially walks through the requested user address
space, faults in pages if necessary, and returns references to those
pages.
The logic of get_user_pages() you need to be concerned about is
this inner loop:
while (!(page = follow_page(vma, start, foll_flags))) {
int ret;
ret = handle_mm_fault(mm, vma, start,
foll_flags & FOLL_WRITE);
...
}
handle_mm_fault() does all the dirty work of a page fault, and
is how we get to update_mmu_cache(), the sun4c implementation of
which is where you see the crash.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Re: [SPARC32] NULL pointer derefference
2007-07-31 6:42 ` David Miller
@ 2007-07-31 7:55 ` Mark Fortescue
2007-07-31 9:02 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Mark Fortescue @ 2007-07-31 7:55 UTC (permalink / raw)
To: David Miller
Cc: aaw, akpm, linux-arch, sparclinux, wli, linux-mm, linux-kernel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 891 bytes --]
Hi David,
I have formulated a patch that prevents the update_mmu_cache from doing
enything if there is no context available. This apears to have no
immediate, undesirable side effects.
This worked better than the alternative of setting up a context to work with.
Can you for see any issues in doing this?
If not, can you check+apply the attached (un-mangled) patch.
diff -ruNpd linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c
--- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.000000000 +0100
+++ linux-test/arch/sparc/mm/sun4c.c 2007-07-31 08:28:13.000000000 +0100
@@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar
unsigned long flags;
int pseg;
+ if (vma->vm_mm->context == NO_CONTEXT)
+ return;
+
local_irq_save(flags);
address &= PAGE_MASK;
if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {
Regards
Mark Fortescue.
[-- Attachment #2: Type: TEXT/PLAIN, Size: 1184 bytes --]
From: Mark Fortescue <mark@mtfhpc.demon.co.uk>
This deals with a sun4c issue caused by commit b6a2fea39318e43fee84fa7b0b90d68bed92d2ba:
mm: variable length argument support.
The new way the code works means that sun4c_update_mmu_cache gets called before a context
has been selected, which results in invalid operation of the underling mm code.
Simply ignoring update requests when there is no valid context solves the problem.
Signed-off-by Mark Fortescue <mark@mtfhpc.demon.co.uk>
---
This worked better than the alternative of setting up a context to work with.
I definatly need to spend some time writting up the sun4c MMU and how Linux code uses it.
diff -ruNpd -x '.[a-z]*' linux-2.6/arch/sparc/mm/sun4c.c linux-test/arch/sparc/mm/sun4c.c
--- linux-2.6/arch/sparc/mm/sun4c.c 2007-07-30 03:19:15.000000000 +0100
+++ linux-test/arch/sparc/mm/sun4c.c 2007-07-31 08:28:13.000000000 +0100
@@ -1999,6 +2029,9 @@ void sun4c_update_mmu_cache(struct vm_ar
unsigned long flags;
int pseg;
+ if (vma->vm_mm->context == NO_CONTEXT)
+ return;
+
local_irq_save(flags);
address &= PAGE_MASK;
if ((pseg = sun4c_get_segmap(address)) == invalid_segment) {
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Re: [SPARC32] NULL pointer derefference
2007-07-31 7:55 ` [PATCH] " Mark Fortescue
@ 2007-07-31 9:02 ` David Miller
0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2007-07-31 9:02 UTC (permalink / raw)
To: mark; +Cc: aaw, akpm, linux-arch, sparclinux, wli, linux-mm, linux-kernel
From: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Date: Tue, 31 Jul 2007 08:55:20 +0100 (BST)
> I have formulated a patch that prevents the update_mmu_cache from doing
> enything if there is no context available. This apears to have no
> immediate, undesirable side effects.
>
> This worked better than the alternative of setting up a context to work with.
>
> Can you for see any issues in doing this?
>
> If not, can you check+apply the attached (un-mangled) patch.
Thanks for tracking this down Mark.
The issue is that, when exec()'ing to userspace from a kernel thread,
we need activate_context() to be invoked before we try to touch
userspace at all. This new argument handling is invoking
get_user_pages() before that happens.
activate_context() happens via flush_old_exec(), but that occurs via
load_elf_binary() et al. which is long after the argument fetching
code runs in fs/exec.c that is using get_user_pages().
(Mark, hint: activate_context() is defined to switch_mm() on
sparc32, which is sun4c_switch_mm() which you thought was only
invoked from context switches :-))
Touching userspace before activate_context() is questionable at best,
in my opinion. But I can't come up with a good way to fix this right
now other than Mark's sparc patch, so I will apply it.
Thanks again Mark!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-07-31 9:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-30 2:18 [SPARC32] NULL pointer derefference Mark Fortescue
2007-07-30 4:19 ` David Miller
2007-07-30 10:39 ` Mark Fortescue
2007-07-31 5:35 ` Mark Fortescue
2007-07-31 6:42 ` David Miller
2007-07-31 7:55 ` [PATCH] " Mark Fortescue
2007-07-31 9:02 ` David Miller
2007-07-31 5:10 ` Ollie Wild
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).