* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
@ 2007-05-17 11:38 ` Keith Owens
2007-05-17 13:00 ` Robin Holt
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Keith Owens @ 2007-05-17 11:38 UTC (permalink / raw)
To: linux-ia64
Robin Holt (on Thu, 17 May 2007 06:16:52 -0500) wrote:
>Make ia64's get_wchan safer by not unwinding a running tasks stack.
>...
>All that said, I have put together the following simple patch stolen
>directly from i386's get_wchan. If the task is running, why even try.
>
>
>Index: linux-tot-20070517/arch/ia64/kernel/process.c
>=================================>--- linux-tot-20070517.orig/arch/ia64/kernel/process.c 2007-05-17 05:39:54.000000000 -0500
>+++ linux-tot-20070517/arch/ia64/kernel/process.c 2007-05-17 05:44:26.820535382 -0500
>@@ -763,6 +763,9 @@ get_wchan (struct task_struct *p)
> unsigned long ip;
> int count = 0;
>
>+ if (!p || p = current || p->state = TASK_RUNNING)
>+ return 0;
>+
> /*
> * Note: p may not be a blocked task (it could be current or
> * another process running on some other CPU. Rather than
AFAICT there is no lock on struct task_struct p that stops it being
scheduled to run after you test if it is running. proc_task_lookup()
only does get_task_struct() which prevents the task from being deleted,
it does not prevent the task from being scheduled while you are looking
at it.
So even with that check, it can race between not running and running
while you do the unwind, and still get the MCA.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
2007-05-17 11:38 ` Keith Owens
@ 2007-05-17 13:00 ` Robin Holt
2007-05-17 13:05 ` Keith Owens
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Robin Holt @ 2007-05-17 13:00 UTC (permalink / raw)
To: linux-ia64
On Thu, May 17, 2007 at 09:38:59PM +1000, Keith Owens wrote:
> Robin Holt (on Thu, 17 May 2007 06:16:52 -0500) wrote:
...
> AFAICT there is no lock on struct task_struct p that stops it being
> scheduled to run after you test if it is running. proc_task_lookup()
> only does get_task_struct() which prevents the task from being deleted,
> it does not prevent the task from being scheduled while you are looking
> at it.
>
> So even with that check, it can race between not running and running
> while you do the unwind, and still get the MCA.
I realized the problem, but don't see an easy fix. I think the hope
is we would be far enough back in the call trace so that portion of the
stack would be constant.
To me, this still seems like a reasonable check. Do you object to
this patch or are you just pointing out the hole. I see the other
architectures have the same problem. Would you suggest repeating the
check before each call to unw_unwind()?
Thanks,
Robin
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
2007-05-17 11:38 ` Keith Owens
2007-05-17 13:00 ` Robin Holt
@ 2007-05-17 13:05 ` Keith Owens
2007-05-17 14:16 ` David Mosberger-Tang
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Keith Owens @ 2007-05-17 13:05 UTC (permalink / raw)
To: linux-ia64
Robin Holt (on Thu, 17 May 2007 08:00:07 -0500) wrote:
>On Thu, May 17, 2007 at 09:38:59PM +1000, Keith Owens wrote:
>> So even with that check, it can race between not running and running
>> while you do the unwind, and still get the MCA.
>
>To me, this still seems like a reasonable check. Do you object to
>this patch or are you just pointing out the hole.
No objection to the patch, but the race means that any test on the
task's status is going to be an incomplete fix. David Mosberger
reckons that unwind should never cause an error, maybe we should be
looking at adding more checks to the unwind code to cope with spurious
addresses?
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (2 preceding siblings ...)
2007-05-17 13:05 ` Keith Owens
@ 2007-05-17 14:16 ` David Mosberger-Tang
2007-05-18 3:02 ` Robin Holt
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: David Mosberger-Tang @ 2007-05-17 14:16 UTC (permalink / raw)
To: linux-ia64
On 5/17/07, Keith Owens <kaos@sgi.com> wrote:
> David Mosberger
> reckons that unwind should never cause an error, maybe we should be
> looking at adding more checks to the unwind code to cope with spurious
> addresses?
That's correct. If the unwinder causes MCAs, it's broken. Robin, can
you look into why the memory-access safety-checks in the unwinder
aren't sufficient to avoid the MCAs you're seeing?
--david
--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (3 preceding siblings ...)
2007-05-17 14:16 ` David Mosberger-Tang
@ 2007-05-18 3:02 ` Robin Holt
2007-05-18 18:23 ` David Mosberger-Tang
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Robin Holt @ 2007-05-18 3:02 UTC (permalink / raw)
To: linux-ia64
On Thu, May 17, 2007 at 08:16:55AM -0600, David Mosberger-Tang wrote:
> On 5/17/07, Keith Owens <kaos@sgi.com> wrote:
>
> >David Mosberger
> >reckons that unwind should never cause an error, maybe we should be
> >looking at adding more checks to the unwind code to cope with spurious
> >addresses?
>
> That's correct. If the unwinder causes MCAs, it's broken. Robin, can
> you look into why the memory-access safety-checks in the unwinder
> aren't sufficient to avoid the MCAs you're seeing?
I don't think it got very far at all.
The task in question is calling get_wchan on itself. It is at
>> px *(task_struct *)0xe003819a00000000 | grep ksp
ksp = 0xe003819a00007900
>> px 0xe003819a00007900 + 16
0xe003819a00007910
>> px *(switch_stack *)0xe003819a00007910 | grep bsp
ar_bspstore = 0xe003819a00000000
Here we start to run into difficulties. ar_bspstore is the same address
as our task_struct. info->regstk.top = 0xe003819a00000000 which leads
to unw_init_frame_info calculating info->bsp = 0xe0038199ffffff30
which is near the addresses causing problems (0xe0038199ffffff80 and
0xe0038199ffffffe0). Notice it is in the page before our task_struct.
Well, time for bed.
Thanks,
Robin
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (4 preceding siblings ...)
2007-05-18 3:02 ` Robin Holt
@ 2007-05-18 18:23 ` David Mosberger-Tang
2007-05-18 18:35 ` Robin Holt
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: David Mosberger-Tang @ 2007-05-18 18:23 UTC (permalink / raw)
To: linux-ia64
Hmmh, I thought we had a little helper-routine to check
memory-validity before dereferencing a pointer, but I seem to be
mixing up the (old) kernel unwinder with the one based on libunwind.
Maybe your problem will convince everybody that the libunwind-based
unwinder should be merged into mainline? ;-) I posted the patch a
couple of months ago to the kernel list. Probably it would need to be
freshened-up a bit. Tony, what do you think?
--david
On 5/17/07, Robin Holt <holt@sgi.com> wrote:
> On Thu, May 17, 2007 at 08:16:55AM -0600, David Mosberger-Tang wrote:
> > On 5/17/07, Keith Owens <kaos@sgi.com> wrote:
> >
> > >David Mosberger
> > >reckons that unwind should never cause an error, maybe we should be
> > >looking at adding more checks to the unwind code to cope with spurious
> > >addresses?
> >
> > That's correct. If the unwinder causes MCAs, it's broken. Robin, can
> > you look into why the memory-access safety-checks in the unwinder
> > aren't sufficient to avoid the MCAs you're seeing?
>
> I don't think it got very far at all.
>
> The task in question is calling get_wchan on itself. It is at
> >> px *(task_struct *)0xe003819a00000000 | grep ksp
> ksp = 0xe003819a00007900
> >> px 0xe003819a00007900 + 16
> 0xe003819a00007910
> >> px *(switch_stack *)0xe003819a00007910 | grep bsp
> ar_bspstore = 0xe003819a00000000
>
>
> Here we start to run into difficulties. ar_bspstore is the same address
> as our task_struct. info->regstk.top = 0xe003819a00000000 which leads
> to unw_init_frame_info calculating info->bsp = 0xe0038199ffffff30
> which is near the addresses causing problems (0xe0038199ffffff80 and
> 0xe0038199ffffffe0). Notice it is in the page before our task_struct.
>
> Well, time for bed.
>
> Thanks,
> Robin
>
--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (5 preceding siblings ...)
2007-05-18 18:23 ` David Mosberger-Tang
@ 2007-05-18 18:35 ` Robin Holt
2007-05-18 23:01 ` Luck, Tony
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Robin Holt @ 2007-05-18 18:35 UTC (permalink / raw)
To: linux-ia64
On Thu, May 17, 2007 at 10:02:45PM -0500, Robin Holt wrote:
> On Thu, May 17, 2007 at 08:16:55AM -0600, David Mosberger-Tang wrote:
> > On 5/17/07, Keith Owens <kaos@sgi.com> wrote:
> >
> > >David Mosberger
> > >reckons that unwind should never cause an error, maybe we should be
> > >looking at adding more checks to the unwind code to cope with spurious
> > >addresses?
> >
> > That's correct. If the unwinder causes MCAs, it's broken. Robin, can
> > you look into why the memory-access safety-checks in the unwinder
> > aren't sufficient to avoid the MCAs you're seeing?
>
> I don't think it got very far at all.
>
> The task in question is calling get_wchan on itself. It is at
> >> px *(task_struct *)0xe003819a00000000 | grep ksp
> ksp = 0xe003819a00007900
> >> px 0xe003819a00007900 + 16
> 0xe003819a00007910
> >> px *(switch_stack *)0xe003819a00007910 | grep bsp
> ar_bspstore = 0xe003819a00000000
>
>
> Here we start to run into difficulties. ar_bspstore is the same address
> as our task_struct. info->regstk.top = 0xe003819a00000000 which leads
> to unw_init_frame_info calculating info->bsp = 0xe0038199ffffff30
> which is near the addresses causing problems (0xe0038199ffffff80 and
> 0xe0038199ffffffe0). Notice it is in the page before our task_struct.
I think I have everything figured out now. Address range for our tasks
switch stack is 0xe001849a00007910 to 0xe001849a00007b20.
Or unw_frame_info structure allocated by get_wchan() on the memory stack
happens to reside at 0xe001849a00007b20 to 0xe001849a00007ce8.
Assume we are in get_wchan and r12 = 0xe001849a00007b20. We take
an interrupt. The switch stack gets allocated on the memory stack in
the address ranges above. Upon return from the interrupt, we proceed
to call unw_init_from_blocked_task() which called unw_init_frame_info().
unw_init_frame_info does:
0xa000000100041aa0 <unw_init_frame_info>: [MMI] alloc r36=ar.pfs,9,6,0
0xa000000100041aa6 <unw_init_frame_info+0x6>: adds r12=-16,r12
0xa000000100041aac <unw_init_frame_info+0xc>: mov r35°
0xa000000100041ab0 <unw_init_frame_info+0x10>: [MII] nop.m 0x0
0xa000000100041ab6 <unw_init_frame_info+0x16>: mov r38=r32;;
0xa000000100041abc <unw_init_frame_info+0x1c>: adds r9\x16,r12
0xa000000100041ac0 <unw_init_frame_info+0x20>: [MMI] mov r39=r0
0xa000000100041ac6 <unw_init_frame_info+0x26>: nop.m 0x0
0xa000000100041acc <unw_init_frame_info+0x2c>: mov r40E6;;
0xa000000100041ad0 <unw_init_frame_info+0x30>: [MIB] st8 [r9]=r33
which ends up placing r33 (struct task_struct *) onto the stack at
exactly the location of the no longer valid switch_stack struct pointed
to by this threads ->ksp.
This comes down to we need to take an interrupt in get_wchan when called
on our own task between the time when r12 is updated to allocate the
unw_frame_info structure and when unw_init_from_blocked_task() is called.
Seeing how that is only a few instructions, I would expect this to be
a fairly small window of opportunity.
I am going to submit two patches. One which improves the error checking
in the unwind functions. The other is essentially the patch I produced
yesterday.
Thanks,
Robin
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (6 preceding siblings ...)
2007-05-18 18:35 ` Robin Holt
@ 2007-05-18 23:01 ` Luck, Tony
2007-05-19 2:08 ` Robin Holt
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2007-05-18 23:01 UTC (permalink / raw)
To: linux-ia64
If the problem *only* occurs when trying to get_wchan on the
current task (a futile endevour) then the original patch
should be fine all by itself ... there are no races involved
the current task is going to remain in TASK_RUNNING state at
any point that it happens to look at itself.
A quick scan of my e-mail archives shows an updated libunwind
patch from you dated August 2005 ... which is a lot more than
a couple of months ago :-) I assume that I've missed some more
recent posting.
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (7 preceding siblings ...)
2007-05-18 23:01 ` Luck, Tony
@ 2007-05-19 2:08 ` Robin Holt
2007-05-19 2:26 ` David Mosberger-Tang
2007-05-23 3:32 ` Nick Piggin
10 siblings, 0 replies; 12+ messages in thread
From: Robin Holt @ 2007-05-19 2:08 UTC (permalink / raw)
To: linux-ia64
On Fri, May 18, 2007 at 04:01:21PM -0700, Luck, Tony wrote:
> If the problem *only* occurs when trying to get_wchan on the
> current task (a futile endevour) then the original patch
> should be fine all by itself ... there are no races involved
> the current task is going to remain in TASK_RUNNING state at
> any point that it happens to look at itself.
The problem I was seeing was only on the current task, but we could
have two seperate threads creating that situation. I could certain see
the argument for adding the sanity checks to the current unwind code or
verifying that libunwind does not suffer the same race conditions.
I will leave it to your discretion.
Thanks,
Robin
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (8 preceding siblings ...)
2007-05-19 2:08 ` Robin Holt
@ 2007-05-19 2:26 ` David Mosberger-Tang
2007-05-23 3:32 ` Nick Piggin
10 siblings, 0 replies; 12+ messages in thread
From: David Mosberger-Tang @ 2007-05-19 2:26 UTC (permalink / raw)
To: linux-ia64
I don't think the original patch solves the problem. The issue is
that even if a task is stopped at the time unwinding is initiated, it
may be resumed later on (e.g., by another CPU), so the unwinding MUST
be safe even on a running task. get_wchan() unwinding the currently
running task is a special case, but fixing just that special case will
only make it harder to debug and fix the fundamental bug later on.
August 2005 sounds about right, I'm afraid. I'd love to update the
patch, but with Summer travel coming up etc., it won't happen before
July, for sure. It would be best if somebody else could champion the
libunwind-in-the-kernel effort. It's really needed but my time
available for this project will remain spotty for a while...
--david
On 5/18/07, Luck, Tony <tony.luck@intel.com> wrote:
> If the problem *only* occurs when trying to get_wchan on the
> current task (a futile endevour) then the original patch
> should be fine all by itself ... there are no races involved
> the current task is going to remain in TASK_RUNNING state at
> any point that it happens to look at itself.
>
> A quick scan of my e-mail archives shows an updated libunwind
> patch from you dated August 2005 ... which is a lot more than
> a couple of months ago :-) I assume that I've missed some more
> recent posting.
>
> -Tony
>
--
Mosberger Consulting LLC, http://www.mosberger-consulting.com/
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH] get_wchan on running task sometimes MCAs the machine.
2007-05-17 11:16 [PATCH] get_wchan on running task sometimes MCAs the machine Robin Holt
` (9 preceding siblings ...)
2007-05-19 2:26 ` David Mosberger-Tang
@ 2007-05-23 3:32 ` Nick Piggin
10 siblings, 0 replies; 12+ messages in thread
From: Nick Piggin @ 2007-05-23 3:32 UTC (permalink / raw)
To: linux-ia64
Hi, sorry not to do a proper threaded reply, but I can't find an mbox
of the relevant message (or know another good way to do it).
Well I don't presume to know the ia64 unwind code, however from the
sounds of it, we _might_ be able to get away with simply blocking the
task from running at the scheduler, during the unwind? Note this
requires we have interrupts disabled and the rq lock held, so it may
not work if the unwinder does anything nontrivial...
Longer term, it seems better to make the unwinder more robust, but
maybe this gives us a fix until then? Just an idea (untested!)
---
Index: linux-2.6/arch/ia64/kernel/process.c
=================================--- linux-2.6.orig/arch/ia64/kernel/process.c
+++ linux-2.6/arch/ia64/kernel/process.c
@@ -760,7 +760,9 @@ unsigned long
get_wchan (struct task_struct *p)
{
struct unw_frame_info info;
+ unsigned long flags;
unsigned long ip;
+ unsigned long ret = 0;
int count = 0;
/*
@@ -771,15 +773,24 @@ get_wchan (struct task_struct *p)
* gracefully if the process wasn't really blocked after all.
* --davidm 99/12/15
*/
+ if (!lock_blocked_task(p, &flags))
+ goto out;
+
unw_init_from_blocked_task(&info, p);
do {
if (unw_unwind(&info) < 0)
- return 0;
+ goto unlock;
unw_get_ip(&info, &ip);
- if (!in_sched_functions(ip))
- return ip;
+ if (!in_sched_functions(ip)) {
+ ret = ip;
+ goto unlock;
+ }
} while (count++ < 16);
- return 0;
+
+unlock:
+ unlock_blocked_task(p, &flags);
+out:
+ return ret;
}
void
Index: linux-2.6/include/linux/sched.h
=================================--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -1332,6 +1332,8 @@ extern void FASTCALL(wake_up_new_task(st
#endif
extern void FASTCALL(sched_fork(struct task_struct * p, int clone_flags));
extern void FASTCALL(sched_exit(struct task_struct * p));
+extern int lock_blocked_task(struct task_struct *p, unsigned long *flags);
+extern void unlock_blocked_task(struct task_struct *p, unsigned long *flags);
extern int in_group_p(gid_t);
extern int in_egroup_p(gid_t);
Index: linux-2.6/kernel/sched.c
=================================--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -1177,6 +1177,25 @@ repeat:
task_rq_unlock(rq, &flags);
}
+int lock_blocked_task(struct task_struct *p, unsigned long *flags)
+{
+ struct rq *rq;
+
+ rq = task_rq_lock(p, flags);
+ if (p->state = TASK_RUNNING || task_running(rq, p)) {
+ task_rq_unlock(rq, flags);
+ return 0;
+ }
+
+ return 1;
+}
+
+void unlock_blocked_task(struct task_struct *p, unsigned long *flags)
+{
+ task_rq_unlock(task_rq(p), flags);
+}
+
+
/***
* kick_process - kick a running thread to enter/exit the kernel
* @p: the to-be-kicked thread
^ permalink raw reply [flat|nested] 12+ messages in thread