public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: ptrace patch fails stress testing
@ 2003-04-03 15:22 James Cownie
  2003-04-03 19:53 ` Chris Wright
  0 siblings, 1 reply; 6+ messages in thread
From: James Cownie @ 2003-04-03 15:22 UTC (permalink / raw)
  To: linux-kernel

Alan wrote :-

> On Tue, 2003-04-01 at 19:22, linas@austin.ibm.com wrote:
> > The problem appears to be that task->mm is dereferenced without
> > looking to see if mm is NULL. e.g. in the sched.h in the
> > is_dumpable() macro, we have task->mm->dumpable . I'm sitting
> > in front of a KDB session and I'm clearly looking at task->mm
> > which is NULL.
> > Why, how and under what conditions this race condition occurs,
> > I don't know. What the best fix is, I don't know.
> 
> Zombie process. The patch checks ->mm but must also check ->mm != NULL
> first.

We're seeing this 100% reliably with out TotalView debugger, and as
Alan suggests it happens when trying to make a ptrace call on a zombie
process.

FWIW the oops looks like this 

  >>EIP; c01197f3 <ptrace_check_attach+13/50>   <=====
  Trace; c0109bc6 <sys_ptrace+ba/580>
  Trace; c0106cb8 <error_code+34/3c>
  Trace; c0106bc7 <system_call+33/38>
  Code;  c01197f3 <ptrace_check_attach+13/50>
  00000000 <_EIP>:
  Code;  c01197f3 <ptrace_check_attach+13/50>   <=====
     0:   f6 40 7c 01               testb  $0x1,0x7c(%eax)   <=====
  Code;  c01197f7 <ptrace_check_attach+17/50>
     4:   75 07                     jne    d <_EIP+0xd> c0119800 <ptrace_check_attach+20/50>
  Code;  c01197f9 <ptrace_check_attach+19/50>
     6:   b8 ff ff ff ff            mov    $0xffffffff,%eax
  Code;  c01197fe <ptrace_check_attach+1e/50>
     b:   c3                        ret    
  Code;  c01197ff <ptrace_check_attach+1f/50>
     c:   90                        nop    
  Code;  c0119800 <ptrace_check_attach+20/50>
     d:   f6 42 18 01               testb  $0x1,0x18(%edx)
  Code;  c0119804 <ptrace_check_attach+24/50>
    11:   75 0a                     jne    1d <_EIP+0x1d> c0119810 <ptrace_check_attach+30/50>
  Code;  c0119806 <ptrace_check_attach+26/50>
    13:   b8 00 00 00 00            mov    $0x0,%eax

which corresponds to checking a null mm.

Following Alan, the fix, then is to have is_dumpable look like this :-

#define is_dumpable(tsk)	((tsk)->task_dumpable && (tsk)->mm && (tsk)->mm->dumpable)

(and be prepared un user space to get EPERM back from some ptrace
calls which previously "worked" ok.)

-- Jim 

James Cownie	<jcownie@etnus.com>
Etnus, LLC.     +44 117 9071438
http://www.etnus.com

^ permalink raw reply	[flat|nested] 6+ messages in thread
* ptrace patch fails stress testing
@ 2003-04-01 18:22 linas
  2003-04-01 21:25 ` John M Flinchbaugh
  2003-04-02 11:49 ` Alan Cox
  0 siblings, 2 replies; 6+ messages in thread
From: linas @ 2003-04-01 18:22 UTC (permalink / raw)
  To: alan; +Cc: linas, ppc, linux-kernel

Hi,

I've got a number of machines here that crash after installing
the recent ptrace fix.  The crash only occurrs when machines 
are highly stressed.

The problem appears to be that task->mm is dereferenced without 
looking to see if mm is NULL.  e.g. in the sched.h in the 
is_dumpable() macro, we have task->mm->dumpable .  I'm sitting
in front of a KDB session and I'm clearly looking at task->mm
which is NULL. 

In my particular case, the crash is *always* in kernel/ptrace.c
in access_process_vm(),  (which is called when something tries
to read /proc/pid/cmd_line).  There seem to be a few other places
in the kernel where task->mm is dererenced without checking mm,
but these are rare (?)  Most (?) places seem to make a point of
checking for NULL before using mm.

Why, how and under what conditions this race condition occurs, 
I don't know.  What the best fix is, I don't know.

I can try to just add a check for NULL, but I'd like someone 
to tell me that 'yes this is the right way to fix this.' 
(As opposed for trying to get some lock or trying to force 
the process to get paged in or whatever.)

BTW, this is an SMP machine, don't know if that matters.

Comments? Suggestions?

--linas


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-04-03 19:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-03 15:22 ptrace patch fails stress testing James Cownie
2003-04-03 19:53 ` Chris Wright
  -- strict thread matches above, loose matches on Subject: below --
2003-04-01 18:22 linas
2003-04-01 21:25 ` John M Flinchbaugh
2003-04-02 11:49 ` Alan Cox
2003-04-02 14:45   ` Keith Owens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox