[PATCH, for 2.6.29] ptrace: fix the usage of ptrace

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
@ 2009-02-09  1:02 Oleg Nesterov
  2009-02-09  1:28 ` Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-02-09  1:02 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar, Markus Metzger, Roland McGrath; +Cc: linux-kernel

I noticed by pure accident we have ptrace_fork() and friends. This was
added by "x86, bts: add fork and exit handling", commit
bf53de907dfdaac178c92d774aae7370d7b97d20

I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly
believe this needs the fix. I think something like this program

	int main(void)
	{
		int pid = fork();

		if (!pid) {
			ptrace(PTRACE_TRACEME, 0, NULL, NULL);
			kill(getpid(), SIGSTOP);
			fork();
		} else {
			struct ptrace_bts_config bts = {
				.flags = PTRACE_BTS_O_ALLOC,
				.size  = 4 * 4096,
			};

			wait(NULL);

			ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK);
			ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts));
			ptrace(PTRACE_CONT, pid, NULL, NULL);

			sleep(1);
		}

		return 0;
	}

should crash the kernel.

If the task is traced by its natural parent ptrace_reparented() returns 0
but we should clear ->btsxxx anyway.

This is a minimal fix for 2.6.29, we need further cleanups imho.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>

--- 6.29-rc3/kernel/fork.c~BTS_FIX	2009-01-29 01:13:55.000000000 +0100
+++ 6.29-rc3/kernel/fork.c	2009-02-09 01:03:48.000000000 +0100
@@ -1093,7 +1093,7 @@ static struct task_struct *copy_process(
 #ifdef CONFIG_DEBUG_MUTEXES
 	p->blocked_on = NULL; /* not blocked yet */
 #endif
-	if (unlikely(ptrace_reparented(current)))
+	if (unlikely(current->ptrace))
 		ptrace_fork(p, clone_flags);

 	/* Perform scheduler related setup. Assign this task to a CPU. */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09  1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov
@ 2009-02-09  1:28 ` Oleg Nesterov
  2009-02-09  1:54   ` Roland McGrath
  2009-02-09  9:28   ` Metzger, Markus T
  2009-02-10 20:08 ` Andrew Morton
  2009-02-11  9:33 ` Ingo Molnar
  2 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-02-09  1:28 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar, Markus Metzger, Roland McGrath; +Cc: linux-kernel

On 02/09, Oleg Nesterov wrote:
>
> I noticed by pure accident we have ptrace_fork() and friends. This was
> added by "x86, bts: add fork and exit handling", commit
> bf53de907dfdaac178c92d774aae7370d7b97d20

Hmm. Looks like we have more problems here...

"x86, bts: memory accounting", commit c5dee6177f4bd2095aab7d9be9f6ebdddd6deee9.

PTRACE_BTS_CONFIG allocates ->bts_buffer via alloc_locked_buffer()
which updates mm->total_vm/locked_vm.

ptrace_detach() does free_locked_buffer() which "restores" mm->xxx_vm.

But if the tracer exits we are doing __ptrace_unlink()->ptrace_bts_untrace()
which uses a plain kfree(), in that case we don't update mm->xxx_vm ?

Note that the exiting tracer can have sub-threads, so the whole process
does not necessary dies.

Or, the tracer can reap a zombie tracee without PTRACE_DETACH, in that
case we don't update ->mm too.

Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the
tracer's sub-thread which does do_wait()->release_task() (if the tracee
was killed before detach takes tasklist), the kernel can crash in this
case.

Unless I missed something, This all looks rather wrong, and I wasn't
aware about these changes :(

Oleg.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09  1:28 ` Oleg Nesterov
@ 2009-02-09  1:54   ` Roland McGrath
  2009-02-09  9:28   ` Metzger, Markus T
  1 sibling, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-02-09  1:54 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Ingo Molnar, Markus Metzger, linux-kernel

I didn't review the new ptrace BTS magic in detail either.  I'm not
surprised it's fraught with these cans of worms.  Personally, I doubt it's
worth trying to make it right before we clean up ptrace a bunch more so it
becomes much more straightforward to manage this stuff.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09  1:28 ` Oleg Nesterov
  2009-02-09  1:54   ` Roland McGrath
@ 2009-02-09  9:28   ` Metzger, Markus T
  2009-02-09 19:36     ` Oleg Nesterov
  1 sibling, 1 reply; 13+ messages in thread
From: Metzger, Markus T @ 2009-02-09  9:28 UTC (permalink / raw)
  To: Oleg Nesterov, Andrew Morton, Ingo Molnar, Roland McGrath
  Cc: linux-kernel@vger.kernel.org

>-----Original Message-----
>From: Oleg Nesterov [mailto:oleg@redhat.com]
>Sent: Monday, February 09, 2009 2:28 AM
>To: Andrew Morton; Ingo Molnar; Metzger, Markus T; Roland McGrath
>Cc: linux-kernel@vger.kernel.org

>> I noticed by pure accident we have ptrace_fork() and friends. This was
>> added by "x86, bts: add fork and exit handling", commit
>> bf53de907dfdaac178c92d774aae7370d7b97d20
>
>Hmm. Looks like we have more problems here...

Thanks for pointing out these problems. I did not get too many reviews when I sent out the patch.

>PTRACE_BTS_CONFIG allocates ->bts_buffer via alloc_locked_buffer()
>which updates mm->total_vm/locked_vm.
>
>ptrace_detach() does free_locked_buffer() which "restores" mm->xxx_vm.

That's what I expect to be the normal case.

>But if the tracer exits we are doing __ptrace_unlink()->ptrace_bts_untrace()
>which uses a plain kfree(), in that case we don't update mm->xxx_vm ?

That's correct.
When the tracer dies, do_exit() first calls exit_mm() and then calls exit_notify(), which eventually calls __ptrace_unlink() and ptrace_bts_untrace().

At this time, the tracer's mm is already gone, but the bts buffer is not.
The code reclaims the memory; so we should not leak memory.
The user should not see any problems with his ulimit, either, since the task that had the memory accounted against his locked and total limit is as good as dead.

Where exactly do you see the problem?

>Note that the exiting tracer can have sub-threads, so the whole process
>does not necessary dies.

In that case, the process would lose the child's branch trace. Doesn't the process lose control over the ptraced task, anyway, when we call __ptrace_unlink()?

The task that paid for the buffer, though, is dead, and the memory has been reclaimed by the kernel.

>Or, the tracer can reap a zombie tracee without PTRACE_DETACH, in that
>case we don't update ->mm too.

I'm not sure I understand that scenario.
A task ptraces another task, requests branch trace, and does not detach when the tracee dies?

In that case, the tracer would continue to pay for the buffer (which had been freed in __ptrace_unlink()) until it dies.
If the tracer task still lives, shouldn't we enforce a proper detach?

The problem I was trying to solve is that a dying ptracer does not detach properly. It seems that there are more ways to bypass ptrace detach.

Ideally (that is, from my point of view, at least), the tracer would call ptrace_detach() very early in do_exit(), so ptrace would not have to distinguish the various ways a tracer or tracee could die. That view might be a bit naïve, I admit.

>Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the
>tracer's sub-thread which does do_wait()->release_task() (if the tracee
>was killed before detach takes tasklist), the kernel can crash in this
>case.

Are you saying that ptrace_detach() should call ptrace_disable() with tasklist_lock held for write?
There's a comment in ptrace_detach() before it does write_lock_irw(&tasklist_lock) and calls __ptrace_detach().

>Unless I missed something, This all looks rather wrong, and I wasn't
>aware about these changes :(

I wished I would have got this review when I sent out the patch.

regards,
markus.

---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09  9:28   ` Metzger, Markus T
@ 2009-02-09 19:36     ` Oleg Nesterov
  2009-02-10  9:47       ` Metzger, Markus T
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2009-02-09 19:36 UTC (permalink / raw)
  To: Metzger, Markus T
  Cc: Andrew Morton, Ingo Molnar, Roland McGrath,
	linux-kernel@vger.kernel.org

On 02/09, Metzger, Markus T wrote:
>
> >PTRACE_BTS_CONFIG allocates ->bts_buffer via alloc_locked_buffer()
> >which updates mm->total_vm/locked_vm.
> >
> >ptrace_detach() does free_locked_buffer() which "restores" mm->xxx_vm.
>
> That's what I expect to be the normal case.
>
>
> >But if the tracer exits we are doing __ptrace_unlink()->ptrace_bts_untrace()
> >which uses a plain kfree(), in that case we don't update mm->xxx_vm ?
>
> That's correct.
> When the tracer dies, do_exit() first calls exit_mm() and then calls exit_notify(), which eventually calls __ptrace_unlink() and ptrace_bts_untrace().
>
> At this time, the tracer's mm is already gone,

It is not if we have sub-threads (which share the same ->mm),

> but the bts buffer is not.
> The code reclaims the memory; so we should not leak memory.

yes, there is no memory leak,

> The user should not see any problems with his ulimit, either, since the task that had the memory accounted against his locked and total limit is as good as dead.

again, if the process is multithreaded, it is not dead. It (other threads)
continues to run with the same ->mm. Only the tracer thread is dead.

> >Or, the tracer can reap a zombie tracee without PTRACE_DETACH, in that
> >case we don't update ->mm too.
>
> I'm not sure I understand that scenario.
> A task ptraces another task, requests branch trace, and does not detach when the tracee dies?

Yes. Again, we don't leak the memory, but the accounting in mm->total_vm/locked_vm
is not right.

> Ideally (that is, from my point of view, at least), the tracer would call
> ptrace_detach() very early in do_exit(), so ptrace would not have to distinguish
> the various ways a tracer or tracee could die.

Well, yes I agree. Please look at http://marc.info/?t=123411902800001

> >Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the
> >tracer's sub-thread which does do_wait()->release_task() (if the tracee
> >was killed before detach takes tasklist), the kernel can crash in this
> >case.
>
> Are you saying that ptrace_detach() should call ptrace_disable() with tasklist_lock held for write?

We can't do this. And btw one of the reasons we can't is that
ptrace_bts_free_buffer() needs ->mmap_sem ;)

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09 19:36     ` Oleg Nesterov
@ 2009-02-10  9:47       ` Metzger, Markus T
  2009-02-10 18:40         ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Metzger, Markus T @ 2009-02-10  9:47 UTC (permalink / raw)
  To: Oleg Nesterov, Ingo Molnar, Roland McGrath
  Cc: Andrew Morton, linux-kernel@vger.kernel.org, Markus Metzger

>-----Original Message-----
>From: Oleg Nesterov [mailto:oleg@redhat.com]
>Sent: Monday, February 09, 2009 8:36 PM
>To: Metzger, Markus T

>> When the tracer dies, do_exit() first calls exit_mm() and then calls exit_notify(), which eventually
>calls __ptrace_unlink() and ptrace_bts_untrace().
>>
>> At this time, the tracer's mm is already gone,
>
>It is not if we have sub-threads (which share the same ->mm),

 [....]

>> >Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the
>> >tracer's sub-thread which does do_wait()->release_task() (if the tracee
>> >was killed before detach takes tasklist), the kernel can crash in this
>> >case.
>>
>> Are you saying that ptrace_detach() should call ptrace_disable() with tasklist_lock held for write?
>
>We can't do this. And btw one of the reasons we can't is that
>ptrace_bts_free_buffer() needs ->mmap_sem ;)

Thanks for your explanations.

If I understand this correctly, we have two problems left:
1. if the tracer thread dies without detaching, the process will not get the (locked) memory refunded.
2. there is a race between a thread detaching and another thread releasing the same task.

I do not really understand the second problem.

As far as I know, there can only be one ptracer per task. This ptracer can either detach or release,
but not both. That other thread that does do_wait() should not be able to see the tracee as long as
it is ptraced (wait_consider_task() will ignore it). Since ptrace_disable() is called before
__ptrace_unlink(), we free the BTS buffer before do_wait() will consider the tracee.
I do not see the race. Am I missing something?

Regarding the first problem, the userland impact would be that multi-threaded debuggers whose ptrace
controlling threads die without detaching will (more or less slowly) run out of memory and not be able
to collect branch trace after some time. Such debuggers will have lost control of the debuggee and
need to reattach/restart the debuggee process. Users would have to restart those debuggers to be able
to collect branch trace again.

Even though mm is shared, the pointer to it has already been set to NULL before arch_ptrace_untrace()
is called; we can't access it in ptrace_bts_untrace().
Utrace picked a better place; tracehook_report_exit() is called very early in do_exit() with everything
still in place.

We could try to mimic that and add a ptrace_notify_exit() function that is called early in do_exit().
As long as I only put the ptrace_bts_detach() into the arch version of it, the changes should be
relatively safe.
We might even be able to call ptrace_detach() from that function, although I would not dare to make
those changes myself.
This would leave us with two exit notifications to ptrace, though, which does not make the code any
cleaner or simpler.

What do you and Roland think about it? Do you have a better idea?

Thanks again for pointing out those problems. I would appreciate, if you reviewed future patches
in that area.

thanks and regards,
markus.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-10  9:47       ` Metzger, Markus T
@ 2009-02-10 18:40         ` Oleg Nesterov
  2009-02-10 20:21           ` Markus Metzger
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2009-02-10 18:40 UTC (permalink / raw)
  To: Metzger, Markus T
  Cc: Ingo Molnar, Roland McGrath, Andrew Morton,
	linux-kernel@vger.kernel.org, Markus Metzger

On 02/10, Metzger, Markus T wrote:
>
> If I understand this correctly, we have two problems left: 1. if the
> tracer thread dies without detaching, the process will not get the
> (locked) memory refunded.

Yes.

> 2. there is a race between a thread detaching
> and another thread releasing the same task.
>
>
> I do not really understand the second problem.
>
> As far as I know, there can only be one ptracer per task. This ptracer
> can either detach or release, but not both. That other thread that does
> do_wait() should not be able to see the tracee as long as it is ptraced
> (wait_consider_task() will ignore it).

Please note that do_wait() does

	tsk = current;
	do {

		ptrace_do_wait(tsk, ...);

		tsk = next_thread(tsk);
	} while (tsk != current);

So the sub-thread of the tracer can reap the tracee, please see below.

> Since ptrace_disable() is called
> before __ptrace_unlink(), we free the BTS buffer before do_wait() will
> consider the tracee.

They both can free it in parallel.

Suppose we have 2 threads T1 and T2, C is a child of T1 (this is not
strictly necessary, just for simplicity),

T1 attaches to C, does PTRACE_BTS_CONFIG, and then starts PTRACE_DETACH.
When it calls ptrace_detach(), C is TASK_TRACED.

C is killed by SIGKILL, C exits and becomes a zombie. Not a problem
for T1, it has a reference to task_struct.

T1 calls ptrace_disable()->ptrace_bts_detach().

T2 calls do_wait(), the second iteration of the "do while" loop above
finds the "eligible" child C, and calls wait_task_zombie(), which in
turn does release_task()->ptrace_unlink()->...->ptrace_bts_untrace().

Now, T1->ptrace_bts_detach() can race with T2->ptrace_bts_untrace(), they
both can see ->bts != NULL, and they both can do kfree/ds_release_bts.

(and we have another similar race with de_thread() which can call
 release_task() too).

> I do not see the race. Am I missing something?

Or perhaps it is me who missed something, I didn't try to verify the
problem...

> We could try to mimic that and add a ptrace_notify_exit() function that is
> called early in do_exit().  As long as I only put the ptrace_bts_detach()
> into the arch version of it, the changes should be relatively safe.

Yes, we can do untrace earlier, but we still have the problems with tasklist_lock.
Of course, we can add the special function which does ptrace_bts_untrace()
for each tracee under tasklist and returns the size of the freed buffer,
then we drop tasklist and update ->mm. But this is soooo ugly...

And this can't resolve the problem with do_wait/de_thread which
can do ptrace_bts_untrace() before us.

> What do you and Roland think about it? Do you have a better idea?

We should cleanup ptrace first ;) IOW, I don't have a good idea.

Perhaps, for 2.6.29, we can do something like the "patch" below?

(btw, do you agree with the change in copy_process() I sent? )

> I would appreciate, if
> you reviewed future patches in that area.

Please CC me, I'll try to review. But I only understand (more or
less) the process-management part of ptrace...

Oleg.

--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta

 static void ptrace_bts_detach(struct task_struct *child)
 {
+	// We can race with de_thread/do_wait which
+	// can do ptrace_bts_untrace() before us
 	if (unlikely(child->bts)) {
-		ds_release_bts(child->bts);
-		child->bts = NULL;
-
-		ptrace_bts_free_buffer(child);
+		// This all will be freed by ptrace_bts_untrace()
+		// later, but we should update ->mm
+		down_write(->mmap_sem);
+		mm->total_vm  -= bts_size;
+		mm->locked_vm -= bts_size);
+		up_write(->mmap_sem);
 	}
 }
 #else

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-10 18:40         ` Oleg Nesterov
@ 2009-02-10 20:21           ` Markus Metzger
  2009-02-10 21:00             ` Markus Metzger
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Metzger @ 2009-02-10 20:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Metzger, Markus T, Ingo Molnar, Roland McGrath, Andrew Morton,
	linux-kernel@vger.kernel.org, Markus Metzger

On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote:

> > 2. there is a race between a thread detaching
> > and another thread releasing the same task.

I think I now see the problem. Ptrace uses the tasklist_lock to protect
against __ptrace_unlink() races.

I could either introduce a separate lock to protect bts buffer
deallocation, or I put the kfree part under the tasklist_lock,
as you suggest below.



> Perhaps, for 2.6.29, we can do something like the "patch" below?
> 
> (btw, do you agree with the change in copy_process() I sent? )

Both patches look good to me.


> --- a/arch/x86/kernel/ptrace.c
> +++ b/arch/x86/kernel/ptrace.c
> @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta
>  
>  static void ptrace_bts_detach(struct task_struct *child)
>  {
> +	// We can race with de_thread/do_wait which
> +	// can do ptrace_bts_untrace() before us
>  	if (unlikely(child->bts)) {
> -		ds_release_bts(child->bts);
> -		child->bts = NULL;
> -
> -		ptrace_bts_free_buffer(child);
> +		// This all will be freed by ptrace_bts_untrace()
> +		// later, but we should update ->mm
> +		down_write(->mmap_sem);
> +		mm->total_vm  -= bts_size;
> +		mm->locked_vm -= bts_size);
> +		up_write(->mmap_sem);
>  	}
>  }
>  #else
> 


You already sent out the first one. I don't have access to any
test machine from home. I could send the patch tomorrow (evening).

thanks and regards,
markus.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-10 20:21           ` Markus Metzger
@ 2009-02-10 21:00             ` Markus Metzger
  2009-02-10 21:48               ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Metzger @ 2009-02-10 21:00 UTC (permalink / raw)
  To: Markus Metzger
  Cc: Oleg Nesterov, Metzger, Markus T, Ingo Molnar, Roland McGrath,
	Andrew Morton, linux-kernel@vger.kernel.org

On Tue, 2009-02-10 at 21:21 +0100, Markus Metzger wrote:
> On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote:

> > Perhaps, for 2.6.29, we can do something like the "patch" below?
> > 
> > --- a/arch/x86/kernel/ptrace.c
> > +++ b/arch/x86/kernel/ptrace.c
> > @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta
> >  
> >  static void ptrace_bts_detach(struct task_struct *child)
> >  {
> > +	// We can race with de_thread/do_wait which
> > +	// can do ptrace_bts_untrace() before us
> >  	if (unlikely(child->bts)) {
> > -		ds_release_bts(child->bts);
> > -		child->bts = NULL;
> > -
> > -		ptrace_bts_free_buffer(child);
> > +		// This all will be freed by ptrace_bts_untrace()
> > +		// later, but we should update ->mm
> > +		down_write(->mmap_sem);
> > +		mm->total_vm  -= bts_size;
> > +		mm->locked_vm -= bts_size);
> > +		up_write(->mmap_sem);
> >  	}
> >  }
> >  #else
> > 
> 

There's still a race.
The kfree() is safe, now, but ptrace_bts_untrace() might have cleared
child->bts_size before we can refund the memory.

We need to make ptrace_bts_untrace() ignore child->bts_size and clear
it in ptrace_bts_detach().


regards,
markus.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-10 21:00             ` Markus Metzger
@ 2009-02-10 21:48               ` Oleg Nesterov
  2009-02-11  7:03                 ` Markus Metzger
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2009-02-10 21:48 UTC (permalink / raw)
  To: Markus Metzger
  Cc: Metzger, Markus T, Ingo Molnar, Roland McGrath, Andrew Morton,
	linux-kernel@vger.kernel.org

On 02/10, Markus Metzger wrote:
>
> On Tue, 2009-02-10 at 21:21 +0100, Markus Metzger wrote:
> > On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote:
>
> > > Perhaps, for 2.6.29, we can do something like the "patch" below?
> > > 
> > > --- a/arch/x86/kernel/ptrace.c
> > > +++ b/arch/x86/kernel/ptrace.c
> > > @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta
> > >  
> > >  static void ptrace_bts_detach(struct task_struct *child)
> > >  {
> > > +	// We can race with de_thread/do_wait which
> > > +	// can do ptrace_bts_untrace() before us
> > >  	if (unlikely(child->bts)) {
> > > -		ds_release_bts(child->bts);
> > > -		child->bts = NULL;
> > > -
> > > -		ptrace_bts_free_buffer(child);
> > > +		// This all will be freed by ptrace_bts_untrace()
> > > +		// later, but we should update ->mm
> > > +		down_write(->mmap_sem);
> > > +		mm->total_vm  -= bts_size;
> > > +		mm->locked_vm -= bts_size);
> > > +		up_write(->mmap_sem);
> > >  	}
> > >  }
> > >  #else
> > > 
> > 
>
> There's still a race.
> The kfree() is safe, now, but ptrace_bts_untrace() might have cleared
> child->bts_size before we can refund the memory.

Yes sure, please note the "We can race..." comment at the top
of ptrace_bts_detach().

The goal of this patch is to avoid the crash. The memory accounting
in ->mm is still not right. But at least, the tracer can not "steal"
the memory above the limits. And the "good" tracer should not exit
without detach, and it shouldn't release the tracee from sub-thread
if this can race with detach.

So, afaics, the worst thing which can happen is: the "bad" tracer
is punished by the "unfair" mm->xxx_vm numbers.

Except exec() can release the main thread whatever the tracer does...

> We need to make ptrace_bts_untrace() ignore child->bts_size and clear
> it in ptrace_bts_detach().

This is worse, now we can leak the memory if the tracer doesn't
do ptrace_detach().

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-10 21:48               ` Oleg Nesterov
@ 2009-02-11  7:03                 ` Markus Metzger
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Metzger @ 2009-02-11  7:03 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Markus Metzger, Metzger, Markus T, Ingo Molnar, Roland McGrath,
	Andrew Morton, linux-kernel@vger.kernel.org

Oleg Nesterov wrote:
> On 02/10, Markus Metzger wrote:
>> On Tue, 2009-02-10 at 21:21 +0100, Markus Metzger wrote:
>>> On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote:
>>>> Perhaps, for 2.6.29, we can do something like the "patch" below?
>>>>
>>>> --- a/arch/x86/kernel/ptrace.c
>>>> +++ b/arch/x86/kernel/ptrace.c
>>>> @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta
>>>>  
>>>>  static void ptrace_bts_detach(struct task_struct *child)
>>>>  {
>>>> +	// We can race with de_thread/do_wait which
>>>> +	// can do ptrace_bts_untrace() before us
>>>>  	if (unlikely(child->bts)) {
>>>> -		ds_release_bts(child->bts);
>>>> -		child->bts = NULL;
>>>> -
>>>> -		ptrace_bts_free_buffer(child);
>>>> +		// This all will be freed by ptrace_bts_untrace()
>>>> +		// later, but we should update ->mm
>>>> +		down_write(->mmap_sem);
>>>> +		mm->total_vm  -= bts_size;
>>>> +		mm->locked_vm -= bts_size);
>>>> +		up_write(->mmap_sem);
>>>>  	}
>>>>  }
>>>>  #else
>>>>

> The goal of this patch is to avoid the crash. The memory accounting
> in ->mm is still not right. But at least, the tracer can not "steal"
> the memory above the limits. And the "good" tracer should not exit
> without detach, and it shouldn't release the tracee from sub-thread
> if this can race with detach.
> 
> So, afaics, the worst thing which can happen is: the "bad" tracer
> is punished by the "unfair" mm->xxx_vm numbers.
> 
> Except exec() can release the main thread whatever the tracer does...
> 
>> We need to make ptrace_bts_untrace() ignore child->bts_size and clear
>> it in ptrace_bts_detach().
> 
> This is worse, now we can leak the memory if the tracer doesn't
> do ptrace_detach().

I see.

If the tracer dies and bypasses detach, the next tracer to trace the tracee
would get the memory refunded when he configures branch tracing - unless we take 
care about this in ptrace_bts_configure() and only refund the memory when there 
was a buffer to free.

But this would complicate the code even more.

I think that the underlying problem is that ptrace_detach() can be bypassed.
This bypasses also arch-specific cleanup code - that's why I added 
arch_ptrace_untrace().
It would all be very simple if that were not the case.

regards,
markus.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09  1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov
  2009-02-09  1:28 ` Oleg Nesterov
@ 2009-02-10 20:08 ` Andrew Morton
  2009-02-11  9:33 ` Ingo Molnar
  2 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2009-02-10 20:08 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: mingo, markus.t.metzger, roland, linux-kernel

On Mon, 9 Feb 2009 02:02:33 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> I noticed by pure accident we have ptrace_fork() and friends. This was
> added by "x86, bts: add fork and exit handling", commit
> bf53de907dfdaac178c92d774aae7370d7b97d20
> 
> I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly
> believe this needs the fix. I think something like this program
> 
> 	int main(void)
> 	{
> 		int pid = fork();
> 
> 		if (!pid) {
> 			ptrace(PTRACE_TRACEME, 0, NULL, NULL);
> 			kill(getpid(), SIGSTOP);
> 			fork();
> 		} else {
> 			struct ptrace_bts_config bts = {
> 				.flags = PTRACE_BTS_O_ALLOC,
> 				.size  = 4 * 4096,
> 			};
> 
> 			wait(NULL);
> 
> 			ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK);
> 			ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts));
> 			ptrace(PTRACE_CONT, pid, NULL, NULL);
> 
> 			sleep(1);
> 		}
> 
> 		return 0;
> 	}
> 
> should crash the kernel.
> 
> If the task is traced by its natural parent ptrace_reparented() returns 0
> but we should clear ->btsxxx anyway.
> 
> This is a minimal fix for 2.6.29, we need further cleanups imho.
> 

This changelog is all a bit tentative-sounding.

> 
> --- 6.29-rc3/kernel/fork.c~BTS_FIX	2009-01-29 01:13:55.000000000 +0100
> +++ 6.29-rc3/kernel/fork.c	2009-02-09 01:03:48.000000000 +0100
> @@ -1093,7 +1093,7 @@ static struct task_struct *copy_process(
>  #ifdef CONFIG_DEBUG_MUTEXES
>  	p->blocked_on = NULL; /* not blocked yet */
>  #endif
> -	if (unlikely(ptrace_reparented(current)))
> +	if (unlikely(current->ptrace))
>  		ptrace_fork(p, clone_flags);
>  
>  	/* Perform scheduler related setup. Assign this task to a CPU. */

Can we please confirm that this patch is indeed correct and needed?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
  2009-02-09  1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov
  2009-02-09  1:28 ` Oleg Nesterov
  2009-02-10 20:08 ` Andrew Morton
@ 2009-02-11  9:33 ` Ingo Molnar
  2 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-02-11  9:33 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Markus Metzger, Roland McGrath, linux-kernel


* Oleg Nesterov <oleg@redhat.com> wrote:

> I noticed by pure accident we have ptrace_fork() and friends. This was
> added by "x86, bts: add fork and exit handling", commit
> bf53de907dfdaac178c92d774aae7370d7b97d20
> 
> I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly
> believe this needs the fix. I think something like this program
> 
> 	int main(void)
> 	{
> 		int pid = fork();
> 
> 		if (!pid) {
> 			ptrace(PTRACE_TRACEME, 0, NULL, NULL);
> 			kill(getpid(), SIGSTOP);
> 			fork();
> 		} else {
> 			struct ptrace_bts_config bts = {
> 				.flags = PTRACE_BTS_O_ALLOC,
> 				.size  = 4 * 4096,
> 			};
> 
> 			wait(NULL);
> 
> 			ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK);
> 			ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts));
> 			ptrace(PTRACE_CONT, pid, NULL, NULL);
> 
> 			sleep(1);
> 		}
> 
> 		return 0;
> 	}
> 
> should crash the kernel.
> 
> If the task is traced by its natural parent ptrace_reparented() returns 0
> but we should clear ->btsxxx anyway.
> 
> This is a minimal fix for 2.6.29, we need further cleanups imho.

I've applied this fix to tip:x86/urgent for now, until the other fix
from Markus gets finalized.

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-02-11  9:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-09  1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov
2009-02-09  1:28 ` Oleg Nesterov
2009-02-09  1:54   ` Roland McGrath
2009-02-09  9:28   ` Metzger, Markus T
2009-02-09 19:36     ` Oleg Nesterov
2009-02-10  9:47       ` Metzger, Markus T
2009-02-10 18:40         ` Oleg Nesterov
2009-02-10 20:21           ` Markus Metzger
2009-02-10 21:00             ` Markus Metzger
2009-02-10 21:48               ` Oleg Nesterov
2009-02-11  7:03                 ` Markus Metzger
2009-02-10 20:08 ` Andrew Morton
2009-02-11  9:33 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox