All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: "Metzger, Markus T" <markus.t.metzger@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, Roland McGrath <roland@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Markus Metzger <markus.t.metzger@googlemail.com>
Subject: Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
Date: Tue, 10 Feb 2009 19:40:14 +0100	[thread overview]
Message-ID: <20090210184014.GA30545@redhat.com> (raw)
In-Reply-To: <928CFBE8E7CB0040959E56B4EA41A77E4A1E82FE@irsmsx504.ger.corp.intel.com>

On 02/10, Metzger, Markus T wrote:
>
> If I understand this correctly, we have two problems left: 1. if the
> tracer thread dies without detaching, the process will not get the
> (locked) memory refunded.

Yes.

> 2. there is a race between a thread detaching
> and another thread releasing the same task.
>
>
> I do not really understand the second problem.
>
> As far as I know, there can only be one ptracer per task. This ptracer
> can either detach or release, but not both. That other thread that does
> do_wait() should not be able to see the tracee as long as it is ptraced
> (wait_consider_task() will ignore it).

Please note that do_wait() does

	tsk = current;
	do {

		ptrace_do_wait(tsk, ...);

		tsk = next_thread(tsk);
	} while (tsk != current);

So the sub-thread of the tracer can reap the tracee, please see below.

> Since ptrace_disable() is called
> before __ptrace_unlink(), we free the BTS buffer before do_wait() will
> consider the tracee.

They both can free it in parallel.

Suppose we have 2 threads T1 and T2, C is a child of T1 (this is not
strictly necessary, just for simplicity),

T1 attaches to C, does PTRACE_BTS_CONFIG, and then starts PTRACE_DETACH.
When it calls ptrace_detach(), C is TASK_TRACED.

C is killed by SIGKILL, C exits and becomes a zombie. Not a problem
for T1, it has a reference to task_struct.

T1 calls ptrace_disable()->ptrace_bts_detach().

T2 calls do_wait(), the second iteration of the "do while" loop above
finds the "eligible" child C, and calls wait_task_zombie(), which in
turn does release_task()->ptrace_unlink()->...->ptrace_bts_untrace().

Now, T1->ptrace_bts_detach() can race with T2->ptrace_bts_untrace(), they
both can see ->bts != NULL, and they both can do kfree/ds_release_bts.

(and we have another similar race with de_thread() which can call
 release_task() too).

> I do not see the race. Am I missing something?

Or perhaps it is me who missed something, I didn't try to verify the
problem...

> We could try to mimic that and add a ptrace_notify_exit() function that is
> called early in do_exit().  As long as I only put the ptrace_bts_detach()
> into the arch version of it, the changes should be relatively safe.

Yes, we can do untrace earlier, but we still have the problems with tasklist_lock.
Of course, we can add the special function which does ptrace_bts_untrace()
for each tracee under tasklist and returns the size of the freed buffer,
then we drop tasklist and update ->mm. But this is soooo ugly...

And this can't resolve the problem with do_wait/de_thread which
can do ptrace_bts_untrace() before us.

> What do you and Roland think about it? Do you have a better idea?

We should cleanup ptrace first ;) IOW, I don't have a good idea.

Perhaps, for 2.6.29, we can do something like the "patch" below?

(btw, do you agree with the change in copy_process() I sent? )

> I would appreciate, if
> you reviewed future patches in that area.

Please CC me, I'll try to review. But I only understand (more or
less) the process-management part of ptrace...

Oleg.

--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta
 
 static void ptrace_bts_detach(struct task_struct *child)
 {
+	// We can race with de_thread/do_wait which
+	// can do ptrace_bts_untrace() before us
 	if (unlikely(child->bts)) {
-		ds_release_bts(child->bts);
-		child->bts = NULL;
-
-		ptrace_bts_free_buffer(child);
+		// This all will be freed by ptrace_bts_untrace()
+		// later, but we should update ->mm
+		down_write(->mmap_sem);
+		mm->total_vm  -= bts_size;
+		mm->locked_vm -= bts_size);
+		up_write(->mmap_sem);
 	}
 }
 #else


  reply	other threads:[~2009-02-10 18:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-09  1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov
2009-02-09  1:28 ` Oleg Nesterov
2009-02-09  1:54   ` Roland McGrath
2009-02-09  9:28   ` Metzger, Markus T
2009-02-09 19:36     ` Oleg Nesterov
2009-02-10  9:47       ` Metzger, Markus T
2009-02-10 18:40         ` Oleg Nesterov [this message]
2009-02-10 20:21           ` Markus Metzger
2009-02-10 21:00             ` Markus Metzger
2009-02-10 21:48               ` Oleg Nesterov
2009-02-11  7:03                 ` Markus Metzger
2009-02-10 20:08 ` Andrew Morton
2009-02-11  9:33 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090210184014.GA30545@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=markus.t.metzger@googlemail.com \
    --cc=markus.t.metzger@intel.com \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.