* [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork()
@ 2009-02-09 1:02 Oleg Nesterov
2009-02-09 1:28 ` Oleg Nesterov
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-02-09 1:02 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar, Markus Metzger, Roland McGrath; +Cc: linux-kernel
I noticed by pure accident we have ptrace_fork() and friends. This was
added by "x86, bts: add fork and exit handling", commit
bf53de907dfdaac178c92d774aae7370d7b97d20
I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly
believe this needs the fix. I think something like this program
int main(void)
{
int pid = fork();
if (!pid) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
kill(getpid(), SIGSTOP);
fork();
} else {
struct ptrace_bts_config bts = {
.flags = PTRACE_BTS_O_ALLOC,
.size = 4 * 4096,
};
wait(NULL);
ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK);
ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts));
ptrace(PTRACE_CONT, pid, NULL, NULL);
sleep(1);
}
return 0;
}
should crash the kernel.
If the task is traced by its natural parent ptrace_reparented() returns 0
but we should clear ->btsxxx anyway.
This is a minimal fix for 2.6.29, we need further cleanups imho.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
--- 6.29-rc3/kernel/fork.c~BTS_FIX 2009-01-29 01:13:55.000000000 +0100
+++ 6.29-rc3/kernel/fork.c 2009-02-09 01:03:48.000000000 +0100
@@ -1093,7 +1093,7 @@ static struct task_struct *copy_process(
#ifdef CONFIG_DEBUG_MUTEXES
p->blocked_on = NULL; /* not blocked yet */
#endif
- if (unlikely(ptrace_reparented(current)))
+ if (unlikely(current->ptrace))
ptrace_fork(p, clone_flags);
/* Perform scheduler related setup. Assign this task to a CPU. */
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov @ 2009-02-09 1:28 ` Oleg Nesterov 2009-02-09 1:54 ` Roland McGrath 2009-02-09 9:28 ` Metzger, Markus T 2009-02-10 20:08 ` Andrew Morton 2009-02-11 9:33 ` Ingo Molnar 2 siblings, 2 replies; 13+ messages in thread From: Oleg Nesterov @ 2009-02-09 1:28 UTC (permalink / raw) To: Andrew Morton, Ingo Molnar, Markus Metzger, Roland McGrath; +Cc: linux-kernel On 02/09, Oleg Nesterov wrote: > > I noticed by pure accident we have ptrace_fork() and friends. This was > added by "x86, bts: add fork and exit handling", commit > bf53de907dfdaac178c92d774aae7370d7b97d20 Hmm. Looks like we have more problems here... "x86, bts: memory accounting", commit c5dee6177f4bd2095aab7d9be9f6ebdddd6deee9. PTRACE_BTS_CONFIG allocates ->bts_buffer via alloc_locked_buffer() which updates mm->total_vm/locked_vm. ptrace_detach() does free_locked_buffer() which "restores" mm->xxx_vm. But if the tracer exits we are doing __ptrace_unlink()->ptrace_bts_untrace() which uses a plain kfree(), in that case we don't update mm->xxx_vm ? Note that the exiting tracer can have sub-threads, so the whole process does not necessary dies. Or, the tracer can reap a zombie tracee without PTRACE_DETACH, in that case we don't update ->mm too. Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the tracer's sub-thread which does do_wait()->release_task() (if the tracee was killed before detach takes tasklist), the kernel can crash in this case. Unless I missed something, This all looks rather wrong, and I wasn't aware about these changes :( Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 1:28 ` Oleg Nesterov @ 2009-02-09 1:54 ` Roland McGrath 2009-02-09 9:28 ` Metzger, Markus T 1 sibling, 0 replies; 13+ messages in thread From: Roland McGrath @ 2009-02-09 1:54 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Andrew Morton, Ingo Molnar, Markus Metzger, linux-kernel I didn't review the new ptrace BTS magic in detail either. I'm not surprised it's fraught with these cans of worms. Personally, I doubt it's worth trying to make it right before we clean up ptrace a bunch more so it becomes much more straightforward to manage this stuff. Thanks, Roland ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 1:28 ` Oleg Nesterov 2009-02-09 1:54 ` Roland McGrath @ 2009-02-09 9:28 ` Metzger, Markus T 2009-02-09 19:36 ` Oleg Nesterov 1 sibling, 1 reply; 13+ messages in thread From: Metzger, Markus T @ 2009-02-09 9:28 UTC (permalink / raw) To: Oleg Nesterov, Andrew Morton, Ingo Molnar, Roland McGrath Cc: linux-kernel@vger.kernel.org >-----Original Message----- >From: Oleg Nesterov [mailto:oleg@redhat.com] >Sent: Monday, February 09, 2009 2:28 AM >To: Andrew Morton; Ingo Molnar; Metzger, Markus T; Roland McGrath >Cc: linux-kernel@vger.kernel.org >> I noticed by pure accident we have ptrace_fork() and friends. This was >> added by "x86, bts: add fork and exit handling", commit >> bf53de907dfdaac178c92d774aae7370d7b97d20 > >Hmm. Looks like we have more problems here... Thanks for pointing out these problems. I did not get too many reviews when I sent out the patch. >PTRACE_BTS_CONFIG allocates ->bts_buffer via alloc_locked_buffer() >which updates mm->total_vm/locked_vm. > >ptrace_detach() does free_locked_buffer() which "restores" mm->xxx_vm. That's what I expect to be the normal case. >But if the tracer exits we are doing __ptrace_unlink()->ptrace_bts_untrace() >which uses a plain kfree(), in that case we don't update mm->xxx_vm ? That's correct. When the tracer dies, do_exit() first calls exit_mm() and then calls exit_notify(), which eventually calls __ptrace_unlink() and ptrace_bts_untrace(). At this time, the tracer's mm is already gone, but the bts buffer is not. The code reclaims the memory; so we should not leak memory. The user should not see any problems with his ulimit, either, since the task that had the memory accounted against his locked and total limit is as good as dead. Where exactly do you see the problem? >Note that the exiting tracer can have sub-threads, so the whole process >does not necessary dies. In that case, the process would lose the child's branch trace. Doesn't the process lose control over the ptraced task, anyway, when we call __ptrace_unlink()? The task that paid for the buffer, though, is dead, and the memory has been reclaimed by the kernel. >Or, the tracer can reap a zombie tracee without PTRACE_DETACH, in that >case we don't update ->mm too. I'm not sure I understand that scenario. A task ptraces another task, requests branch trace, and does not detach when the tracee dies? In that case, the tracer would continue to pay for the buffer (which had been freed in __ptrace_unlink()) until it dies. If the tracer task still lives, shouldn't we enforce a proper detach? The problem I was trying to solve is that a dying ptracer does not detach properly. It seems that there are more ways to bypass ptrace detach. Ideally (that is, from my point of view, at least), the tracer would call ptrace_detach() very early in do_exit(), so ptrace would not have to distinguish the various ways a tracer or tracee could die. That view might be a bit naïve, I admit. >Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the >tracer's sub-thread which does do_wait()->release_task() (if the tracee >was killed before detach takes tasklist), the kernel can crash in this >case. Are you saying that ptrace_detach() should call ptrace_disable() with tasklist_lock held for write? There's a comment in ptrace_detach() before it does write_lock_irw(&tasklist_lock) and calls __ptrace_detach(). >Unless I missed something, This all looks rather wrong, and I wasn't >aware about these changes :( I wished I would have got this review when I sent out the patch. regards, markus. --------------------------------------------------------------------- Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen Germany Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer Registergericht: Muenchen HRB 47456 Ust.-IdNr. VAT Registration No.: DE129385895 Citibank Frankfurt (BLZ 502 109 00) 600119052 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 9:28 ` Metzger, Markus T @ 2009-02-09 19:36 ` Oleg Nesterov 2009-02-10 9:47 ` Metzger, Markus T 0 siblings, 1 reply; 13+ messages in thread From: Oleg Nesterov @ 2009-02-09 19:36 UTC (permalink / raw) To: Metzger, Markus T Cc: Andrew Morton, Ingo Molnar, Roland McGrath, linux-kernel@vger.kernel.org On 02/09, Metzger, Markus T wrote: > > >PTRACE_BTS_CONFIG allocates ->bts_buffer via alloc_locked_buffer() > >which updates mm->total_vm/locked_vm. > > > >ptrace_detach() does free_locked_buffer() which "restores" mm->xxx_vm. > > That's what I expect to be the normal case. > > > >But if the tracer exits we are doing __ptrace_unlink()->ptrace_bts_untrace() > >which uses a plain kfree(), in that case we don't update mm->xxx_vm ? > > That's correct. > When the tracer dies, do_exit() first calls exit_mm() and then calls exit_notify(), which eventually calls __ptrace_unlink() and ptrace_bts_untrace(). > > At this time, the tracer's mm is already gone, It is not if we have sub-threads (which share the same ->mm), > but the bts buffer is not. > The code reclaims the memory; so we should not leak memory. yes, there is no memory leak, > The user should not see any problems with his ulimit, either, since the task that had the memory accounted against his locked and total limit is as good as dead. again, if the process is multithreaded, it is not dead. It (other threads) continues to run with the same ->mm. Only the tracer thread is dead. > >Or, the tracer can reap a zombie tracee without PTRACE_DETACH, in that > >case we don't update ->mm too. > > I'm not sure I understand that scenario. > A task ptraces another task, requests branch trace, and does not detach when the tracee dies? Yes. Again, we don't leak the memory, but the accounting in mm->total_vm/locked_vm is not right. > Ideally (that is, from my point of view, at least), the tracer would call > ptrace_detach() very early in do_exit(), so ptrace would not have to distinguish > the various ways a tracer or tracee could die. Well, yes I agree. Please look at http://marc.info/?t=123411902800001 > >Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the > >tracer's sub-thread which does do_wait()->release_task() (if the tracee > >was killed before detach takes tasklist), the kernel can crash in this > >case. > > Are you saying that ptrace_detach() should call ptrace_disable() with tasklist_lock held for write? We can't do this. And btw one of the reasons we can't is that ptrace_bts_free_buffer() needs ->mmap_sem ;) Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 19:36 ` Oleg Nesterov @ 2009-02-10 9:47 ` Metzger, Markus T 2009-02-10 18:40 ` Oleg Nesterov 0 siblings, 1 reply; 13+ messages in thread From: Metzger, Markus T @ 2009-02-10 9:47 UTC (permalink / raw) To: Oleg Nesterov, Ingo Molnar, Roland McGrath Cc: Andrew Morton, linux-kernel@vger.kernel.org, Markus Metzger >-----Original Message----- >From: Oleg Nesterov [mailto:oleg@redhat.com] >Sent: Monday, February 09, 2009 8:36 PM >To: Metzger, Markus T >> When the tracer dies, do_exit() first calls exit_mm() and then calls exit_notify(), which eventually >calls __ptrace_unlink() and ptrace_bts_untrace(). >> >> At this time, the tracer's mm is already gone, > >It is not if we have sub-threads (which share the same ->mm), [....] >> >Oh, and afaics ptrace_detach()->ptrace_bts_detach() can race with the >> >tracer's sub-thread which does do_wait()->release_task() (if the tracee >> >was killed before detach takes tasklist), the kernel can crash in this >> >case. >> >> Are you saying that ptrace_detach() should call ptrace_disable() with tasklist_lock held for write? > >We can't do this. And btw one of the reasons we can't is that >ptrace_bts_free_buffer() needs ->mmap_sem ;) Thanks for your explanations. If I understand this correctly, we have two problems left: 1. if the tracer thread dies without detaching, the process will not get the (locked) memory refunded. 2. there is a race between a thread detaching and another thread releasing the same task. I do not really understand the second problem. As far as I know, there can only be one ptracer per task. This ptracer can either detach or release, but not both. That other thread that does do_wait() should not be able to see the tracee as long as it is ptraced (wait_consider_task() will ignore it). Since ptrace_disable() is called before __ptrace_unlink(), we free the BTS buffer before do_wait() will consider the tracee. I do not see the race. Am I missing something? Regarding the first problem, the userland impact would be that multi-threaded debuggers whose ptrace controlling threads die without detaching will (more or less slowly) run out of memory and not be able to collect branch trace after some time. Such debuggers will have lost control of the debuggee and need to reattach/restart the debuggee process. Users would have to restart those debuggers to be able to collect branch trace again. Even though mm is shared, the pointer to it has already been set to NULL before arch_ptrace_untrace() is called; we can't access it in ptrace_bts_untrace(). Utrace picked a better place; tracehook_report_exit() is called very early in do_exit() with everything still in place. We could try to mimic that and add a ptrace_notify_exit() function that is called early in do_exit(). As long as I only put the ptrace_bts_detach() into the arch version of it, the changes should be relatively safe. We might even be able to call ptrace_detach() from that function, although I would not dare to make those changes myself. This would leave us with two exit notifications to ptrace, though, which does not make the code any cleaner or simpler. What do you and Roland think about it? Do you have a better idea? Thanks again for pointing out those problems. I would appreciate, if you reviewed future patches in that area. thanks and regards, markus. --------------------------------------------------------------------- Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen Germany Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer Registergericht: Muenchen HRB 47456 Ust.-IdNr. VAT Registration No.: DE129385895 Citibank Frankfurt (BLZ 502 109 00) 600119052 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-10 9:47 ` Metzger, Markus T @ 2009-02-10 18:40 ` Oleg Nesterov 2009-02-10 20:21 ` Markus Metzger 0 siblings, 1 reply; 13+ messages in thread From: Oleg Nesterov @ 2009-02-10 18:40 UTC (permalink / raw) To: Metzger, Markus T Cc: Ingo Molnar, Roland McGrath, Andrew Morton, linux-kernel@vger.kernel.org, Markus Metzger On 02/10, Metzger, Markus T wrote: > > If I understand this correctly, we have two problems left: 1. if the > tracer thread dies without detaching, the process will not get the > (locked) memory refunded. Yes. > 2. there is a race between a thread detaching > and another thread releasing the same task. > > > I do not really understand the second problem. > > As far as I know, there can only be one ptracer per task. This ptracer > can either detach or release, but not both. That other thread that does > do_wait() should not be able to see the tracee as long as it is ptraced > (wait_consider_task() will ignore it). Please note that do_wait() does tsk = current; do { ptrace_do_wait(tsk, ...); tsk = next_thread(tsk); } while (tsk != current); So the sub-thread of the tracer can reap the tracee, please see below. > Since ptrace_disable() is called > before __ptrace_unlink(), we free the BTS buffer before do_wait() will > consider the tracee. They both can free it in parallel. Suppose we have 2 threads T1 and T2, C is a child of T1 (this is not strictly necessary, just for simplicity), T1 attaches to C, does PTRACE_BTS_CONFIG, and then starts PTRACE_DETACH. When it calls ptrace_detach(), C is TASK_TRACED. C is killed by SIGKILL, C exits and becomes a zombie. Not a problem for T1, it has a reference to task_struct. T1 calls ptrace_disable()->ptrace_bts_detach(). T2 calls do_wait(), the second iteration of the "do while" loop above finds the "eligible" child C, and calls wait_task_zombie(), which in turn does release_task()->ptrace_unlink()->...->ptrace_bts_untrace(). Now, T1->ptrace_bts_detach() can race with T2->ptrace_bts_untrace(), they both can see ->bts != NULL, and they both can do kfree/ds_release_bts. (and we have another similar race with de_thread() which can call release_task() too). > I do not see the race. Am I missing something? Or perhaps it is me who missed something, I didn't try to verify the problem... > We could try to mimic that and add a ptrace_notify_exit() function that is > called early in do_exit(). As long as I only put the ptrace_bts_detach() > into the arch version of it, the changes should be relatively safe. Yes, we can do untrace earlier, but we still have the problems with tasklist_lock. Of course, we can add the special function which does ptrace_bts_untrace() for each tracee under tasklist and returns the size of the freed buffer, then we drop tasklist and update ->mm. But this is soooo ugly... And this can't resolve the problem with do_wait/de_thread which can do ptrace_bts_untrace() before us. > What do you and Roland think about it? Do you have a better idea? We should cleanup ptrace first ;) IOW, I don't have a good idea. Perhaps, for 2.6.29, we can do something like the "patch" below? (btw, do you agree with the change in copy_process() I sent? ) > I would appreciate, if > you reviewed future patches in that area. Please CC me, I'll try to review. But I only understand (more or less) the process-management part of ptrace... Oleg. --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta static void ptrace_bts_detach(struct task_struct *child) { + // We can race with de_thread/do_wait which + // can do ptrace_bts_untrace() before us if (unlikely(child->bts)) { - ds_release_bts(child->bts); - child->bts = NULL; - - ptrace_bts_free_buffer(child); + // This all will be freed by ptrace_bts_untrace() + // later, but we should update ->mm + down_write(->mmap_sem); + mm->total_vm -= bts_size; + mm->locked_vm -= bts_size); + up_write(->mmap_sem); } } #else ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-10 18:40 ` Oleg Nesterov @ 2009-02-10 20:21 ` Markus Metzger 2009-02-10 21:00 ` Markus Metzger 0 siblings, 1 reply; 13+ messages in thread From: Markus Metzger @ 2009-02-10 20:21 UTC (permalink / raw) To: Oleg Nesterov Cc: Metzger, Markus T, Ingo Molnar, Roland McGrath, Andrew Morton, linux-kernel@vger.kernel.org, Markus Metzger On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote: > > 2. there is a race between a thread detaching > > and another thread releasing the same task. I think I now see the problem. Ptrace uses the tasklist_lock to protect against __ptrace_unlink() races. I could either introduce a separate lock to protect bts buffer deallocation, or I put the kfree part under the tasklist_lock, as you suggest below. > Perhaps, for 2.6.29, we can do something like the "patch" below? > > (btw, do you agree with the change in copy_process() I sent? ) Both patches look good to me. > --- a/arch/x86/kernel/ptrace.c > +++ b/arch/x86/kernel/ptrace.c > @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta > > static void ptrace_bts_detach(struct task_struct *child) > { > + // We can race with de_thread/do_wait which > + // can do ptrace_bts_untrace() before us > if (unlikely(child->bts)) { > - ds_release_bts(child->bts); > - child->bts = NULL; > - > - ptrace_bts_free_buffer(child); > + // This all will be freed by ptrace_bts_untrace() > + // later, but we should update ->mm > + down_write(->mmap_sem); > + mm->total_vm -= bts_size; > + mm->locked_vm -= bts_size); > + up_write(->mmap_sem); > } > } > #else > You already sent out the first one. I don't have access to any test machine from home. I could send the patch tomorrow (evening). thanks and regards, markus. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-10 20:21 ` Markus Metzger @ 2009-02-10 21:00 ` Markus Metzger 2009-02-10 21:48 ` Oleg Nesterov 0 siblings, 1 reply; 13+ messages in thread From: Markus Metzger @ 2009-02-10 21:00 UTC (permalink / raw) To: Markus Metzger Cc: Oleg Nesterov, Metzger, Markus T, Ingo Molnar, Roland McGrath, Andrew Morton, linux-kernel@vger.kernel.org On Tue, 2009-02-10 at 21:21 +0100, Markus Metzger wrote: > On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote: > > Perhaps, for 2.6.29, we can do something like the "patch" below? > > > > --- a/arch/x86/kernel/ptrace.c > > +++ b/arch/x86/kernel/ptrace.c > > @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta > > > > static void ptrace_bts_detach(struct task_struct *child) > > { > > + // We can race with de_thread/do_wait which > > + // can do ptrace_bts_untrace() before us > > if (unlikely(child->bts)) { > > - ds_release_bts(child->bts); > > - child->bts = NULL; > > - > > - ptrace_bts_free_buffer(child); > > + // This all will be freed by ptrace_bts_untrace() > > + // later, but we should update ->mm > > + down_write(->mmap_sem); > > + mm->total_vm -= bts_size; > > + mm->locked_vm -= bts_size); > > + up_write(->mmap_sem); > > } > > } > > #else > > > There's still a race. The kfree() is safe, now, but ptrace_bts_untrace() might have cleared child->bts_size before we can refund the memory. We need to make ptrace_bts_untrace() ignore child->bts_size and clear it in ptrace_bts_detach(). regards, markus. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-10 21:00 ` Markus Metzger @ 2009-02-10 21:48 ` Oleg Nesterov 2009-02-11 7:03 ` Markus Metzger 0 siblings, 1 reply; 13+ messages in thread From: Oleg Nesterov @ 2009-02-10 21:48 UTC (permalink / raw) To: Markus Metzger Cc: Metzger, Markus T, Ingo Molnar, Roland McGrath, Andrew Morton, linux-kernel@vger.kernel.org On 02/10, Markus Metzger wrote: > > On Tue, 2009-02-10 at 21:21 +0100, Markus Metzger wrote: > > On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote: > > > > Perhaps, for 2.6.29, we can do something like the "patch" below? > > > > > > --- a/arch/x86/kernel/ptrace.c > > > +++ b/arch/x86/kernel/ptrace.c > > > @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta > > > > > > static void ptrace_bts_detach(struct task_struct *child) > > > { > > > + // We can race with de_thread/do_wait which > > > + // can do ptrace_bts_untrace() before us > > > if (unlikely(child->bts)) { > > > - ds_release_bts(child->bts); > > > - child->bts = NULL; > > > - > > > - ptrace_bts_free_buffer(child); > > > + // This all will be freed by ptrace_bts_untrace() > > > + // later, but we should update ->mm > > > + down_write(->mmap_sem); > > > + mm->total_vm -= bts_size; > > > + mm->locked_vm -= bts_size); > > > + up_write(->mmap_sem); > > > } > > > } > > > #else > > > > > > > There's still a race. > The kfree() is safe, now, but ptrace_bts_untrace() might have cleared > child->bts_size before we can refund the memory. Yes sure, please note the "We can race..." comment at the top of ptrace_bts_detach(). The goal of this patch is to avoid the crash. The memory accounting in ->mm is still not right. But at least, the tracer can not "steal" the memory above the limits. And the "good" tracer should not exit without detach, and it shouldn't release the tracee from sub-thread if this can race with detach. So, afaics, the worst thing which can happen is: the "bad" tracer is punished by the "unfair" mm->xxx_vm numbers. Except exec() can release the main thread whatever the tracer does... > We need to make ptrace_bts_untrace() ignore child->bts_size and clear > it in ptrace_bts_detach(). This is worse, now we can leak the memory if the tracer doesn't do ptrace_detach(). Oleg. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-10 21:48 ` Oleg Nesterov @ 2009-02-11 7:03 ` Markus Metzger 0 siblings, 0 replies; 13+ messages in thread From: Markus Metzger @ 2009-02-11 7:03 UTC (permalink / raw) To: Oleg Nesterov Cc: Markus Metzger, Metzger, Markus T, Ingo Molnar, Roland McGrath, Andrew Morton, linux-kernel@vger.kernel.org Oleg Nesterov wrote: > On 02/10, Markus Metzger wrote: >> On Tue, 2009-02-10 at 21:21 +0100, Markus Metzger wrote: >>> On Tue, 2009-02-10 at 19:40 +0100, Oleg Nesterov wrote: >>>> Perhaps, for 2.6.29, we can do something like the "patch" below? >>>> >>>> --- a/arch/x86/kernel/ptrace.c >>>> +++ b/arch/x86/kernel/ptrace.c >>>> @@ -810,11 +810,15 @@ static void ptrace_bts_untrace(struct ta >>>> >>>> static void ptrace_bts_detach(struct task_struct *child) >>>> { >>>> + // We can race with de_thread/do_wait which >>>> + // can do ptrace_bts_untrace() before us >>>> if (unlikely(child->bts)) { >>>> - ds_release_bts(child->bts); >>>> - child->bts = NULL; >>>> - >>>> - ptrace_bts_free_buffer(child); >>>> + // This all will be freed by ptrace_bts_untrace() >>>> + // later, but we should update ->mm >>>> + down_write(->mmap_sem); >>>> + mm->total_vm -= bts_size; >>>> + mm->locked_vm -= bts_size); >>>> + up_write(->mmap_sem); >>>> } >>>> } >>>> #else >>>> > The goal of this patch is to avoid the crash. The memory accounting > in ->mm is still not right. But at least, the tracer can not "steal" > the memory above the limits. And the "good" tracer should not exit > without detach, and it shouldn't release the tracee from sub-thread > if this can race with detach. > > So, afaics, the worst thing which can happen is: the "bad" tracer > is punished by the "unfair" mm->xxx_vm numbers. > > Except exec() can release the main thread whatever the tracer does... > >> We need to make ptrace_bts_untrace() ignore child->bts_size and clear >> it in ptrace_bts_detach(). > > This is worse, now we can leak the memory if the tracer doesn't > do ptrace_detach(). I see. If the tracer dies and bypasses detach, the next tracer to trace the tracee would get the memory refunded when he configures branch tracing - unless we take care about this in ptrace_bts_configure() and only refund the memory when there was a buffer to free. But this would complicate the code even more. I think that the underlying problem is that ptrace_detach() can be bypassed. This bypasses also arch-specific cleanup code - that's why I added arch_ptrace_untrace(). It would all be very simple if that were not the case. regards, markus. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov 2009-02-09 1:28 ` Oleg Nesterov @ 2009-02-10 20:08 ` Andrew Morton 2009-02-11 9:33 ` Ingo Molnar 2 siblings, 0 replies; 13+ messages in thread From: Andrew Morton @ 2009-02-10 20:08 UTC (permalink / raw) To: Oleg Nesterov; +Cc: mingo, markus.t.metzger, roland, linux-kernel On Mon, 9 Feb 2009 02:02:33 +0100 Oleg Nesterov <oleg@redhat.com> wrote: > I noticed by pure accident we have ptrace_fork() and friends. This was > added by "x86, bts: add fork and exit handling", commit > bf53de907dfdaac178c92d774aae7370d7b97d20 > > I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly > believe this needs the fix. I think something like this program > > int main(void) > { > int pid = fork(); > > if (!pid) { > ptrace(PTRACE_TRACEME, 0, NULL, NULL); > kill(getpid(), SIGSTOP); > fork(); > } else { > struct ptrace_bts_config bts = { > .flags = PTRACE_BTS_O_ALLOC, > .size = 4 * 4096, > }; > > wait(NULL); > > ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK); > ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts)); > ptrace(PTRACE_CONT, pid, NULL, NULL); > > sleep(1); > } > > return 0; > } > > should crash the kernel. > > If the task is traced by its natural parent ptrace_reparented() returns 0 > but we should clear ->btsxxx anyway. > > This is a minimal fix for 2.6.29, we need further cleanups imho. > This changelog is all a bit tentative-sounding. > > --- 6.29-rc3/kernel/fork.c~BTS_FIX 2009-01-29 01:13:55.000000000 +0100 > +++ 6.29-rc3/kernel/fork.c 2009-02-09 01:03:48.000000000 +0100 > @@ -1093,7 +1093,7 @@ static struct task_struct *copy_process( > #ifdef CONFIG_DEBUG_MUTEXES > p->blocked_on = NULL; /* not blocked yet */ > #endif > - if (unlikely(ptrace_reparented(current))) > + if (unlikely(current->ptrace)) > ptrace_fork(p, clone_flags); > > /* Perform scheduler related setup. Assign this task to a CPU. */ Can we please confirm that this patch is indeed correct and needed? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() 2009-02-09 1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov 2009-02-09 1:28 ` Oleg Nesterov 2009-02-10 20:08 ` Andrew Morton @ 2009-02-11 9:33 ` Ingo Molnar 2 siblings, 0 replies; 13+ messages in thread From: Ingo Molnar @ 2009-02-11 9:33 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Andrew Morton, Markus Metzger, Roland McGrath, linux-kernel * Oleg Nesterov <oleg@redhat.com> wrote: > I noticed by pure accident we have ptrace_fork() and friends. This was > added by "x86, bts: add fork and exit handling", commit > bf53de907dfdaac178c92d774aae7370d7b97d20 > > I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly > believe this needs the fix. I think something like this program > > int main(void) > { > int pid = fork(); > > if (!pid) { > ptrace(PTRACE_TRACEME, 0, NULL, NULL); > kill(getpid(), SIGSTOP); > fork(); > } else { > struct ptrace_bts_config bts = { > .flags = PTRACE_BTS_O_ALLOC, > .size = 4 * 4096, > }; > > wait(NULL); > > ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK); > ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts)); > ptrace(PTRACE_CONT, pid, NULL, NULL); > > sleep(1); > } > > return 0; > } > > should crash the kernel. > > If the task is traced by its natural parent ptrace_reparented() returns 0 > but we should clear ->btsxxx anyway. > > This is a minimal fix for 2.6.29, we need further cleanups imho. I've applied this fix to tip:x86/urgent for now, until the other fix from Markus gets finalized. Ingo ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-02-11 9:34 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-09 1:02 [PATCH, for 2.6.29] ptrace: fix the usage of ptrace_fork() Oleg Nesterov 2009-02-09 1:28 ` Oleg Nesterov 2009-02-09 1:54 ` Roland McGrath 2009-02-09 9:28 ` Metzger, Markus T 2009-02-09 19:36 ` Oleg Nesterov 2009-02-10 9:47 ` Metzger, Markus T 2009-02-10 18:40 ` Oleg Nesterov 2009-02-10 20:21 ` Markus Metzger 2009-02-10 21:00 ` Markus Metzger 2009-02-10 21:48 ` Oleg Nesterov 2009-02-11 7:03 ` Markus Metzger 2009-02-10 20:08 ` Andrew Morton 2009-02-11 9:33 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox