public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
@ 2008-05-22  2:47 Luming Yu
  2008-05-22  8:47 ` Petr Tesarik
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-22  2:47 UTC (permalink / raw)
  To: LKML, linux-ia64

[-- Attachment #1: Type: text/plain, Size: 783 bytes --]

Hello list,

The following patch is to fixed a race in ptrace_stop handling which
causes "strace" hang if the target process blocks SIGTRAP with the
test case filed at
https://bugzilla.redhat.com/show_bug.cgi?id=446200#c16.
Please note this is just IA64 problem because just IA64 has
arch_ptrace_stop_needed defined, and has arch_ptrace_stop defined that
would set notify_resume flags for syncing rbs...but it also opens the
door to invoke ia64_do_signal->get_signal_to_deliver before setting
current PTRACED flag. Please help review.

**The patch is enclosed in text attachment*
**I'm using web client to send the patch* *

Signed-off-by: Yu Luming <luming.yu@intel.com>
--------------------------------------
 signal.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

[-- Attachment #2: 446200.patch --]
[-- Type: application/force-download, Size: 624 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22  2:47 [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race Luming Yu
@ 2008-05-22  8:47 ` Petr Tesarik
  2008-05-22  9:16   ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Petr Tesarik @ 2008-05-22  8:47 UTC (permalink / raw)
  To: Luming Yu; +Cc: LKML, linux-ia64, Roland McGrath

On Thu, 2008-05-22 at 10:47 +0800, Luming Yu wrote:
> Hello list,
> 
> The following patch is to fixed a race in ptrace_stop handling which
> causes "strace" hang if the target process blocks SIGTRAP with the
> test case filed at
> https://bugzilla.redhat.com/show_bug.cgi?id=446200#c16.
> Please note this is just IA64 problem because just IA64 has
> arch_ptrace_stop_needed defined, and has arch_ptrace_stop defined that
> would set notify_resume flags for syncing rbs...but it also opens the
> door to invoke ia64_do_signal->get_signal_to_deliver before setting
> current PTRACED flag. Please help review.
> 
> **The patch is enclosed in text attachment*
> **I'm using web client to send the patch* *

I'm inlining the patch for sake of convenience:

> 
> Signed-off-by: Yu Luming <luming.yu@intel.com>
> --------------------------------------
>  signal.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> --- 0/kernel/signal.c	2008-05-14 02:24:51.000000000 +0800
> +++ 1/kernel/signal.c	2008-05-22 13:54:42.000000000 +0800
> @@ -1488,6 +1488,9 @@
>  {
>  	int killed = 0;
>  
> +	/* Let the debugger run.  */
> +	__set_current_state(TASK_TRACED);
> +

That's probably not what we want. What happens if the task then sleeps
during the user-space access? Unless I forgot something obvious, it will
never get scheduled again...

Petr Tesarik

>  	if (arch_ptrace_stop_needed(exit_code, info)) {
>  		/*
>  		 * The arch code has something special to do before a
> @@ -1516,8 +1519,6 @@
>  	current->last_siginfo = info;
>  	current->exit_code = exit_code;
>  
> -	/* Let the debugger run.  */
> -	__set_current_state(TASK_TRACED);
>  	spin_unlock_irq(&current->sighand->siglock);
>  	read_lock(&tasklist_lock);
>  	if (!unlikely(killed) && may_ptrace_stop()) {


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22  8:47 ` Petr Tesarik
@ 2008-05-22  9:16   ` Luming Yu
  2008-05-22 11:18     ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-22  9:16 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: LKML, linux-ia64, Roland McGrath

On Thu, May 22, 2008 at 4:47 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> On Thu, 2008-05-22 at 10:47 +0800, Luming Yu wrote:
>> Hello list,
>>
>> The following patch is to fixed a race in ptrace_stop handling which
>> causes "strace" hang if the target process blocks SIGTRAP with the
>> test case filed at
>> https://bugzilla.redhat.com/show_bug.cgi?id=446200#c16.
>> Please note this is just IA64 problem because just IA64 has
>> arch_ptrace_stop_needed defined, and has arch_ptrace_stop defined that
>> would set notify_resume flags for syncing rbs...but it also opens the
>> door to invoke ia64_do_signal->get_signal_to_deliver before setting
>> current PTRACED flag. Please help review.
>>
>> **The patch is enclosed in text attachment*
>> **I'm using web client to send the patch* *
>
> I'm inlining the patch for sake of convenience:
>

thanks.

>>
>> Signed-off-by: Yu Luming <luming.yu@intel.com>
>> --------------------------------------
>>  signal.c |    5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> --- 0/kernel/signal.c 2008-05-14 02:24:51.000000000 +0800
>> +++ 1/kernel/signal.c 2008-05-22 13:54:42.000000000 +0800
>> @@ -1488,6 +1488,9 @@
>>  {
>>       int killed = 0;
>>
>> +     /* Let the debugger run.  */
>> +     __set_current_state(TASK_TRACED);
>> +
>
> That's probably not what we want. What happens if the task then sleeps
> during the user-space access? Unless I forgot something obvious, it will
> never get scheduled again...

My intention is to disable signal delivering before TASK_TRACED flag
is set for correctly handling ptrace_stop() with SIGTRAP masked.
Although this patch totally is a hack, but it should clearly shows
where the problem is that I want to solve..

>
> Petr Tesarik
>
>>       if (arch_ptrace_stop_needed(exit_code, info)) {
>>               /*
>>                * The arch code has something special to do before a
>> @@ -1516,8 +1519,6 @@
>>       current->last_siginfo = info;
>>       current->exit_code = exit_code;
>>
>> -     /* Let the debugger run.  */
>> -     __set_current_state(TASK_TRACED);
>>       spin_unlock_irq(&current->sighand->siglock);
>>       read_lock(&tasklist_lock);
>>       if (!unlikely(killed) && may_ptrace_stop()) {
>
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22  9:16   ` Luming Yu
@ 2008-05-22 11:18     ` Roland McGrath
  2008-05-22 12:12       ` Petr Tesarik
  2008-05-22 13:24       ` Luming Yu
  0 siblings, 2 replies; 41+ messages in thread
From: Roland McGrath @ 2008-05-22 11:18 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

I really cannot figure out from anything you've said what the failure mode
is or how you think it should be affected.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22 11:18     ` Roland McGrath
@ 2008-05-22 12:12       ` Petr Tesarik
  2008-05-22 20:39         ` Roland McGrath
  2008-05-23 12:33         ` Luming Yu
  2008-05-22 13:24       ` Luming Yu
  1 sibling, 2 replies; 41+ messages in thread
From: Petr Tesarik @ 2008-05-22 12:12 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Luming Yu, LKML, linux-ia64

On Thu, 2008-05-22 at 04:18 -0700, Roland McGrath wrote:
> I really cannot figure out from anything you've said what the failure mode
> is or how you think it should be affected.

At least you should be able to see the related bug report at:

https://bugzilla.redhat.com/show_bug.cgi?id=446200

I'm getting Access Denied. :(

Petr Tesarik


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22 11:18     ` Roland McGrath
  2008-05-22 12:12       ` Petr Tesarik
@ 2008-05-22 13:24       ` Luming Yu
  2008-05-22 20:34         ` Roland McGrath
  1 sibling, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-22 13:24 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

On Thu, May 22, 2008 at 7:18 PM, Roland McGrath <roland@redhat.com> wrote:
> I really cannot figure out from anything you've said what the failure mode
> is or how you think it should be affected.

Sorry for confusion, Let me try to explain it more:

For ia64,the code path is like:
ptrace_notify (to let the debugger run)--> ptrace_stop
-->spin_unlock_irq->arch_ptrace_stop (ia64_ptrace_stop) ->[sync rbs
and set NOTIFY_RESUME.....]-->spin_lock_irq->set TASK_TRACED flag (to
let the debugger run)

For x86, the code path is like:
ptrace_notify (to let the debugger run) ->ptrace_stop->set TASK_TRACED
flag (to let the debugger run)-->spin_unlock_irq

If TASK_TRACED is not set earlier before arch_ptrace_stop on ia64
ptrace_notify code path, some signals would be delivered without
letting debugger run.. (i.e. PTRACED logica in get_signal_to_deliver
would be ignored totally!). These should cause the test case hang on
ia64. And x86 just works..

If you have any question , I would dig further..

Thanks,
Luming

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22 13:24       ` Luming Yu
@ 2008-05-22 20:34         ` Roland McGrath
  2008-05-23  3:42           ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-05-22 20:34 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

> Sorry for confusion, Let me try to explain it more:

I understand these code paths (I wrote them).

> If TASK_TRACED is not set earlier before arch_ptrace_stop on ia64
> ptrace_notify code path, some signals would be delivered without
> letting debugger run.. (i.e. PTRACED logica in get_signal_to_deliver
> would be ignored totally!). These should cause the test case hang on
> ia64. And x86 just works..

I do not understand this at all, and it has given no information you did
not give before.  Please describe the scenario you see in fine-grained
terms.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22 12:12       ` Petr Tesarik
@ 2008-05-22 20:39         ` Roland McGrath
  2008-05-23 12:33         ` Luming Yu
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2008-05-22 20:39 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Luming Yu, LKML, linux-ia64

> On Thu, 2008-05-22 at 04:18 -0700, Roland McGrath wrote:
> > I really cannot figure out from anything you've said what the failure mode
> > is or how you think it should be affected.
> 
> At least you should be able to see the related bug report at:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=446200

That really doesn't help (yes, I can see it).  That is a bug for the RHEL4
kernel, which is based on a much older kernel and has none of the code
we're discussing here.  The test case is a simple program that blocks
SIGTRAP using sigprocmask, then forks and exec's a simple shell script that
just runs some random command; run this under strace.  There is no further
information in that bugzilla report that helps us understand the behavior
of that test case in the current kernel, which Luming says has a problem.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22 20:34         ` Roland McGrath
@ 2008-05-23  3:42           ` Luming Yu
  2008-05-23  4:19             ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-23  3:42 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

On Fri, May 23, 2008 at 4:34 AM, Roland McGrath <roland@redhat.com> wrote:
>> Sorry for confusion, Let me try to explain it more:
>
> I understand these code paths (I wrote them).
>
>> If TASK_TRACED is not set earlier before arch_ptrace_stop on ia64
>> ptrace_notify code path, some signals would be delivered without
>> letting debugger run.. (i.e. PTRACED logica in get_signal_to_deliver
>> would be ignored totally!). These should cause the test case hang on
>> ia64. And x86 just works..
>
> I do not understand this at all, and it has given no information you did
> not give before.  Please describe the scenario you see in fine-grained
> terms.
>
In the code path mentioned above, I see only ia64 has chance to let
ptraced thread deliver those pending signals before TASK_TRACED is
set. Then debugger thread would lose chance to interfere the
delivering of those signals if I correctly understand PT_PTRACED logic
in get_signal_to_deliver, and the relationship between the two flag :
TASK_TRACED and PT_TRACED.

Since you write those code, Please clarify, in ptrace_notify code
path, is it allowed that ptraced thread can run signal handler without
telling debugger what happened?

I noticed the only difference between x86 and IA64 ,  and it does make
the test case work on
x86, and fail on IA64... So I made the patch trying to eliminate the
difference. It indeed seems to solve my problem although it is still
hack, and I don't know what kind of signals strace handled has such
magic..

As for how the door is only open for ia64, I can explain further if
you want to know.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-23  3:42           ` Luming Yu
@ 2008-05-23  4:19             ` Roland McGrath
  2008-05-23  5:24               ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-05-23  4:19 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

> In the code path mentioned above, I see only ia64 has chance to let
> ptraced thread deliver those pending signals before TASK_TRACED is

This is the key thing that makes no sense to me.  What do you mean by
"deliver" here?  The normal meaning of to "deliver" a signal means that
get_signal_to_deliver() dequeues and processes it.  This can never happen
inside any other kernel code path.  It certainly can never happen inside
ptrace_stop or ptrace_notify.

The difference between ia64 and others is that inside ptrace_stop(), it can
release the siglock and then can block (in page faults, or by preemption).
No signal delivery can happen.  

What perhaps can happen is that TIF_SIGPENDING being set can cause a
TASK_INTERRUPTIBLE wait in the page fault path to return early and
ia64_sync_user_rbs will bail out before doing all its work.  But you
haven't described any problem like that.

Please tell us the exact code path you think is happening in the error case
you can reproduce.  Describe the actual code path, not high-level notions
like "deliver those pending signals".


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-23  4:19             ` Roland McGrath
@ 2008-05-23  5:24               ` Luming Yu
  2008-05-26  0:15                 ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-23  5:24 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

> This is the key thing that makes no sense to me.  What do you mean by
> "deliver" here?  The normal meaning of to "deliver" a signal means that
> get_signal_to_deliver() dequeues and processes it.  This can never happen
> inside any other kernel code path.  It certainly can never happen inside
> ptrace_stop or ptrace_notify.

"deliver" here I mean get_signal_to_deliver()...
Please clarify, in ptrace_notify code path, is it allowed that ptraced
thread can ignore signals (such as SIGTRAP) without telling debugger
what happened? This is key problem I'm not quite sure... Because if
the test doesn't block SIGTRAP, it works on IA64 too.. Hope I have
made the problem clear...

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-22 12:12       ` Petr Tesarik
  2008-05-22 20:39         ` Roland McGrath
@ 2008-05-23 12:33         ` Luming Yu
  1 sibling, 0 replies; 41+ messages in thread
From: Luming Yu @ 2008-05-23 12:33 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Roland McGrath, LKML, linux-ia64

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

On Thu, May 22, 2008 at 8:12 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> On Thu, 2008-05-22 at 04:18 -0700, Roland McGrath wrote:
>> I really cannot figure out from anything you've said what the failure mode
>> is or how you think it should be affected.
>
> At least you should be able to see the related bug report at:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=446200
>
> I'm getting Access Denied. :(

ok , I just got permit to share the test case here..  Please check out
the attachment.
 1. tar jxvf testpro.tar.bz2
       testpro/test1.c
               test2.sh
2.cd testpro
3.gcc -o test1 test1.c
4.strace -f -o log.txt ./test1

hope you can reproduce the problem.

Thanks,
Luming

[-- Attachment #2: testpro.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 694 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-23  5:24               ` Luming Yu
@ 2008-05-26  0:15                 ` Roland McGrath
  2008-05-26  1:30                   ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-05-26  0:15 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

> "deliver" here I mean get_signal_to_deliver()...

This can never happen inside ptrace_stop.  
So your line of inquiry doesn't make sense to me.

> the test doesn't block SIGTRAP, it works on IA64 too.. Hope I have
> made the problem clear...

It was already quite clear what elements have to be present to cause a
problem with the test case (ia64 and blocked SIGTRAP).  Nothing else about
it is yet clear to me, sorry.

Please tell us the exact code path you think is happening in the error case
you can reproduce.  


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-26  0:15                 ` Roland McGrath
@ 2008-05-26  1:30                   ` Luming Yu
  2008-05-27  3:31                     ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-26  1:30 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

On Mon, May 26, 2008 at 8:15 AM, Roland McGrath <roland@redhat.com> wrote:
>> "deliver" here I mean get_signal_to_deliver()...
>
> This can never happen inside ptrace_stop.
> So your line of inquiry doesn't make sense to me.

if happens, it should be a bug, right?

>
> Please tell us the exact code path you think is happening in the error case
> you can reproduce.
>
I will try to customize kernel to capture call trace for a precise code path.

--Luming

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-26  1:30                   ` Luming Yu
@ 2008-05-27  3:31                     ` Luming Yu
  2008-05-27  4:04                       ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-27  3:31 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

>
> if happens, it should be a bug, right?

It does happen!!

Call Trace:
 [<a000000100011bd0>] show_stack+0x50/0xa0
                                sp=e000000146bbfbb0 bsp=e000000146bb0e08
 [<a000000100011c50>] dump_stack+0x30/0x60
                                sp=e000000146bbfd80 bsp=e000000146bb0de8
 [<a0000001000979a0>] get_signal_to_deliver+0x60/0x6e0
                                sp=e000000146bbfd80 bsp=e000000146bb0d80
 [<a0000001000343d0>] ia64_do_signal+0xb0/0xd00
                                sp=e000000146bbfd80 bsp=e000000146bb0cd8
 [<a000000100012650>] do_notify_resume_user+0xf0/0x140
                                sp=e000000146bbfe20 bsp=e000000146bb0ca8
 [<a00000010000aac0>] notify_resume_user+0x40/0x60
                                sp=e000000146bbfe20 bsp=e000000146bb0c58
 [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
                                sp=e000000146bbfe30 bsp=e000000146bb0c58
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000000146bc0000 bsp=e000000146bb0c58

>>
> I will try to customize kernel to capture call trace for a precise code path.

I applied the following patch , and got the call trace above..
If apply my RFC patch as antidote,  I don't see "deliver" ...
Is the problem clear now?  I will serve you until every thing is clear to you.

Thanks,
Luming

Signed-off-by: Yu Luming <luming.yu@intel.com>

diff -Bru 1/kernel/signal.c 0/kernel/signal.c
--- 1/kernel/signal.c   2008-05-27 15:18:48.000000000 +0800
+++ 0/kernel/signal.c   2008-05-27 15:08:51.000000000 +0800
@@ -38,6 +38,7 @@
  */

 static struct kmem_cache *sigqueue_cachep;
+unsigned long global_arch_ptrace_stop_flag =0;

 static int __sig_ignored(struct task_struct *t, int sig)
 {
@@ -1501,9 +1502,12 @@
                 * siglock.  That must prevent us from sleeping in TASK_TRACED.
                 * So after regaining the lock, we must check for SIGKILL.
                 */
+               global_arch_ptrace_stop_flag = 1;
                spin_unlock_irq(&current->sighand->siglock);
                arch_ptrace_stop(exit_code, info);
+
                spin_lock_irq(&current->sighand->siglock);
+               global_arch_ptrace_stop_flag = 0;
                killed = sigkill_pending(current);
        }

@@ -1701,6 +1705,8 @@
        struct signal_struct *signal = current->signal;
        int signr;

+       if (global_arch_ptrace_stop_flag)
+               dump_stack();
 relock:
        /*
         * We'll jump back here after any time we were stopped in TASK_STOPPED.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  3:31                     ` Luming Yu
@ 2008-05-27  4:04                       ` Roland McGrath
  2008-05-27  5:49                         ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-05-27  4:04 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

> > if happens, it should be a bug, right?

It doesn't even make sense that it should be possible.
So if it somehow is possible, that is certainly a bug.
But the mind boggles as to exactly what sort of bug it could be.

> It does happen!!

Um.  Really?  What does happen exactly?

> Call Trace:
>  [<a000000100011bd0>] show_stack+0x50/0xa0
>                                 sp=e000000146bbfbb0 bsp=e000000146bb0e08
>  [<a000000100011c50>] dump_stack+0x30/0x60
>                                 sp=e000000146bbfd80 bsp=e000000146bb0de8
>  [<a0000001000979a0>] get_signal_to_deliver+0x60/0x6e0
>                                 sp=e000000146bbfd80 bsp=e000000146bb0d80
>  [<a0000001000343d0>] ia64_do_signal+0xb0/0xd00
>                                 sp=e000000146bbfd80 bsp=e000000146bb0cd8
>  [<a000000100012650>] do_notify_resume_user+0xf0/0x140
>                                 sp=e000000146bbfe20 bsp=e000000146bb0ca8
>  [<a00000010000aac0>] notify_resume_user+0x40/0x60
>                                 sp=e000000146bbfe20 bsp=e000000146bb0c58
>  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
>                                 sp=e000000146bbfe30 bsp=e000000146bb0c58
>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>                                 sp=e000000146bc0000 bsp=e000000146bb0c58

So this here shows a perfectly normal trace that bottoms out at a syscall
entry from user mode.  You seem to be saying that, somehow, inside
ptrace_stop(), we tried to return to user mode--I guess you mean losing the
kernel stack with the call chain leading to ptrace_stop()--and then
reentered the kernel as for a signal after a syscall.  

> I applied the following patch , and got the call trace above..
> If apply my RFC patch as antidote,  I don't see "deliver" ...

With just that diagnostic patch as shown, these might be two different
threads.  But I guess you've ruled that out somehow?  If this does in fact
happen in the thread that is supposed to be in ptrace_stop(), then the
trail we need to follow is in arch_ptrace_stop(), i.e. ia64_ptrace_stop().

> Is the problem clear now?  

I'm sorry, it's not at all clear to me.

> I will serve you until every thing is clear to you.

That's quite a commitment!  My full enlightenment may be a long time off.
I won't hold you to it once we've fixed this particular bug, though. ;-)

What should be happening is that ia64_ptrace_stop() should do its work,
possibly blocking, and then return to its caller in ptrace_stop().  At no
point should it be possible for ia64_ptrace_stop() to return directly to
user mode, or to reenter notify_resume_user() in any fashion.

Please focus on the exact code path taken inside the ia64_ptrace_stop()
call.  It should be possible to identify every step of that and see exactly
where it goes astray from what we expect.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  4:04                       ` Roland McGrath
@ 2008-05-27  5:49                         ` Luming Yu
  2008-05-27  6:12                           ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-27  5:49 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

>
>> It does happen!!
>
> Um.  Really?  What does happen exactly?

I revised patch a bit, and managed to get this:

Call Trace:
 [<a000000100011bd0>] show_stack+0x50/0xa0
                                sp=e000000141c9fbb0 bsp=e000000141c90ea8
 [<a000000100011c50>] dump_stack+0x30/0x60
                                sp=e000000141c9fd80 bsp=e000000141c90e90
 [<a000000100097a20>] get_signal_to_deliver+0xa0/0x720
                                sp=e000000141c9fd80 bsp=e000000141c90e28
 [<a0000001000343d0>] ia64_do_signal+0xb0/0xd00
                                sp=e000000141c9fd80 bsp=e000000141c90d78
 [<a000000100012650>] do_notify_resume_user+0xf0/0x140
                                sp=e000000141c9fe20 bsp=e000000141c90d48
 [<a00000010000aac0>] notify_resume_user+0x40/0x60
                                sp=e000000141c9fe20 bsp=e000000141c90cf8
 [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
                                sp=e000000141c9fe30 bsp=e000000141c90cf8
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000000141ca0000 bsp=e000000141c90cf8
 [<a0000001000971c0>] ptrace_stop+0xa0/0x3e0
                                sp=e00000014716fdb0 bsp=e000000147160ca8
 [<a000000100097650>] ptrace_notify+0x150/0x1c0
                                sp=e00000014716fdb0 bsp=e000000147160c88
 [<a00000010002adb0>] syscall_trace+0x30/0xc0
                                sp=e00000014716fe30 bsp=e000000147160c70
 [<a00000010002aea0>] syscall_trace_enter+0x60/0xa0
                                sp=e00000014716fe30 bsp=e000000147160c18
 [<a00000010000a300>] ia64_trace_syscall+0x40/0x110
                                sp=e00000014716fe30 bsp=e000000147160c18
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000000147170000 bsp=e000000147160c18

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  5:49                         ` Luming Yu
@ 2008-05-27  6:12                           ` Roland McGrath
  2008-05-27  6:25                             ` Petr Tesarik
                                               ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Roland McGrath @ 2008-05-27  6:12 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

>  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
>                                 sp=e000000141c9fe30 bsp=e000000141c90cf8
>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>                                 sp=e000000141ca0000 bsp=e000000141c90cf8
>  [<a0000001000971c0>] ptrace_stop+0xa0/0x3e0
>                                 sp=e00000014716fdb0 bsp=e000000147160ca8
>  [<a000000100097650>] ptrace_notify+0x150/0x1c0
>                                 sp=e00000014716fdb0 bsp=e000000147160c88

If this is a backtrace of a single thread on the same kernel stack, it has
a stack depth of over 84MB.  You still need to follow the path inside
ia64_ptrace_stop() to see how any of this is possible.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  6:12                           ` Roland McGrath
@ 2008-05-27  6:25                             ` Petr Tesarik
  2008-06-03  6:04                               ` Luming Yu
  2008-05-27  6:34                             ` Luming Yu
  2008-05-27  8:48                             ` Luming Yu
  2 siblings, 1 reply; 41+ messages in thread
From: Petr Tesarik @ 2008-05-27  6:25 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Luming Yu, LKML, linux-ia64

On Mon, 2008-05-26 at 23:12 -0700, Roland McGrath wrote:
> >  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
> >                                 sp=e000000141c9fe30 bsp=e000000141c90cf8
> >  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
> >                                 sp=e000000141ca0000 bsp=e000000141c90cf8

Indeed, there seems to be a large hole here. So, this is either a bug in
the unwinder, or a bug in the RBS synchronization, which causes
corruption. My test machine currently needs some work to run 2.6.25
again, but I'll try your test case as soon as I re-install it later this
week.

Cheers,
Petr Tesarik

> >  [<a0000001000971c0>] ptrace_stop+0xa0/0x3e0
> >                                 sp=e00000014716fdb0 bsp=e000000147160ca8
> >  [<a000000100097650>] ptrace_notify+0x150/0x1c0
> >                                 sp=e00000014716fdb0 bsp=e000000147160c88
> 
> If this is a backtrace of a single thread on the same kernel stack, it has
> a stack depth of over 84MB.  You still need to follow the path inside
> ia64_ptrace_stop() to see how any of this is possible.
> 
> 
> Thanks,
> Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  6:12                           ` Roland McGrath
  2008-05-27  6:25                             ` Petr Tesarik
@ 2008-05-27  6:34                             ` Luming Yu
  2008-05-27  8:48                             ` Luming Yu
  2 siblings, 0 replies; 41+ messages in thread
From: Luming Yu @ 2008-05-27  6:34 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

On Tue, May 27, 2008 at 2:12 PM, Roland McGrath <roland@redhat.com> wrote:
>>  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
>>                                 sp=e000000141c9fe30 bsp=e000000141c90cf8
>>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>>                                 sp=e000000141ca0000 bsp=e000000141c90cf8
>>  [<a0000001000971c0>] ptrace_stop+0xa0/0x3e0
>>                                 sp=e00000014716fdb0 bsp=e000000147160ca8
>>  [<a000000100097650>] ptrace_notify+0x150/0x1c0
>>                                 sp=e00000014716fdb0 bsp=e000000147160c88
>
> If this is a backtrace of a single thread on the same kernel stack, it has
> a stack depth of over 84MB.  You still need to follow the path inside

you are right, I just find outputs of two different thread are mixed!
Please just ignore the call trace above..

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  6:12                           ` Roland McGrath
  2008-05-27  6:25                             ` Petr Tesarik
  2008-05-27  6:34                             ` Luming Yu
@ 2008-05-27  8:48                             ` Luming Yu
  2008-05-28  9:14                               ` Luming Yu
  2008-05-30  8:05                               ` Roland McGrath
  2 siblings, 2 replies; 41+ messages in thread
From: Luming Yu @ 2008-05-27  8:48 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

> You still need to follow the path inside
> ia64_ptrace_stop() to see how any of this is possible.

I tend to believe singal "deliver" in ptrace_notify path is impossible..
Given the fact that if not block SIGTRAP, the test case just works on
ia64, I still believe some signals have been consumed without letting
debugger know.. but it is just my guess, We still need to investigate
how it happened..I assume you should know how the test case failed. PS
shows the test case hangs in process 6553 which seem to be impossible
... Please help me understand what kind of situation could cause it
happen. That would be very helpful.

root      6516  0.0  0.0 57344 5184 ttyS1    Ss+  19:33   0:00 -bash
root      6551  0.0  0.0  4992 2560 ttyS1    T    19:33   0:00  \_
strace -f ./test1
root      6552  0.0  0.0  4416 1792 ttyS1    T    19:33   0:00      \_ ./test1
root      6553  0.0  0.0 56832 4288 ttyS1    T    19:33   0:01
 \_ /bin/bash ./test2.sh

Thanks,
Luming

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  8:48                             ` Luming Yu
@ 2008-05-28  9:14                               ` Luming Yu
  2008-06-03  6:02                                 ` Luming Yu
  2008-05-30  8:05                               ` Roland McGrath
  1 sibling, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-05-28  9:14 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

Some correction about the test case hang,  the ps output should looks
like the following:
13925 S+ strace -f ./test1
13926 S+ ./test1
13927 T+  /bin/bash ./test2

I'm trying upstream kdump to get more detailed data for help analysis
the scenario..
But unfortunately upstream kernel just hang when I echo c to
sysrq-trigger.  Downgrade the kernel to 2.6.22,  'echo c' doesn't
hang, but I just got "zero" dump file...
Will try F9 later... But from the symptom shouldn't we ask this
question: why  process 13927 can't wake up?  Roland,  please confirm
if ptrace_untrace is the only way to get Traced process wake up?  Back
up to caller of ptrace_untrace, maybe we can find out why my RFC patch
happen to fix the problem... Any suggestions are welcome!
--Luming

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  8:48                             ` Luming Yu
  2008-05-28  9:14                               ` Luming Yu
@ 2008-05-30  8:05                               ` Roland McGrath
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2008-05-30  8:05 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

I'd say the first thing to do is understand the sequence of ptrace calls,
wait results seen by the ptracer, and the ptrace_stop() path, in your test
case.  Compare that on x86 and ia64, where the behavior you see differs.
When you pinpoint the difference in that sequence, we will have some sense
of what it is that really matters to your test.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-28  9:14                               ` Luming Yu
@ 2008-06-03  6:02                                 ` Luming Yu
  0 siblings, 0 replies; 41+ messages in thread
From: Luming Yu @ 2008-06-03  6:02 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

> I'm trying upstream kdump to get more detailed data for help analysis
> the scenario..
Upstream kdump doesn't work! don't know what cause the regression,
possibly kexec-tools...
Need investigation here!

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-05-27  6:25                             ` Petr Tesarik
@ 2008-06-03  6:04                               ` Luming Yu
  2008-06-03  9:01                                 ` Petr Tesarik
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-06-03  6:04 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Roland McGrath, LKML, linux-ia64

On Tue, May 27, 2008 at 2:25 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> On Mon, 2008-05-26 at 23:12 -0700, Roland McGrath wrote:
>> >  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
>> >                                 sp=e000000141c9fe30 bsp=e000000141c90cf8
>> >  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>> >                                 sp=e000000141ca0000 bsp=e000000141c90cf8
>
> Indeed, there seems to be a large hole here. So, this is either a bug in
> the unwinder, or a bug in the RBS synchronization, which causes
> corruption. My test machine currently needs some work to run 2.6.25
> again, but I'll try your test case as soon as I re-install it later this
> week.

Just want to check if the test case works for you?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03  6:04                               ` Luming Yu
@ 2008-06-03  9:01                                 ` Petr Tesarik
  2008-06-03 14:32                                   ` Petr Tesarik
  0 siblings, 1 reply; 41+ messages in thread
From: Petr Tesarik @ 2008-06-03  9:01 UTC (permalink / raw)
  To: Luming Yu; +Cc: Roland McGrath, LKML, linux-ia64

Luming Yu wrote:
> On Tue, May 27, 2008 at 2:25 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
>> On Mon, 2008-05-26 at 23:12 -0700, Roland McGrath wrote:
>>>>  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
>>>>                                 sp=e000000141c9fe30 bsp=e000000141c90cf8
>>>>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>>>>                                 sp=e000000141ca0000 bsp=e000000141c90cf8
>> Indeed, there seems to be a large hole here. So, this is either a bug in
>> the unwinder, or a bug in the RBS synchronization, which causes
>> corruption. My test machine currently needs some work to run 2.6.25
>> again, but I'll try your test case as soon as I re-install it later this
>> week.
> 
> Just want to check if the test case works for you?

Yes, the test case hangs here too. But the problem seems to be
elsewhere. Did you look into the strace output? This line is pretty
suspicious:

3258  clone2(child_stack=0, stack_size=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x200000000004e290) = 1

Obviously, strace cannot attach PID 1, and since it is not designed to
handle this situation, it hangs. I'm going to investigate why the return
value of the clone2 syscall is seen as 1 by the tracer. Might even turn
out to be a bug in strace...

Petr Tesarik


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03  9:01                                 ` Petr Tesarik
@ 2008-06-03 14:32                                   ` Petr Tesarik
  2008-06-03 21:01                                     ` Roland McGrath
  2008-06-04  2:16                                     ` Luming Yu
  0 siblings, 2 replies; 41+ messages in thread
From: Petr Tesarik @ 2008-06-03 14:32 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Luming Yu, Roland McGrath, LKML, linux-ia64

Petr Tesarik wrote:
> Luming Yu wrote:
>> On Tue, May 27, 2008 at 2:25 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
>>> On Mon, 2008-05-26 at 23:12 -0700, Roland McGrath wrote:
>>>>>  [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
>>>>>                                 sp=e000000141c9fe30 bsp=e000000141c90cf8
>>>>>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>>>>>                                 sp=e000000141ca0000 bsp=e000000141c90cf8
>>> Indeed, there seems to be a large hole here. So, this is either a bug in
>>> the unwinder, or a bug in the RBS synchronization, which causes
>>> corruption. My test machine currently needs some work to run 2.6.25
>>> again, but I'll try your test case as soon as I re-install it later this
>>> week.
>> Just want to check if the test case works for you?
> 
> Yes, the test case hangs here too. But the problem seems to be
> elsewhere. Did you look into the strace output? This line is pretty
> suspicious:
> 
> 3258  clone2(child_stack=0, stack_size=0,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0x200000000004e290) = 1
> 
> Obviously, strace cannot attach PID 1, and since it is not designed to
> handle this situation, it hangs. I'm going to investigate why the return
> value of the clone2 syscall is seen as 1 by the tracer. Might even turn
> out to be a bug in strace...

It's definitely a bug in strace. For some reason (I don't care about)
the execve() syscall produces an extra notification. However, this
notification message is suppressed when SIGTRAP is blocked. This
explains why the test case fails only when SIGTRAP is blocked.

Now, you may ask why it only fails on ia64 and not on i386 or x86_64.
Well, I was so good that I even looked into strace sources to make sure.
Whereas for i386 and x86_64, the value of EAX/RAX is checked for -ENOSYS
in syscall_fixup(), for ia64 the first ptrace() after an execve() is
unconditionally ignored, see code in get_scno().

I don't know why Luming's fix helps here, but, please, fix strace, don't
introduce weird behaviour in the kernel.

The only thing I'm willing to talk about is why the extra notification
message is sent, and how userspace (strace) is supposed to recognize it.
FWIW the backtrace (system tap was at __group_send_sig_info):

 0xa0000001000b1a60 : __group_send_sig_info+0x0/0x180 []
 0xa0000001000b1e30 : do_notify_parent_cldstop+0x250/0x2c0 []
 0xa0000001000b2230 : ptrace_stop+0x2b0/0x3c0 []
 0xa0000001000b5200 : get_signal_to_deliver+0x200/0xa40 []
 0xa000000100035920 : ia64_do_signal+0xa0/0xee0 []
 0xa000000100014b60 : do_notify_resume_user+0x100/0x160 []
 0xa00000010000d040 : notify_resume_user+0x40/0x60 []
 0xa00000010000cf40 : skip_rbs_switch+0xf0/0x150 []
 0xa000000000010620 : __kernel_syscall_via_break+0x0/0x20 []

Regards,
Petr Tesarik


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03 14:32                                   ` Petr Tesarik
@ 2008-06-03 21:01                                     ` Roland McGrath
  2008-06-03 21:31                                       ` Luck, Tony
  2008-06-04  2:16                                     ` Luming Yu
  1 sibling, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-06-03 21:01 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Luming Yu, LKML, linux-ia64

> It's definitely a bug in strace. For some reason (I don't care about)
> the execve() syscall produces an extra notification. However, this
> notification message is suppressed when SIGTRAP is blocked. This
> explains why the test case fails only when SIGTRAP is blocked.

What do you mean by "extra"?  There is a SIGTRAP sent after execve
completes when ptraced, even when PTRACE_SYSCALL is not being used.
So for an execve that succeeds under PTRACE_SYSCALL, there is a
ptrace_notify at syscall entry, then a SIGTRAP queued (i.e., not seen
by ptrace if blocked), then a ptrace_notify at syscall exit.  If
that's what's happening (including the blocked SIGTRAP not being seen
by the ptracer, i.e. strace), then there is no mystery (and no bug).


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03 21:01                                     ` Roland McGrath
@ 2008-06-03 21:31                                       ` Luck, Tony
  2008-06-03 22:13                                         ` Roland McGrath
  2008-06-10  8:23                                         ` Luming Yu
  0 siblings, 2 replies; 41+ messages in thread
From: Luck, Tony @ 2008-06-03 21:31 UTC (permalink / raw)
  To: Roland McGrath, Petr Tesarik; +Cc: Luming Yu, LKML, linux-ia64

> What do you mean by "extra"?  There is a SIGTRAP sent after execve
> completes when ptraced, even when PTRACE_SYSCALL is not being used.
> So for an execve that succeeds under PTRACE_SYSCALL, there is a
> ptrace_notify at syscall entry, then a SIGTRAP queued (i.e., not seen
> by ptrace if blocked), then a ptrace_notify at syscall exit.  If
> that's what's happening (including the blocked SIGTRAP not being seen
> by the ptracer, i.e. strace), then there is no mystery (and no bug).

This might not be the same bug ... but I do have a definite 100%
reproducible bug (latest git kernel, old version of strace (4.5.15-1.el4.1))

Run:

	$ strace -o logit -f make

in any directory where make is actually going to have to do some
work.  You'll see that the command hangs after make outputs the
first action that it will take.  Looking at the stack traces of
the 3 processes involved it seems that make forked, the child
stopped in ptrace waiting for some action from strace, but strace
isn't woken from its sleep in wait().

Backtrace of pid 6442 (strace)

Call Trace:
 [<a0000001007069b0>] schedule+0x11f0/0x1380
                                sp=e0000001b28cfdb0 bsp=e0000001b28c0e00
 [<a0000001000842d0>] do_wait+0x1110/0x1520
                                sp=e0000001b28cfdd0 bsp=e0000001b28c0d58
 [<a0000001000849c0>] sys_wait4+0x140/0x1a0
                                sp=e0000001b28cfe30 bsp=e0000001b28c0cd8
 [<a00000010000aa60>] ia64_ret_from_syscall+0x0/0x20
                                sp=e0000001b28cfe30 bsp=e0000001b28c0cd8
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e0000001b28d0000 bsp=e0000001b28c0cd8

Backtrace of pid 6443 (make)

Call Trace:
 [<a0000001007069b0>] schedule+0x11f0/0x1380
                                sp=e0000001b768fb40 bsp=e0000001b7680d58
 [<a000000100707800>] schedule_timeout+0x40/0x180
                                sp=e0000001b768fb60 bsp=e0000001b7680d28
 [<a000000100706d60>] wait_for_common+0x220/0x380
                                sp=e0000001b768fb90 bsp=e0000001b7680cd8
 [<a000000100706f00>] wait_for_completion+0x40/0x60
                                sp=e0000001b768fbf0 bsp=e0000001b7680cb8
 [<a0000001000794d0>] do_fork+0x430/0x4a0
                                sp=e0000001b768fbf0 bsp=e0000001b7680c60
 [<a00000010000a340>] sys_clone+0x60/0x80
                                sp=e0000001b768fc20 bsp=e0000001b7680c10
 [<a00000010000a990>] ia64_trace_syscall+0xd0/0x110
                                sp=e0000001b768fe30 bsp=e0000001b7680c10
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e0000001b7690000 bsp=e0000001b7680c10

Backtrace of pid 6444 (make)

Call Trace:
 [<a0000001007069b0>] schedule+0x11f0/0x1380
                                sp=e0000001b803fd60 bsp=e0000001b8030dd8
 [<a000000100097590>] ptrace_stop+0x2d0/0x380
                                sp=e0000001b803fd80 bsp=e0000001b8030da0
 [<a000000100097c90>] get_signal_to_deliver+0x1d0/0x6a0
                                sp=e0000001b803fd80 bsp=e0000001b8030d38
 [<a000000100034a10>] ia64_do_signal+0xb0/0xd00
                                sp=e0000001b803fd80 bsp=e0000001b8030c90
 [<a000000100012c60>] do_notify_resume_user+0x100/0x180
                                sp=e0000001b803fe20 bsp=e0000001b8030c60
 [<a00000010000b0c0>] notify_resume_user+0x40/0x60
                                sp=e0000001b803fe20 bsp=e0000001b8030c10
 [<a00000010000aff0>] skip_rbs_switch+0xe0/0x110
                                sp=e0000001b803fe30 bsp=e0000001b8030c10
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e0000001b8040000 bsp=e0000001b8030c10


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03 21:31                                       ` Luck, Tony
@ 2008-06-03 22:13                                         ` Roland McGrath
  2008-06-10  8:23                                         ` Luming Yu
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2008-06-03 22:13 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Petr Tesarik, Luming Yu, LKML, linux-ia64

> This might not be the same bug ... but I do have a definite 100%
> reproducible bug (latest git kernel, old version of strace (4.5.15-1.el4.1))

Please start a thread with a sensical subject line about that.

> Backtrace of pid 6443 (make)
> 
> Call Trace:
>  [<a0000001007069b0>] schedule+0x11f0/0x1380
>                                 sp=e0000001b768fb40 bsp=e0000001b7680d58
>  [<a000000100707800>] schedule_timeout+0x40/0x180
>                                 sp=e0000001b768fb60 bsp=e0000001b7680d28
>  [<a000000100706d60>] wait_for_common+0x220/0x380
>                                 sp=e0000001b768fb90 bsp=e0000001b7680cd8
>  [<a000000100706f00>] wait_for_completion+0x40/0x60
>                                 sp=e0000001b768fbf0 bsp=e0000001b7680cb8
>  [<a0000001000794d0>] do_fork+0x430/0x4a0
>                                 sp=e0000001b768fbf0 bsp=e0000001b7680c60
>  [<a00000010000a340>] sys_clone+0x60/0x80
>                                 sp=e0000001b768fc20 bsp=e0000001b7680c10
>  [<a00000010000a990>] ia64_trace_syscall+0xd0/0x110
>                                 sp=e0000001b768fe30 bsp=e0000001b7680c10
>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>                                 sp=e0000001b7690000 bsp=e0000001b7680c10

This trace (do_fork->wait_for_completion) tells us this is a vfork call.
It is waiting for its child (presumably 6444) to exit or exec.

> Backtrace of pid 6444 (make)
> 
> Call Trace:
>  [<a0000001007069b0>] schedule+0x11f0/0x1380
>                                 sp=e0000001b803fd60 bsp=e0000001b8030dd8
>  [<a000000100097590>] ptrace_stop+0x2d0/0x380
>                                 sp=e0000001b803fd80 bsp=e0000001b8030da0
>  [<a000000100097c90>] get_signal_to_deliver+0x1d0/0x6a0
>                                 sp=e0000001b803fd80 bsp=e0000001b8030d38
>  [<a000000100034a10>] ia64_do_signal+0xb0/0xd00
>                                 sp=e0000001b803fd80 bsp=e0000001b8030c90
>  [<a000000100012c60>] do_notify_resume_user+0x100/0x180
>                                 sp=e0000001b803fe20 bsp=e0000001b8030c60
>  [<a00000010000b0c0>] notify_resume_user+0x40/0x60
>                                 sp=e0000001b803fe20 bsp=e0000001b8030c10
>  [<a00000010000aff0>] skip_rbs_switch+0xe0/0x110
>                                 sp=e0000001b803fe30 bsp=e0000001b8030c10
>  [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
>                                 sp=e0000001b8040000 bsp=e0000001b8030c10

This is the normal trace for the child having received a signal and stopped
to tell ptrace about it (not a syscall tracing stop).

I think you need to look into what strace is doing.  There is far too much
going to know much of anything just from the kernel state where the
processes sit.  In particular, the sequence of ptrace and wait calls strace
made.  If the same strace (identical everything) behaved differently with
an older kernel, then compare the sequence of ptrace and wait calls and see
where it differs.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03 14:32                                   ` Petr Tesarik
  2008-06-03 21:01                                     ` Roland McGrath
@ 2008-06-04  2:16                                     ` Luming Yu
  2008-06-04  9:16                                       ` Petr Tesarik
  1 sibling, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-06-04  2:16 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Roland McGrath, LKML, linux-ia64

> It's definitely a bug in strace. For some reason (I don't care about)
> the execve() syscall produces an extra notification. However, this
> notification message is suppressed when SIGTRAP is blocked. This
> explains why the test case fails only when SIGTRAP is blocked.

This is exact problem I suspected and I was trying to address in my hack..
Since there are several processes involved in the pretty complex
ptrace scenario.,
I need to capture all processes context with kdump to confirm this is
exact root-cause
for the problem. But kdump doesn't work for me..I'm trying to solve it now..

I'm also in doubt about the semantic correctness of the test case..
Since SIGTRAP is so necessary to get ptrace work, is it legitimate to
block it in test case?

One more thing I need to say is:
Same strace works for utrace enabled kernel on IA64.. If the bug is in
strace, how could it happen?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-04  2:16                                     ` Luming Yu
@ 2008-06-04  9:16                                       ` Petr Tesarik
  2008-06-05  1:49                                         ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Petr Tesarik @ 2008-06-04  9:16 UTC (permalink / raw)
  To: Luming Yu; +Cc: Roland McGrath, LKML, linux-ia64

Luming Yu wrote:
>> It's definitely a bug in strace. For some reason (I don't care about)
>> the execve() syscall produces an extra notification. However, this
>> notification message is suppressed when SIGTRAP is blocked. This
>> explains why the test case fails only when SIGTRAP is blocked.
> 
> This is exact problem I suspected and I was trying to address in my hack..
> Since there are several processes involved in the pretty complex
> ptrace scenario.,
> I need to capture all processes context with kdump to confirm this is
> exact root-cause
> for the problem. But kdump doesn't work for me..I'm trying to solve it now..
> 
> I'm also in doubt about the semantic correctness of the test case..
> Since SIGTRAP is so necessary to get ptrace work, is it legitimate to
> block it in test case?
> 
> One more thing I need to say is:
> Same strace works for utrace enabled kernel on IA64.. If the bug is in
> strace, how could it happen?

No idea, but send me the strace.log file from running

strace -o strace.log strace -f -o log.txt ./test1

and I may be able to tell.

Petr Tesarik

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-04  9:16                                       ` Petr Tesarik
@ 2008-06-05  1:49                                         ` Luming Yu
  2008-06-05 11:16                                           ` Petr Tesarik
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-06-05  1:49 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Roland McGrath, LKML, linux-ia64

[-- Attachment #1: Type: text/plain, Size: 1225 bytes --]

On Wed, Jun 4, 2008 at 5:16 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> Luming Yu wrote:
>>> It's definitely a bug in strace. For some reason (I don't care about)
>>> the execve() syscall produces an extra notification. However, this
>>> notification message is suppressed when SIGTRAP is blocked. This
>>> explains why the test case fails only when SIGTRAP is blocked.
>>
>> This is exact problem I suspected and I was trying to address in my hack..
>> Since there are several processes involved in the pretty complex
>> ptrace scenario.,
>> I need to capture all processes context with kdump to confirm this is
>> exact root-cause
>> for the problem. But kdump doesn't work for me..I'm trying to solve it now..
>>
>> I'm also in doubt about the semantic correctness of the test case..
>> Since SIGTRAP is so necessary to get ptrace work, is it legitimate to
>> block it in test case?
>>
>> One more thing I need to say is:
>> Same strace works for utrace enabled kernel on IA64.. If the bug is in
>> strace, how could it happen?
>
> No idea, but send me the strace.log file from running
>
> strace -o strace.log strace -f -o log.txt ./test1
>
> and I may be able to tell.

Please check the attachment!

>
> Petr Tesarik
>

[-- Attachment #2: strace.log --]
[-- Type: application/octet-stream, Size: 16090 bytes --]

5717  execve("./test1", ["./test1"], [/* 23 vars */]) = 1
5717  brk(0)                            = 0x6000000000004000
5717  uname({sys="Linux", node="tigerF", ...}) = 0
5717  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
5717  open("/etc/ld.so.cache", O_RDONLY) = 3
5717  fstat(3, {st_mode=S_IFREG|0644, st_size=47098, ...}) = 0
5717  mmap(NULL, 47098, PROT_READ, MAP_PRIVATE, 3, 0) = 0x200000000003c000
5717  close(3)                          = 0
5717  open("/lib/libc.so.6.1", O_RDONLY) = 3
5717  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\340:\2\0\0\0\0\0"..., 832) = 832
5717  fstat(3, {st_mode=S_IFREG|0755, st_size=2865232, ...}) = 0
5717  mmap(NULL, 2633344, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2000000000050000
5717  mprotect(0x20000000002bc000, 49152, PROT_NONE) = 0
5717  mmap(0x20000000002c8000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x268000) = 0x20000000002c8000
5717  mmap(0x20000000002d0000, 11904, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000002d0000
5717  close(3)                          = 0
5717  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x20000000002d4000
5717  mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x20000000002d8000
5717  munmap(0x200000000003c000, 47098) = 0
5717  rt_sigprocmask(SIG_SETMASK, [TRAP], NULL, 8) = 0
5717  pipe([3, 4])                      = 3
5717  clone2(child_stack=0, stack_size=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x20000000002d4090) = 5718
5717  close(4 <unfinished ...>
5718  close(3 <unfinished ...>
5717  <... close resumed> )             = 0
5718  <... close resumed> )             = 0
5718  close(1)                          = 0
5718  dup(4)                            = 1
5718  close(4)                          = 0
5717  read(3,  <unfinished ...>
5718  execve("./test2.sh", [], [/* 23 vars */]) = 1
5718  execve("", [0x840c001000100003, 0x26230c14203032, 0x8cb0008800140a81, 0xa643100801808402, 0x2400905000040088, 0x11600a0072000001, 0xad814a00402e0, 0x2200012464009344, 0x1180418512c40026, 0x400003081880008, 0x2100010840910404, 0x8045120000800003, 0x6400000c0000600, 0xc20063440501400, 0x1048015002008081, 0xe02226005008c010, ...], [/* 0 vars */]) = 1
5718  uname({sys="Linux", node="tigerF", ...}) = 0
5718  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
5718  open("/etc/ld.so.cache", O_RDONLY) = 3
5718  fstat(3, {st_mode=S_IFREG|0644, st_size=47098, ...}) = 0
5718  mmap(NULL, 47098, PROT_READ, MAP_PRIVATE, 3, 0) = 0x200000000003c000
5718  close(3)                          = 0
5718  open("/lib/libtermcap.so.2", O_RDONLY) = 3
5718  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\300\25\0\0\0\0\0\0"..., 832) = 832
5718  fstat(3, {st_mode=S_IFREG|0755, st_size=25272, ...}) = 0
5718  mmap(NULL, 88760, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2000000000050000
5718  mprotect(0x2000000000058000, 49152, PROT_NONE) = 0
5718  mmap(0x2000000000064000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x2000000000064000
5718  close(3)                          = 0
5718  open("/lib/libdl.so.2", O_RDONLY) = 3
5718  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\300\24\0\0\0\0\0\0"..., 832) = 832
5718  fstat(3, {st_mode=S_IFREG|0755, st_size=27088, ...}) = 0
5718  mmap(NULL, 84288, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2000000000068000
5718  mprotect(0x2000000000070000, 49152, PROT_NONE) = 0
5718  mmap(0x200000000007c000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x200000000007c000
5718  close(3)                          = 0
5718  open("/lib/libc.so.6.1", O_RDONLY) = 3
5718  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\340:\2\0\0\0\0\0"..., 832) = 832
5718  fstat(3, {st_mode=S_IFREG|0755, st_size=2865232, ...}) = 0
5718  mmap(NULL, 2633344, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2000000000080000
5718  mprotect(0x20000000002ec000, 49152, PROT_NONE) = 0
5718  mmap(0x20000000002f8000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x268000) = 0x20000000002f8000
5718  mmap(0x2000000000300000, 11904, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2000000000300000
5718  close(3)                          = 0
5718  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000304000
5718  mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000308000
5718  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000310000
5718  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000314000
5718  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000318000
5718  munmap(0x200000000003c000, 47098) = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  open("/dev/tty", O_RDWR|O_NONBLOCK) = 3
5718  close(3)                          = 0
5718  brk(0)                            = 0x600000000001c000
5718  brk(0x6000000000040000)           = 0x6000000000040000
5718  open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
5718  fstat(3, {st_mode=S_IFREG|0644, st_size=58759552, ...}) = 0
5718  mmap(NULL, 58759552, PROT_READ, MAP_PRIVATE, 3, 0) = 0x200000000031c000
5718  close(3)                          = 0
5718  getuid()                          = 0
5718  getgid()                          = 0
5718  geteuid()                         = 0
5718  getegid()                         = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  gettimeofday({1212673242, 35736}, NULL) = 0
5718  open("/proc/meminfo", O_RDONLY)   = 3
5718  fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
5718  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000003b28000
5718  read(3, "MemTotal:      7597840 kB\nMemFre"..., 16384) = 780
5718  close(3)                          = 0
5718  munmap(0x2000000003b28000, 65536) = 0
5718  rt_sigaction(SIGCHLD, {SIG_DFL}, {SIG_DFL}, 8) = 0
5718  rt_sigaction(SIGCHLD, {SIG_DFL}, {SIG_DFL}, 8) = 0
5718  rt_sigaction(SIGINT, {SIG_DFL}, {SIG_IGN}, 8) = 0
5718  rt_sigaction(SIGINT, {SIG_IGN}, {SIG_DFL}, 8) = 0
5718  rt_sigaction(SIGQUIT, {SIG_DFL}, {SIG_IGN}, 8) = 0
5718  rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_DFL}, 8) = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_IGN}, 8) = 0
5718  uname({sys="Linux", node="tigerF", ...}) = 0
5718  stat("/tmp/testpro", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
5718  stat(".", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
5718  getpid()                          = 5718
5718  open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 3
5718  fstat(3, {st_mode=S_IFREG|0644, st_size=25462, ...}) = 0
5718  mmap(NULL, 25462, PROT_READ, MAP_SHARED, 3, 0) = 0x2000000003c00000
5718  close(3)                          = 0
5718  getppid()                         = 5717
5718  getpgid(0)                        = 5716
5718  rt_sigaction(SIGCHLD, {0x4000000000163b30, [], 0}, {SIG_DFL}, 8) = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  open("./test2.sh", O_RDONLY)      = 3
5718  ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x60000fffffac34c0) = -1 ENOTTY (Inappropriate ioctl for device)
5718  lseek(3, 0, SEEK_CUR)             = 0
5718  read(3, "#!/bin/sh\n\n/bin/date\n", 80) = 21
5718  lseek(3, 0, SEEK_SET)             = 0
5718  getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
5718  dup2(3, 255)                      = 255
5718  close(3)                          = 0
5718  fcntl(255, F_SETFD, FD_CLOEXEC)   = 0
5718  fcntl(255, F_GETFL)               = 0x8000 (flags O_RDONLY|0x8000)
5718  fstat(255, {st_mode=S_IFREG|0775, st_size=21, ...}) = 0
5718  lseek(255, 0, SEEK_CUR)           = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  read(255, "#!/bin/sh\n\n/bin/date\n", 21) = 21
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  rt_sigprocmask(SIG_BLOCK, [INT CHLD], [TRAP], 8) = 0
5718  clone2(child_stack=0, stack_size=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x20000000003047c0) = 5719
5718  rt_sigprocmask(SIG_SETMASK, [TRAP],  <unfinished ...>
5719  close(255 <unfinished ...>
5718  <... rt_sigprocmask resumed> NULL, 8) = 0
5719  <... close resumed> )             = 0
5718  rt_sigprocmask(SIG_BLOCK, [CHLD],  <unfinished ...>
5719  rt_sigprocmask(SIG_SETMASK, [TRAP],  <unfinished ...>
5718  <... rt_sigprocmask resumed> [TRAP], 8) = 0
5719  <... rt_sigprocmask resumed> NULL, 8) = 0
5719  rt_sigaction(SIGTSTP, {SIG_DFL},  <unfinished ...>
5718  rt_sigprocmask(SIG_SETMASK, [TRAP],  <unfinished ...>
5719  <... rt_sigaction resumed> {SIG_DFL}, 8) = 0
5718  <... rt_sigprocmask resumed> NULL, 8) = 0
5719  rt_sigaction(SIGTTIN, {SIG_DFL},  <unfinished ...>
5718  rt_sigprocmask(SIG_BLOCK, [CHLD],  <unfinished ...>
5719  <... rt_sigaction resumed> {SIG_IGN}, 8) = 0
5718  <... rt_sigprocmask resumed> [TRAP], 8) = 0
5719  rt_sigaction(SIGTTOU, {SIG_DFL},  <unfinished ...>
5718  rt_sigaction(SIGINT, {0x4000000000163b80, [], 0},  <unfinished ...>
5719  <... rt_sigaction resumed> {SIG_IGN}, 8) = 0
5718  <... rt_sigaction resumed> {SIG_IGN}, 8) = 0
5719  rt_sigaction(SIGINT, {SIG_IGN},  <unfinished ...>
5718  rt_sigaction(SIGINT, {SIG_IGN},  <unfinished ...>
5719  <... rt_sigaction resumed> {SIG_IGN}, 8) = 0
5718  <... rt_sigaction resumed> {0x4000000000163b80, [], 0}, 8) = 0
5719  rt_sigaction(SIGQUIT, {SIG_IGN},  <unfinished ...>
5718  wait4(-1,  <unfinished ...>
5719  <... rt_sigaction resumed> {SIG_IGN}, 8) = 0
5719  rt_sigaction(SIGCHLD, {SIG_DFL}, {0x4000000000163b30, [], 0}, 8) = 0
5719  execve("/bin/date", ["/bin/date"...], [/* 22 vars */]) = 1
5719  brk(0)                            = 0x6000000000008000
5719  uname({sys="Linux", node="tigerF", ...}) = 0
5719  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
5719  open("/etc/ld.so.cache", O_RDONLY) = 3
5719  fstat(3, {st_mode=S_IFREG|0644, st_size=47098, ...}) = 0
5719  mmap(NULL, 47098, PROT_READ, MAP_PRIVATE, 3, 0) = 0x200000000003c000
5719  close(3)                          = 0
5719  open("/lib/librt.so.1", O_RDONLY) = 3
5719  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\0/\0\0\0\0\0\0"..., 832) = 832
5719  fstat(3, {st_mode=S_IFREG|0755, st_size=77416, ...}) = 0
5719  mmap(NULL, 128000, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2000000000050000
5719  mprotect(0x2000000000060000, 49152, PROT_NONE) = 0
5719  mmap(0x200000000006c000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xc000) = 0x200000000006c000
5719  close(3)                          = 0
5719  open("/lib/libc.so.6.1", O_RDONLY) = 3
5719  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\340:\2\0\0\0\0\0"..., 832) = 832
5719  fstat(3, {st_mode=S_IFREG|0755, st_size=2865232, ...}) = 0
5719  mmap(NULL, 2633344, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2000000000070000
5719  mprotect(0x20000000002dc000, 49152, PROT_NONE) = 0
5719  mmap(0x20000000002e8000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x268000) = 0x20000000002e8000
5719  mmap(0x20000000002f0000, 11904, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000002f0000
5719  close(3)                          = 0
5719  open("/lib/libpthread.so.0", O_RDONLY) = 3
5719  read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0pt\0\0\0\0\0\0"..., 832) = 832
5719  fstat(3, {st_mode=S_IFREG|0755, st_size=210024, ...}) = 0
5719  mmap(NULL, 242272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x20000000002f4000
5719  mprotect(0x200000000031c000, 49152, PROT_NONE) = 0
5719  mmap(0x2000000000328000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x24000) = 0x2000000000328000
5719  mmap(0x200000000032c000, 12896, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x200000000032c000
5719  close(3)                          = 0
5719  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000330000
5719  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000334000
5719  mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000338000
5719  mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000340000
5719  munmap(0x200000000003c000, 47098) = 0
5719  set_tid_address(0x20000000003309a0) = 5719
5719  get_robust_list(0x20000000003309b0, 0x18, 0x4000000000000040) = 0
5719  rt_sigaction(SIGRT0, {0x2000000000048f30, [], SA_SIGINFO}, NULL, 8) = 0
5719  rt_sigaction(SIGRT1, {0x2000000000048f20, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
5719  rt_sigprocmask(SIG_UNBLOCK, [RT0 RT1], NULL, 8) = 0
5719  getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM_INFINITY}) = 0
5719  brk(0)                            = 0x6000000000008000
5719  brk(0x600000000002c000)           = 0x600000000002c000
5719  open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
5719  fstat(3, {st_mode=S_IFREG|0644, st_size=58759552, ...}) = 0
5719  mmap(NULL, 58759552, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2000000000344000
5719  close(3)                          = 0
5719  clock_gettime(CLOCK_REALTIME, {1212673242, 58465043}) = 0
5719  open("/etc/localtime", O_RDONLY)  = 3
5719  fstat(3, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
5719  fstat(3, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
5719  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000003b50000
5719  read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 16384) = 3519
5719  lseek(3, -2252, SEEK_CUR)         = 1267
5719  read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 16384) = 2252
5719  close(3)                          = 0
5719  munmap(0x2000000003b50000, 65536) = 0
5719  stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
5719  fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
5719  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000003b50000
5719  stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
5719  write(1, "Thu Jun  5 09:40:42 EDT 2008\n", 29) = 29
5717  <... read resumed> "Thu Jun  5 09:40:42 EDT 2008\n", 8192) = 29
5719  close(1 <unfinished ...>
5717  fstat(1,  <unfinished ...>
5719  <... close resumed> )             = 0
5719  munmap(0x2000000003b50000, 65536 <unfinished ...>
5717  <... fstat resumed> {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
5719  <... munmap resumed> )            = 0
5717  mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
5719  exit_group(0)                     = ?
5717  <... mmap resumed> )              = 0x20000000002e0000
5718  <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 5719
5717  write(1, "Thu Jun  5 09:40:42 EDT 2008\n", 29 <unfinished ...>
5718  rt_sigprocmask(SIG_SETMASK, [TRAP], NULL, 8) = 0
5717  <... write resumed> )             = 29
5718  --- SIGCHLD (Child exited) @ a000000000010621 (1657) ---
5717  read(3,  <unfinished ...>
5718  wait4(-1, 0x60000fffffac29cc, WNOHANG, NULL) = -1 ECHILD (No child processes)
5718  rt_sigreturn()                    = ? (mask now [HUP INT ILL BUS KILL SEGV USR2 PIPE STKFLT CHLD STOP TTOU URG XCPU XFSZ VTALRM PROF PWR SYS])
5718  rt_sigaction(SIGINT, {SIG_IGN}, {SIG_IGN}, 8) = 0
5718  rt_sigprocmask(SIG_BLOCK, NULL, [TRAP], 8) = 0
5718  read(255, "", 21)                 = 0
5718  exit_group(0)                     = ?
5717  <... read resumed> "", 8192)      = 0
5717  --- SIGCHLD (Child exited) @ a000000000010621 (1656) ---
5717  close(3)                          = 0
5717  wait4(-1, NULL, 0, NULL)          = 5718
5717  exit_group(0)                     = ?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-05  1:49                                         ` Luming Yu
@ 2008-06-05 11:16                                           ` Petr Tesarik
  2008-06-06  0:07                                             ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Petr Tesarik @ 2008-06-05 11:16 UTC (permalink / raw)
  To: Luming Yu; +Cc: Roland McGrath, LKML, linux-ia64

On Thu, 2008-06-05 at 09:49 +0800, Luming Yu wrote:
> On Wed, Jun 4, 2008 at 5:16 PM, Petr Tesarik <ptesarik@suse.cz> wrote:
> > Luming Yu wrote:
> >>> It's definitely a bug in strace. For some reason (I don't care about)
> >>> the execve() syscall produces an extra notification. However, this
> >>> notification message is suppressed when SIGTRAP is blocked. This
> >>> explains why the test case fails only when SIGTRAP is blocked.
> >>
> >> This is exact problem I suspected and I was trying to address in my hack..
> >> Since there are several processes involved in the pretty complex
> >> ptrace scenario.,
> >> I need to capture all processes context with kdump to confirm this is
> >> exact root-cause
> >> for the problem. But kdump doesn't work for me..I'm trying to solve it now..
> >>
> >> I'm also in doubt about the semantic correctness of the test case..
> >> Since SIGTRAP is so necessary to get ptrace work, is it legitimate to
> >> block it in test case?
> >>
> >> One more thing I need to say is:
> >> Same strace works for utrace enabled kernel on IA64.. If the bug is in
> >> strace, how could it happen?
> >
> > No idea, but send me the strace.log file from running
> >
> > strace -o strace.log strace -f -o log.txt ./test1
> >
> > and I may be able to tell.
> 
> Please check the attachment!
> 
> >
> > Petr Tesarik
> >

Hm, I think without utrace, it gets out-of-sync once, so syscall entries
and exits are swapped from that point on. With utrace, it gets
out-of-sync _twice_, so it eventually looks fine. But the strace output
definitely looks incorrect even with utrace:

5718  execve("./test2.sh", [], [/* 23 vars */]) = 1
5718  execve("", [0x840c001000100003, 0x26230c14203032, 0x8cb0008800140a81, 0xa643100801808402, 0x2400905000040088, 0x11600a0072000001, 0xad814a00402e0, 0x2200012464009344, 0x1180418512c40026, 0x400003081880008, 0x2100010840910404, 0x8045120000800003, 0x6400000c0000600, 0xc20063440501400, 0x1048015002008081, 0xe02226005008c010, ...], [/* 0 vars */]) = 1
5718  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

Note that strace missed a brk() syscall, although I can actually see
this in the other trace you sent me:

wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], __WALL, NULL) = 5704
ptrace(PTRACE_PEEKUSER, 5704, psr, NULL) = 4398046511120
ptrace(PTRACE_PEEKUSER, 5704, r15, NULL) = 1060
ptrace(PTRACE_SYSCALL, 5704, 0x1, SIG_0) = 0

Look at the value of r15, and compare it with unistd.h:
#define __NR_brk                        1060

I _guess_ this is caused by the fact that test2.sh is a shell script, so
the kernel executes the shell, and maybe utrace produces a second execve
notifications in this case? Roland, can you shed some light?

Petr Tesarik

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-05 11:16                                           ` Petr Tesarik
@ 2008-06-06  0:07                                             ` Roland McGrath
  2008-09-09  3:06                                               ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-06-06  0:07 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Luming Yu, LKML, linux-ia64

> I _guess_ this is caused by the fact that test2.sh is a shell script, so
> the kernel executes the shell, and maybe utrace produces a second execve
> notifications in this case? Roland, can you shed some light?

Not really.  The utrace kernels Luming is trying are intended to match the
vanilla ptrace behavior.  I don't think it's very useful to worry about the
difference between some utrace kernel and the current vanilla kernel.
Let's just look at what the current vanilla kernel is doing and compare
that to what an older vanilla kernel did if older versions produced
different results for the test case.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-03 21:31                                       ` Luck, Tony
  2008-06-03 22:13                                         ` Roland McGrath
@ 2008-06-10  8:23                                         ` Luming Yu
  1 sibling, 0 replies; 41+ messages in thread
From: Luming Yu @ 2008-06-10  8:23 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Roland McGrath, Petr Tesarik, LKML, linux-ia64

On Wed, Jun 4, 2008 at 5:31 AM, Luck, Tony <tony.luck@intel.com> wrote:
>> What do you mean by "extra"?  There is a SIGTRAP sent after execve
>> completes when ptraced, even when PTRACE_SYSCALL is not being used.
>> So for an execve that succeeds under PTRACE_SYSCALL, there is a
>> ptrace_notify at syscall entry, then a SIGTRAP queued (i.e., not seen
>> by ptrace if blocked), then a ptrace_notify at syscall exit.  If
>> that's what's happening (including the blocked SIGTRAP not being seen
>> by the ptracer, i.e. strace), then there is no mystery (and no bug).
>
> This might not be the same bug ... but I do have a definite 100%
> reproducible bug (latest git kernel, old version of strace (4.5.15-1.el4.1))
>
> Run:
>
>        $ strace -o logit -f make

For this test case, utrace doesn't work too..

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-06-06  0:07                                             ` Roland McGrath
@ 2008-09-09  3:06                                               ` Luming Yu
  2008-09-10  5:55                                                 ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-09-09  3:06 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

[-- Attachment #1: Type: text/plain, Size: 5131 bytes --]

Just have chance to re-check the problem. I have spotted the place that strace
misbehaves when SIGTRAP is blocked by task being straced with -f flag
(follow trace)
The problem is that after debugee blocking SIGTRAP the task's exec
path would not wake up debugger (strace) with this signal. But strace
don't know about it, and expect such a wake up. Upon receiving a
subsequent wake up,strace still thought it was a wake up from "SIGTRAP
in previous exec path". From now on, Debuger and Debugee misunderstand
each other..

With the test case posted before in this thread and a customized
kernel, I got the following debug info from kernel:

syscall_trace_enter exiting (4717)
report_exec:pid= 4717 parent=4715, real_parent=4716,current->ptrace =00000001,
report_exec:pid= 4717 parent=4715, real_parent=4716,current->ptrace =00000001,
syscall_trace_leave entering (4717, syscall:1033)
 ..sig_ignored=1..
4715, PTRACE_PEEKUSR 4717 @addr 00000830 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008c0 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008d0 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008d0 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008c0 return 00000000
syscall_trace current(4717)->exit_code(0)
syscall_trace_leave exiting (4717)
syscall_trace_enter beginning (4717, syscall: 1060)
 ..sig_ignored=1..
4715, PTRACE_PEEKUSR 4717 @addr 00000830 return 00000010
4715, PTRACE_PEEKUSR 4717 @addr 000008b8 return 00000424
syscall_trace current(4717)->exit_code(0)
syscall_trace_enter exiting (4717)

Delete the line that blocks SIGTRAP in the test case, rerun it, I got:

syscall_trace current(4724)->exit_code(0)
syscall_trace_enter exiting (4724)
report_exec:pid= 4724 parent=4722, real_parent=4723,current->ptrace =00000001,
report_exec:pid= 4724 parent=4722, real_parent=4723,current->ptrace =00000001,
syscall_trace_leave entering (4724, syscall:1033)
 ..sig_ignored=1..
4722, PTRACE_PEEKUSR 4724 @addr 00000830 return 00000000
4722, PTRACE_PEEKUSR 4724 @addr 000008c0 return 00000000
4722, PTRACE_PEEKUSR 4724 @addr 000008d0 return 00000000
4722, PTRACE_PEEKUSR 4724 @addr 000008d0 return 00000000
4722, PTRACE_PEEKUSR 4724 @addr 000008c0 return 00000000
syscall_trace current(4724)->exit_code(0)
syscall_trace_leave exiting (4724)
 ..sig_ignored=1..
4722, PTRACE_PEEKUSR 4724 @addr 00000830 return 00000000
4722, PTRACE_PEEKUSR 4724 @addr 000008b8 return 00000409
syscall_trace_enter beginning (4724, syscall: 1060)
 ..sig_ignored=1..


The attahced patch is a proposal for this problem. But I still don't know
why it is just specific to IA64.

***Note: The following patch is copy&paste, If you want apply, please
check out the attached version


Signed-off-by: Yu Luming <luming.yu@intel.com>
--------------------------------------
 include/linux/tracehook.h |    4 +++-
 kernel/signal.c           |   11 ++++++++++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff -Bru 0/include/linux/tracehook.h 1/include/linux/tracehook.h
--- 0/include/linux/tracehook.h 2008-08-18 20:43:21.000000000 -0400
+++ 1/include/linux/tracehook.h 2008-09-09 09:51:14.000000000 -0400
@@ -195,8 +195,10 @@
                                         struct pt_regs *regs)
 {
        if (!ptrace_event(PT_TRACE_EXEC, PTRACE_EVENT_EXEC, 0) &&
-           unlikely(task_ptrace(current) & PT_PTRACED))
+           unlikely(task_ptrace(current) & PT_PTRACED)){
                send_sig(SIGTRAP, current, 0);
+               set_tsk_thread_flag(current,TIF_SIGPENDING);
+       }
 }

 /**
diff -Bru 0/kernel/signal.c 1/kernel/signal.c
--- 0/kernel/signal.c   2008-08-18 20:43:21.000000000 -0400
+++ 1/kernel/signal.c   2008-09-09 09:50:38.000000000 -0400
@@ -384,8 +384,14 @@
 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
                        siginfo_t *info)
 {
-       int sig = next_signal(pending, mask);
+       int sig = 0;
+       int allow_trap = 0;

+       if((current->ptrace & PT_PTRACED) && sigismember(mask,SIGTRAP)){
+               sigdelset(mask,SIGTRAP);
+               allow_trap = 1;
+       }
+       sig = next_signal(pending, mask);
        if (sig) {
                if (current->notifier) {
                        if (sigismember(current->notifier_mask, sig)) {
@@ -399,6 +405,9 @@
                collect_signal(sig, pending, info);
        }

+       if(allow_trap)
+               sigaddset(mask,SIGTRAP);
+
        return sig;
}

On Fri, Jun 6, 2008 at 8:07 AM, Roland McGrath <roland@redhat.com> wrote:
>> I _guess_ this is caused by the fact that test2.sh is a shell script, so
>> the kernel executes the shell, and maybe utrace produces a second execve
>> notifications in this case? Roland, can you shed some light?
>
> Not really.  The utrace kernels Luming is trying are intended to match the
> vanilla ptrace behavior.  I don't think it's very useful to worry about the
> difference between some utrace kernel and the current vanilla kernel.
> Let's just look at what the current vanilla kernel is doing and compare
> that to what an older vanilla kernel did if older versions produced
> different results for the test case.
>
>
> Thanks,
> Roland
>

[-- Attachment #2: 44620-upstream.patch --]
[-- Type: application/octet-stream, Size: 1251 bytes --]

diff -Bru 0/include/linux/tracehook.h 1/include/linux/tracehook.h
--- 0/include/linux/tracehook.h	2008-08-18 20:43:21.000000000 -0400
+++ 1/include/linux/tracehook.h	2008-09-09 09:51:14.000000000 -0400
@@ -195,8 +195,10 @@
 					 struct pt_regs *regs)
 {
 	if (!ptrace_event(PT_TRACE_EXEC, PTRACE_EVENT_EXEC, 0) &&
-	    unlikely(task_ptrace(current) & PT_PTRACED))
+	    unlikely(task_ptrace(current) & PT_PTRACED)){
 		send_sig(SIGTRAP, current, 0);
+		set_tsk_thread_flag(current,TIF_SIGPENDING);
+	}
 }
 
 /**
diff -Bru 0/kernel/signal.c 1/kernel/signal.c
--- 0/kernel/signal.c	2008-08-18 20:43:21.000000000 -0400
+++ 1/kernel/signal.c	2008-09-09 09:50:38.000000000 -0400
@@ -384,8 +384,14 @@
 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
 			siginfo_t *info)
 {
-	int sig = next_signal(pending, mask);
+	int sig = 0;
+	int allow_trap = 0;
 
+	if((current->ptrace & PT_PTRACED) && sigismember(mask,SIGTRAP)){
+		sigdelset(mask,SIGTRAP);
+		allow_trap = 1;
+	}
+	sig = next_signal(pending, mask);
 	if (sig) {
 		if (current->notifier) {
 			if (sigismember(current->notifier_mask, sig)) {
@@ -399,6 +405,9 @@
 		collect_signal(sig, pending, info);
 	}
 
+	if(allow_trap)
+		sigaddset(mask,SIGTRAP);
+
 	return sig;
 }
 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-09-09  3:06                                               ` Luming Yu
@ 2008-09-10  5:55                                                 ` Roland McGrath
  2008-09-16  8:50                                                   ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-09-10  5:55 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

This subject line is quite far from what you're now addressing, btw.

I certainly wouldn't recommend that patch.  The interference between ptrace
uses of SIGTRAP and a program that blocks SIGTRAP is just a fact of life
with ptrace.  If you wanted to change the exec SIGTRAP to be more
consistent with other signals induced by debuggers (such as breakpoints and
single-step), you could just change send_sig to force_sig.  That unblocks
SIGTRAP and resets its handler when it's blocked.  Conversely, a tracer
program could use PTRACE_O_TRACEEXEC if it cared to deal with tracing a
process that blocks SIGTRAP.  (That uses ptrace_notify rather than any
actual signal that can be blocked.)

But, none of this goes to your actual problem at all AFAICT.  You've said
your actual problem with strace only arises on ia64.  The behavior of
ptrace'd exec, and the effect of a program blocking SIGTRAP, is exactly the
same across all machines.  Changing it does not explain your situation.
I'm in favor of understanding what is actually going on before changing
things.

If the SIGTRAP from exec is actually seen by strace on a different machine
like it's not on ia64, that suggests that something different happened
between the two.  In the case that doesn't exhibit the problem, either
SIGTRAP got unblocked by something before the exec, or userland (strace)
behaves differently so that it doesn't care about not seeing that SIGTRAP.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-09-10  5:55                                                 ` Roland McGrath
@ 2008-09-16  8:50                                                   ` Luming Yu
  2008-09-17 17:01                                                     ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Luming Yu @ 2008-09-16  8:50 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

On Wed, Sep 10, 2008 at 1:55 PM, Roland McGrath <roland@redhat.com> wrote:
> This subject line is quite far from what you're now addressing, btw.
>
> I certainly wouldn't recommend that patch.  The interference between ptrace
> uses of SIGTRAP and a program that blocks SIGTRAP is just a fact of life
> with ptrace.  If you wanted to change the exec SIGTRAP to be more
> consistent with other signals induced by debuggers (such as breakpoints and
> single-step), you could just change send_sig to force_sig.  That unblocks
> SIGTRAP and resets its handler when it's blocked.  Conversely, a tracer
> program could use PTRACE_O_TRACEEXEC if it cared to deal with tracing a
> process that blocks SIGTRAP.  (That uses ptrace_notify rather than any
> actual signal that can be blocked.)
>
> But, none of this goes to your actual problem at all AFAICT.  You've said
> your actual problem with strace only arises on ia64.  The behavior of
> ptrace'd exec, and the effect of a program blocking SIGTRAP, is exactly the
> same across all machines.  Changing it does not explain your situation.
> I'm in favor of understanding what is actually going on before changing
> things.
>
> If the SIGTRAP from exec is actually seen by strace on a different machine
> like it's not on ia64, that suggests that something different happened
> between the two.  In the case that doesn't exhibit the problem, either
> SIGTRAP got unblocked by something before the exec, or userland (strace)
> behaves differently so that it doesn't care about not seeing that SIGTRAP.

The reason is not as complicated as I thought. It is because x86
strace don't test TCB_WAITEXECVE. please take a look at defs.h
(strace-4.5.16), the flag is defined as follows:

#ifdef LINUX
# if defined(ALPHA) || defined(SPARC) || defined(SPARC64) ||
defined(POWERPC) || defined(IA64) || defined(HPPA) || defined(SH) ||
defined(SH64) || defined(S390) || defined(S390X) || defined(ARM)
#  define TCB_WAITEXECVE 02000  /* ignore SIGTRAP after exceve */
# endif

I'm not an strace expert, so I have no idea what the TCB_WAITEXECVE means.
And why x86 strace can handle SIGTRAP after execve but ALL other Arch can not..
Could anybody shield some lights on the flag?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-09-16  8:50                                                   ` Luming Yu
@ 2008-09-17 17:01                                                     ` Roland McGrath
  2008-09-18  5:44                                                       ` Luming Yu
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2008-09-17 17:01 UTC (permalink / raw)
  To: Luming Yu; +Cc: Petr Tesarik, LKML, linux-ia64

So is there or is there not any regression (change) in the kernel's behavior?

If you are now just talking about strace's own code, then the place for
that is strace-devel@lists.sourceforge.net, not any kernel lists.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race
  2008-09-17 17:01                                                     ` Roland McGrath
@ 2008-09-18  5:44                                                       ` Luming Yu
  0 siblings, 0 replies; 41+ messages in thread
From: Luming Yu @ 2008-09-18  5:44 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Petr Tesarik, LKML, linux-ia64

The another mysterious that I completely don't understand  probably
needs your help.
Why kernels with utrace patch applied on ia64 just work?
Sounds like utrace delivers an extra SIGTRAP comparing with kernel
without utrace code..
Dose it make sense? Hmm.. Sounds like a bug to me..

On Thu, Sep 18, 2008 at 1:01 AM, Roland McGrath <roland@redhat.com> wrote:
> So is there or is there not any regression (change) in the kernel's behavior?
>
> If you are now just talking about strace's own code, then the place for
> that is strace-devel@lists.sourceforge.net, not any kernel lists.
>
>
> Thanks,
> Roland
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2008-09-18  5:44 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-22  2:47 [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race Luming Yu
2008-05-22  8:47 ` Petr Tesarik
2008-05-22  9:16   ` Luming Yu
2008-05-22 11:18     ` Roland McGrath
2008-05-22 12:12       ` Petr Tesarik
2008-05-22 20:39         ` Roland McGrath
2008-05-23 12:33         ` Luming Yu
2008-05-22 13:24       ` Luming Yu
2008-05-22 20:34         ` Roland McGrath
2008-05-23  3:42           ` Luming Yu
2008-05-23  4:19             ` Roland McGrath
2008-05-23  5:24               ` Luming Yu
2008-05-26  0:15                 ` Roland McGrath
2008-05-26  1:30                   ` Luming Yu
2008-05-27  3:31                     ` Luming Yu
2008-05-27  4:04                       ` Roland McGrath
2008-05-27  5:49                         ` Luming Yu
2008-05-27  6:12                           ` Roland McGrath
2008-05-27  6:25                             ` Petr Tesarik
2008-06-03  6:04                               ` Luming Yu
2008-06-03  9:01                                 ` Petr Tesarik
2008-06-03 14:32                                   ` Petr Tesarik
2008-06-03 21:01                                     ` Roland McGrath
2008-06-03 21:31                                       ` Luck, Tony
2008-06-03 22:13                                         ` Roland McGrath
2008-06-10  8:23                                         ` Luming Yu
2008-06-04  2:16                                     ` Luming Yu
2008-06-04  9:16                                       ` Petr Tesarik
2008-06-05  1:49                                         ` Luming Yu
2008-06-05 11:16                                           ` Petr Tesarik
2008-06-06  0:07                                             ` Roland McGrath
2008-09-09  3:06                                               ` Luming Yu
2008-09-10  5:55                                                 ` Roland McGrath
2008-09-16  8:50                                                   ` Luming Yu
2008-09-17 17:01                                                     ` Roland McGrath
2008-09-18  5:44                                                       ` Luming Yu
2008-05-27  6:34                             ` Luming Yu
2008-05-27  8:48                             ` Luming Yu
2008-05-28  9:14                               ` Luming Yu
2008-06-03  6:02                                 ` Luming Yu
2008-05-30  8:05                               ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox