* ptrace(), fork(), sleep(), exit(), SIGCHLD
@ 2001-08-13 8:29 Bruce Janson
2001-08-14 7:28 ` christophe barbé
0 siblings, 1 reply; 20+ messages in thread
From: Bruce Janson @ 2001-08-13 8:29 UTC (permalink / raw)
To: linux-kernel
Hi,
The following program behaves incorrectly when traced:
$ uname -a
Linux dependo 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i686 unknown
$ cc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)
$ strace -V
strace -- version 4.2
$ cat t.c
main()
{
switch (fork())
{
case -1:
write(2, "fork\n", 5);
break;
case 0:
usleep(1000000);
break;
default:
if (usleep(5000000) == -1)
write(2, "wrong\n", 6);
break;
}
exit(0);
}
$ cc t.c
$ time ./a.out
real 0m5.011s
user 0m0.000s
sys 0m0.000s
$ time strace -o /dev/null ./a.out
wrong
real 0m1.025s
user 0m0.010s
sys 0m0.010s
$
The problem appears to be that, when traced, the child process' exit()
interrupts the parent's usleep() with a SIGCHLD, the latter returning EINTR.
It also fails in the same way under Linux 2.2.16 and 2.2.19.
What am I missing?
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD 2001-08-13 8:29 ptrace(), fork(), sleep(), exit(), SIGCHLD Bruce Janson @ 2001-08-14 7:28 ` christophe barbé 2001-08-14 15:06 ` Bruce Janson 0 siblings, 1 reply; 20+ messages in thread From: christophe barbé @ 2001-08-14 7:28 UTC (permalink / raw) To: linux-kernel Have you receive off-line answers? I guess that it's certainly more a strace issue and that it's perhaps 'trivial' for most people here. But I would be interesting in knowing the why behind this. Christophe Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : > Hi, > The following program behaves incorrectly when traced: > > $ uname -a > Linux dependo 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i686 unknown > $ cc -v > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs > gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81) > $ strace -V > strace -- version 4.2 > $ cat t.c > main() > { > switch (fork()) > { > case -1: > write(2, "fork\n", 5); > break; > > case 0: > usleep(1000000); > break; > > default: > if (usleep(5000000) == -1) > write(2, "wrong\n", 6); > break; > } > > exit(0); > } > $ cc t.c > $ time ./a.out > > real 0m5.011s > user 0m0.000s > sys 0m0.000s > $ time strace -o /dev/null ./a.out > wrong > > real 0m1.025s > user 0m0.010s > sys 0m0.010s > $ > > The problem appears to be that, when traced, the child process' exit() > interrupts the parent's usleep() with a SIGCHLD, the latter returning > EINTR. > It also fails in the same way under Linux 2.2.16 and 2.2.19. > > What am I missing? > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" > in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Christophe Barbé Software Engineer - christophe.barbe@lineo.fr Lineo France - Lineo High Availability Group 42-46, rue Médéric - 92110 Clichy - France phone (33).1.41.40.02.12 - fax (33).1.41.40.02.01 http://www.lineo.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD 2001-08-14 7:28 ` christophe barbé @ 2001-08-14 15:06 ` Bruce Janson 2001-08-15 15:46 ` george anzinger 0 siblings, 1 reply; 20+ messages in thread From: Bruce Janson @ 2001-08-14 15:06 UTC (permalink / raw) To: linux-kernel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 815 bytes --] In article <20010814092849.E13892@pc8.lineo.fr>, christophe =?iso-8859-1?Q?barb=E9?= <christophe.barbe@lineo.fr> wrote: .. >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : .. >> The following program behaves incorrectly when traced: .. >Have you receive off-line answers? .. No, though I did receive an offline reply from someone who appeared to have misunderstood the post. In case it wasn't clear, the problem is that the above program behaves differently when traced to how it behaves when not traced. (I do realise that in general, under newer Unices, when not ignored, a SIGCHLD signal may accompany the death of a child.) >I guess that it's certainly more a strace issue and that it's perhaps .. It's not clear to me whether it is a kernel, glibc or strace bug, but it does appear to be a bug. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD 2001-08-14 15:06 ` Bruce Janson @ 2001-08-15 15:46 ` george anzinger 2001-08-15 17:53 ` george anzinger 2001-08-15 18:02 ` george anzinger 0 siblings, 2 replies; 20+ messages in thread From: george anzinger @ 2001-08-15 15:46 UTC (permalink / raw) To: Bruce Janson; +Cc: linux-kernel Bruce Janson wrote: > > In article <20010814092849.E13892@pc8.lineo.fr>, > christophe =?iso-8859-1?Q?barb=E9?= <christophe.barbe@lineo.fr> wrote: > .. > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : > .. > >> The following program behaves incorrectly when traced: > .. > >Have you receive off-line answers? > .. > > No, though I did receive an offline reply from someone who appeared > to have misunderstood the post. In case it wasn't clear, the problem > is that the above program behaves differently when traced to how it > behaves when not traced. (I do realise that in general, under newer > Unices, when not ignored, a SIGCHLD signal may accompany the death of > a child.) > > >I guess that it's certainly more a strace issue and that it's perhaps > .. > > It's not clear to me whether it is a kernel, glibc or strace bug, but > it does appear to be a bug. I don't have the code for usleep() handy and the man page is not much help, but here goes: I think strace is using ptrace() which causes signals to be redirected to wake up the parent (strace in this case). In particular, blocked signals are no longer blocked. What this means is that a.) SIG CHILD is posted, b.) the signal, not being blocked, the child is wakened, c.) ptrace returns to the parent, d) the parent does what ever and tells the kernel (ptrace) to continue the child with the original mask, e.) the signal code returns 0 with out delivering the signal to the child. Looks good, right? Wrong! The wake up at (b) pulls the child out of the timer queue so when signal returns, the sleep (I assume nano sleep is actually used here) call returns with the remaining sleep time as a value. This is an issue for debugging also (same ptrace...). The fix is to fix nano_sleep to match the standard which says it should only return on a signal if the signal is delivered to the program (i.e. not on internal "do nothing" signals). Signal in the kernel returns 1 if it calls the task and 0 otherwise, thus nano sleep might be changed as follows: expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); current->state = TASK_INTERRUPTIBLE; - expire = schedule_timeout(expire); + while (expire = schedule_timeout(expire) && !signal()); if (expire) { if (rmtp) { jiffies_to_timespec(expire, &t); if (copy_to_user(rmtp, &t, sizeof(struct timespec))) return -EFAULT; } return -EINTR; } return 0; This code is in ../kernel/timer.c Note that this assumes that nano_sleep() underlies usleep(). If setitimer (via sleep() or otherwise) is used, the problem and fix is in the library. In that case, the code needs to notice that it was awakened but the alarm handler was not called. Still, with out the full spec on usleep() it is not clear what it should do. In any case, this is a bug in nano_sleep(), where the spec is clear on this point. George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD 2001-08-15 15:46 ` george anzinger @ 2001-08-15 17:53 ` george anzinger 2001-08-15 18:02 ` george anzinger 1 sibling, 0 replies; 20+ messages in thread From: george anzinger @ 2001-08-15 17:53 UTC (permalink / raw) To: Bruce Janson, linux-kernel george anzinger wrote: > > Bruce Janson wrote: > > > > In article <20010814092849.E13892@pc8.lineo.fr>, > > christophe =?iso-8859-1?Q?barb=E9?= <christophe.barbe@lineo.fr> wrote: > > .. > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : > > .. > > >> The following program behaves incorrectly when traced: > > .. > > >Have you receive off-line answers? > > .. > > > > No, though I did receive an offline reply from someone who appeared > > to have misunderstood the post. In case it wasn't clear, the problem > > is that the above program behaves differently when traced to how it > > behaves when not traced. (I do realise that in general, under newer > > Unices, when not ignored, a SIGCHLD signal may accompany the death of > > a child.) > > > > >I guess that it's certainly more a strace issue and that it's perhaps > > .. > > > > It's not clear to me whether it is a kernel, glibc or strace bug, but > > it does appear to be a bug. > > I don't have the code for usleep() handy and the man page is not much > help, but here goes: > > I think strace is using ptrace() which causes signals to be redirected > to wake up the parent (strace in this case). In particular, blocked > signals are no longer blocked. What this means is that a.) SIG CHILD is > posted, b.) the signal, not being blocked, the child is wakened, c.) > ptrace returns to the parent, d) the parent does what ever and tells the > kernel (ptrace) to continue the child with the original mask, e.) the > signal code returns 0 with out delivering the signal to the child. > Looks good, right? Wrong! The wake up at (b) pulls the child out of > the timer queue so when signal returns, the sleep (I assume nano sleep > is actually used here) call returns with the remaining sleep time as a > value. > > This is an issue for debugging also (same ptrace...). The fix is to fix > nano_sleep to match the standard which says it should only return on a > signal if the signal is delivered to the program (i.e. not on internal > "do nothing" signals). Signal in the kernel returns 1 if it calls the > task and 0 otherwise, thus nano sleep might be changed as follows: > > expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); > > current->state = TASK_INTERRUPTIBLE; > - expire = schedule_timeout(expire); > + while (expire = schedule_timeout(expire) && !signal()); Oops! should be: + while (expire = schedule_timeout(expire) && !do_signal()); > > if (expire) { > if (rmtp) { > jiffies_to_timespec(expire, &t); > if (copy_to_user(rmtp, &t, sizeof(struct timespec))) > return -EFAULT; > } > return -EINTR; > } > return 0; > > This code is in ../kernel/timer.c > > Note that this assumes that nano_sleep() underlies usleep(). If > setitimer (via sleep() or otherwise) is used, the problem and fix is in > the library. In that case, the code needs to notice that it was > awakened but the alarm handler was not called. Still, with out the full > spec on usleep() it is not clear what it should do. > > In any case, this is a bug in nano_sleep(), where the spec is clear on > this point. > > George > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: ptrace(), fork(), sleep(), exit(), SIGCHLD 2001-08-15 15:46 ` george anzinger 2001-08-15 17:53 ` george anzinger @ 2001-08-15 18:02 ` george anzinger 2001-08-16 0:59 ` How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) george anzinger 1 sibling, 1 reply; 20+ messages in thread From: george anzinger @ 2001-08-15 18:02 UTC (permalink / raw) To: Bruce Janson, linux-kernel george anzinger wrote: > > Bruce Janson wrote: > > > > In article <20010814092849.E13892@pc8.lineo.fr>, > > christophe =?iso-8859-1?Q?barb=E9?= <christophe.barbe@lineo.fr> wrote: > > .. > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : > > .. > > >> The following program behaves incorrectly when traced: > > .. > > >Have you receive off-line answers? > > .. > > > > No, though I did receive an offline reply from someone who appeared > > to have misunderstood the post. In case it wasn't clear, the problem > > is that the above program behaves differently when traced to how it > > behaves when not traced. (I do realise that in general, under newer > > Unices, when not ignored, a SIGCHLD signal may accompany the death of > > a child.) > > > > >I guess that it's certainly more a strace issue and that it's perhaps > > .. > > > > It's not clear to me whether it is a kernel, glibc or strace bug, but > > it does appear to be a bug. > > I don't have the code for usleep() handy and the man page is not much > help, but here goes: > > I think strace is using ptrace() which causes signals to be redirected > to wake up the parent (strace in this case). In particular, blocked > signals are no longer blocked. What this means is that a.) SIG CHILD is > posted, b.) the signal, not being blocked, the child is wakened, c.) > ptrace returns to the parent, d) the parent does what ever and tells the > kernel (ptrace) to continue the child with the original mask, e.) the > signal code returns 0 with out delivering the signal to the child. > Looks good, right? Wrong! The wake up at (b) pulls the child out of > the timer queue so when signal returns, the sleep (I assume nano sleep > is actually used here) call returns with the remaining sleep time as a > value. > > This is an issue for debugging also (same ptrace...). The fix is to fix > nano_sleep to match the standard which says it should only return on a > signal if the signal is delivered to the program (i.e. not on internal > "do nothing" signals). Signal in the kernel returns 1 if it calls the > task and 0 otherwise, thus nano sleep might be changed as follows: > > expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); > > current->state = TASK_INTERRUPTIBLE; > - expire = schedule_timeout(expire); > + while (expire = schedule_timeout(expire) && !signal()); Still not quite right. regs needs to be dummied up (see sys_sigpause) and then: + while (expire = schedule_timeout(expire) && !do_signal(regs, NULL)); > > if (expire) { > if (rmtp) { > jiffies_to_timespec(expire, &t); > if (copy_to_user(rmtp, &t, sizeof(struct timespec))) > return -EFAULT; > } > return -EINTR; > } > return 0; > > This code is in ../kernel/timer.c > > Note that this assumes that nano_sleep() underlies usleep(). If > setitimer (via sleep() or otherwise) is used, the problem and fix is in > the library. In that case, the code needs to notice that it was > awakened but the alarm handler was not called. Still, with out the full > spec on usleep() it is not clear what it should do. > > In any case, this is a bug in nano_sleep(), where the spec is clear on > this point. > > George > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-15 18:02 ` george anzinger @ 2001-08-16 0:59 ` george anzinger 2001-08-16 10:17 ` christophe barbé 0 siblings, 1 reply; 20+ messages in thread From: george anzinger @ 2001-08-16 0:59 UTC (permalink / raw) To: Bruce Janson, linux-kernel The problem is that nano_sleep needs to continue sleeping if it wakes up on a signal which is not delivered to the task. This happens when "strace" or "ptrace" cause otherwise blocked signals to be delivered. Do_signal() returns 0 if it does not deliver a signal, a 1 if it does so I propose the following changes to nano_sleep: asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec *rmtp) { struct timespec t; unsigned long expire; + struct pt_regs * regs = (struct pt_regs *) &rqtp; if(copy_from_user(&t, rqtp, sizeof(struct timespec))) return -EFAULT; if (t.tv_nsec >= 1000000000L || t.tv_nsec < 0 || t.tv_sec < 0) return -EINVAL; if (t.tv_sec == 0 && t.tv_nsec <= 2000000L && current->policy != SCHED_OTHER) { /* * Short delay requests up to 2 ms will be handled with * high precision by a busy wait for all real-time processes. * * Its important on SMP not to do this holding locks. */ udelay((t.tv_nsec + 999) / 1000); return 0; } expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); - current->state = TASK_INTERRUPTIBLE; - expire = schedule_timeout(expire); + regs->eax = -EINTR; + do { + current->state = TASK_INTERRUPTIBLE; + } while((expire = schedule_timeout(expire)) && !do_signal(regs,NULL)); if (expire) { if (rmtp) { jiffies_to_timespec(expire, &t); if (copy_to_user(rmtp, &t, sizeof(struct timespec))) return -EFAULT; } return -EINTR; } return 0; } BUT, it turns out that do_signal() is in the "arch" code, and further that different arch's have different calling sequences (and, of course, pt_regs is different also). This is the ONLY place in the kernel where platform independent code needs to call do_signal() :( There does not seem to be a clean answer for this issue. I suppose we could put something like this in timer.c: #ifndef _do_signal #define _do_signal(a,b) 1 #endif and then leave it to the platform code to define a uniform _do_signal(a,b) interface. This way each platform will work as it does now and better once they define the macro. If this is the path to take, what should the parameters "a" & "b" be? We need to cover all platforms needs. Another way to solve this issue is to move nano_sleep into the "arch" signal.c file, but then each would have to change and things would be broken (i.e. nano_sleep would not work) until the platform made the move. I suppose the current nano_sleep could stay in the kernel and each platform could implement one in their area with a different name. When all were done, the current code could be deleted. How should this be approached? George george anzinger wrote: > > george anzinger wrote: > > > > Bruce Janson wrote: > > > > > > In article <20010814092849.E13892@pc8.lineo.fr>, > > > christophe =?iso-8859-1?Q?barb=E9?= <christophe.barbe@lineo.fr> wrote: > > > .. > > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : > > > .. > > > >> The following program behaves incorrectly when traced: > > > .. > > > >Have you receive off-line answers? > > > .. > > > > > > No, though I did receive an offline reply from someone who appeared > > > to have misunderstood the post. In case it wasn't clear, the problem > > > is that the above program behaves differently when traced to how it > > > behaves when not traced. (I do realise that in general, under newer > > > Unices, when not ignored, a SIGCHLD signal may accompany the death of > > > a child.) > > > > > > >I guess that it's certainly more a strace issue and that it's perhaps > > > .. > > > > > > It's not clear to me whether it is a kernel, glibc or strace bug, but > > > it does appear to be a bug. > > > > I don't have the code for usleep() handy and the man page is not much > > help, but here goes: > > > > I think strace is using ptrace() which causes signals to be redirected > > to wake up the parent (strace in this case). In particular, blocked > > signals are no longer blocked. What this means is that a.) SIG CHILD is > > posted, b.) the signal, not being blocked, the child is wakened, c.) > > ptrace returns to the parent, d) the parent does what ever and tells the > > kernel (ptrace) to continue the child with the original mask, e.) the > > signal code returns 0 with out delivering the signal to the child. > > Looks good, right? Wrong! The wake up at (b) pulls the child out of > > the timer queue so when signal returns, the sleep (I assume nano sleep > > is actually used here) call returns with the remaining sleep time as a > > value. > > > > This is an issue for debugging also (same ptrace...). The fix is to fix > > nano_sleep to match the standard which says it should only return on a > > signal if the signal is delivered to the program (i.e. not on internal > > "do nothing" signals). Signal in the kernel returns 1 if it calls the > > task and 0 otherwise, thus nano sleep might be changed as follows: > > > > expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); > > > > current->state = TASK_INTERRUPTIBLE; > > - expire = schedule_timeout(expire); > > + while (expire = schedule_timeout(expire) && !signal()); > Still not quite right. regs needs to be dummied up (see sys_sigpause) > and then: > + while (expire = schedule_timeout(expire) && !do_signal(regs, > NULL)); > > > > if (expire) { > > if (rmtp) { > > jiffies_to_timespec(expire, &t); > > if (copy_to_user(rmtp, &t, sizeof(struct timespec))) > > return -EFAULT; > > } > > return -EINTR; > > } > > return 0; > > > > This code is in ../kernel/timer.c > > > > Note that this assumes that nano_sleep() underlies usleep(). If > > setitimer (via sleep() or otherwise) is used, the problem and fix is in > > the library. In that case, the code needs to notice that it was > > awakened but the alarm handler was not called. Still, with out the full > > spec on usleep() it is not clear what it should do. > > > > In any case, this is a bug in nano_sleep(), where the spec is clear on > > this point. > > > > George > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 0:59 ` How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) george anzinger @ 2001-08-16 10:17 ` christophe barbé 2001-08-16 10:29 ` Russell King 0 siblings, 1 reply; 20+ messages in thread From: christophe barbé @ 2001-08-16 10:17 UTC (permalink / raw) To: linux-kernel Thank george for your infos. First I was thinking that it was a bug in a user-mode tool. Now you convinced me that it's a nanosleep kernel bug. I've not looked in the user_mode usleed code but I've tested the bruce code with nanosleep and got the same result. What still astonish me is that this looks like a not so unprobable bug when running strace and should have been detected before. IT should be better to let sys_nanosleep in non-arch directory and use a macro as you have proposed but this macro should include more than do_signal : #ifndef _sys_nanosleep #define _sys_nanosleep \ current->state = TASK_INTERRUPTIBLE; \ expire = schedule_timeout(expire); #endif and for x86 : #define _sys_nanosleep \ { \ struct pt_regs * regs = (struct pt_regs *) &rqtp; \ regs->eax = -EINTR; \ do { \ current->state = TASK_INTERRUPTIBLE; \ } while((expire = schedule_timeout(expire)) && !do_signal(regs,NULL)); \ } Is it bad to use directly the variable in a macro like this one. We could still use something like _sys_nanosleep(current,rqtp) but it could be not generic enough. Christophe Le jeu, 16 aoû 2001 02:59:51, george anzinger a écrit : > The problem is that nano_sleep needs to continue sleeping if it wakes up > on a signal which is not delivered to the task. This happens when > "strace" or "ptrace" cause otherwise blocked signals to be delivered. > Do_signal() returns 0 if it does not deliver a signal, a 1 if it does so > I propose the following changes to nano_sleep: > > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec > *rmtp) > { > struct timespec t; > unsigned long expire; > + struct pt_regs * regs = (struct pt_regs *) &rqtp; > > if(copy_from_user(&t, rqtp, sizeof(struct timespec))) > return -EFAULT; > > if (t.tv_nsec >= 1000000000L || t.tv_nsec < 0 || t.tv_sec < 0) > return -EINVAL; > > > if (t.tv_sec == 0 && t.tv_nsec <= 2000000L && > current->policy != SCHED_OTHER) > { > /* > * Short delay requests up to 2 ms will be handled with > * high precision by a busy wait for all real-time > processes. > * > * Its important on SMP not to do this holding locks. > */ > I udelay((t.tv_nsec + 999) / 1000); > return 0; > } > > expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); > > - current->state = TASK_INTERRUPTIBLE; > - expire = schedule_timeout(expire); > + regs->eax = -EINTR; > + do { > + current->state = TASK_INTERRUPTIBLE; > + } while((expire = schedule_timeout(expire)) && > !do_signal(regs,NULL)); > > if (expire) { > if (rmtp) { > jiffies_to_timespec(expire, &t); > if (copy_to_user(rmtp, &t, sizeof(struct > timespec))) > return -EFAULT; > } > return -EINTR; > } > return 0; > } > > BUT, it turns out that do_signal() is in the "arch" code, and further > that different arch's have different calling sequences (and, of course, > pt_regs is different also). This is the ONLY place in the kernel where > platform independent code needs to call do_signal() :( There does not > seem to be a clean answer for this issue. I suppose we could put > something like this in timer.c: > > #ifndef _do_signal > #define _do_signal(a,b) 1 > #endif > > and then leave it to the platform code to define a uniform > _do_signal(a,b) interface. This way each platform will work as it does > now and better once they define the macro. If this is the path to take, > what should the parameters "a" & "b" be? We need to cover all platforms > needs. > > Another way to solve this issue is to move nano_sleep into the "arch" > signal.c file, but then each would have to change and things would be > broken (i.e. nano_sleep would not work) until the platform made the > move. I suppose the current nano_sleep could stay in the kernel and > each platform could implement one in their area with a different name. > When all were done, the current code could be deleted. > > How should this be approached? > > George > > > george anzinger wrote: > > > > george anzinger wrote: > > > > > > Bruce Janson wrote: > > > > > > > > In article <20010814092849.E13892@pc8.lineo.fr>, > > > > christophe =?iso-8859-1?Q?barb=E9?= <christophe.barbe@lineo.fr> > wrote: > > > > .. > > > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit : > > > > .. > > > > >> The following program behaves incorrectly when traced: > > > > .. > > > > >Have you receive off-line answers? > > > > .. > > > > > > > > No, though I did receive an offline reply from someone who appeared > > > > to have misunderstood the post. In case it wasn't clear, the > problem > > > > is that the above program behaves differently when traced to how it > > > > behaves when not traced. (I do realise that in general, under > newer > > > > Unices, when not ignored, a SIGCHLD signal may accompany the death > of > > > > a child.) > > > > > > > > >I guess that it's certainly more a strace issue and that it's > perhaps > > > > .. > > > > > > > > It's not clear to me whether it is a kernel, glibc or strace bug, > but > > > > it does appear to be a bug. > > > > > > I don't have the code for usleep() handy and the man page is not much > > > help, but here goes: > > > > > > I think strace is using ptrace() which causes signals to be > redirected > > > to wake up the parent (strace in this case). In particular, blocked > > > signals are no longer blocked. What this means is that a.) SIG CHILD > is > > > posted, b.) the signal, not being blocked, the child is wakened, c.) > > > ptrace returns to the parent, d) the parent does what ever and tells > the > > > kernel (ptrace) to continue the child with the original mask, e.) the > > > signal code returns 0 with out delivering the signal to the child. > > > Looks good, right? Wrong! The wake up at (b) pulls the child out of > > > the timer queue so when signal returns, the sleep (I assume nano > sleep > > > is actually used here) call returns with the remaining sleep time as > a > > > value. > > > > > > This is an issue for debugging also (same ptrace...). The fix is to > fix > > > nano_sleep to match the standard which says it should only return on > a > > > signal if the signal is delivered to the program (i.e. not on > internal > > > "do nothing" signals). Signal in the kernel returns 1 if it calls > the > > > task and 0 otherwise, thus nano sleep might be changed as follows: > > > > > > expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec); > > > > > > current->state = TASK_INTERRUPTIBLE; > > > - expire = schedule_timeout(expire); > > > + while (expire = schedule_timeout(expire) && !signal()); > > Still not quite right. regs needs to be dummied up (see sys_sigpause) > > and then: > > + while (expire = schedule_timeout(expire) && !do_signal(regs, > > NULL)); > > > > > > if (expire) { > > > if (rmtp) { > > > jiffies_to_timespec(expire, &t); > > > if (copy_to_user(rmtp, &t, sizeof(struct > timespec))) > > > return -EFAULT; > > > } > > > return -EINTR; > > > } > > > return 0; > > > > > > This code is in ../kernel/timer.c > > > > > > Note that this assumes that nano_sleep() underlies usleep(). If > > > setitimer (via sleep() or otherwise) is used, the problem and fix is > in > > > the library. In that case, the code needs to notice that it was > > > awakened but the alarm handler was not called. Still, with out the > full > > > spec on usleep() it is not clear what it should do. > > > > > > In any case, this is a bug in nano_sleep(), where the spec is clear > on > > > this point. > > > > > > George > > > - > > > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Please read the FAQ at http://www.tux.org/lkml/ > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" > in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" > in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Christophe Barbé Software Engineer - christophe.barbe@lineo.fr Lineo France - Lineo High Availability Group 42-46, rue Médéric - 92110 Clichy - France phone (33).1.41.40.02.12 - fax (33).1.41.40.02.01 http://www.lineo.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 10:17 ` christophe barbé @ 2001-08-16 10:29 ` Russell King 2001-08-16 14:16 ` george anzinger 2001-08-16 16:00 ` christophe barbé 0 siblings, 2 replies; 20+ messages in thread From: Russell King @ 2001-08-16 10:29 UTC (permalink / raw) To: christophe barbé; +Cc: linux-kernel On Thu, Aug 16, 2001 at 12:17:46PM +0200, christophe barbé wrote: > > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec > > *rmtp) > > { > > struct timespec t; > > unsigned long expire; > > + struct pt_regs * regs = (struct pt_regs *) &rqtp; Note also that this is bogus as an architecture invariant. On ARM, we have to pass a pt_regs pointer into any function that requires it. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 10:29 ` Russell King @ 2001-08-16 14:16 ` george anzinger 2001-08-16 16:00 ` christophe barbé 1 sibling, 0 replies; 20+ messages in thread From: george anzinger @ 2001-08-16 14:16 UTC (permalink / raw) To: Russell King; +Cc: christophe barbé, linux-kernel Russell King wrote: > > On Thu, Aug 16, 2001 at 12:17:46PM +0200, christophe barbé wrote: > > > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec > > > *rmtp) > > > { > > > struct timespec t; > > > unsigned long expire; > > > + struct pt_regs * regs = (struct pt_regs *) &rqtp; > > Note also that this is bogus as an architecture invariant. > > On ARM, we have to pass a pt_regs pointer into any function that requires > it. > Does that mean that any function that requires pt_regs _must_ not be in common code? Isn't there a better solution than implementing these functions over and over for each platform? George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 10:29 ` Russell King 2001-08-16 14:16 ` george anzinger @ 2001-08-16 16:00 ` christophe barbé 2001-08-16 16:12 ` Russell King 1 sibling, 1 reply; 20+ messages in thread From: christophe barbé @ 2001-08-16 16:00 UTC (permalink / raw) To: linux-kernel; +Cc: Russell King Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit : > On Thu, Aug 16, 2001 at 12:17:46PM +0200, christophe barbé wrote: > > > asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec > > > *rmtp) > > > { > > > struct timespec t; > > > unsigned long expire; > > > + struct pt_regs * regs = (struct pt_regs *) &rqtp; > > Note also that this is bogus as an architecture invariant. > > On ARM, we have to pass a pt_regs pointer into any function that requires > it. I'm not sure to understand your point. The first sentence tell me that the "struct pt_regs ..." line is x86 specific and this was the reason behind my proposition to not add a _signal macro but a _sys_nanosleep macro to include this too. The second sentence seem's to indicate that this is a classic problem for the ARM port. So if this is correct what is the best way to solve it ? Christophe > -- > Russell King (rmk@arm.linux.org.uk) The developer of ARM > Linux > http://www.arm.linux.org.uk/personal/aboutme.html > -- Christophe Barbé Software Engineer - christophe.barbe@lineo.fr Lineo France - Lineo High Availability Group 42-46, rue Médéric - 92110 Clichy - France phone (33).1.41.40.02.12 - fax (33).1.41.40.02.01 http://www.lineo.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 16:00 ` christophe barbé @ 2001-08-16 16:12 ` Russell King 2001-08-16 18:17 ` george anzinger 0 siblings, 1 reply; 20+ messages in thread From: Russell King @ 2001-08-16 16:12 UTC (permalink / raw) To: christophe barbé; +Cc: linux-kernel On Thu, Aug 16, 2001 at 06:00:10PM +0200, christophe barbé wrote: > Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit : > > Note also that this is bogus as an architecture invariant. > > > > On ARM, we have to pass a pt_regs pointer into any function that requires > > it. > > I'm not sure to understand your point. Its quite simple: int sys_foo(struct pt_regs regs) { } does not reveal the user space registers on ARM. It instead reveals crap. Why? The ARM procedure call standard specifies that the first 4 words of "regs" in this case are in 4 processor registers. The other words are on the stack immediately above the frame created by foo. This is not how the stack is layed out on ARM on entry to a sys_* function due to the requirement for these to be restartable. Instead, we must pass a pointer thusly: int sys_foo(struct pt_regs *regs) { } and the pointer is specifically setup and passed in by a very small assembler wrapper. > The first sentence tell me that the "struct pt_regs ..." line is x86 > specific and this was the reason behind my proposition to not add a _signal > macro but a _sys_nanosleep macro to include this too. Correct. But the act of getting "struct pt_regs" on entry to the function is also architecture specific. > The second sentence seem's to indicate that this is a classic problem for > the ARM port. So if this is correct what is the best way to solve it ? It used to be with such functions as sys_execve. Then, sys_execve became an architecture specific wrapper around do_execve (not by my hand), so I guess that its not an ARM specific problem. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 16:12 ` Russell King @ 2001-08-16 18:17 ` george anzinger 2001-08-17 18:25 ` george anzinger 0 siblings, 1 reply; 20+ messages in thread From: george anzinger @ 2001-08-16 18:17 UTC (permalink / raw) To: Russell King; +Cc: christophe barbé, linux-kernel Russell King wrote: > > On Thu, Aug 16, 2001 at 06:00:10PM +0200, christophe barbé wrote: > > Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit : > > > Note also that this is bogus as an architecture invariant. > > > > > > On ARM, we have to pass a pt_regs pointer into any function that requires > > > it. > > > > I'm not sure to understand your point. > > Its quite simple: > > int sys_foo(struct pt_regs regs) > { > } > > does not reveal the user space registers on ARM. It instead reveals crap. > Why? The ARM procedure call standard specifies that the first 4 words > of "regs" in this case are in 4 processor registers. The other words > are on the stack immediately above the frame created by foo. This is > not how the stack is layed out on ARM on entry to a sys_* function > due to the requirement for these to be restartable. > > Instead, we must pass a pointer thusly: > > int sys_foo(struct pt_regs *regs) > { > } > > and the pointer is specifically setup and passed in by a very small > assembler wrapper. > > > The first sentence tell me that the "struct pt_regs ..." line is x86 > > specific and this was the reason behind my proposition to not add a _signal > > macro but a _sys_nanosleep macro to include this too. > > Correct. But the act of getting "struct pt_regs" on entry to the function > is also architecture specific. > > > The second sentence seem's to indicate that this is a classic problem for > > the ARM port. So if this is correct what is the best way to solve it ? > > It used to be with such functions as sys_execve. Then, sys_execve > became an architecture specific wrapper around do_execve (not by my > hand), so I guess that its not an ARM specific problem. > > -- So, it seems we need an arch. specific wrapper for nano_sleep. Now, how to do it so it is a smooth transition? George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-16 18:17 ` george anzinger @ 2001-08-17 18:25 ` george anzinger 2001-08-17 18:57 ` Victor Yodaiken 0 siblings, 1 reply; 20+ messages in thread From: george anzinger @ 2001-08-17 18:25 UTC (permalink / raw) To: Russell King, christophe barbé, linux-kernel george anzinger wrote: > > Russell King wrote: > > > > On Thu, Aug 16, 2001 at 06:00:10PM +0200, christophe barbé wrote: > > > Le jeu, 16 aoû 2001 12:29:05, Russell King a écrit : > > > > Note also that this is bogus as an architecture invariant. > > > > > > > > On ARM, we have to pass a pt_regs pointer into any function that requires > > > > it. > > > > > > I'm not sure to understand your point. > > > > Its quite simple: > > > > int sys_foo(struct pt_regs regs) > > { > > } > > > > does not reveal the user space registers on ARM. It instead reveals crap. > > Why? The ARM procedure call standard specifies that the first 4 words > > of "regs" in this case are in 4 processor registers. The other words > > are on the stack immediately above the frame created by foo. This is > > not how the stack is layed out on ARM on entry to a sys_* function > > due to the requirement for these to be restartable. > > > > Instead, we must pass a pointer thusly: > > > > int sys_foo(struct pt_regs *regs) > > { > > } > > > > and the pointer is specifically setup and passed in by a very small > > assembler wrapper. > > > > > The first sentence tell me that the "struct pt_regs ..." line is x86 > > > specific and this was the reason behind my proposition to not add a _signal > > > macro but a _sys_nanosleep macro to include this too. > > > > Correct. But the act of getting "struct pt_regs" on entry to the function > > is also architecture specific. > > > > > The second sentence seem's to indicate that this is a classic problem for > > > the ARM port. So if this is correct what is the best way to solve it ? > > > > It used to be with such functions as sys_execve. Then, sys_execve > > became an architecture specific wrapper around do_execve (not by my > > hand), so I guess that its not an ARM specific problem. > > > > -- > So, it seems we need an arch. specific wrapper for nano_sleep. Now, how > to do it so it is a smooth transition? > How about something like: In ../asm/signal.h (for i386) #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \ type name(p1_type p1,p2_typ p2)\ { struct pt_regs *regs = (struct pt_regs *)&p1; #define _do_signal() do_signal(regs, NULL) in ../asm_arm/signal.h #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \ type name(p1_type p1,p2_typ p2, struct pt_regs *regs) \ { #define _do_signal() do_signal(NULL,regs, NULL) and other archs as needed. in ../linux/signal.h (the default) : #include <asm/signal.h> : : #ifndef PT_REGS_ENTRY #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \ type name(p1_type p1,p2_typ p2)\ { #define _do_signal() 1 #endif and finally in nano_sleep : PT_REGS_ENTRY(asmlinkage long,sys_nanosleep,struct timespec *,rqtp,struct timespec *,rmtp) struct timespec t; unsigned long expire; : : etc... : do { current->state = TASK_INTERRUPTIBLE; } while((expire = schedule_timeout(expire)) && !_do_signal()); The notion is that this could be used for other system calls, in particular I want to implement clock_nanosleep() as part of the extended POSIX standard. Comments, flames,... George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-17 18:25 ` george anzinger @ 2001-08-17 18:57 ` Victor Yodaiken 2001-08-17 19:56 ` george anzinger 0 siblings, 1 reply; 20+ messages in thread From: Victor Yodaiken @ 2001-08-17 18:57 UTC (permalink / raw) To: george anzinger; +Cc: Russell King, christophe barbé, linux-kernel On Fri, Aug 17, 2001 at 11:25:42AM -0700, george anzinger wrote: > > > How about something like: > > In ../asm/signal.h (for i386) > > #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \ > type name(p1_type p1,p2_typ p2)\ > { struct pt_regs *regs = (struct pt_regs *)&p1; In RTLinux we define MACHDEPREGS as an arch dependent type. PPC defines this as a pointer and x87 as the structure etc. The small number of functions that actually need to manipulate this can be made machine dependent too. Came in handy during the port to BSD too. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-17 18:57 ` Victor Yodaiken @ 2001-08-17 19:56 ` george anzinger 2001-08-22 18:40 ` Russell King 0 siblings, 1 reply; 20+ messages in thread From: george anzinger @ 2001-08-17 19:56 UTC (permalink / raw) To: Victor Yodaiken; +Cc: Russell King, christophe barbé, linux-kernel Victor Yodaiken wrote: > > On Fri, Aug 17, 2001 at 11:25:42AM -0700, george anzinger wrote: > > > > > How about something like: > > > > In ../asm/signal.h (for i386) > > > > #define PT_REGS_ENTRY(type,name,p1_type,p1, p2_type,p2) \ > > type name(p1_type p1,p2_typ p2)\ > > { struct pt_regs *regs = (struct pt_regs *)&p1; > > In RTLinux we define MACHDEPREGS as an arch dependent type. PPC defines > this as a pointer and x87 as the structure etc. The small number of functions > that actually need to manipulate this can be made machine dependent too. > Came in handy during the port to BSD too. Uh..? I though that was what I was allowing. It seems to me to be a lot of extra work to put the same code in 15 different archs. Especially if one does not really know each of them, nor can any one group (or individual) be expected to be able to test (or even have the hardware to test) each of them. This said, what I am trying to define is a way to write common code that will handle each arch as the arch coder defines the macro for his arch. In the given case, for example, the type of the arch dependent structure is _only_ used in the macros which are defined in the asm/*.h file. In this particular case, the common macros, (which the arch macros over ride) do not call the arch dependent function, but assume that it returns true. For nano_sleep, this means that it will return early on ptrace signals, a standard violation, but what the current code does, so each arch can fix their system by defining the macro. I also would like a way to avoid writing (and supporting) 15 different nano_sleeps, clock_nano_sleeps, etc. If I miss understand what you are saying, please help me to understand it. But do keep in mind, that I _really_ don't want to support 15 incarnations of nano_sleep and clock_nano_sleep. I would, however, like to find a simple and elegant way to write and support the one common nano_sleep and clock_nano_sleep, or what ever code. George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-17 19:56 ` george anzinger @ 2001-08-22 18:40 ` Russell King 2001-08-23 20:04 ` george anzinger 0 siblings, 1 reply; 20+ messages in thread From: Russell King @ 2001-08-22 18:40 UTC (permalink / raw) To: george anzinger; +Cc: Victor Yodaiken, christophe barbé, linux-kernel On Fri, Aug 17, 2001 at 12:56:31PM -0700, george anzinger wrote: > Uh..? I though that was what I was allowing. It seems to me to be a > lot of extra work to put the same code in 15 different archs. > Especially if one does not really know each of them, nor can any one > group (or individual) be expected to be able to test (or even have the > hardware to test) each of them. Umm, my best advice is to look at sys_fork() and do_fork(), sys_execve() and do_execve(). -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-22 18:40 ` Russell King @ 2001-08-23 20:04 ` george anzinger 2001-08-23 20:11 ` Russell King 0 siblings, 1 reply; 20+ messages in thread From: george anzinger @ 2001-08-23 20:04 UTC (permalink / raw) To: Russell King; +Cc: Victor Yodaiken, christophe barbé, linux-kernel Russell King wrote: > > On Fri, Aug 17, 2001 at 12:56:31PM -0700, george anzinger wrote: > > Uh..? I though that was what I was allowing. It seems to me to be a > > lot of extra work to put the same code in 15 different archs. > > Especially if one does not really know each of them, nor can any one > > group (or individual) be expected to be able to test (or even have the > > hardware to test) each of them. > > Umm, my best advice is to look at sys_fork() and do_fork(), sys_execve() > and do_execve(). > Sorry, but none of those system calls requires the registers which is where the problem is. George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-23 20:04 ` george anzinger @ 2001-08-23 20:11 ` Russell King 2001-08-23 21:13 ` george anzinger 0 siblings, 1 reply; 20+ messages in thread From: Russell King @ 2001-08-23 20:11 UTC (permalink / raw) To: george anzinger; +Cc: Victor Yodaiken, christophe barbé, linux-kernel On Thu, Aug 23, 2001 at 01:04:09PM -0700, george anzinger wrote: > Sorry, but none of those system calls requires the registers which is > where the problem is. /* Fork a new task - this creates a new program thread. * This is called indirectly via a small wrapper */ asmlinkage int sys_fork(struct pt_regs *regs) ^^^^^^^^^^^^^^^^^^^^ { return do_fork(SIGCHLD, regs->ARM_sp, regs, 0); } /* sys_execve() executes a new program. * This is called indirectly via a small wrapper */ asmlinkage int sys_execve(char *filenamei, char **argv, char **envp, struct pt_regs *regs) ^^^^^^^^^^^^^^^^^^^^^ { int error; char * filename; filename = getname(filenamei); error = PTR_ERR(filename); if (IS_ERR(filename)) goto out; error = do_execve(filename, argv, envp, regs); putname(filename); out: return error; } Certainly looks to me like they do. See the highlighted arguments. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) 2001-08-23 20:11 ` Russell King @ 2001-08-23 21:13 ` george anzinger 0 siblings, 0 replies; 20+ messages in thread From: george anzinger @ 2001-08-23 21:13 UTC (permalink / raw) To: Russell King; +Cc: Victor Yodaiken, christophe barbé, linux-kernel Russell King wrote: > > On Thu, Aug 23, 2001 at 01:04:09PM -0700, george anzinger wrote: > > Sorry, but none of those system calls requires the registers which is > > where the problem is. > > /* Fork a new task - this creates a new program thread. > * This is called indirectly via a small wrapper > */ > asmlinkage int sys_fork(struct pt_regs *regs) > ^^^^^^^^^^^^^^^^^^^^ > { > return do_fork(SIGCHLD, regs->ARM_sp, regs, 0); > } > > /* sys_execve() executes a new program. > * This is called indirectly via a small wrapper > */ > asmlinkage int > sys_execve(char *filenamei, char **argv, char **envp, struct pt_regs *regs) > ^^^^^^^^^^^^^^^^^^^^^ > { > int error; > char * filename; > > filename = getname(filenamei); > error = PTR_ERR(filename); > if (IS_ERR(filename)) > goto out; > error = do_execve(filename, argv, envp, regs); > putname(filename); > out: > return error; > } > > Certainly looks to me like they do. See the highlighted arguments. > I stand corrected! Guess I should look before I leap, uh email. OK, here is the issue I am dealing with. I would like to define a new system call \x1f(clock_nanosleep()) to conform to the extended POSIX standard. It, as well as the current nanosleep() needs the regs. to call do_signal(), however, it is strictly a pass thru, do_signal() uses the regs, not the main body of the sleep functions. The various platforms have different ways of passing the regs to the system call and also different ways of calling do_signal(). The failure to be able to call do_signal() results in the call not properly ignoring ptrace signals, see the beginning of this thread. What I would like to do is to abstract the interface to the system calls (just the nanosleep ones will do for now) so that the common code will work (except for the ptrace problem) while leaving room for the platform code to pick up the ball as they support people feel so inclined. This means that some of them would have to write the required wrappers and all would have to define the macros that abstract the interface. Here is what I have so far: In include/linux/signal.h #ifndef PT_REGS_ENTRY #define PT_REGS_ENTRY(type,name,p1_type,p1,p2_type,p2) \ type name(pi_type p1,p2_type,p2) { #define _do_signal() 1 #endif In include/asm_i386/signal.h asmlinkage int FASTCALL(do_signal(struct pt_regs *regs, sigset_t *oldset)); #define PT_REGS_ENTRY(type,name,p1_type,p1,p2_type,p2) \ type name(p1_type p1,p2_type p2) \ { struct pt_regs *regs=(struct pt_regs *)&p1; #define _do_signal() do_signal(regs,NULL) and in kernel/timer.c PT_REGS_ENTRY(asmlinkage long,sys_nanosleep,struct timespec*,rqtp,struct timespec* ,rmtp) struct timespec t;..... do{ current->state = TASK_INTERRUPTABLE; }while((expire = schedule_timeout(expire)) && !_do_signal()); I think this lets us get on with fixing the i386 branch without breaking any other branch and still leaves it easy for other branches to catch up as they find the time to do so. The question I haven't had time to research is if there is enough flexibility for all the other platforms. Certainly ARM can insert the additional parameter in the system call using this. If you want my $.02 worth, the pt_regs should be addressable without reference to the call stack. To the best of my knowledge, they are always at the base of the kernel stack (assuming we ignore the case of kernel interrupts where we don't allow signal delivery in any case) and so should be find able by doing a bit of math on the stack pointer. How does the hardware come up with a stack pointer when interrupting/ calling from user space? Food for thought, and of course, it is different for each platform. George ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2001-08-23 21:16 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-08-13 8:29 ptrace(), fork(), sleep(), exit(), SIGCHLD Bruce Janson 2001-08-14 7:28 ` christophe barbé 2001-08-14 15:06 ` Bruce Janson 2001-08-15 15:46 ` george anzinger 2001-08-15 17:53 ` george anzinger 2001-08-15 18:02 ` george anzinger 2001-08-16 0:59 ` How should nano_sleep be fixed (was: ptrace(), fork(), sleep(), exit(), SIGCHLD) george anzinger 2001-08-16 10:17 ` christophe barbé 2001-08-16 10:29 ` Russell King 2001-08-16 14:16 ` george anzinger 2001-08-16 16:00 ` christophe barbé 2001-08-16 16:12 ` Russell King 2001-08-16 18:17 ` george anzinger 2001-08-17 18:25 ` george anzinger 2001-08-17 18:57 ` Victor Yodaiken 2001-08-17 19:56 ` george anzinger 2001-08-22 18:40 ` Russell King 2001-08-23 20:04 ` george anzinger 2001-08-23 20:11 ` Russell King 2001-08-23 21:13 ` george anzinger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox