* Re: [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version)
[not found] <200604101640.04255.lbocseg@domain.hid>
@ 2006-04-11 12:29 ` Jan Kiszka
2006-04-11 12:54 ` Rodrigo Rosenfeld Rosas
2006-04-11 14:01 ` Jan Kiszka
0 siblings, 2 replies; 7+ messages in thread
From: Jan Kiszka @ 2006-04-11 12:29 UTC (permalink / raw)
To: Rodrigo Rosenfeld Rosas; +Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 308 bytes --]
Rodrigo Rosenfeld Rosas wrote:
> BTW, please, could someone confirm the rt_task_delete(NULL) bug in SVN?
Half-confirmed, there is something fishy. I'm struggling with the
debugger ATM, not sure yet who's wrong ;). It tells me rt_task_delete of
the skin module is entered with task != NULL...
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version)
2006-04-11 12:29 ` [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version) Jan Kiszka
@ 2006-04-11 12:54 ` Rodrigo Rosenfeld Rosas
2006-04-11 14:01 ` Jan Kiszka
1 sibling, 0 replies; 7+ messages in thread
From: Rodrigo Rosenfeld Rosas @ 2006-04-11 12:54 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-core
Maybe NULL is defined diferently in kernel mode and in userspace? Maybe it
would be good to test the function against 0 (zero) in both code just for
making sure...
Rodrigo.
____________________________________________________________
Em Terça 11 Abril 2006 09:29, Jan Kiszka escreveu:
>Rodrigo Rosenfeld Rosas wrote:
>> BTW, please, could someone confirm the rt_task_delete(NULL) bug in SVN?
>
>Half-confirmed, there is something fishy. I'm struggling with the
>debugger ATM, not sure yet who's wrong ;). It tells me rt_task_delete of
>the skin module is entered with task != NULL...
>
>Jan
_______________________________________________________
Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu celular. Registre seu aparelho agora!
http://br.mobile.yahoo.com/mailalertas/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version)
2006-04-11 12:29 ` [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version) Jan Kiszka
2006-04-11 12:54 ` Rodrigo Rosenfeld Rosas
@ 2006-04-11 14:01 ` Jan Kiszka
2006-04-11 20:25 ` Philippe Gerum
1 sibling, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2006-04-11 14:01 UTC (permalink / raw)
To: Rodrigo Rosenfeld Rosas; +Cc: xenomai-core
[-- Attachment #1.1: Type: text/plain, Size: 1195 bytes --]
[a few interruptions later]
Jan Kiszka wrote:
> Rodrigo Rosenfeld Rosas wrote:
>> BTW, please, could someone confirm the rt_task_delete(NULL) bug in SVN?
>
> Half-confirmed, there is something fishy. I'm struggling with the
> debugger ATM, not sure yet who's wrong ;). It tells me rt_task_delete of
> the skin module is entered with task != NULL...
...which turns out to be fine, just appears redundant to me when
comparing __rt_task_delete and rt_task_delete for the task=NULL case.
Anyway, leaving a native task with rt_task_delete(NULL) raises SIGKILL
to the whole process instead of just the task (pthread). This lets your
program terminate unexpectedly - I would say: a bug. And this doesn't
happen with 2.1?
I guess the easiest way to solve this is to catch NULL in userspace and
call pthread_exit() in favour of the skin service (the POSIX skin uses
pthread_exit anyway), see attached patch. Someone just has to confirm
that there will be no problem hidden by this approach.
Jan
PS: What's the reason for "if (err == -ESRCH) return 0" in
src/skins/native/task.c, rt_task_delete? Why is that error generate in
the first place if it is zeroed out here?
[-- Attachment #1.2: task-delete-null.patch --]
[-- Type: text/plain, Size: 456 bytes --]
Index: src/skins/native/task.c
===================================================================
--- src/skins/native/task.c (revision 923)
+++ src/skins/native/task.c (working copy)
@@ -212,7 +212,10 @@ int rt_task_delete (RT_TASK *task)
{
int err;
- if (task && task->opaque2) {
+ if (!task)
+ pthread_exit(NULL);
+
+ if (task->opaque2) {
err = pthread_cancel((pthread_t)task->opaque2);
if (err)
return -err;
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version)
2006-04-11 14:01 ` Jan Kiszka
@ 2006-04-11 20:25 ` Philippe Gerum
2006-04-11 20:41 ` Philippe Gerum
0 siblings, 1 reply; 7+ messages in thread
From: Philippe Gerum @ 2006-04-11 20:25 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-core
Jan Kiszka wrote:
> [a few interruptions later]
>
> Jan Kiszka wrote:
>
>>Rodrigo Rosenfeld Rosas wrote:
>>
>>>BTW, please, could someone confirm the rt_task_delete(NULL) bug in SVN?
>>
>>Half-confirmed, there is something fishy. I'm struggling with the
>>debugger ATM, not sure yet who's wrong ;). It tells me rt_task_delete of
>>the skin module is entered with task != NULL...
>
>
> ...which turns out to be fine, just appears redundant to me when
> comparing __rt_task_delete and rt_task_delete for the task=NULL case.
>
> Anyway, leaving a native task with rt_task_delete(NULL) raises SIGKILL
> to the whole process instead of just the task (pthread). This lets your
> program terminate unexpectedly - I would say: a bug. And this doesn't
> happen with 2.1?
>
It's a side-effect of a recent bug fix in ksrc/nucleus/shadow.c; now
killing a thread raises a group signal wiping out the entire process.
Ok, it's a bit drastic, will fix.
> I guess the easiest way to solve this is to catch NULL in userspace and
> call pthread_exit() in favour of the skin service (the POSIX skin uses
> pthread_exit anyway), see attached patch. Someone just has to confirm
> that there will be no problem hidden by this approach.
Passing NULL needs to work including from user-space; the kernel-space
is ok with this, and the API must behave the same way regardless of the
execution space. Should fix as needed.
>
> Jan
>
>
> PS: What's the reason for "if (err == -ESRCH) return 0" in
> src/skins/native/task.c, rt_task_delete? Why is that error generate in
> the first place if it is zeroed out here?
>
>
> ------------------------------------------------------------------------
>
> Index: src/skins/native/task.c
> ===================================================================
> --- src/skins/native/task.c (revision 923)
> +++ src/skins/native/task.c (working copy)
> @@ -212,7 +212,10 @@ int rt_task_delete (RT_TASK *task)
> {
> int err;
>
> - if (task && task->opaque2) {
> + if (!task)
> + pthread_exit(NULL);
> +
> + if (task->opaque2) {
> err = pthread_cancel((pthread_t)task->opaque2);
> if (err)
> return -err;
--
Philippe.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version)
2006-04-11 20:25 ` Philippe Gerum
@ 2006-04-11 20:41 ` Philippe Gerum
2006-04-11 21:27 ` Jan Kiszka
0 siblings, 1 reply; 7+ messages in thread
From: Philippe Gerum @ 2006-04-11 20:41 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Jan Kiszka, xenomai-core
Philippe Gerum wrote:
> Jan Kiszka wrote:
>
>> [a few interruptions later]
>>
>> Jan Kiszka wrote:
>>
>>> Rodrigo Rosenfeld Rosas wrote:
>>>
>>>> BTW, please, could someone confirm the rt_task_delete(NULL) bug in SVN?
>>>
>>>
>>> Half-confirmed, there is something fishy. I'm struggling with the
>>> debugger ATM, not sure yet who's wrong ;). It tells me rt_task_delete of
>>> the skin module is entered with task != NULL...
>>
>>
>>
>> ...which turns out to be fine, just appears redundant to me when
>> comparing __rt_task_delete and rt_task_delete for the task=NULL case.
>>
>> Anyway, leaving a native task with rt_task_delete(NULL) raises SIGKILL
>> to the whole process instead of just the task (pthread). This lets your
>> program terminate unexpectedly - I would say: a bug. And this doesn't
>> happen with 2.1?
>>
>
> It's a side-effect of a recent bug fix in ksrc/nucleus/shadow.c; now
> killing
Er, "deleting" is the right word here. Sending a thread a termination
signal must kill the entire process as per POSIX, and will continue to
do so. Calling rt_task_delete() to explicitely delete a single thread
from within the containing process is another story. The current issue
is due to the fact that no distinction is made on the caller:
rt_task_delete() targeting a thread from another process should wipe out
the entire target process; otherwise, only the local target thread
should be deleted. It's not clear whether we should still wipe out the
entire process when the target thread is not the current one, regardless
of the fact such thread is a member of the same process or not.
I'm open to suggestions.
a thread raises a group signal wiping out the entire process.
> Ok, it's a bit drastic, will fix.
>
>> I guess the easiest way to solve this is to catch NULL in userspace and
>> call pthread_exit() in favour of the skin service (the POSIX skin uses
>> pthread_exit anyway), see attached patch. Someone just has to confirm
>> that there will be no problem hidden by this approach.
>
>
> Passing NULL needs to work including from user-space; the kernel-space
> is ok with this, and the API must behave the same way regardless of the
> execution space. Should fix as needed.
>
>>
>> Jan
>>
>>
>> PS: What's the reason for "if (err == -ESRCH) return 0" in
>> src/skins/native/task.c, rt_task_delete? Why is that error generate in
>> the first place if it is zeroed out here?
>>
>>
>> ------------------------------------------------------------------------
>>
>> Index: src/skins/native/task.c
>> ===================================================================
>> --- src/skins/native/task.c (revision 923)
>> +++ src/skins/native/task.c (working copy)
>> @@ -212,7 +212,10 @@ int rt_task_delete (RT_TASK *task)
>> {
>> int err;
>>
>> - if (task && task->opaque2) {
>> + if (!task)
>> + pthread_exit(NULL);
>> +
>> + if (task->opaque2) {
>> err = pthread_cancel((pthread_t)task->opaque2);
>> if (err)
>> return -err;
>
>
>
--
Philippe.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version)
2006-04-11 20:41 ` Philippe Gerum
@ 2006-04-11 21:27 ` Jan Kiszka
2006-04-16 10:27 ` [Xenomai-core] rt_task_delete() behaviour Philippe Gerum
0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2006-04-11 21:27 UTC (permalink / raw)
To: Philippe Gerum; +Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 3246 bytes --]
Philippe Gerum wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>
>>> [a few interruptions later]
>>>
>>> Jan Kiszka wrote:
>>>
>>>> Rodrigo Rosenfeld Rosas wrote:
>>>>
>>>>> BTW, please, could someone confirm the rt_task_delete(NULL) bug in
>>>>> SVN?
>>>>
>>>>
>>>> Half-confirmed, there is something fishy. I'm struggling with the
>>>> debugger ATM, not sure yet who's wrong ;). It tells me
>>>> rt_task_delete of
>>>> the skin module is entered with task != NULL...
>>>
>>>
>>>
>>> ...which turns out to be fine, just appears redundant to me when
>>> comparing __rt_task_delete and rt_task_delete for the task=NULL case.
>>>
>>> Anyway, leaving a native task with rt_task_delete(NULL) raises SIGKILL
>>> to the whole process instead of just the task (pthread). This lets your
>>> program terminate unexpectedly - I would say: a bug. And this doesn't
>>> happen with 2.1?
>>>
>>
>> It's a side-effect of a recent bug fix in ksrc/nucleus/shadow.c; now
>> killing
>
> Er, "deleting" is the right word here. Sending a thread a termination
> signal must kill the entire process as per POSIX, and will continue to
> do so. Calling rt_task_delete() to explicitely delete a single thread
> from within the containing process is another story. The current issue
> is due to the fact that no distinction is made on the caller:
> rt_task_delete() targeting a thread from another process should wipe out
> the entire target process; otherwise, only the local target thread
> should be deleted. It's not clear whether we should still wipe out the
> entire process when the target thread is not the current one, regardless
> of the fact such thread is a member of the same process or not.
> I'm open to suggestions.
Killing other threads within the same process currently only works due
to pthread_cancel. I don't see a portable equivalent for foreign
processes yet as well. :-/
I guess the thread termination signal sent by pthread_cancel depends on
glibc internals, specifically its variant (NTPL or linux-threads),
doesn't it? Didn't we already have this discussion??
For now I would say the best we can do is to avoid the
rt_task_delete(NULL) side effect in userspace (as I suggested) and live
with the limitation of terminating the whole process when using the
(rather unusual) cross-process rt_task_delete.
>
> a thread raises a group signal wiping out the entire process.
>> Ok, it's a bit drastic, will fix.
>>
>>> I guess the easiest way to solve this is to catch NULL in userspace and
>>> call pthread_exit() in favour of the skin service (the POSIX skin uses
>>> pthread_exit anyway), see attached patch. Someone just has to confirm
>>> that there will be no problem hidden by this approach.
>>
>>
>> Passing NULL needs to work including from user-space; the kernel-space
>> is ok with this, and the API must behave the same way regardless of
>> the execution space. Should fix as needed.
>>
>>>
>>> Jan
>>>
>>>
>>> PS: What's the reason for "if (err == -ESRCH) return 0" in
>>> src/skins/native/task.c, rt_task_delete? Why is that error generate in
>>> the first place if it is zeroed out here?
>>>
<attention: unanswered question above> ;)
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Xenomai-core] rt_task_delete() behaviour
2006-04-11 21:27 ` Jan Kiszka
@ 2006-04-16 10:27 ` Philippe Gerum
0 siblings, 0 replies; 7+ messages in thread
From: Philippe Gerum @ 2006-04-16 10:27 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-core
Jan Kiszka wrote:
>>>>Anyway, leaving a native task with rt_task_delete(NULL) raises SIGKILL
>>>>to the whole process instead of just the task (pthread). This lets your
>>>>program terminate unexpectedly - I would say: a bug. And this doesn't
>>>>happen with 2.1?
>>>>
>>>
>>>It's a side-effect of a recent bug fix in ksrc/nucleus/shadow.c; now
>>>killing
>>
>>Er, "deleting" is the right word here. Sending a thread a termination
>>signal must kill the entire process as per POSIX, and will continue to
>>do so. Calling rt_task_delete() to explicitely delete a single thread
>>from within the containing process is another story. The current issue
>>is due to the fact that no distinction is made on the caller:
>>rt_task_delete() targeting a thread from another process should wipe out
>>the entire target process; otherwise, only the local target thread
>>should be deleted. It's not clear whether we should still wipe out the
>>entire process when the target thread is not the current one, regardless
>>of the fact such thread is a member of the same process or not.
>>I'm open to suggestions.
>
>
> Killing other threads within the same process currently only works due
> to pthread_cancel. I don't see a portable equivalent for foreign
> processes yet as well. :-/
>
> I guess the thread termination signal sent by pthread_cancel depends on
> glibc internals, specifically its variant (NTPL or linux-threads),
> doesn't it? Didn't we already have this discussion??
>
Actually, the issue is different, it depends on the underlying kernel
support; it's Xenomai's shadow manager who sends the termination signal
when demoting threads from kernel space, the pthread API is not involved
here. The nucleus happens to kill the thread group over 2.6 because
thread group support is fully implemented on this kernel, and calling
the kill_proc() API with a termination signal would properly kill all
threads belonging to the group the target thread belongs to. This does
not work over 2.4 which puts every new thread in its own group by
default, de facto making it as a group leader, regardless of the
CLONE_THREAD attribute being set or not when the glibc calls the clone()
service. IOW, you actually end up having two different behaviours when
calling rt_task_delete() whether 2.4 or 2.6 is considered, even if both
setups rely on the NPTL on the application side.
> For now I would say the best we can do is to avoid the
> rt_task_delete(NULL) side effect in userspace (as I suggested) and live
> with the limitation of terminating the whole process when using the
> (rather unusual) cross-process rt_task_delete.
>
This would not be a limitation in some cases actually: e.g. continuing
an application that had thread(s) killed from another _process_ would be
most often meaningless.
>
>> a thread raises a group signal wiping out the entire process.
>>
>>>Ok, it's a bit drastic, will fix.
>>>
>>>
>>>>I guess the easiest way to solve this is to catch NULL in userspace and
>>>>call pthread_exit() in favour of the skin service (the POSIX skin uses
>>>>pthread_exit anyway), see attached patch. Someone just has to confirm
>>>>that there will be no problem hidden by this approach.
>>>
>>>
>>>Passing NULL needs to work including from user-space; the kernel-space
>>>is ok with this, and the API must behave the same way regardless of
>>>the execution space. Should fix as needed.
>>>
>>>
>>>>Jan
>>>>
>>>>
>>>>PS: What's the reason for "if (err == -ESRCH) return 0" in
>>>>src/skins/native/task.c, rt_task_delete? Why is that error generate in
>>>>the first place if it is zeroed out here?
>>>>
>
>
> <attention: unanswered question above> ;)
>
I don't think I've coded this stuff, but reading it, I would say that
since the preceding call to pthread_cancel() might have caused the
target thread to be wiped out before the nucleus syscall is issued,
-ESRCH would not be a real error.
> Jan
>
--
Philippe.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-04-16 10:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200604101640.04255.lbocseg@domain.hid>
2006-04-11 12:29 ` [Xenomai-core] [PATCH] Fixs doxygen doc on rt_queue_read in ksrc/native/queue.c (for SVN version) Jan Kiszka
2006-04-11 12:54 ` Rodrigo Rosenfeld Rosas
2006-04-11 14:01 ` Jan Kiszka
2006-04-11 20:25 ` Philippe Gerum
2006-04-11 20:41 ` Philippe Gerum
2006-04-11 21:27 ` Jan Kiszka
2006-04-16 10:27 ` [Xenomai-core] rt_task_delete() behaviour Philippe Gerum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.