[Xenomai-core] Potential heap corruption on thread cleanup

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xenomai-core] Potential heap corruption on thread cleanup
@ 2010-03-03  8:58 Jan Kiszka
  2010-03-03  9:04 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2010-03-03  8:58 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Wolfgang Mauerer, xenomai-core, Gernot Hillier

[-- Attachment #1: Type: text/plain, Size: 1048 bytes --]

Hi Gilles,

I'm pushing your findings to the list, also as my colleagues showed
strong interest - this thing may explain rare corruptions for us as well.

I thought a bit about that likely u_mode-related crash in your test case
and have the following theory so far: If the xeno_current_mode storage
is allocated on the application heap (!HAVE_THREAD, that's also what we
are forced to use), it is automatically freed on thread termination in
the context of the dying thread. If the thread is already migrated to
secondary or if that happens while it is cleaned up (i.e. before calling
for exit into the kernel), there is no problem, Xenomai will not touch
the mode storage anymore. But if the thread happens to delete the
storage "silently", without any migration, the final exit will trigger
one further access. And that takes place against an invalid head area at
this point.

Does this make sense?

If that is true, all we need to do is to force a migration before
releasing the mode storage. Could you check this?

Jan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-03  8:58 [Xenomai-core] Potential heap corruption on thread cleanup Jan Kiszka
@ 2010-03-03  9:04 ` Gilles Chanteperdrix
  2010-03-03  9:13   ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-03  9:04 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Wolfgang Mauerer, xenomai-core, Gernot Hillier

Jan Kiszka wrote:
> Hi Gilles,
> 
> I'm pushing your findings to the list, also as my colleagues showed
> strong interest - this thing may explain rare corruptions for us as well.
> 
> I thought a bit about that likely u_mode-related crash in your test case
> and have the following theory so far: If the xeno_current_mode storage
> is allocated on the application heap (!HAVE_THREAD, that's also what we
> are forced to use), it is automatically freed on thread termination in
> the context of the dying thread. If the thread is already migrated to
> secondary or if that happens while it is cleaned up (i.e. before calling
> for exit into the kernel), there is no problem, Xenomai will not touch
> the mode storage anymore. But if the thread happens to delete the
> storage "silently", without any migration, the final exit will trigger
> one further access. And that takes place against an invalid head area at
> this point.
> 
> Does this make sense?

Yes, it is the issue we observed.

> 
> If that is true, all we need to do is to force a migration before
> releasing the mode storage. Could you check this?

No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
in another TSD cleanup function is which could be called after the
current_mode TSD cleanup is allowed and could trigger a switch to
primary mode and a write to the u_mode.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-03  9:04 ` Gilles Chanteperdrix
@ 2010-03-03  9:13   ` Jan Kiszka
  2010-03-03  9:16     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2010-03-03  9:13 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Wolfgang Mauerer, xenomai-core, Gernot Hillier

[-- Attachment #1: Type: text/plain, Size: 1581 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Hi Gilles,
>>
>> I'm pushing your findings to the list, also as my colleagues showed
>> strong interest - this thing may explain rare corruptions for us as well.
>>
>> I thought a bit about that likely u_mode-related crash in your test case
>> and have the following theory so far: If the xeno_current_mode storage
>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>> are forced to use), it is automatically freed on thread termination in
>> the context of the dying thread. If the thread is already migrated to
>> secondary or if that happens while it is cleaned up (i.e. before calling
>> for exit into the kernel), there is no problem, Xenomai will not touch
>> the mode storage anymore. But if the thread happens to delete the
>> storage "silently", without any migration, the final exit will trigger
>> one further access. And that takes place against an invalid head area at
>> this point.
>>
>> Does this make sense?
> 
> Yes, it is the issue we observed.
> 
>> If that is true, all we need to do is to force a migration before
>> releasing the mode storage. Could you check this?
> 
> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
> in another TSD cleanup function is which could be called after the
> current_mode TSD cleanup is allowed and could trigger a switch to
> primary mode and a write to the u_mode.
> 

Good point. Mmh. Another, but ABI-breaking, way would be to add a
syscall for deregistering the u_mode pointer...

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-03  9:13   ` Jan Kiszka
@ 2010-03-03  9:16     ` Gilles Chanteperdrix
  2010-03-04 18:28       ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-03  9:16 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Wolfgang Mauerer, xenomai-core, Gernot Hillier

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Hi Gilles,
>>>
>>> I'm pushing your findings to the list, also as my colleagues showed
>>> strong interest - this thing may explain rare corruptions for us as well.
>>>
>>> I thought a bit about that likely u_mode-related crash in your test case
>>> and have the following theory so far: If the xeno_current_mode storage
>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>> are forced to use), it is automatically freed on thread termination in
>>> the context of the dying thread. If the thread is already migrated to
>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>> the mode storage anymore. But if the thread happens to delete the
>>> storage "silently", without any migration, the final exit will trigger
>>> one further access. And that takes place against an invalid head area at
>>> this point.
>>>
>>> Does this make sense?
>> Yes, it is the issue we observed.
>>
>>> If that is true, all we need to do is to force a migration before
>>> releasing the mode storage. Could you check this?
>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>> in another TSD cleanup function is which could be called after the
>> current_mode TSD cleanup is allowed and could trigger a switch to
>> primary mode and a write to the u_mode.
>>
> 
> Good point. Mmh. Another, but ABI-breaking, way would be to add a
> syscall for deregistering the u_mode pointer...

That is the thing we did to verify that we had this bug. But this
syscall would be also called too soon, and suffers from the TSD cleanup
functions order again.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-03  9:16     ` Gilles Chanteperdrix
@ 2010-03-04 18:28       ` Jan Kiszka
  2010-03-04 18:36         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2010-03-04 18:28 UTC (permalink / raw)
  To: Gilles Chanteperdrix
  Cc: Wolfgang Mauerer, Jan Kiszka, xenomai-core, Gernot Hillier

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Hi Gilles,
>>>>
>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>
>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>> and have the following theory so far: If the xeno_current_mode storage
>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>> are forced to use), it is automatically freed on thread termination in
>>>> the context of the dying thread. If the thread is already migrated to
>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>> the mode storage anymore. But if the thread happens to delete the
>>>> storage "silently", without any migration, the final exit will trigger
>>>> one further access. And that takes place against an invalid head area at
>>>> this point.
>>>>
>>>> Does this make sense?
>>> Yes, it is the issue we observed.
>>>
>>>> If that is true, all we need to do is to force a migration before
>>>> releasing the mode storage. Could you check this?
>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>> in another TSD cleanup function is which could be called after the
>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>> primary mode and a write to the u_mode.
>>>
>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>> syscall for deregistering the u_mode pointer...
> 
> That is the thing we did to verify that we had this bug. But this
> syscall would be also called too soon, and suffers from the TSD cleanup
> functions order again.
> 

Right, the only complete fix without losing functionality is to add an
option to our ABI for requesting kernel-managed memory if dynamic
allocation is necessary (i.e. no TLS is available).

But I thought a bit more about a workaround for the existing ABI. We
basically need a way to free some memory as late as possible on thread
deletion. Even when leaving garbage collection that no one really wants
aside, there might be some semi-perfect user-space-only solution:

pthread_create_key says that TSD destructors are re-run after the first
round if their key value is still non-NULL. So we could at least work
around the already rare case that some TSD destructor past ours tries to
access an RT mutex or otherwise migrates the thread to RT again. For
this, we just need a counter (next to the mode storages) for the round.
If we are in round #1, we would restore the key value again instead of
freeing it. On run #n < PTHREAD_DESTRUCTOR_ITERATIONS, we would finally
free it in the hope we are the last interested in it. This just requires
PTHREAD_DESTRUCTOR_ITERATIONS > 1, and that the application does not do
this ugly dance as well AND also performs Xenomai calls.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-04 18:28       ` Jan Kiszka
@ 2010-03-04 18:36         ` Gilles Chanteperdrix
  2010-03-04 20:25           ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-04 18:36 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Wolfgang Mauerer, Jan Kiszka, xenomai-core, Gernot Hillier

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Hi Gilles,
>>>>>
>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>
>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>> are forced to use), it is automatically freed on thread termination in
>>>>> the context of the dying thread. If the thread is already migrated to
>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>> storage "silently", without any migration, the final exit will trigger
>>>>> one further access. And that takes place against an invalid head area at
>>>>> this point.
>>>>>
>>>>> Does this make sense?
>>>> Yes, it is the issue we observed.
>>>>
>>>>> If that is true, all we need to do is to force a migration before
>>>>> releasing the mode storage. Could you check this?
>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>> in another TSD cleanup function is which could be called after the
>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>> primary mode and a write to the u_mode.
>>>>
>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>> syscall for deregistering the u_mode pointer...
>> That is the thing we did to verify that we had this bug. But this
>> syscall would be also called too soon, and suffers from the TSD cleanup
>> functions order again.
>>
> 
> Right, the only complete fix without losing functionality is to add an
> option to our ABI for requesting kernel-managed memory if dynamic
> allocation is necessary (i.e. no TLS is available).

No. TLS may as well suffer from the same issue, since it is handled by
the glibc or libgcc, over which we have no control. So yes, it may work
by chance today, but may as well stop working tomorrow. We use
kernel-managed memory all the time, final point.

> 
> But I thought a bit more about a workaround for the existing ABI. We
> basically need a way to free some memory as late as possible on thread
> deletion. Even when leaving garbage collection that no one really wants
> aside, there might be some semi-perfect user-space-only solution:
> 
> pthread_create_key says that TSD destructors are re-run after the first
> round if their key value is still non-NULL. So we could at least work
> around the already rare case that some TSD destructor past ours tries to
> access an RT mutex or otherwise migrates the thread to RT again. For
> this, we just need a counter (next to the mode storages) for the round.
> If we are in round #1, we would restore the key value again instead of
> freeing it. On run #n < PTHREAD_DESTRUCTOR_ITERATIONS, we would finally
> free it in the hope we are the last interested in it. This just requires
> PTHREAD_DESTRUCTOR_ITERATIONS > 1, and that the application does not do
> this ugly dance as well AND also performs Xenomai calls.

I have thought of another simpler fix: we leak the u_mode when
kernel-support is too old, and whine loudly about it. For the other case
(newer kernel-support with older user-space support), I was thinking
about something else, which I still find complicated and far from
perfect: handling the exit syscall by setting the u_mode pointer to NULL
because we know at that time the u_mode pointer points to free memory.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-04 18:36         ` Gilles Chanteperdrix
@ 2010-03-04 20:25           ` Jan Kiszka
  2010-03-04 20:42             ` Gilles Chanteperdrix
  2010-03-05 11:08             ` Wolfgang Mauerer
  0 siblings, 2 replies; 15+ messages in thread
From: Jan Kiszka @ 2010-03-04 20:25 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

[-- Attachment #1: Type: text/plain, Size: 5301 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Hi Gilles,
>>>>>>
>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>
>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>> one further access. And that takes place against an invalid head area at
>>>>>> this point.
>>>>>>
>>>>>> Does this make sense?
>>>>> Yes, it is the issue we observed.
>>>>>
>>>>>> If that is true, all we need to do is to force a migration before
>>>>>> releasing the mode storage. Could you check this?
>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>> in another TSD cleanup function is which could be called after the
>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>> primary mode and a write to the u_mode.
>>>>>
>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>> syscall for deregistering the u_mode pointer...
>>> That is the thing we did to verify that we had this bug. But this
>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>> functions order again.
>>>
>> Right, the only complete fix without losing functionality is to add an
>> option to our ABI for requesting kernel-managed memory if dynamic
>> allocation is necessary (i.e. no TLS is available).
> 
> No. TLS may as well suffer from the same issue, since it is handled by
> the glibc or libgcc, over which we have no control. So yes, it may work
> by chance today, but may as well stop working tomorrow. We use
> kernel-managed memory all the time, final point.

I think we are still in the solution finding process, no need for early
conclusions.

See, we actually do not need kernel-managed storage for u_mode at all.
u_mode is an optimization, mostly for our fast user space mutexes. We
can indeed switch off all updates by the kernel and will still be able
to provide all required features - just less optimally. Adding a third
state, "invalid", we can make all mutex users assume they need the slow
syscall path on uncontended acquisition. And assert_nrt will probably be
happy about a syscall replacement for u_mode when it became invalid.

This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
without it) is entered during thread clean up with the help of a TSD
destructor. The destructor will then deregister our u_mode storage from
the kernel so that it doesn't matter if we release the memory
immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
And in this model, it also doesn't matter when precisely the destructor
is called.

> 
>> But I thought a bit more about a workaround for the existing ABI. We
>> basically need a way to free some memory as late as possible on thread
>> deletion. Even when leaving garbage collection that no one really wants
>> aside, there might be some semi-perfect user-space-only solution:
>>
>> pthread_create_key says that TSD destructors are re-run after the first
>> round if their key value is still non-NULL. So we could at least work
>> around the already rare case that some TSD destructor past ours tries to
>> access an RT mutex or otherwise migrates the thread to RT again. For
>> this, we just need a counter (next to the mode storages) for the round.
>> If we are in round #1, we would restore the key value again instead of
>> freeing it. On run #n < PTHREAD_DESTRUCTOR_ITERATIONS, we would finally
>> free it in the hope we are the last interested in it. This just requires
>> PTHREAD_DESTRUCTOR_ITERATIONS > 1, and that the application does not do
>> this ugly dance as well AND also performs Xenomai calls.
> 
> I have thought of another simpler fix: we leak the u_mode when
> kernel-support is too old, and whine loudly about it. For the other case

Leaking is not nice, but I guess an application will crash sooner over
the bug than this leak becomes a reason for a failure.

> (newer kernel-support with older user-space support), I was thinking
> about something else, which I still find complicated and far from
> perfect: handling the exit syscall by setting the u_mode pointer to NULL
> because we know at that time the u_mode pointer points to free memory.
> 

That would reduce the probability of a crash, right. Probably the best
we can do for old user land. And I don't think if should take more than
two lines of code in the syscall dispatching path.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-04 20:25           ` Jan Kiszka
@ 2010-03-04 20:42             ` Gilles Chanteperdrix
  2010-03-05 11:21               ` Jan Kiszka
  2010-03-05 11:08             ` Wolfgang Mauerer
  1 sibling, 1 reply; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-04 20:42 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Hi Gilles,
>>>>>>>
>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>
>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>> this point.
>>>>>>>
>>>>>>> Does this make sense?
>>>>>> Yes, it is the issue we observed.
>>>>>>
>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>> releasing the mode storage. Could you check this?
>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>> in another TSD cleanup function is which could be called after the
>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>> primary mode and a write to the u_mode.
>>>>>>
>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>> syscall for deregistering the u_mode pointer...
>>>> That is the thing we did to verify that we had this bug. But this
>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>> functions order again.
>>>>
>>> Right, the only complete fix without losing functionality is to add an
>>> option to our ABI for requesting kernel-managed memory if dynamic
>>> allocation is necessary (i.e. no TLS is available).
>> No. TLS may as well suffer from the same issue, since it is handled by
>> the glibc or libgcc, over which we have no control. So yes, it may work
>> by chance today, but may as well stop working tomorrow. We use
>> kernel-managed memory all the time, final point.
> 
> I think we are still in the solution finding process, no need for early
> conclusions.
> 
> See, we actually do not need kernel-managed storage for u_mode at all.
> u_mode is an optimization, mostly for our fast user space mutexes. We
> can indeed switch off all updates by the kernel and will still be able
> to provide all required features - just less optimally. Adding a third
> state, "invalid", we can make all mutex users assume they need the slow
> syscall path on uncontended acquisition. And assert_nrt will probably be
> happy about a syscall replacement for u_mode when it became invalid.
> 
> This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
> without it) is entered during thread clean up with the help of a TSD
> destructor. The destructor will then deregister our u_mode storage from
> the kernel so that it doesn't matter if we release the memory
> immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
> And in this model, it also doesn't matter when precisely the destructor
> is called.

We have to add a syscall to propagate this value to kernel-space, and
clutter the kernel-space code which uses u_mode with tests to see if
u_mode is valid or not, and we have to clutter the code which uses
u_mode in user-space to handle that invalid state. And every time we add
a user of u_mode, we have to think about the invalid state. A lot of
clutter.

The two last issues may be removed by handling the invalid state only in
the function which returns the current mode. If the state is invalid,
then issue the syscall. Admittedly, we get two syscalls for mutex locks,
but who cares.

However, what for? Allocating u_mode in the process private sem_heap, as
I suggest since the beggining, looks so much simpler. No test, no
special case, the address is always valid as long as the tcb is valid.

> 
>>> But I thought a bit more about a workaround for the existing ABI. We
>>> basically need a way to free some memory as late as possible on thread
>>> deletion. Even when leaving garbage collection that no one really wants
>>> aside, there might be some semi-perfect user-space-only solution:
>>>
>>> pthread_create_key says that TSD destructors are re-run after the first
>>> round if their key value is still non-NULL. So we could at least work
>>> around the already rare case that some TSD destructor past ours tries to
>>> access an RT mutex or otherwise migrates the thread to RT again. For
>>> this, we just need a counter (next to the mode storages) for the round.
>>> If we are in round #1, we would restore the key value again instead of
>>> freeing it. On run #n < PTHREAD_DESTRUCTOR_ITERATIONS, we would finally
>>> free it in the hope we are the last interested in it. This just requires
>>> PTHREAD_DESTRUCTOR_ITERATIONS > 1, and that the application does not do
>>> this ugly dance as well AND also performs Xenomai calls.
>> I have thought of another simpler fix: we leak the u_mode when
>> kernel-support is too old, and whine loudly about it. For the other case
> 
> Leaking is not nice, but I guess an application will crash sooner over
> the bug than this leak becomes a reason for a failure.
> 
>> (newer kernel-support with older user-space support), I was thinking
>> about something else, which I still find complicated and far from
>> perfect: handling the exit syscall by setting the u_mode pointer to NULL
>> because we know at that time the u_mode pointer points to free memory.
>>
> 
> That would reduce the probability of a crash, right. Probably the best
> we can do for old user land. And I don't think if should take more than
> two lines of code in the syscall dispatching path.

A bit more, we have to deal with the old user-space detection.
Noise would came from the kernel with this solution too.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-04 20:42             ` Gilles Chanteperdrix
@ 2010-03-05 11:21               ` Jan Kiszka
  2010-03-05 11:30                 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2010-03-05 11:21 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Gilles Chanteperdrix wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Hi Gilles,
>>>>>>>>
>>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>>
>>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>>> this point.
>>>>>>>>
>>>>>>>> Does this make sense?
>>>>>>> Yes, it is the issue we observed.
>>>>>>>
>>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>>> releasing the mode storage. Could you check this?
>>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>>> in another TSD cleanup function is which could be called after the
>>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>>> primary mode and a write to the u_mode.
>>>>>>>
>>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>>> syscall for deregistering the u_mode pointer...
>>>>> That is the thing we did to verify that we had this bug. But this
>>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>>> functions order again.
>>>>>
>>>> Right, the only complete fix without losing functionality is to add an
>>>> option to our ABI for requesting kernel-managed memory if dynamic
>>>> allocation is necessary (i.e. no TLS is available).
>>> No. TLS may as well suffer from the same issue, since it is handled by
>>> the glibc or libgcc, over which we have no control. So yes, it may work
>>> by chance today, but may as well stop working tomorrow. We use
>>> kernel-managed memory all the time, final point.
>> I think we are still in the solution finding process, no need for early
>> conclusions.
>>
>> See, we actually do not need kernel-managed storage for u_mode at all.
>> u_mode is an optimization, mostly for our fast user space mutexes. We
>> can indeed switch off all updates by the kernel and will still be able
>> to provide all required features - just less optimally. Adding a third
>> state, "invalid", we can make all mutex users assume they need the slow
>> syscall path on uncontended acquisition. And assert_nrt will probably be
>> happy about a syscall replacement for u_mode when it became invalid.
>>
>> This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
>> without it) is entered during thread clean up with the help of a TSD
>> destructor. The destructor will then deregister our u_mode storage from
>> the kernel so that it doesn't matter if we release the memory
>> immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
>> And in this model, it also doesn't matter when precisely the destructor
>> is called.
> 
> We have to add a syscall to propagate this value to kernel-space, and
> clutter the kernel-space code which uses u_mode with tests to see if
> u_mode is valid or not, and we have to clutter the code which uses
> u_mode in user-space to handle that invalid state. And every time we add
> a user of u_mode, we have to think about the invalid state. A lot of
> clutter.
> 
> The two last issues may be removed by handling the invalid state only in
> the function which returns the current mode. If the state is invalid,
> then issue the syscall. Admittedly, we get two syscalls for mutex locks,
> but who cares.
> 
> However, what for? Allocating u_mode in the process private sem_heap, as
> I suggest since the beggining, looks so much simpler. No test, no
> special case, the address is always valid as long as the tcb is valid.

Try implementing it.

I will post a prototype for my approach within a minute. Its major
implementation advantage is that there is no need to touch any skin,
neither on user nor kernel side, and that there is no need for backward
compatible syscalls.

Another advantage of my approach is that it does not touch the fast
paths of mutex handling (before deregistration) - well, at lest almost
for non-TLS, but absolutely not for TLS.

> 
>>>> But I thought a bit more about a workaround for the existing ABI. We
>>>> basically need a way to free some memory as late as possible on thread
>>>> deletion. Even when leaving garbage collection that no one really wants
>>>> aside, there might be some semi-perfect user-space-only solution:
>>>>
>>>> pthread_create_key says that TSD destructors are re-run after the first
>>>> round if their key value is still non-NULL. So we could at least work
>>>> around the already rare case that some TSD destructor past ours tries to
>>>> access an RT mutex or otherwise migrates the thread to RT again. For
>>>> this, we just need a counter (next to the mode storages) for the round.
>>>> If we are in round #1, we would restore the key value again instead of
>>>> freeing it. On run #n < PTHREAD_DESTRUCTOR_ITERATIONS, we would finally
>>>> free it in the hope we are the last interested in it. This just requires
>>>> PTHREAD_DESTRUCTOR_ITERATIONS > 1, and that the application does not do
>>>> this ugly dance as well AND also performs Xenomai calls.
>>> I have thought of another simpler fix: we leak the u_mode when
>>> kernel-support is too old, and whine loudly about it. For the other case
>> Leaking is not nice, but I guess an application will crash sooner over
>> the bug than this leak becomes a reason for a failure.
>>
>>> (newer kernel-support with older user-space support), I was thinking
>>> about something else, which I still find complicated and far from
>>> perfect: handling the exit syscall by setting the u_mode pointer to NULL
>>> because we know at that time the u_mode pointer points to free memory.
>>>
>> That would reduce the probability of a crash, right. Probably the best
>> we can do for old user land. And I don't think if should take more than
>> two lines of code in the syscall dispatching path.
> 
> A bit more, we have to deal with the old user-space detection.

Nope, we can clear (more precisely: redirect) u_mode unconditionally.

> Noise would came from the kernel with this solution too.
> 

Yes, it remains band-aid.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-05 11:21               ` Jan Kiszka
@ 2010-03-05 11:30                 ` Gilles Chanteperdrix
  2010-03-05 11:39                   ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-05 11:30 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>> Jan Kiszka wrote:
>>>>>>>>> Hi Gilles,
>>>>>>>>>
>>>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>>>
>>>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>>>> this point.
>>>>>>>>>
>>>>>>>>> Does this make sense?
>>>>>>>> Yes, it is the issue we observed.
>>>>>>>>
>>>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>>>> releasing the mode storage. Could you check this?
>>>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>>>> in another TSD cleanup function is which could be called after the
>>>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>>>> primary mode and a write to the u_mode.
>>>>>>>>
>>>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>>>> syscall for deregistering the u_mode pointer...
>>>>>> That is the thing we did to verify that we had this bug. But this
>>>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>>>> functions order again.
>>>>>>
>>>>> Right, the only complete fix without losing functionality is to add an
>>>>> option to our ABI for requesting kernel-managed memory if dynamic
>>>>> allocation is necessary (i.e. no TLS is available).
>>>> No. TLS may as well suffer from the same issue, since it is handled by
>>>> the glibc or libgcc, over which we have no control. So yes, it may work
>>>> by chance today, but may as well stop working tomorrow. We use
>>>> kernel-managed memory all the time, final point.
>>> I think we are still in the solution finding process, no need for early
>>> conclusions.
>>>
>>> See, we actually do not need kernel-managed storage for u_mode at all.
>>> u_mode is an optimization, mostly for our fast user space mutexes. We
>>> can indeed switch off all updates by the kernel and will still be able
>>> to provide all required features - just less optimally. Adding a third
>>> state, "invalid", we can make all mutex users assume they need the slow
>>> syscall path on uncontended acquisition. And assert_nrt will probably be
>>> happy about a syscall replacement for u_mode when it became invalid.
>>>
>>> This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
>>> without it) is entered during thread clean up with the help of a TSD
>>> destructor. The destructor will then deregister our u_mode storage from
>>> the kernel so that it doesn't matter if we release the memory
>>> immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
>>> And in this model, it also doesn't matter when precisely the destructor
>>> is called.
>> We have to add a syscall to propagate this value to kernel-space, and
>> clutter the kernel-space code which uses u_mode with tests to see if
>> u_mode is valid or not, and we have to clutter the code which uses
>> u_mode in user-space to handle that invalid state. And every time we add
>> a user of u_mode, we have to think about the invalid state. A lot of
>> clutter.
>>
>> The two last issues may be removed by handling the invalid state only in
>> the function which returns the current mode. If the state is invalid,
>> then issue the syscall. Admittedly, we get two syscalls for mutex locks,
>> but who cares.
>>
>> However, what for? Allocating u_mode in the process private sem_heap, as
>> I suggest since the beggining, looks so much simpler. No test, no
>> special case, the address is always valid as long as the tcb is valid.
> 
> Try implementing it.
> 
> I will post a prototype for my approach within a minute. Its major
> implementation advantage is that there is no need to touch any skin,
> neither on user nor kernel side, and that there is no need for backward
> compatible syscalls.
> 
> Another advantage of my approach is that it does not touch the fast
> paths of mutex handling (before deregistration) - well, at lest almost
> for non-TLS, but absolutely not for TLS.

Do not forget the kernel-space part which detects whether we are using
the older or newer user-space.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-05 11:30                 ` Gilles Chanteperdrix
@ 2010-03-05 11:39                   ` Jan Kiszka
  2010-03-05 11:42                     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Kiszka @ 2010-03-05 11:39 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Gilles Chanteperdrix wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>> Hi Gilles,
>>>>>>>>>>
>>>>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>>>>
>>>>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>>>>> this point.
>>>>>>>>>>
>>>>>>>>>> Does this make sense?
>>>>>>>>> Yes, it is the issue we observed.
>>>>>>>>>
>>>>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>>>>> releasing the mode storage. Could you check this?
>>>>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>>>>> in another TSD cleanup function is which could be called after the
>>>>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>>>>> primary mode and a write to the u_mode.
>>>>>>>>>
>>>>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>>>>> syscall for deregistering the u_mode pointer...
>>>>>>> That is the thing we did to verify that we had this bug. But this
>>>>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>>>>> functions order again.
>>>>>>>
>>>>>> Right, the only complete fix without losing functionality is to add an
>>>>>> option to our ABI for requesting kernel-managed memory if dynamic
>>>>>> allocation is necessary (i.e. no TLS is available).
>>>>> No. TLS may as well suffer from the same issue, since it is handled by
>>>>> the glibc or libgcc, over which we have no control. So yes, it may work
>>>>> by chance today, but may as well stop working tomorrow. We use
>>>>> kernel-managed memory all the time, final point.
>>>> I think we are still in the solution finding process, no need for early
>>>> conclusions.
>>>>
>>>> See, we actually do not need kernel-managed storage for u_mode at all.
>>>> u_mode is an optimization, mostly for our fast user space mutexes. We
>>>> can indeed switch off all updates by the kernel and will still be able
>>>> to provide all required features - just less optimally. Adding a third
>>>> state, "invalid", we can make all mutex users assume they need the slow
>>>> syscall path on uncontended acquisition. And assert_nrt will probably be
>>>> happy about a syscall replacement for u_mode when it became invalid.
>>>>
>>>> This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
>>>> without it) is entered during thread clean up with the help of a TSD
>>>> destructor. The destructor will then deregister our u_mode storage from
>>>> the kernel so that it doesn't matter if we release the memory
>>>> immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
>>>> And in this model, it also doesn't matter when precisely the destructor
>>>> is called.
>>> We have to add a syscall to propagate this value to kernel-space, and
>>> clutter the kernel-space code which uses u_mode with tests to see if
>>> u_mode is valid or not, and we have to clutter the code which uses
>>> u_mode in user-space to handle that invalid state. And every time we add
>>> a user of u_mode, we have to think about the invalid state. A lot of
>>> clutter.
>>>
>>> The two last issues may be removed by handling the invalid state only in
>>> the function which returns the current mode. If the state is invalid,
>>> then issue the syscall. Admittedly, we get two syscalls for mutex locks,
>>> but who cares.
>>>
>>> However, what for? Allocating u_mode in the process private sem_heap, as
>>> I suggest since the beggining, looks so much simpler. No test, no
>>> special case, the address is always valid as long as the tcb is valid.
>> Try implementing it.
>>
>> I will post a prototype for my approach within a minute. Its major
>> implementation advantage is that there is no need to touch any skin,
>> neither on user nor kernel side, and that there is no need for backward
>> compatible syscalls.
>>
>> Another advantage of my approach is that it does not touch the fast
>> paths of mutex handling (before deregistration) - well, at lest almost
>> for non-TLS, but absolutely not for TLS.
> 
> Do not forget the kernel-space part which detects whether we are using
> the older or newer user-space.

Not required (famous last words).

The only bit that should be missing in my RFC is the exit() trap,
probably a two-liner. Will look into this soon.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-05 11:39                   ` Jan Kiszka
@ 2010-03-05 11:42                     ` Gilles Chanteperdrix
  2010-03-05 11:45                       ` Jan Kiszka
  0 siblings, 1 reply; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-05 11:42 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>> Jan Kiszka wrote:
>>>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>>> Hi Gilles,
>>>>>>>>>>>
>>>>>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>>>>>
>>>>>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>>>>>> this point.
>>>>>>>>>>>
>>>>>>>>>>> Does this make sense?
>>>>>>>>>> Yes, it is the issue we observed.
>>>>>>>>>>
>>>>>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>>>>>> releasing the mode storage. Could you check this?
>>>>>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>>>>>> in another TSD cleanup function is which could be called after the
>>>>>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>>>>>> primary mode and a write to the u_mode.
>>>>>>>>>>
>>>>>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>>>>>> syscall for deregistering the u_mode pointer...
>>>>>>>> That is the thing we did to verify that we had this bug. But this
>>>>>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>>>>>> functions order again.
>>>>>>>>
>>>>>>> Right, the only complete fix without losing functionality is to add an
>>>>>>> option to our ABI for requesting kernel-managed memory if dynamic
>>>>>>> allocation is necessary (i.e. no TLS is available).
>>>>>> No. TLS may as well suffer from the same issue, since it is handled by
>>>>>> the glibc or libgcc, over which we have no control. So yes, it may work
>>>>>> by chance today, but may as well stop working tomorrow. We use
>>>>>> kernel-managed memory all the time, final point.
>>>>> I think we are still in the solution finding process, no need for early
>>>>> conclusions.
>>>>>
>>>>> See, we actually do not need kernel-managed storage for u_mode at all.
>>>>> u_mode is an optimization, mostly for our fast user space mutexes. We
>>>>> can indeed switch off all updates by the kernel and will still be able
>>>>> to provide all required features - just less optimally. Adding a third
>>>>> state, "invalid", we can make all mutex users assume they need the slow
>>>>> syscall path on uncontended acquisition. And assert_nrt will probably be
>>>>> happy about a syscall replacement for u_mode when it became invalid.
>>>>>
>>>>> This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
>>>>> without it) is entered during thread clean up with the help of a TSD
>>>>> destructor. The destructor will then deregister our u_mode storage from
>>>>> the kernel so that it doesn't matter if we release the memory
>>>>> immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
>>>>> And in this model, it also doesn't matter when precisely the destructor
>>>>> is called.
>>>> We have to add a syscall to propagate this value to kernel-space, and
>>>> clutter the kernel-space code which uses u_mode with tests to see if
>>>> u_mode is valid or not, and we have to clutter the code which uses
>>>> u_mode in user-space to handle that invalid state. And every time we add
>>>> a user of u_mode, we have to think about the invalid state. A lot of
>>>> clutter.
>>>>
>>>> The two last issues may be removed by handling the invalid state only in
>>>> the function which returns the current mode. If the state is invalid,
>>>> then issue the syscall. Admittedly, we get two syscalls for mutex locks,
>>>> but who cares.
>>>>
>>>> However, what for? Allocating u_mode in the process private sem_heap, as
>>>> I suggest since the beggining, looks so much simpler. No test, no
>>>> special case, the address is always valid as long as the tcb is valid.
>>> Try implementing it.
>>>
>>> I will post a prototype for my approach within a minute. Its major
>>> implementation advantage is that there is no need to touch any skin,
>>> neither on user nor kernel side, and that there is no need for backward
>>> compatible syscalls.
>>>
>>> Another advantage of my approach is that it does not touch the fast
>>> paths of mutex handling (before deregistration) - well, at lest almost
>>> for non-TLS, but absolutely not for TLS.
>> Do not forget the kernel-space part which detects whether we are using
>> the older or newer user-space.
> 
> Not required (famous last words).
> 
> The only bit that should be missing in my RFC is the exit() trap,
> probably a two-liner. Will look into this soon.

We discussed about it yesterday. The exit() trap does not guarantee that
the system will be working. So the warning is required.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-05 11:42                     ` Gilles Chanteperdrix
@ 2010-03-05 11:45                       ` Jan Kiszka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kiszka @ 2010-03-05 11:45 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Mauerer, Wolfgang, xenomai-core, Hillier, Gernot

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Gilles Chanteperdrix wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>>>> Hi Gilles,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>>>>>>
>>>>>>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>>>>>>> this point.
>>>>>>>>>>>>
>>>>>>>>>>>> Does this make sense?
>>>>>>>>>>> Yes, it is the issue we observed.
>>>>>>>>>>>
>>>>>>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>>>>>>> releasing the mode storage. Could you check this?
>>>>>>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>>>>>>> in another TSD cleanup function is which could be called after the
>>>>>>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>>>>>>> primary mode and a write to the u_mode.
>>>>>>>>>>>
>>>>>>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>>>>>>> syscall for deregistering the u_mode pointer...
>>>>>>>>> That is the thing we did to verify that we had this bug. But this
>>>>>>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>>>>>>> functions order again.
>>>>>>>>>
>>>>>>>> Right, the only complete fix without losing functionality is to add an
>>>>>>>> option to our ABI for requesting kernel-managed memory if dynamic
>>>>>>>> allocation is necessary (i.e. no TLS is available).
>>>>>>> No. TLS may as well suffer from the same issue, since it is handled by
>>>>>>> the glibc or libgcc, over which we have no control. So yes, it may work
>>>>>>> by chance today, but may as well stop working tomorrow. We use
>>>>>>> kernel-managed memory all the time, final point.
>>>>>> I think we are still in the solution finding process, no need for early
>>>>>> conclusions.
>>>>>>
>>>>>> See, we actually do not need kernel-managed storage for u_mode at all.
>>>>>> u_mode is an optimization, mostly for our fast user space mutexes. We
>>>>>> can indeed switch off all updates by the kernel and will still be able
>>>>>> to provide all required features - just less optimally. Adding a third
>>>>>> state, "invalid", we can make all mutex users assume they need the slow
>>>>>> syscall path on uncontended acquisition. And assert_nrt will probably be
>>>>>> happy about a syscall replacement for u_mode when it became invalid.
>>>>>>
>>>>>> This invalid state (maybe u_mode == -1 with TLS, and mode_key == NULL
>>>>>> without it) is entered during thread clean up with the help of a TSD
>>>>>> destructor. The destructor will then deregister our u_mode storage from
>>>>>> the kernel so that it doesn't matter if we release the memory
>>>>>> immediately and explicitly (w/o TLS) or leave this to glibc (/w TLS).
>>>>>> And in this model, it also doesn't matter when precisely the destructor
>>>>>> is called.
>>>>> We have to add a syscall to propagate this value to kernel-space, and
>>>>> clutter the kernel-space code which uses u_mode with tests to see if
>>>>> u_mode is valid or not, and we have to clutter the code which uses
>>>>> u_mode in user-space to handle that invalid state. And every time we add
>>>>> a user of u_mode, we have to think about the invalid state. A lot of
>>>>> clutter.
>>>>>
>>>>> The two last issues may be removed by handling the invalid state only in
>>>>> the function which returns the current mode. If the state is invalid,
>>>>> then issue the syscall. Admittedly, we get two syscalls for mutex locks,
>>>>> but who cares.
>>>>>
>>>>> However, what for? Allocating u_mode in the process private sem_heap, as
>>>>> I suggest since the beggining, looks so much simpler. No test, no
>>>>> special case, the address is always valid as long as the tcb is valid.
>>>> Try implementing it.
>>>>
>>>> I will post a prototype for my approach within a minute. Its major
>>>> implementation advantage is that there is no need to touch any skin,
>>>> neither on user nor kernel side, and that there is no need for backward
>>>> compatible syscalls.
>>>>
>>>> Another advantage of my approach is that it does not touch the fast
>>>> paths of mutex handling (before deregistration) - well, at lest almost
>>>> for non-TLS, but absolutely not for TLS.
>>> Do not forget the kernel-space part which detects whether we are using
>>> the older or newer user-space.
>> Not required (famous last words).
>>
>> The only bit that should be missing in my RFC is the exit() trap,
>> probably a two-liner. Will look into this soon.
> 
> We discussed about it yesterday. The exit() trap does not guarantee that
> the system will be working. So the warning is required.

Right, four lines:

if (syscall == exit) {
	WARN_ON_ONCE(cur->u_mode != &cur->u_mode_dump);
	cur->u_mode = &cur->u_mode_dump;
}

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-04 20:25           ` Jan Kiszka
  2010-03-04 20:42             ` Gilles Chanteperdrix
@ 2010-03-05 11:08             ` Wolfgang Mauerer
  2010-03-05 11:29               ` Gilles Chanteperdrix
  1 sibling, 1 reply; 15+ messages in thread
From: Wolfgang Mauerer @ 2010-03-05 11:08 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core, Hillier, Gernot

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Hi Gilles,
>>>>>>>
>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>
>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>> this point.
>>>>>>>
>>>>>>> Does this make sense?
>>>>>> Yes, it is the issue we observed.
>>>>>>
>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>> releasing the mode storage. Could you check this?
>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>> in another TSD cleanup function is which could be called after the
>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>> primary mode and a write to the u_mode.
>>>>>>
>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>> syscall for deregistering the u_mode pointer...
>>>> That is the thing we did to verify that we had this bug. But this
>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>> functions order again.
>>>>
>>> Right, the only complete fix without losing functionality is to add an
>>> option to our ABI for requesting kernel-managed memory if dynamic
>>> allocation is necessary (i.e. no TLS is available).
>> No. TLS may as well suffer from the same issue, since it is handled by
>> the glibc or libgcc, over which we have no control. So yes, it may work
>> by chance today, but may as well stop working tomorrow. We use
>> kernel-managed memory all the time, final point.
> 
> I think we are still in the solution finding process, no need for early
> conclusions.
> 
> See, we actually do not need kernel-managed storage for u_mode at all.
> u_mode is an optimization, mostly for our fast user space mutexes. We
> can indeed switch off all updates by the kernel and will still be able
> to provide all required features - just less optimally. Adding a third
> state, "invalid", we can make all mutex users assume they need the slow
> syscall path on uncontended acquisition. And assert_nrt will probably be
> happy about a syscall replacement for u_mode when it became invalid.

Thinking about the "fast" part in "fast userspace mutex": Would it be an
argument in favour of not using the global semaphore heap that said
memory is uncached on some architectures? Or is that irrelevant?

Regards,

Wolfgang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Xenomai-core] Potential heap corruption on thread cleanup
  2010-03-05 11:08             ` Wolfgang Mauerer
@ 2010-03-05 11:29               ` Gilles Chanteperdrix
  0 siblings, 0 replies; 15+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-05 11:29 UTC (permalink / raw)
  To: Wolfgang Mauerer; +Cc: Jan Kiszka, xenomai-core, Hillier, Gernot

Wolfgang Mauerer wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Gilles Chanteperdrix wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Hi Gilles,
>>>>>>>>
>>>>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>>>>
>>>>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>>>>> are forced to use), it is automatically freed on thread termination in
>>>>>>>> the context of the dying thread. If the thread is already migrated to
>>>>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>>>>> storage "silently", without any migration, the final exit will trigger
>>>>>>>> one further access. And that takes place against an invalid head area at
>>>>>>>> this point.
>>>>>>>>
>>>>>>>> Does this make sense?
>>>>>>> Yes, it is the issue we observed.
>>>>>>>
>>>>>>>> If that is true, all we need to do is to force a migration before
>>>>>>>> releasing the mode storage. Could you check this?
>>>>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>>>>> in another TSD cleanup function is which could be called after the
>>>>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>>>>> primary mode and a write to the u_mode.
>>>>>>>
>>>>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>>>>> syscall for deregistering the u_mode pointer...
>>>>> That is the thing we did to verify that we had this bug. But this
>>>>> syscall would be also called too soon, and suffers from the TSD cleanup
>>>>> functions order again.
>>>>>
>>>> Right, the only complete fix without losing functionality is to add an
>>>> option to our ABI for requesting kernel-managed memory if dynamic
>>>> allocation is necessary (i.e. no TLS is available).
>>> No. TLS may as well suffer from the same issue, since it is handled by
>>> the glibc or libgcc, over which we have no control. So yes, it may work
>>> by chance today, but may as well stop working tomorrow. We use
>>> kernel-managed memory all the time, final point.
>> I think we are still in the solution finding process, no need for early
>> conclusions.
>>
>> See, we actually do not need kernel-managed storage for u_mode at all.
>> u_mode is an optimization, mostly for our fast user space mutexes. We
>> can indeed switch off all updates by the kernel and will still be able
>> to provide all required features - just less optimally. Adding a third
>> state, "invalid", we can make all mutex users assume they need the slow
>> syscall path on uncontended acquisition. And assert_nrt will probably be
>> happy about a syscall replacement for u_mode when it became invalid.
> 
> Thinking about the "fast" part in "fast userspace mutex": Would it be an
> argument in favour of not using the global semaphore heap that said
> memory is uncached on some architectures? Or is that irrelevant?

Several answers:
- whether memory is cached is mostly irrelevant for real-time. The worst
case remains the same. Of course it is a bit less irrelevant when we are
trying to get free time to run a general-purpose OS as idle task.
- we are talking about arms, right. When not using FCSE, the cache is
small and the kernel flushes it at every context switch, so the chances
for getting a cache miss are in fact pretty high.
- on arm, the cost of a system call is still way higher than two RAM
accesses.
- simpler solutions usually yield less bugs, as shown by the bug whe are
discussing.
- according to the comments we read on the mailing list about 2.5.1
stability, having something which works correctly is kind of urgent.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-03-05 11:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-03  8:58 [Xenomai-core] Potential heap corruption on thread cleanup Jan Kiszka
2010-03-03  9:04 ` Gilles Chanteperdrix
2010-03-03  9:13   ` Jan Kiszka
2010-03-03  9:16     ` Gilles Chanteperdrix
2010-03-04 18:28       ` Jan Kiszka
2010-03-04 18:36         ` Gilles Chanteperdrix
2010-03-04 20:25           ` Jan Kiszka
2010-03-04 20:42             ` Gilles Chanteperdrix
2010-03-05 11:21               ` Jan Kiszka
2010-03-05 11:30                 ` Gilles Chanteperdrix
2010-03-05 11:39                   ` Jan Kiszka
2010-03-05 11:42                     ` Gilles Chanteperdrix
2010-03-05 11:45                       ` Jan Kiszka
2010-03-05 11:08             ` Wolfgang Mauerer
2010-03-05 11:29               ` Gilles Chanteperdrix

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.