All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: Wolfgang Mauerer <wolfgang.mauerer@domain.hid>,
	Jan Kiszka <jan.kiszka@domain.hid>,
	xenomai-core <xenomai@xenomai.org>,
	Gernot Hillier <gernot.hillier@domain.hid>
Subject: Re: [Xenomai-core] Potential heap corruption on thread cleanup
Date: Thu, 04 Mar 2010 19:36:48 +0100	[thread overview]
Message-ID: <4B8FFDC0.4080903@domain.hid> (raw)
In-Reply-To: <4B8FFBCD.9020305@domain.hid>

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Hi Gilles,
>>>>>
>>>>> I'm pushing your findings to the list, also as my colleagues showed
>>>>> strong interest - this thing may explain rare corruptions for us as well.
>>>>>
>>>>> I thought a bit about that likely u_mode-related crash in your test case
>>>>> and have the following theory so far: If the xeno_current_mode storage
>>>>> is allocated on the application heap (!HAVE_THREAD, that's also what we
>>>>> are forced to use), it is automatically freed on thread termination in
>>>>> the context of the dying thread. If the thread is already migrated to
>>>>> secondary or if that happens while it is cleaned up (i.e. before calling
>>>>> for exit into the kernel), there is no problem, Xenomai will not touch
>>>>> the mode storage anymore. But if the thread happens to delete the
>>>>> storage "silently", without any migration, the final exit will trigger
>>>>> one further access. And that takes place against an invalid head area at
>>>>> this point.
>>>>>
>>>>> Does this make sense?
>>>> Yes, it is the issue we observed.
>>>>
>>>>> If that is true, all we need to do is to force a migration before
>>>>> releasing the mode storage. Could you check this?
>>>> No, that does not fly. Calling, for instance, __wrap_pthread_mutex_lock
>>>> in another TSD cleanup function is which could be called after the
>>>> current_mode TSD cleanup is allowed and could trigger a switch to
>>>> primary mode and a write to the u_mode.
>>>>
>>> Good point. Mmh. Another, but ABI-breaking, way would be to add a
>>> syscall for deregistering the u_mode pointer...
>> That is the thing we did to verify that we had this bug. But this
>> syscall would be also called too soon, and suffers from the TSD cleanup
>> functions order again.
>>
> 
> Right, the only complete fix without losing functionality is to add an
> option to our ABI for requesting kernel-managed memory if dynamic
> allocation is necessary (i.e. no TLS is available).

No. TLS may as well suffer from the same issue, since it is handled by
the glibc or libgcc, over which we have no control. So yes, it may work
by chance today, but may as well stop working tomorrow. We use
kernel-managed memory all the time, final point.

> 
> But I thought a bit more about a workaround for the existing ABI. We
> basically need a way to free some memory as late as possible on thread
> deletion. Even when leaving garbage collection that no one really wants
> aside, there might be some semi-perfect user-space-only solution:
> 
> pthread_create_key says that TSD destructors are re-run after the first
> round if their key value is still non-NULL. So we could at least work
> around the already rare case that some TSD destructor past ours tries to
> access an RT mutex or otherwise migrates the thread to RT again. For
> this, we just need a counter (next to the mode storages) for the round.
> If we are in round #1, we would restore the key value again instead of
> freeing it. On run #n < PTHREAD_DESTRUCTOR_ITERATIONS, we would finally
> free it in the hope we are the last interested in it. This just requires
> PTHREAD_DESTRUCTOR_ITERATIONS > 1, and that the application does not do
> this ugly dance as well AND also performs Xenomai calls.

I have thought of another simpler fix: we leak the u_mode when
kernel-support is too old, and whine loudly about it. For the other case
(newer kernel-support with older user-space support), I was thinking
about something else, which I still find complicated and far from
perfect: handling the exit syscall by setting the u_mode pointer to NULL
because we know at that time the u_mode pointer points to free memory.

-- 
					    Gilles.


  reply	other threads:[~2010-03-04 18:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-03  8:58 [Xenomai-core] Potential heap corruption on thread cleanup Jan Kiszka
2010-03-03  9:04 ` Gilles Chanteperdrix
2010-03-03  9:13   ` Jan Kiszka
2010-03-03  9:16     ` Gilles Chanteperdrix
2010-03-04 18:28       ` Jan Kiszka
2010-03-04 18:36         ` Gilles Chanteperdrix [this message]
2010-03-04 20:25           ` Jan Kiszka
2010-03-04 20:42             ` Gilles Chanteperdrix
2010-03-05 11:21               ` Jan Kiszka
2010-03-05 11:30                 ` Gilles Chanteperdrix
2010-03-05 11:39                   ` Jan Kiszka
2010-03-05 11:42                     ` Gilles Chanteperdrix
2010-03-05 11:45                       ` Jan Kiszka
2010-03-05 11:08             ` Wolfgang Mauerer
2010-03-05 11:29               ` Gilles Chanteperdrix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B8FFDC0.4080903@domain.hid \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=gernot.hillier@domain.hid \
    --cc=jan.kiszka@domain.hid \
    --cc=wolfgang.mauerer@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.