From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Kai Bollue <mlist1@bollue.de>
Cc: Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] native/heap: "removing non-linked element"
Date: Thu, 09 May 2013 18:31:59 +0200 [thread overview]
Message-ID: <518BCF7F.3000407@xenomai.org> (raw)
In-Reply-To: <518BCB32.6060007@xenomai.org>
On 05/09/2013 06:13 PM, Gilles Chanteperdrix wrote:
> On 05/02/2013 08:45 PM, Kai Bollue wrote:
>
>> Hello,
>>
>> we experience a crash upon unbinding of a previously deleted (and
>> cleaned up) shared heap.
>> Scheme:
>> - Process A calls rt_heap_create() (with H_SHARED flag), waits for some
>> time and then terminates.
>> - Process B calls rt_heap_bind() on that heap, uses it and calls
>> rt_heap_unbind() (or terminates) after process A has terminated.
>>
>> Then the system crashes after the output of "Xenomai: removing
>> non-linked element, holder=ffffc900125e4940, qslot=ffff880427aa90f8 at
>> kernel/xenomai/skins/native/heap.c:374".
>>
>> The crash does not always happen, but can quite reliably be reproduced
>> by starting process A in a loop from bash (while [ TRUE ]; do ...) and
>> keeping process B running.
>>
>> Two aspects seem to be crucial:
>> - Calling rt_heap_delete() in process A is not sufficient to reproduce
>> the problem, the process has to terminate (the cleaning up seems to be
>> relevant).
>> - We could only reproduce the crash as long as process B accessed the
>> heap after process A had terminated (e.g. using memcpy).
>>
>> As a workaround, it could be tried to avoid access to a deleted heap,
>> but it is not always possible to detect the termination of process A on
>> time in such a constellation.
>>
>> The system:
>> - AMD AM3 FX-8350
>> - Debian 6.0
>> - Kernel 3.5.7
>> - Xenomai 2.6.2.1
>>
>> We also tested this on an older system (Xenomai 2.6.0, Kernel 2.6.37):
>> Here, both processes hung indefinitely and could not be killed, but the
>> system did not crash.
>>
>> Any hints are appreciated.
>>
>> Attachments:
>> - Console output
>> - Code of process A
>> - Code of process B
>
>
> Hi Kai,
>
> thank you very much for your test case, it allowed to reproduce the
> issue and try and understand what happens.
>
> From what I understand, processA creates the shared heap which is added
> to the list of the objects it holds (xeno_get_rholder()), when processA
> dies, the heap is removed from the list, but not destroyed because it is
> also bound to processB.
>
> Then processB unbinds the heap, which triggers an auto-destruction,
> which tries to remove the heap from processA list again. If processA
> control block has not been re-used, this works, because the list is
> still there, if processA has be re-launched, the control block has been
> reinitialized, as well as the list, so removing the element from the
> list fails.
>
> I see several possible corrections:
> - get rt_heap_delete to return an error when the heap is currently bound
> to another process (EBUSY for instance), while still unmapping it from
> the current process. This will cause __xeno_flush_rq to move the heap to
> the "global" ressource holder, where it can safely be deleted later
> - put any rt_heap with the H_MAPPABLE flag directly on the global
> ressource holder, as it is a global object anyway, this means that when
> a process which created a mappable heap dies, the heap survives, but
> this is maybe what should be expected from shareable heaps.
- or remove the rt_heap from the list directly in rt_heap_delete, it
does not seem to make sense to keep it in the list after it has been
deleted: it will be automatically deleted when the last process bound to
it unbinds it anyway.
--
Gilles.
next prev parent reply other threads:[~2013-05-09 16:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-02 18:45 [Xenomai] native/heap: "removing non-linked element" Kai Bollue
2013-05-09 16:13 ` Gilles Chanteperdrix
2013-05-09 16:31 ` Gilles Chanteperdrix [this message]
2013-05-15 18:01 ` Kai Bollue
2013-05-15 21:46 ` Gilles Chanteperdrix
2013-05-16 7:57 ` Philippe Gerum
2013-05-18 14:53 ` Gilles Chanteperdrix
2013-05-15 21:56 ` Gilles Chanteperdrix
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=518BCF7F.3000407@xenomai.org \
--to=gilles.chanteperdrix@xenomai.org \
--cc=mlist1@bollue.de \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.