From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Kai Bollue <mlist1@bollue.de>
Cc: Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] native/heap: "removing non-linked element"
Date: Thu, 09 May 2013 18:13:38 +0200 [thread overview]
Message-ID: <518BCB32.6060007@xenomai.org> (raw)
In-Reply-To: <5182B42F.9090907@bollue.de>
On 05/02/2013 08:45 PM, Kai Bollue wrote:
> Hello,
>
> we experience a crash upon unbinding of a previously deleted (and
> cleaned up) shared heap.
> Scheme:
> - Process A calls rt_heap_create() (with H_SHARED flag), waits for some
> time and then terminates.
> - Process B calls rt_heap_bind() on that heap, uses it and calls
> rt_heap_unbind() (or terminates) after process A has terminated.
>
> Then the system crashes after the output of "Xenomai: removing
> non-linked element, holder=ffffc900125e4940, qslot=ffff880427aa90f8 at
> kernel/xenomai/skins/native/heap.c:374".
>
> The crash does not always happen, but can quite reliably be reproduced
> by starting process A in a loop from bash (while [ TRUE ]; do ...) and
> keeping process B running.
>
> Two aspects seem to be crucial:
> - Calling rt_heap_delete() in process A is not sufficient to reproduce
> the problem, the process has to terminate (the cleaning up seems to be
> relevant).
> - We could only reproduce the crash as long as process B accessed the
> heap after process A had terminated (e.g. using memcpy).
>
> As a workaround, it could be tried to avoid access to a deleted heap,
> but it is not always possible to detect the termination of process A on
> time in such a constellation.
>
> The system:
> - AMD AM3 FX-8350
> - Debian 6.0
> - Kernel 3.5.7
> - Xenomai 2.6.2.1
>
> We also tested this on an older system (Xenomai 2.6.0, Kernel 2.6.37):
> Here, both processes hung indefinitely and could not be killed, but the
> system did not crash.
>
> Any hints are appreciated.
>
> Attachments:
> - Console output
> - Code of process A
> - Code of process B
Hi Kai,
thank you very much for your test case, it allowed to reproduce the
issue and try and understand what happens.
>From what I understand, processA creates the shared heap which is added
to the list of the objects it holds (xeno_get_rholder()), when processA
dies, the heap is removed from the list, but not destroyed because it is
also bound to processB.
Then processB unbinds the heap, which triggers an auto-destruction,
which tries to remove the heap from processA list again. If processA
control block has not been re-used, this works, because the list is
still there, if processA has be re-launched, the control block has been
reinitialized, as well as the list, so removing the element from the
list fails.
I see several possible corrections:
- get rt_heap_delete to return an error when the heap is currently bound
to another process (EBUSY for instance), while still unmapping it from
the current process. This will cause __xeno_flush_rq to move the heap to
the "global" ressource holder, where it can safely be deleted later
- put any rt_heap with the H_MAPPABLE flag directly on the global
ressource holder, as it is a global object anyway, this means that when
a process which created a mappable heap dies, the heap survives, but
this is maybe what should be expected from shareable heaps.
Regards.
--
Gilles.
next prev parent reply other threads:[~2013-05-09 16:13 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-02 18:45 [Xenomai] native/heap: "removing non-linked element" Kai Bollue
2013-05-09 16:13 ` Gilles Chanteperdrix [this message]
2013-05-09 16:31 ` Gilles Chanteperdrix
2013-05-15 18:01 ` Kai Bollue
2013-05-15 21:46 ` Gilles Chanteperdrix
2013-05-16 7:57 ` Philippe Gerum
2013-05-18 14:53 ` Gilles Chanteperdrix
2013-05-15 21:56 ` Gilles Chanteperdrix
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=518BCB32.6060007@xenomai.org \
--to=gilles.chanteperdrix@xenomai.org \
--cc=mlist1@bollue.de \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.