* [Xenomai-help] rt_queue_write return 1, with no receiver
@ 2009-03-10 0:27 Mark Saiia
2009-03-10 9:37 ` Philippe Gerum
0 siblings, 1 reply; 18+ messages in thread
From: Mark Saiia @ 2009-03-10 0:27 UTC (permalink / raw)
To: xenomai
Hello all,
Running Xenomai 2.4.7 w/ linux 2.6.27.19 on Geode GX1.
We are writing to a queue. This write is returning a 1, and we
believe that there is no corresponding read pending. Are the API docs
correct in regards to the return value of rt_queue_write? Has anyone
seen anything related to this issue?
Thanks,
Mark
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-10 0:27 [Xenomai-help] rt_queue_write return 1, with no receiver Mark Saiia
@ 2009-03-10 9:37 ` Philippe Gerum
2009-03-10 18:21 ` Steven Seeger
2009-03-10 18:57 ` Steven Seeger
0 siblings, 2 replies; 18+ messages in thread
From: Philippe Gerum @ 2009-03-10 9:37 UTC (permalink / raw)
To: Mark Saiia; +Cc: xenomai
Mark Saiia wrote:
> Hello all,
>
> Running Xenomai 2.4.7 w/ linux 2.6.27.19 on Geode GX1.
>
> We are writing to a queue. This write is returning a 1, and we
> believe that there is no corresponding read pending. Are the API docs
> correct in regards to the return value of rt_queue_write?
Yes, the docs are correct, and the code looks sane as well. You may want to
double-check your findings using rt_queue_inquire() before calling
rt_queue_write(), even if this won't be 100% reliable in case your reader is
polling the queue.
Has anyone
> seen anything related to this issue?
>
> Thanks,
>
> Mark
>
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-10 9:37 ` Philippe Gerum
@ 2009-03-10 18:21 ` Steven Seeger
2009-03-10 18:57 ` Steven Seeger
1 sibling, 0 replies; 18+ messages in thread
From: Steven Seeger @ 2009-03-10 18:21 UTC (permalink / raw)
To: rpm; +Cc: xenomai
The reader waits forever (TM_INFINITE) on the queue. We used values
from rt_timer_read() to timestamp some logging messages that go out
another queue to confirm that the reader thread is busy and not
reading from the queue. We get a log at time x1 before we do the
write, and then another time value at x2 before we do the read. The
higher priority thread is doing the writing.
Hopefully it's a problem with our testing.
Steven
On Mar 10, 2009, at 5:37 AM, Philippe Gerum wrote:
> Mark Saiia wrote:
>> Hello all,
>>
>> Running Xenomai 2.4.7 w/ linux 2.6.27.19 on Geode GX1.
>>
>> We are writing to a queue. This write is returning a 1, and we
>> believe that there is no corresponding read pending. Are the API
>> docs
>> correct in regards to the return value of rt_queue_write?
>
> Yes, the docs are correct, and the code looks sane as well. You may
> want to
> double-check your findings using rt_queue_inquire() before calling
> rt_queue_write(), even if this won't be 100% reliable in case your
> reader is
> polling the queue.
>
> Has anyone
>> seen anything related to this issue?
>>
>> Thanks,
>>
>> Mark
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
>>
>
>
> --
> Philippe.
>
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-10 9:37 ` Philippe Gerum
2009-03-10 18:21 ` Steven Seeger
@ 2009-03-10 18:57 ` Steven Seeger
2009-03-10 19:56 ` Philippe Gerum
1 sibling, 1 reply; 18+ messages in thread
From: Steven Seeger @ 2009-03-10 18:57 UTC (permalink / raw)
To: rpm; +Cc: xenomai
> Yes, the docs are correct, and the code looks sane as well. You may
> want to
> double-check your findings using rt_queue_inquire() before calling
> rt_queue_write(), even if this won't be 100% reliable in case your
> reader is
> polling the queue.
Philippe,
We took your advice and tried rt_queue_inquire(). If we use a timeout
on the read, It seems there are always 3 waiters, which is strange
because we have only one thread reading from the queue. If we remove
the timout, there is either 1 or 2 waiters. It got 0 waiters once and
still hung. Very strange.
Thanks,
Steven
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-10 18:57 ` Steven Seeger
@ 2009-03-10 19:56 ` Philippe Gerum
[not found] ` <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid>
0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2009-03-10 19:56 UTC (permalink / raw)
To: Steven Seeger; +Cc: xenomai
Steven Seeger wrote:
>> Yes, the docs are correct, and the code looks sane as well. You may
>> want to
>> double-check your findings using rt_queue_inquire() before calling
>> rt_queue_write(), even if this won't be 100% reliable in case your
>> reader is
>> polling the queue.
>
>
> Philippe,
>
> We took your advice and tried rt_queue_inquire(). If we use a timeout on
> the read, It seems there are always 3 waiters, which is strange because
> we have only one thread reading from the queue. If we remove the timout,
> there is either 1 or 2 waiters. It got 0 waiters once and still hung.
> Very strange.
>
/proc/xenomai/registry/native/queues/* will tell you which threads are pending
on the queue.
> Thanks,
> Steven
>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
[not found] ` <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid>
@ 2009-03-10 21:07 ` Philippe Gerum
2009-03-10 22:16 ` Mark Saiia
0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2009-03-10 21:07 UTC (permalink / raw)
To: Mark Saiia; +Cc: xenomai-help
Mark Saiia wrote:
> When our app's log shows the number of waiters is 1(via
> rt_queue_inquire), catting the proc entry shows the queue info, and a
> + on the next line. However, when the log shows the number of waiters
> is 2, catting the proc entry crashes the system hard, which
> necessitates a reboot. This behavior is completely reproducible.
>
This is an evidence that some internal data structures are terminally broken.
You may want to enable CONFIG_XENO_OPT_DEBUG, CONFIG_XENO_OPT_DEBUG_NUCLEUS and
CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull the
break when a corruption is detected at runtime.
> Mark
>
>
> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>> Steven Seeger wrote:
>>>> Yes, the docs are correct, and the code looks sane as well. You may
>>>> want to
>>>> double-check your findings using rt_queue_inquire() before calling
>>>> rt_queue_write(), even if this won't be 100% reliable in case your
>>>> reader is
>>>> polling the queue.
>>>
>>> Philippe,
>>>
>>> We took your advice and tried rt_queue_inquire(). If we use a timeout on
>>> the read, It seems there are always 3 waiters, which is strange because
>>> we have only one thread reading from the queue. If we remove the timout,
>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung.
>>> Very strange.
>>>
>> /proc/xenomai/registry/native/queues/* will tell you which threads are
>> pending
>> on the queue.
>>
>>> Thanks,
>>> Steven
>>>
>>>
>>
>> --
>> Philippe.
>>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-10 21:07 ` Philippe Gerum
@ 2009-03-10 22:16 ` Mark Saiia
2009-03-10 22:24 ` Philippe Gerum
0 siblings, 1 reply; 18+ messages in thread
From: Mark Saiia @ 2009-03-10 22:16 UTC (permalink / raw)
To: rpm; +Cc: xenomai-help
With the debugging options enabled, at the point when the application
has previously shifted from 1 waiter to 2 waiters, the system
hardlocks, just as when I was catting the queue proc entry.
On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
> Mark Saiia wrote:
>> When our app's log shows the number of waiters is 1(via
>> rt_queue_inquire), catting the proc entry shows the queue info, and a
>> + on the next line. However, when the log shows the number of waiters
>> is 2, catting the proc entry crashes the system hard, which
>> necessitates a reboot. This behavior is completely reproducible.
>>
>
> This is an evidence that some internal data structures are terminally
> broken.
> You may want to enable CONFIG_XENO_OPT_DEBUG, CONFIG_XENO_OPT_DEBUG_NUCLEUS
> and
> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull
> the
> break when a corruption is detected at runtime.
>
>> Mark
>>
>>
>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>>> Steven Seeger wrote:
>>>>> Yes, the docs are correct, and the code looks sane as well. You may
>>>>> want to
>>>>> double-check your findings using rt_queue_inquire() before calling
>>>>> rt_queue_write(), even if this won't be 100% reliable in case your
>>>>> reader is
>>>>> polling the queue.
>>>>
>>>> Philippe,
>>>>
>>>> We took your advice and tried rt_queue_inquire(). If we use a timeout on
>>>> the read, It seems there are always 3 waiters, which is strange because
>>>> we have only one thread reading from the queue. If we remove the timout,
>>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung.
>>>> Very strange.
>>>>
>>> /proc/xenomai/registry/native/queues/* will tell you which threads are
>>> pending
>>> on the queue.
>>>
>>>> Thanks,
>>>> Steven
>>>>
>>>>
>>>
>>> --
>>> Philippe.
>>>
>>
>
>
> --
> Philippe.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-10 22:16 ` Mark Saiia
@ 2009-03-10 22:24 ` Philippe Gerum
[not found] ` <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid>
0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2009-03-10 22:24 UTC (permalink / raw)
To: Mark Saiia; +Cc: xenomai-help
Mark Saiia wrote:
> With the debugging options enabled, at the point when the application
> has previously shifted from 1 waiter to 2 waiters, the system
> hardlocks, just as when I was catting the queue proc entry.
>
That is expected, but what does the kernel log say at that point?
> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>> Mark Saiia wrote:
>>> When our app's log shows the number of waiters is 1(via
>>> rt_queue_inquire), catting the proc entry shows the queue info, and a
>>> + on the next line. However, when the log shows the number of waiters
>>> is 2, catting the proc entry crashes the system hard, which
>>> necessitates a reboot. This behavior is completely reproducible.
>>>
>> This is an evidence that some internal data structures are terminally
>> broken.
>> You may want to enable CONFIG_XENO_OPT_DEBUG, CONFIG_XENO_OPT_DEBUG_NUCLEUS
>> and
>> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull
>> the
>> break when a corruption is detected at runtime.
>>
>>> Mark
>>>
>>>
>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>>>> Steven Seeger wrote:
>>>>>> Yes, the docs are correct, and the code looks sane as well. You may
>>>>>> want to
>>>>>> double-check your findings using rt_queue_inquire() before calling
>>>>>> rt_queue_write(), even if this won't be 100% reliable in case your
>>>>>> reader is
>>>>>> polling the queue.
>>>>> Philippe,
>>>>>
>>>>> We took your advice and tried rt_queue_inquire(). If we use a timeout on
>>>>> the read, It seems there are always 3 waiters, which is strange because
>>>>> we have only one thread reading from the queue. If we remove the timout,
>>>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung.
>>>>> Very strange.
>>>>>
>>>> /proc/xenomai/registry/native/queues/* will tell you which threads are
>>>> pending
>>>> on the queue.
>>>>
>>>>> Thanks,
>>>>> Steven
>>>>>
>>>>>
>>>> --
>>>> Philippe.
>>>>
>>
>> --
>> Philippe.
>>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
[not found] ` <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid>
@ 2009-03-11 9:14 ` Philippe Gerum
[not found] ` <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid>
0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2009-03-11 9:14 UTC (permalink / raw)
To: Mark Saiia; +Cc: xenomai-help
Mark Saiia wrote:
> I am unable to examine the kernel log. When I say hard crash I mean
> that everything locks up, including the OS.
When the system detects a corruption, it first dumps a report to the console
then halts the CPU. So what you need to do is a way to get the console output
over a serial link, or maybe over a netconsole.
Therefore, I cannot do a
> logread. I modified syslogd to output to file. Klogd is outputting to
> syslog (the version being using does not have the option to output
> directly to file). When I examine the log on disk after reboot, there
> is nothing relevant in there.
The report can't be synced to disk, so you can't find it after next boot anyway.
The last log message is prior to the
> crash.
>
> On Tue, Mar 10, 2009 at 3:24 PM, Philippe Gerum <rpm@xenomai.org> wrote:
>> Mark Saiia wrote:
>>> With the debugging options enabled, at the point when the application
>>> has previously shifted from 1 waiter to 2 waiters, the system
>>> hardlocks, just as when I was catting the queue proc entry.
>>>
>> That is expected, but what does the kernel log say at that point?
>>
>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>>>> Mark Saiia wrote:
>>>>> When our app's log shows the number of waiters is 1(via
>>>>> rt_queue_inquire), catting the proc entry shows the queue info, and a
>>>>> + on the next line. However, when the log shows the number of waiters
>>>>> is 2, catting the proc entry crashes the system hard, which
>>>>> necessitates a reboot. This behavior is completely reproducible.
>>>>>
>>>> This is an evidence that some internal data structures are terminally
>>>> broken.
>>>> You may want to enable CONFIG_XENO_OPT_DEBUG,
>>>> CONFIG_XENO_OPT_DEBUG_NUCLEUS
>>>> and
>>>> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull
>>>> the
>>>> break when a corruption is detected at runtime.
>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>>>>>> Steven Seeger wrote:
>>>>>>>> Yes, the docs are correct, and the code looks sane as well. You may
>>>>>>>> want to
>>>>>>>> double-check your findings using rt_queue_inquire() before calling
>>>>>>>> rt_queue_write(), even if this won't be 100% reliable in case your
>>>>>>>> reader is
>>>>>>>> polling the queue.
>>>>>>> Philippe,
>>>>>>>
>>>>>>> We took your advice and tried rt_queue_inquire(). If we use a timeout
>>>>>>> on
>>>>>>> the read, It seems there are always 3 waiters, which is strange
>>>>>>> because
>>>>>>> we have only one thread reading from the queue. If we remove the
>>>>>>> timout,
>>>>>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung.
>>>>>>> Very strange.
>>>>>>>
>>>>>> /proc/xenomai/registry/native/queues/* will tell you which threads are
>>>>>> pending
>>>>>> on the queue.
>>>>>>
>>>>>>> Thanks,
>>>>>>> Steven
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Philippe.
>>>>>>
>>>> --
>>>> Philippe.
>>>>
>>
>> --
>> Philippe.
>>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
[not found] ` <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid>
@ 2009-03-11 17:04 ` Philippe Gerum
2009-03-11 17:07 ` Steven Seeger
2009-03-11 17:17 ` [Xenomai-help] " Jan Kiszka
0 siblings, 2 replies; 18+ messages in thread
From: Philippe Gerum @ 2009-03-11 17:04 UTC (permalink / raw)
To: Mark Saiia; +Cc: xenomai-help
Mark Saiia wrote:
> I still get no output when the crash occurs, even over a serial link.
> I also do not see any output when running the app over telnet. Right
> now the app is not running in graphics mode, so I should see output
> even when I was running it locally.
>
Not if the kernel is terminally broken due to this bug. Wild guess: I would
suggest to check how message queue buffers are used by your application,
particularly to detect out-of-bound writes.
.
>
> Mark
>
>
> On Wed, Mar 11, 2009 at 2:14 AM, Philippe Gerum <rpm@xenomai.org> wrote:
>> Mark Saiia wrote:
>>> I am unable to examine the kernel log. When I say hard crash I mean
>>> that everything locks up, including the OS.
>> When the system detects a corruption, it first dumps a report to the console
>> then halts the CPU. So what you need to do is a way to get the console
>> output over a serial link, or maybe over a netconsole.
>>
>> Therefore, I cannot do a
>>> logread. I modified syslogd to output to file. Klogd is outputting to
>>> syslog (the version being using does not have the option to output
>>> directly to file). When I examine the log on disk after reboot, there
>>> is nothing relevant in there.
>> The report can't be synced to disk, so you can't find it after next boot
>> anyway.
>>
>> The last log message is prior to the
>>> crash.
>>>
>>> On Tue, Mar 10, 2009 at 3:24 PM, Philippe Gerum <rpm@xenomai.org> wrote:
>>>> Mark Saiia wrote:
>>>>> With the debugging options enabled, at the point when the application
>>>>> has previously shifted from 1 waiter to 2 waiters, the system
>>>>> hardlocks, just as when I was catting the queue proc entry.
>>>>>
>>>> That is expected, but what does the kernel log say at that point?
>>>>
>>>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>>>>>> Mark Saiia wrote:
>>>>>>> When our app's log shows the number of waiters is 1(via
>>>>>>> rt_queue_inquire), catting the proc entry shows the queue info, and a
>>>>>>> + on the next line. However, when the log shows the number of waiters
>>>>>>> is 2, catting the proc entry crashes the system hard, which
>>>>>>> necessitates a reboot. This behavior is completely reproducible.
>>>>>>>
>>>>>> This is an evidence that some internal data structures are terminally
>>>>>> broken.
>>>>>> You may want to enable CONFIG_XENO_OPT_DEBUG,
>>>>>> CONFIG_XENO_OPT_DEBUG_NUCLEUS
>>>>>> and
>>>>>> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will
>>>>>> pull
>>>>>> the
>>>>>> break when a corruption is detected at runtime.
>>>>>>
>>>>>>> Mark
>>>>>>>
>>>>>>>
>>>>>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote:
>>>>>>>> Steven Seeger wrote:
>>>>>>>>>> Yes, the docs are correct, and the code looks sane as well. You may
>>>>>>>>>> want to
>>>>>>>>>> double-check your findings using rt_queue_inquire() before calling
>>>>>>>>>> rt_queue_write(), even if this won't be 100% reliable in case your
>>>>>>>>>> reader is
>>>>>>>>>> polling the queue.
>>>>>>>>> Philippe,
>>>>>>>>>
>>>>>>>>> We took your advice and tried rt_queue_inquire(). If we use a
>>>>>>>>> timeout
>>>>>>>>> on
>>>>>>>>> the read, It seems there are always 3 waiters, which is strange
>>>>>>>>> because
>>>>>>>>> we have only one thread reading from the queue. If we remove the
>>>>>>>>> timout,
>>>>>>>>> there is either 1 or 2 waiters. It got 0 waiters once and still
>>>>>>>>> hung.
>>>>>>>>> Very strange.
>>>>>>>>>
>>>>>>>> /proc/xenomai/registry/native/queues/* will tell you which threads
>>>>>>>> are
>>>>>>>> pending
>>>>>>>> on the queue.
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Steven
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>> Philippe.
>>>>>>>>
>>>>>> --
>>>>>> Philippe.
>>>>>>
>>>> --
>>>> Philippe.
>>>>
>>
>> --
>> Philippe.
>>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:04 ` Philippe Gerum
@ 2009-03-11 17:07 ` Steven Seeger
2009-03-11 17:15 ` Philippe Gerum
2009-03-11 17:17 ` [Xenomai-help] " Jan Kiszka
1 sibling, 1 reply; 18+ messages in thread
From: Steven Seeger @ 2009-03-11 17:07 UTC (permalink / raw)
To: rpm; +Cc: xenomai-help
> Not if the kernel is terminally broken due to this bug. Wild guess:
> I would
> suggest to check how message queue buffers are used by your
> application,
> particularly to detect out-of-bound writes.
This particular queue is always writing a structure of a constant
size. So our reads and writes are always constant size. If reading
does not return a size that we expect then we assert.
Steven
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:07 ` Steven Seeger
@ 2009-03-11 17:15 ` Philippe Gerum
2009-03-11 17:18 ` Steven Seeger
0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2009-03-11 17:15 UTC (permalink / raw)
To: Steven Seeger; +Cc: xenomai-help
Steven Seeger wrote:
>> Not if the kernel is terminally broken due to this bug. Wild guess: I
>> would
>> suggest to check how message queue buffers are used by your application,
>> particularly to detect out-of-bound writes.
>
> This particular queue is always writing a structure of a constant size.
> So our reads and writes are always constant size. If reading does not
> return a size that we expect then we assert.
>
Then I have no clue; you may want to try simplifying your test until the bug
disappears.
> Steven
>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:04 ` Philippe Gerum
2009-03-11 17:07 ` Steven Seeger
@ 2009-03-11 17:17 ` Jan Kiszka
1 sibling, 0 replies; 18+ messages in thread
From: Jan Kiszka @ 2009-03-11 17:17 UTC (permalink / raw)
To: Mark Saiia; +Cc: xenomai-help
Philippe Gerum wrote:
> Mark Saiia wrote:
>> I still get no output when the crash occurs, even over a serial link.
Do you have some klogd running? That nice guy tries to collect kernel
log messages for syslog, but if things go totally wrong, you then do not
see any last words even on serial consoles.
Jan
--
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:15 ` Philippe Gerum
@ 2009-03-11 17:18 ` Steven Seeger
2009-03-11 17:36 ` Philippe Gerum
0 siblings, 1 reply; 18+ messages in thread
From: Steven Seeger @ 2009-03-11 17:18 UTC (permalink / raw)
To: rpm; +Cc: xenomai-help
On Mar 11, 2009, at 1:15 PM, Philippe Gerum wrote:
> Then I have no clue; you may want to try simplifying your test until
> the bug disappears.
I also suggested this already. :)
We are also always writing the same structure. It is statically
allocated and we use queues as a means of providing a series of
snapshots.
Our app is entirely userspace. Could memory corruption in userspace
cause issues with the /proc entry?
Steven
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:18 ` Steven Seeger
@ 2009-03-11 17:36 ` Philippe Gerum
2009-03-11 17:38 ` Steven Seeger
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Philippe Gerum @ 2009-03-11 17:36 UTC (permalink / raw)
To: Steven Seeger; +Cc: xenomai-help
Steven Seeger wrote:
> On Mar 11, 2009, at 1:15 PM, Philippe Gerum wrote:
>
>> Then I have no clue; you may want to try simplifying your test until
>> the bug disappears.
>
> I also suggested this already. :)
>
> We are also always writing the same structure. It is statically
> allocated and we use queues as a means of providing a series of snapshots.
>
> Our app is entirely userspace. Could memory corruption in userspace
> cause issues with the /proc entry?
>
You are sharing buffers with kernel space using RT_QUEUEs. If anything goes
wrong writing to this address space, you may well end up corrupting some kernel
data, including some internal Xenomai data structures. The fact that looking at
/proc/xenomai crashes the system is telling, since the backing code may attempt
to scan such data structures.
Since you also have your own driver running, this might be another source for
the issue. Non-reg tests did not reveal any issue with RT_QUEUEs so far.
> Steven
>
>
--
Philippe.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:36 ` Philippe Gerum
@ 2009-03-11 17:38 ` Steven Seeger
2009-03-11 17:39 ` Steven Seeger
[not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid>
2 siblings, 0 replies; 18+ messages in thread
From: Steven Seeger @ 2009-03-11 17:38 UTC (permalink / raw)
To: rpm; +Cc: xenomai-help
> You are sharing buffers with kernel space using RT_QUEUEs. If
> anything goes wrong writing to this address space, you may well end
> up corrupting some kernel data, including some internal Xenomai data
> structures. The fact that looking at /proc/xenomai crashes the
> system is telling, since the backing code may attempt to scan such
> data structures.
>
> Since you also have your own driver running, this might be another
> source for the issue. Non-reg tests did not reveal any issue with
> RT_QUEUEs so far.
Actually we have disabled the driver and do not even load it until the
FPU issue is resolved.
Mark is taking Jan's advice and disabling klogd now.
Steven
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver
2009-03-11 17:36 ` Philippe Gerum
2009-03-11 17:38 ` Steven Seeger
@ 2009-03-11 17:39 ` Steven Seeger
[not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid>
2 siblings, 0 replies; 18+ messages in thread
From: Steven Seeger @ 2009-03-11 17:39 UTC (permalink / raw)
To: rpm; +Cc: xenomai-help
BTW, we have seen sporadic hanging with the queues before but never
pinpointed it because it happened so infrequently. Since moving to
xenomai 2.4.7 it's very repeatable at the same state in the system.
Steven
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Xenomai-help] Fwd: rt_queue_write return 1, with no receiver
[not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid>
@ 2009-03-11 17:45 ` Mark Saiia
0 siblings, 0 replies; 18+ messages in thread
From: Mark Saiia @ 2009-03-11 17:45 UTC (permalink / raw)
To: xenomai-help
Forgot to cc the list. Sorry.
---------- Forwarded message ----------
From: Mark Saiia <mark.saiia@domain.hid>
Date: Wed, Mar 11, 2009 at 10:43 AM
Subject: Re: [Xenomai-help] rt_queue_write return 1, with no receiver
To: rpm@xenomai.org
Even with klogd disabled, there is no debugging information displayed
over the serial console. It looks like it is time to start writing
test code.
Mark
On Wed, Mar 11, 2009 at 10:36 AM, Philippe Gerum <rpm@xenomai.org> wrote:
> Steven Seeger wrote:
>>
>> On Mar 11, 2009, at 1:15 PM, Philippe Gerum wrote:
>>
>>> Then I have no clue; you may want to try simplifying your test until the
>>> bug disappears.
>>
>> I also suggested this already. :)
>>
>> We are also always writing the same structure. It is statically allocated
>> and we use queues as a means of providing a series of snapshots.
>>
>> Our app is entirely userspace. Could memory corruption in userspace cause
>> issues with the /proc entry?
>>
>
> You are sharing buffers with kernel space using RT_QUEUEs. If anything goes
> wrong writing to this address space, you may well end up corrupting some
> kernel data, including some internal Xenomai data structures. The fact that
> looking at /proc/xenomai crashes the system is telling, since the backing
> code may attempt to scan such data structures.
>
> Since you also have your own driver running, this might be another source
> for the issue. Non-reg tests did not reveal any issue with RT_QUEUEs so far.
>
>> Steven
>>
>>
>
>
> --
> Philippe.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-03-11 17:45 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-10 0:27 [Xenomai-help] rt_queue_write return 1, with no receiver Mark Saiia
2009-03-10 9:37 ` Philippe Gerum
2009-03-10 18:21 ` Steven Seeger
2009-03-10 18:57 ` Steven Seeger
2009-03-10 19:56 ` Philippe Gerum
[not found] ` <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid>
2009-03-10 21:07 ` Philippe Gerum
2009-03-10 22:16 ` Mark Saiia
2009-03-10 22:24 ` Philippe Gerum
[not found] ` <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid>
2009-03-11 9:14 ` Philippe Gerum
[not found] ` <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid>
2009-03-11 17:04 ` Philippe Gerum
2009-03-11 17:07 ` Steven Seeger
2009-03-11 17:15 ` Philippe Gerum
2009-03-11 17:18 ` Steven Seeger
2009-03-11 17:36 ` Philippe Gerum
2009-03-11 17:38 ` Steven Seeger
2009-03-11 17:39 ` Steven Seeger
[not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid>
2009-03-11 17:45 ` [Xenomai-help] Fwd: " Mark Saiia
2009-03-11 17:17 ` [Xenomai-help] " Jan Kiszka
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.