* [Xenomai-help] rt_queue_write return 1, with no receiver @ 2009-03-10 0:27 Mark Saiia 2009-03-10 9:37 ` Philippe Gerum 0 siblings, 1 reply; 18+ messages in thread From: Mark Saiia @ 2009-03-10 0:27 UTC (permalink / raw) To: xenomai Hello all, Running Xenomai 2.4.7 w/ linux 2.6.27.19 on Geode GX1. We are writing to a queue. This write is returning a 1, and we believe that there is no corresponding read pending. Are the API docs correct in regards to the return value of rt_queue_write? Has anyone seen anything related to this issue? Thanks, Mark ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-10 0:27 [Xenomai-help] rt_queue_write return 1, with no receiver Mark Saiia @ 2009-03-10 9:37 ` Philippe Gerum 2009-03-10 18:21 ` Steven Seeger 2009-03-10 18:57 ` Steven Seeger 0 siblings, 2 replies; 18+ messages in thread From: Philippe Gerum @ 2009-03-10 9:37 UTC (permalink / raw) To: Mark Saiia; +Cc: xenomai Mark Saiia wrote: > Hello all, > > Running Xenomai 2.4.7 w/ linux 2.6.27.19 on Geode GX1. > > We are writing to a queue. This write is returning a 1, and we > believe that there is no corresponding read pending. Are the API docs > correct in regards to the return value of rt_queue_write? Yes, the docs are correct, and the code looks sane as well. You may want to double-check your findings using rt_queue_inquire() before calling rt_queue_write(), even if this won't be 100% reliable in case your reader is polling the queue. Has anyone > seen anything related to this issue? > > Thanks, > > Mark > > _______________________________________________ > Xenomai-help mailing list > Xenomai-help@domain.hid > https://mail.gna.org/listinfo/xenomai-help > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-10 9:37 ` Philippe Gerum @ 2009-03-10 18:21 ` Steven Seeger 2009-03-10 18:57 ` Steven Seeger 1 sibling, 0 replies; 18+ messages in thread From: Steven Seeger @ 2009-03-10 18:21 UTC (permalink / raw) To: rpm; +Cc: xenomai The reader waits forever (TM_INFINITE) on the queue. We used values from rt_timer_read() to timestamp some logging messages that go out another queue to confirm that the reader thread is busy and not reading from the queue. We get a log at time x1 before we do the write, and then another time value at x2 before we do the read. The higher priority thread is doing the writing. Hopefully it's a problem with our testing. Steven On Mar 10, 2009, at 5:37 AM, Philippe Gerum wrote: > Mark Saiia wrote: >> Hello all, >> >> Running Xenomai 2.4.7 w/ linux 2.6.27.19 on Geode GX1. >> >> We are writing to a queue. This write is returning a 1, and we >> believe that there is no corresponding read pending. Are the API >> docs >> correct in regards to the return value of rt_queue_write? > > Yes, the docs are correct, and the code looks sane as well. You may > want to > double-check your findings using rt_queue_inquire() before calling > rt_queue_write(), even if this won't be 100% reliable in case your > reader is > polling the queue. > > Has anyone >> seen anything related to this issue? >> >> Thanks, >> >> Mark >> >> _______________________________________________ >> Xenomai-help mailing list >> Xenomai-help@domain.hid >> https://mail.gna.org/listinfo/xenomai-help >> > > > -- > Philippe. > > _______________________________________________ > Xenomai-help mailing list > Xenomai-help@domain.hid > https://mail.gna.org/listinfo/xenomai-help ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-10 9:37 ` Philippe Gerum 2009-03-10 18:21 ` Steven Seeger @ 2009-03-10 18:57 ` Steven Seeger 2009-03-10 19:56 ` Philippe Gerum 1 sibling, 1 reply; 18+ messages in thread From: Steven Seeger @ 2009-03-10 18:57 UTC (permalink / raw) To: rpm; +Cc: xenomai > Yes, the docs are correct, and the code looks sane as well. You may > want to > double-check your findings using rt_queue_inquire() before calling > rt_queue_write(), even if this won't be 100% reliable in case your > reader is > polling the queue. Philippe, We took your advice and tried rt_queue_inquire(). If we use a timeout on the read, It seems there are always 3 waiters, which is strange because we have only one thread reading from the queue. If we remove the timout, there is either 1 or 2 waiters. It got 0 waiters once and still hung. Very strange. Thanks, Steven ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-10 18:57 ` Steven Seeger @ 2009-03-10 19:56 ` Philippe Gerum [not found] ` <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid> 0 siblings, 1 reply; 18+ messages in thread From: Philippe Gerum @ 2009-03-10 19:56 UTC (permalink / raw) To: Steven Seeger; +Cc: xenomai Steven Seeger wrote: >> Yes, the docs are correct, and the code looks sane as well. You may >> want to >> double-check your findings using rt_queue_inquire() before calling >> rt_queue_write(), even if this won't be 100% reliable in case your >> reader is >> polling the queue. > > > Philippe, > > We took your advice and tried rt_queue_inquire(). If we use a timeout on > the read, It seems there are always 3 waiters, which is strange because > we have only one thread reading from the queue. If we remove the timout, > there is either 1 or 2 waiters. It got 0 waiters once and still hung. > Very strange. > /proc/xenomai/registry/native/queues/* will tell you which threads are pending on the queue. > Thanks, > Steven > > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid>]
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver [not found] ` <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid> @ 2009-03-10 21:07 ` Philippe Gerum 2009-03-10 22:16 ` Mark Saiia 0 siblings, 1 reply; 18+ messages in thread From: Philippe Gerum @ 2009-03-10 21:07 UTC (permalink / raw) To: Mark Saiia; +Cc: xenomai-help Mark Saiia wrote: > When our app's log shows the number of waiters is 1(via > rt_queue_inquire), catting the proc entry shows the queue info, and a > + on the next line. However, when the log shows the number of waiters > is 2, catting the proc entry crashes the system hard, which > necessitates a reboot. This behavior is completely reproducible. > This is an evidence that some internal data structures are terminally broken. You may want to enable CONFIG_XENO_OPT_DEBUG, CONFIG_XENO_OPT_DEBUG_NUCLEUS and CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull the break when a corruption is detected at runtime. > Mark > > > On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >> Steven Seeger wrote: >>>> Yes, the docs are correct, and the code looks sane as well. You may >>>> want to >>>> double-check your findings using rt_queue_inquire() before calling >>>> rt_queue_write(), even if this won't be 100% reliable in case your >>>> reader is >>>> polling the queue. >>> >>> Philippe, >>> >>> We took your advice and tried rt_queue_inquire(). If we use a timeout on >>> the read, It seems there are always 3 waiters, which is strange because >>> we have only one thread reading from the queue. If we remove the timout, >>> there is either 1 or 2 waiters. It got 0 waiters once and still hung. >>> Very strange. >>> >> /proc/xenomai/registry/native/queues/* will tell you which threads are >> pending >> on the queue. >> >>> Thanks, >>> Steven >>> >>> >> >> -- >> Philippe. >> > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-10 21:07 ` Philippe Gerum @ 2009-03-10 22:16 ` Mark Saiia 2009-03-10 22:24 ` Philippe Gerum 0 siblings, 1 reply; 18+ messages in thread From: Mark Saiia @ 2009-03-10 22:16 UTC (permalink / raw) To: rpm; +Cc: xenomai-help With the debugging options enabled, at the point when the application has previously shifted from 1 waiter to 2 waiters, the system hardlocks, just as when I was catting the queue proc entry. On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: > Mark Saiia wrote: >> When our app's log shows the number of waiters is 1(via >> rt_queue_inquire), catting the proc entry shows the queue info, and a >> + on the next line. However, when the log shows the number of waiters >> is 2, catting the proc entry crashes the system hard, which >> necessitates a reboot. This behavior is completely reproducible. >> > > This is an evidence that some internal data structures are terminally > broken. > You may want to enable CONFIG_XENO_OPT_DEBUG, CONFIG_XENO_OPT_DEBUG_NUCLEUS > and > CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull > the > break when a corruption is detected at runtime. > >> Mark >> >> >> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >>> Steven Seeger wrote: >>>>> Yes, the docs are correct, and the code looks sane as well. You may >>>>> want to >>>>> double-check your findings using rt_queue_inquire() before calling >>>>> rt_queue_write(), even if this won't be 100% reliable in case your >>>>> reader is >>>>> polling the queue. >>>> >>>> Philippe, >>>> >>>> We took your advice and tried rt_queue_inquire(). If we use a timeout on >>>> the read, It seems there are always 3 waiters, which is strange because >>>> we have only one thread reading from the queue. If we remove the timout, >>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung. >>>> Very strange. >>>> >>> /proc/xenomai/registry/native/queues/* will tell you which threads are >>> pending >>> on the queue. >>> >>>> Thanks, >>>> Steven >>>> >>>> >>> >>> -- >>> Philippe. >>> >> > > > -- > Philippe. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-10 22:16 ` Mark Saiia @ 2009-03-10 22:24 ` Philippe Gerum [not found] ` <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid> 0 siblings, 1 reply; 18+ messages in thread From: Philippe Gerum @ 2009-03-10 22:24 UTC (permalink / raw) To: Mark Saiia; +Cc: xenomai-help Mark Saiia wrote: > With the debugging options enabled, at the point when the application > has previously shifted from 1 waiter to 2 waiters, the system > hardlocks, just as when I was catting the queue proc entry. > That is expected, but what does the kernel log say at that point? > On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >> Mark Saiia wrote: >>> When our app's log shows the number of waiters is 1(via >>> rt_queue_inquire), catting the proc entry shows the queue info, and a >>> + on the next line. However, when the log shows the number of waiters >>> is 2, catting the proc entry crashes the system hard, which >>> necessitates a reboot. This behavior is completely reproducible. >>> >> This is an evidence that some internal data structures are terminally >> broken. >> You may want to enable CONFIG_XENO_OPT_DEBUG, CONFIG_XENO_OPT_DEBUG_NUCLEUS >> and >> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull >> the >> break when a corruption is detected at runtime. >> >>> Mark >>> >>> >>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >>>> Steven Seeger wrote: >>>>>> Yes, the docs are correct, and the code looks sane as well. You may >>>>>> want to >>>>>> double-check your findings using rt_queue_inquire() before calling >>>>>> rt_queue_write(), even if this won't be 100% reliable in case your >>>>>> reader is >>>>>> polling the queue. >>>>> Philippe, >>>>> >>>>> We took your advice and tried rt_queue_inquire(). If we use a timeout on >>>>> the read, It seems there are always 3 waiters, which is strange because >>>>> we have only one thread reading from the queue. If we remove the timout, >>>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung. >>>>> Very strange. >>>>> >>>> /proc/xenomai/registry/native/queues/* will tell you which threads are >>>> pending >>>> on the queue. >>>> >>>>> Thanks, >>>>> Steven >>>>> >>>>> >>>> -- >>>> Philippe. >>>> >> >> -- >> Philippe. >> > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid>]
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver [not found] ` <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid> @ 2009-03-11 9:14 ` Philippe Gerum [not found] ` <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid> 0 siblings, 1 reply; 18+ messages in thread From: Philippe Gerum @ 2009-03-11 9:14 UTC (permalink / raw) To: Mark Saiia; +Cc: xenomai-help Mark Saiia wrote: > I am unable to examine the kernel log. When I say hard crash I mean > that everything locks up, including the OS. When the system detects a corruption, it first dumps a report to the console then halts the CPU. So what you need to do is a way to get the console output over a serial link, or maybe over a netconsole. Therefore, I cannot do a > logread. I modified syslogd to output to file. Klogd is outputting to > syslog (the version being using does not have the option to output > directly to file). When I examine the log on disk after reboot, there > is nothing relevant in there. The report can't be synced to disk, so you can't find it after next boot anyway. The last log message is prior to the > crash. > > On Tue, Mar 10, 2009 at 3:24 PM, Philippe Gerum <rpm@xenomai.org> wrote: >> Mark Saiia wrote: >>> With the debugging options enabled, at the point when the application >>> has previously shifted from 1 waiter to 2 waiters, the system >>> hardlocks, just as when I was catting the queue proc entry. >>> >> That is expected, but what does the kernel log say at that point? >> >>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >>>> Mark Saiia wrote: >>>>> When our app's log shows the number of waiters is 1(via >>>>> rt_queue_inquire), catting the proc entry shows the queue info, and a >>>>> + on the next line. However, when the log shows the number of waiters >>>>> is 2, catting the proc entry crashes the system hard, which >>>>> necessitates a reboot. This behavior is completely reproducible. >>>>> >>>> This is an evidence that some internal data structures are terminally >>>> broken. >>>> You may want to enable CONFIG_XENO_OPT_DEBUG, >>>> CONFIG_XENO_OPT_DEBUG_NUCLEUS >>>> and >>>> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will pull >>>> the >>>> break when a corruption is detected at runtime. >>>> >>>>> Mark >>>>> >>>>> >>>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >>>>>> Steven Seeger wrote: >>>>>>>> Yes, the docs are correct, and the code looks sane as well. You may >>>>>>>> want to >>>>>>>> double-check your findings using rt_queue_inquire() before calling >>>>>>>> rt_queue_write(), even if this won't be 100% reliable in case your >>>>>>>> reader is >>>>>>>> polling the queue. >>>>>>> Philippe, >>>>>>> >>>>>>> We took your advice and tried rt_queue_inquire(). If we use a timeout >>>>>>> on >>>>>>> the read, It seems there are always 3 waiters, which is strange >>>>>>> because >>>>>>> we have only one thread reading from the queue. If we remove the >>>>>>> timout, >>>>>>> there is either 1 or 2 waiters. It got 0 waiters once and still hung. >>>>>>> Very strange. >>>>>>> >>>>>> /proc/xenomai/registry/native/queues/* will tell you which threads are >>>>>> pending >>>>>> on the queue. >>>>>> >>>>>>> Thanks, >>>>>>> Steven >>>>>>> >>>>>>> >>>>>> -- >>>>>> Philippe. >>>>>> >>>> -- >>>> Philippe. >>>> >> >> -- >> Philippe. >> > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid>]
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver [not found] ` <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid> @ 2009-03-11 17:04 ` Philippe Gerum 2009-03-11 17:07 ` Steven Seeger 2009-03-11 17:17 ` [Xenomai-help] " Jan Kiszka 0 siblings, 2 replies; 18+ messages in thread From: Philippe Gerum @ 2009-03-11 17:04 UTC (permalink / raw) To: Mark Saiia; +Cc: xenomai-help Mark Saiia wrote: > I still get no output when the crash occurs, even over a serial link. > I also do not see any output when running the app over telnet. Right > now the app is not running in graphics mode, so I should see output > even when I was running it locally. > Not if the kernel is terminally broken due to this bug. Wild guess: I would suggest to check how message queue buffers are used by your application, particularly to detect out-of-bound writes. . > > Mark > > > On Wed, Mar 11, 2009 at 2:14 AM, Philippe Gerum <rpm@xenomai.org> wrote: >> Mark Saiia wrote: >>> I am unable to examine the kernel log. When I say hard crash I mean >>> that everything locks up, including the OS. >> When the system detects a corruption, it first dumps a report to the console >> then halts the CPU. So what you need to do is a way to get the console >> output over a serial link, or maybe over a netconsole. >> >> Therefore, I cannot do a >>> logread. I modified syslogd to output to file. Klogd is outputting to >>> syslog (the version being using does not have the option to output >>> directly to file). When I examine the log on disk after reboot, there >>> is nothing relevant in there. >> The report can't be synced to disk, so you can't find it after next boot >> anyway. >> >> The last log message is prior to the >>> crash. >>> >>> On Tue, Mar 10, 2009 at 3:24 PM, Philippe Gerum <rpm@xenomai.org> wrote: >>>> Mark Saiia wrote: >>>>> With the debugging options enabled, at the point when the application >>>>> has previously shifted from 1 waiter to 2 waiters, the system >>>>> hardlocks, just as when I was catting the queue proc entry. >>>>> >>>> That is expected, but what does the kernel log say at that point? >>>> >>>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >>>>>> Mark Saiia wrote: >>>>>>> When our app's log shows the number of waiters is 1(via >>>>>>> rt_queue_inquire), catting the proc entry shows the queue info, and a >>>>>>> + on the next line. However, when the log shows the number of waiters >>>>>>> is 2, catting the proc entry crashes the system hard, which >>>>>>> necessitates a reboot. This behavior is completely reproducible. >>>>>>> >>>>>> This is an evidence that some internal data structures are terminally >>>>>> broken. >>>>>> You may want to enable CONFIG_XENO_OPT_DEBUG, >>>>>> CONFIG_XENO_OPT_DEBUG_NUCLEUS >>>>>> and >>>>>> CONFIG_XENO_OPT_DEBUG_QUEUES in your kernel config. The nucleus will >>>>>> pull >>>>>> the >>>>>> break when a corruption is detected at runtime. >>>>>> >>>>>>> Mark >>>>>>> >>>>>>> >>>>>>> On 3/10/09, Philippe Gerum <rpm@xenomai.org> wrote: >>>>>>>> Steven Seeger wrote: >>>>>>>>>> Yes, the docs are correct, and the code looks sane as well. You may >>>>>>>>>> want to >>>>>>>>>> double-check your findings using rt_queue_inquire() before calling >>>>>>>>>> rt_queue_write(), even if this won't be 100% reliable in case your >>>>>>>>>> reader is >>>>>>>>>> polling the queue. >>>>>>>>> Philippe, >>>>>>>>> >>>>>>>>> We took your advice and tried rt_queue_inquire(). If we use a >>>>>>>>> timeout >>>>>>>>> on >>>>>>>>> the read, It seems there are always 3 waiters, which is strange >>>>>>>>> because >>>>>>>>> we have only one thread reading from the queue. If we remove the >>>>>>>>> timout, >>>>>>>>> there is either 1 or 2 waiters. It got 0 waiters once and still >>>>>>>>> hung. >>>>>>>>> Very strange. >>>>>>>>> >>>>>>>> /proc/xenomai/registry/native/queues/* will tell you which threads >>>>>>>> are >>>>>>>> pending >>>>>>>> on the queue. >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Steven >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> Philippe. >>>>>>>> >>>>>> -- >>>>>> Philippe. >>>>>> >>>> -- >>>> Philippe. >>>> >> >> -- >> Philippe. >> > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:04 ` Philippe Gerum @ 2009-03-11 17:07 ` Steven Seeger 2009-03-11 17:15 ` Philippe Gerum 2009-03-11 17:17 ` [Xenomai-help] " Jan Kiszka 1 sibling, 1 reply; 18+ messages in thread From: Steven Seeger @ 2009-03-11 17:07 UTC (permalink / raw) To: rpm; +Cc: xenomai-help > Not if the kernel is terminally broken due to this bug. Wild guess: > I would > suggest to check how message queue buffers are used by your > application, > particularly to detect out-of-bound writes. This particular queue is always writing a structure of a constant size. So our reads and writes are always constant size. If reading does not return a size that we expect then we assert. Steven ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:07 ` Steven Seeger @ 2009-03-11 17:15 ` Philippe Gerum 2009-03-11 17:18 ` Steven Seeger 0 siblings, 1 reply; 18+ messages in thread From: Philippe Gerum @ 2009-03-11 17:15 UTC (permalink / raw) To: Steven Seeger; +Cc: xenomai-help Steven Seeger wrote: >> Not if the kernel is terminally broken due to this bug. Wild guess: I >> would >> suggest to check how message queue buffers are used by your application, >> particularly to detect out-of-bound writes. > > This particular queue is always writing a structure of a constant size. > So our reads and writes are always constant size. If reading does not > return a size that we expect then we assert. > Then I have no clue; you may want to try simplifying your test until the bug disappears. > Steven > > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:15 ` Philippe Gerum @ 2009-03-11 17:18 ` Steven Seeger 2009-03-11 17:36 ` Philippe Gerum 0 siblings, 1 reply; 18+ messages in thread From: Steven Seeger @ 2009-03-11 17:18 UTC (permalink / raw) To: rpm; +Cc: xenomai-help On Mar 11, 2009, at 1:15 PM, Philippe Gerum wrote: > Then I have no clue; you may want to try simplifying your test until > the bug disappears. I also suggested this already. :) We are also always writing the same structure. It is statically allocated and we use queues as a means of providing a series of snapshots. Our app is entirely userspace. Could memory corruption in userspace cause issues with the /proc entry? Steven ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:18 ` Steven Seeger @ 2009-03-11 17:36 ` Philippe Gerum 2009-03-11 17:38 ` Steven Seeger ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Philippe Gerum @ 2009-03-11 17:36 UTC (permalink / raw) To: Steven Seeger; +Cc: xenomai-help Steven Seeger wrote: > On Mar 11, 2009, at 1:15 PM, Philippe Gerum wrote: > >> Then I have no clue; you may want to try simplifying your test until >> the bug disappears. > > I also suggested this already. :) > > We are also always writing the same structure. It is statically > allocated and we use queues as a means of providing a series of snapshots. > > Our app is entirely userspace. Could memory corruption in userspace > cause issues with the /proc entry? > You are sharing buffers with kernel space using RT_QUEUEs. If anything goes wrong writing to this address space, you may well end up corrupting some kernel data, including some internal Xenomai data structures. The fact that looking at /proc/xenomai crashes the system is telling, since the backing code may attempt to scan such data structures. Since you also have your own driver running, this might be another source for the issue. Non-reg tests did not reveal any issue with RT_QUEUEs so far. > Steven > > -- Philippe. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:36 ` Philippe Gerum @ 2009-03-11 17:38 ` Steven Seeger 2009-03-11 17:39 ` Steven Seeger [not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid> 2 siblings, 0 replies; 18+ messages in thread From: Steven Seeger @ 2009-03-11 17:38 UTC (permalink / raw) To: rpm; +Cc: xenomai-help > You are sharing buffers with kernel space using RT_QUEUEs. If > anything goes wrong writing to this address space, you may well end > up corrupting some kernel data, including some internal Xenomai data > structures. The fact that looking at /proc/xenomai crashes the > system is telling, since the backing code may attempt to scan such > data structures. > > Since you also have your own driver running, this might be another > source for the issue. Non-reg tests did not reveal any issue with > RT_QUEUEs so far. Actually we have disabled the driver and do not even load it until the FPU issue is resolved. Mark is taking Jan's advice and disabling klogd now. Steven ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:36 ` Philippe Gerum 2009-03-11 17:38 ` Steven Seeger @ 2009-03-11 17:39 ` Steven Seeger [not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid> 2 siblings, 0 replies; 18+ messages in thread From: Steven Seeger @ 2009-03-11 17:39 UTC (permalink / raw) To: rpm; +Cc: xenomai-help BTW, we have seen sporadic hanging with the queues before but never pinpointed it because it happened so infrequently. Since moving to xenomai 2.4.7 it's very repeatable at the same state in the system. Steven ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid>]
* [Xenomai-help] Fwd: rt_queue_write return 1, with no receiver [not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid> @ 2009-03-11 17:45 ` Mark Saiia 0 siblings, 0 replies; 18+ messages in thread From: Mark Saiia @ 2009-03-11 17:45 UTC (permalink / raw) To: xenomai-help Forgot to cc the list. Sorry. ---------- Forwarded message ---------- From: Mark Saiia <mark.saiia@domain.hid> Date: Wed, Mar 11, 2009 at 10:43 AM Subject: Re: [Xenomai-help] rt_queue_write return 1, with no receiver To: rpm@xenomai.org Even with klogd disabled, there is no debugging information displayed over the serial console. It looks like it is time to start writing test code. Mark On Wed, Mar 11, 2009 at 10:36 AM, Philippe Gerum <rpm@xenomai.org> wrote: > Steven Seeger wrote: >> >> On Mar 11, 2009, at 1:15 PM, Philippe Gerum wrote: >> >>> Then I have no clue; you may want to try simplifying your test until the >>> bug disappears. >> >> I also suggested this already. :) >> >> We are also always writing the same structure. It is statically allocated >> and we use queues as a means of providing a series of snapshots. >> >> Our app is entirely userspace. Could memory corruption in userspace cause >> issues with the /proc entry? >> > > You are sharing buffers with kernel space using RT_QUEUEs. If anything goes > wrong writing to this address space, you may well end up corrupting some > kernel data, including some internal Xenomai data structures. The fact that > looking at /proc/xenomai crashes the system is telling, since the backing > code may attempt to scan such data structures. > > Since you also have your own driver running, this might be another source > for the issue. Non-reg tests did not reveal any issue with RT_QUEUEs so far. > >> Steven >> >> > > > -- > Philippe. > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Xenomai-help] rt_queue_write return 1, with no receiver 2009-03-11 17:04 ` Philippe Gerum 2009-03-11 17:07 ` Steven Seeger @ 2009-03-11 17:17 ` Jan Kiszka 1 sibling, 0 replies; 18+ messages in thread From: Jan Kiszka @ 2009-03-11 17:17 UTC (permalink / raw) To: Mark Saiia; +Cc: xenomai-help Philippe Gerum wrote: > Mark Saiia wrote: >> I still get no output when the crash occurs, even over a serial link. Do you have some klogd running? That nice guy tries to collect kernel log messages for syslog, but if things go totally wrong, you then do not see any last words even on serial consoles. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-03-11 17:45 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-10 0:27 [Xenomai-help] rt_queue_write return 1, with no receiver Mark Saiia
2009-03-10 9:37 ` Philippe Gerum
2009-03-10 18:21 ` Steven Seeger
2009-03-10 18:57 ` Steven Seeger
2009-03-10 19:56 ` Philippe Gerum
[not found] ` <67b6b3430903101403r183d6d4cwe100619a293abae2@domain.hid>
2009-03-10 21:07 ` Philippe Gerum
2009-03-10 22:16 ` Mark Saiia
2009-03-10 22:24 ` Philippe Gerum
[not found] ` <67b6b3430903101552u37244233s587898c4d0f9ef3d@domain.hid>
2009-03-11 9:14 ` Philippe Gerum
[not found] ` <67b6b3430903110951m71679f89ud83859654f04aabb@domain.hid>
2009-03-11 17:04 ` Philippe Gerum
2009-03-11 17:07 ` Steven Seeger
2009-03-11 17:15 ` Philippe Gerum
2009-03-11 17:18 ` Steven Seeger
2009-03-11 17:36 ` Philippe Gerum
2009-03-11 17:38 ` Steven Seeger
2009-03-11 17:39 ` Steven Seeger
[not found] ` <67b6b3430903111043s628c5c64m89cc8c726bdf6ff9@domain.hid>
2009-03-11 17:45 ` [Xenomai-help] Fwd: " Mark Saiia
2009-03-11 17:17 ` [Xenomai-help] " Jan Kiszka
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.