* [Xenomai-help] Deadlock with test apps in 2.5.0 @ 2010-01-04 10:12 Henri Roosen 2010-01-04 15:01 ` Philippe Gerum 0 siblings, 1 reply; 6+ messages in thread From: Henri Roosen @ 2010-01-04 10:12 UTC (permalink / raw) To: xenomai I just had a quick look at the newly released Xenomai 2.5.0 version. The release notes show some interesting performance optimizations, so we are evaluating whether we can use this version for our future development. I compiled a 2.6.30.10 x86 kernel with this version and as always I started the latency test to give me a first impression whether the system behaves as expected. Then played around with the new test-suite programs that come with this new release. The latency app got into a deadlock when running the mutex-torture-native/posix (in a while loop) at the same time. I did not look into it in any detail, but these apps should not lock each other, right? Is this expected behaviour or this is a bug? Thanks, Henri To reproduce run: latency while true; do mutex-torture-native ; done ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] Deadlock with test apps in 2.5.0 2010-01-04 10:12 [Xenomai-help] Deadlock with test apps in 2.5.0 Henri Roosen @ 2010-01-04 15:01 ` Philippe Gerum 2010-01-04 15:19 ` Philippe Gerum 0 siblings, 1 reply; 6+ messages in thread From: Philippe Gerum @ 2010-01-04 15:01 UTC (permalink / raw) To: Henri Roosen; +Cc: xenomai On Mon, 2010-01-04 at 11:12 +0100, Henri Roosen wrote: > I just had a quick look at the newly released Xenomai 2.5.0 version. > The release notes show some interesting performance optimizations, so > we are evaluating whether we can use this version for our future > development. > > I compiled a 2.6.30.10 x86 kernel with this version and as always I > started the latency test to give me a first impression whether the > system behaves as expected. Then played around with the new test-suite > programs that come with this new release. > > The latency app got into a deadlock when running the > mutex-torture-native/posix (in a while loop) at the same time. I did > not look into it in any detail, but these apps should not lock each > other, right? Is this expected behaviour or this is a bug? At least, this is not expected. Could you enable all debug options from the Xenomai nucleus and try again? TIA,, > > Thanks, > Henri > > To reproduce run: > latency > while true; do mutex-torture-native ; done > > _______________________________________________ > Xenomai-help mailing list > Xenomai-help@domain.hid > https://mail.gna.org/listinfo/xenomai-help -- Philippe. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] Deadlock with test apps in 2.5.0 2010-01-04 15:01 ` Philippe Gerum @ 2010-01-04 15:19 ` Philippe Gerum 2010-01-04 15:25 ` Henri Roosen 0 siblings, 1 reply; 6+ messages in thread From: Philippe Gerum @ 2010-01-04 15:19 UTC (permalink / raw) To: Henri Roosen; +Cc: xenomai On Mon, 2010-01-04 at 16:01 +0100, Philippe Gerum wrote: > On Mon, 2010-01-04 at 11:12 +0100, Henri Roosen wrote: > > I just had a quick look at the newly released Xenomai 2.5.0 version. > > The release notes show some interesting performance optimizations, so > > we are evaluating whether we can use this version for our future > > development. > > > > I compiled a 2.6.30.10 x86 kernel with this version and as always I > > started the latency test to give me a first impression whether the > > system behaves as expected. Then played around with the new test-suite > > programs that come with this new release. > > > > The latency app got into a deadlock when running the > > mutex-torture-native/posix (in a while loop) at the same time. I did > > not look into it in any detail, but these apps should not lock each > > other, right? Is this expected behaviour or this is a bug? > > At least, this is not expected. Could you enable all debug options from > the Xenomai nucleus and try again? TIA,, Don't bother. It's reproducible here as well, the board does not actually lock up, but the apps rather hang, while the rest of the system remains responsive. Is it what you get on your side as well? > > > > > Thanks, > > Henri > > > > To reproduce run: > > latency > > while true; do mutex-torture-native ; done > > > > _______________________________________________ > > Xenomai-help mailing list > > Xenomai-help@domain.hid > > https://mail.gna.org/listinfo/xenomai-help > > -- Philippe. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] Deadlock with test apps in 2.5.0 2010-01-04 15:19 ` Philippe Gerum @ 2010-01-04 15:25 ` Henri Roosen 2010-01-07 15:48 ` Gilles Chanteperdrix 0 siblings, 1 reply; 6+ messages in thread From: Henri Roosen @ 2010-01-04 15:25 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai Correct, that is the same behavior I get. When I then kill the latency app, the other process is released and can go on. On Mon, Jan 4, 2010 at 4:19 PM, Philippe Gerum <rpm@xenomai.org> wrote: > On Mon, 2010-01-04 at 16:01 +0100, Philippe Gerum wrote: >> On Mon, 2010-01-04 at 11:12 +0100, Henri Roosen wrote: >> > I just had a quick look at the newly released Xenomai 2.5.0 version. >> > The release notes show some interesting performance optimizations, so >> > we are evaluating whether we can use this version for our future >> > development. >> > >> > I compiled a 2.6.30.10 x86 kernel with this version and as always I >> > started the latency test to give me a first impression whether the >> > system behaves as expected. Then played around with the new test-suite >> > programs that come with this new release. >> > >> > The latency app got into a deadlock when running the >> > mutex-torture-native/posix (in a while loop) at the same time. I did >> > not look into it in any detail, but these apps should not lock each >> > other, right? Is this expected behaviour or this is a bug? >> >> At least, this is not expected. Could you enable all debug options from >> the Xenomai nucleus and try again? TIA,, > > Don't bother. It's reproducible here as well, the board does not > actually lock up, but the apps rather hang, while the rest of the system > remains responsive. Is it what you get on your side as well? > >> >> > >> > Thanks, >> > Henri >> > >> > To reproduce run: >> > latency >> > while true; do mutex-torture-native ; done >> > >> > _______________________________________________ >> > Xenomai-help mailing list >> > Xenomai-help@domain.hid >> > https://mail.gna.org/listinfo/xenomai-help >> >> > > > -- > Philippe. > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] Deadlock with test apps in 2.5.0 2010-01-04 15:25 ` Henri Roosen @ 2010-01-07 15:48 ` Gilles Chanteperdrix 2010-01-07 16:46 ` Henri Roosen 0 siblings, 1 reply; 6+ messages in thread From: Gilles Chanteperdrix @ 2010-01-07 15:48 UTC (permalink / raw) To: Henri Roosen; +Cc: xenomai Henri Roosen wrote: > Correct, that is the same behavior I get. When I then kill the latency > app, the other process is released and can go on. Ok. I may have found an issue. Here is a patch, could you try it? diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c index 22bac6f..5de5227 100644 --- a/ksrc/nucleus/shadow.c +++ b/ksrc/nucleus/shadow.c @@ -174,6 +174,29 @@ static inline void set_switch_lock_owner(struct task_struct *p) #define rpi_p(t) ((t)->rpi != NULL) +static inline struct xnthread *rpi_next(struct xnsched *sched) +{ + struct xnthread *thread; + spl_t s; + + thread = xnsched_peek_rpi(sched); + while (thread && + xnthread_user_task(thread)->state != TASK_RUNNING && + !xnthread_test_info(thread, XNATOMIC)) { + xnsched_pop_rpi(thread); + thread->rpi = NULL; + xnlock_put_irqrestore(&sched->rpilock, s); + /* Do NOT nest the rpilock and nklock locks. */ + xnlock_get_irqsave(&nklock, s); + xnsched_suspend_rpi(thread); + xnlock_put_irqrestore(&nklock, s); + xnlock_get_irqsave(&sched->rpilock, s); + thread = xnsched_peek_rpi(sched); + } + + return thread; +} + static void rpi_push(struct xnsched *sched, struct xnthread *thread) { struct xnsched_class *sched_class; @@ -236,7 +259,7 @@ static void rpi_pop(struct xnthread *thread) return; } - top = xnsched_peek_rpi(sched); + top = rpi_next(sched); if (likely(top == NULL)) { prio = XNSCHED_IDLE_PRIO; sched_class = &xnsched_class_idle; @@ -310,7 +333,7 @@ static void rpi_clear_remote(struct xnthread *thread) xnsched_pop_rpi(thread); thread->rpi = NULL; - if (xnsched_peek_rpi(rpi) == NULL) + if (rpi_next(rpi) == NULL) rcpu = xnsched_cpu(rpi); xnlock_put_irqrestore(&rpi->rpilock, s); @@ -407,7 +430,7 @@ static inline void rpi_switch(struct task_struct *next_task) xnthread_test_state(next, XNRPIOFF)) { xnlock_get_irqsave(&sched->rpilock, s); - top = xnsched_peek_rpi(sched); + top = rpi_next(sched); if (top) { newprio = top->cprio; newclass = top->sched_class; @@ -495,7 +518,7 @@ void xnshadow_rpi_check(void) struct xnthread *top; xnlock_get(&sched->rpilock); - top = xnsched_peek_rpi(sched); + top = rpi_next(sched); xnlock_put(&sched->rpilock); if (top == NULL && xnsched_root_class(sched) != &xnsched_class_idle) -- Gilles. ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] Deadlock with test apps in 2.5.0 2010-01-07 15:48 ` Gilles Chanteperdrix @ 2010-01-07 16:46 ` Henri Roosen 0 siblings, 0 replies; 6+ messages in thread From: Henri Roosen @ 2010-01-07 16:46 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai Hi Gilles, Yes, this patch seems to solve the issue. It was reproducable within a few seconds, but now I haven't seen the lock-up anymore after running for half an hour. Great job! Thanks, Henri. On Thu, Jan 7, 2010 at 4:48 PM, Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote: > Henri Roosen wrote: >> Correct, that is the same behavior I get. When I then kill the latency >> app, the other process is released and can go on. > > Ok. I may have found an issue. Here is a patch, could you try it? > > diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c > index 22bac6f..5de5227 100644 > --- a/ksrc/nucleus/shadow.c > +++ b/ksrc/nucleus/shadow.c > @@ -174,6 +174,29 @@ static inline void set_switch_lock_owner(struct task_struct *p) > > #define rpi_p(t) ((t)->rpi != NULL) > > +static inline struct xnthread *rpi_next(struct xnsched *sched) > +{ > + struct xnthread *thread; > + spl_t s; > + > + thread = xnsched_peek_rpi(sched); > + while (thread && > + xnthread_user_task(thread)->state != TASK_RUNNING && > + !xnthread_test_info(thread, XNATOMIC)) { > + xnsched_pop_rpi(thread); > + thread->rpi = NULL; > + xnlock_put_irqrestore(&sched->rpilock, s); > + /* Do NOT nest the rpilock and nklock locks. */ > + xnlock_get_irqsave(&nklock, s); > + xnsched_suspend_rpi(thread); > + xnlock_put_irqrestore(&nklock, s); > + xnlock_get_irqsave(&sched->rpilock, s); > + thread = xnsched_peek_rpi(sched); > + } > + > + return thread; > +} > + > static void rpi_push(struct xnsched *sched, struct xnthread *thread) > { > struct xnsched_class *sched_class; > @@ -236,7 +259,7 @@ static void rpi_pop(struct xnthread *thread) > return; > } > > - top = xnsched_peek_rpi(sched); > + top = rpi_next(sched); > if (likely(top == NULL)) { > prio = XNSCHED_IDLE_PRIO; > sched_class = &xnsched_class_idle; > @@ -310,7 +333,7 @@ static void rpi_clear_remote(struct xnthread *thread) > xnsched_pop_rpi(thread); > thread->rpi = NULL; > > - if (xnsched_peek_rpi(rpi) == NULL) > + if (rpi_next(rpi) == NULL) > rcpu = xnsched_cpu(rpi); > > xnlock_put_irqrestore(&rpi->rpilock, s); > @@ -407,7 +430,7 @@ static inline void rpi_switch(struct task_struct *next_task) > xnthread_test_state(next, XNRPIOFF)) { > xnlock_get_irqsave(&sched->rpilock, s); > > - top = xnsched_peek_rpi(sched); > + top = rpi_next(sched); > if (top) { > newprio = top->cprio; > newclass = top->sched_class; > @@ -495,7 +518,7 @@ void xnshadow_rpi_check(void) > struct xnthread *top; > > xnlock_get(&sched->rpilock); > - top = xnsched_peek_rpi(sched); > + top = rpi_next(sched); > xnlock_put(&sched->rpilock); > > if (top == NULL && xnsched_root_class(sched) != &xnsched_class_idle) > > > -- > Gilles. > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-01-07 16:46 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-04 10:12 [Xenomai-help] Deadlock with test apps in 2.5.0 Henri Roosen 2010-01-04 15:01 ` Philippe Gerum 2010-01-04 15:19 ` Philippe Gerum 2010-01-04 15:25 ` Henri Roosen 2010-01-07 15:48 ` Gilles Chanteperdrix 2010-01-07 16:46 ` Henri Roosen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.