M. Koehrer wrote: > Hi everybody, > > I noticed a sporadic freeze of my PC using Xenomai 2.3.1 and kernel 2.6.20.4 on a Pentium D. > adeos-ipipe-2.6.20-i386-1.8-01.patch. > > The freeze happened sporadically on one of our systems, occasionally it took up to 6 hours to get it. > Using a PCI Post Code board and writing POST codes to it, I was able to locate the code that was causing > the issue. And finally I was able to extract it to a very simple program that shows the same behaviour!! > > Here is my simple test program: > **************************************** BEGIN ***************** > #include > #include > > #include > #include > > > RT_TASK taska_desc; > > void mytaska(void *cookie) > { > int i; > > for (i=0; i < 5; i++) > { > rt_task_sleep(5000000); > } > } > > int main(void) > { > int i; > int j; > mlockall(MCL_CURRENT|MCL_FUTURE); > > for (j=0; j < 100; j++) > for (i=10; i < 15000; i++) > { > rt_task_create(&taska_desc, "mytaska", 0, 81, T_JOINABLE | T_FPU | T_CPU(1)); > rt_task_start(&taska_desc, &mytaska, NULL); > usleep(1500); > > rt_task_join(&taska_desc); > if ( i % 100 == 0) > printf("Loop %i\n", i); > } > > return 0; > } > *************************************** END *********************************** > It is important to know, that I started the kernel with isolcpus=1, i.e. all non-realtime tasks > are running on CPU 0. > Somehow it seems to have to do with the usleep() that is following the rt_task_start. > usleep() is executed on CPU 0 and rt_task_start starts a task on CPU 1... > Can this be as the begin of usleep() is executed before the task is started but the end of > usleep() is when the task has already started. Could this be a cause for a race condition? > > I leave the program running for a while and somehow it freezes the PC (only reset works). > > Any feedback on this is welcome! Maybe you are seeing the same bug like this test exposes: #include #include #include void func(void *arg) { rt_task_set_periodic(NULL, TM_NOW, 1000000000LL); while(1) rt_task_wait_period(NULL); } main() { RT_TASK task; cpu_set_t set; mlockall(MCL_CURRENT|MCL_FUTURE); printf("rt_task_spawn=%d\n", rt_task_spawn(&task, "Receiver", 0, 10, 0, func, NULL)); CPU_ZERO(&set); CPU_SET(1, &set); printf("sched_setaffinity=%d\n", sched_setaffinity(0, sizeof(cpu_set_t), &set)); sleep(1); printf("rt_task_delete=%d\n", rt_task_delete(&task)); } Though, this test doesn't hard-lock, just stalls the process in some zombie state. This bug is already scheduled for closer examination, stay tuned. In the meantime: Is it possible to check if a) my demo code happens to lock up hard for you? b) any behaviour changes with latest xeno-2.3.2/ipipe-1.8-05 and your test case? Thanks for reporting, Jan