* [Xenomai-help] Unexpected switch to secondary mode
@ 2007-08-02 9:47 Johan Borkhuis
2007-08-02 10:12 ` Philippe Gerum
0 siblings, 1 reply; 20+ messages in thread
From: Johan Borkhuis @ 2007-08-02 9:47 UTC (permalink / raw)
To: Xenomai-help
I am experiencing an unexpected switch to secondary mode in a
rt_timer_tsc2ns call from userspace.
The following code give a switch:
SRTIME timeStamp;
timeStamp = rt_timer_tsc2ns(rt_timer_tsc());
while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {}
while the following code does not:
RTIME timeStamp;
timeStamp = rt_timer_tsc();
while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {}
(I know that the second example causes a longer timeout, but this was to
show the testcase).
When splitting up the 2nd line in the first example I see that the
rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns
does. What am I doing wrong here?
I am using Xenomai-2.3.2.
Kind regards,
Johan Borkhuis
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 9:47 [Xenomai-help] Unexpected switch to secondary mode Johan Borkhuis @ 2007-08-02 10:12 ` Philippe Gerum 2007-08-02 10:54 ` Johan Borkhuis 0 siblings, 1 reply; 20+ messages in thread From: Philippe Gerum @ 2007-08-02 10:12 UTC (permalink / raw) To: Johan Borkhuis; +Cc: Xenomai-help On Thu, 2007-08-02 at 11:47 +0200, Johan Borkhuis wrote: > I am experiencing an unexpected switch to secondary mode in a > rt_timer_tsc2ns call from userspace. > > The following code give a switch: > SRTIME timeStamp; > timeStamp = rt_timer_tsc2ns(rt_timer_tsc()); > while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {} > > while the following code does not: > RTIME timeStamp; > timeStamp = rt_timer_tsc(); > while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {} > > (I know that the second example causes a longer timeout, but this was to > show the testcase). > > When splitting up the 2nd line in the first example I see that the > rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns > does. What am I doing wrong here? > > I am using Xenomai-2.3.2. > - Which CPU architecture, and which Adeos patch release are you using? - How do you notice the switches, SIGXCPU or /proc/xenomai/stat? > Kind regards, > Johan Borkhuis > > > _______________________________________________ > Xenomai-help mailing list > Xenomai-help@domain.hid > https://mail.gna.org/listinfo/xenomai-help -- Philippe. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 10:12 ` Philippe Gerum @ 2007-08-02 10:54 ` Johan Borkhuis 2007-08-02 11:23 ` Philippe Gerum 2007-08-04 12:30 ` Wolfgang Grandegger 0 siblings, 2 replies; 20+ messages in thread From: Johan Borkhuis @ 2007-08-02 10:54 UTC (permalink / raw) To: rpm; +Cc: Xenomai-help Philippe Gerum wrote: > On Thu, 2007-08-02 at 11:47 +0200, Johan Borkhuis wrote: > >> I am experiencing an unexpected switch to secondary mode in a >> rt_timer_tsc2ns call from userspace. >> >> The following code give a switch: >> SRTIME timeStamp; >> timeStamp = rt_timer_tsc2ns(rt_timer_tsc()); >> while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {} >> >> while the following code does not: >> RTIME timeStamp; >> timeStamp = rt_timer_tsc(); >> while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {} >> >> (I know that the second example causes a longer timeout, but this was to >> show the testcase). >> >> When splitting up the 2nd line in the first example I see that the >> rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns >> does. What am I doing wrong here? >> >> I am using Xenomai-2.3.2. >> >> > > - Which CPU architecture, and which Adeos patch release are you using? > Processor: ppc 85xx, patch 1.5-03, Xenomai 2.3.2 > - How do you notice the switches, SIGXCPU or /proc/xenomai/stat? > SIGXCPU Kind regards, Johan Borkhuis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 10:54 ` Johan Borkhuis @ 2007-08-02 11:23 ` Philippe Gerum 2007-08-02 17:55 ` Gilles Chanteperdrix 2007-08-03 8:05 ` Johan Borkhuis 2007-08-04 12:30 ` Wolfgang Grandegger 1 sibling, 2 replies; 20+ messages in thread From: Philippe Gerum @ 2007-08-02 11:23 UTC (permalink / raw) To: Johan Borkhuis; +Cc: Xenomai-help On Thu, 2007-08-02 at 12:54 +0200, Johan Borkhuis wrote: > Philippe Gerum wrote: > > On Thu, 2007-08-02 at 11:47 +0200, Johan Borkhuis wrote: > > > >> I am experiencing an unexpected switch to secondary mode in a > >> rt_timer_tsc2ns call from userspace. > >> > >> The following code give a switch: > >> SRTIME timeStamp; > >> timeStamp = rt_timer_tsc2ns(rt_timer_tsc()); > >> while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {} > >> > >> while the following code does not: > >> RTIME timeStamp; > >> timeStamp = rt_timer_tsc(); > >> while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {} > >> > >> (I know that the second example causes a longer timeout, but this was to > >> show the testcase). > >> > >> When splitting up the 2nd line in the first example I see that the > >> rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns > >> does. What am I doing wrong here? > >> > >> I am using Xenomai-2.3.2. > >> > >> > > > > - Which CPU architecture, and which Adeos patch release are you using? > > > Processor: ppc 85xx, patch 1.5-03, Xenomai 2.3.2 > > > - How do you notice the switches, SIGXCPU or /proc/xenomai/stat? > > > SIGXCPU > So it's unfortunately possible to experiment such switches on ppc hw for now. It is likely happening when the ns value is copied back to a user variable for which there is a PTE miss to handle in Linux kernel space first. Check the mail archive about "pte", "cow" (copy-on-write) and such things. > Kind regards, > Johan Borkhuis > -- Philippe. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 11:23 ` Philippe Gerum @ 2007-08-02 17:55 ` Gilles Chanteperdrix 2007-08-02 18:01 ` Philippe Gerum 2007-08-03 8:05 ` Johan Borkhuis 1 sibling, 1 reply; 20+ messages in thread From: Gilles Chanteperdrix @ 2007-08-02 17:55 UTC (permalink / raw) To: rpm; +Cc: Xenomai-help Philippe Gerum wrote: > On Thu, 2007-08-02 at 12:54 +0200, Johan Borkhuis wrote: > > Philippe Gerum wrote: > > > On Thu, 2007-08-02 at 11:47 +0200, Johan Borkhuis wrote: > > > > > >> I am experiencing an unexpected switch to secondary mode in a > > >> rt_timer_tsc2ns call from userspace. > > >> > > >> The following code give a switch: > > >> SRTIME timeStamp; > > >> timeStamp = rt_timer_tsc2ns(rt_timer_tsc()); > > >> while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {} > > >> > > >> while the following code does not: > > >> RTIME timeStamp; > > >> timeStamp = rt_timer_tsc(); > > >> while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {} > > >> > > >> (I know that the second example causes a longer timeout, but this was to > > >> show the testcase). > > >> > > >> When splitting up the 2nd line in the first example I see that the > > >> rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns > > >> does. What am I doing wrong here? > > >> > > >> I am using Xenomai-2.3.2. > > >> > > >> > > > > > > - Which CPU architecture, and which Adeos patch release are you using? > > > > > Processor: ppc 85xx, patch 1.5-03, Xenomai 2.3.2 > > > > > - How do you notice the switches, SIGXCPU or /proc/xenomai/stat? > > > > > SIGXCPU > > > > So it's unfortunately possible to experiment such switches on ppc hw for > now. It is likely happening when the ns value is copied back to a user > variable for which there is a PTE miss to handle in Linux kernel space > first. Check the mail archive about "pte", "cow" (copy-on-write) and > such things. Err. ppcs have a tsc accessible in user-space, don't they ? So, I would expect the conversion to be done in user-space. -- Gilles Chanteperdrix. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 17:55 ` Gilles Chanteperdrix @ 2007-08-02 18:01 ` Philippe Gerum 0 siblings, 0 replies; 20+ messages in thread From: Philippe Gerum @ 2007-08-02 18:01 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai-help On Thu, 2007-08-02 at 19:55 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Thu, 2007-08-02 at 12:54 +0200, Johan Borkhuis wrote: > > > Philippe Gerum wrote: > > > > On Thu, 2007-08-02 at 11:47 +0200, Johan Borkhuis wrote: > > > > > > > >> I am experiencing an unexpected switch to secondary mode in a > > > >> rt_timer_tsc2ns call from userspace. > > > >> > > > >> The following code give a switch: > > > >> SRTIME timeStamp; > > > >> timeStamp = rt_timer_tsc2ns(rt_timer_tsc()); > > > >> while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {} > > > >> > > > >> while the following code does not: > > > >> RTIME timeStamp; > > > >> timeStamp = rt_timer_tsc(); > > > >> while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {} > > > >> > > > >> (I know that the second example causes a longer timeout, but this was to > > > >> show the testcase). > > > >> > > > >> When splitting up the 2nd line in the first example I see that the > > > >> rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns > > > >> does. What am I doing wrong here? > > > >> > > > >> I am using Xenomai-2.3.2. > > > >> > > > >> > > > > > > > > - Which CPU architecture, and which Adeos patch release are you using? > > > > > > > Processor: ppc 85xx, patch 1.5-03, Xenomai 2.3.2 > > > > > > > - How do you notice the switches, SIGXCPU or /proc/xenomai/stat? > > > > > > > SIGXCPU > > > > > > > So it's unfortunately possible to experiment such switches on ppc hw for > > now. It is likely happening when the ns value is copied back to a user > > variable for which there is a PTE miss to handle in Linux kernel space > > first. Check the mail archive about "pte", "cow" (copy-on-write) and > > such things. > > Err. ppcs have a tsc accessible in user-space, don't they ? So, I would > expect the conversion to be done in user-space. > The important issue is not that such memory is being written from kernel space, but that you don't have the PTE mapped when writing down the conversion results. And this may happen from pure user-space too in case DIRECT_TSC is available, despite the memlock. -- Philippe. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 11:23 ` Philippe Gerum 2007-08-02 17:55 ` Gilles Chanteperdrix @ 2007-08-03 8:05 ` Johan Borkhuis 2007-08-05 17:22 ` Philippe Gerum 2007-08-06 9:08 ` Gilles Chanteperdrix 1 sibling, 2 replies; 20+ messages in thread From: Johan Borkhuis @ 2007-08-03 8:05 UTC (permalink / raw) To: rpm; +Cc: Xenomai-help Philippe, (BTW: is this something that should be discussed in the Xenomai group, or would it be better to move this discussion to the Adeos mailing list?) I do have question on this. We already discussed this problem earlier, and I managed to find a very dirty work around for my application: I added a lot of "dummy" functions to my application, which are spread over the whole application. By calling these after the mlockall, but before I switch to RT-mode I manage to eliminate most of the switches. Philippe Gerum wrote: > So it's unfortunately possible to experiment such switches on ppc hw for > now. It is likely happening when the ns value is copied back to a user > variable for which there is a PTE miss to handle in Linux kernel space > first. Check the mail archive about "pte", "cow" (copy-on-write) and > such things. > As far as I know (but I can be incorrect here) an interrupt is generated due to a PTE-miss. This causes the Nucleus to switch to secondary mode, to allow Linux to process the PTE's. The total latency caused by this would not be very bad, as you also indicated earlier. From the earlier discussion: In such a case, you have likely hit an illustration of the latter issue which the I-pipe/ppc implementation still suffers from: some page table entries are missed during real-time operations. As a consequence of this, the nucleus catches page faults on behalf of RT threads in primary mode, then switches these threads back to secondary in order to process the faults, and eventually wire the missing PTEs in. This is something calling mlockall() does not prevent the application from (like COW). Now for my question. When looking at a situation where you have a system running multiple RT-tasks, when one of them hits an (unexpected) switch, it is possible that this task will never be switched in again, as the other RT-tasks might consume all the processor time. Would it be possible to have the Nucleus, after the page-fault is processed and the "problem" is fixed, automagically switch the system back to primary mode? It is not the real solution to the problem, but it would be a acceptable workaround for this moment, until a real solution is available. Also another question on this issue: at this moment I am using kernel 2.6.14(ppc), but I saw that there is now an adeos-patch for 2.6.20(powerpc). Could using this version give an improvement in this area, or does this not make a difference? Kind regards, Johan Borkhuis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-03 8:05 ` Johan Borkhuis @ 2007-08-05 17:22 ` Philippe Gerum 2007-08-06 9:08 ` Gilles Chanteperdrix 1 sibling, 0 replies; 20+ messages in thread From: Philippe Gerum @ 2007-08-05 17:22 UTC (permalink / raw) To: Johan Borkhuis; +Cc: Xenomai-help On Fri, 2007-08-03 at 10:05 +0200, Johan Borkhuis wrote: > Philippe, > > (BTW: is this something that should be discussed in the Xenomai group, > or would it be better to move this discussion to the Adeos mailing list?) > This list is ok; we are discussing about how RT threads are coping with the underlying MMU, which is first and foremost a Xenomai issue (despite Adeos must be the one to provide some support to fix the latter issue though). > I do have question on this. We already discussed this problem earlier, > and I managed to find a very dirty work around for my application: I > added a lot of "dummy" functions to my application, which are spread > over the whole application. By calling these after the mlockall, but > before I switch to RT-mode I manage to eliminate most of the switches. > Yeah, not pretty, but clearly illustrates the point. > Philippe Gerum wrote: > > So it's unfortunately possible to experiment such switches on ppc hw for > > now. It is likely happening when the ns value is copied back to a user > > variable for which there is a PTE miss to handle in Linux kernel space > > first. Check the mail archive about "pte", "cow" (copy-on-write) and > > such things. > > > As far as I know (but I can be incorrect here) an interrupt is generated > due to a PTE-miss. This causes the Nucleus to switch to secondary mode, > to allow Linux to process the PTE's. The total latency caused by this > would not be very bad, as you also indicated earlier. > Correct, as the experience shows. Still, I don't like the idea of leaving this window open for latency. > From the earlier discussion: > > In such a case, you have likely hit an illustration of the latter issue > which the I-pipe/ppc implementation still suffers from: some page table > entries are missed during real-time operations. As a consequence of > this, the nucleus catches page faults on behalf of RT threads in primary > mode, then switches these threads back to secondary in order to process > the faults, and eventually wire the missing PTEs in. This is something > calling mlockall() does not prevent the application from (like COW). > > > Now for my question. > When looking at a situation where you have a system running multiple > RT-tasks, when one of them hits an (unexpected) switch, it is possible > that this task will never be switched in again, as the other RT-tasks > might consume all the processor time. Correct, if not the most probable since threads usually manage to call a blocking Xenomai service which switches them back to primary mode. Still, the switch-delayed-by-preemption scenario is perfectly possible. > Would it be possible to have the Nucleus, after the page-fault is > processed and the "problem" is fixed, automagically switch the system > back to primary mode? Yes, I think so. This would be restricted to minor VM faults in order to prevent any misuse of this feature, but I'm going to implement this. > It is not the real solution to the problem, but it would be a acceptable > workaround for this moment, until a real solution is available. > Ack. > Also another question on this issue: at this moment I am using kernel > 2.6.14(ppc), but I saw that there is now an adeos-patch for > 2.6.20(powerpc). Could using this version give an improvement in this > area, or does this not make a difference? > No difference, yet. > Kind regards, > Johan Borkhuis -- Philippe. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-03 8:05 ` Johan Borkhuis 2007-08-05 17:22 ` Philippe Gerum @ 2007-08-06 9:08 ` Gilles Chanteperdrix 2007-08-06 11:41 ` Gilles Chanteperdrix 2007-08-06 11:54 ` Philippe Gerum 1 sibling, 2 replies; 20+ messages in thread From: Gilles Chanteperdrix @ 2007-08-06 9:08 UTC (permalink / raw) To: Johan Borkhuis; +Cc: Xenomai-help On 8/3/07, Johan Borkhuis <j.borkhuis@domain.hid> wrote: > Philippe, > > (BTW: is this something that should be discussed in the Xenomai group, > or would it be better to move this discussion to the Adeos mailing list?) > > I do have question on this. We already discussed this problem earlier, > and I managed to find a very dirty work around for my application: I > added a lot of "dummy" functions to my application, which are spread > over the whole application. By calling these after the mlockall, but > before I switch to RT-mode I manage to eliminate most of the switches. Before implementing the nocow patch, I opened /proc/self/maps and caused a fault on every writable page, even if ugly, this looks simpler than spreading dummy functions over the whole application. But I have another question: since the nocow patch is platform independent, why not integrating it in the I-pipe patch for power pc ? -- Gilles Chanteperdrix ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-06 9:08 ` Gilles Chanteperdrix @ 2007-08-06 11:41 ` Gilles Chanteperdrix 2007-08-07 11:06 ` Johan Borkhuis 2007-08-06 11:54 ` Philippe Gerum 1 sibling, 1 reply; 20+ messages in thread From: Gilles Chanteperdrix @ 2007-08-06 11:41 UTC (permalink / raw) To: Johan Borkhuis, rpm, Xenomai-help Gilles Chanteperdrix wrote: > On 8/3/07, Johan Borkhuis <j.borkhuis@domain.hid> wrote: > > Philippe, > > > > (BTW: is this something that should be discussed in the Xenomai group, > > or would it be better to move this discussion to the Adeos mailing list?) > > > > I do have question on this. We already discussed this problem earlier, > > and I managed to find a very dirty work around for my application: I > > added a lot of "dummy" functions to my application, which are spread > > over the whole application. By calling these after the mlockall, but > > before I switch to RT-mode I manage to eliminate most of the switches. > > Before implementing the nocow patch, I opened /proc/self/maps and > caused a fault on every writable page, even if ugly, this looks > simpler than spreading dummy functions over the whole application. I posted the piece of code here: https://mail.gna.org/public/xenomai-help/2006-12/msg00168.html -- Gilles Chanteperdrix. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-06 11:41 ` Gilles Chanteperdrix @ 2007-08-07 11:06 ` Johan Borkhuis 2007-08-07 12:49 ` Gilles Chanteperdrix 0 siblings, 1 reply; 20+ messages in thread From: Johan Borkhuis @ 2007-08-07 11:06 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai-help Gilles, Gilles Chanteperdrix wrote: > Gilles Chanteperdrix wrote: > > On 8/3/07, Johan Borkhuis <j.borkhuis@domain.hid> wrote: > > > Philippe, > > > > > > (BTW: is this something that should be discussed in the Xenomai group, > > > or would it be better to move this discussion to the Adeos mailing list?) > > > > > > I do have question on this. We already discussed this problem earlier, > > > and I managed to find a very dirty work around for my application: I > > > added a lot of "dummy" functions to my application, which are spread > > > over the whole application. By calling these after the mlockall, but > > > before I switch to RT-mode I manage to eliminate most of the switches. > > > > Before implementing the nocow patch, I opened /proc/self/maps and > > caused a fault on every writable page, even if ugly, this looks > > simpler than spreading dummy functions over the whole application. > > I posted the piece of code here: > https://mail.gna.org/public/xenomai-help/2006-12/msg00168.html > Thank you for this. However, it does not seem to work: when I execute this code it does not make any difference, when executed before I start a RT-task or even within a RT-task. The results are identical to the results as if no dummy functions were used. Only the pages that can be written are touched, but the code pages that cause the problem (afaik) are marked readonly, so they are not touched. I modified the linking, to mark the text-section as writable, but that also did not make any difference. The fact that this does not work could be caused by the fact that you are accessing the pages as data instead of code. Next I tried something different, more like a combination between my original approach and your approach: I tried to find a "return from subroutine" in each text-page, and if found call this location. It is extremely dirty, but it does seem to work quite well. Below is the code that I used: ========== #define RET_CODE 0x4e800020 /* blr */ typedef int (*TestFunc)(void); static void fault_vm(void) { FILE *maps = fopen("/proc/self/maps", "r"); unsigned begin, end, pagesize=getpagesize(); char buffer[128]; int rc, i; volatile int tmp; TestFunc testFunc; if (!maps) { perror("fopen"); exit(EXIT_FAILURE); } while ((rc = fscanf(maps, "%x-%x",&begin, &end) == 2)) { fgets(buffer, 128, maps); for (; begin != end; begin += pagesize) { if(buffer[2] == 'w') { /* Data section */ *(volatile int *) begin = *(volatile int *) begin; } else if(buffer[1] == 'r') { /* Text section */ for(i = 0; i < pagesize; i += 4) { testFunc = (void *)(begin + i); tmp = *(volatile int *) (begin + i); if(tmp == RET_CODE) { testFunc(); continue; } } } } } fclose(maps); } ========== Kind regards, Johan Borkhuis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-07 11:06 ` Johan Borkhuis @ 2007-08-07 12:49 ` Gilles Chanteperdrix 2007-08-07 14:13 ` Johan Borkhuis 0 siblings, 1 reply; 20+ messages in thread From: Gilles Chanteperdrix @ 2007-08-07 12:49 UTC (permalink / raw) To: Johan Borkhuis; +Cc: Xenomai-help On 8/7/07, Johan Borkhuis <j.borkhuis@domain.hid> wrote: > Gilles, > > Gilles Chanteperdrix wrote: > > Gilles Chanteperdrix wrote: > > > On 8/3/07, Johan Borkhuis <j.borkhuis@domain.hid> wrote: > > > > Philippe, > > > > > > > > (BTW: is this something that should be discussed in the Xenomai group, > > > > or would it be better to move this discussion to the Adeos mailing list?) > > > > > > > > I do have question on this. We already discussed this problem earlier, > > > > and I managed to find a very dirty work around for my application: I > > > > added a lot of "dummy" functions to my application, which are spread > > > > over the whole application. By calling these after the mlockall, but > > > > before I switch to RT-mode I manage to eliminate most of the switches. > > > > > > Before implementing the nocow patch, I opened /proc/self/maps and > > > caused a fault on every writable page, even if ugly, this looks > > > simpler than spreading dummy functions over the whole application. > > > > I posted the piece of code here: > > https://mail.gna.org/public/xenomai-help/2006-12/msg00168.html > > > Thank you for this. However, it does not seem to work: when I execute > this code it does not make any difference, when executed before I start > a RT-task or even within a RT-task. The results are identical to the > results as if no dummy functions were used. Only the pages that can be > written are touched, but the code pages that cause the problem (afaik) > are marked readonly, so they are not touched. I modified the linking, to > mark the text-section as writable, but that also did not make any > difference. > The fact that this does not work could be caused by the fact that you > are accessing the pages as data instead of code. > > Next I tried something different, more like a combination between my > original approach and your approach: I tried to find a "return from > subroutine" in each text-page, and if found call this location. It is > extremely dirty, but it does seem to work quite well. Below is the code > that I used: The code I sent only access writable pages, and apparently you need to also access read-only pages. Why not doing something like: else if(buffer[1] == 'r') { static voltatile int dummy; dummy = *(volatile int *)begin; } -- Gilles Chanteperdrix ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-07 12:49 ` Gilles Chanteperdrix @ 2007-08-07 14:13 ` Johan Borkhuis 0 siblings, 0 replies; 20+ messages in thread From: Johan Borkhuis @ 2007-08-07 14:13 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai-help Gilles Chanteperdrix wrote: > The code I sent only access writable pages, and apparently you need to > also access read-only pages. > > Why not doing something like: > else if(buffer[1] == 'r') { > static voltatile int dummy; > dummy = *(volatile int *)begin; > } > > I tried this code first, but it did not make any difference. The problem is (at least, that is what I expect) that you access the page as data, opposed to code, when you execute something in the page. Kind regards, Johan Borkhuis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-06 9:08 ` Gilles Chanteperdrix 2007-08-06 11:41 ` Gilles Chanteperdrix @ 2007-08-06 11:54 ` Philippe Gerum [not found] ` <b647ffbd0708070758t22f01577wd3a5397a53249459@domain.hid> 1 sibling, 1 reply; 20+ messages in thread From: Philippe Gerum @ 2007-08-06 11:54 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai-help On Mon, 2007-08-06 at 11:08 +0200, Gilles Chanteperdrix wrote: > On 8/3/07, Johan Borkhuis <j.borkhuis@domain.hid> wrote: > > Philippe, > > > > (BTW: is this something that should be discussed in the Xenomai group, > > or would it be better to move this discussion to the Adeos mailing list?) > > > > I do have question on this. We already discussed this problem earlier, > > and I managed to find a very dirty work around for my application: I > > added a lot of "dummy" functions to my application, which are spread > > over the whole application. By calling these after the mlockall, but > > before I switch to RT-mode I manage to eliminate most of the switches. > > Before implementing the nocow patch, I opened /proc/self/maps and > caused a fault on every writable page, even if ugly, this looks > simpler than spreading dummy functions over the whole application. > > But I have another question: since the nocow patch is platform > independent, why not integrating it in the I-pipe patch for power pc ? > It is merged into the latest patches against the powerpc/ tree. -- Philippe. ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <b647ffbd0708070758t22f01577wd3a5397a53249459@domain.hid>]
* [Xenomai-help] Unexpected switch to secondary mode [not found] ` <b647ffbd0708070758t22f01577wd3a5397a53249459@domain.hid> @ 2007-08-08 7:40 ` Dmitry Adamushko 2007-08-08 8:00 ` Heikki Lindholm 2007-08-08 8:14 ` Wolfgang Grandegger 0 siblings, 2 replies; 20+ messages in thread From: Dmitry Adamushko @ 2007-08-08 7:40 UTC (permalink / raw) To: Xenomai help [ forgot to add the list : ] ---------- Forwarded message ---------- On 06/08/07, Philippe Gerum <rpm@xenomai.org> wrote: > > [ ... ] > > But I have another question: since the nocow patch is platform > > independent, why not integrating it in the I-pipe patch for power pc ? > > > > It is merged into the latest patches against the powerpc/ tree. [ Mainly, just out of curiosity ] I've actually tried 'google'ing for an overview of the MMU on ppc but it doesn't seem to be easily available on the net. Do I understand it right that in the PPC arch. a CPU doesn't have full access to task's page tables.. that said, if a TLB miss takes place (or some analogue lookup mechanism), the CPU needs assistance from the OS and e.g. rises an exception. [*] A corresponding OS handler gets a 'fault address' and looks for it in the task's 'page tables' .. and if found, updates TLB accordingly. This would be similar to what happens in MIPS : TLB miss --> TLB-miss exception --> OS-dependent TLB-miss handler. Now, Xenomai is not able (at the moment) to do [*] on its own.. thus, there is a switch to the secondary mode so that Linux is able to take care of it. Unless there is a way to reserve a set of TLB entries for a real-time task (or other mechanism to have 'virtual -> physical' conversion entirely in the CPU -- I mean, not involving the OS) + the working set of the rt task fits into this 'reserved' set --- 'TLB-miss' exception gonna happen.. e.g. every time after the RT relinquish a CPU and something else trashes the TLB tables. e.g. on MIPS, the area used for kernel modules also requires virtual->physical translation.. so even a kernel-mode task (and actually, interrupt handlers inside the kernel modules) cause TLB-miss exceptions. Sure, it's not a case if it's linked against the kernel itself. errr.. ok, to many words :-) does it sound like smth taking place here? A link to the MMU overview for ppc would be highly appreciated as well. TIA, > > -- > Philippe. > -- Best regards, Dmitry Adamushko -- Best regards, Dmitry Adamushko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-08 7:40 ` Dmitry Adamushko @ 2007-08-08 8:00 ` Heikki Lindholm 2007-08-08 8:14 ` Wolfgang Grandegger 1 sibling, 0 replies; 20+ messages in thread From: Heikki Lindholm @ 2007-08-08 8:00 UTC (permalink / raw) To: Dmitry Adamushko; +Cc: Xenomai help Dmitry Adamushko kirjoitti: > [ forgot to add the list : ] > > ---------- Forwarded message ---------- > > On 06/08/07, Philippe Gerum <rpm@xenomai.org> wrote: > >>>[ ... ] >>>But I have another question: since the nocow patch is platform >>>independent, why not integrating it in the I-pipe patch for power pc ? >>> >> >>It is merged into the latest patches against the powerpc/ tree. > > > [ Mainly, just out of curiosity ] > > I've actually tried 'google'ing for an overview of the MMU on ppc but > it doesn't seem to be easily > available on the net. > > Do I understand it right that in the PPC arch. a CPU doesn't have full > access to task's page tables.. that said, if a TLB miss takes place > (or some analogue lookup mechanism), the CPU needs assistance from the > OS and e.g. rises an exception. > > [*] A corresponding OS handler gets a 'fault address' and looks for it > in the task's 'page tables' .. and if found, updates TLB accordingly. > > This would be similar to what happens in MIPS : > TLB miss --> TLB-miss exception --> OS-dependent TLB-miss handler. > > Now, Xenomai is not able (at the moment) to do [*] on its own.. thus, > there is a switch to the secondary mode so that Linux is able to take > care of it. > > Unless there is a way to reserve a set of TLB entries for a real-time > task (or other mechanism to have 'virtual -> physical' conversion > entirely in the CPU -- I mean, not involving the OS) + the working set > of the rt task fits into this 'reserved' set --- 'TLB-miss' exception > gonna happen.. e.g. every time after the RT relinquish a CPU and > something else trashes the TLB tables. > > e.g. on MIPS, the area used for kernel modules also requires > virtual->physical translation.. so even a kernel-mode task (and > actually, interrupt handlers inside the kernel modules) cause TLB-miss > exceptions. Sure, it's not a case if it's linked against the kernel > itself. > > errr.. ok, to many words :-) does it sound like smth taking place > here? A link to the MMU overview for ppc would be highly appreciated > as well. See the powerpc architecture book at: http://www.ibm.com/developerworks/eserver/articles/archguide.html It gives the generics. Also see the Book-E (~embedded cpus) architecture; they have different MMUs: http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699600682CC7 And then the model specific manuals (405, 440, 5xx, etc.), which should be available at the IBM/Freescale/AMCC sites. -- Heikki Lindholm ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-08 7:40 ` Dmitry Adamushko 2007-08-08 8:00 ` Heikki Lindholm @ 2007-08-08 8:14 ` Wolfgang Grandegger 2007-08-08 9:12 ` Dmitry Adamushko 1 sibling, 1 reply; 20+ messages in thread From: Wolfgang Grandegger @ 2007-08-08 8:14 UTC (permalink / raw) To: Dmitry Adamushko; +Cc: Xenomai help Dmitry Adamushko wrote: > [ forgot to add the list : ] > > ---------- Forwarded message ---------- > > On 06/08/07, Philippe Gerum <rpm@xenomai.org> wrote: >>> [ ... ] >>> But I have another question: since the nocow patch is platform >>> independent, why not integrating it in the I-pipe patch for power pc ? >>> >> It is merged into the latest patches against the powerpc/ tree. > > [ Mainly, just out of curiosity ] > > I've actually tried 'google'ing for an overview of the MMU on ppc but > it doesn't seem to be easily > available on the net. Unfortunately, there is no common MMU implementation for the PowerPC processors. It is usually described in the User Manual for the IBM, Freescale or AMCC processor, e.g.: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC5200&fpsp=1&tab=Documentation_Tab > > Do I understand it right that in the PPC arch. a CPU doesn't have full > access to task's page tables.. that said, if a TLB miss takes place > (or some analogue lookup mechanism), the CPU needs assistance from the > OS and e.g. rises an exception. Yes. > [*] A corresponding OS handler gets a 'fault address' and looks for it > in the task's 'page tables' .. and if found, updates TLB accordingly. Yes. High-end PowerPC processors have better hw support for virtual to physical address translation than low-end processors. They usually just have 64 TLB entries or even less (like the MPC 8xx). > This would be similar to what happens in MIPS : > TLB miss --> TLB-miss exception --> OS-dependent TLB-miss handler. Yes. > Now, Xenomai is not able (at the moment) to do [*] on its own.. thus, > there is a switch to the secondary mode so that Linux is able to take > care of it. But normal TLB misses happen frequently and do not force a switch to secondary mode. I think the problem is with the do_page_fault trap. Can somebody confirm that? > Unless there is a way to reserve a set of TLB entries for a real-time > task (or other mechanism to have 'virtual -> physical' conversion > entirely in the CPU -- I mean, not involving the OS) + the working set > of the rt task fits into this 'reserved' set --- 'TLB-miss' exception > gonna happen.. e.g. every time after the RT relinquish a CPU and > something else trashes the TLB tables. > > e.g. on MIPS, the area used for kernel modules also requires > virtual->physical translation.. so even a kernel-mode task (and > actually, interrupt handlers inside the kernel modules) cause TLB-miss > exceptions. Sure, it's not a case if it's linked against the kernel > itself. Even kernel code may cause a TLB miss. TLB pinning has been abandoned on some PowerPC archs because it does reduce overall system performance. > errr.. ok, to many words :-) does it sound like smth taking place > here? A link to the MMU overview for ppc would be highly appreciated > as well. See above. Unfortunately, I do only a limited, global view of the MMU implementation for PowerPC. Wolfgang. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-08 8:14 ` Wolfgang Grandegger @ 2007-08-08 9:12 ` Dmitry Adamushko 2007-08-08 10:13 ` Wolfgang Grandegger 0 siblings, 1 reply; 20+ messages in thread From: Dmitry Adamushko @ 2007-08-08 9:12 UTC (permalink / raw) To: Wolfgang Grandegger; +Cc: Xenomai help Heikki and Wolfgang, thanks for the links. > > Now, Xenomai is not able (at the moment) to do [*] on its own.. thus, > > there is a switch to the secondary mode so that Linux is able to take > > care of it. > > But normal TLB misses happen frequently and do not force a switch to > secondary mode. I think the problem is with the do_page_fault trap. This would make sense indeed. I presume, there must be some sort of inter-domain synchronization in place to safely access the 'page tables' for the fast-path of a TLB-miss handler to run in the primary mode (chores: so go and read the code.. ah well :-) Then a remaining part is do_page_fault() and one of the reasons can be 'cow' indeed. As a side note (not PPC related so one may skip this part), on MIPS do_page_fault() may be caused by accessing a valid address allocated by a kernel module (say, vmalloc()). Say, - task_1 calls ioctl() of some driver which does obj = vmalloc() . As a result, task_1 :: page_tables are updated accordingly + so called 'master_table' gets updated; - task_2 calls ioctl() to the same driver which, in turn, accesses 'obj' . Now, if there is no TLB record in place at the moment --> TLB miss --> TLB-miss handler looks at task_2 :: page_tables but there is no record! --> do_page_fault() --> verifies a 'fault address' against the 'master_table' and if it's ok, copies a record from 'master_table' to task_2 :: page_tables --> update TLB, etc. although, it's kind of a 'cow' mech. Provided, do_page_fault() is not handled in the primary mode.. so to avoid it, I guess, one would need to immediately sync. RT_task :: page_tables upon any vmalloc() in the system. Sure, in theory, 'master_table' could have been always used for translation of the kernel space vmalloc()'ed areas.. but that somewhat complicates a TLB-miss handler : maybe size, speed, sychronization, I guess (humm.. interesting why it's not actually like that.. maybe smth else). > > e.g. on MIPS, the area used for kernel modules also requires > > virtual->physical translation.. so even a kernel-mode task (and > > actually, interrupt handlers inside the kernel modules) cause TLB-miss > > exceptions. Sure, it's not a case if it's linked against the kernel > > itself. > > Even kernel code may cause a TLB miss. TLB pinning has been abandoned on > some PowerPC archs because it does reduce overall system performance. Humm.. interesting. AFAIK, on MIPS [ 0x80000000, 0xa0000000 ] and [ 0xa0000000, 0xc0000000 ] are just directly mapped onto the physical memory, cached and non-cached mode respectively. This avoids a need for TLB. > > See above. Unfortunately, I do only a limited, global view of the MMU > implementation for PowerPC. Thanks indeed! > > Wolfgang. > -- Best regards, Dmitry Adamushko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-08 9:12 ` Dmitry Adamushko @ 2007-08-08 10:13 ` Wolfgang Grandegger 0 siblings, 0 replies; 20+ messages in thread From: Wolfgang Grandegger @ 2007-08-08 10:13 UTC (permalink / raw) To: Dmitry Adamushko; +Cc: Xenomai help Dmitry Adamushko wrote: > Heikki and Wolfgang, thanks for the links. > >>> Now, Xenomai is not able (at the moment) to do [*] on its own.. thus, >>> there is a switch to the secondary mode so that Linux is able to take >>> care of it. >> But normal TLB misses happen frequently and do not force a switch to >> secondary mode. I think the problem is with the do_page_fault trap. > > This would make sense indeed. I presume, there must be some sort of > inter-domain synchronization in place to safely access the 'page > tables' for the fast-path of a TLB-miss handler to run in the primary > mode (chores: so go and read the code.. ah well :-) > > Then a remaining part is do_page_fault() and one of the reasons can be > 'cow' indeed. > > As a side note (not PPC related so one may skip this part), > > on MIPS do_page_fault() may be caused by accessing a valid address > allocated by a kernel module (say, vmalloc()). Say, > > - task_1 calls ioctl() of some driver which does obj = vmalloc() . As a result, > task_1 :: page_tables are updated accordingly + so called > 'master_table' gets updated; > > - task_2 calls ioctl() to the same driver which, in turn, accesses 'obj' . > Now, if there is no TLB record in place at the moment --> TLB miss > --> TLB-miss handler looks at task_2 :: page_tables but there is no > record! --> do_page_fault() --> verifies a 'fault address' against the > 'master_table' and if it's ok, copies a record from 'master_table' to > task_2 :: page_tables --> update TLB, etc. > > although, it's kind of a 'cow' mech. Provided, do_page_fault() is not > handled in the primary mode.. so to avoid it, I guess, one would need > to immediately sync. RT_task :: page_tables upon any vmalloc() in the > system. Sure, in theory, 'master_table' could have been always used > for translation of the kernel space vmalloc()'ed areas.. but that > somewhat complicates a TLB-miss handler : maybe size, speed, > sychronization, I guess (humm.. interesting why it's not actually like > that.. maybe smth else). > >>> e.g. on MIPS, the area used for kernel modules also requires >>> virtual->physical translation.. so even a kernel-mode task (and >>> actually, interrupt handlers inside the kernel modules) cause TLB-miss >>> exceptions. Sure, it's not a case if it's linked against the kernel >>> itself. >> Even kernel code may cause a TLB miss. TLB pinning has been abandoned on >> some PowerPC archs because it does reduce overall system performance. > > Humm.. interesting. AFAIK, on MIPS [ 0x80000000, 0xa0000000 ] and [ > 0xa0000000, 0xc0000000 ] are just directly mapped onto the physical > memory, cached and non-cached mode respectively. This avoids a need > for TLB. This is done in a similar way for PowerPC CPUs with a "standard" MMU. The kernel address space is then directly mapped with a Block Address Translation (BAT) register. My statement is valid for systems with just 64 TLB entries or less (4xx, 8xx) and pinning one or two of them has an impact on user space performance. >> See above. Unfortunately, I do only a limited, global view of the MMU >> implementation for PowerPC. > > Thanks indeed! Just found the following article comparing 6xx/7xx with 4xx: http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF7785256997006EB430/$file/4xx_6xx_an.pdf It also highlights the difference between (Data/Instruction) TLB miss and Data (Load/Store) Translation miss. Wolfgang. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xenomai-help] Unexpected switch to secondary mode 2007-08-02 10:54 ` Johan Borkhuis 2007-08-02 11:23 ` Philippe Gerum @ 2007-08-04 12:30 ` Wolfgang Grandegger 1 sibling, 0 replies; 20+ messages in thread From: Wolfgang Grandegger @ 2007-08-04 12:30 UTC (permalink / raw) To: Johan Borkhuis; +Cc: Xenomai-help Johan Borkhuis wrote: > Philippe Gerum wrote: >> On Thu, 2007-08-02 at 11:47 +0200, Johan Borkhuis wrote: >> >>> I am experiencing an unexpected switch to secondary mode in a >>> rt_timer_tsc2ns call from userspace. >>> >>> The following code give a switch: >>> SRTIME timeStamp; >>> timeStamp = rt_timer_tsc2ns(rt_timer_tsc()); >>> while((rt_timer_tsc2ns(rt_timer_tsc()) - timeStamp) < (usecs*1000)) {} >>> >>> while the following code does not: >>> RTIME timeStamp; >>> timeStamp = rt_timer_tsc(); >>> while((rt_timer_tsc() - timeStamp) < (usecs*1000)) {} >>> >>> (I know that the second example causes a longer timeout, but this was to >>> show the testcase). >>> >>> When splitting up the 2nd line in the first example I see that the >>> rt_timer_tsc() call does not cause a switch, but the rt_timer_tsc2ns >>> does. What am I doing wrong here? >>> >>> I am using Xenomai-2.3.2. >>> >>> >> - Which CPU architecture, and which Adeos patch release are you using? >> > Processor: ppc 85xx, patch 1.5-03, Xenomai 2.3.2 > >> - How do you notice the switches, SIGXCPU or /proc/xenomai/stat? >> > SIGXCPU And what does /proc/xenomai/faults show right before and after the SIGXCPU? Wolfgang. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2007-08-08 10:13 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-02 9:47 [Xenomai-help] Unexpected switch to secondary mode Johan Borkhuis
2007-08-02 10:12 ` Philippe Gerum
2007-08-02 10:54 ` Johan Borkhuis
2007-08-02 11:23 ` Philippe Gerum
2007-08-02 17:55 ` Gilles Chanteperdrix
2007-08-02 18:01 ` Philippe Gerum
2007-08-03 8:05 ` Johan Borkhuis
2007-08-05 17:22 ` Philippe Gerum
2007-08-06 9:08 ` Gilles Chanteperdrix
2007-08-06 11:41 ` Gilles Chanteperdrix
2007-08-07 11:06 ` Johan Borkhuis
2007-08-07 12:49 ` Gilles Chanteperdrix
2007-08-07 14:13 ` Johan Borkhuis
2007-08-06 11:54 ` Philippe Gerum
[not found] ` <b647ffbd0708070758t22f01577wd3a5397a53249459@domain.hid>
2007-08-08 7:40 ` Dmitry Adamushko
2007-08-08 8:00 ` Heikki Lindholm
2007-08-08 8:14 ` Wolfgang Grandegger
2007-08-08 9:12 ` Dmitry Adamushko
2007-08-08 10:13 ` Wolfgang Grandegger
2007-08-04 12:30 ` Wolfgang Grandegger
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.