All of lore.kernel.org
 help / color / mirror / Atom feed
* IBM test question
@ 2008-02-07 13:49 Matthieu CASTET
  2008-02-07 15:34 ` Sébastien Dugué
  0 siblings, 1 reply; 6+ messages in thread
From: Matthieu CASTET @ 2008-02-07 13:49 UTC (permalink / raw)
  To: linux-rt-users

hi,

I am trying to use some IBM rt test on arm.


I define atomic_add to
assert(i==1);
return ++(v->counter);

That's a bit ugly, but that should work for my need.

But I have a problem with the sched_latency test.
On my platform the thread creation is quite slow (25ms), so with the 
default value, I got a PERIOD MISSED.

I wonder why the test account thread creation time and not compute start 
at the beginning of the thread ?

Also my cpu is quite slow (compared to last intel core or powerpc). For 
example a sched_jitter run take 6s.
Couldn't be some static or runtime configuration to configure the test 
according to the cpu speed ?


Matthieu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IBM test question
  2008-02-07 13:49 IBM test question Matthieu CASTET
@ 2008-02-07 15:34 ` Sébastien Dugué
  2008-02-07 16:27   ` Matthieu CASTET
  0 siblings, 1 reply; 6+ messages in thread
From: Sébastien Dugué @ 2008-02-07 15:34 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-rt-users


  Hello Matthieu,

On Thu, 07 Feb 2008 14:49:07 +0100 Matthieu CASTET <matthieu.castet@parrot.com> wrote:

> hi,
> 
> I am trying to use some IBM rt test on arm.
> 
> 
> I define atomic_add to
> assert(i==1);
> return ++(v->counter);
> 
> That's a bit ugly, but that should work for my need.

  That would be the poor man's atomic_inc() and not sure it really does
what you think it does ;). Just for the record, pre-armv6 cores have no support
for userland atomic operations (aside from swapping).

> 
> But I have a problem with the sched_latency test.
> On my platform the thread creation is quite slow (25ms), so with the 
> default value, I got a PERIOD MISSED.

  The IBM RT tests have been integrated into the LTP and I recently
sent some updates to those testcases. Notably one the patches did improve
the thread starting time. Other patches did touch this particular test too.

  Could you try the latest release (from LTP) and tell me if things
have improved for you.

  Also, the PASS/FAIL criteria are quite arbitrary. They happen to be fine
for most recent PC-class hardware but surely not for embedded systems and
should be tuned according to your RT requirements.

> 
> I wonder why the test account thread creation time and not compute start 
> at the beginning of the thread ?

  Yes, maybe this should be fixed.

> 
> Also my cpu is quite slow (compared to last intel core or powerpc). For 
> example a sched_jitter run take 6s.

  Ouch! What's your CPU (core type, clock speed)?

> Couldn't be some static or runtime configuration to configure the test 
> according to the cpu speed ?
> 

  Well, that's not the goal here. The objective is to tune the criteria
according to what kind of latencies your RT application can tolerate, not
the other way around.

  Hope this helps,

  Sebastien.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IBM test question
  2008-02-07 15:34 ` Sébastien Dugué
@ 2008-02-07 16:27   ` Matthieu CASTET
  2008-02-08  9:06     ` Sébastien Dugué
  0 siblings, 1 reply; 6+ messages in thread
From: Matthieu CASTET @ 2008-02-07 16:27 UTC (permalink / raw)
  To: Sébastien Dugué; +Cc: linux-rt-users

Hi Sébastien,

Sébastien Dugué wrote:
>   Hello Matthieu,
> 
> On Thu, 07 Feb 2008 14:49:07 +0100 Matthieu CASTET <matthieu.castet@parrot.com> wrote:
> 
>> hi,
>>
>> I am trying to use some IBM rt test on arm.
>>
>>
>> I define atomic_add to
>> assert(i==1);
>> return ++(v->counter);
>>
>> That's a bit ugly, but that should work for my need.
> 
>   That would be the poor man's atomic_inc() and not sure it really does
> what you think it does ;). Just for the record, pre-armv6 cores have no support
> for userland atomic operations (aside from swapping).
I can, if I use a kernel helper :) [1]

BTW what should do the atomic_add.
On i386 it does the atomic add and return the value in memory before the 
add (Exchange and Add).
On powerpc, it seems to do the atomic add and return the new value.


> 
>> But I have a problem with the sched_latency test.
>> On my platform the thread creation is quite slow (25ms), so with the 
>> default value, I got a PERIOD MISSED.
> 
>   The IBM RT tests have been integrated into the LTP and I recently
> sent some updates to those testcases. Notably one the patches did improve
> the thread starting time. Other patches did touch this particular test too.
> 
>   Could you try the latest release (from LTP) and tell me if things
> have improved for you.
Ok I will try them.
> 
>   Also, the PASS/FAIL criteria are quite arbitrary. They happen to be fine
> for most recent PC-class hardware but surely not for embedded systems and
> should be tuned according to your RT requirements.
Yes I saw that.


> 
>> Also my cpu is quite slow (compared to last intel core or powerpc). For 
>> example a sched_jitter run take 6s.
> 
>   Ouch! What's your CPU (core type, clock speed)?
Arm926 ~104.65 Mhz

Thanks,

Matthieu

[1]
#define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \
   ({ register __typeof (oldval) a_oldval asm ("r0"); 
        \
      register __typeof (oldval) a_newval asm ("r1") = (newval); 
        \
      register __typeof (mem) a_ptr asm ("r2") = (mem); 
        \
      register __typeof (oldval) a_tmp asm ("r3"); 
        \
      register __typeof (oldval) a_oldval2 asm ("r4") = (oldval); 
        \
      __asm__ __volatile__ 
        \
              ("0:\tldr\t%[tmp],[%[ptr]]\n\t" 
        \
               "cmp\t%[tmp], %[old2]\n\t" 
        \
               "bne\t1f\n\t" 
        \
               "mov\t%[old], %[old2]\n\t" 
        \
               "mov\t%[tmp], #0xffff0fff\n\t" 
        \
               "mov\tlr, pc\n\t" 
        \
               "add\tpc, %[tmp], #(0xffff0fc0 - 0xffff0fff)\n\t" 
        \
               "bcc\t0b\n\t" 
        \
               "mov\t%[tmp], %[old2]\n\t" 
        \
               "1:" 
        \
               : [old] "=&r" (a_oldval), [tmp] "=&r" (a_tmp) 
        \
               : [new] "r" (a_newval), [ptr] "r" (a_ptr), 
        \
                 [old2] "r" (a_oldval2) 
        \
               : "ip", "lr", "cc", "memory"); 
        \
      a_tmp; })
	do {
		int oldval = v->counter;
		int ret;
		ret = __arch_compare_and_exchange_val_32_acq(&v->counter, oldval+i, 
oldval);
	} while (ret != oldval);
	return oldval;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IBM test question
  2008-02-07 16:27   ` Matthieu CASTET
@ 2008-02-08  9:06     ` Sébastien Dugué
  2008-02-12 10:19       ` Esben Nielsen
  0 siblings, 1 reply; 6+ messages in thread
From: Sébastien Dugué @ 2008-02-08  9:06 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-rt-users

On Thu, 07 Feb 2008 17:27:53 +0100 Matthieu CASTET <matthieu.castet@parrot.com> wrote:

> Hi Sébastien,
> 
> Sébastien Dugué wrote:
> >   Hello Matthieu,
> > 
> > On Thu, 07 Feb 2008 14:49:07 +0100 Matthieu CASTET <matthieu.castet@parrot.com> wrote:
> > 
> >> hi,
> >>
> >> I am trying to use some IBM rt test on arm.
> >>
> >>
> >> I define atomic_add to
> >> assert(i==1);
> >> return ++(v->counter);
> >>
> >> That's a bit ugly, but that should work for my need.
> > 
> >   That would be the poor man's atomic_inc() and not sure it really does
> > what you think it does ;). Just for the record, pre-armv6 cores have no support
> > for userland atomic operations (aside from swapping).
> I can, if I use a kernel helper :) [1]

  Yep, but much slower.

> 
> BTW what should do the atomic_add.
> On i386 it does the atomic add and return the value in memory before the 
> add (Exchange and Add).

  Looking at the kernel and glibc, i386's atomic_add seems to be a void
function (unless I missed something).

> On powerpc, it seems to do the atomic add and return the new value.

  Yes, both for kernel and glibc implementations.

> 
> 
> > 
> >> But I have a problem with the sched_latency test.
> >> On my platform the thread creation is quite slow (25ms), so with the 
> >> default value, I got a PERIOD MISSED.
> > 
> >   The IBM RT tests have been integrated into the LTP and I recently
> > sent some updates to those testcases. Notably one the patches did improve
> > the thread starting time. Other patches did touch this particular test too.
> > 
> >   Could you try the latest release (from LTP) and tell me if things
> > have improved for you.
> Ok I will try them.
> > 
> >   Also, the PASS/FAIL criteria are quite arbitrary. They happen to be fine
> > for most recent PC-class hardware but surely not for embedded systems and
> > should be tuned according to your RT requirements.
> Yes I saw that.
> 
> 
> > 
> >> Also my cpu is quite slow (compared to last intel core or powerpc). For 
> >> example a sched_jitter run take 6s.
> > 
> >   Ouch! What's your CPU (core type, clock speed)?
> Arm926 ~104.65 Mhz

  ARMv5 core then. You'll need the kernel helper then to be trully atomic.

  Sebastien.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IBM test question
  2008-02-08  9:06     ` Sébastien Dugué
@ 2008-02-12 10:19       ` Esben Nielsen
  2008-02-12 10:57         ` Matthieu CASTET
  0 siblings, 1 reply; 6+ messages in thread
From: Esben Nielsen @ 2008-02-12 10:19 UTC (permalink / raw)
  To: Sébastien Dugué; +Cc: Matthieu CASTET, linux-rt-users

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3725 bytes --]



On Fri, 8 Feb 2008, Sébastien Dugué wrote:

> On Thu, 07 Feb 2008 17:27:53 +0100 Matthieu CASTET <matthieu.castet@parrot.com> wrote:
>
>> Hi Sébastien,
>>
>> Sébastien Dugué wrote:
>>>   Hello Matthieu,
>>>
>>> On Thu, 07 Feb 2008 14:49:07 +0100 Matthieu CASTET <matthieu.castet@parrot.com> wrote:
>>>
>>>> hi,
>>>>
>>>> I am trying to use some IBM rt test on arm.
>>>>
>>>>
>>>> I define atomic_add to
>>>> assert(i==1);
>>>> return ++(v->counter);
>>>>
>>>> That's a bit ugly, but that should work for my need.
>>>
>>>   That would be the poor man's atomic_inc() and not sure it really does
>>> what you think it does ;). Just for the record, pre-armv6 cores have no support
>>> for userland atomic operations (aside from swapping).
>> I can, if I use a kernel helper :) [1]
>
>  Yep, but much slower.
>

I worked with an ARMv4 at my former job and wanted to run Linux on 
it. I thus gave this problem a thought. I got the following idea:
Make a user space preemt-disable counter just like the in-kernel one. This 
can be done by registering a address in userspace per thread pointing 
where to find the counter. When the kernel wants to schedule it checks if 
the counter is non-zero. If it is (the very rare case), it doesn't 
reschedule but sets up a timer of some configurable time (say 1 ms or
whatever you need). If the counter is not back to 0 after the timer has 
expired we schedule anyway and signals the thread to let it know that an
atomic operation have failed. Notice, that this can only happen due to an
error in the program: You must always be able finish your atomic 
operations in 1 ms.

(There are a lot of details to this, ofcourse.  Forinstance. in the case 
the kernel wanted to schedule and sets up he timer, the user space program 
needs to know it so it can disable the timer reschedule as soon as the 
counter reaches 0. And there is the problem of not swapping out the page 
where the counter is stored....)

Esben




>>
>> BTW what should do the atomic_add.
>> On i386 it does the atomic add and return the value in memory before the
>> add (Exchange and Add).
>
>  Looking at the kernel and glibc, i386's atomic_add seems to be a void
> function (unless I missed something).
>
>> On powerpc, it seems to do the atomic add and return the new value.
>
>  Yes, both for kernel and glibc implementations.
>
>>
>>
>>>
>>>> But I have a problem with the sched_latency test.
>>>> On my platform the thread creation is quite slow (25ms), so with the
>>>> default value, I got a PERIOD MISSED.
>>>
>>>   The IBM RT tests have been integrated into the LTP and I recently
>>> sent some updates to those testcases. Notably one the patches did improve
>>> the thread starting time. Other patches did touch this particular test too.
>>>
>>>   Could you try the latest release (from LTP) and tell me if things
>>> have improved for you.
>> Ok I will try them.
>>>
>>>   Also, the PASS/FAIL criteria are quite arbitrary. They happen to be fine
>>> for most recent PC-class hardware but surely not for embedded systems and
>>> should be tuned according to your RT requirements.
>> Yes I saw that.
>>
>>
>>>
>>>> Also my cpu is quite slow (compared to last intel core or powerpc). For
>>>> example a sched_jitter run take 6s.
>>>
>>>   Ouch! What's your CPU (core type, clock speed)?
>> Arm926 ~104.65 Mhz
>
>  ARMv5 core then. You'll need the kernel helper then to be trully atomic.
>
>  Sebastien.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: IBM test question
  2008-02-12 10:19       ` Esben Nielsen
@ 2008-02-12 10:57         ` Matthieu CASTET
  0 siblings, 0 replies; 6+ messages in thread
From: Matthieu CASTET @ 2008-02-12 10:57 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: Sébastien Dugué, linux-rt-users

Hi,

Esben Nielsen wrote:
> 
>>>> for userland atomic operations (aside from swapping).
>>> I can, if I use a kernel helper :) [1]
>>
>>  Yep, but much slower.
>>
> 
> I worked with an ARMv4 at my former job and wanted to run Linux on it. I 
> thus gave this problem a thought. I got the following idea:
> Make a user space preemt-disable counter just like the in-kernel one. 
> This can be done by registering a address in userspace per thread 
> pointing where to find the counter. When the kernel wants to schedule it 
> checks if the counter is non-zero. If it is (the very rare case), it 
> doesn't reschedule but sets up a timer of some configurable time (say 1 
> ms or
> whatever you need). If the counter is not back to 0 after the timer has 
> expired we schedule anyway and signals the thread to let it know that an
> atomic operation have failed. Notice, that this can only happen due to an
> error in the program: You must always be able finish your atomic 
> operations in 1 ms.
> 
> (There are a lot of details to this, ofcourse.  Forinstance. in the case 
> the kernel wanted to schedule and sets up he timer, the user space 
> program needs to know it so it can disable the timer reschedule as soon 
> as the counter reaches 0. And there is the problem of not swapping out 
> the page where the counter is stored....)
> 

The kernel helper is not slow for armv5. There no userspace->kernel 
switch with some magic. That's just 15 instructions instead of one.
The kernel helper is at a special address. When a context switch occurs, 
  the kernel check if it wasn't in the helper and finish the atomic 
operation or set a flag.


Matthieu

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-02-12 10:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-07 13:49 IBM test question Matthieu CASTET
2008-02-07 15:34 ` Sébastien Dugué
2008-02-07 16:27   ` Matthieu CASTET
2008-02-08  9:06     ` Sébastien Dugué
2008-02-12 10:19       ` Esben Nielsen
2008-02-12 10:57         ` Matthieu CASTET

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.