* [parisc-linux] Generic light-weight syscall.
@ 2003-07-25 6:37 Carlos O'Donell
2003-07-25 11:37 ` Matthew Wilcox
0 siblings, 1 reply; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-25 6:37 UTC (permalink / raw)
To: parisc-linux
pa,
Any thoughts about how one should implement some type of lightweight
syscall for our glibc to use?
We already have the makings of a simple system for SET_THREAD_SELF (e.g.
seting cr27, the thread register, from userpsace), and I want to extend
this to:
exchange_and_add (volatile uint32_t *mem, int val)
atomic_add (volatile uint32_t *mem, int val)
compare_and_swap (volatile long int *p, long int oldval, long int newval)
o---> libc
--> exchange_and_add
==> Params into kernel
==> disable interrupts on the current processor
==> take a semaphore to keep other cpu's out
==> do work
==> release semaphore
==> reenable i-bit
--> back into userspace and done.
This _must_ be very very fast, and appear atomic to userspace.
Perhaps adding other pages instead of 0xE0 for set_thread_self might be
the simplest way to do this? The area after the linux gateway page
perhaps? Which seems to be the start of the next 4k page? Would this be
possible? e.g. branch to 0x1000 (not 0x100 which is the current syscall
branch).
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-25 6:37 [parisc-linux] Generic light-weight syscall Carlos O'Donell
@ 2003-07-25 11:37 ` Matthew Wilcox
2003-07-26 17:48 ` Carlos O'Donell
2003-07-28 20:30 ` Randolph Chung
0 siblings, 2 replies; 20+ messages in thread
From: Matthew Wilcox @ 2003-07-25 11:37 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: parisc-linux
On Fri, Jul 25, 2003 at 02:37:40AM -0400, Carlos O'Donell wrote:
> Any thoughts about how one should implement some type of lightweight
> syscall for our glibc to use?
I have lots of thoughts ;-)
> We already have the makings of a simple system for SET_THREAD_SELF (e.g.
> seting cr27, the thread register, from userpsace), and I want to extend
> this to:
>
> exchange_and_add (volatile uint32_t *mem, int val)
> atomic_add (volatile uint32_t *mem, int val)
> compare_and_swap (volatile long int *p, long int oldval, long int newval)
Sure. Sounds like a great idea.
> o---> libc
> --> exchange_and_add
> ==> Params into kernel
> ==> disable interrupts on the current processor
> ==> take a semaphore to keep other cpu's out
> ==> do work
> ==> release semaphore
> ==> reenable i-bit
> --> back into userspace and done.
>
> This _must_ be very very fast, and appear atomic to userspace.
I'd say a spinlock rather than a semaphore. And likely a special-cased
one too.
> Perhaps adding other pages instead of 0xE0 for set_thread_self might be
> the simplest way to do this? The area after the linux gateway page
> perhaps? Which seems to be the start of the next 4k page? Would this be
> possible? e.g. branch to 0x1000 (not 0x100 which is the current syscall
> branch).
I'd say we should keep doing stuff on our existing gateway page until we
exhaust it. We've got plenty of space -- 248 instruction slots left before
0xE0, and a lot of space left after the syscall handler.
On a related subject, fast gettimeofday is always a popular idea. I'm not
sure of all the ramifications of, for example, mapping a user-read-only,
system-writable data page after the gateway page (can't put the data
on the existing gateway page; a page that can do privilege promotion
isn't readable/writable). If we have only one CPU update the data on
that page, time shouldn't go backwards ... right?
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-25 11:37 ` Matthew Wilcox
@ 2003-07-26 17:48 ` Carlos O'Donell
2003-07-26 18:00 ` Carlos O'Donell
2003-07-28 20:30 ` Randolph Chung
1 sibling, 1 reply; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-26 17:48 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
> > Any thoughts about how one should implement some type of lightweight
> > syscall for our glibc to use?
>
> I have lots of thoughts ;-)
Spill the beans!
> > o---> libc
> > --> exchange_and_add
> > ==> Params into kernel
> > ==> disable interrupts on the current processor
> > ==> take a semaphore to keep other cpu's out
> > ==> do work
> > ==> release semaphore
> > ==> reenable i-bit
> > --> back into userspace and done.
> >
> > This _must_ be very very fast, and appear atomic to userspace.
>
> I'd say a spinlock rather than a semaphore. And likely a special-cased
> one too.
Okay, we talked about this, choose a random 4-bits from the incoming
address and hash this to select one of 16 spinlocks that keep other
cpu's out. This, as you indicated, should scale to more than 4 CPU's and
make LaMont happy.
> I'd say we should keep doing stuff on our existing gateway page until we
> exhaust it. We've got plenty of space -- 248 instruction slots left before
> 0xE0, and a lot of space left after the syscall handler.
I'll see if I can fit _all_ the operations into jumps in that area. If I
can't then I'll see what the performance of implementing "one" operation
in kernel and using that to do the rest atomically.
> On a related subject, fast gettimeofday is always a popular idea. I'm not
> sure of all the ramifications of, for example, mapping a user-read-only,
> system-writable data page after the gateway page (can't put the data
> on the existing gateway page; a page that can do privilege promotion
> isn't readable/writable). If we have only one CPU update the data on
> that page, time shouldn't go backwards ... right?
If you map the page and make CPU 0 update the date, then I'll write the
userspace interface for gettimeofday.
I'm not sure how we would do the check for 'do we see fast gettimeofday'
but it might be that we include a magic value there and check for it?
Other arches must have solved this.
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-26 17:48 ` Carlos O'Donell
@ 2003-07-26 18:00 ` Carlos O'Donell
2003-07-27 12:27 ` Grant Grundler
2003-07-27 20:43 ` Michael S.Zick
0 siblings, 2 replies; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-26 18:00 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: parisc-linux
>
> If you map the page and make CPU 0 update the date, then I'll write the
> userspace interface for gettimeofday.
>
> I'm not sure how we would do the check for 'do we see fast gettimeofday'
> but it might be that we include a magic value there and check for it?
> Other arches must have solved this.
Talked to Rik Van Riel about fast gettimeofday and he indicated that
it's not doable since you can't guarantee your process will get
scheduled on another CPU whose clock is out of sync by more than X and
get negative time. Though I imagine you were talking about having one
CPU update one page with time on it... and then other CPU's read this?
LaMont notes that there is no requirement from the PA design that CPU's
clock at _exactly_ the same frequency or have monotonically incrementing
clocks at the right rate. Could you explain the idea you have a bit
more?
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-26 18:00 ` Carlos O'Donell
@ 2003-07-27 12:27 ` Grant Grundler
2003-07-28 15:57 ` Carlos O'Donell
2003-07-27 20:43 ` Michael S.Zick
1 sibling, 1 reply; 20+ messages in thread
From: Grant Grundler @ 2003-07-27 12:27 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: Matthew Wilcox, parisc-linux
On Sat, Jul 26, 2003 at 02:00:32PM -0400, Carlos O'Donell wrote:
> Talked to Rik Van Riel about fast gettimeofday and he indicated that
> it's not doable since you can't guarantee your process will get
> scheduled on another CPU whose clock is out of sync by more than X and
> get negative time.
Yes we can. We can sync CR16 across CPUs within a few CPU cycles.
I've described this before on parisc-linux.
"sync" means figure out the difference between CR16 on several
CPUs and using CPU 0 as the reference.
> Though I imagine you were talking about having one
> CPU update one page with time on it... and then other CPU's read this?
> LaMont notes that there is no requirement from the PA design that CPU's
> clock at _exactly_ the same frequency or have monotonically incrementing
> clocks at the right rate.
correct. IIRC 9000/870 have seperate clock sources.
But all the boxes we support to date have exactly one clock source.
The multi-cell boxes (like superdome) will have multiple sources
and I don't know how to handle those - maybe a "not quite so fast"
gettimeofday().
hth,
grant
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-26 18:00 ` Carlos O'Donell
2003-07-27 12:27 ` Grant Grundler
@ 2003-07-27 20:43 ` Michael S.Zick
1 sibling, 0 replies; 20+ messages in thread
From: Michael S.Zick @ 2003-07-27 20:43 UTC (permalink / raw)
To: parisc-linux
On Saturday 26 July 2003 01:00 pm, Carlos O'Donell wrote:
> > If you map the page and make CPU 0 update the date, then I'll write the
> > userspace interface for gettimeofday.
> >
> > I'm not sure how we would do the check for 'do we see fast gettimeofday'
> > but it might be that we include a magic value there and check for it?
> > Other arches must have solved this.
>
> Talked to Rik Van Riel about fast gettimeofday and he indicated that
> it's not doable since you can't guarantee your process will get
> scheduled on another CPU whose clock is out of sync by more than X and
> get negative time. Though I imagine you were talking about having one
> CPU update one page with time on it... and then other CPU's read this?
> LaMont notes that there is no requirement from the PA design that CPU's
> clock at _exactly_ the same frequency or have monotonically incrementing
> clocks at the right rate. Could you explain the idea you have a bit
> more?
Note - some multiple cpu hardware intentionally clocks the cpu's at
slightly different rates to limit RFI generation.
Not sure if any HP-PARISC machines do such clocking.
Mike
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-27 12:27 ` Grant Grundler
@ 2003-07-28 15:57 ` Carlos O'Donell
2003-07-28 17:45 ` Matthew Wilcox
0 siblings, 1 reply; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-28 15:57 UTC (permalink / raw)
To: Grant Grundler; +Cc: Matthew Wilcox, parisc-linux
> Yes we can. We can sync CR16 across CPUs within a few CPU cycles.
> I've described this before on parisc-linux.
It might be too costly to do the sync'ing all the time, and too costly
for a fast gettimeofday to do a sync at the polling point.
> But all the boxes we support to date have exactly one clock source.
> The multi-cell boxes (like superdome) will have multiple sources
> and I don't know how to handle those - maybe a "not quite so fast"
> gettimeofday().
The whole point behind fast gettimeofday is that userspace apps that
want to do timestamping on a _very_ accurate granularity (e.g.
nanosecondes) can get monotically incrementing numbers on each
gettimeofday. Do we even have such a fast clock on PA? What is the
fastest clock across the most boxes?
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 15:57 ` Carlos O'Donell
@ 2003-07-28 17:45 ` Matthew Wilcox
2003-07-28 19:04 ` Carlos O'Donell
2003-07-29 18:51 ` Grant Grundler
0 siblings, 2 replies; 20+ messages in thread
From: Matthew Wilcox @ 2003-07-28 17:45 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: Grant Grundler, Matthew Wilcox, parisc-linux
On Mon, Jul 28, 2003 at 11:57:04AM -0400, Carlos O'Donell wrote:
> > Yes we can. We can sync CR16 across CPUs within a few CPU cycles.
> > I've described this before on parisc-linux.
>
> It might be too costly to do the sync'ing all the time, and too costly
> for a fast gettimeofday to do a sync at the polling point.
>
> > But all the boxes we support to date have exactly one clock source.
> > The multi-cell boxes (like superdome) will have multiple sources
> > and I don't know how to handle those - maybe a "not quite so fast"
> > gettimeofday().
>
>
> The whole point behind fast gettimeofday is that userspace apps that
> want to do timestamping on a _very_ accurate granularity (e.g.
> nanosecondes) can get monotically incrementing numbers on each
> gettimeofday. Do we even have such a fast clock on PA? What is the
> fastest clock across the most boxes?
You know, you don't even need kernel help for this. According to page
2-5 of the Kane book, the Interval Timer is accessible by non-privileged
instructions.
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 17:45 ` Matthew Wilcox
@ 2003-07-28 19:04 ` Carlos O'Donell
2003-07-28 19:14 ` Matthew Wilcox
2003-07-29 18:51 ` Grant Grundler
1 sibling, 1 reply; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-28 19:04 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Grant Grundler, parisc-linux
> You know, you don't even need kernel help for this. According to page
> 2-5 of the Kane book, the Interval Timer is accessible by non-privileged
> instructions.
Isn't this going to be different for all CPU's? Which means if you get
scheduled around you might see negative moving time?
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 19:04 ` Carlos O'Donell
@ 2003-07-28 19:14 ` Matthew Wilcox
2003-07-28 21:10 ` Richard Hirst
0 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2003-07-28 19:14 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: Matthew Wilcox, Grant Grundler, parisc-linux
On Mon, Jul 28, 2003 at 03:04:55PM -0400, Carlos O'Donell wrote:
> > You know, you don't even need kernel help for this. According to page
> > 2-5 of the Kane book, the Interval Timer is accessible by non-privileged
> > instructions.
>
> Isn't this going to be different for all CPU's? Which means if you get
> scheduled around you might see negative moving time?
We discussed this in the Black Thorn ... record what value you last
returned to the user and never return less than that.
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-25 11:37 ` Matthew Wilcox
2003-07-26 17:48 ` Carlos O'Donell
@ 2003-07-28 20:30 ` Randolph Chung
2003-07-28 20:37 ` Matthew Wilcox
1 sibling, 1 reply; 20+ messages in thread
From: Randolph Chung @ 2003-07-28 20:30 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Carlos O'Donell, parisc-linux
> I'd say we should keep doing stuff on our existing gateway page until we
> exhaust it. We've got plenty of space -- 248 instruction slots left before
> 0xE0, and a lot of space left after the syscall handler.
>
> On a related subject, fast gettimeofday is always a popular idea. I'm not
Why not add a flag to syscall() which indicates whether this is a "fast"
syscall or a "slow" syscall, and based on this, decide whether to do all
the register spilling, etc when entering the kernel? Then we can
implement the atomic ops as additional "syscalls"....
I would think that there is at least some amount of logic that needs to
be there everytime you enter/exit the kernel, irregardless of whether
you are doing a "fast syscall" (i.e. no need to save the processor
state, etc) or a regular one... i would hope we don't need to have two
copies of that logic.
Carlos had some concerns that this means fast syscalls (or regular ones
perhaps) will always incur a mispredicted branch and/or extra stack
manipulations that may not be needed..... but i'm not yet convinced that
there is enough overhead to make this a problem. what do others think?
thanks,
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 20:30 ` Randolph Chung
@ 2003-07-28 20:37 ` Matthew Wilcox
0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2003-07-28 20:37 UTC (permalink / raw)
To: Randolph Chung; +Cc: Matthew Wilcox, Carlos O'Donell, parisc-linux
On Mon, Jul 28, 2003 at 01:30:41PM -0700, Randolph Chung wrote:
> Why not add a flag to syscall() which indicates whether this is a "fast"
> syscall or a "slow" syscall, and based on this, decide whether to do all
> the register spilling, etc when entering the kernel? Then we can
> implement the atomic ops as additional "syscalls"....
>
> I would think that there is at least some amount of logic that needs to
> be there everytime you enter/exit the kernel, irregardless of whether
> you are doing a "fast syscall" (i.e. no need to save the processor
> state, etc) or a regular one... i would hope we don't need to have two
> copies of that logic.
>
> Carlos had some concerns that this means fast syscalls (or regular ones
> perhaps) will always incur a mispredicted branch and/or extra stack
> manipulations that may not be needed..... but i'm not yet convinced that
> there is enough overhead to make this a problem. what do others think?
I don't think enough syscalls are "fast" to make this worth doing.
Most of the things people are talking about for lightweight syscalls are
things that could/should be done in userspace ... except that you need to
be privileged to use them. So the optimum way to solve them is to have a
special calling convention and todo the absolute minimum amount of work.
Setting cr26 is a great example because we would do it in userspace if
the architecture let us.
Does anyone have a better definition for lightweight syscall? How about
"Cannot sleep, called frequently, must be fast"?
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 19:14 ` Matthew Wilcox
@ 2003-07-28 21:10 ` Richard Hirst
2003-07-29 17:50 ` Carlos O'Donell
0 siblings, 1 reply; 20+ messages in thread
From: Richard Hirst @ 2003-07-28 21:10 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Carlos O'Donell, Grant Grundler, parisc-linux
On Mon, Jul 28, 2003 at 08:14:20PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 28, 2003 at 03:04:55PM -0400, Carlos O'Donell wrote:
> > > You know, you don't even need kernel help for this. According to page
> > > 2-5 of the Kane book, the Interval Timer is accessible by non-privileged
> > > instructions.
> >
> > Isn't this going to be different for all CPU's? Which means if you get
> > scheduled around you might see negative moving time?
>
> We discussed this in the Black Thorn ... record what value you last
> returned to the user and never return less than that.
Not very nice if the difference between CPUs is significant. You could
find yourself checking time on CPU A, doing some work, getting moved to
CPU B, checking time, and finding you apparently did all that in 0ns.
Richard
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 21:10 ` Richard Hirst
@ 2003-07-29 17:50 ` Carlos O'Donell
2003-07-29 18:55 ` Grant Grundler
2003-07-29 21:06 ` Richard Hirst
0 siblings, 2 replies; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-29 17:50 UTC (permalink / raw)
To: Richard Hirst; +Cc: Matthew Wilcox, Grant Grundler, parisc-linux
> > We discussed this in the Black Thorn ... record what value you last
> > returned to the user and never return less than that.
>
> Not very nice if the difference between CPUs is significant. You could
> find yourself checking time on CPU A, doing some work, getting moved to
> CPU B, checking time, and finding you apparently did all that in 0ns.
Willy and myself talked about this, you just return 1ns in the case
where you know that _something_ must have taken _some_ amount of time
(e.g. the insns that make up fast gettimeofday) so you couldn't have done
it in zero time :)
I also just noticed (thanks to jejb) that cr16 is readable from
userspace since we leave PSW-S cleared. So the following works:
unsigned long cr16;
asm("mfctl %%cr16, %0" : "=r" (cr16) : );
printf("cr16=%lu\n",cr16);
---
carlos@firin:~$ ./test_cr16_read
cr16=3544734914
carlos@firin:~$ ./test_cr16_read
cr16=4230470557
carlos@firin:~$ ./test_cr16_read
cr16=2337868642
carlos@firin:~$
---
It overlaps pretty fast though. Userspace could do the translation and
hold on to a 'last_tick' value and then return a really small value if
we get scheduled onto a negatively-drifted CPU? Do all CPU's update cr16?
Willy, I just realized that the following trail of function calls
doesn't really work under smp:
glibc ->
gettimeofday ->
syscall + syscall table lookup ->
sys32_gettimeofday ->
do_gettimeofday ->
gettimeoffset ( returns 0 in smp)
With the comment I don't quite understand:
linux-2.4/arch/parisc/kernel/time.c
/*
* FIXME: This won't work on smp because jiffies are updated by cpu
* 0.
* Once parisc-linux learns the cr16 difference between
* processors,
* this could be made to work.
*/
Does that mean that our time granularity drops drastically in the SMP
case? We loose usec! (2.4 and 2.5 kernels)
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-28 17:45 ` Matthew Wilcox
2003-07-28 19:04 ` Carlos O'Donell
@ 2003-07-29 18:51 ` Grant Grundler
1 sibling, 0 replies; 20+ messages in thread
From: Grant Grundler @ 2003-07-29 18:51 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Carlos O'Donell, parisc-linux
On Mon, Jul 28, 2003 at 06:45:51PM +0100, Matthew Wilcox wrote:
...
> > It might be too costly to do the sync'ing all the time, and too costly
> > for a fast gettimeofday to do a sync at the polling point.
If they all use the same clock source they won't drift.
> > gettimeofday. Do we even have such a fast clock on PA? What is the
> > fastest clock across the most boxes?
cr16
> You know, you don't even need kernel help for this. According to page
> 2-5 of the Kane book, the Interval Timer is accessible by non-privileged
> instructions.
One needs kernel help in determining the difference between CPUs
and handling CR16 rollover (mostly a problem for 32-bit machines).
But then user space can read CR16 and "normalize" the time.
grant
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-29 17:50 ` Carlos O'Donell
@ 2003-07-29 18:55 ` Grant Grundler
2003-07-29 21:06 ` Richard Hirst
1 sibling, 0 replies; 20+ messages in thread
From: Grant Grundler @ 2003-07-29 18:55 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Richard Hirst, Matthew Wilcox, Grant Grundler, parisc-linux
On Tue, Jul 29, 2003 at 01:50:58PM -0400, Carlos O'Donell wrote:
> unsigned long cr16;
> asm("mfctl %%cr16, %0" : "=r" (cr16) : );
> printf("cr16=%lu\n",cr16);
this only gets you the CR16 for *that* CPU.
Not useful since CR16 isn't syncronized at powerup between CPUs.
At least I don't think that's the case...maybe Kirk Bresniker knows
how that works. We've talked about this before on parisc-linux.
...
> With the comment I don't quite understand:
> linux-2.4/arch/parisc/kernel/time.c
> /*
> * FIXME: This won't work on smp because jiffies are updated by cpu
> * 0.
> * Once parisc-linux learns the cr16 difference between
> * processors,
> * this could be made to work.
> */
This is what I've been trying to explain.
> Does that mean that our time granularity drops drastically in the SMP
> case? We loose usec! (2.4 and 2.5 kernels)
not by much.
grant
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-29 17:50 ` Carlos O'Donell
2003-07-29 18:55 ` Grant Grundler
@ 2003-07-29 21:06 ` Richard Hirst
2003-07-29 23:36 ` Carlos O'Donell
2003-07-29 23:38 ` Carlos O'Donell
1 sibling, 2 replies; 20+ messages in thread
From: Richard Hirst @ 2003-07-29 21:06 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: Matthew Wilcox, Grant Grundler, parisc-linux
On Tue, Jul 29, 2003 at 01:50:58PM -0400, Carlos O'Donell wrote:
> > > We discussed this in the Black Thorn ... record what value you last
> > > returned to the user and never return less than that.
> >
> > Not very nice if the difference between CPUs is significant. You could
> > find yourself checking time on CPU A, doing some work, getting moved to
> > CPU B, checking time, and finding you apparently did all that in 0ns.
>
> Willy and myself talked about this, you just return 1ns in the case
> where you know that _something_ must have taken _some_ amount of time
> (e.g. the insns that make up fast gettimeofday) so you couldn't have done
> it in zero time :)
Depends what people want to use it for. I couldn't use it to time how
long some syscall took, for example. But if we zap the microseconds
part on smp anyway, that's irrelevant.
Richard
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-29 21:06 ` Richard Hirst
@ 2003-07-29 23:36 ` Carlos O'Donell
2003-07-30 16:37 ` Thibaut VARENE
2003-07-29 23:38 ` Carlos O'Donell
1 sibling, 1 reply; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-29 23:36 UTC (permalink / raw)
To: Richard Hirst; +Cc: Matthew Wilcox, Grant Grundler, parisc-linux
> Depends what people want to use it for. I couldn't use it to time how
> long some syscall took, for example. But if we zap the microseconds
> part on smp anyway, that's irrelevant.
Well there was talk of, on irc, of exporting the CPU# through cr26.
- Read cr16
- Read cr26
- Read cr16
If you weren't rescheduled the delta should be a minimal number of
ticks (not taking into account nearness to unsigned overflow). You then
use this tick value to calculate a delta via a table of CPU specific
offsets. If the tick is far out from the last read, then you assume a
reschedule and loop. Perhaps terminating on the third try with a default
delta?
I mean the easiest way, as willy notes is to jump into the kernel with a
fast syscall, disable interrupts, get the CPU# and index into a table of
cpu vs. last known good tick.
However, since we _can_ read cr16 (as willy wrote in an email I totally
failed to read, sorry willy!) from userspace on most systems (not on
705's and 710's but they aren't SMP anyway and we can change the method
there) we are trying to make good use of that.
So there are a variety of ways:
1. Fast-syscall similar to set_thread_register, clears interrupt bit,
indexes into cpu# table to get last good known tick and returns it
to userspace (after cleaning up the mess).
2. Userspace does triple read and loop until it looks like we (in a
lockless fashion) have both the right CPU# and latest tick which we
can use to update our CPU/tick table.
3. Export a page with '(tick_val & mask) | CPU#' or 'tick_val xor CPU#'.
You then use this to determine the CPU and tick atomically in a
single read.
We know #1 works. We don't know if #2 is faster than #1 (or stable),
anyone wishing to comment please do :) Number 3 would loose resolution
in, first by loping off bits, or by having near tick values that
overlap and you aren't able to find the CPU# from the xor'd quantity.
To add a datapoint to #2, on a PA8700 650Mhz I see:
Anyone wishing to run this test on another box, please do...
---
#include <stdio.h>
#include <time.h>
#define LOOPS 1000
int main(void){
double avg_diff=0.0;
unsigned long cr16a, cr16b, cr26;
int i=LOOPS;
while(i>0){
asm("mfctl %%cr16, %%r26 \n\
mfctl %%cr26, %%r24 \n\
mfctl %%cr16, %%r25 \n\
stw %%r26, %0 \n\
stw %%r25, %1 \n\
stw %%r24, %2" : "=m" (cr16a), "=m" (cr16b), "=m" (cr26) :);
printf("cr26=%lu\ncr16a=%lu\ncr16b=%lu\ndiff=%lu\n",
cr26,cr16a,cr16b,cr16b-cr16a);
avg_diff+=(double)(cr16b-cr16a);
printf("avg_diff=%f\n",avg_diff);
i--;
}
printf("Average ticks per back/back cr16 read (%lu loops) = %f\n",
LOOPS,avg_diff/(double)LOOPS);
exit(0);
}
---
<snip>
cr26=4294967295
cr16a=3348684869
cr16b=3348684881
diff=12
avg_diff=11435.000000
Average ticks per back/back cr16 read (1000 loops) = 11.435000
carlos@firin:/mnt/flaire/src/linux-2.5/arch/parisc/kernel$
---
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-29 21:06 ` Richard Hirst
2003-07-29 23:36 ` Carlos O'Donell
@ 2003-07-29 23:38 ` Carlos O'Donell
1 sibling, 0 replies; 20+ messages in thread
From: Carlos O'Donell @ 2003-07-29 23:38 UTC (permalink / raw)
To: Richard Hirst; +Cc: Matthew Wilcox, Grant Grundler, parisc-linux
> Depends what people want to use it for. I couldn't use it to time how
> long some syscall took, for example. But if we zap the microseconds
> part on smp anyway, that's irrelevant.
useconds gets zapped in gettimeofday on SMP systems. I'm working on
trying to fix this :)
c.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [parisc-linux] Generic light-weight syscall.
2003-07-29 23:36 ` Carlos O'Donell
@ 2003-07-30 16:37 ` Thibaut VARENE
0 siblings, 0 replies; 20+ messages in thread
From: Thibaut VARENE @ 2003-07-30 16:37 UTC (permalink / raw)
To: Carlos O'Donell
Cc: Richard Hirst, Matthew Wilcox, Grant Grundler, parisc-linux
On Tue, 29 Jul 2003 19:36:25 -0400
Carlos O'Donell <carlos@baldric.uwo.ca> wrote:
> To add a datapoint to #2, on a PA8700 650Mhz I see:
> Anyone wishing to run this test on another box, please do...
A500-5X (PA8600@550Mhz):
cr26=4294934527
cr16a=4175244475
cr16b=4175244488
diff=13
avg_diff=12993.000000
Average ticks per back/back cr16 read (1000 loops) = 12.993000
HTH,
Thibaut VARENE
The PA/Linux ESIEE Team
http://pateam.esiee.fr/
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2003-07-30 16:37 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-07-25 6:37 [parisc-linux] Generic light-weight syscall Carlos O'Donell
2003-07-25 11:37 ` Matthew Wilcox
2003-07-26 17:48 ` Carlos O'Donell
2003-07-26 18:00 ` Carlos O'Donell
2003-07-27 12:27 ` Grant Grundler
2003-07-28 15:57 ` Carlos O'Donell
2003-07-28 17:45 ` Matthew Wilcox
2003-07-28 19:04 ` Carlos O'Donell
2003-07-28 19:14 ` Matthew Wilcox
2003-07-28 21:10 ` Richard Hirst
2003-07-29 17:50 ` Carlos O'Donell
2003-07-29 18:55 ` Grant Grundler
2003-07-29 21:06 ` Richard Hirst
2003-07-29 23:36 ` Carlos O'Donell
2003-07-30 16:37 ` Thibaut VARENE
2003-07-29 23:38 ` Carlos O'Donell
2003-07-29 18:51 ` Grant Grundler
2003-07-27 20:43 ` Michael S.Zick
2003-07-28 20:30 ` Randolph Chung
2003-07-28 20:37 ` Matthew Wilcox
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.