* pte bit spin lock
@ 2004-11-19 6:56 Nick Piggin
2004-11-19 18:46 ` Luck, Tony
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Nick Piggin @ 2004-11-19 6:56 UTC (permalink / raw)
To: linux-ia64
Hi list,
I was wondering if it might be possible to change arch/ia64/kernel/ivt.S
routines that modify pte access bits, to first take a "spin lock bit" in
the pte before any other modifications to it, and clear the lock bit when
done?
And second question, a pte's memory doesn't ever get updated transparently
by the hardware on ia64, does it?
I have been helping Christoph to look at some ways to reduce page_table_lock
locking. It appears that the ptl can be entirely removed by using per-pte
locks, however this can only be efficient if *all* updates to the pte obey
the lock (if not, then all accesses, and the pte-unlock have to be atomic so
the dirty bit doesn't get lost).
And my last question... I wonder if someone might be able to help me do the
assembly for the locking in ivt.S provided it is a small job and I give the
specification? Sorry, I have no idea about ia64 assembly :(
Thanks,
Nick
^ permalink raw reply [flat|nested] 5+ messages in thread* RE: pte bit spin lock
2004-11-19 6:56 pte bit spin lock Nick Piggin
@ 2004-11-19 18:46 ` Luck, Tony
2004-11-19 19:49 ` Christoph Lameter
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Luck, Tony @ 2004-11-19 18:46 UTC (permalink / raw)
To: linux-ia64
>I was wondering if it might be possible to change arch/ia64/kernel/ivt.S
>routines that modify pte access bits, to first take a "spin
>lock bit" in the pte before any other modifications to it, and clear the
>lock bit when done?
Possible? Anything is possible. Is it a good idea? I don't know, you'd
need some benchmark data to show which applications win, and which lose.
Obviously this spin-lock-bit will make some operations that are now cheap
become much more expensive ... whether you have an overall win would
depend on a lot of factors.
>And second question, a pte's memory doesn't ever get updated
>transparently by the hardware on ia64, does it?
No. The h/w VHPT walker on ia64 only _reads_ page tables, all updates
(setting dirty bits) are done by s/w in ivt.S or elsewhere in Linux.
-Tony
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pte bit spin lock
2004-11-19 6:56 pte bit spin lock Nick Piggin
2004-11-19 18:46 ` Luck, Tony
@ 2004-11-19 19:49 ` Christoph Lameter
2004-11-20 0:56 ` Nick Piggin
2004-11-22 18:44 ` Luck, Tony
3 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2004-11-19 19:49 UTC (permalink / raw)
To: linux-ia64
On Fri, 19 Nov 2004, Nick Piggin wrote:
> I was wondering if it might be possible to change arch/ia64/kernel/ivt.S
> routines that modify pte access bits, to first take a "spin lock bit" in
> the pte before any other modifications to it, and clear the lock bit when
> done?
Sure.
> And second question, a pte's memory doesn't ever get updated transparently
> by the hardware on ia64, does it?
Correct.
> And my last question... I wonder if someone might be able to help me do the
> assembly for the locking in ivt.S provided it is a small job and I give the
> specification? Sorry, I have no idea about ia64 assembly :(
I could take a look at it next week.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: pte bit spin lock
2004-11-19 6:56 pte bit spin lock Nick Piggin
2004-11-19 18:46 ` Luck, Tony
2004-11-19 19:49 ` Christoph Lameter
@ 2004-11-20 0:56 ` Nick Piggin
2004-11-22 18:44 ` Luck, Tony
3 siblings, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2004-11-20 0:56 UTC (permalink / raw)
To: linux-ia64
Luck, Tony wrote:
>>I was wondering if it might be possible to change arch/ia64/kernel/ivt.S
>>routines that modify pte access bits, to first take a "spin
>>lock bit" in the pte before any other modifications to it, and clear the
>>lock bit when done?
>
>
> Possible? Anything is possible. Is it a good idea? I don't know, you'd
OK that's a good start.
> need some benchmark data to show which applications win, and which lose.
> Obviously this spin-lock-bit will make some operations that are now cheap
> become much more expensive ... whether you have an overall win would
> depend on a lot of factors.
>
Yes it will, you're right. It adds an extra atomic rmw operation to pte
manipulation in place of the page table lock. So in practice, it won't
be much different on single-pte operations like page faults, but the
batched operation 'copy_page_range' will suffer.
zap_pte_range, while being a batched operation, must currently also do an
atomic operation per-pte (so it doesn't lose the dirty bit), so this
doesn't suffer any extra atomic ops.
But the copy_page_range issue seems to cost about 7% on lmbench fork
which is fairly significant (with i386 using pte cmpxchg; pte locks
shouldn't be worse than cmpxchg, hopefully cheaper if anything).
I don't think you will see significant contention on the pte lock, so the
cost to ivt.S should be essentially an extra atomic op. But this could
mean that subsequent modification of the pte accessed bits need not be
atomic RMWs as seems to be the case there now.
But anyway it is not very productive to try to extrapolate results to ia64,
so yes it would need to be carefully tested.
>
>>And second question, a pte's memory doesn't ever get updated
>>transparently by the hardware on ia64, does it?
>
>
> No. The h/w VHPT walker on ia64 only _reads_ page tables, all updates
> (setting dirty bits) are done by s/w in ivt.S or elsewhere in Linux.
>
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: pte bit spin lock
2004-11-19 6:56 pte bit spin lock Nick Piggin
` (2 preceding siblings ...)
2004-11-20 0:56 ` Nick Piggin
@ 2004-11-22 18:44 ` Luck, Tony
3 siblings, 0 replies; 5+ messages in thread
From: Luck, Tony @ 2004-11-22 18:44 UTC (permalink / raw)
To: linux-ia64
>But the copy_page_range issue seems to cost about 7% on lmbench fork
>which is fairly significant
Significant to lmbench perhaps ... but there is considerable difference
of opinion about the benefits of micro-benchmarks like this. On the one
hand they make it easy to see the cost of a particular change, on the
other, too many people give too much weight to these numbers (a 7% change
in fork performance is not going to be measureable for almost all macro
benchmarks).
-Tony
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-11-22 18:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-19 6:56 pte bit spin lock Nick Piggin
2004-11-19 18:46 ` Luck, Tony
2004-11-19 19:49 ` Christoph Lameter
2004-11-20 0:56 ` Nick Piggin
2004-11-22 18:44 ` Luck, Tony
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox