* [parisc-linux] Branch Prediction
@ 2002-10-25 17:17 John David Anglin
2002-10-26 5:11 ` Grant Grundler
0 siblings, 1 reply; 6+ messages in thread
From: John David Anglin @ 2002-10-25 17:17 UTC (permalink / raw)
To: parisc-linux
Does anyone know which PA processors if any implement the BTS? I
was also wondering about the ITLB P bit and what parisc-linux does
with it. The came up in regard to accelerating branches for calls
and returns.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] Branch Prediction
2002-10-25 17:17 [parisc-linux] Branch Prediction John David Anglin
@ 2002-10-26 5:11 ` Grant Grundler
2002-10-26 17:05 ` John David Anglin
0 siblings, 1 reply; 6+ messages in thread
From: Grant Grundler @ 2002-10-26 5:11 UTC (permalink / raw)
To: John David Anglin; +Cc: parisc-linux
"John David Anglin" wrote:
> Does anyone know which PA processors if any implement the BTS?
No clue. I had to read the PA2.0 arch book (page 6-15, "Branch
Target Stack) to learn what this is and what it does.
Anyay, it would be interesting to know if HPUX's acc uses it.
ie did it's usage ever get validated for both systems that do
and don't implement BTS?
> I was also wondering about the ITLB P bit and what parisc-linux does
> with it. The came up in regard to accelerating branches for calls
> and returns.
It sounds like parisc-linux does nothing with P-bit.
We could just always enable it if you think that's the right
thing to do for now. Looks like could be done by adding one more
insn to itlb_miss_common_20w in entry.S.
But to answer both questions, remind me to dig this up if
no answer gets posted by next Wednesday or so.
grant
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] Branch Prediction
2002-10-26 5:11 ` Grant Grundler
@ 2002-10-26 17:05 ` John David Anglin
2002-10-27 1:23 ` Grant Grundler
0 siblings, 1 reply; 6+ messages in thread
From: John David Anglin @ 2002-10-26 17:05 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
> "John David Anglin" wrote:
> > Does anyone know which PA processors if any implement the BTS?
>
> No clue. I had to read the PA2.0 arch book (page 6-15, "Branch
> Target Stack) to learn what this is and what it does.
I did a quick hack to gcc yesterday to try it on the a500. It
didn't seem to make any difference, so I think the PA-8500 doesn't
have a branch target stack. I wonder if the PA-8700 in the rp2470
has it? I suppose it also could be an add-on chip.
It's fairly easy to implement and the assembler already supports
the feature. However, there might be issues with software compiled
without the feature not inter-operating with software compiled
with it. For call-return acceleration, the safe solution of
pushing the return in the callee results in reduced performance.
You have to do the push in the call. Maybe I will add this
as an option if there is actually some gear that has the option.
> Anyay, it would be interesting to know if HPUX's acc uses it.
> ie did it's usage ever get validated for both systems that do
> and don't implement BTS?
>
> > I was also wondering about the ITLB P bit and what parisc-linux does
> > with it. The came up in regard to accelerating branches for calls
> > and returns.
>
> It sounds like parisc-linux does nothing with P-bit.
> We could just always enable it if you think that's the right
> thing to do for now. Looks like could be done by adding one more
> insn to itlb_miss_common_20w in entry.S.
I am guessing but I think setting it would help on machines with
dynamic prediction hardware. There is probably a paper somewhere
on this on the HP site. I tried to find info on the branch target
stack but didn't have any success.
My understanding is that pc-relative branches can be predicted
from examination of the code. Indirect branches (e.g., call
returns) can't. I don't know how the dynamic prediction hardware
works but I would think it wouldn't be there if it didn't
improve branch prediction.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] Branch Prediction
2002-10-26 17:05 ` John David Anglin
@ 2002-10-27 1:23 ` Grant Grundler
2002-10-27 10:20 ` N.Leymann
2002-11-01 21:30 ` John David Anglin
0 siblings, 2 replies; 6+ messages in thread
From: Grant Grundler @ 2002-10-27 1:23 UTC (permalink / raw)
To: John David Anglin; +Cc: parisc-linux
"John David Anglin" wrote:
> I wonder if the PA-8700 in the rp2470 has it?
gsyprf11 is a 650Mhz PA8700. please try it.
> I suppose it also could be an add-on chip.
Think so?
After reading the description I had the impression BTS has to be on chip
and integrated in order to get the speed. But I'm just a SW engineer...
> I am guessing but I think setting it would help on machines with
> dynamic prediction hardware. There is probably a paper somewhere
> on this on the HP site.
> My understanding is that pc-relative branches can be predicted
> from examination of the code. Indirect branches (e.g., call
> returns) can't.
yes. IIRC, branches forward tend to not be taken and branches backwards
tend to be loops. Or something along that line. But with PBO, the
static hints are better. And HPUX has a very cool "driver" called
"flipper" that will flip to static hints to match performance path
at run time.
> I don't know how the dynamic prediction hardware
> works but I would think it wouldn't be there if it didn't
> improve branch prediction.
yeah - I'll what I can learn about it this week.
thanks,
grant
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] Branch Prediction
2002-10-27 1:23 ` Grant Grundler
@ 2002-10-27 10:20 ` N.Leymann
2002-11-01 21:30 ` John David Anglin
1 sibling, 0 replies; 6+ messages in thread
From: N.Leymann @ 2002-10-27 10:20 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux, dave
Hi,
Sunday, October 27, 2002, 2:23:03 AM, you wrote:
> "John David Anglin" wrote:
>> I wonder if the PA-8700 in the rp2470 has it?
> gsyprf11 is a 650Mhz PA8700. please try it.
It's quite a while ago that I worked on assembly level with HPPA so I
might be wrong. But as far as I know most of the HPPA (at least 8200 and
8700) implement static and dynamic branch prediction. Which scheme is
used is controlled on a per page basis with the P-Bit in the ITLB.
With the HP c/c++ compilers you can control this behaviour using the
+O[no]static_prediction flags.
> Think so?
> After reading the description I had the impression BTS has to be on chip
> and integrated in order to get the speed. But I'm just a SW engineer...
Yep. If it's implemented it has to be on chip.
> yes. IIRC, branches forward tend to not be taken and branches backwards
> tend to be loops. Or something along that line. But with PBO, the
> static hints are better. And HPUX has a very cool "driver" called
> "flipper" that will flip to static hints to match performance path
> at run time.
>> I don't know how the dynamic prediction hardware
>> works but I would think it wouldn't be there if it didn't
>> improve branch prediction.
In this case a branch history table is used which records the results of the
last branches. HPPA uses a three bit shift register (256 entries).
The fetch unit checks this register and predicts the branch according to the
content. eg. if the branch was taken two times before it is predicted that it
will be taken again.
If you are interested in more details I can check tomorrow when I'm back to
office. I should have a paper somewhere which compares static and dynamic
branch prediction on PA2.0.
hope that helps
Nic
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] Branch Prediction
2002-10-27 1:23 ` Grant Grundler
2002-10-27 10:20 ` N.Leymann
@ 2002-11-01 21:30 ` John David Anglin
1 sibling, 0 replies; 6+ messages in thread
From: John David Anglin @ 2002-11-01 21:30 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
> "John David Anglin" wrote:
> > I wonder if the PA-8700 in the rp2470 has it?
>
> gsyprf11 is a 650Mhz PA8700. please try it.
I gave it a whirl. As far as I can tell based on a small amount of testing,
there isn't any difference in performance using the branch target stack to
"accelerate" call returns.
So it may be that this feature like quad precision is not implemented.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-11-01 21:47 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-25 17:17 [parisc-linux] Branch Prediction John David Anglin
2002-10-26 5:11 ` Grant Grundler
2002-10-26 17:05 ` John David Anglin
2002-10-27 1:23 ` Grant Grundler
2002-10-27 10:20 ` N.Leymann
2002-11-01 21:30 ` John David Anglin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox