From: Joel Soete <soete.joel@tiscali.be>
To: Grant Grundler <grundler@parisc-linux.org>
Cc: parisc-linux <parisc-linux@lists.parisc-linux.org>
Subject: Re: copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test]
Date: Mon, 27 Dec 2004 10:40:49 +0000 [thread overview]
Message-ID: <41CFE6B1.6010707@tiscali.be> (raw)
In-Reply-To: <20041227073654.GI29492@colo.lackof.org>
Grant Grundler wrote:
> On Tue, Dec 21, 2004 at 02:37:47PM +0100, Joel Soete wrote:
>
>>Hello all,
>
>
> Joel,
> I trim your postings to only include the parts I need to respond to.
> Could you please do the same?
>
Apologies, I would just like to be as detailed as possible for the others who didn't follow our previous mail exchange before :-(
> I hate having to scroll down pages of stuff to get to your comment.
> That's probably why no one else responded.
>
I understand that make stuff too noisy
>
>
>>As promised, here is a cleaner (?) patch:
>>--- arch/parisc/kernel/pacache.S.Orig 2004-12-20 08:28:23.000000000 +0100
>>+++ arch/parisc/kernel/pacache.S 2004-12-20 14:49:35.000000000 +0100
>>@@ -295,7 +295,52 @@
>> .callinfo NO_CALLS
>> .entry
>>
>>- ldi 64, %r1
>>+ pdtlb 0(%r25)
>>+ pdtlb 0(%r26)
>
>
> Sorry - I missed why the pdtlb needs to be added.
> Could you explain?
Sorry no, that was a question of mine:
the previous inplementation of copy_user_page_asm() (between #if 0 ... #endif below in the code) started with:
[...]
/* Purge any old translations */
pdtlb 0(%r28)
pdtlb 0(%r29)
ldi 64, %r1
[...]
and we do the same in __clear_user_page_asm()
[...]
/* Purge any old translation */
pdtlb 0(%r28)
[...]
>
> Won't the pdtlb guarantee at least one trap per page copied?
> I would hope we guarantee the D-TLB is "clean" when calling this function.
>
Should be why it was removed but as far as I didn't find any explanation (that's obvious: that's nearly impossible to explain all
details of implementation ;-)
>
>>+#ifdef __LP64__
>>+
>>+ ldi 32, %r1 /* PAGE_SIZE/128 == 32 */
>>+
>>+1: ldd 0(%r25), %r19
>>+ ldd 8(%r25), %r20
>>+ ldd 16(%r25), %r21
>>+ ldd 24(%r25), %r22
>>+ std %r19, 0(%r26)
>>+ std %r20, 8(%r26)
[...]
>
> This looks good.
>
> PA2.0 can retire 2 loads and 2 stores per cycle IFF there are no dependencies.
> can be executed in one cycle.
>
> That means we want something like this:
>
> +1: ldd 0(%r25), %r19
> + ldd 8(%r25), %r20
> + ldd 16(%r25), %r21
> + ldd 24(%r25), %r22
> + std %r19, 0(%r26)
> + std %r20, 8(%r26)
> + ldd 32(%r25), %r19
> + ldd 40(%r25), %r20
[...]
> + ldo 128(%r25), %r25
> + std %r21, 112(%r26)
> + std %r22, 120(%r26)
> + ADDIB> -1, %r1, 1b
> + ldo 128(%r26), %r26
> ...
>
> [ Note that I've moved the "ldo" around as well!]
>
> More distance between the "ldd %rX" and the corresponding
> "std %rX" is generally a good thing.
> This routine could use more registers in the loop to get more "distance".
Ok that was another possibility: I trust that we can use r23, r24 as far as:
r23-r26: these are arg3-arg0, i.e. you can use them if you
don't care about the values that were passed in anymore.
but not more of r3-r18 because:
r3-r18,r27,r30 need to be saved and restored. r3-r18 are just
general purpose registers. [...]
>
> It costs us 1 cycle to save two registers on the stack.
> Once the data is in L1-Cache, IFF the CPU needs more than one cycle
> to retire successive loads, we gain several cycles assuming additional
> register pairs are used multiply times per loop.
Well that (cache management) is still far beyond my skill :-(
[...]
>>- extrd,u %r26,56,32, %r26 /* convert phys addr to tlb insert format */
...
>>+ extrd,u %r26,56,32, %r26 /* convert phys addr to tlb insert format */
>
> Please post white space changes as seperate patches.
>
oops my bad (apologies)
>
[...]
>>* with original 2.6.10-rc3-pa8 running kernel
>># grep "^user" k-loop1
>
> Please use "^sys" or "^real".
> "user" time is only number that should NOT change with this patch.
>
I will try to recover those info
>
[...]
>
>>So the main interest is to reduce the number of clock ticks :-)
>
>
> Yes. :^)
>
Thanks for your patience and relevant remarks, I will come back we more material soon ;-)
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
next prev parent reply other threads:[~2004-12-27 10:40 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <418A80E8000124B5@mail-6-bnl.tiscali.it>
2004-12-27 7:36 ` copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test] Grant Grundler
2004-12-27 10:40 ` Joel Soete [this message]
2004-12-27 15:08 ` James Bottomley
2004-12-31 20:26 ` Michael S. Zick
2004-12-31 20:56 ` Grant Grundler
2004-12-31 21:35 ` Michael S. Zick
[not found] ` <20041231225447.GC23592@colo.lackof.org>
2004-12-31 23:56 ` Michael S. Zick
2005-01-12 13:52 ` Michael S. Zick
2005-01-12 15:32 ` Joel Soete
2004-12-31 21:21 ` James Bottomley
2004-12-27 17:34 ` Joel Soete
2004-12-27 18:32 ` Joel Soete
2004-12-28 16:25 ` [parisc-linux] Re: copy_user_page_asm suggested 64bit improvment (Test case) Joel Soete
2004-12-29 5:46 ` Grant Grundler
2004-12-29 11:36 ` Joel Soete
2004-12-30 8:10 ` copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test] Grant Grundler
2004-12-30 17:04 ` [parisc-linux] Re: copy_user_page_asm suggested 64bit improvment [Was: [parisc-l John David Anglin
[not found] <20041210190333.GC6653@colo.lackof.org>
[not found] ` <418A811700010466@mail-8-bnl.mail.tiscali.sys>
[not found] ` <20041213180758.GA8705@colo.lackof.org>
[not found] ` <41C34C56.4080508@tiscali.be>
[not found] ` <20041218073036.GA29003@colo.lackof.org>
[not found] ` <41C440A3.6060708@tiscali.be>
[not found] ` <41C4872D.6010705@tiscali.be>
[not found] ` <41C4A35A.7010003@tiscali.be>
[not found] ` <20041219042528.GB15282@colo.lackof.org>
[not found] ` <41C5D761.4030004@tiscali.be>
2004-12-19 20:27 ` copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test] Joel Soete
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41CFE6B1.6010707@tiscali.be \
--to=soete.joel@tiscali.be \
--cc=grundler@parisc-linux.org \
--cc=parisc-linux@lists.parisc-linux.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox