Linux PARISC architecture development
 help / color / mirror / Atom feed
From: Joel Soete <soete.joel@tiscali.be>
To: Grant Grundler <grundler@parisc-linux.org>
Cc: parisc-linux <parisc-linux@lists.parisc-linux.org>
Subject: Re: copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test]
Date: Mon, 27 Dec 2004 10:40:49 +0000	[thread overview]
Message-ID: <41CFE6B1.6010707@tiscali.be> (raw)
In-Reply-To: <20041227073654.GI29492@colo.lackof.org>



Grant Grundler wrote:
> On Tue, Dec 21, 2004 at 02:37:47PM +0100, Joel Soete wrote:
> 
>>Hello all,
> 
> 
> Joel,
> I trim your postings to only include the parts I need to respond to.
> Could you please do the same?
> 
Apologies, I would just like to be as detailed as possible for the others who didn't follow our previous mail exchange before :-(

> I hate having to scroll down pages of stuff to get to your comment.
> That's probably why no one else responded.
> 
I understand that make stuff too noisy

> 
> 
>>As promised, here is a cleaner (?)  patch:
>>--- arch/parisc/kernel/pacache.S.Orig	2004-12-20 08:28:23.000000000 +0100
>>+++ arch/parisc/kernel/pacache.S	2004-12-20 14:49:35.000000000 +0100
>>@@ -295,7 +295,52 @@
>> 	.callinfo NO_CALLS
>> 	.entry
>>
>>-	ldi		64, %r1
>>+	pdtlb		0(%r25)
>>+	pdtlb		0(%r26)
> 
> 
> Sorry - I missed why the pdtlb needs to be added.
> Could you explain?

Sorry no, that was a question of mine:
the previous inplementation of copy_user_page_asm() (between #if 0 ... #endif below in the code) started with:
[...]
         /* Purge any old translations */

         pdtlb           0(%r28)
         pdtlb           0(%r29)

         ldi             64, %r1
[...]

and we do the same in __clear_user_page_asm()
[...]
         /* Purge any old translation */

         pdtlb           0(%r28)

[...]
> 
> Won't the pdtlb guarantee at least one trap per page copied?
> I would hope we guarantee the D-TLB is "clean" when calling this function.
> 
Should be why it was removed but as far as I didn't find any explanation (that's obvious: that's nearly impossible to explain all 
details of implementation ;-)

> 
>>+#ifdef __LP64__
>>+
>>+	ldi		32, %r1			/* PAGE_SIZE/128 == 32 */
>>+
>>+1:	ldd		0(%r25), %r19
>>+	ldd		8(%r25), %r20
>>+	ldd		16(%r25), %r21
>>+	ldd		24(%r25), %r22
>>+	std		%r19, 0(%r26)
>>+	std		%r20, 8(%r26)
[...]
> 
> This looks good.
> 
> PA2.0 can retire 2 loads and 2 stores per cycle IFF there are no dependencies.
> can be executed in one cycle.
> 
> That means we want something like this:
> 
> +1:	ldd		0(%r25), %r19
> +	ldd		8(%r25), %r20
> +	ldd		16(%r25), %r21
> +	ldd		24(%r25), %r22
> +	std		%r19, 0(%r26)
> +	std		%r20, 8(%r26)
> +	ldd		32(%r25), %r19
> +	ldd		40(%r25), %r20
[...]
> +	ldo		128(%r25), %r25
> +	std		%r21, 112(%r26)
> +	std		%r22, 120(%r26)
> +	ADDIB>		-1, %r1, 1b
> +	ldo		128(%r26), %r26
> ...
> 
> [ Note that I've moved the "ldo" around as well!]
> 
> More distance between the "ldd %rX" and the corresponding
> "std %rX" is generally a good thing.
> This routine could use more registers in the loop to get more "distance".
Ok that was another possibility: I trust that we can use r23, r24 as far as:
     r23-r26: these are arg3-arg0, i.e. you can use them if you
         don't care about the values that were passed in anymore.

but not more of r3-r18 because:
r3-r18,r27,r30 need to be saved and restored. r3-r18 are just
     general purpose registers. [...]

> 
> It costs us 1 cycle to save two registers on the stack.
> Once the data is in L1-Cache, IFF the CPU needs more than one cycle
> to retire successive loads, we gain several cycles assuming additional
> register pairs are used multiply times per loop.
Well that (cache management) is still far beyond my skill :-(

[...]
>>-	extrd,u		%r26,56,32, %r26		/* convert phys addr to tlb insert format */
...
>>+	extrd,u		%r26,56,32, %r26	/* convert phys addr to tlb insert format */
> 
> Please post white space changes as seperate patches.
> 
oops my bad (apologies)
> 
[...]
>>* with original 2.6.10-rc3-pa8 running kernel
>># grep "^user" k-loop1
> 
> Please use "^sys" or "^real".
> "user" time is only number that should NOT change with this patch.
> 
I will try to recover those info
> 
[...]
> 
>>So the main interest is to reduce the number of clock ticks :-)
> 
> 
> Yes. :^)
> 
Thanks for your patience and relevant remarks, I will come back we more material soon ;-)

Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

  reply	other threads:[~2004-12-27 10:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <418A80E8000124B5@mail-6-bnl.tiscali.it>
2004-12-27  7:36 ` copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test] Grant Grundler
2004-12-27 10:40   ` Joel Soete [this message]
2004-12-27 15:08     ` James Bottomley
2004-12-31 20:26       ` Michael S. Zick
2004-12-31 20:56         ` Grant Grundler
2004-12-31 21:35           ` Michael S. Zick
     [not found]             ` <20041231225447.GC23592@colo.lackof.org>
2004-12-31 23:56               ` Michael S. Zick
2005-01-12 13:52               ` Michael S. Zick
2005-01-12 15:32                 ` Joel Soete
2004-12-31 21:21         ` James Bottomley
2004-12-27 17:34     ` Joel Soete
2004-12-27 18:32     ` Joel Soete
2004-12-28 16:25   ` [parisc-linux] Re: copy_user_page_asm suggested 64bit improvment (Test case) Joel Soete
2004-12-29  5:46     ` Grant Grundler
2004-12-29 11:36       ` Joel Soete
2004-12-30  8:10   ` copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test] Grant Grundler
2004-12-30 17:04     ` [parisc-linux] Re: copy_user_page_asm suggested 64bit improvment [Was: [parisc-l John David Anglin
     [not found] <20041210190333.GC6653@colo.lackof.org>
     [not found] ` <418A811700010466@mail-8-bnl.mail.tiscali.sys>
     [not found]   ` <20041213180758.GA8705@colo.lackof.org>
     [not found]     ` <41C34C56.4080508@tiscali.be>
     [not found]       ` <20041218073036.GA29003@colo.lackof.org>
     [not found]         ` <41C440A3.6060708@tiscali.be>
     [not found]           ` <41C4872D.6010705@tiscali.be>
     [not found]             ` <41C4A35A.7010003@tiscali.be>
     [not found]               ` <20041219042528.GB15282@colo.lackof.org>
     [not found]                 ` <41C5D761.4030004@tiscali.be>
2004-12-19 20:27                   ` copy_user_page_asm suggested 64bit improvment [Was: [parisc-linux] clear user page test] Joel Soete

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41CFE6B1.6010707@tiscali.be \
    --to=soete.joel@tiscali.be \
    --cc=grundler@parisc-linux.org \
    --cc=parisc-linux@lists.parisc-linux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox