* Yet another inline asm worry: mtsp() macro (and may be other)?
@ 2008-06-19 11:40 Joel Soete
2008-06-19 16:11 ` Grant Grundler
0 siblings, 1 reply; 7+ messages in thread
From: Joel Soete @ 2008-06-19 11:40 UTC (permalink / raw)
To: linux-parisc; +Cc: Helge Deller, grundler, kyle
Hello all,
Grant, in a first approach I would like to address this post to you because
all start from ccio-dma (again yes) when I figure out that in fact gcc-4.2 rob
a part this driver code???
My first worry was about the form takes the resulting code of <ccio_io_pdir_entry>
for remind:
0: cb 39 a0 60 movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
4: 34 1c 00 00 ldi 0,ret0
8: 00 1c 58 20 mtsp ret0,sr1
c: 22 60 0e 01 ldil L%-10000000,r19
10: 0a 78 0a 13 add,l r24,r19,r19
14: d6 77 0c 14 depw r23,31,12,r19
18: 0f 53 12 88 stw r19,4(r26)
1c: 07 00 53 1c lci r0(sr1,r24),ret0
20: d3 9c 1a 74 extrw,u ret0,19,12,ret0
24: d7 3c 0e 14 depw ret0,15,12,r25
28: 0f 59 12 80 stw r25,0(r26)
2c: 07 40 12 80 fdc r0(r26)
30: 00 00 04 00 sync
34: e8 40 c0 02 bv,n r0(rp)
38: 03 ff e0 1f break 1f,1fff
3c: e8 1f 1f f7 b,l,n 3c <ccio_io_pdir_entry+0x3c>,r0
The question was why sr1 is always set to 0 when it's actualy a function
parameter (iirc 2d arg i.e. r25?)
Fwiw removing BUG_ON() seems to fix this issue?
btw as far as it must be KERNEL_SPACE, couldn't we simplify stuff: removing
sid parameter and replace mtsp macro call with a simple asm volatile("mtsp
%%r0,%%sr1":::"memory")?
Then I look for all the call to function (ccio_io_pdir_entry) even thought
it's inligned with those mtsp and lci it would be easy to find it.
The C code learn me that it would call in ccio_map_single() and thanks with
iommu_helper.h in ccio_map_sg(). But objdump -D ccio-dma.o shows me only one
place where this code is used
00000000 <ccio_map_single>:
[snip]
84: 34 14 00 00 ldi 0,r20
88: 00 14 58 20 mtsp r20,sr1
8c: 0a a3 0a 13 add,l r3,r21,r19
90: d6 65 0c 14 depw r5,31,12,r19
94: 0f 53 12 88 stw r19,4(r26)
98: 04 60 53 1c lci r0(sr1,r3),ret0
9c: d3 9c 1a 74 extrw,u ret0,19,12,ret0
a0: d6 9c 0e 14 depw ret0,15,12,r20
a4: 0f 54 12 80 stw r20,0(r26)
a8: 07 40 12 80 fdc r0(r26)
ac: 00 00 04 00 sync
[snip]
and not in ccio_map_sg() (even not a call to ccio_io_pdir_entry() in case he
didn't reach to inline it).
And the wonder is that my system d380 is even booting and running despite this
lake???
That said with gcc-4.3, I find well this code iin the 2 functions.
And here commes my worry about mtsp() macro: here is the resulting code with
gcc-4.3:
00000000 <ccio_map_sg>:
[snip]
__ 1a0: 34 1a 00 00 ldi 0,r26__
1a4: 20 a0 02 00 ldil L%10000000,r5
1a8: 8e 80 61 28 cmpib,> 0,r20,244 <ccio_map_sg+0x244>
1ac: 20 40 0e 01 ldil L%-10000000,rp
1b0: 86 c0 21 88 cmpib,= 0,r22,27c <ccio_map_sg+0x27c>
1b4: 34 19 00 00 ldi 0,r25
1b8: 0f e0 10 9c ldw 0(r31),ret0
1bc: 0d 80 10 94 ldw 0(r12),r20
1c0: d7 80 1c 1e depwi 0,31,2,ret0
1c4: 4b b3 00 20 ldw 10(ret1),r19
1c8: 0a 9c 04 1c sub ret0,r20,ret0
1cc: 0f f0 10 95 ldw 8(r31),r21
1d0: d3 9c 1f 45 extrw,s ret0,26,27,ret0
1d4: 0a b3 0a 13 add,l r19,r21,r19
1d8: d7 9c 09 8c depw,z ret0,19,20,ret0
1dc: 0f e8 10 94 ldw 4(r31),r20
1e0: 08 bc 0a 1c add,l ret0,r5,ret0
1e4: 6b b3 00 20 stw r19,10(ret1)
1e8: 0a b9 0a 15 add,l r25,r21,r21
1ec: 0a 9c 0a 14 add,l ret0,r20,r20
__ 1f0: 00 1a 58 20 mtsp r26,sr1 __
1f4: 08 54 0a 1c add,l r20,rp,ret0
1f8: d7 8e 0c 14 depw r14,31,12,ret0
1fc: 0e dc 12 88 stw ret0,4(r22)
200: 06 80 53 13 lci r0(sr1,r20),r19
204: d2 73 1a 74 extrw,u r19,19,12,r19
208: 08 1a 02 5c copy r26,ret0
20c: d7 93 0e 14 depw r19,15,12,ret0
210: 0e dc 12 80 stw ret0,0(r22)
214: 06 c0 12 80 fdc r0(r22)
218: 00 00 04 00 sync
[snip]
My worry is the number of insn between the ldi 0,r26 and the mtsp r26,sr1.
Here, I supposed it's harmless but may be different in other critical path?
Wouldn't it be interesting (if possible) to insure that the load of the reg
and mtsp stay close together? (or may be simply detect the case of a const == 0?)
What your opinion?
Tx again,
J.
PS: I didn't find back gcc-4.0 to check if my previous perf test could failed
for this reason?
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Yet another inline asm worry: mtsp() macro (and may be other)?
2008-06-19 11:40 Yet another inline asm worry: mtsp() macro (and may be other)? Joel Soete
@ 2008-06-19 16:11 ` Grant Grundler
2008-06-19 20:01 ` Joel Soete
0 siblings, 1 reply; 7+ messages in thread
From: Grant Grundler @ 2008-06-19 16:11 UTC (permalink / raw)
To: Joel Soete; +Cc: linux-parisc, Helge Deller, grundler, kyle
On Thu, Jun 19, 2008 at 12:40:30PM +0100, Joel Soete wrote:
> Hello all,
>
> Grant, in a first approach I would like to address this post to you because
> all start from ccio-dma (again yes) when I figure out that in fact gcc-4.2 rob
> a part this driver code???
..
>
> My first worry was about the form takes the resulting code of <ccio_io_pdir_entry>
> for remind:
> 0: cb 39 a0 60 movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
> 4: 34 1c 00 00 ldi 0,ret0
...
> The question was why sr1 is always set to 0 when it's actualy a function
> parameter (iirc 2d arg i.e. r25?)
>
> Fwiw removing BUG_ON() seems to fix this issue?
Yes. I just replied to your previous mail.
> btw as far as it must be KERNEL_SPACE, couldn't we simplify stuff: removing
> sid parameter and replace mtsp macro call with a simple asm volatile("mtsp
> %%r0,%%sr1":::"memory")?
We could. But I thought there are cases were we might want to DMA to/from
user space directly and that's why Space ID is a parameter. My guess is
we are creating a kernel alias for user space and using the alias instead.
If that's correct and is the long term plan, we could remove the "sid"
parameter.
hth,
grant
> Then I look for all the call to function (ccio_io_pdir_entry) even thought
> it's inligned with those mtsp and lci it would be easy to find it.
> The C code learn me that it would call in ccio_map_single() and thanks with
> iommu_helper.h in ccio_map_sg(). But objdump -D ccio-dma.o shows me only one
> place where this code is used
> 00000000 <ccio_map_single>:
> [snip]
> 84: 34 14 00 00 ldi 0,r20
> 88: 00 14 58 20 mtsp r20,sr1
> 8c: 0a a3 0a 13 add,l r3,r21,r19
> 90: d6 65 0c 14 depw r5,31,12,r19
> 94: 0f 53 12 88 stw r19,4(r26)
> 98: 04 60 53 1c lci r0(sr1,r3),ret0
> 9c: d3 9c 1a 74 extrw,u ret0,19,12,ret0
> a0: d6 9c 0e 14 depw ret0,15,12,r20
> a4: 0f 54 12 80 stw r20,0(r26)
> a8: 07 40 12 80 fdc r0(r26)
> ac: 00 00 04 00 sync
> [snip]
>
> and not in ccio_map_sg() (even not a call to ccio_io_pdir_entry() in case he
> didn't reach to inline it).
>
> And the wonder is that my system d380 is even booting and running despite this
> lake???
>
> That said with gcc-4.3, I find well this code iin the 2 functions.
> And here commes my worry about mtsp() macro: here is the resulting code with
> gcc-4.3:
> 00000000 <ccio_map_sg>:
> [snip]
> __ 1a0: 34 1a 00 00 ldi 0,r26__
> 1a4: 20 a0 02 00 ldil L%10000000,r5
> 1a8: 8e 80 61 28 cmpib,> 0,r20,244 <ccio_map_sg+0x244>
> 1ac: 20 40 0e 01 ldil L%-10000000,rp
> 1b0: 86 c0 21 88 cmpib,= 0,r22,27c <ccio_map_sg+0x27c>
> 1b4: 34 19 00 00 ldi 0,r25
> 1b8: 0f e0 10 9c ldw 0(r31),ret0
> 1bc: 0d 80 10 94 ldw 0(r12),r20
> 1c0: d7 80 1c 1e depwi 0,31,2,ret0
> 1c4: 4b b3 00 20 ldw 10(ret1),r19
> 1c8: 0a 9c 04 1c sub ret0,r20,ret0
> 1cc: 0f f0 10 95 ldw 8(r31),r21
> 1d0: d3 9c 1f 45 extrw,s ret0,26,27,ret0
> 1d4: 0a b3 0a 13 add,l r19,r21,r19
> 1d8: d7 9c 09 8c depw,z ret0,19,20,ret0
> 1dc: 0f e8 10 94 ldw 4(r31),r20
> 1e0: 08 bc 0a 1c add,l ret0,r5,ret0
> 1e4: 6b b3 00 20 stw r19,10(ret1)
> 1e8: 0a b9 0a 15 add,l r25,r21,r21
> 1ec: 0a 9c 0a 14 add,l ret0,r20,r20
> __ 1f0: 00 1a 58 20 mtsp r26,sr1 __
> 1f4: 08 54 0a 1c add,l r20,rp,ret0
> 1f8: d7 8e 0c 14 depw r14,31,12,ret0
> 1fc: 0e dc 12 88 stw ret0,4(r22)
> 200: 06 80 53 13 lci r0(sr1,r20),r19
> 204: d2 73 1a 74 extrw,u r19,19,12,r19
> 208: 08 1a 02 5c copy r26,ret0
> 20c: d7 93 0e 14 depw r19,15,12,ret0
> 210: 0e dc 12 80 stw ret0,0(r22)
> 214: 06 c0 12 80 fdc r0(r22)
> 218: 00 00 04 00 sync
> [snip]
>
> My worry is the number of insn between the ldi 0,r26 and the mtsp r26,sr1.
>
> Here, I supposed it's harmless but may be different in other critical path?
>
> Wouldn't it be interesting (if possible) to insure that the load of the reg
> and mtsp stay close together? (or may be simply detect the case of a const == 0?)
>
> What your opinion?
>
> Tx again,
> J.
>
> PS: I didn't find back gcc-4.0 to check if my previous perf test could failed
> for this reason?
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Yet another inline asm worry: mtsp() macro (and may be other)?
2008-06-19 16:11 ` Grant Grundler
@ 2008-06-19 20:01 ` Joel Soete
2008-06-22 17:15 ` Grant Grundler
0 siblings, 1 reply; 7+ messages in thread
From: Joel Soete @ 2008-06-19 20:01 UTC (permalink / raw)
To: Grant Grundler; +Cc: linux-parisc, Helge Deller, kyle
Grant Grundler wrote:
> On Thu, Jun 19, 2008 at 12:40:30PM +0100, Joel Soete wrote:
>> Hello all,
>>
>> Grant, in a first approach I would like to address this post to you because
>> all start from ccio-dma (again yes) when I figure out that in fact gcc-4.2 rob
>> a part this driver code???
> ..
>> My first worry was about the form takes the resulting code of <ccio_io_pdir_entry>
>> for remind:
>> 0: cb 39 a0 60 movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
>> 4: 34 1c 00 00 ldi 0,ret0
> ...
>> The question was why sr1 is always set to 0 when it's actualy a function
>> parameter (iirc 2d arg i.e. r25?)
>>
>> Fwiw removing BUG_ON() seems to fix this issue?
>
> Yes. I just replied to your previous mail.
>
yes.
>> btw as far as it must be KERNEL_SPACE, couldn't we simplify stuff: removing
>> sid parameter and replace mtsp macro call with a simple asm volatile("mtsp
>> %%r0,%%sr1":::"memory")?
>
> We could. But I thought there are cases were we might want to DMA to/from
> user space directly and that's why Space ID is a parameter.
Ok but it was a BUG if temporary space reg is not 0, so???
> My guess is
> we are creating a kernel alias for user space and using the alias instead.
> If that's correct and is the long term plan, we could remove the "sid"
> parameter.
>
Nice to me.
btw system seems to boot too with asm volatile("mtsp %%r0,%%sr1":::"memory")
may be the cpp can catch in the mtsp(gr, cr) macro if gr==0 then I can use this insn if not the original macro???
Tx again,
j.
> hth,
> grant
>
>> Then I look for all the call to function (ccio_io_pdir_entry) even thought
>> it's inligned with those mtsp and lci it would be easy to find it.
>> The C code learn me that it would call in ccio_map_single() and thanks with
>> iommu_helper.h in ccio_map_sg(). But objdump -D ccio-dma.o shows me only one
>> place where this code is used
>> 00000000 <ccio_map_single>:
>> [snip]
>> 84: 34 14 00 00 ldi 0,r20
>> 88: 00 14 58 20 mtsp r20,sr1
>> 8c: 0a a3 0a 13 add,l r3,r21,r19
>> 90: d6 65 0c 14 depw r5,31,12,r19
>> 94: 0f 53 12 88 stw r19,4(r26)
>> 98: 04 60 53 1c lci r0(sr1,r3),ret0
>> 9c: d3 9c 1a 74 extrw,u ret0,19,12,ret0
>> a0: d6 9c 0e 14 depw ret0,15,12,r20
>> a4: 0f 54 12 80 stw r20,0(r26)
>> a8: 07 40 12 80 fdc r0(r26)
>> ac: 00 00 04 00 sync
>> [snip]
>>
>> and not in ccio_map_sg() (even not a call to ccio_io_pdir_entry() in case he
>> didn't reach to inline it).
>>
>> And the wonder is that my system d380 is even booting and running despite this
>> lake???
>>
>> That said with gcc-4.3, I find well this code iin the 2 functions.
>> And here commes my worry about mtsp() macro: here is the resulting code with
>> gcc-4.3:
>> 00000000 <ccio_map_sg>:
>> [snip]
>> __ 1a0: 34 1a 00 00 ldi 0,r26__
>> 1a4: 20 a0 02 00 ldil L%10000000,r5
>> 1a8: 8e 80 61 28 cmpib,> 0,r20,244 <ccio_map_sg+0x244>
>> 1ac: 20 40 0e 01 ldil L%-10000000,rp
>> 1b0: 86 c0 21 88 cmpib,= 0,r22,27c <ccio_map_sg+0x27c>
>> 1b4: 34 19 00 00 ldi 0,r25
>> 1b8: 0f e0 10 9c ldw 0(r31),ret0
>> 1bc: 0d 80 10 94 ldw 0(r12),r20
>> 1c0: d7 80 1c 1e depwi 0,31,2,ret0
>> 1c4: 4b b3 00 20 ldw 10(ret1),r19
>> 1c8: 0a 9c 04 1c sub ret0,r20,ret0
>> 1cc: 0f f0 10 95 ldw 8(r31),r21
>> 1d0: d3 9c 1f 45 extrw,s ret0,26,27,ret0
>> 1d4: 0a b3 0a 13 add,l r19,r21,r19
>> 1d8: d7 9c 09 8c depw,z ret0,19,20,ret0
>> 1dc: 0f e8 10 94 ldw 4(r31),r20
>> 1e0: 08 bc 0a 1c add,l ret0,r5,ret0
>> 1e4: 6b b3 00 20 stw r19,10(ret1)
>> 1e8: 0a b9 0a 15 add,l r25,r21,r21
>> 1ec: 0a 9c 0a 14 add,l ret0,r20,r20
>> __ 1f0: 00 1a 58 20 mtsp r26,sr1 __
>> 1f4: 08 54 0a 1c add,l r20,rp,ret0
>> 1f8: d7 8e 0c 14 depw r14,31,12,ret0
>> 1fc: 0e dc 12 88 stw ret0,4(r22)
>> 200: 06 80 53 13 lci r0(sr1,r20),r19
>> 204: d2 73 1a 74 extrw,u r19,19,12,r19
>> 208: 08 1a 02 5c copy r26,ret0
>> 20c: d7 93 0e 14 depw r19,15,12,ret0
>> 210: 0e dc 12 80 stw ret0,0(r22)
>> 214: 06 c0 12 80 fdc r0(r22)
>> 218: 00 00 04 00 sync
>> [snip]
>>
>> My worry is the number of insn between the ldi 0,r26 and the mtsp r26,sr1.
>>
>> Here, I supposed it's harmless but may be different in other critical path?
>>
>> Wouldn't it be interesting (if possible) to insure that the load of the reg
>> and mtsp stay close together? (or may be simply detect the case of a const == 0?)
>>
>> What your opinion?
>>
>> Tx again,
>> J.
>>
>> PS: I didn't find back gcc-4.0 to check if my previous perf test could failed
>> for this reason?
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Yet another inline asm worry: mtsp() macro (and may be other)?
2008-06-19 20:01 ` Joel Soete
@ 2008-06-22 17:15 ` Grant Grundler
0 siblings, 0 replies; 7+ messages in thread
From: Grant Grundler @ 2008-06-22 17:15 UTC (permalink / raw)
To: Joel Soete; +Cc: Grant Grundler, linux-parisc, Helge Deller, kyle
On Thu, Jun 19, 2008 at 08:01:02PM +0000, Joel Soete wrote:
...
>> We could. But I thought there are cases were we might want to DMA to/from
>> user space directly and that's why Space ID is a parameter.
>
> Ok but it was a BUG if temporary space reg is not 0, so???
It was a BUG in 2.4 kernels since 2.4 didn't have the VM support
to do DMA to/from user space.
I thought 2.6 kernels could DMA to/from user space (related to Direct IO).
> btw system seems to boot too with asm volatile("mtsp
> %%r0,%%sr1":::"memory")
> may be the cpp can catch in the mtsp(gr, cr) macro if gr==0 then I can use
> this insn if not the original macro???
Yes. I like willy's proposed patch for this.
thanks,
grant
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Yet another inline asm worry: mtsp() macro (and may be other)?
@ 2008-06-20 14:01 Joel Soete
2008-06-20 17:11 ` Matthew Wilcox
0 siblings, 1 reply; 7+ messages in thread
From: Joel Soete @ 2008-06-20 14:01 UTC (permalink / raw)
To: grundler; +Cc: linux-parisc, deller, grundler, kyle
> On Thu, Jun 19, 2008 at 12:40:30PM +0100, Joel Soete wrote:
> > Hello all,
> >
[snip]
> > And the wonder is that my system d380 is even booting and running despite this
> > lake???
> >
> > That said with gcc-4.3, I find well this code iin the 2 functions.
> > And here commes my worry about mtsp() macro: here is the resulting code with
> > gcc-4.3:
Well I was hopping that would be the fix of my perf test but not ;_(
(after a bit less then an hour of tar/rm test the fs became again corrupted)
> > 00000000 <ccio_map_sg>:
> > [snip]
> > __ 1a0: 34 1a 00 00 ldi 0,r26__
> > 1a4: 20 a0 02 00 ldil L%10000000,r5
> > 1a8: 8e 80 61 28 cmpib,> 0,r20,244 <ccio_map_sg+0x244>
> > 1ac: 20 40 0e 01 ldil L%-10000000,rp
> > 1b0: 86 c0 21 88 cmpib,= 0,r22,27c <ccio_map_sg+0x27c>
> > 1b4: 34 19 00 00 ldi 0,r25
> > 1b8: 0f e0 10 9c ldw 0(r31),ret0
> > 1bc: 0d 80 10 94 ldw 0(r12),r20
> > 1c0: d7 80 1c 1e depwi 0,31,2,ret0
> > 1c4: 4b b3 00 20 ldw 10(ret1),r19
> > 1c8: 0a 9c 04 1c sub ret0,r20,ret0
> > 1cc: 0f f0 10 95 ldw 8(r31),r21
> > 1d0: d3 9c 1f 45 extrw,s ret0,26,27,ret0
> > 1d4: 0a b3 0a 13 add,l r19,r21,r19
> > 1d8: d7 9c 09 8c depw,z ret0,19,20,ret0
> > 1dc: 0f e8 10 94 ldw 4(r31),r20
> > 1e0: 08 bc 0a 1c add,l ret0,r5,ret0
> > 1e4: 6b b3 00 20 stw r19,10(ret1)
> > 1e8: 0a b9 0a 15 add,l r25,r21,r21
> > 1ec: 0a 9c 0a 14 add,l ret0,r20,r20
> > __ 1f0: 00 1a 58 20 mtsp r26,sr1 __
> > 1f4: 08 54 0a 1c add,l r20,rp,ret0
> > 1f8: d7 8e 0c 14 depw r14,31,12,ret0
> > 1fc: 0e dc 12 88 stw ret0,4(r22)
> > 200: 06 80 53 13 lci r0(sr1,r20),r19
> > 204: d2 73 1a 74 extrw,u r19,19,12,r19
> > 208: 08 1a 02 5c copy r26,ret0
> > 20c: d7 93 0e 14 depw r19,15,12,ret0
> > 210: 0e dc 12 80 stw ret0,0(r22)
> > 214: 06 c0 12 80 fdc r0(r22)
> > 218: 00 00 04 00 sync
> > [snip]
> >
> > My worry is the number of insn between the ldi 0,r26 and the mtsp r26,sr1.
> >
> > Here, I supposed it's harmless but may be different in other critical path?
> >
> > Wouldn't it be interesting (if possible) to insure that the load of the reg
> > and mtsp stay close together? (or may be simply detect the case of a const
== 0?)
> >
Well I tried something:
@@ -103,10 +108,23 @@
cr; \
})
-#define mtsp(gr, cr) \
- __asm__ __volatile__("mtsp %0,%1" \
- : /* no outputs */ \
- : "r" (gr), "i" (cr) : "memory")
+#define mtsp(lval, i_sr) \
+{ \
+ if (lval) { \
+ unsigned long reg = (unsigned long)(lval); \
+ __asm__ __volatile__( \
+ "mtsp %0, %%sr%1" \
+ : /* no outputs */ \
+ : "r" (reg), "i" (i_sr) \
+ : "memory"); \
+ } else { \
+ __asm__ __volatile__( \
+ "mtsp %%r0, %%sr%0" \
+ : /* no outputs */ \
+ : "i" (i_sr) \
+ : "memory"); \
+ }; \
+}
that does well the drill for ccio-dma driver but else where (memcpy or cache)
it looks worse then the original, so get rid of this idea.
I will go back to sleep on this issue some more month ;<)
Tx again to all for help,
J.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Yet another inline asm worry: mtsp() macro (and may be other)?
2008-06-20 14:01 Joel Soete
@ 2008-06-20 17:11 ` Matthew Wilcox
2008-06-21 19:17 ` rubisher
0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2008-06-20 17:11 UTC (permalink / raw)
To: Joel Soete; +Cc: grundler, linux-parisc, deller, kyle
On Fri, Jun 20, 2008 at 03:01:43PM +0100, Joel Soete wrote:
> Well I tried something:
> @@ -103,10 +108,23 @@
> cr; \
> })
>
> -#define mtsp(gr, cr) \
> - __asm__ __volatile__("mtsp %0,%1" \
> - : /* no outputs */ \
> - : "r" (gr), "i" (cr) : "memory")
> +#define mtsp(lval, i_sr) \
> +{ \
> + if (lval) { \
> + unsigned long reg = (unsigned long)(lval); \
> + __asm__ __volatile__( \
> + "mtsp %0, %%sr%1" \
> + : /* no outputs */ \
> + : "r" (reg), "i" (i_sr) \
> + : "memory"); \
> + } else { \
> + __asm__ __volatile__( \
> + "mtsp %%r0, %%sr%0" \
> + : /* no outputs */ \
> + : "i" (i_sr) \
> + : "memory"); \
> + }; \
> +}
>
> that does well the drill for ccio-dma driver but else where (memcpy or cache)
> it looks worse then the original, so get rid of this idea.
Try this instead:
#define mtsp(space, cr) { \
if (__builtin_constant_p(space) && (space == 0)) { \
__asm__ __volatile__("mtsp %%r0, %0" : \
/* no outputs */ : "i" (cr) : "memory"); \
} else { \
__asm__ __volatile__("mtsp %0, %1" : \
/* no outputs */ : "r" (space), "i" (cr) : \
"memory"); \
} \
}
--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Yet another inline asm worry: mtsp() macro (and may be other)?
2008-06-20 17:11 ` Matthew Wilcox
@ 2008-06-21 19:17 ` rubisher
0 siblings, 0 replies; 7+ messages in thread
From: rubisher @ 2008-06-21 19:17 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Joel Soete, grundler, linux-parisc, deller, kyle
Matthew Wilcox wrote:
> On Fri, Jun 20, 2008 at 03:01:43PM +0100, Joel Soete wrote:
>> Well I tried something:
>> @@ -103,10 +108,23 @@
>> cr; \
>> })
>>
>> -#define mtsp(gr, cr) \
>> - __asm__ __volatile__("mtsp %0,%1" \
>> - : /* no outputs */ \
>> - : "r" (gr), "i" (cr) : "memory")
>> +#define mtsp(lval, i_sr) \
>> +{ \
>> + if (lval) { \
>> + unsigned long reg = (unsigned long)(lval); \
>> + __asm__ __volatile__( \
>> + "mtsp %0, %%sr%1" \
>> + : /* no outputs */ \
>> + : "r" (reg), "i" (i_sr) \
>> + : "memory"); \
>> + } else { \
>> + __asm__ __volatile__( \
>> + "mtsp %%r0, %%sr%0" \
>> + : /* no outputs */ \
>> + : "i" (i_sr) \
>> + : "memory"); \
>> + }; \
>> +}
>>
>> that does well the drill for ccio-dma driver but else where (memcpy or cache)
>> it looks worse then the original, so get rid of this idea.
>
> Try this instead:
>
> #define mtsp(space, cr) { \
> if (__builtin_constant_p(space) && (space == 0)) { \
> __asm__ __volatile__("mtsp %%r0, %0" : \
> /* no outputs */ : "i" (cr) : "memory"); \
> } else { \
> __asm__ __volatile__("mtsp %0, %1" : \
> /* no outputs */ : "r" (space), "i" (cr) : \
> "memory"); \
> } \
> }
>
This does the same job for ccio-dma driver, no change for cache but for memory it's a bit mixed, I let you appreciate:
[snip]
00000000 <copy_to_user>: 00000000 <copy_to_user>:
0: 6b c2 3f d9 stw rp,-14(sp) 0: 6b c2 3f d9 stw rp,-14(sp)
4: 34 1c 00 00 ldi 0,ret0 | 4: 00 00 58 20 mtsp r0,sr1
8: 00 1c 58 20 mtsp ret0,sr1 | 8: 03 c0 08 bc mfctl tr6,ret0
c: 03 c0 08 b3 mfctl tr6,r19 | c: 0f 98 10 93 ldw c(ret0),r19
10: 0e 78 10 9c ldw c(r19),ret0 | 10: 86 60 20 28 cmpib,= 0,r19,2c <copy_to_user+0x2c>
14: 93 80 20 00 cmpiclr,= 0,ret0,r0 | 14: 34 1c 00 00 ldi 0,ret0
18: 00 00 c4 bc mfsp sr3,ret0 18: 00 00 c4 bc mfsp sr3,ret0
1c: 00 1c 98 20 mtsp ret0,sr2 1c: 00 1c 98 20 mtsp ret0,sr2
20: 4b c2 3f d9 ldw -14(sp),rp 20: 4b c2 3f d9 ldw -14(sp),rp
24: e8 00 00 00 b,l 2c <copy_to_user+0x2c>,r0 24: e8 00 00 00 b,l 2c <copy_to_user+0x2c>,r0
28: 08 00 02 40 nop 28: 08 00 02 40 nop
2c: 08 00 02 40 nop | 2c: 00 1c 98 20 mtsp ret0,sr2
> 30: 4b c2 3f d9 ldw -14(sp),rp
> 34: e8 00 00 00 b,l 3c <copy_to_user+0x3c>,r0
> 38: 08 00 02 40 nop
> 3c: 08 00 02 40 nop
[snip]
00000000 <copy_from_user>: 00000000 <copy_from_user>:
0: 6b c2 3f d9 stw rp,-14(sp) 0: 6b c2 3f d9 stw rp,-14(sp)
4: 03 c0 08 bc mfctl tr6,ret0 4: 03 c0 08 bc mfctl tr6,ret0
8: 0f 98 10 93 ldw c(ret0),r19 8: 0f 98 10 93 ldw c(ret0),r19
c: 86 60 20 38 cmpib,= 0,r19,30 <copy_from_user+0x30 | c: 86 60 20 30 cmpib,= 0,r19,2c <copy_from_user+0x2c
10: 34 1c 00 00 ldi 0,ret0 10: 34 1c 00 00 ldi 0,ret0
14: 00 00 c4 bc mfsp sr3,ret0 14: 00 00 c4 bc mfsp sr3,ret0
18: 00 1c 58 20 mtsp ret0,sr1 18: 00 1c 58 20 mtsp ret0,sr1
1c: 34 1c 00 00 ldi 0,ret0 | 1c: 00 00 98 20 mtsp r0,sr2
20: 00 1c 98 20 mtsp ret0,sr2 | 20: 4b c2 3f d9 ldw -14(sp),rp
24: 4b c2 3f d9 ldw -14(sp),rp | 24: e8 00 00 00 b,l 2c <copy_from_user+0x2c>,r0
28: e8 00 00 00 b,l 30 <copy_from_user+0x30>,r0 | 28: 08 00 02 40 nop
2c: 08 00 02 40 nop | 2c: 00 1c 58 20 mtsp ret0,sr1
30: 00 1c 58 20 mtsp ret0,sr1 | 30: 00 00 98 20 mtsp r0,sr2
34: 34 1c 00 00 ldi 0,ret0 | 34: 4b c2 3f d9 ldw -14(sp),rp
38: 00 1c 98 20 mtsp ret0,sr2 | 38: e8 00 00 00 b,l 40 <copy_from_user+0x40>,r0
3c: 4b c2 3f d9 ldw -14(sp),rp | 3c: 08 00 02 40 nop
40: e8 00 00 00 b,l 48 <copy_from_user+0x48>,r0 | 40: 08 00 02 40 nop
44: 08 00 02 40 nop <
48: 08 00 02 40 nop <
Disassembly of section .text.copy_in_user: Disassembly of section .text.copy_in_user:
[snip]
00000000 <memcpy>: 00000000 <memcpy>:
0: 6b c2 3f d9 stw rp,-14(sp) 0: 6b c2 3f d9 stw rp,-14(sp)
4: 34 1c 00 00 ldi 0,ret0 | 4: 6f c4 00 80 stw,ma r4,40(sp)
8: 6f c4 00 80 stw,ma r4,40(sp) | 8: 08 1a 02 44 copy r26,r4
c: 08 1a 02 44 copy r26,r4 | c: 00 00 58 20 mtsp r0,sr1
10: 00 1c 58 20 mtsp ret0,sr1 | 10: 00 00 98 20 mtsp r0,sr2
14: 34 13 00 00 ldi 0,r19 | 14: e8 40 00 00 b,l 1c <memcpy+0x1c>,rp
18: 00 13 98 20 mtsp r19,sr2 | 18: 08 00 02 40 nop
1c: e8 40 00 00 b,l 24 <memcpy+0x24>,rp | 1c: 4b c2 3f 59 ldw -54(sp),rp
20: 08 00 02 40 nop | 20: 08 04 02 5c copy r4,ret0
24: 4b c2 3f 59 ldw -54(sp),rp | 24: e8 40 c0 00 bv r0(rp)
28: 08 04 02 5c copy r4,ret0 | 28: 4f c4 3f 81 ldw,mb -40(sp),r4
2c: e8 40 c0 00 bv r0(rp) <
30: 4f c4 3f 81 ldw,mb -40(sp),r4 <
[snip]
Anyway, it's far well better than mine.
Tx,
J.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-22 17:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-19 11:40 Yet another inline asm worry: mtsp() macro (and may be other)? Joel Soete
2008-06-19 16:11 ` Grant Grundler
2008-06-19 20:01 ` Joel Soete
2008-06-22 17:15 ` Grant Grundler
-- strict thread matches above, loose matches on Subject: below --
2008-06-20 14:01 Joel Soete
2008-06-20 17:11 ` Matthew Wilcox
2008-06-21 19:17 ` rubisher
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox