* in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-06-15 12:37 rubisher
2008-06-16 11:37 ` in ccio_io_pdir_entry(),BUG_ON() " rubisher
2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
0 siblings, 2 replies; 16+ messages in thread
From: rubisher @ 2008-06-15 12:37 UTC (permalink / raw)
To: linux-parisc
Hello all,
looking at this hunk:
void CCIO_INLINE
ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
unsigned long hints)
{
register unsigned long pa;
register unsigned long ci; /* coherent index */
/* We currently only support kernel addresses */
BUG_ON(sid != KERNEL_SPACE);
mtsp(sid,1);
[snip]
pa = virt_to_phys(vba);
asm volatile("depw %1,31,12,%0" : "+r" (pa) : "r" (hints));
((u32 *)pdir_ptr)[1] = (u32) pa;
[snip]
pa = 0;
[snip]
asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
((u32 *)pdir_ptr)[0] = (u32) pa;
[snip]
asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr));
asm volatile("sync");
}
(I just remove comments and 64bit stuff)
and I noticed that resulting code looks like:
0: cb 39 a0 60 movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
4: 34 1c 00 00 ldi 0,ret0
8: 00 1c 58 20 mtsp ret0,sr1
c: 22 60 0e 01 ldil L%-10000000,r19
10: 0a 78 0a 13 add,l r24,r19,r19
14: d6 77 0c 14 depw r23,31,12,r19
18: 0f 53 12 88 stw r19,4(r26)
1c: 07 00 53 1c lci r0(sr1,r24),ret0
20: d3 9c 1a 74 extrw,u ret0,19,12,ret0
24: d7 3c 0e 14 depw ret0,15,12,r25
28: 0f 59 12 80 stw r25,0(r26)
2c: 07 40 12 80 fdc r0(r26)
30: 00 00 04 00 sync
34: e8 40 c0 02 bv,n r0(rp)
38: 03 ff e0 1f break 1f,1fff
3c: e8 1f 1f f7 b,l,n 3c <ccio_io_pdir_entry+0x3c>,r0
Disassembly of section .text.ccio_proc_bitmap_open:
And my worry was about lines 4: and 8:.
According to the C code, I don't understand why optimization want to initialize sr1 to 0 while it should be set to r25 (i.e.
arg1)?
Otoh, the sba botherhood code didn't showing the same behaviour:
0: 22 a0 0e 01 ldil L%-10000000,r21
4: 34 1c 00 00 ldi 0,ret0
8: 34 1d 20 01 ldi -1000,ret1
c: 0a b8 0a 15 add,l r24,r21,r21
10: 08 15 02 56 copy r21,r22
14: 34 15 00 00 ldi 0,r21
18: 0b 95 02 15 and r21,ret0,r21
1c: 0b b6 02 16 and r22,ret1,r22
20: 00 19 58 20 mtsp r25,sr1
24: 07 00 53 13 lci r0(sr1,r24),r19
28: d2 73 1a 6c extrw,u r19,19,20,r19
2c: 23 80 00 01 ldil L%-80000000,ret0
30: 34 1d 00 00 ldi 0,ret1
but didn't start with BUG_ON(), I simply try to remove this from ccio code and get a better result:
00000000 <ccio_io_pdir_entry>:
0: 00 19 58 20 mtsp r25,sr1
4: 23 80 0e 01 ldil L%-10000000,ret0
8: 0b 98 0a 1c add,l r24,ret0,ret0
c: d7 97 0c 14 depw r23,31,12,ret0
10: 0f 5c 12 88 stw ret0,4(r26)
14: 07 00 53 18 lci r0(sr1,r24),r24
18: d3 18 1a 74 extrw,u r24,19,12,r24
1c: 34 1c 00 00 ldi 0,ret0
20: d7 98 0e 14 depw r24,15,12,ret0
24: 0f 5c 12 80 stw ret0,0(r26)
28: 07 40 12 80 fdc r0(r26)
2c: 00 00 04 00 sync
30: e8 40 c0 02 bv,n r0(rp)
Disassembly of section .init.text:
But this time, it seems not consider assembly:
asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
as a 'volatile' block and insert line 1c:
This could may be solved by re-write as an one 'volatile' asm block like:
asm volatile (
"lci %%r0(%%sr1, %1), %1"
"\textru %1,19,12,%1\n"
"\tdepw %1,15,12,%0\n"
: "=r" (pa)
: "r" (vba));
and even add a clobber 'memory'
asm volatile (
"lci %%r0(%%sr1, %1), %1"
"\textru %1,19,12,%1\n"
"\tdepw %1,15,12,%0\n"
: "=r" (pa)
: "r" (vba)
: "memory");
But I have no clue how to restore BUG_ON() and avoid wrong optimization?
Any idea?
Tia,
r.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-15 12:37 in ccio_io_pdir_entry(), BUG_ON() " rubisher
@ 2008-06-16 11:37 ` rubisher
2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
1 sibling, 0 replies; 16+ messages in thread
From: rubisher @ 2008-06-16 11:37 UTC (permalink / raw)
To: linux-parisc
wierd, though.
rubisher <rubisher <at> scarlet.be> writes:
>
> Hello all,
>
> looking at this hunk:
> void CCIO_INLINE
> ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
> unsigned long hints)
> {
> register unsigned long pa;
> register unsigned long ci; /* coherent index */
>
> /* We currently only support kernel addresses */
> BUG_ON(sid != KERNEL_SPACE);
>
no pb to get rid of this BUG_ON(): each place I see it called sid == KERNEL_SPACE
the optimized produced code looks so Ok
[snip]
>
> But this time, it seems not consider assembly:
> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
>
> as a 'volatile' block and insert line 1c:
> This could may be solved by re-write as an one 'volatile' asm block like:
> asm volatile (
> "lci %%r0(%%sr1, %1), %1"
> "\textru %1,19,12,%1\n"
> "\tdepw %1,15,12,%0\n"
> : "=r" (pa)
> : "r" (vba));
>
> and even add a clobber 'memory'
> asm volatile (
> "lci %%r0(%%sr1, %1), %1"
> "\textru %1,19,12,%1\n"
> "\tdepw %1,15,12,%0\n"
> : "=r" (pa)
> : "r" (vba)
> : "memory");
>
> But I have no clue how to restore BUG_ON() and avoid wrong optimization?
>
But here is what is very wierd, I tested severall re-writen on this code amoung
latest:
asm volatile (
"lci %%r0(%%sr1, %1), %%r19"
"\textru %%r19,19,12,%%r19\n"
"\tdepw %%r19,15,12,%0\n"
: "=r" (pa)
: "r" (vba)
: "r19" );
but this system failed to boot on ncr_attach: pim info seems to point out to
this place:
PROCESSOR PIM INFORMATION
----------------- Processor 0 HPMC Information ------------------
Timestamp = Mon Jun 16 09:50:32 GMT 2008 (20:08:06:16:09:50:32)
HPMC Chassis Codes = 0xcbf0 0x5002 0x7d03 0x5402 0x5508 0xcbf2
0xcbfc
General Registers 0 - 31
00-03 0000000000000000 18711a0000001194 0000000010514bf8 000000001fd48000
04-07 0000000000000003 0000000010520308 000000001fdf2400 000000001fc20414
08-11 000000001049f3d8 000000001fc20088 000000001051ce20 00000000104e03cc
12-15 000000001fd4802c 000000001fdf2704 0000000010425000 0000000010425000
16-19 0000000010403b78 0000000010403ba8 0000000010403bb8 0000000000024800
20-23 00000000000a142c 00000000000f4240 000000001fd48000 0000000000001458
24-27 0000000000024800 0000000000000000 000000001fd48000 000000001047de20
28-31 0000000000000001 61c4680000004650 000000001fc20540 000000001029bfdc
Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000000000000000 0000000000000000 00000000000000c0 000000000000001d
12-15 0000000000000000 0000000000000000 0000000000110000 00000000e1000000
16-19 000000248ca1a027 0000000000000000 0000000010514c6c 00000000d39c1bfd
20-23 000000009227ffc4 000000000102482c 000000ff0004f90f 0000000080000000
24-27 00000000004c2000 00000000004c2000 00000000aaaaaaaa 00000000aaaaaaaa
28-31 00000000000003d0 0000000011111111 000000001fc20000 0000000011111111
Space Registers 0 - 7
00-03 00000000 00000000 00000000 00000000
04-07 00000000 00000000 00000000 00000000
IIA Space = 0x0000000000000000
IIA Offset = 0x0000000010514c70
Check Type = 0x20000000
CPU State = 0x9e000004
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x00305004
Assists Check = 0x00000000
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x000000fff1004817
System Requestor Address = 0xfffffffffffa0000
Check Summary = 0x8000000810284000
Available Memory = 0x0000000010000000
CPU Diagnose Register 2 = 0x0301000004000004
CPU Status Register 0 = 0x1440020000000000
CPU Status Register 1 = 0x8000000800000000
SADD LOG = 0x0000000100000001
Read Short LOG = 0xc10010fff1004817
----------------- Processor 0 LPMC Information ------------------
Check Type = 0x00000000
IC Parity Info = 0x00000000
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x00000000
Assists Check = 0x00000000
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x0000000000000000
System Requestor Address = 0x0000000000000000
----------------- Processor 0 TOC Information -------------------
General Registers 0 - 31
00-03 0000000000000000 000000000aba9500 00000000f005d12c 0000002d110b5065
04-07 0000000000000000 fffffffff0101120 00000000000f4240 0000000000000005
08-11 00000000f0102cd4 000000001fc20088 000000001051ce20 00000000104e03cc
12-15 000000001fd4802c 000000001fdf2704 0000000010425000 0000000010425000
16-19 0000000010403b78 0000000010403ba8 0000000010403bb8 fffffffff0160007
20-23 0000000000000001 00000000000f4240 000000001fd48000 0000000000000000
24-27 0000000000000000 0000000000000000 0000002d13066c03 fffffffff0102400
28-31 0000002d1bc5e565 0301000004000004 fffffffff0102e30 0301000004000004
Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000000000000000 0000000000000000 00000000000000c0 000000000000001d
12-15 0000000000000000 0000000000000000 000000fff0010000 0000000000000000
16-19 0000002d130a7a0d 0000000000000000 00000000f005d154 000000000804025c
20-23 000000009227c3c0 c000000040902e44 0000000000291008 0000000080000000
24-27 00000000004c2000 00000000004c2000 00000000aaaaaaaa 00000000aaaaaaaa
28-31 00000000000003d0 0000000011111111 000000001fc20000 0000000011111111
Space Registers 0 - 7
00-03 00000000 00000000 00000000 00000000
04-07 00000000 00000000 00000000 00000000
IIA Space = 0x0000000000000000
IIA Offset = 0x00000000f005d134
CPU State = 0x9e000001
Memory Error Log Information:
Timestamp = Mon Jun 16 09:50:32 GMT 2008 (20:08:06:16:09:50:32)
Trans Addr Central Bus
Status Requestor id par CP AD DV Address/Data
------ ---------- ----- ---- -- -- -- ---------------------
0x14 0xfffa0000 0x00 0x0 0 0 0 0x00000000 0x00000000
Memory
ECC Reg Address Data
---------- ---------- ----------
0x00000000 0x00000000 0x00000000
0x00000000
0x00000000
0x00000000
0x00000000
I/O Module Error Log Information:
Timestamp = Mon Jun 16 09:50:32 GMT 2008 (20:08:06:16:09:50:32)
Bus HPA Module Type Path Slt Md Sev Estat Requestor Responder
--- ---------- ---------------- -------- -- -- ---- ----- ---------- ----------
0 0xfff88000 I/O Adapter 8 2 0 he 0x04 0xf1004000 0x000a1420
0 0xfff8a000 I/O Adapter 10 2 2 he 0x0d 0x00000000 0x00000000
IO Bus Converter Log
--------------------
IOA0 HPA = 0xfff88000
IOA0 UBC_IO_CONTROL = 0x00020080
GSC1 HPA = 0xf1000001
IOA0 IO_IO_LO = 0xf0800000
IOA0 IO_IO_HI = 0xf17f0000
IOA0 IO_IO_LO_HV = 0xfc000000
IOA0 IO_IO_HI_HV = 0xfff80000
IOA1 HPA = 0xfff8a000
IOA1 UBC_IO_CONTROL = 0x00020080
GSC2 HPA = 0xf1800001
IOA1 IO_IO_LO = 0xf1800000
IOA1 IO_IO_HI = 0xf2000000
IOA1 IO_IO_LO_HV = 0x00000000
IOA1 IO_IO_HI_HV = 0x00000000
while IAOQ point to
IIA Offset = 0x0000000010514c70
...
e.g. IAOQ = 0x0000000010514c70
Parse IAOQ = 0x0000000010514c70 for CPU[0]
Func: ncr_attach, Off: 0xc4c, Addr: 0x10514c70
...
10514c60: 80 95 20 28 cmpb,= r21,r4,10514c7c <.L1517+0x660>
...
10514c70: 87 80 3f d7 cmpib,=,n 0,ret0,10514c60 <.L1517+0x644>
10514c74: 34 84 00 02 ldo 1(r4),r4
10514c78: 08 03 02 5a copy r3,r26
10514c7c: 34 19 00 c8 ldi 64,r25
Any idea what I missed in this chg?
Tia,
r.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
2008-06-15 12:37 in ccio_io_pdir_entry(), BUG_ON() " rubisher
2008-06-16 11:37 ` in ccio_io_pdir_entry(),BUG_ON() " rubisher
@ 2008-06-19 16:04 ` Grant Grundler
2008-06-19 19:44 ` Joel Soete
1 sibling, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-06-19 16:04 UTC (permalink / raw)
To: rubisher; +Cc: linux-parisc
On Sun, Jun 15, 2008 at 12:37:25PM +0000, rubisher wrote:
> Hello all,
>
> looking at this hunk:
> void CCIO_INLINE
> ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
> unsigned long hints)
> {
> register unsigned long pa;
> register unsigned long ci; /* coherent index */
>
> /* We currently only support kernel addresses */
> BUG_ON(sid != KERNEL_SPACE);
...
> and I noticed that resulting code looks like:
> 0: cb 39 a0 60 movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
> 4: 34 1c 00 00 ldi 0,ret0
The BUG_ON is causing the movb to be inserted. And then the compiler knows
the value is zero and can either copy from a register or "load immediate 0".
It probably chose the "ldi 0" because it avoids register interlocks and
can always be executed.
The movb will either branch to +0x38 (and nullifies the delay slot)
or execute the ldi. So it looks right to me.
BTW, I think the BUG_ON can go away. It's good for debugging but doesn't
need to be in every kernel.
...
> And my worry was about lines 4: and 8:.
> According to the C code, I don't understand why optimization want to
> initialize sr1 to 0 while it should be set to r25 (i.e. arg1)?
Does the BUG_ON explaination make sense to you?
>
> Otoh, the sba botherhood code didn't showing the same behaviour:
> 0: 22 a0 0e 01 ldil L%-10000000,r21
> 4: 34 1c 00 00 ldi 0,ret0
> 8: 34 1d 20 01 ldi -1000,ret1
> c: 0a b8 0a 15 add,l r24,r21,r21
> 10: 08 15 02 56 copy r21,r22
> 14: 34 15 00 00 ldi 0,r21
> 18: 0b 95 02 15 and r21,ret0,r21
> 1c: 0b b6 02 16 and r22,ret1,r22
> 20: 00 19 58 20 mtsp r25,sr1
> 24: 07 00 53 13 lci r0(sr1,r24),r19
> 28: d2 73 1a 6c extrw,u r19,19,20,r19
> 2c: 23 80 00 01 ldil L%-80000000,ret0
> 30: 34 1d 00 00 ldi 0,ret1
>
> but didn't start with BUG_ON(),
Right. That should be a clue. :)
hth,
grant
> I simply try to remove this from ccio code
> and get a better result:
> 00000000 <ccio_io_pdir_entry>:
> 0: 00 19 58 20 mtsp r25,sr1
> 4: 23 80 0e 01 ldil L%-10000000,ret0
> 8: 0b 98 0a 1c add,l r24,ret0,ret0
> c: d7 97 0c 14 depw r23,31,12,ret0
> 10: 0f 5c 12 88 stw ret0,4(r26)
> 14: 07 00 53 18 lci r0(sr1,r24),r24
> 18: d3 18 1a 74 extrw,u r24,19,12,r24
> 1c: 34 1c 00 00 ldi 0,ret0
> 20: d7 98 0e 14 depw r24,15,12,ret0
> 24: 0f 5c 12 80 stw ret0,0(r26)
> 28: 07 40 12 80 fdc r0(r26)
> 2c: 00 00 04 00 sync
> 30: e8 40 c0 02 bv,n r0(rp)
> Disassembly of section .init.text:
>
> But this time, it seems not consider assembly:
> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
>
> as a 'volatile' block and insert line 1c:
> This could may be solved by re-write as an one 'volatile' asm block like:
> asm volatile (
> "lci %%r0(%%sr1, %1), %1"
> "\textru %1,19,12,%1\n"
> "\tdepw %1,15,12,%0\n"
> : "=r" (pa)
> : "r" (vba));
>
> and even add a clobber 'memory'
> asm volatile (
> "lci %%r0(%%sr1, %1), %1"
> "\textru %1,19,12,%1\n"
> "\tdepw %1,15,12,%0\n"
> : "=r" (pa)
> : "r" (vba)
> : "memory");
>
> But I have no clue how to restore BUG_ON() and avoid wrong optimization?
>
> Any idea?
>
> Tia,
> r.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
@ 2008-06-19 19:44 ` Joel Soete
2008-06-19 22:48 ` John David Anglin
0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-06-19 19:44 UTC (permalink / raw)
To: Grant Grundler; +Cc: rubisher, linux-parisc
Grant Grundler wrote:
> On Sun, Jun 15, 2008 at 12:37:25PM +0000, rubisher wrote:
>> Hello all,
>>
>> looking at this hunk:
>> void CCIO_INLINE
>> ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
>> unsigned long hints)
>> {
>> register unsigned long pa;
>> register unsigned long ci; /* coherent index */
>>
>> /* We currently only support kernel addresses */
>> BUG_ON(sid != KERNEL_SPACE);
> ...
>> and I noticed that resulting code looks like:
>> 0: cb 39 a0 60 movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
>> 4: 34 1c 00 00 ldi 0,ret0
>
> The BUG_ON is causing the movb to be inserted. And then the compiler knows
> the value is zero and can either copy from a register or "load immediate 0".
> It probably chose the "ldi 0" because it avoids register interlocks and
> can always be executed.
>
> The movb will either branch to +0x38 (and nullifies the delay slot)
> or execute the ldi. So it looks right to me.
>
> BTW, I think the BUG_ON can go away. It's good for debugging but doesn't
> need to be in every kernel.
>
> ...
>> And my worry was about lines 4: and 8:.
>> According to the C code, I don't understand why optimization want to
>> initialize sr1 to 0 while it should be set to r25 (i.e. arg1)?
>
> Does the BUG_ON explaination make sense to you?
>
Well I nerver imagine that a compiler can make proof of this kind of induction spirit ;-) but that's give the sense, tx.
>> Otoh, the sba botherhood code didn't showing the same behaviour:
>> 0: 22 a0 0e 01 ldil L%-10000000,r21
>> 4: 34 1c 00 00 ldi 0,ret0
>> 8: 34 1d 20 01 ldi -1000,ret1
>> c: 0a b8 0a 15 add,l r24,r21,r21
>> 10: 08 15 02 56 copy r21,r22
>> 14: 34 15 00 00 ldi 0,r21
>> 18: 0b 95 02 15 and r21,ret0,r21
>> 1c: 0b b6 02 16 and r22,ret1,r22
>> 20: 00 19 58 20 mtsp r25,sr1
>> 24: 07 00 53 13 lci r0(sr1,r24),r19
>> 28: d2 73 1a 6c extrw,u r19,19,20,r19
>> 2c: 23 80 00 01 ldil L%-80000000,ret0
>> 30: 34 1d 00 00 ldi 0,ret1
>>
>> but didn't start with BUG_ON(),
>
> Right. That should be a clue. :)
>
> hth,
> grant
>
Tx again,
J.
ps: ccio_io_pdir_entry() seems to works fine without BUG_ON().
>> I simply try to remove this from ccio code
>> and get a better result:
>> 00000000 <ccio_io_pdir_entry>:
>> 0: 00 19 58 20 mtsp r25,sr1
>> 4: 23 80 0e 01 ldil L%-10000000,ret0
>> 8: 0b 98 0a 1c add,l r24,ret0,ret0
>> c: d7 97 0c 14 depw r23,31,12,ret0
>> 10: 0f 5c 12 88 stw ret0,4(r26)
>> 14: 07 00 53 18 lci r0(sr1,r24),r24
>> 18: d3 18 1a 74 extrw,u r24,19,12,r24
>> 1c: 34 1c 00 00 ldi 0,ret0
>> 20: d7 98 0e 14 depw r24,15,12,ret0
>> 24: 0f 5c 12 80 stw ret0,0(r26)
>> 28: 07 40 12 80 fdc r0(r26)
>> 2c: 00 00 04 00 sync
>> 30: e8 40 c0 02 bv,n r0(rp)
>> Disassembly of section .init.text:
>>
>> But this time, it seems not consider assembly:
>> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
>> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
>>
>> as a 'volatile' block and insert line 1c:
>> This could may be solved by re-write as an one 'volatile' asm block like:
>> asm volatile (
>> "lci %%r0(%%sr1, %1), %1"
>> "\textru %1,19,12,%1\n"
>> "\tdepw %1,15,12,%0\n"
>> : "=r" (pa)
>> : "r" (vba));
>>
>> and even add a clobber 'memory'
>> asm volatile (
>> "lci %%r0(%%sr1, %1), %1"
>> "\textru %1,19,12,%1\n"
>> "\tdepw %1,15,12,%0\n"
>> : "=r" (pa)
>> : "r" (vba)
>> : "memory");
>>
>> But I have no clue how to restore BUG_ON() and avoid wrong optimization?
>>
>> Any idea?
>>
>> Tia,
>> r.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
2008-06-19 19:44 ` Joel Soete
@ 2008-06-19 22:48 ` John David Anglin
0 siblings, 0 replies; 16+ messages in thread
From: John David Anglin @ 2008-06-19 22:48 UTC (permalink / raw)
To: Joel Soete; +Cc: grundler, rubisher, linux-parisc
> > Does the BUG_ON explaination make sense to you?
> >
> Well I nerver imagine that a compiler can make proof of this kind of induction spirit ;-) but that's give the sense, tx.
Yes, the current compiler eliminates a lot of dead code based on induction.
This occurs a lot when functions are inlined, etc. Makes debugging a real
problem.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-06-20 6:37 Joel Soete
0 siblings, 0 replies; 16+ messages in thread
From: Joel Soete @ 2008-06-20 6:37 UTC (permalink / raw)
To: dave; +Cc: grundler, rubisher, linux-parisc
> > > Does the BUG_ON explaination make sense to you?
> > >
> > Well I nerver imagine that a compiler can make proof of this kind of
induction spirit ;-) but that's give the sense, tx.
>
> Yes, the current compiler eliminates a lot of dead code based on induction.
> This occurs a lot when functions are inlined, etc. Makes debugging a real
> problem.
>
Thanks Dave (after a while away from this parisc hobby, it's nice to learn
fresh info).
Cheers,
J.
> Dave
> --
> J. David Anglin dave.anglin@nrc-cnrc.gc.ca
> National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-06-26 6:28 Joel Soete
2008-06-28 20:23 ` Grant Grundler
0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-06-26 6:28 UTC (permalink / raw)
To: Grant Grundler; +Cc: linux-parisc
Just for remind:
Joel Soete wrote:
>
>
[snip]
>>> But this time, it seems not consider assembly:
>>> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
>>> (vba));
>>> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>>> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
>>>
>>> as a 'volatile' block and insert line 1c:
This seems to make better what I want:
__asm__ __volatile__ (
"lci %%r0(%%sr1, %2), %0\n"
"\textru %0,19,12,%0\n"
"\tdepw %0,15,12,%1\n"
: "+r" (ci), "+r" (pa)
: "r" (vba)
: "memory"
);
in <ccio_map_sg>
200: 06 80 53 13 lci r0(sr1,r20),r19 | 200: 08 1a 02 54 copy r26,r20
204: d2 73 1a 74 extrw,u r19,19,12,r19 | 204: 06 a0 53 14 lci
r0(sr1,r21),r20
208: 08 1a 02 5c copy r26,ret0 | 208: d2 94 1a 74 extrw,u
r20,19,12,r20
20c: d7 93 0e 14 depw r19,15,12,ret0 | 20c: d7 94 0e 14 depw
r20,15,12,ret0
210: 0e dc 12 80 stw ret0,0(r22) 210: 0e dc 12 80 stw ret0,0(r22)
214: 06 c0 12 80 fdc r0(r22) 214: 06 c0 12 80 fdc r0(r22)
218: 00 00 04 00 sync 218: 00 00 04 00 sync
J.
PS: I don't yet understand why kernel panicing when I try to get rid to
reserve 'ci' variable and use in place a gr as r19 (even if clobbered), the
produced object seems ok but resulting kernel panicing (even after a full
rebuild after a make distclean???)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-26 6:28 in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization? Joel Soete
@ 2008-06-28 20:23 ` Grant Grundler
2008-06-28 22:26 ` Joel Soete
0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-06-28 20:23 UTC (permalink / raw)
To: Joel Soete; +Cc: Grant Grundler, linux-parisc
On Thu, Jun 26, 2008 at 07:28:03AM +0100, Joel Soete wrote:
> Just for remind:
>
> Joel Soete wrote:
> >
> >
> [snip]
> >>> But this time, it seems not consider assembly:
> >>> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
> >>> (vba));
> >>> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
> >>> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
> >>>
> >>> as a 'volatile' block and insert line 1c:
>
> This seems to make better what I want:
> __asm__ __volatile__ (
> "lci %%r0(%%sr1, %2), %0\n"
> "\textru %0,19,12,%0\n"
> "\tdepw %0,15,12,%1\n"
> : "+r" (ci), "+r" (pa)
> : "r" (vba)
> : "memory"
Why the "memory"? This asm code isn't modifying memory at all.
> );
>
>
> in <ccio_map_sg>
> 200: 06 80 53 13 lci r0(sr1,r20),r19 | 200: 08 1a 02 54 copy r26,r20
> 204: d2 73 1a 74 extrw,u r19,19,12,r19 | 204: 06 a0 53 14 lci
> r0(sr1,r21),r20
> 208: 08 1a 02 5c copy r26,ret0 | 208: d2 94 1a 74 extrw,u
> r20,19,12,r20
> 20c: d7 93 0e 14 depw r19,15,12,ret0 | 20c: d7 94 0e 14 depw
> r20,15,12,ret0
> 210: 0e dc 12 80 stw ret0,0(r22) 210: 0e dc 12 80 stw ret0,0(r22)
> 214: 06 c0 12 80 fdc r0(r22) 214: 06 c0 12 80 fdc r0(r22)
> 218: 00 00 04 00 sync 218: 00 00 04 00 sync
>
> J.
>
> PS: I don't yet understand why kernel panicing when I try to get rid to
> reserve 'ci' variable and use in place a gr as r19 (even if clobbered), the
> produced object seems ok but resulting kernel panicing (even after a full
> rebuild after a make distclean???)
What's wrong with "ci" variable? It's just another register.
C compiler will allocate it and then we pass it to the asm().
And are you sure it's ok to use r19?
hth,
grant
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-28 20:23 ` Grant Grundler
@ 2008-06-28 22:26 ` Joel Soete
2008-06-28 22:45 ` John David Anglin
2008-06-29 20:52 ` Grant Grundler
0 siblings, 2 replies; 16+ messages in thread
From: Joel Soete @ 2008-06-28 22:26 UTC (permalink / raw)
To: Grant Grundler; +Cc: linux-parisc
Hello Grant,
Grant Grundler wrote:
> On Thu, Jun 26, 2008 at 07:28:03AM +0100, Joel Soete wrote:
>> Just for remind:
>>
>> Joel Soete wrote:
>> >
>> >
>> [snip]
>> >>> But this time, it seems not consider assembly:
>> >>> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
>> >>> (vba));
>> >>> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>> >>> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
>> >>>
>> >>> as a 'volatile' block and insert line 1c:
>>
>> This seems to make better what I want:
>> __asm__ __volatile__ (
>> "lci %%r0(%%sr1, %2), %0\n"
>> "\textru %0,19,12,%0\n"
>> "\tdepw %0,15,12,%1\n"
>> : "+r" (ci), "+r" (pa)
>> : "r" (vba)
>> : "memory"
>
> Why the "memory"? This asm code isn't modifying memory at all.
>
well I certainly still have wrongly understand what "memory" clobber means but for me "pa" is a memory location modified
(the plus sign in '"+r" (pa)' and that either after 64bit computation or 32bit initialization pa=0).
But your comment make me fill I am again wrong (well definitively C is not my language ;-) .)
>> );
>>
>>
>> in <ccio_map_sg>
>> 200: 06 80 53 13 lci r0(sr1,r20),r19 | 200: 08 1a 02 54 copy r26,r20
>> 204: d2 73 1a 74 extrw,u r19,19,12,r19 | 204: 06 a0 53 14 lci
>> r0(sr1,r21),r20
>> 208: 08 1a 02 5c copy r26,ret0 | 208: d2 94 1a 74 extrw,u
>> r20,19,12,r20
>> 20c: d7 93 0e 14 depw r19,15,12,ret0 | 20c: d7 94 0e 14 depw
>> r20,15,12,ret0
>> 210: 0e dc 12 80 stw ret0,0(r22) 210: 0e dc 12 80 stw ret0,0(r22)
>> 214: 06 c0 12 80 fdc r0(r22) 214: 06 c0 12 80 fdc r0(r22)
>> 218: 00 00 04 00 sync 218: 00 00 04 00 sync
>>
>> J.
>>
>> PS: I don't yet understand why kernel panicing when I try to get rid to
>> reserve 'ci' variable and use in place a gr as r19 (even if clobbered), the
>> produced object seems ok but resulting kernel panicing (even after a full
>> rebuild after a make distclean???)
>
> What's wrong with "ci" variable? It's just another register.
well the idea is that "ci" variable is just there as a tmp variable just used in this asm code, right.
so why not simply get rid of its declaration (btw save a line of code ;-)) and just use a common temporay reg as r19?
but what I show above is just a 'diff -y' between a objdump's hunk of ccio-dma.o:
on the left side what gcc produce with the original code (3 different asm lines) (this kernel was well booting fine)
on the right side what same compile produce with new code (1 asm line).
Otoh, based on my few knowldge on parisc asm, it seems to me that something like:
__asm__ __volatile__ (
"lci %%r0(%%sr1, %1), %%r19\n"
"\textru %%r19,19,12,%%r19\n"
"\tdepw %%r19,15,12,%0\n"
: "+r" (pa)
: "r" (vba)
: "r19")
would do the same job, but all my test results was a panicing kernel???
(even thought if I am diffing 2 objdump of the only ccio-dma.o, it looks like gr are just used differently. but may be did I
missed another wrong stuff).
That said, nothing of that help to fixe ccio-dma driver issue ;-(
My latest hope is relayfs to grab more debug info (I just need a bit more free time).
> C compiler will allocate it and then we pass it to the asm().
>
> And are you sure it's ok to use r19?
>
yes that was the only lines, I was working on.
Tx,
J.
> hth,
> grant
>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-28 22:26 ` Joel Soete
@ 2008-06-28 22:45 ` John David Anglin
2008-06-29 20:52 ` Grant Grundler
1 sibling, 0 replies; 16+ messages in thread
From: John David Anglin @ 2008-06-28 22:45 UTC (permalink / raw)
To: Joel Soete; +Cc: grundler, linux-parisc
> Grant Grundler wrote:
> > On Thu, Jun 26, 2008 at 07:28:03AM +0100, Joel Soete wrote:
> >> Just for remind:
> >>
> >> Joel Soete wrote:
> >> >
> >> >
> >> [snip]
> >> >>> But this time, it seems not consider assembly:
> >> >>> asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
> >> >>> (vba));
> >> >>> asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
> >> >>> asm volatile ("depw %1,15,12,%0" : "+r" (pa) : "r" (ci));
> >> >>>
> >> >>> as a 'volatile' block and insert line 1c:
> >>
> >> This seems to make better what I want:
> >> __asm__ __volatile__ (
> >> "lci %%r0(%%sr1, %2), %0\n"
> >> "\textru %0,19,12,%0\n"
> >> "\tdepw %0,15,12,%1\n"
> >> : "+r" (ci), "+r" (pa)
> >> : "r" (vba)
> >> : "memory"
> >
> > Why the "memory"? This asm code isn't modifying memory at all.
> >
> well I certainly still have wrongly understand what "memory" clobber means but for me "pa" is a memory location modified
> (the plus sign in '"+r" (pa)' and that either after 64bit computation or 32bit initialization pa=0).
The "+r" constraint indicates a register operand that is both read and written.
However, "ci" and "pa" are not read, so the constraints for them should be
"=r". None of the instructions modify memory, so the "memory" clobber
is unnecessary.
> so why not simply get rid of its declaration (btw save a line of code ;-)) and just use a common temporay reg as r19?
r19 is the pic register in 64-bit mode and can't be used without saving
and restoring it.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-28 22:26 ` Joel Soete
2008-06-28 22:45 ` John David Anglin
@ 2008-06-29 20:52 ` Grant Grundler
2008-06-30 18:28 ` Joel Soete
1 sibling, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-06-29 20:52 UTC (permalink / raw)
To: Joel Soete; +Cc: linux-parisc
On Sat, Jun 28, 2008 at 10:26:25PM +0000, Joel Soete wrote:
...
>> What's wrong with "ci" variable? It's just another register.
> well the idea is that "ci" variable is just there as a tmp variable just
> used in this asm code, right.
> so why not simply get rid of its declaration (btw save a line of code ;-))
> and just use a common temporay reg as r19?
No - don't do that. Let the compiler do the work of allocating registers.
It's really good at it. :) Much better than you and I combined. :)
All we need to do is tell the compiler we need a register.
cheers,
grant
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-29 20:52 ` Grant Grundler
@ 2008-06-30 18:28 ` Joel Soete
2008-07-02 4:28 ` Grant Grundler
0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-06-30 18:28 UTC (permalink / raw)
To: Grant Grundler; +Cc: linux-parisc
Hello Grant,
Grant Grundler wrote:
> On Sat, Jun 28, 2008 at 10:26:25PM +0000, Joel Soete wrote:
> ...
>>> What's wrong with "ci" variable? It's just another register.
>> well the idea is that "ci" variable is just there as a tmp variable just
>> used in this asm code, right.
>> so why not simply get rid of its declaration (btw save a line of code ;-))
>> and just use a common temporay reg as r19?
>
> No - don't do that. Let the compiler do the work of allocating registers.
> It's really good at it. :) Much better than you and I combined. :)
You and gcc already prove it to me ;-)
> All we need to do is tell the compiler we need a register.
>
Ok
Taken also into account of David remarks, here is the latest stuff tested with success:
__asm__ __volatile__ (
"lci %%r0(%%sr1, %2), %0\n"
"\textru %0,19,12,%0\n"
"\tdepw %0,15,12,%1\n"
: "=r" (ci), "=r" (pa)
: "r" (vba)
);
Tx to all for kind comments, they learn me a lot ;-)
J.
> cheers,
> grant
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-06-30 18:28 ` Joel Soete
@ 2008-07-02 4:28 ` Grant Grundler
2008-07-02 18:01 ` Joel Soete
0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-07-02 4:28 UTC (permalink / raw)
To: Joel Soete; +Cc: Grant Grundler, linux-parisc
On Mon, Jun 30, 2008 at 06:28:36PM +0000, Joel Soete wrote:
...
> Taken also into account of David remarks, here is the latest stuff tested
> with success:
>
> __asm__ __volatile__ (
> "lci %%r0(%%sr1, %2), %0\n"
> "\textru %0,19,12,%0\n"
> "\tdepw %0,15,12,%1\n"
> : "=r" (ci), "=r" (pa)
> : "r" (vba)
> );
So can you remind me (and Kyle) why this is better than what is
there now?
> Tx to all for kind comments, they learn me a lot ;-)
welcome! :)
grant
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-07-02 4:28 ` Grant Grundler
@ 2008-07-02 18:01 ` Joel Soete
2008-07-07 15:28 ` Grant Grundler
0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-07-02 18:01 UTC (permalink / raw)
To: Grant Grundler; +Cc: linux-parisc
Grant Grundler wrote:
> On Mon, Jun 30, 2008 at 06:28:36PM +0000, Joel Soete wrote:
> ...
>> Taken also into account of David remarks, here is the latest stuff tested
>> with success:
>>
>> __asm__ __volatile__ (
>> "lci %%r0(%%sr1, %2), %0\n"
>> "\textru %0,19,12,%0\n"
>> "\tdepw %0,15,12,%1\n"
>> : "=r" (ci), "=r" (pa)
>> : "r" (vba)
>> );
>
> So can you remind me (and Kyle) why this is better than what is
> there now?
>
mmm may be because gcc consider it now as only 1 'volatile' instruction:
in <ccio_map_sg>
-----------< Original code > --------------+------------< New stuff >-----------------
200: 06 80 53 13 lci r0(sr1,r20),r19 | 200: 08 1a 02 54 copy r26,r20
204: d2 73 1a 74 extrw,u r19,19,12,r19 | 204: 06 a0 53 14 lci r0(sr1,r21),r20
208: 08 1a 02 5c copy r26,ret0 | 208: d2 94 1a 74 extrw,u r20,19,12,r20
20c: d7 93 0e 14 depw r19,15,12,ret0 | 20c: d7 94 0e 14 depw r20,15,12,ret0
210: 0e dc 12 80 stw ret0,0(r22) 210: 0e dc 12 80 stw ret0,0(r22)
214: 06 c0 12 80 fdc r0(r22) 214: 06 c0 12 80 fdc r0(r22)
218: 00 00 04 00 sync 218: 00 00 04 00 sync
thought?
>> Tx to all for kind comments, they learn me a lot ;-)
>
> welcome! :)
>
> grant
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
2008-07-02 18:01 ` Joel Soete
@ 2008-07-07 15:28 ` Grant Grundler
0 siblings, 0 replies; 16+ messages in thread
From: Grant Grundler @ 2008-07-07 15:28 UTC (permalink / raw)
To: Joel Soete; +Cc: Grant Grundler, linux-parisc
On Wed, Jul 02, 2008 at 06:01:45PM +0000, Joel Soete wrote:
>> So can you remind me (and Kyle) why this is better than what is
>> there now?
> mmm may be because gcc consider it now as only 1 'volatile' instruction:
>
> in <ccio_map_sg>
> -----------< Original code > --------------+------------< New stuff
> >-----------------
> 200: 06 80 53 13 lci r0(sr1,r20),r19 | 200: 08 1a 02 54 copy r26,r20
> 204: d2 73 1a 74 extrw,u r19,19,12,r19 | 204: 06 a0 53 14 lci r0(sr1,r21),r20
> 208: 08 1a 02 5c copy r26,ret0 | 208: d2 94 1a 74 extrw,u r20,19,12,r20
> 20c: d7 93 0e 14 depw r19,15,12,ret0 | 20c: d7 94 0e 14 depw r20,15,12,ret0
> 210: 0e dc 12 80 stw ret0,0(r22) 210: 0e dc 12 80 stw ret0,0(r22)
> 214: 06 c0 12 80 fdc r0(r22) 214: 06 c0 12 80 fdc r0(r22)
> 218: 00 00 04 00 sync 218: 00 00 04 00 sync
>
> thought?
I'm not comfortable "New stuff" version is correct.
In "New Stuff", r20 gets clobbered and values copied from
r26 are lost. That's not the case in the original code.
"depw" only replaces 12 bits of "ret0" in the original code.
thanks,
grant
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-07-08 9:04 Joel Soete
0 siblings, 0 replies; 16+ messages in thread
From: Joel Soete @ 2008-07-08 9:04 UTC (permalink / raw)
To: grundler; +Cc: grundler, linux-parisc
[...]
>
> In "New Stuff", r20 gets clobbered and values copied from
> r26 are lost. That's not the case in the original code.
> "depw" only replaces 12 bits of "ret0" in the original code.
>
Good catch, in fact I would just have to save the read+write constraint on pa;
i.e this seems to make the drill:
__asm__ __volatile__ (
"lci %%r0(%%sr1, %2), %0\n"
"\textru %0,19,12,%0\n"
"\tdepw %0,15,12,%1\n"
: "=r" (ci), "+r" (pa)
: "r" (vba)
);
comparing produced objdump:
00000000 <ccio_io_pdir_entry>: 00000000 <ccio_io_pdir_entry>:
0: 00 19 58 20 mtsp r25,sr1 0: 00 19 58 20 mtsp r25,sr1
4: 23 80 0e 01 ldil L%-10000000,ret0 4: 23 80 0e 01 ldil L%-10000000,ret0
8: 0b 98 0a 1c add,l r24,ret0,ret0 8: 0b 98 0a 1c add,l r24,ret0,ret0
c: d7 97 0c 14 depw r23,31,12,ret0 c: d7 97 0c 14 depw r23,31,12,ret0
10: 0f 5c 12 88 stw ret0,4(r26) | 10: 34 13 00 00 ldi 0,r19
14: 07 00 53 18 lci r0(sr1,r24),r24 | 14: 0f 5c 12 88 stw ret0,4(r26)
18: d3 18 1a 74 extrw,u r24,19,12,r24 | 18: 07 00 53 1c lci r0(sr1,r24),ret0
1c: 34 1c 00 00 ldi 0,ret0 | 1c: d3 9c 1a 74 extrw,u ret0,19,12,ret0
20: d7 98 0e 14 depw r24,15,12,ret0 | 20: d6 7c 0e 14 depw ret0,15,12,r19
24: 0f 5c 12 80 stw ret0,0(r26) | 24: 0f 53 12 80 stw r19,0(r26)
28: 07 40 12 80 fdc r0(r26) 28: 07 40 12 80 fdc r0(r26)
2c: 00 00 04 00 sync 2c: 00 00 04 00 sync
30: e8 40 c0 02 bv,n r0(rp) 30: e8 40 c0 02 bv,n r0(rp)
Disassembly of section .text.ccio_free_consistent:Disassembly of section ...
This time r19 is well 'initialized' with 0 (as per C code pa = 0);
the coherent index in r28 (ret0) and after extrw, depw replaces 12 bits of r19.
Many thanks for re-reading ;-)
J.
> thanks,
> grant
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-07-08 9:04 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-26 6:28 in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization? Joel Soete
2008-06-28 20:23 ` Grant Grundler
2008-06-28 22:26 ` Joel Soete
2008-06-28 22:45 ` John David Anglin
2008-06-29 20:52 ` Grant Grundler
2008-06-30 18:28 ` Joel Soete
2008-07-02 4:28 ` Grant Grundler
2008-07-02 18:01 ` Joel Soete
2008-07-07 15:28 ` Grant Grundler
-- strict thread matches above, loose matches on Subject: below --
2008-07-08 9:04 Joel Soete
2008-06-20 6:37 Joel Soete
2008-06-15 12:37 in ccio_io_pdir_entry(), BUG_ON() " rubisher
2008-06-16 11:37 ` in ccio_io_pdir_entry(),BUG_ON() " rubisher
2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
2008-06-19 19:44 ` Joel Soete
2008-06-19 22:48 ` John David Anglin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox