Linux PARISC architecture development
 help / color / mirror / Atom feed
* in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-06-15 12:37 rubisher
  2008-06-16 11:37 ` in ccio_io_pdir_entry(),BUG_ON() " rubisher
  2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
  0 siblings, 2 replies; 16+ messages in thread
From: rubisher @ 2008-06-15 12:37 UTC (permalink / raw)
  To: linux-parisc

Hello all,

looking at this hunk:
void CCIO_INLINE
ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
                    unsigned long hints)
{
         register unsigned long pa;
         register unsigned long ci; /* coherent index */

         /* We currently only support kernel addresses */
         BUG_ON(sid != KERNEL_SPACE);

         mtsp(sid,1);

[snip]
         pa = virt_to_phys(vba);
         asm volatile("depw  %1,31,12,%0" : "+r" (pa) : "r" (hints));
         ((u32 *)pdir_ptr)[1] = (u32) pa;

[snip]
         pa = 0;
[snip]
         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));

         ((u32 *)pdir_ptr)[0] = (u32) pa;

[snip]
         asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr));
         asm volatile("sync");
}
(I just remove comments and 64bit stuff)

and I noticed that resulting code looks like:
    0:   cb 39 a0 60     movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
    4:   34 1c 00 00     ldi 0,ret0
    8:   00 1c 58 20     mtsp ret0,sr1
    c:   22 60 0e 01     ldil L%-10000000,r19
   10:   0a 78 0a 13     add,l r24,r19,r19
   14:   d6 77 0c 14     depw r23,31,12,r19
   18:   0f 53 12 88     stw r19,4(r26)
   1c:   07 00 53 1c     lci r0(sr1,r24),ret0
   20:   d3 9c 1a 74     extrw,u ret0,19,12,ret0
   24:   d7 3c 0e 14     depw ret0,15,12,r25
   28:   0f 59 12 80     stw r25,0(r26)
   2c:   07 40 12 80     fdc r0(r26)
   30:   00 00 04 00     sync
   34:   e8 40 c0 02     bv,n r0(rp)
   38:   03 ff e0 1f     break 1f,1fff
   3c:   e8 1f 1f f7     b,l,n 3c <ccio_io_pdir_entry+0x3c>,r0
Disassembly of section .text.ccio_proc_bitmap_open:

And my worry was about lines 4: and 8:.
According to the C code, I don't understand why optimization want to initialize sr1 to 0 while it should be set to r25 (i.e. 
arg1)?

Otoh, the sba botherhood code didn't showing the same behaviour:
    0:   22 a0 0e 01     ldil L%-10000000,r21
    4:   34 1c 00 00     ldi 0,ret0
    8:   34 1d 20 01     ldi -1000,ret1
    c:   0a b8 0a 15     add,l r24,r21,r21
   10:   08 15 02 56     copy r21,r22
   14:   34 15 00 00     ldi 0,r21
   18:   0b 95 02 15     and r21,ret0,r21
   1c:   0b b6 02 16     and r22,ret1,r22
   20:   00 19 58 20     mtsp r25,sr1
   24:   07 00 53 13     lci r0(sr1,r24),r19
   28:   d2 73 1a 6c     extrw,u r19,19,20,r19
   2c:   23 80 00 01     ldil L%-80000000,ret0
   30:   34 1d 00 00     ldi 0,ret1

but didn't start with BUG_ON(), I simply try to remove this from ccio code and get a better result:
00000000 <ccio_io_pdir_entry>:
    0:   00 19 58 20     mtsp r25,sr1
    4:   23 80 0e 01     ldil L%-10000000,ret0
    8:   0b 98 0a 1c     add,l r24,ret0,ret0
    c:   d7 97 0c 14     depw r23,31,12,ret0
   10:   0f 5c 12 88     stw ret0,4(r26)
   14:   07 00 53 18     lci r0(sr1,r24),r24
   18:   d3 18 1a 74     extrw,u r24,19,12,r24
   1c:   34 1c 00 00     ldi 0,ret0
   20:   d7 98 0e 14     depw r24,15,12,ret0
   24:   0f 5c 12 80     stw ret0,0(r26)
   28:   07 40 12 80     fdc r0(r26)
   2c:   00 00 04 00     sync
   30:   e8 40 c0 02     bv,n r0(rp)
Disassembly of section .init.text:

But this time, it seems not consider assembly:
         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));

as a 'volatile' block and insert line 1c:
This could may be solved by re-write as an one 'volatile' asm block like:
	asm volatile (
	"lci %%r0(%%sr1, %1), %1"
	"\textru        %1,19,12,%1\n"
	"\tdepw         %1,15,12,%0\n"
	: "=r" (pa)
	: "r" (vba));

and even add a clobber 'memory'
	asm volatile (
	"lci %%r0(%%sr1, %1), %1"
	"\textru        %1,19,12,%1\n"
	"\tdepw         %1,15,12,%0\n"
	: "=r" (pa)
	: "r" (vba)
	: "memory");

But I have no clue how to restore BUG_ON() and avoid wrong optimization?

Any idea?

Tia,
	r.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-15 12:37 in ccio_io_pdir_entry(), BUG_ON() " rubisher
@ 2008-06-16 11:37 ` rubisher
  2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
  1 sibling, 0 replies; 16+ messages in thread
From: rubisher @ 2008-06-16 11:37 UTC (permalink / raw)
  To: linux-parisc

wierd, though.

rubisher <rubisher <at> scarlet.be> writes:

> 
> Hello all,
> 
> looking at this hunk:
> void CCIO_INLINE
> ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
>                     unsigned long hints)
> {
>          register unsigned long pa;
>          register unsigned long ci; /* coherent index */
> 
>          /* We currently only support kernel addresses */
>          BUG_ON(sid != KERNEL_SPACE);
> 
no pb to get rid of this BUG_ON(): each place I see it called sid == KERNEL_SPACE
the optimized produced code looks so Ok

[snip]
> 
> But this time, it seems not consider assembly:
>          asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
>          asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>          asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
> 
> as a 'volatile' block and insert line 1c:
> This could may be solved by re-write as an one 'volatile' asm block like:
> 	asm volatile (
> 	"lci %%r0(%%sr1, %1), %1"
> 	"\textru        %1,19,12,%1\n"
> 	"\tdepw         %1,15,12,%0\n"
> 	: "=r" (pa)
> 	: "r" (vba));
> 
> and even add a clobber 'memory'
> 	asm volatile (
> 	"lci %%r0(%%sr1, %1), %1"
> 	"\textru        %1,19,12,%1\n"
> 	"\tdepw         %1,15,12,%0\n"
> 	: "=r" (pa)
> 	: "r" (vba)
> 	: "memory");
> 
> But I have no clue how to restore BUG_ON() and avoid wrong optimization?
> 
But here is what is very wierd, I tested severall re-writen on this code amoung
latest:
	asm volatile (
	"lci %%r0(%%sr1, %1), %%r19"
	"\textru        %%r19,19,12,%%r19\n"
	"\tdepw         %%r19,15,12,%0\n"
	: "=r" (pa)
	: "r" (vba)
	: "r19" );

but this system failed to boot on ncr_attach: pim info seems to point out to
this place:
PROCESSOR PIM INFORMATION

-----------------  Processor 0 HPMC Information ------------------

Timestamp =   Mon Jun  16 09:50:32 GMT 2008    (20:08:06:16:09:50:32)

HPMC Chassis Codes = 0xcbf0  0x5002  0x7d03  0x5402  0x5508  0xcbf2  
                     0xcbfc  

General Registers 0 - 31
00-03  0000000000000000  18711a0000001194  0000000010514bf8  000000001fd48000
04-07  0000000000000003  0000000010520308  000000001fdf2400  000000001fc20414
08-11  000000001049f3d8  000000001fc20088  000000001051ce20  00000000104e03cc
12-15  000000001fd4802c  000000001fdf2704  0000000010425000  0000000010425000
16-19  0000000010403b78  0000000010403ba8  0000000010403bb8  0000000000024800
20-23  00000000000a142c  00000000000f4240  000000001fd48000  0000000000001458
24-27  0000000000024800  0000000000000000  000000001fd48000  000000001047de20
28-31  0000000000000001  61c4680000004650  000000001fc20540  000000001029bfdc

Control Registers 0 - 31
00-03  0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11  0000000000000000  0000000000000000  00000000000000c0  000000000000001d
12-15  0000000000000000  0000000000000000  0000000000110000  00000000e1000000
16-19  000000248ca1a027  0000000000000000  0000000010514c6c  00000000d39c1bfd
20-23  000000009227ffc4  000000000102482c  000000ff0004f90f  0000000080000000
24-27  00000000004c2000  00000000004c2000  00000000aaaaaaaa  00000000aaaaaaaa
28-31  00000000000003d0  0000000011111111  000000001fc20000  0000000011111111

Space Registers 0 - 7
00-03  00000000          00000000          00000000          00000000
04-07  00000000          00000000          00000000          00000000

IIA Space                    = 0x0000000000000000
IIA Offset                   = 0x0000000010514c70
Check Type                   = 0x20000000
CPU State                    = 0x9e000004
Cache Check                  = 0x00000000
TLB Check                    = 0x00000000
Bus Check                    = 0x00305004
Assists Check                = 0x00000000
Assist State                 = 0x00000000
Path Info                    = 0x00000000
System Responder Address     = 0x000000fff1004817
System Requestor Address     = 0xfffffffffffa0000
Check Summary                = 0x8000000810284000
Available Memory             = 0x0000000010000000
CPU Diagnose Register 2      = 0x0301000004000004
CPU Status Register 0        = 0x1440020000000000
CPU Status Register 1        = 0x8000000800000000
SADD LOG                     = 0x0000000100000001
Read Short LOG               = 0xc10010fff1004817

-----------------  Processor 0 LPMC Information ------------------

Check Type                   = 0x00000000
IC Parity Info               = 0x00000000
Cache Check                  = 0x00000000
TLB Check                    = 0x00000000
Bus Check                    = 0x00000000
Assists Check                = 0x00000000
Assist State                 = 0x00000000
Path Info                    = 0x00000000
System Responder Address     = 0x0000000000000000
System Requestor Address     = 0x0000000000000000

-----------------  Processor 0 TOC Information -------------------

General Registers 0 - 31
00-03  0000000000000000  000000000aba9500  00000000f005d12c  0000002d110b5065
04-07  0000000000000000  fffffffff0101120  00000000000f4240  0000000000000005
08-11  00000000f0102cd4  000000001fc20088  000000001051ce20  00000000104e03cc
12-15  000000001fd4802c  000000001fdf2704  0000000010425000  0000000010425000
16-19  0000000010403b78  0000000010403ba8  0000000010403bb8  fffffffff0160007
20-23  0000000000000001  00000000000f4240  000000001fd48000  0000000000000000
24-27  0000000000000000  0000000000000000  0000002d13066c03  fffffffff0102400
28-31  0000002d1bc5e565  0301000004000004  fffffffff0102e30  0301000004000004

Control Registers 0 - 31
00-03  0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07  0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11  0000000000000000  0000000000000000  00000000000000c0  000000000000001d
12-15  0000000000000000  0000000000000000  000000fff0010000  0000000000000000
16-19  0000002d130a7a0d  0000000000000000  00000000f005d154  000000000804025c
20-23  000000009227c3c0  c000000040902e44  0000000000291008  0000000080000000
24-27  00000000004c2000  00000000004c2000  00000000aaaaaaaa  00000000aaaaaaaa
28-31  00000000000003d0  0000000011111111  000000001fc20000  0000000011111111

Space Registers 0 - 7
00-03  00000000          00000000          00000000          00000000
04-07  00000000          00000000          00000000          00000000

IIA Space                    = 0x0000000000000000
IIA Offset                   = 0x00000000f005d134
CPU State                    = 0x9e000001


Memory Error Log Information:

Timestamp =   Mon Jun  16 09:50:32 GMT 2008    (20:08:06:16:09:50:32)

                    Trans  Addr                   Central Bus
Status  Requestor     id    par  CP  AD  DV       Address/Data
------  ----------  -----  ----  --  --  --  ---------------------

 0x14   0xfffa0000   0x00   0x0   0   0   0  0x00000000 0x00000000

                                                    Memory
                                  ECC Reg     Address      Data
                                 ----------  ---------- ----------

                                 0x00000000  0x00000000 0x00000000
                                                        0x00000000
                                                        0x00000000
                                                        0x00000000
                                                        0x00000000

I/O Module Error Log Information:

Timestamp =   Mon Jun  16 09:50:32 GMT 2008    (20:08:06:16:09:50:32)

Bus    HPA       Module Type      Path  Slt Md Sev  Estat Requestor  Responder
--- ---------- ---------------- -------- -- -- ---- ----- ---------- ----------
 0  0xfff88000 I/O Adapter      8         2  0  he   0x04 0xf1004000 0x000a1420
 0  0xfff8a000 I/O Adapter      10        2  2  he   0x0d 0x00000000 0x00000000

IO Bus Converter Log
--------------------

IOA0 HPA            = 0xfff88000
IOA0 UBC_IO_CONTROL = 0x00020080
GSC1 HPA            = 0xf1000001
IOA0 IO_IO_LO       = 0xf0800000
IOA0 IO_IO_HI       = 0xf17f0000
IOA0 IO_IO_LO_HV    = 0xfc000000
IOA0 IO_IO_HI_HV    = 0xfff80000

IOA1 HPA            = 0xfff8a000
IOA1 UBC_IO_CONTROL = 0x00020080
GSC2 HPA            = 0xf1800001
IOA1 IO_IO_LO       = 0xf1800000
IOA1 IO_IO_HI       = 0xf2000000
IOA1 IO_IO_LO_HV    = 0x00000000
IOA1 IO_IO_HI_HV    = 0x00000000

while IAOQ point to
IIA Offset                   = 0x0000000010514c70
...

e.g. IAOQ = 0x0000000010514c70

Parse IAOQ = 0x0000000010514c70 for CPU[0]

Func: ncr_attach, Off: 0xc4c, Addr: 0x10514c70

	...
10514c60:	80 95 20 28 	cmpb,= r21,r4,10514c7c <.L1517+0x660>

	...
10514c70:	87 80 3f d7 	cmpib,=,n 0,ret0,10514c60 <.L1517+0x644>
10514c74:	34 84 00 02 	ldo 1(r4),r4
10514c78:	08 03 02 5a 	copy r3,r26
10514c7c:	34 19 00 c8 	ldi 64,r25

Any idea what I missed in this chg?

Tia,
r.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-15 12:37 in ccio_io_pdir_entry(), BUG_ON() " rubisher
  2008-06-16 11:37 ` in ccio_io_pdir_entry(),BUG_ON() " rubisher
@ 2008-06-19 16:04 ` Grant Grundler
  2008-06-19 19:44   ` Joel Soete
  1 sibling, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-06-19 16:04 UTC (permalink / raw)
  To: rubisher; +Cc: linux-parisc

On Sun, Jun 15, 2008 at 12:37:25PM +0000, rubisher wrote:
> Hello all,
>
> looking at this hunk:
> void CCIO_INLINE
> ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
>                    unsigned long hints)
> {
>         register unsigned long pa;
>         register unsigned long ci; /* coherent index */
>
>         /* We currently only support kernel addresses */
>         BUG_ON(sid != KERNEL_SPACE);
...
> and I noticed that resulting code looks like:
>    0:   cb 39 a0 60     movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
>    4:   34 1c 00 00     ldi 0,ret0

The BUG_ON is causing the movb to be inserted. And then the compiler knows
the value is zero and can either copy from a register or "load immediate 0".
It probably chose the "ldi 0" because it avoids register interlocks and
can always be executed.

The movb will either branch to +0x38 (and nullifies the delay slot)
or execute the ldi. So it looks right to me.

BTW, I think the BUG_ON can go away. It's good for debugging but doesn't
need to be in every kernel.

...
> And my worry was about lines 4: and 8:.
> According to the C code, I don't understand why optimization want to 
> initialize sr1 to 0 while it should be set to r25 (i.e. arg1)?

Does the BUG_ON explaination make sense to you?

>
> Otoh, the sba botherhood code didn't showing the same behaviour:
>    0:   22 a0 0e 01     ldil L%-10000000,r21
>    4:   34 1c 00 00     ldi 0,ret0
>    8:   34 1d 20 01     ldi -1000,ret1
>    c:   0a b8 0a 15     add,l r24,r21,r21
>   10:   08 15 02 56     copy r21,r22
>   14:   34 15 00 00     ldi 0,r21
>   18:   0b 95 02 15     and r21,ret0,r21
>   1c:   0b b6 02 16     and r22,ret1,r22
>   20:   00 19 58 20     mtsp r25,sr1
>   24:   07 00 53 13     lci r0(sr1,r24),r19
>   28:   d2 73 1a 6c     extrw,u r19,19,20,r19
>   2c:   23 80 00 01     ldil L%-80000000,ret0
>   30:   34 1d 00 00     ldi 0,ret1
>
> but didn't start with BUG_ON(),

Right. That should be a clue. :)

hth,
grant

> I simply try to remove this from ccio code 
> and get a better result:
> 00000000 <ccio_io_pdir_entry>:
>    0:   00 19 58 20     mtsp r25,sr1
>    4:   23 80 0e 01     ldil L%-10000000,ret0
>    8:   0b 98 0a 1c     add,l r24,ret0,ret0
>    c:   d7 97 0c 14     depw r23,31,12,ret0
>   10:   0f 5c 12 88     stw ret0,4(r26)
>   14:   07 00 53 18     lci r0(sr1,r24),r24
>   18:   d3 18 1a 74     extrw,u r24,19,12,r24
>   1c:   34 1c 00 00     ldi 0,ret0
>   20:   d7 98 0e 14     depw r24,15,12,ret0
>   24:   0f 5c 12 80     stw ret0,0(r26)
>   28:   07 40 12 80     fdc r0(r26)
>   2c:   00 00 04 00     sync
>   30:   e8 40 c0 02     bv,n r0(rp)
> Disassembly of section .init.text:
>
> But this time, it seems not consider assembly:
>         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
>         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
>
> as a 'volatile' block and insert line 1c:
> This could may be solved by re-write as an one 'volatile' asm block like:
> 	asm volatile (
> 	"lci %%r0(%%sr1, %1), %1"
> 	"\textru        %1,19,12,%1\n"
> 	"\tdepw         %1,15,12,%0\n"
> 	: "=r" (pa)
> 	: "r" (vba));
>
> and even add a clobber 'memory'
> 	asm volatile (
> 	"lci %%r0(%%sr1, %1), %1"
> 	"\textru        %1,19,12,%1\n"
> 	"\tdepw         %1,15,12,%0\n"
> 	: "=r" (pa)
> 	: "r" (vba)
> 	: "memory");
>
> But I have no clue how to restore BUG_ON() and avoid wrong optimization?
>
> Any idea?
>
> Tia,
> 	r.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
@ 2008-06-19 19:44   ` Joel Soete
  2008-06-19 22:48     ` John David Anglin
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-06-19 19:44 UTC (permalink / raw)
  To: Grant Grundler; +Cc: rubisher, linux-parisc



Grant Grundler wrote:
> On Sun, Jun 15, 2008 at 12:37:25PM +0000, rubisher wrote:
>> Hello all,
>>
>> looking at this hunk:
>> void CCIO_INLINE
>> ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba,
>>                    unsigned long hints)
>> {
>>         register unsigned long pa;
>>         register unsigned long ci; /* coherent index */
>>
>>         /* We currently only support kernel addresses */
>>         BUG_ON(sid != KERNEL_SPACE);
> ...
>> and I noticed that resulting code looks like:
>>    0:   cb 39 a0 60     movb,<> r25,r25,38 <ccio_io_pdir_entry+0x38>
>>    4:   34 1c 00 00     ldi 0,ret0
> 
> The BUG_ON is causing the movb to be inserted. And then the compiler knows
> the value is zero and can either copy from a register or "load immediate 0".
> It probably chose the "ldi 0" because it avoids register interlocks and
> can always be executed.
> 
> The movb will either branch to +0x38 (and nullifies the delay slot)
> or execute the ldi. So it looks right to me.
> 
> BTW, I think the BUG_ON can go away. It's good for debugging but doesn't
> need to be in every kernel.
> 
> ...
>> And my worry was about lines 4: and 8:.
>> According to the C code, I don't understand why optimization want to 
>> initialize sr1 to 0 while it should be set to r25 (i.e. arg1)?
> 
> Does the BUG_ON explaination make sense to you?
> 
Well I nerver imagine that a compiler can make proof of this kind of induction spirit ;-) but that's give the sense, tx.

>> Otoh, the sba botherhood code didn't showing the same behaviour:
>>    0:   22 a0 0e 01     ldil L%-10000000,r21
>>    4:   34 1c 00 00     ldi 0,ret0
>>    8:   34 1d 20 01     ldi -1000,ret1
>>    c:   0a b8 0a 15     add,l r24,r21,r21
>>   10:   08 15 02 56     copy r21,r22
>>   14:   34 15 00 00     ldi 0,r21
>>   18:   0b 95 02 15     and r21,ret0,r21
>>   1c:   0b b6 02 16     and r22,ret1,r22
>>   20:   00 19 58 20     mtsp r25,sr1
>>   24:   07 00 53 13     lci r0(sr1,r24),r19
>>   28:   d2 73 1a 6c     extrw,u r19,19,20,r19
>>   2c:   23 80 00 01     ldil L%-80000000,ret0
>>   30:   34 1d 00 00     ldi 0,ret1
>>
>> but didn't start with BUG_ON(),
> 
> Right. That should be a clue. :)
> 
> hth,
> grant
> 
Tx again,
	J.

ps: ccio_io_pdir_entry() seems to works fine without BUG_ON().
>> I simply try to remove this from ccio code 
>> and get a better result:
>> 00000000 <ccio_io_pdir_entry>:
>>    0:   00 19 58 20     mtsp r25,sr1
>>    4:   23 80 0e 01     ldil L%-10000000,ret0
>>    8:   0b 98 0a 1c     add,l r24,ret0,ret0
>>    c:   d7 97 0c 14     depw r23,31,12,ret0
>>   10:   0f 5c 12 88     stw ret0,4(r26)
>>   14:   07 00 53 18     lci r0(sr1,r24),r24
>>   18:   d3 18 1a 74     extrw,u r24,19,12,r24
>>   1c:   34 1c 00 00     ldi 0,ret0
>>   20:   d7 98 0e 14     depw r24,15,12,ret0
>>   24:   0f 5c 12 80     stw ret0,0(r26)
>>   28:   07 40 12 80     fdc r0(r26)
>>   2c:   00 00 04 00     sync
>>   30:   e8 40 c0 02     bv,n r0(rp)
>> Disassembly of section .init.text:
>>
>> But this time, it seems not consider assembly:
>>         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r" (vba));
>>         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>>         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
>>
>> as a 'volatile' block and insert line 1c:
>> This could may be solved by re-write as an one 'volatile' asm block like:
>> 	asm volatile (
>> 	"lci %%r0(%%sr1, %1), %1"
>> 	"\textru        %1,19,12,%1\n"
>> 	"\tdepw         %1,15,12,%0\n"
>> 	: "=r" (pa)
>> 	: "r" (vba));
>>
>> and even add a clobber 'memory'
>> 	asm volatile (
>> 	"lci %%r0(%%sr1, %1), %1"
>> 	"\textru        %1,19,12,%1\n"
>> 	"\tdepw         %1,15,12,%0\n"
>> 	: "=r" (pa)
>> 	: "r" (vba)
>> 	: "memory");
>>
>> But I have no clue how to restore BUG_ON() and avoid wrong optimization?
>>
>> Any idea?
>>
>> Tia,
>> 	r.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(), BUG_ON() seems to break gcc-4.2    optimization?
  2008-06-19 19:44   ` Joel Soete
@ 2008-06-19 22:48     ` John David Anglin
  0 siblings, 0 replies; 16+ messages in thread
From: John David Anglin @ 2008-06-19 22:48 UTC (permalink / raw)
  To: Joel Soete; +Cc: grundler, rubisher, linux-parisc

> > Does the BUG_ON explaination make sense to you?
> > 
> Well I nerver imagine that a compiler can make proof of this kind of induction spirit ;-) but that's give the sense, tx.

Yes, the current compiler eliminates a lot of dead code based on induction.
This occurs a lot when functions are inlined, etc.  Makes debugging a real
problem.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-06-20  6:37 Joel Soete
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Soete @ 2008-06-20  6:37 UTC (permalink / raw)
  To: dave; +Cc: grundler, rubisher, linux-parisc

> > > Does the BUG_ON explaination make sense to you?
> > > 
> > Well I nerver imagine that a compiler can make proof of this kind of
induction spirit ;-) but that's give the sense, tx.
> 
> Yes, the current compiler eliminates a lot of dead code based on induction.
> This occurs a lot when functions are inlined, etc.  Makes debugging a real
> problem.
> 
Thanks Dave (after a while away from this parisc hobby, it's nice to learn
fresh info).

Cheers,
    J.

> Dave
> -- 
> J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
> National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-06-26  6:28 Joel Soete
  2008-06-28 20:23 ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-06-26  6:28 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc

Just for remind:

Joel Soete wrote:
 >
 >
[snip]
 >>> But this time, it seems not consider assembly:
 >>>         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
 >>> (vba));
 >>>         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
 >>>         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
 >>>
 >>> as a 'volatile' block and insert line 1c:

This seems to make better what I want:
        __asm__ __volatile__ (
        "lci    %%r0(%%sr1, %2), %0\n"
        "\textru        %0,19,12,%0\n"
        "\tdepw         %0,15,12,%1\n"
        : "+r" (ci), "+r" (pa)
        : "r" (vba)
        : "memory"
        );


in <ccio_map_sg>
  200:	06 80 53 13 	lci r0(sr1,r20),r19		      |	 200:	08 1a 02 54 	copy r26,r20
  204:	d2 73 1a 74 	extrw,u r19,19,12,r19		      |	 204:	06 a0 53 14 	lci
r0(sr1,r21),r20
  208:	08 1a 02 5c 	copy r26,ret0			      |	 208:	d2 94 1a 74 	extrw,u
r20,19,12,r20
  20c:	d7 93 0e 14 	depw r19,15,12,ret0		      |	 20c:	d7 94 0e 14 	depw
r20,15,12,ret0
  210:	0e dc 12 80 	stw ret0,0(r22)				 210:	0e dc 12 80 	stw ret0,0(r22)
  214:	06 c0 12 80 	fdc r0(r22)				 214:	06 c0 12 80 	fdc r0(r22)
  218:	00 00 04 00 	sync					 218:	00 00 04 00 	sync

J.

PS: I don't yet understand why kernel panicing when I try to get rid to
reserve 'ci' variable and use in place a gr as r19 (even if clobbered), the
produced object seems ok but resulting kernel panicing (even after a full
rebuild after a make distclean???)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-26  6:28 in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization? Joel Soete
@ 2008-06-28 20:23 ` Grant Grundler
  2008-06-28 22:26   ` Joel Soete
  0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-06-28 20:23 UTC (permalink / raw)
  To: Joel Soete; +Cc: Grant Grundler, linux-parisc

On Thu, Jun 26, 2008 at 07:28:03AM +0100, Joel Soete wrote:
> Just for remind:
> 
> Joel Soete wrote:
>  >
>  >
> [snip]
>  >>> But this time, it seems not consider assembly:
>  >>>         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
>  >>> (vba));
>  >>>         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>  >>>         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
>  >>>
>  >>> as a 'volatile' block and insert line 1c:
> 
> This seems to make better what I want:
>         __asm__ __volatile__ (
>         "lci    %%r0(%%sr1, %2), %0\n"
>         "\textru        %0,19,12,%0\n"
>         "\tdepw         %0,15,12,%1\n"
>         : "+r" (ci), "+r" (pa)
>         : "r" (vba)
>         : "memory"

Why the "memory"? This asm code isn't modifying memory at all.

>         );
> 
> 
> in <ccio_map_sg>
>   200:	06 80 53 13 	lci r0(sr1,r20),r19		      |	 200:	08 1a 02 54 	copy r26,r20
>   204:	d2 73 1a 74 	extrw,u r19,19,12,r19		      |	 204:	06 a0 53 14 	lci
> r0(sr1,r21),r20
>   208:	08 1a 02 5c 	copy r26,ret0			      |	 208:	d2 94 1a 74 	extrw,u
> r20,19,12,r20
>   20c:	d7 93 0e 14 	depw r19,15,12,ret0		      |	 20c:	d7 94 0e 14 	depw
> r20,15,12,ret0
>   210:	0e dc 12 80 	stw ret0,0(r22)				 210:	0e dc 12 80 	stw ret0,0(r22)
>   214:	06 c0 12 80 	fdc r0(r22)				 214:	06 c0 12 80 	fdc r0(r22)
>   218:	00 00 04 00 	sync					 218:	00 00 04 00 	sync
> 
> J.
> 
> PS: I don't yet understand why kernel panicing when I try to get rid to
> reserve 'ci' variable and use in place a gr as r19 (even if clobbered), the
> produced object seems ok but resulting kernel panicing (even after a full
> rebuild after a make distclean???)

What's wrong with "ci" variable? It's just another register.
C compiler will allocate it and then we pass it to the asm().

And are you sure it's ok to use r19?

hth,
grant

> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-28 20:23 ` Grant Grundler
@ 2008-06-28 22:26   ` Joel Soete
  2008-06-28 22:45     ` John David Anglin
  2008-06-29 20:52     ` Grant Grundler
  0 siblings, 2 replies; 16+ messages in thread
From: Joel Soete @ 2008-06-28 22:26 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc

Hello Grant,

Grant Grundler wrote:
> On Thu, Jun 26, 2008 at 07:28:03AM +0100, Joel Soete wrote:
>> Just for remind:
>>
>> Joel Soete wrote:
>>  >
>>  >
>> [snip]
>>  >>> But this time, it seems not consider assembly:
>>  >>>         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
>>  >>> (vba));
>>  >>>         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
>>  >>>         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
>>  >>>
>>  >>> as a 'volatile' block and insert line 1c:
>>
>> This seems to make better what I want:
>>         __asm__ __volatile__ (
>>         "lci    %%r0(%%sr1, %2), %0\n"
>>         "\textru        %0,19,12,%0\n"
>>         "\tdepw         %0,15,12,%1\n"
>>         : "+r" (ci), "+r" (pa)
>>         : "r" (vba)
>>         : "memory"
> 
> Why the "memory"? This asm code isn't modifying memory at all.
> 
well I certainly still have wrongly understand what "memory" clobber means but for me "pa" is a memory location modified 
(the plus sign in '"+r" (pa)' and that either after 64bit computation or 32bit initialization pa=0).
But your comment make me fill I am again wrong (well definitively C is not my language ;-) .)

>>         );
>>
>>
>> in <ccio_map_sg>
>>   200:	06 80 53 13 	lci r0(sr1,r20),r19		      |	 200:	08 1a 02 54 	copy r26,r20
>>   204:	d2 73 1a 74 	extrw,u r19,19,12,r19		      |	 204:	06 a0 53 14 	lci
>> r0(sr1,r21),r20
>>   208:	08 1a 02 5c 	copy r26,ret0			      |	 208:	d2 94 1a 74 	extrw,u
>> r20,19,12,r20
>>   20c:	d7 93 0e 14 	depw r19,15,12,ret0		      |	 20c:	d7 94 0e 14 	depw
>> r20,15,12,ret0
>>   210:	0e dc 12 80 	stw ret0,0(r22)				 210:	0e dc 12 80 	stw ret0,0(r22)
>>   214:	06 c0 12 80 	fdc r0(r22)				 214:	06 c0 12 80 	fdc r0(r22)
>>   218:	00 00 04 00 	sync					 218:	00 00 04 00 	sync
>>
>> J.
>>
>> PS: I don't yet understand why kernel panicing when I try to get rid to
>> reserve 'ci' variable and use in place a gr as r19 (even if clobbered), the
>> produced object seems ok but resulting kernel panicing (even after a full
>> rebuild after a make distclean???)
> 
> What's wrong with "ci" variable? It's just another register.
well the idea is that "ci" variable is just there as a tmp variable just used in this asm code, right.
so why not simply get rid of its declaration (btw save a line of code ;-)) and just use a common temporay reg as r19?

but what I show above is just a 'diff -y' between a objdump's hunk of ccio-dma.o:
on the left side what gcc produce with the original code (3 different asm lines) (this kernel was well booting fine)
on the right side what same compile produce with new code (1 asm line).

Otoh, based on my few knowldge on parisc asm, it seems to me that something like:
          __asm__ __volatile__ (
          "lci    %%r0(%%sr1, %1), %%r19\n"
          "\textru        %%r19,19,12,%%r19\n"
          "\tdepw         %%r19,15,12,%0\n"
          : "+r" (pa)
          : "r" (vba)
          : "r19")

would do the same job, but all my test results was a panicing kernel???
(even thought if I am diffing 2 objdump of the only ccio-dma.o, it looks like gr are just used differently. but may be did I 
missed another wrong stuff).

That said, nothing of that help to fixe ccio-dma driver issue ;-(

My latest hope is relayfs to grab more debug info (I just need a bit more free time).

> C compiler will allocate it and then we pass it to the asm().
> 
> And are you sure it's ok to use r19?
> 
yes that was the only lines, I was working on.

Tx,
	J.

> hth,
> grant
> 
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2     optimization?
  2008-06-28 22:26   ` Joel Soete
@ 2008-06-28 22:45     ` John David Anglin
  2008-06-29 20:52     ` Grant Grundler
  1 sibling, 0 replies; 16+ messages in thread
From: John David Anglin @ 2008-06-28 22:45 UTC (permalink / raw)
  To: Joel Soete; +Cc: grundler, linux-parisc

> Grant Grundler wrote:
> > On Thu, Jun 26, 2008 at 07:28:03AM +0100, Joel Soete wrote:
> >> Just for remind:
> >>
> >> Joel Soete wrote:
> >>  >
> >>  >
> >> [snip]
> >>  >>> But this time, it seems not consider assembly:
> >>  >>>         asm volatile ("lci %%r0(%%sr1, %1), %0" : "=r" (ci) : "r"
> >>  >>> (vba));
> >>  >>>         asm volatile ("extru %1,19,12,%0" : "+r" (ci) : "r" (ci));
> >>  >>>         asm volatile ("depw  %1,15,12,%0" : "+r" (pa) : "r" (ci));
> >>  >>>
> >>  >>> as a 'volatile' block and insert line 1c:
> >>
> >> This seems to make better what I want:
> >>         __asm__ __volatile__ (
> >>         "lci    %%r0(%%sr1, %2), %0\n"
> >>         "\textru        %0,19,12,%0\n"
> >>         "\tdepw         %0,15,12,%1\n"
> >>         : "+r" (ci), "+r" (pa)
> >>         : "r" (vba)
> >>         : "memory"
> > 
> > Why the "memory"? This asm code isn't modifying memory at all.
> > 
> well I certainly still have wrongly understand what "memory" clobber means but for me "pa" is a memory location modified 
> (the plus sign in '"+r" (pa)' and that either after 64bit computation or 32bit initialization pa=0).

The "+r" constraint indicates a register operand that is both read and written.
However, "ci" and "pa" are not read, so the constraints for them should be
"=r".  None of the instructions modify memory, so the "memory" clobber
is unnecessary.

> so why not simply get rid of its declaration (btw save a line of code ;-)) and just use a common temporay reg as r19?

r19 is the pic register in 64-bit mode and can't be used without saving
and restoring it.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-28 22:26   ` Joel Soete
  2008-06-28 22:45     ` John David Anglin
@ 2008-06-29 20:52     ` Grant Grundler
  2008-06-30 18:28       ` Joel Soete
  1 sibling, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-06-29 20:52 UTC (permalink / raw)
  To: Joel Soete; +Cc: linux-parisc

On Sat, Jun 28, 2008 at 10:26:25PM +0000, Joel Soete wrote:
...
>> What's wrong with "ci" variable? It's just another register.
> well the idea is that "ci" variable is just there as a tmp variable just 
> used in this asm code, right.
> so why not simply get rid of its declaration (btw save a line of code ;-)) 
> and just use a common temporay reg as r19?

No - don't do that. Let the compiler do the work of allocating registers.
It's really good at it. :)  Much better than you and I combined. :)
All we need to do is tell the compiler we need a register.

cheers,
grant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-29 20:52     ` Grant Grundler
@ 2008-06-30 18:28       ` Joel Soete
  2008-07-02  4:28         ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-06-30 18:28 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc

Hello Grant,

Grant Grundler wrote:
> On Sat, Jun 28, 2008 at 10:26:25PM +0000, Joel Soete wrote:
> ...
>>> What's wrong with "ci" variable? It's just another register.
>> well the idea is that "ci" variable is just there as a tmp variable just 
>> used in this asm code, right.
>> so why not simply get rid of its declaration (btw save a line of code ;-)) 
>> and just use a common temporay reg as r19?
> 
> No - don't do that. Let the compiler do the work of allocating registers.
> It's really good at it. :)  Much better than you and I combined. :)

You and gcc already prove it to me ;-)

> All we need to do is tell the compiler we need a register.
> 
Ok

Taken also into account of David remarks, here is the latest stuff tested with success:

         __asm__ __volatile__ (
         "lci    %%r0(%%sr1, %2), %0\n"
         "\textru        %0,19,12,%0\n"
         "\tdepw         %0,15,12,%1\n"
         : "=r" (ci), "=r" (pa)
         : "r" (vba)
         );

Tx to all for kind comments, they learn me a lot ;-)

J.

> cheers,
> grant
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-06-30 18:28       ` Joel Soete
@ 2008-07-02  4:28         ` Grant Grundler
  2008-07-02 18:01           ` Joel Soete
  0 siblings, 1 reply; 16+ messages in thread
From: Grant Grundler @ 2008-07-02  4:28 UTC (permalink / raw)
  To: Joel Soete; +Cc: Grant Grundler, linux-parisc

On Mon, Jun 30, 2008 at 06:28:36PM +0000, Joel Soete wrote:
...
> Taken also into account of David remarks, here is the latest stuff tested 
> with success:
>
>         __asm__ __volatile__ (
>         "lci    %%r0(%%sr1, %2), %0\n"
>         "\textru        %0,19,12,%0\n"
>         "\tdepw         %0,15,12,%1\n"
>         : "=r" (ci), "=r" (pa)
>         : "r" (vba)
>         );

So can you remind me (and Kyle) why this is better than what is
there now?

> Tx to all for kind comments, they learn me a lot ;-)

welcome! :)

grant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-07-02  4:28         ` Grant Grundler
@ 2008-07-02 18:01           ` Joel Soete
  2008-07-07 15:28             ` Grant Grundler
  0 siblings, 1 reply; 16+ messages in thread
From: Joel Soete @ 2008-07-02 18:01 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc



Grant Grundler wrote:
> On Mon, Jun 30, 2008 at 06:28:36PM +0000, Joel Soete wrote:
> ...
>> Taken also into account of David remarks, here is the latest stuff tested 
>> with success:
>>
>>         __asm__ __volatile__ (
>>         "lci    %%r0(%%sr1, %2), %0\n"
>>         "\textru        %0,19,12,%0\n"
>>         "\tdepw         %0,15,12,%1\n"
>>         : "=r" (ci), "=r" (pa)
>>         : "r" (vba)
>>         );
> 
> So can you remind me (and Kyle) why this is better than what is
> there now?
> 
mmm may be because gcc consider it now as only 1 'volatile' instruction:

in <ccio_map_sg>
-----------< Original code > --------------+------------< New stuff >-----------------
200: 06 80 53 13  lci r0(sr1,r20),r19      |  200: 08 1a 02 54  copy r26,r20
204: d2 73 1a 74  extrw,u r19,19,12,r19    |  204: 06 a0 53 14  lci r0(sr1,r21),r20
208: 08 1a 02 5c  copy r26,ret0            |  208: d2 94 1a 74  extrw,u r20,19,12,r20
20c: d7 93 0e 14  depw r19,15,12,ret0      |  20c: d7 94 0e 14  depw r20,15,12,ret0
210: 0e dc 12 80  stw ret0,0(r22)             210: 0e dc 12 80  stw ret0,0(r22)
214: 06 c0 12 80  fdc r0(r22)                 214: 06 c0 12 80  fdc r0(r22)
218: 00 00 04 00  sync                        218: 00 00 04 00  sync

thought?

>> Tx to all for kind comments, they learn me a lot ;-)
> 
> welcome! :)
> 
> grant
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
  2008-07-02 18:01           ` Joel Soete
@ 2008-07-07 15:28             ` Grant Grundler
  0 siblings, 0 replies; 16+ messages in thread
From: Grant Grundler @ 2008-07-07 15:28 UTC (permalink / raw)
  To: Joel Soete; +Cc: Grant Grundler, linux-parisc

On Wed, Jul 02, 2008 at 06:01:45PM +0000, Joel Soete wrote:
>> So can you remind me (and Kyle) why this is better than what is
>> there now?
> mmm may be because gcc consider it now as only 1 'volatile' instruction:
>
> in <ccio_map_sg>
> -----------< Original code > --------------+------------< New stuff 
> >-----------------
> 200: 06 80 53 13  lci r0(sr1,r20),r19      |  200: 08 1a 02 54  copy r26,r20
> 204: d2 73 1a 74  extrw,u r19,19,12,r19    |  204: 06 a0 53 14  lci r0(sr1,r21),r20
> 208: 08 1a 02 5c  copy r26,ret0            |  208: d2 94 1a 74  extrw,u r20,19,12,r20
> 20c: d7 93 0e 14  depw r19,15,12,ret0      |  20c: d7 94 0e 14  depw r20,15,12,ret0
> 210: 0e dc 12 80  stw ret0,0(r22)             210: 0e dc 12 80  stw ret0,0(r22)
> 214: 06 c0 12 80  fdc r0(r22)                 214: 06 c0 12 80  fdc r0(r22)
> 218: 00 00 04 00  sync                        218: 00 00 04 00  sync
>
> thought?

I'm not comfortable "New stuff" version is correct.

In "New Stuff", r20 gets clobbered and values copied from
r26 are lost. That's not the case in the original code.
"depw" only replaces 12 bits of "ret0" in the original code.

thanks,
grant

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization?
@ 2008-07-08  9:04 Joel Soete
  0 siblings, 0 replies; 16+ messages in thread
From: Joel Soete @ 2008-07-08  9:04 UTC (permalink / raw)
  To: grundler; +Cc: grundler, linux-parisc

[...]
> 
> In "New Stuff", r20 gets clobbered and values copied from
> r26 are lost. That's not the case in the original code.
> "depw" only replaces 12 bits of "ret0" in the original code.
> 
Good catch, in fact I would just have to save the read+write constraint on pa;
i.e this seems to make the drill:

        __asm__ __volatile__ (
        "lci    %%r0(%%sr1, %2), %0\n"
        "\textru        %0,19,12,%0\n"
        "\tdepw         %0,15,12,%1\n"
        : "=r" (ci), "+r" (pa)
        : "r" (vba)
        );

comparing produced objdump:
00000000 <ccio_io_pdir_entry>:           00000000 <ccio_io_pdir_entry>:
 0: 00 19 58 20  mtsp r25,sr1             0: 00 19 58 20  mtsp r25,sr1
 4: 23 80 0e 01  ldil L%-10000000,ret0    4: 23 80 0e 01  ldil L%-10000000,ret0
 8: 0b 98 0a 1c  add,l r24,ret0,ret0      8: 0b 98 0a 1c  add,l r24,ret0,ret0
 c: d7 97 0c 14  depw r23,31,12,ret0      c: d7 97 0c 14  depw r23,31,12,ret0
10: 0f 5c 12 88  stw ret0,4(r26)       | 10: 34 13 00 00  ldi 0,r19
14: 07 00 53 18  lci r0(sr1,r24),r24   | 14: 0f 5c 12 88  stw ret0,4(r26)
18: d3 18 1a 74  extrw,u r24,19,12,r24 | 18: 07 00 53 1c  lci r0(sr1,r24),ret0
1c: 34 1c 00 00  ldi 0,ret0            | 1c: d3 9c 1a 74  extrw,u ret0,19,12,ret0
20: d7 98 0e 14  depw r24,15,12,ret0   | 20: d6 7c 0e 14  depw ret0,15,12,r19
24: 0f 5c 12 80  stw ret0,0(r26)       | 24: 0f 53 12 80  stw r19,0(r26)
28: 07 40 12 80  fdc r0(r26)             28: 07 40 12 80  fdc r0(r26)
2c: 00 00 04 00  sync                    2c: 00 00 04 00  sync
30: e8 40 c0 02  bv,n r0(rp)             30: e8 40 c0 02  bv,n r0(rp)
Disassembly of section .text.ccio_free_consistent:Disassembly of section ...

This time r19 is well 'initialized' with 0 (as per C code pa = 0);
the coherent index in r28 (ret0) and after extrw, depw replaces 12 bits of r19.

Many thanks for re-reading ;-)
    J.

> thanks,
> grant
> --
> To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-07-08  9:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-26  6:28 in ccio_io_pdir_entry(),BUG_ON() seems to break gcc-4.2 optimization? Joel Soete
2008-06-28 20:23 ` Grant Grundler
2008-06-28 22:26   ` Joel Soete
2008-06-28 22:45     ` John David Anglin
2008-06-29 20:52     ` Grant Grundler
2008-06-30 18:28       ` Joel Soete
2008-07-02  4:28         ` Grant Grundler
2008-07-02 18:01           ` Joel Soete
2008-07-07 15:28             ` Grant Grundler
  -- strict thread matches above, loose matches on Subject: below --
2008-07-08  9:04 Joel Soete
2008-06-20  6:37 Joel Soete
2008-06-15 12:37 in ccio_io_pdir_entry(), BUG_ON() " rubisher
2008-06-16 11:37 ` in ccio_io_pdir_entry(),BUG_ON() " rubisher
2008-06-19 16:04 ` in ccio_io_pdir_entry(), BUG_ON() " Grant Grundler
2008-06-19 19:44   ` Joel Soete
2008-06-19 22:48     ` John David Anglin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox