public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* XIP & GCC optimization & halting
@ 2005-10-17 10:35 Korolev, Alexey
  2005-10-17 20:49 ` Nicolas Pitre
  0 siblings, 1 reply; 6+ messages in thread
From: Korolev, Alexey @ 2005-10-17 10:35 UTC (permalink / raw)
  To: linux-mtd

Hi all,

I faced several issues with XIP on kernel 2.6.11.
I used Mainstone board and NOR flash with the only partition (non RWW
device).
First found issue of XIP was concerned with GCC optimization.
The function Do_write_buffer was inlined into upper function and as a
result loose __xipram attribute in case of default -Os compile option.
I reduced compile option to -O1 and it helped.

After resolving this issue I found another one halting. 
The system halts with rather big probability on write and erase
operations (good way to reproduce it download file (of size ~1M) via
tftp on JFFS2 volume).
I used BDI debugger to investigate the issue and found that after
halting flash device was in READ_STATUS mode and write operation is not
complete. PC reg points to stuff in virtual memory region mapped to
FLASH.
I disabled for just a case FIQs, but it didn't help.
I really puzzled what code could interrupt write operation then IRQs and
FIQs are disabled. 
I saw some improvement when I tried to invalidate all cache regions
before disabling interrupts. But halting is still high probable. 

Also from time to time I see another issue. 
I get kernel panic message of the view :
I added debug print just before BUG was called. 
It looks very strange according to code chips_per_word = worwidth *
cfi_interleave / map_bankwidth. There is no way to get 6 !

2005-10-15 17:49:58 | cfi_interleave = 2 worwidth = 4 map_bankwidth = 4
chips_per_word = 6 cmd = 0x80.
2005-10-15 17:49:58 | kernel BUG at include/linux/mtd/cfi.h:304!
2005-10-15 17:49:58 | Unable to handle kernel NULL pointer dereference
at virtual address 00000000
2005-10-15 17:49:58 | pgd = c0004000
2005-10-15 17:49:58 | [00000000] *pgd=00000000
2005-10-15 17:49:58 | Internal error: Oops: 817 [#1]
2005-10-15 17:49:58 | Modules linked in:
2005-10-15 17:49:58 | CPU: 0
2005-10-15 17:49:58 | PC is at __bug+0x48/0x5c
2005-10-15 17:49:58 | LR is at 0x1
2005-10-15 17:49:58 | pc : [<bf097a94>]    lr : [<00000001>]    Tainted:
P     
2005-10-15 17:49:58 | sp : c02c5d30  ip : 00000000  fp : c02c5d40
2005-10-15 17:49:58 | r10: 00000001  r9 : 01500000  r8 : c0025390
2005-10-15 17:49:58 | r7 : c0025390  r6 : 00000002  r5 : 00000006  r4 :
00000000
2005-10-15 17:49:58 | r3 : 00000000  r2 : 00000000  r1 : 000012b7  r0 :
00000001
2005-10-15 17:49:58 | Flags: nZCv  IRQs off  FIQs on  Mode SVC_32
Segment kernel
2005-10-15 17:49:58 | Control: 397F  Table: A3AB4000  DAC: 00000017
2005-10-15 17:49:58 | Process swapper (pid: 684, stack limit =
0xc02c41a4)
2005-10-15 17:49:58 | Stack: (0xc02c5d30 to 0xc02c6000)
2005-10-15 17:49:58 | 5d20:                                     000000d0
c02c5e10 c02c5d44 c00212a8 
2005-10-15 17:49:58 | 5d40: bf097a58 00000000 00c000c0 00000012 00000004
00800080 c01655c0 c026c4a0 
2005-10-15 17:49:58 | 5d60: 00001e89 c026c4d8 00000000 00800080 00800080
00000000 00000000 00700070 
2005-10-15 17:49:58 | 5d80: 00000000 00ff00ff c02c5e10 c02c5d98 60000013
00400040 00c000c0 00400040 
2005-10-15 17:49:58 | 5da0: 00800080 00800080 00800080 00c000c0 00800080
00700070 00b000b0 00800080 
2005-10-15 17:49:58 | 5dc0: 00000000 c02bad00 bf0a54a0 00000000 00000000
00000000 c02bad00 bf0a54a0 
2005-10-15 17:49:58 | 5de0: 00000000 00000000 c02c5e70 00000002 c026c4d8
c0025390 00011789 c0025390 
2005-10-15 17:49:58 | 5e00: 01500000 c02c5ed8 c02c5e14 c0024444 c00207f8
00800080 c0113cd8 00800080 
2005-10-15 17:49:58 | 5e20: 00000000 00000000 00000003 00800080 00000000
c026c4a0 00040000 c026c4a0 
2005-10-15 17:49:58 | 5e40: c3e51dec 014c0000 c026c4d8 003a003a 00700070
00020002 00800080 00020002 
2005-10-15 17:49:58 | 5e60: 00000000 00800080 00800080 00000000 00000000
00d000d0 00200020 00500050 
2005-10-15 17:49:58 | 5e80: 00800080 00000000 00000000 c02bad00 bf0a54a0
00000000 00000000 00000000 
2005-10-15 17:49:58 | 5ea0: c02bad00 bf0a54a0 00000000 00000000 00040000
01500000 00000000 00000001 
2005-10-15 17:49:58 | 5ec0: 01500000 00000000 c0272440 c02c5f18 c02c5edc
bf19d194 c0023980 00000000 
2005-10-15 17:49:58 | 5ee0: 00000001 c026c4a0 c0025390 c0023974 c02c5fa4
c02c5f64 c02c5f48 c02c5fa4 
2005-10-15 17:49:58 | 5f00: c01e11a8 00000000 00000000 c02c5f34 c02c5f1c
bf19fd00 bf19cf60 00040000 
2005-10-15 17:49:58 | 5f20: 00000000 c02bba00 c02c5f44 c02c5f38 bf19466c
bf19fcdc c02c5f94 c02c5f48 
2005-10-15 17:49:58 | 5f40: bf198348 bf194620 c02c5f48 c02c5f48 00000000
c02bad00 bf0a54a0 00000000 
2005-10-15 17:49:58 | 5f60: 00000000 00000000 c02bad00 bf0a54a0 00000000
00000000 c0373d80 00000001 
2005-10-15 17:49:58 | 5f80: c0373d60 c0373d80 c02c5ff4 c02c5f98 bf1972d8
bf1982cc c02c5fa0 c3e51ddc 
2005-10-15 17:49:58 | 5fa0: 00000010 c02bba00 01500000 00040000 ffffffff
00000000 00000000 00000000 
2005-10-15 17:49:58 | 5fc0: 00000000 bf19829c c02c5f48 00000000 00000000
00000000 00000000 00000000 
2005-10-15 17:49:58 | 5fe0: 00000000 00000000 00000000 c02c5ff8 bf0ab1b4
bf197084 f098cb36 4f7d3d2b 
2005-10-15 17:49:58 | Backtrace: 
2005-10-15 17:49:58 | [<bf097a4c>] (__bug+0x0/0x5c) from [<c00212a8>]
(0xc00212a8)
2005-10-15 17:49:58 |  r4 = 000000D0 
2005-10-15 17:49:58 | [<c00207ec>] (0xc00207ec) from [<c0024444>]
(0xc0024444)
2005-10-15 17:49:58 | [<c0023974>] (0xc0023974) from [<bf19d194>]
(cfi_varsize_frob+0x240/0x2cc)
2005-10-15 17:49:58 | [<bf19cf54>] (cfi_varsize_frob+0x0/0x2cc) from
[<bf19fd00>] (cfi_intelext_erase_varsize+0x30/0x58)
2005-10-15 17:49:58 | [<bf19fcd0>] (cfi_intelext_erase_varsize+0x0/0x58)
from [<bf19466c>] (part_erase+0x58/0x5c)
2005-10-15 17:49:58 |  r4 = C02BBA00 
2005-10-15 17:49:58 | Code: eb004527 e59f0014 eb004525 e3a03000
(e5833000) 


I'm really suspect that that kernel panic and halting has the same root.
Then flash is in READ_ARRAY mode and due to unknown reason an exception
appears we get kernel_panic in another case we get halting.

Has anyone seen something looking like that ?
What may cause access to flash in case of executing write or erase when
interrupts are disabled?

Platform: mainstone 2 with CPU PXA270 and different nor flash devices
with one partition. 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XIP & GCC optimization & halting
  2005-10-17 10:35 Korolev, Alexey
@ 2005-10-17 20:49 ` Nicolas Pitre
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pitre @ 2005-10-17 20:49 UTC (permalink / raw)
  To: Korolev, Alexey; +Cc: linux-mtd

On Mon, 17 Oct 2005, Korolev, Alexey wrote:

> Hi all,
> 
> I faced several issues with XIP on kernel 2.6.11.
> I used Mainstone board and NOR flash with the only partition (non RWW
> device).
> First found issue of XIP was concerned with GCC optimization.
> The function Do_write_buffer was inlined into upper function and as a
> result loose __xipram attribute in case of default -Os compile option.
> I reduced compile option to -O1 and it helped.

Please apply this patch to your tree:

diff --git a/include/linux/mtd/xip.h b/include/linux/mtd/xip.h
index 7b7deef..46ddb14 100644
--- a/include/linux/mtd/xip.h
+++ b/include/linux/mtd/xip.h
@@ -27,7 +27,7 @@
  * obviously not be running from flash.  The __xipram is therefore marking
  * those functions so they get relocated to ram.
  */
-#define __xipram __attribute__ ((__section__ (".data")))
+#define __xipram noinline __attribute__ ((__section__ (".data")))
 
 /*
  * We really don't want gcc to guess anything.

> After resolving this issue I found another one halting. 
> The system halts with rather big probability on write and erase
> operations (good way to reproduce it download file (of size ~1M) via
> tftp on JFFS2 volume).
> I used BDI debugger to investigate the issue and found that after
> halting flash device was in READ_STATUS mode and write operation is not
> complete. PC reg points to stuff in virtual memory region mapped to
> FLASH.
> I disabled for just a case FIQs, but it didn't help.

The default FIQ exception handler lives in ram anyway.

> I really puzzled what code could interrupt write operation then IRQs and
> FIQs are disabled. 

data abort I'd guess.

If you could set your BDI to trap any data abort when XIP is disabled 
and provide the content of the FAR (fault address register) when it 
happens that might be extremely helpful.

> Also from time to time I see another issue. 
> I get kernel panic message of the view :
> I added debug print just before BUG was called. 
> It looks very strange according to code chips_per_word = worwidth *
> cfi_interleave / map_bankwidth. There is no way to get 6 !
> 
> 2005-10-15 17:49:58 | cfi_interleave = 2 worwidth = 4 map_bankwidth = 4
> chips_per_word = 6 cmd = 0x80.

Probably a side effect of the above bug.


Nicolas

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: XIP & GCC optimization & halting
       [not found] <F92C2EFE7C3BA745855BF7036E1F3F9A014CE48D@NNSMSX401>
@ 2005-10-19 15:00 ` Nicolas Pitre
  2005-10-19 16:47   ` Vitaly Wool
  0 siblings, 1 reply; 6+ messages in thread
From: Nicolas Pitre @ 2005-10-19 15:00 UTC (permalink / raw)
  To: Korolev, Alexey; +Cc: linux-mtd

On Wed, 19 Oct 2005, Korolev, Alexey wrote:

> Disassembling of cfi_cmdset001.o showed that operation ("/") is mapped
> to __divsi3 and __udivsi3 call from arch/arm/lib. 

Ahhhh.

> Now I'm not clear about solution for the problem.

Select only one CFI buswidth and only one chip interleave when 
configuring your kernel.  That will allow gcc to precompute everything a 
t compile time and avoid the runtime divide.

I'll think of a way to enforce that in Kconfig.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XIP & GCC optimization & halting
  2005-10-19 15:00 ` XIP & GCC optimization & halting Nicolas Pitre
@ 2005-10-19 16:47   ` Vitaly Wool
  2005-10-19 17:03     ` Nicolas Pitre
  0 siblings, 1 reply; 6+ messages in thread
From: Vitaly Wool @ 2005-10-19 16:47 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Korolev, Alexey, linux-mtd

Hi,

>  
>
>>Now I'm not clear about solution for the problem.
>>    
>>
>
>Select only one CFI buswidth and only one chip interleave when 
>configuring your kernel.  That will allow gcc to precompute everything a 
>t compile time and avoid the runtime divide.
>  
>
Does it make sense to lock runtime divide functions in i-cache during 
flash not being in linear mode?

Vitaly

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XIP & GCC optimization & halting
  2005-10-19 16:47   ` Vitaly Wool
@ 2005-10-19 17:03     ` Nicolas Pitre
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pitre @ 2005-10-19 17:03 UTC (permalink / raw)
  To: Vitaly Wool; +Cc: Korolev, Alexey, linux-mtd

On Wed, 19 Oct 2005, Vitaly Wool wrote:

> Hi,
> 
> >  
> > > Now I'm not clear about solution for the problem.
> > >    
> > 
> > Select only one CFI buswidth and only one chip interleave when configuring
> > your kernel.  That will allow gcc to precompute everything a t compile time
> > and avoid the runtime divide.
> >  
> Does it make sense to lock runtime divide functions in i-cache during flash
> not being in linear mode?

No.

That _could_ be done of course.  But that is a completely non generic 
(non portable) solution adding yet more complexity and dependencies to a 
setup which is already fragile.  Better aim for more simplicity instead 
which is to make the XIP-off code paths as short, simple and verifiable 
as possible.  Going with a single buswidth and chip interval should 
achieve that.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: XIP & GCC optimization & halting
       [not found] <F92C2EFE7C3BA745855BF7036E1F3F9A014CEB18@NNSMSX401>
@ 2005-10-20 16:11 ` Nicolas Pitre
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pitre @ 2005-10-20 16:11 UTC (permalink / raw)
  To: Korolev, Alexey; +Cc: linux-mtd, Vitaly Wool

On Thu, 20 Oct 2005, Korolev, Alexey wrote:

> Nicolas,
> 
> Selecting of only one CFI buswidth and chip interleave helped for
> resolving this issue.
> cfi_cmdset001.o doesn't contain any reference to __udivsi3 __divsi3 and
> __ashrdi3 in data segment.
> 
> I made several tests no halts or kernel panics got appeared. 

Great!

> P/S 
> 1. I also tried to relocate some /arch/arm/lib code into RAM. It was
> very bad move because it caused kernel Data abort at the very beginning
> of kernel bootup process. Looks some code uses __divsi3 operations
> before memory got initialized.

Although I don't encourage it, that should have worked since enough of 
the kernel data should always be available even when memory is partially 
initialized.

> 2. I have a question about XIP and icache 
> 
> 	a. Execute kernel instructions from FLASH. Start filling of
>          icache pages by instructions from FLASH.
> 	b. Do_write_buffer call. Switching to READ_STATUS. 
> 	c. Finish filling of icache pages by instructions from FLASH.
>          (Part of icache is corrupted).

No, that can't happen since do_write_buffer() is marked __xipram and 
therefore lives in ram.  Also, the call to xip_enable() ensures that the 
instruction prefetch is recycled with xip_iprefetch() after the flash 
has been turned into data mode again.

> 	d. Return from Do_write_buffer 
> 	e. Execute instruction from FLASH which is presorted in icache.
> -> Data abort or kernel panic.

Should not happen (see above).


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-10-20 16:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <F92C2EFE7C3BA745855BF7036E1F3F9A014CE48D@NNSMSX401>
2005-10-19 15:00 ` XIP & GCC optimization & halting Nicolas Pitre
2005-10-19 16:47   ` Vitaly Wool
2005-10-19 17:03     ` Nicolas Pitre
     [not found] <F92C2EFE7C3BA745855BF7036E1F3F9A014CEB18@NNSMSX401>
2005-10-20 16:11 ` Nicolas Pitre
2005-10-17 10:35 Korolev, Alexey
2005-10-17 20:49 ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox