* Decompression failure in an inexplicable case
@ 2010-04-29 15:27 Michael Cashwell
2010-04-29 15:40 ` Catalin Marinas
2010-04-29 16:34 ` Catalin Marinas
0 siblings, 2 replies; 8+ messages in thread
From: Michael Cashwell @ 2010-04-29 15:27 UTC (permalink / raw)
To: linux-arm-kernel
Greetings,
I'm working on a custom 2.6.33.2 port to Gumstix Verdex XL6P (PXA270) that is exhibiting odd behavior.
If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel builds and runs seemingly fine.
But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support then kernel builds OK (including full mrproper cleanings in between):
...
LD vmlinux
SYSMAP System.map
SYSMAP .tmp_System.map
OBJCOPY arch/arm/boot/Image
Kernel: arch/arm/boot/Image is ready
AS arch/arm/boot/compressed/head.o
GZIP arch/arm/boot/compressed/piggy.gzip
CC arch/arm/boot/compressed/misc.o
AS arch/arm/boot/compressed/head-xscale.o
SHIPPED arch/arm/boot/compressed/lib1funcs.S
AS arch/arm/boot/compressed/lib1funcs.o
AS arch/arm/boot/compressed/piggy.gzip.o
LD arch/arm/boot/compressed/vmlinux
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
#### Exporting linux-2.6.33.2-arm-gum to netboot.
Image Name: 2.6.33.2-gum1
Created: Wed Apr 28 16:12:14 2010
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 1736780 Bytes = 1696.07 kB = 1.66 MB
Load Address: 0xA1000000
Entry Point: 0xA1000000
but fails to run:
TFTP of 'GUM1/boot/uImage' from server 10.18.1.11; our IP address is 10.18.17.1 to address 0xa2000000
Loading: #################################################################
######################################################
Bytes transferred = 1736844 (1a808c hex)
## Booting image at a2000000 ...
Image Name: 2.6.33.2-gum1
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 1736780 Bytes = 1.7 MB
Load Address: a1000000
Entry Point: a1000000
Verifying Checksum ... OK
Starting kernel at address a1000000 ...
Uncompressing Linux...
uncompression error
-- System halted
As noted on another thread, I'm working on cpufreq-pxa2xx.c but I can't see how anything done there could directly cause the decompression to fail. Surely decompression is way too early for any code in cpu-freq to be called. (Yes?)
My hunch is that some subtle image/map linker issue is causing the non-cpufreq-debug case to lay the image out just so the decompressor gets confused but the cause and effect seem so random I'm not sure where to look.
I've seen this sort of thing before on boards where SDRAM is flakey but again, this is an unmodified commercial Gumstix board (in fact, several of them) so this seems unlikely.
I can leave cpufreq debugging compiled in and not pass a "cpufreq.debug=" kernel arg but I don't want to sweep this under the rug.
Anyone have any ideas? Is there any way to get more info from the decompressor or tests I can conduct against the zImage to determine what's wrong?
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 15:27 Decompression failure in an inexplicable case Michael Cashwell
@ 2010-04-29 15:40 ` Catalin Marinas
2010-04-29 16:09 ` Albin Tonnerre
2010-04-29 17:16 ` Michael Cashwell
2010-04-29 16:34 ` Catalin Marinas
1 sibling, 2 replies; 8+ messages in thread
From: Catalin Marinas @ 2010-04-29 15:40 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
> If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both
> it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel
> builds and runs seemingly fine.
>
> But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support
> then kernel builds OK (including full mrproper cleanings in between):
I had some issues a few weeks ago with the decompressing:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/73476
That seemed to do with the size of the Image file and randomly removing
parts of it made it work. Unfortunately, I couldn't reproduce it so that
others can try.
--
Catalin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 15:40 ` Catalin Marinas
@ 2010-04-29 16:09 ` Albin Tonnerre
2010-04-29 17:16 ` Michael Cashwell
1 sibling, 0 replies; 8+ messages in thread
From: Albin Tonnerre @ 2010-04-29 16:09 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Apr 29, 2010 at 5:40 PM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
>> If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both
>> it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel
>> builds and runs seemingly fine.
>>
>> But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support
>> then kernel builds OK (including full mrproper cleanings in between):
>
> I had some issues a few weeks ago with the decompressing:
>
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/73476
>
> That seemed to do with the size of the Image file and randomly removing
> parts of it made it work. Unfortunately, I couldn't reproduce it so that
> others can try.
You might want to try using LZO instead of gzip, that one should work correctly.
Cheers,
Albin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 15:27 Decompression failure in an inexplicable case Michael Cashwell
2010-04-29 15:40 ` Catalin Marinas
@ 2010-04-29 16:34 ` Catalin Marinas
2010-04-29 16:39 ` Russell King - ARM Linux
2010-04-29 18:59 ` Michael Cashwell
1 sibling, 2 replies; 8+ messages in thread
From: Catalin Marinas @ 2010-04-29 16:34 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
> If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both
> it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel
> builds and runs seemingly fine.
>
> But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support
> then kernel builds OK (including full mrproper cleanings in between):
BTW, one of the suggestions at the time was to try this:
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 3dc2cf8..e12e382 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -316,7 +316,7 @@ LC0: .word LC0 @ r1
.word _start @ r5
.word _got_start @ r6
.word _got_end @ ip
- .word user_stack+4096 @ sp
+ .word user_stack+8192 @ sp
LC1: .word reloc_end - reloc_start
.size LC0, . - LC0
@@ -1086,4 +1086,4 @@ reloc_end:
.align
.section ".stack", "w"
-user_stack: .space 4096
+user_stack: .space 8192
--
Catalin
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 16:34 ` Catalin Marinas
@ 2010-04-29 16:39 ` Russell King - ARM Linux
2010-04-29 17:29 ` Michael Cashwell
2010-04-29 18:59 ` Michael Cashwell
1 sibling, 1 reply; 8+ messages in thread
From: Russell King - ARM Linux @ 2010-04-29 16:39 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Apr 29, 2010 at 05:34:46PM +0100, Catalin Marinas wrote:
> On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
> > If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both
> > it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel
> > builds and runs seemingly fine.
> >
> > But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support
> > then kernel builds OK (including full mrproper cleanings in between):
>
> BTW, one of the suggestions at the time was to try this:
Another thing is to ask which kernel version. There's a number of fixes
which recently went in to both -stable and -rc to fix problems with
the decompressor.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 15:40 ` Catalin Marinas
2010-04-29 16:09 ` Albin Tonnerre
@ 2010-04-29 17:16 ` Michael Cashwell
1 sibling, 0 replies; 8+ messages in thread
From: Michael Cashwell @ 2010-04-29 17:16 UTC (permalink / raw)
To: linux-arm-kernel
On Apr 29, 2010, at 11:40 AM, Catalin Marinas wrote:
> On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
>> If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel builds and runs seemingly fine.
>>
>> But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support then kernel builds OK (including full mrproper cleanings in between):
>
> I had some issues a few weeks ago with the decompressing:
>
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/73476
>
> That seemed to do with the size of the Image file and randomly removing parts of it made it work. Unfortunately, I couldn't reproduce it so that others can try.
OK, thank you! (I was wondering what happened to the ... dots during decompression.)
It was the sense that random size/layout changes were involved that worried me. Contrary to that thread, I'm seeing this with a kernel of about 1.66MB.
-rwxr-xr-x 1 cashwell cashwell 1813956 2010-04-29 12:45 arch/arm/boot/compressed/vmlinux
-rw-rw-r-- 1 cashwell cashwell 1736932 2010-04-29 12:45 arch/arm/boot/uImage
-rwxr-xr-x 1 cashwell cashwell 1736868 2010-04-29 12:45 arch/arm/boot/zImage
-rwxr-xr-x 1 cashwell cashwell 36840120 2010-04-29 12:31 vmlinux
And for me the *smaller* kernel (without cpufreq debugging) fails.
I tried applying Uwe's patches from 2010-02-03 09:41:07 GMT and got this during the build:
arch/arm/boot/compressed/misc.c: In function 'decompress_kernel':
arch/arm/boot/compressed/misc.c:308: warning: passing argument 4 of 'gunzip' from incompatible pointer type
arch/arm/boot/compressed/misc.c:297: warning: unused variable 'tmp'
I fixed those (deleted the tmp variable and change the first arg of the flush function from char * to void * and defined buff by casting the first arg).
That built but I still get:
Bytes transferred = 1736932 (1a80e4 hex)
## Booting image at a2000000 ...
Image Name: 2.6.33.2-gum1
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 1736868 Bytes = 1.7 MB
Load Address: a1000000
Entry Point: a1000000
Verifying Checksum ... OK
Starting kernel at address a1000000 ...
Uncompressing Linux..............................................................................................................
uncompression error
-- System halted
I've also defined DEBUG in arch/arm/boot/compressed/head.S as Uwe also advised. The output and result was unchanged but the dots progressed much more slowly.
Lastly, I just tried using LZO compression instead as the thread also mentioned. After a quick "sudo yum install lzop" it built. Interestingly it's about 6% larger than with GZIP, but it seems to work better:
TFTP of 'cashwell/netboot/GUM1/boot/uImage' from server 10.18.1.11; our IP address is 10.18.17.1 to address 0xa2000000
Loading: #################################################################
################################################################
Bytes transferred = 1883320 (1cbcb8 hex)
## Booting image at a2000000 ...
Image Name: 2.6.33.2-gum1
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 1883256 Bytes = 1.8 MB
Load Address: a1000000
Entry Point: a1000000
Verifying Checksum ... OK
Starting kernel at address a1000000 ...
Uncompressing Linux................. done, booting the kernel.
Linux version 2.6.33.2 (cashwell at mec-fedora12.argon.local) (gcc version 4.3.4 (GCC) ) #1 PREEMPT Thu Apr 29 12:31:38 EDT 2010
CPU: XScale-PXA270 [69054117] revision 7 (ARMv5TE), cr=0000397f
CPU: VIVT data cache, VIVT instruction cache
Machine: Gumstix Verdex PRO
Memory policy: ECC disabled, Data cache writeback
On node 0 totalpages: 32768
free_area_init_node: node 0, pgdat c0358ff0, node_mem_map c0376000
Normal zone: 256 pages used for memmap
Normal zone: 0 pages reserved
Normal zone: 32512 pages, LIFO batch:7
CCSR: 30000310, MDREFR: 201fc031
...
Fewer dots is interesting. But at least this helps me chalk this up to GZIP code being broken somehow.
LZO seems to be the winner.
Thanks!
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 16:39 ` Russell King - ARM Linux
@ 2010-04-29 17:29 ` Michael Cashwell
0 siblings, 0 replies; 8+ messages in thread
From: Michael Cashwell @ 2010-04-29 17:29 UTC (permalink / raw)
To: linux-arm-kernel
On Apr 29, 2010, at 12:39 PM, Russell King - ARM Linux wrote:
> On Thu, Apr 29, 2010 at 05:34:46PM +0100, Catalin Marinas wrote:
>> On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
>>> If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both
>>> it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel
>>> builds and runs seemingly fine.
>>>
>>> But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support
>>> then kernel builds OK (including full mrproper cleanings in between):
>>
>> BTW, one of the suggestions at the time was to try this:
>
> Another thing is to ask which kernel version. There's a number of fixes which recently went in to both -stable and -rc to fix problems with the decompressor.
>From first post (sorry to hear about your Internet connectivity woes Russell!):
> I'm working on a custom 2.6.33.2 port to Gumstix Verdex XL6P (PXA270) ...
The custom part is vanilla board support (missing stuff we don't care about like frame buffers and Bluetooth).
That's why I was so rattled by the decompression problems. I haven't been working anywhere near that.
I've not tried GZIP with the 4K -> 8K stack space Catalin mentioned.
Since I presently seem to have this Heisenbug somewhat cornered and reproducible if doing that and reporting the results would be helpful I'm happy to do so.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Decompression failure in an inexplicable case
2010-04-29 16:34 ` Catalin Marinas
2010-04-29 16:39 ` Russell King - ARM Linux
@ 2010-04-29 18:59 ` Michael Cashwell
1 sibling, 0 replies; 8+ messages in thread
From: Michael Cashwell @ 2010-04-29 18:59 UTC (permalink / raw)
To: linux-arm-kernel
On Apr 29, 2010, at 12:34 PM, Catalin Marinas wrote:
> On Thu, 2010-04-29 at 16:27 +0100, Michael Cashwell wrote:
>> If I omit CPUFreq support [CONFIG_CPU_FREQ is not set], or enable both it and its debugging support [CONFIG_CPU_FREQ_DEBUG=y], the kernel builds and runs seemingly fine.
>>
>> But if I have CONFIG_CPU_FREQ enabled but NOT its debugging support then kernel builds OK (including full mrproper cleanings in between):
>
> BTW, one of the suggestions at the time was to try this:
>
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index 3dc2cf8..e12e382 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -316,7 +316,7 @@ LC0: .word LC0 @ r1
> .word _start @ r5
> .word _got_start @ r6
> .word _got_end @ ip
> - .word user_stack+4096 @ sp
> + .word user_stack+8192 @ sp
> LC1: .word reloc_end - reloc_start
> .size LC0, . - LC0
>
> @@ -1086,4 +1086,4 @@ reloc_end:
>
> .align
> .section ".stack", "w"
> -user_stack: .space 4096
> +user_stack: .space 8192
OK, I tried this (with Uwe's other malloc heap size and decompress_flush() callback patches) and sadly my results were unchanged:
Bytes transferred = 1736932 (1a80e4 hex)
## Booting image at a2000000 ...
Image Name: 2.6.33.2-gum1
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 1736868 Bytes = 1.7 MB
Load Address: a1000000
Entry Point: a1000000
Verifying Checksum ... OK
No initrd
Starting kernel at address a1000000 ...
Uncompressing Linux..............................................................................................................
uncompression error
-- System halted
So there still seems to be some data-dependent issues in the kernel's gunzip decompressor.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-04-29 18:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-29 15:27 Decompression failure in an inexplicable case Michael Cashwell
2010-04-29 15:40 ` Catalin Marinas
2010-04-29 16:09 ` Albin Tonnerre
2010-04-29 17:16 ` Michael Cashwell
2010-04-29 16:34 ` Catalin Marinas
2010-04-29 16:39 ` Russell King - ARM Linux
2010-04-29 17:29 ` Michael Cashwell
2010-04-29 18:59 ` Michael Cashwell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox