Linux PARISC architecture development
 help / color / mirror / Atom feed
* bad code with gcc-4.3 and lzma-utils ?
@ 2009-01-03 12:45 Mike Frysinger
  2009-01-03 13:40 ` Matthew Wilcox
  2009-01-03 17:21 ` John David Anglin
  0 siblings, 2 replies; 14+ messages in thread
From: Mike Frysinger @ 2009-01-03 12:45 UTC (permalink / raw)
  To: linux-parisc


[-- Attachment #1.1: Type: text/plain, Size: 2632 bytes --]

if you build lzma-4.32.7 with gcc-4.3 and -march=2.0, then lzma segfaults when 
trying to do anything useful.  gcc-4.1 has no problem here.

to reproduce, just download lzma-4.32.7.tar.gz and do:
tar xf lzma-4.32.7.tar.gz
cd lzma-4.32.7
CXXFLAGS='-O1 -march=2.0 -g' ./configure
make
make check

Guy Martin narrowed it down to the code in src/sdk/7zip/Compress/LZMA/ ... if 
we build the encoder/decoder without -march=2.0, then the tests pass and life 
is peachy.  with a little bit of patience, i think i narrowed down a bit 
further to the function CDecoder::SetDecoderProperties2().  we can build the 
rest of the file with -march=2.0, but when we build this function with -
march=2.0, then it craps out.  the code in question has a bit of funky casting 
from a byte array up to uint32's, but i dont think this is an alignment issue.

i'm attaching the preprocessed file which can then be compared:
g++ -O1 -c LZMADecoder.i -march=2.0 -o LZMADecoder.bad.o
g++ -O1 -c LZMADecoder.i -o LZMADecoder.good.o

the kernel logs the fault like so:
do_page_fault() pid=15903 command='lzma' type=15 address=0x00000063

     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001000000000000001111 Not tainted
r00-03  0004000f 0001b800 0001c217 00000005
r04-07  000320e8 00356e06 00000019 00040000
r08-11  000320b0 fb12c810 fb12c6c4 00000000
r12-15  fb12c6c0 00000000 00030790 fb12c010
r16-19  00030794 00029d74 0002a8a8 00000063
r20-23  00000000 00000400 00000003 fb12c9c8
r24-27  00100000 4008f008 00000003 0002fcbc
r28-31  00000060 00000000 fb12c980 00000000
sr00-03  00000006 00000000 00000000 00000006
sr04-07  00000006 00000006 00000006 00000006

      VZOUICununcqcqcqcqcqcrmunTDVZOUI
FPSR: 00000000000000000000000000000000
FPER1: 00000000
fr00-03  0000000000000000 0000000000000000 0000000000000000 0000000000000000
fr04-07  1267d000cfed0968 00000020101683f0 106238101061f810 bff0000000000000
fr08-11  fffff0001267d000 0000000200000003 0000000012729840 ffffff9c00000002
fr12-15  fb6ab02c00000001 000cc542101737c8 10101a281061f810 126840883b9aca00
fr16-19  104aed9c106238a8 fffffff412684208 105902f6105902f7 000000000000000b
fr20-23  1055d8100000000f 1055d810101687b4 0000000800000002 00001c2c00000000
fr24-27  0000000000000000 000000004ccd4eed fce2fc640b19f33d 8c4f289cb1314a9a
fr28-31  0701fb1163036696 8a9012eac57c0709 0701fb1100000228 0a4ed1f910111908

IASQ: 00000006 00000006 IAOQ: 0001b9f3 0001b9f7
 IIR: 0e751280    ISR: 00000006  IOR: 00000063
 CPU:        0   CR30: 93704000 CR31: 10600000
 ORIG_R28: 00000000
 IAOQ[0]: 0x1b9f0
 IAOQ[1]: 0x1b9f4
 RP(r2): 0x1c214
-mike

[-- Attachment #1.2: LZMADecoder.i.bz2 --]
[-- Type: application/x-bzip, Size: 9475 bytes --]

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-03 12:45 bad code with gcc-4.3 and lzma-utils ? Mike Frysinger
@ 2009-01-03 13:40 ` Matthew Wilcox
  2009-01-03 17:21 ` John David Anglin
  1 sibling, 0 replies; 14+ messages in thread
From: Matthew Wilcox @ 2009-01-03 13:40 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: linux-parisc

On Sat, Jan 03, 2009 at 07:45:01AM -0500, Mike Frysinger wrote:
> the kernel logs the fault like so:
> do_page_fault() pid=15903 command='lzma' type=15 address=0x00000063

That's a null pointer dereference ...

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-03 12:45 bad code with gcc-4.3 and lzma-utils ? Mike Frysinger
  2009-01-03 13:40 ` Matthew Wilcox
@ 2009-01-03 17:21 ` John David Anglin
  2009-01-04 12:08   ` Guy Martin
  1 sibling, 1 reply; 14+ messages in thread
From: John David Anglin @ 2009-01-03 17:21 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: linux-parisc

> to reproduce, just download lzma-4.32.7.tar.gz and do:
> tar xf lzma-4.32.7.tar.gz
> cd lzma-4.32.7
> CXXFLAGS=3D'-O1 -march=3D2.0 -g' ./configure
> make
> make check

Built as above.  I see a segv compressing InBuffer.o in build directory.

Program received signal SIGSEGV, Segmentation fault.
0x00024638 in NCompress::NLZMA::CEncoder::SetStreams (this=0x4158f000, 
    inStream=0x0, outStream=0x3, inSize=0x12, outSize=0x0)
    at ../../../../../../src/sdk/7zip/Compress/LZMA/LZMAEncoder.cpp:1251
1251	  RINOK(Create());

There is a problem with memory allocation in MyAlloc:

Breakpoint 2, MyAlloc (size=72355848) at ../../../src/sdk/Common/Alloc.cpp:24
24	  if (size == 0)

This fails:

(gdb) 
NBT4::CMatchFinderBinTree::Create (this=0x32100, historySize=8388608, 
    keepAddBufferBefore=<value optimized out>, 
    matchMaxLen=<value optimized out>, keepAddBufferAfter=419)
    at ../../../../../../src/sdk/7zip/Compress/LZMA/../LZ/BinTree/BinTreeMain.h:97
97	    if (_hash != 0)
(gdb) 
101	  return E_OUTOFMEMORY;

There isn't a check for out of memory at 1251, so segv.  Don't have time
now to look at memory allocation problem.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-03 17:21 ` John David Anglin
@ 2009-01-04 12:08   ` Guy Martin
  2009-01-04 16:41     ` John David Anglin
  2009-01-04 20:16     ` John David Anglin
  0 siblings, 2 replies; 14+ messages in thread
From: Guy Martin @ 2009-01-04 12:08 UTC (permalink / raw)
  To: John David Anglin; +Cc: Mike Frysinger, linux-parisc


Hi Dave,

Could you do the very same test with -fno-delayed-branch ?

Compiling without that optimization on my system makes the problem go
away. It is really a problem with that part of the code and delayed
branching.

This has been tested with different versions of gcc (4.1.2, 4.2.4,
4.3.1). Both 4.2 and 4.3 exhibit the issue once you compile with
-march=2.0 and -fdelayed-branch.

Also, compiling with -march=2.0 -O0 -fdelayed-branch doesn't produce a
segv but produces the following error message :
hope tests # ../src/lzma/lzma -c < in > out 
../src/lzma/lzma: SetCoderProperties() error

There is really a compiler issue here.


For reference, the Gentoo bug can be found here :
http://bugs.gentoo.org/show_bug.cgi?id=228287


Also, not sure if you noticed but values printed in the gdb backtrace
are completely wrong. But that is a different issue as binaries
generated with any compilers shows the same thing.


Regards,
  Guy

On Sat, 3 Jan 2009 12:21:25 -0500 (EST)
"John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:

> > to reproduce, just download lzma-4.32.7.tar.gz and do:
> > tar xf lzma-4.32.7.tar.gz
> > cd lzma-4.32.7
> > CXXFLAGS=3D'-O1 -march=3D2.0 -g' ./configure
> > make
> > make check
> 
> Built as above.  I see a segv compressing InBuffer.o in build
> directory.
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x00024638 in NCompress::NLZMA::CEncoder::SetStreams
> (this=0x4158f000, inStream=0x0, outStream=0x3, inSize=0x12,
> outSize=0x0)
> at ../../../../../../src/sdk/7zip/Compress/LZMA/LZMAEncoder.cpp:1251
> 1251	  RINOK(Create());
> 
> There is a problem with memory allocation in MyAlloc:
> 
> Breakpoint 2, MyAlloc (size=72355848)
> at ../../../src/sdk/Common/Alloc.cpp:24 24	  if (size == 0)
> 
> This fails:
> 
> (gdb) 
> NBT4::CMatchFinderBinTree::Create (this=0x32100, historySize=8388608, 
>     keepAddBufferBefore=<value optimized out>, 
>     matchMaxLen=<value optimized out>, keepAddBufferAfter=419)
>     at ../../../../../../src/sdk/7zip/Compress/LZMA/../LZ/BinTree/BinTreeMain.h:97
> 97	    if (_hash != 0)
> (gdb) 
> 101	  return E_OUTOFMEMORY;
> 
> There isn't a check for out of memory at 1251, so segv.  Don't have
> time now to look at memory allocation problem.
> 
> Dave


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 12:08   ` Guy Martin
@ 2009-01-04 16:41     ` John David Anglin
  2009-01-04 18:21       ` John David Anglin
  2009-01-04 20:16     ` John David Anglin
  1 sibling, 1 reply; 14+ messages in thread
From: John David Anglin @ 2009-01-04 16:41 UTC (permalink / raw)
  To: Guy Martin; +Cc: vapier, linux-parisc

> Could you do the very same test with -fno-delayed-branch ?
> 
> Compiling without that optimization on my system makes the problem go
> away. It is really a problem with that part of the code and delayed
> branching.

That's useful information as it narrows the problem considerably.  The
problem might be in the PA backend delay slot handling for PA 2.0 32-bit
calls, or it might be a reorg problem.  The latter are very tricky.

> This has been tested with different versions of gcc (4.1.2, 4.2.4,
> 4.3.1). Both 4.2 and 4.3 exhibit the issue once you compile with
> -march=2.0 and -fdelayed-branch.
> 
> Also, compiling with -march=2.0 -O0 -fdelayed-branch doesn't produce a
> segv but produces the following error message :
> hope tests # ../src/lzma/lzma -c < in > out 
> ../src/lzma/lzma: SetCoderProperties() error
> 
> There is really a compiler issue here.

We need to find by debugging an example of code that is miscompiled.

> Also, not sure if you noticed but values printed in the gdb backtrace
> are completely wrong. But that is a different issue as binaries
> generated with any compilers shows the same thing.

Yes, it's a major problem for debugging and it has gotten worse with
each new GCC versions.  There have been some discussions on the list
about how to improve debugging information.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 16:41     ` John David Anglin
@ 2009-01-04 18:21       ` John David Anglin
  0 siblings, 0 replies; 14+ messages in thread
From: John David Anglin @ 2009-01-04 18:21 UTC (permalink / raw)
  To: John David Anglin; +Cc: gmsoft, vapier, linux-parisc

> > Also, compiling with -march=2.0 -O0 -fdelayed-branch doesn't produce a
> > segv but produces the following error message :
> > hope tests # ../src/lzma/lzma -c < in > out 
> > ../src/lzma/lzma: SetCoderProperties() error

With this case, I see the following miscompilation:

0x0002e860 <_ZN9NCompress5NLZMA8CEncoder18SetCoderPropertiesEPKjPK14tagPROPVARIANTj+496>:	ldil L%29000,r1
0x0002e864 <_ZN9NCompress5NLZMA8CEncoder18SetCoderPropertiesEPKjPK14tagPROPVARIANTj+500>:	be,l 554(sr4,r1),sr0,r31
0x0002e868 <_ZN9NCompress5NLZMA8CEncoder18SetCoderPropertiesEPKjPK14tagPROPVARIANTj+504>:	copy r31,rp
0x0002e86c <_ZN9NCompress5NLZMA8CEncoder18SetCoderPropertiesEPKjPK14tagPROPVARIANTj+508>:	copy ret0,r26

The delay slot for the branch is filled twice.  So for now, don't use
-march=2.0 in 32-bit compilations.  4.1 is probably broken too.  The
second copy instruction was supposed to have been moved before the call.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 12:08   ` Guy Martin
  2009-01-04 16:41     ` John David Anglin
@ 2009-01-04 20:16     ` John David Anglin
  2009-01-04 23:21       ` Guy Martin
  1 sibling, 1 reply; 14+ messages in thread
From: John David Anglin @ 2009-01-04 20:16 UTC (permalink / raw)
  To: Guy Martin; +Cc: vapier, linux-parisc

> Also, compiling with -march=2.0 -O0 -fdelayed-branch doesn't produce a
> segv but produces the following error message :
> hope tests # ../src/lzma/lzma -c < in > out 
> ../src/lzma/lzma: SetCoderProperties() error
> 
> There is really a compiler issue here.

The problem was introduced in revision 68677 on June 29, 2003 when
a change was made to allow ble to be used for local calls.  So, the
problem is present in all versions back to at least 3.4.

The following change should fix the bug.  I'm currently doing a full
build to test.

Thanks for reporting the problem.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

Index: config/pa/pa.c
===================================================================
--- config/pa/pa.c	(revision 143056)
+++ config/pa/pa.c	(working copy)
@@ -7547,7 +7547,9 @@
 	  if (seq_length != 0
 	      && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN
 	      && !sibcall
-	      && (!TARGET_PA_20 || indirect_call))
+	      && (!TARGET_PA_20
+		  || indirect_call
+		  || (TARGET_LONG_ABS_CALL || local_call) && !flag_pic))
 	    {
 	      /* A non-jump insn in the delay slot.  By definition we can
 		 emit this insn before the call (and in fact before argument

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 20:16     ` John David Anglin
@ 2009-01-04 23:21       ` Guy Martin
  2009-01-04 23:26         ` John David Anglin
  2009-01-04 23:42         ` John David Anglin
  0 siblings, 2 replies; 14+ messages in thread
From: Guy Martin @ 2009-01-04 23:21 UTC (permalink / raw)
  To: John David Anglin; +Cc: vapier, linux-parisc

On Sun, 4 Jan 2009 15:16:40 -0500 (EST)
"John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:


> The problem was introduced in revision 68677 on June 29, 2003 when
> a change was made to allow ble to be used for local calls.  So, the
> problem is present in all versions back to at least 3.4.
> 
> The following change should fix the bug.  I'm currently doing a full
> build to test.

I just tested this patch with 4.2.4 and it worked fine !

Thanks for the fix.

Cheers,
  Guy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 23:21       ` Guy Martin
@ 2009-01-04 23:26         ` John David Anglin
  2009-01-04 23:42         ` John David Anglin
  1 sibling, 0 replies; 14+ messages in thread
From: John David Anglin @ 2009-01-04 23:26 UTC (permalink / raw)
  To: Guy Martin; +Cc: vapier, linux-parisc

> I just tested this patch with 4.2.4 and it worked fine !

Sigh, I messed up the parens, but its the right idea.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 23:21       ` Guy Martin
  2009-01-04 23:26         ` John David Anglin
@ 2009-01-04 23:42         ` John David Anglin
  2009-01-05  9:54           ` Guy Martin
  1 sibling, 1 reply; 14+ messages in thread
From: John David Anglin @ 2009-01-04 23:42 UTC (permalink / raw)
  To: Guy Martin; +Cc: vapier, linux-parisc

Guy,

> I just tested this patch with 4.2.4 and it worked fine !

Would you please try this.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

Index: config/pa/pa.c
===================================================================
--- config/pa/pa.c	(revision 143062)
+++ config/pa/pa.c	(working copy)
@@ -7547,7 +7547,9 @@
 	  if (seq_length != 0
 	      && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN
 	      && !sibcall
-	      && (!TARGET_PA_20 || indirect_call))
+	      && (!TARGET_PA_20
+		  || indirect_call
+		  || ((TARGET_LONG_ABS_CALL || local_call) && !flag_pic)))
 	    {
 	      /* A non-jump insn in the delay slot.  By definition we can
 		 emit this insn before the call (and in fact before argument

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-04 23:42         ` John David Anglin
@ 2009-01-05  9:54           ` Guy Martin
  2009-01-06  4:40             ` John David Anglin
  0 siblings, 1 reply; 14+ messages in thread
From: Guy Martin @ 2009-01-05  9:54 UTC (permalink / raw)
  To: John David Anglin; +Cc: vapier, linux-parisc

On Sun, 4 Jan 2009 18:42:02 -0500 (EST)
"John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:

> Would you please try this.

It did work too.

  Guy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-05  9:54           ` Guy Martin
@ 2009-01-06  4:40             ` John David Anglin
  2009-01-06  6:16               ` Mike Frysinger
  0 siblings, 1 reply; 14+ messages in thread
From: John David Anglin @ 2009-01-06  4:40 UTC (permalink / raw)
  To: Guy Martin; +Cc: vapier, linux-parisc

> On Sun, 4 Jan 2009 18:42:02 -0500 (EST)
> "John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:
> 
> > Would you please try this.
> 
> It did work too.

I installed the patch on the gcc trunk (4.4).  However, there are still
issues.  The patch didn't fix the -O0 -fdelayed-branch error.  I have
traced this to a middle-end bug:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38740

There's also a nagging issue.  There would appear to be a problem
with the branch distance calculation.  The "be,l" instruction should
not have been used.  A "b,l" is supposed to be ok if the branch
distance is less than 7600000 bytes.  The code in this file is not
that big.  This may be the reason for the apparent regression.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-06  4:40             ` John David Anglin
@ 2009-01-06  6:16               ` Mike Frysinger
  2009-01-06 14:35                 ` John David Anglin
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Frysinger @ 2009-01-06  6:16 UTC (permalink / raw)
  To: John David Anglin; +Cc: Guy Martin, linux-parisc

[-- Attachment #1: Type: text/plain, Size: 488 bytes --]

On Monday 05 January 2009 23:40:57 John David Anglin wrote:
> > On Sun, 4 Jan 2009 18:42:02 -0500 (EST) John David Anglin wrote:
> > > Would you please try this.
> >
> > It did work too.
>
> I installed the patch on the gcc trunk (4.4).  However, there are still
> issues.  The patch didn't fix the -O0 -fdelayed-branch error.  I have
> traced this to a middle-end bug:

thanks a lot for looking into this.  do you plan on adding the patch to the 
4.3 branch as well ?
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: bad code with gcc-4.3 and lzma-utils ?
  2009-01-06  6:16               ` Mike Frysinger
@ 2009-01-06 14:35                 ` John David Anglin
  0 siblings, 0 replies; 14+ messages in thread
From: John David Anglin @ 2009-01-06 14:35 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: gmsoft, linux-parisc

> do you plan on adding the patch to the 4.3 branch as well ?

Yes, but I want to study why this branch sequence is actually
being used in this situation.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-01-06 14:35 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-03 12:45 bad code with gcc-4.3 and lzma-utils ? Mike Frysinger
2009-01-03 13:40 ` Matthew Wilcox
2009-01-03 17:21 ` John David Anglin
2009-01-04 12:08   ` Guy Martin
2009-01-04 16:41     ` John David Anglin
2009-01-04 18:21       ` John David Anglin
2009-01-04 20:16     ` John David Anglin
2009-01-04 23:21       ` Guy Martin
2009-01-04 23:26         ` John David Anglin
2009-01-04 23:42         ` John David Anglin
2009-01-05  9:54           ` Guy Martin
2009-01-06  4:40             ` John David Anglin
2009-01-06  6:16               ` Mike Frysinger
2009-01-06 14:35                 ` John David Anglin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox