public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved)
       [not found]   ` <1162497112.12781.51.camel@localhost.localdomain>
@ 2006-11-03 10:09     ` Enrico Scholz
  2006-11-13 21:56       ` Anton Vorontsov
  2006-11-14 22:15       ` Anton Vorontsov
  0 siblings, 2 replies; 3+ messages in thread
From: Enrico Scholz @ 2006-11-03 10:09 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: linux-kernel, rpurdie

[-- Attachment #1: Type: text/plain, Size: 3775 bytes --]

[CC lkml; original issue at
 http://article.gmane.org/gmane.linux.ports.arm.kernel/28068]

rpurdie@rpsys.net (Richard Purdie) writes:

>> > I have a problem with JFFS2 filesystem and kernel 2.6.18. When
>> > starting a program which uses a certain library (libutil.so.1 in
>> > my case), the .got section of the library can be initialized
>> > wrongly when the used memory is uninitialized.
>> 
>> Problem seems to be caused by
>> 
>> | [PATCH] zlib_inflate: Upgrade library code to a recent version
>> 
>> (4f3865fb57a04db7cca068fed1c15badc064a302)
>> 
>> After reverting this (and related patches), things seem to work.
>> 
>> I don't have an idea yet, which changes in this complex patch are
>> really responsible....
>
> I'm the author of the above change. I just ran your test program
> on a device (ARM PXA255 with 2.6.19-rc4 kernel, 2.3.5ish glibc,
> gcc 3.4.4, libraries on jffs2) and I can't reproduce the
> problem.

I can reproduce it 100% with:

$ git checkout -b test v2.6.17.14
$ git-am -3 tmp/000[1-8]*
  (see https://www.cvg.de/people/ensc/libutil/ for patches and
   used .config (config.txt); the physmap patches are from 2.6.18)
$ make tftp

--> 'fillmem ; test' sequences work without errors


$ git-cherry-pick 4f3865fb57a04db7cca068fed1c15badc064a302
$ make tftp

--> 'fillmem ; test' sequences stop with a segfault

I compiled kernel both with gcc-3.4.6 and gcc-4.1.1 and got same
results.


Same results when using recent 2.6.18.1 kernel and reverting all
patches which modified lib/zlib_*.

I see segfaults too with 2.6.19-rc4 but did not checked yet
whether removal of zlib patch solved them.


Things are getting yet more strange when using the glibc-2.5
dynamic loader:

| # ... copying ld-2.5.so and libc-2.5.so ...
| # LD_LIBRARY_PATH=`pwd` ./ld-2.5.so /bin/test2
| Inconsistency detected by ld.so: dynamic-link.h: 169: elf_get_dynamic_info: Assertion `info[19]->d_un.d_val == sizeof (Elf32_Rel)' failed!
| # LD_LIBRARY_PATH=`pwd` ./ld-2.5.so /bin/test2
| Segmentation fault
| # LD_LIBRARY_PATH=`pwd` ./ld-2.5.so /bin/test2
| #


> You mentioned elsewhere that reading the lib from flash gives
> consistent md5sums. There is only one inflation code path and
> if the md5sum is always consistent, I can't see how the the
> inflation code is at fault. I therefore strongly suspect this
> is some userspace issue when handling the got.

Issue:

* seems to be triggered by the zlib kernel patch
* seems to be triggered by my 'libutil.so' (I can not see it with
  other libraries)
* can be reproduced on two different PXA270 platforms (same
  userspace, but different module vendors and different memory
  timing setups)


I see the following reasons:

* new zlib code has sideeffects (overflows?)
* new zlib code is so fast that it triggers a race somewhere else
* libutil.so's .init section is buggy (likely, but why does the
  error not occur when libutil.so is on tmpfs or NFS?)
* new zlib code requires more/less memory bandwidth, changes
  power consumption of CPU/memory which is causing random errors
  (unlikely because only same part of .got table is affected and
  it happens on two different platforms)
* some DCACHE issue


> Which other related patches did you remove?

For 2.6.18 tests, I reverted only the patches which changed
lib/zlib_* after 2.6.17:

| 31925c8857ba17c11129b766a980ff7c87780301 [PATCH] Fix ppc32 zImage inflate
| b762450e84e20a179ee5993b065caaad99a65fbf [PATCH] zlib inflate: fix function definitions
| 0ecbf4b5fc38479ba29149455d56c11a23b131c0 move acknowledgment for Mark Adler to CREDITS
| 4f3865fb57a04db7cca068fed1c15badc064a302 [PATCH] zlib_inflate: Upgrade library code to a recent version




Enrico

[-- Attachment #2: Type: application/pgp-signature, Size: 480 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved)
  2006-11-03 10:09     ` [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved) Enrico Scholz
@ 2006-11-13 21:56       ` Anton Vorontsov
  2006-11-14 22:15       ` Anton Vorontsov
  1 sibling, 0 replies; 3+ messages in thread
From: Anton Vorontsov @ 2006-11-13 21:56 UTC (permalink / raw)
  To: Enrico Scholz; +Cc: linux-arm-kernel, linux-kernel, rpurdie

Hello Richard, Enrico!

On Fri, Nov 03, 2006 at 11:09:35AM +0100, Enrico Scholz wrote:
> [CC lkml; original issue at
>  http://article.gmane.org/gmane.linux.ports.arm.kernel/28068]
> 
> rpurdie@rpsys.net (Richard Purdie) writes:
> 
> >> > I have a problem with JFFS2 filesystem and kernel 2.6.18. When
> >> > starting a program which uses a certain library (libutil.so.1 in
> >> > my case), the .got section of the library can be initialized
> >> > wrongly when the used memory is uninitialized.
> >> 
> >> Problem seems to be caused by
> >> 
> >> | [PATCH] zlib_inflate: Upgrade library code to a recent version
> >> 
> >> (4f3865fb57a04db7cca068fed1c15badc064a302)
> >> 
> >> After reverting this (and related patches), things seem to work.
> >> 
> >> I don't have an idea yet, which changes in this complex patch are
> >> really responsible....
> >
> > I'm the author of the above change. I just ran your test program
> > on a device (ARM PXA255 with 2.6.19-rc4 kernel, 2.3.5ish glibc,
> > gcc 3.4.4, libraries on jffs2) and I can't reproduce the
> > problem.
> 
> I can reproduce it 100% with:

Same here. I can reproduce exactly same problem. And reverting zlib
changes fixes it. I'm testing it on ARM PXA270 + binutils-2.17 +
glibc-2.5 + gcc-4.1.1 (old ABI).

> > Which other related patches did you remove?
> 
> For 2.6.18 tests, I reverted only the patches which changed
> lib/zlib_* after 2.6.17:
> 
> | 31925c8857ba17c11129b766a980ff7c87780301 [PATCH] Fix ppc32 zImage inflate
> | b762450e84e20a179ee5993b065caaad99a65fbf [PATCH] zlib inflate: fix function definitions
> | 0ecbf4b5fc38479ba29149455d56c11a23b131c0 move acknowledgment for Mark Adler to CREDITS
> | 4f3865fb57a04db7cca068fed1c15badc064a302 [PATCH] zlib_inflate: Upgrade library code to a recent version

Indeed. Reverting these patches fixes all these pesky issues with
jffs2/libutil/openpty.

> Enrico

Thanks,

-- Anton (irc: bd2)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved)
  2006-11-03 10:09     ` [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved) Enrico Scholz
  2006-11-13 21:56       ` Anton Vorontsov
@ 2006-11-14 22:15       ` Anton Vorontsov
  1 sibling, 0 replies; 3+ messages in thread
From: Anton Vorontsov @ 2006-11-14 22:15 UTC (permalink / raw)
  To: Enrico Scholz; +Cc: linux-arm-kernel, linux-kernel, rpurdie

Hi all,

On Fri, Nov 03, 2006 at 11:09:35AM +0100, Enrico Scholz wrote:
> [CC lkml; original issue at
>  http://article.gmane.org/gmane.linux.ports.arm.kernel/28068]
> 
> rpurdie@rpsys.net (Richard Purdie) writes:
> 
> >> > I have a problem with JFFS2 filesystem and kernel 2.6.18. When
> >> > starting a program which uses a certain library (libutil.so.1 in
> >> > my case), the .got section of the library can be initialized
> >> > wrongly when the used memory is uninitialized.
> >> 
> >> Problem seems to be caused by
> >> 
> >> | [PATCH] zlib_inflate: Upgrade library code to a recent version
> >> 
> >> (4f3865fb57a04db7cca068fed1c15badc064a302)
> >> 
> >> After reverting this (and related patches), things seem to work.
> >> 
> >> I don't have an idea yet, which changes in this complex patch are
> >> really responsible....
> >
> > I'm the author of the above change. I just ran your test program
> > on a device (ARM PXA255 with 2.6.19-rc4 kernel, 2.3.5ish glibc,
> > gcc 3.4.4, libraries on jffs2) and I can't reproduce the
> > problem.
> 
> I can reproduce it 100% with:

As I told before (but it's not delivered to the arm-linux-kernel), I
can reproduce it too, using glibc-2.4 or glibc-2.5. I can't reproduce
it using glibc-2.3.5.

My further investigations shows that reading libutil.so.1
(cat /lib/libutil.so.1 > /dev/null) prior using it eliminates
segfault. That, I suppose, means that glibc can easily operate on 
cached file, but refuses to initially ""read"" it from disk properly.

Quoting Richard Purdie:

"The file is read ok from the disk when copying and when read with
md5sum. I therefore wonder if the dynamic linker is doing something it
shouldn't."

Though, it may be either glibc or JFFS2 issue. As for glibc, it's not
using read() call as do cat, cp or md5sum, glibc using readonly
mmap call (which is supported by JFFS2 if I understood code correctly)
on libraries ld-linux wants to load.


I hope these itinerary of mine will bring some light on that issue, and
someone will guess where real bug is. ;-)

> 
> Enrico

-- Anton (irc: bd2)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-11-14 21:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ly1wozcr1d.fsf@ensc-pc.intern.sigma-chemnitz.de>
     [not found] ` <ly64dyt7de.fsf@ensc-pc.intern.sigma-chemnitz.de>
     [not found]   ` <1162497112.12781.51.camel@localhost.localdomain>
2006-11-03 10:09     ` [ARM] Corrupted .got section with 2.6.18 and JFFS2 (solved) Enrico Scholz
2006-11-13 21:56       ` Anton Vorontsov
2006-11-14 22:15       ` Anton Vorontsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox