linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: network stack oops 2.4.1/gcc 2.95.3
@ 2001-01-29 23:54 Iain Sandoe
  2001-01-30 12:39 ` Franz Sirl
  0 siblings, 1 reply; 9+ messages in thread
From: Iain Sandoe @ 2001-01-29 23:54 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Franz Sirl, linuxppc-dev


Hi David,

Mon, Jan 29, 2001, David Edelsohn wrote:
>>> "Iain Sandoe" writes:
> Iain> It is not a regression test.  It would not be particularly easy, either
to
> Iain> put 2.95.2 back up and wind back the system to the build conditions...
> Iain> but the version used was 2.95.3-test2. and the bk pull was 2.4.1-pre10.
>
>  A regression is different from a regression test.  I never called
> it a regression test.

RTFM(ail) properly Iain ;)

>  I do not mean rewinding all of the way to gcc-2.95.2, but to
> Franz's patched version of GCC prior to the gcc-2.95.3 changes.

OK. It might be possible - I have a build timestamp on the kernel (I keep
the last four or five builds) and I guess I could somehow track down which
revs to get from bk.  I still have the previous glibc & gcc rpms.

However, (see below) I will only do this if someone really believes they
will get useful info... I'm mid-way through a fairly large chunk of work
ATM.

>  I am just trying to narrow down whether this is something new.

I believe so.  Because I really think I'd built that particular pull before
I upgraded the tool-chain.  BUT because it only happens under fairly unusual
circumstances (for my set-up) I can't be 100% sure.

Also it involves a depressingly large amount of system context: kernel, X,
inetd, da-da-da...

If it happens again - I'll see if I really have a genuine Illegal
Instruction in the code stream (or I'm just trying to execute a format
string ;)

ciao,
Iain.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: network stack oops 2.4.1/gcc 2.95.3
@ 2001-02-01  0:53 Iain Sandoe
  0 siblings, 0 replies; 9+ messages in thread
From: Iain Sandoe @ 2001-02-01  0:53 UTC (permalink / raw)
  To: David Edelsohn, Franz Sirl; +Cc: linuxppc-dev



> Franz> I would rather think this is a new kernel bug...

Franz is quite right it is.

=====

It is a particularly nasty bug - because it is one of those where something
drops bombs in memory.

====

The code generated by gcc 2.95.3-t2 is fine - from asm through to vmlinux.

=====

However, every now and then - at run time - something trashes the location
which throws up the Illegal Instruction.

as I said - nasty - it could be *anything* and I'm not really sure how/where
to start looking...

much though I hate to admit it (being a firm believer in low-tech printk
debugging) - this is one of those cases where an ICE would really triumph...

ciao,
Iain.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: network stack oops 2.4.1/gcc 2.95.3
@ 2001-01-30 15:45 Iain Sandoe
  0 siblings, 0 replies; 9+ messages in thread
From: Iain Sandoe @ 2001-01-30 15:45 UTC (permalink / raw)
  To: Franz Sirl; +Cc: David Edelsohn, linuxppc-dev


Hi Franz,

You are probably right - it's likely to be a red-herring.

Mind you - an Oops is an Oops ;)

- the compiler issue may be a red-herring the oops was definitely real.

On Tue, Jan 30, 2001,  Franz Sirl wrote:
> At least it would be nice if you could tell me the version of the RPM you
> were using before. I haven't added new patches since early December and
> most of my RPM patches are in test2 (even more in test3) now.

the 'fault' was with test2 (2.92.3-t2) + glibc 2.1.3-15g

the previous version (I have been using for some considerable time) was:

2.95.2-1f + glibc 2.1.3-4a

> Another datapoint:
> [fsirl@entropy:~]$ cat /proc/version
> Linux version 2.4.0 (trini@entropy.crashing.org) (gcc version 2.95.3
> 20010111 (prerelease/franzo/20010111)) #1 Sun Jan 14 15:10:21 MST 2001
> [fsirl@entropy:~]$ uptime
>    5:21am  up 11 days, 20:50,  2 users,  load average: 0.00, 0.00, 0.00

FWIW: once it's up, mine stays there as well... but I'm rebooting a lot at
the moment because of what I'm doing.  Also I'm using bk 2.4.1.

I have a _hunch_ that it is a failure/timeout in connecting to the DNS that
triggers the effect.

> But this machine has a fairly weak net connection, so mabe this is never
> triggered (define heavy network load?).

I have a combined ethernet bridge/NAT/ISDN modem - this occurred when it was
operating at full capacity downloading binary data (i.e. quite likely to
drop packets).

> I would rather think this is a new kernel bug...

yeah, probably... I assume program error first, compiler bug second...

It was just a little unusual... I'll try and find some more hard facts
(otherwise I guess we should forget it unless someone else sees something).

I haven't had time, yet, to look if any of the network code changed around
that time.

sorry if I wasted time here..
Iain.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: network stack oops 2.4.1/gcc 2.95.3
@ 2001-01-29 19:12 Iain Sandoe
  2001-01-29 19:18 ` David Edelsohn
  0 siblings, 1 reply; 9+ messages in thread
From: Iain Sandoe @ 2001-01-29 19:12 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Franz Sirl, linuxppc-dev


>  Has there been any follow up to this problem report?  Is this a
> regression from gcc-2.95.2 behavior?  Franz's pre-releases do not exactly
> correspond to the GCC development pre-releases, so I have no idea when
> this problem began or whether it is local to the linuxppc branch.

It is not a regression test.  It would not be particularly easy, either to
put 2.95.2 back up and wind back the system to the build conditions...

but the version used was 2.95.3-test2. and the bk pull was 2.4.1-pre10.

The only reason I associated gcc at all (and copied this to Franz) was that
I don't recall ever seeing an Illegal Instruction oops before.

Unfortunately, (or fortunately depending on your POV) it is not reproducible
to order.  It seems to depend on network load - maybe one time in 20 with a
heavily loaded network?

The system is booted OK

It is in the transition from single to init 5 (where portmap & named are
launched in my set-up).

Iain.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: network stack oops 2.4.1/gcc 2.95.3
@ 2001-01-29 19:05 David Edelsohn
  0 siblings, 0 replies; 9+ messages in thread
From: David Edelsohn @ 2001-01-29 19:05 UTC (permalink / raw)
  To: iain; +Cc: Franz Sirl, linuxppc-dev


	Has there been any follow up to this problem report?  Is this a
regression from gcc-2.95.2 behavior?  Franz's pre-releases do not exactly
correspond to the GCC development pre-releases, so I have no idea when
this problem began or whether it is local to the linuxppc branch.

David


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread
* network stack oops 2.4.1/gcc 2.95.3
@ 2001-01-27 13:36 Iain Sandoe
  0 siblings, 0 replies; 9+ messages in thread
From: Iain Sandoe @ 2001-01-27 13:36 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Franz Sirl


Hi,

This is not a completely repeatable scenario.

Yesterday, when my network connection was very busy portmap & named timed
out.  Today they both oops'ed in the same place - system came up OK apart
from dead network stack.

Iain.

----

Linux version 2.4.1-pre10-iain1 (iain-s@athena)
(gcc version 2.95.3 20010111 (prerelease/franzo/20010111)) #1 Fri Jan 26

NOTE1:       ^ ^ ^ ^

Oops: Exception in kernel mode, sig: 4
NIP: C017AF80 XER: 20000000 LR: C017AF74 SP: CF52BD00 REGS: cf52bc50 TRAP:
0700

NOTE2: 0700 is Illegal Instruction.

MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK = cf52a000[227] 'portmap' Last syscall: 102
last math cf390000 last altivec 00000000
GPR00: 00000010 CF52BD00 CF52A000 00000038 00001032 00000000 CF54D458
00000000
GPR08: C099C000 00000001 0000000E C028DC60 22444849 1001E9E8 00000000
100B5390
GPR16: 100CEF50 00000000 00000000 00000000 00009032 0F52BE80 00000000
C00043F0
GPR24: C0004120 7FFFF738 10017948 00002260 CF52BD68 00000000 CF550CA8
CF52BD68
Call backtrace:
C017AF74 C0145E3C C0146D5C C0147578 C000417C 10017B28 00000709
0FF7F248 0FF7FAB0 10001980 0FED2734 00000000

from System.map:

c017af38 T inet_recvmsg
c017af98 T inet_sendmsg

c0145de4 T sock_recvmsg
c0145ee8 t sock_lseek

c0146cbc T sys_recvfrom
c0146dc0 T sys_recv

c01473fc T sys_socketcall
c01475dc T sock_register

======================================================

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2001-02-01  0:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-29 23:54 network stack oops 2.4.1/gcc 2.95.3 Iain Sandoe
2001-01-30 12:39 ` Franz Sirl
2001-01-30 17:52   ` David Edelsohn
  -- strict thread matches above, loose matches on Subject: below --
2001-02-01  0:53 Iain Sandoe
2001-01-30 15:45 Iain Sandoe
2001-01-29 19:12 Iain Sandoe
2001-01-29 19:18 ` David Edelsohn
2001-01-29 19:05 David Edelsohn
2001-01-27 13:36 Iain Sandoe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).