All of lore.kernel.org
 help / color / mirror / Atom feed
From: ChristopherHuhn <c.huhn@gsi.de>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: linux-smp <linux-smp@vger.kernel.org>,
	linux-kernel@vger.kernel.org, Walter Schoen <w.schoen@GSI.de>,
	support-gsi@credativ.de
Subject: Re: Kernel Bug at spinlock.h ?!
Date: Mon, 10 Mar 2003 09:52:04 +0100	[thread overview]
Message-ID: <3E6C5234.8090505@GSI.de> (raw)
In-Reply-To: <Pine.LNX.4.50.0303071043580.18716-100000@montezuma.mastecende.com>

Zwane Mwaikambo wrote:

>On Thu, 6 Mar 2003, ChristopherHuhn wrote:
>
>  
>
>>Hi again,
>>
>>    
>>
>>>It looks like a possible race with rpc_execute and possibly the timer, 
>>>although i can't be certain where the other cpus are. Do the other oopses 
>>>look somewhat similar? Could you supply them?
>>> 
>>>
>>>      
>>>
>>below are some oopses I gathered yesterday and today, all on different 
>>machines.
>>I'd like to remark that we experience massive NFS problems at the moment 
>>that seem to be caused by our mixed potato 2.2/ woody 2.4 environment, 
>>i. e. linking apps on a woody system with the sources  mounted via nfs 
>>from a potato box leads to obscure IO failures like "no space left on 
>>device" (This never happens with woddy only). So this might be a clue 
>>here as well.
>>
>>The oopses are all written down from the screen, I hopefully made little 
>>"transmission" errors.
>>    
>>
>
>Some of these are a bit worrying seeing as they are bit flips, also they 
>all appear to come from a UP machine(?) this would change things with 
>respect to my previous comment about races. Regarding weird io failures 
>are you mounting with the 'soft' option?
>
>	Zwane
>  
>
The machines all all DP Xeons, our SP machines run the same kernel, but 
these oopses only occur on DP machines under heavy load.
The machines are recognized as SMP:
# uname -a
Linux lxb000 2.4.20 #2 SMP Tue Dec 17 10:43:29 CET 2002 i686 unknown

but the e7500 chipset seems not to be supported 100%:

Jan 27 15:26:34 lxb000 kernel: found SMP MP-table at 000f6710
Jan 27 15:26:34 lxb000 kernel: hm, page 000f6000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 000f7000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 0009f000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 000a0000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: On node 0 totalpages: 262016
Jan 27 15:26:34 lxb000 kernel: zone(0): 4096 pages.
Jan 27 15:26:34 lxb000 kernel: zone(1): 225280 pages.
Jan 27 15:26:34 lxb000 kernel: zone(2): 32640 pages.
Jan 27 15:26:34 lxb000 kernel: ACPI: Searched entire block, no RSDP was 
found.
Jan 27 15:26:34 lxb000 kernel: ACPI: Searched entire block, no RSDP was 
found.
Jan 27 15:26:34 lxb000 kernel: ACPI: System description tables not found
Jan 27 15:26:34 lxb000 kernel: Intel MultiProcessor Specification v1.4
Jan 27 15:26:34 lxb000 kernel:     Virtual Wire compatibility mode.
Jan 27 15:26:34 lxb000 kernel: OEM ID:   Product ID: Kings Canyon APIC 
at: 0xFEE00000
Jan 27 15:26:34 lxb000 kernel: Processor #0 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #6 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #1 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #7 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: I/O APIC #2 Version 32 at 0xFEC00000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #3 Version 32 at 0xFEC80000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #4 Version 32 at 0xFEC80400.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #5 Version 32 at 0xFEC81000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #8 Version 32 at 0xFEC81400.
Jan 27 15:26:34 lxb000 kernel: Processors: 4
...

There might be (are) severe flaws in our NFS configuration and network 
performance, but that should not crash the box, should it?

BTW: I just received a link to a bux incl. fix that sounds similar to 
our problem: http://marc.theaimsgroup.com/?l=linux-nfs&m=104716581307294&w=2

With kind regards,

Christopher



  reply	other threads:[~2003-03-10  8:52 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-03  8:11 Kernel Bug at spinlock.h ?! ChristopherHuhn
2003-03-03  8:51 ` Zwane Mwaikambo
2003-03-03 10:12   ` ChristopherHuhn
2003-03-03 15:13   ` ChristopherHuhn
2003-03-03 15:30     ` Tomas Szepe
2003-03-03 15:41     ` Richard B. Johnson
2003-03-03 16:03       ` ChristopherHuhn
2003-03-03 16:42         ` Richard B. Johnson
2003-03-03 17:49         ` Zwane Mwaikambo
2003-03-03 17:49           ` Zwane Mwaikambo
2003-03-04 13:58           ` ChristopherHuhn
2003-03-05  5:49             ` Zwane Mwaikambo
2003-03-05  5:49               ` Zwane Mwaikambo
2003-03-06 13:16               ` ChristopherHuhn
2003-03-07 15:47                 ` Zwane Mwaikambo
2003-03-07 15:47                   ` Zwane Mwaikambo
2003-03-10  8:52                   ` ChristopherHuhn [this message]
2003-03-03 16:39     ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2003-03-03 10:32 ChristopherHuhn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E6C5234.8090505@GSI.de \
    --to=c.huhn@gsi.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-smp@vger.kernel.org \
    --cc=support-gsi@credativ.de \
    --cc=w.schoen@GSI.de \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.