public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ChristopherHuhn <c.huhn@gsi.de>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: linux-smp <linux-smp@vger.kernel.org>,
	linux-kernel@vger.kernel.org, Walter Schoen <w.schoen@GSI.de>,
	support-gsi@credativ.de
Subject: Re: Kernel Bug at spinlock.h ?!
Date: Mon, 10 Mar 2003 09:52:04 +0100	[thread overview]
Message-ID: <3E6C5234.8090505@GSI.de> (raw)
In-Reply-To: <Pine.LNX.4.50.0303071043580.18716-100000@montezuma.mastecende.com>

Zwane Mwaikambo wrote:

>On Thu, 6 Mar 2003, ChristopherHuhn wrote:
>
>  
>
>>Hi again,
>>
>>    
>>
>>>It looks like a possible race with rpc_execute and possibly the timer, 
>>>although i can't be certain where the other cpus are. Do the other oopses 
>>>look somewhat similar? Could you supply them?
>>> 
>>>
>>>      
>>>
>>below are some oopses I gathered yesterday and today, all on different 
>>machines.
>>I'd like to remark that we experience massive NFS problems at the moment 
>>that seem to be caused by our mixed potato 2.2/ woody 2.4 environment, 
>>i. e. linking apps on a woody system with the sources  mounted via nfs 
>>from a potato box leads to obscure IO failures like "no space left on 
>>device" (This never happens with woddy only). So this might be a clue 
>>here as well.
>>
>>The oopses are all written down from the screen, I hopefully made little 
>>"transmission" errors.
>>    
>>
>
>Some of these are a bit worrying seeing as they are bit flips, also they 
>all appear to come from a UP machine(?) this would change things with 
>respect to my previous comment about races. Regarding weird io failures 
>are you mounting with the 'soft' option?
>
>	Zwane
>  
>
The machines all all DP Xeons, our SP machines run the same kernel, but 
these oopses only occur on DP machines under heavy load.
The machines are recognized as SMP:
# uname -a
Linux lxb000 2.4.20 #2 SMP Tue Dec 17 10:43:29 CET 2002 i686 unknown

but the e7500 chipset seems not to be supported 100%:

Jan 27 15:26:34 lxb000 kernel: found SMP MP-table at 000f6710
Jan 27 15:26:34 lxb000 kernel: hm, page 000f6000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 000f7000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 0009f000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: hm, page 000a0000 reserved twice.
Jan 27 15:26:34 lxb000 kernel: On node 0 totalpages: 262016
Jan 27 15:26:34 lxb000 kernel: zone(0): 4096 pages.
Jan 27 15:26:34 lxb000 kernel: zone(1): 225280 pages.
Jan 27 15:26:34 lxb000 kernel: zone(2): 32640 pages.
Jan 27 15:26:34 lxb000 kernel: ACPI: Searched entire block, no RSDP was 
found.
Jan 27 15:26:34 lxb000 kernel: ACPI: Searched entire block, no RSDP was 
found.
Jan 27 15:26:34 lxb000 kernel: ACPI: System description tables not found
Jan 27 15:26:34 lxb000 kernel: Intel MultiProcessor Specification v1.4
Jan 27 15:26:34 lxb000 kernel:     Virtual Wire compatibility mode.
Jan 27 15:26:34 lxb000 kernel: OEM ID:   Product ID: Kings Canyon APIC 
at: 0xFEE00000
Jan 27 15:26:34 lxb000 kernel: Processor #0 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #6 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #1 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: Processor #7 Pentium 4(tm) XEON(tm) APIC 
version 20
Jan 27 15:26:34 lxb000 kernel: I/O APIC #2 Version 32 at 0xFEC00000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #3 Version 32 at 0xFEC80000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #4 Version 32 at 0xFEC80400.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #5 Version 32 at 0xFEC81000.
Jan 27 15:26:34 lxb000 kernel: I/O APIC #8 Version 32 at 0xFEC81400.
Jan 27 15:26:34 lxb000 kernel: Processors: 4
...

There might be (are) severe flaws in our NFS configuration and network 
performance, but that should not crash the box, should it?

BTW: I just received a link to a bux incl. fix that sounds similar to 
our problem: http://marc.theaimsgroup.com/?l=linux-nfs&m=104716581307294&w=2

With kind regards,

Christopher



  reply	other threads:[~2003-03-10  8:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-03  8:11 Kernel Bug at spinlock.h ?! ChristopherHuhn
2003-03-03  8:51 ` Zwane Mwaikambo
2003-03-03 10:12   ` ChristopherHuhn
2003-03-03 15:13   ` ChristopherHuhn
2003-03-03 15:30     ` Tomas Szepe
2003-03-03 15:41     ` Richard B. Johnson
2003-03-03 16:03       ` ChristopherHuhn
2003-03-03 16:42         ` Richard B. Johnson
2003-03-03 17:49         ` Zwane Mwaikambo
2003-03-04 13:58           ` ChristopherHuhn
2003-03-05  5:49             ` Zwane Mwaikambo
2003-03-06 13:16               ` ChristopherHuhn
2003-03-07 15:47                 ` Zwane Mwaikambo
2003-03-10  8:52                   ` ChristopherHuhn [this message]
2003-03-03 16:39     ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E6C5234.8090505@GSI.de \
    --to=c.huhn@gsi.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-smp@vger.kernel.org \
    --cc=support-gsi@credativ.de \
    --cc=w.schoen@GSI.de \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox