All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tim Weippert <weiti@security.tds.de>
Cc: akpm@osdl.org, davej@codemonkey.org.uk, discuss@x86-64.org,
	cpufreq@lists.linux.org.uk, linux-kernel@vger.kernel.org,
	Daniel Drake <dsd@gentoo.org>
Subject: Re: Bad page state on AMD Opteron Dual System with kernel 2.6.13-rc6-git13
Date: Mon, 29 Aug 2005 12:28:31 +0200	[thread overview]
Message-ID: <20050829102830.GA7604@pbkg4> (raw)
In-Reply-To: <20050829052454.GA8172@pbkg4>


Hi, 

On Mon, Aug 29, 2005 at 07:24:54AM +0200, Tim Weippert wrote:
> On Sun, Aug 28, 2005 at 01:20:51AM +0100, Daniel Drake wrote:

> > 
> > Seems to be an identical problem as was filed here:
> > 
> > 	http://bugs.gentoo.org/show_bug.cgi?id=103497
> > 
> > This bug report seems to suggest that the ondemand scaling governor may be 
> > at fault. Does your setup use this too?
> > 
> > (CC'ing some extra people to make sure problem is known)
> > 
> 
> As this is an Server, i don't even use cpufreq on this machine. So it
> think this isn't the same problem ...

Update, with stable 2.6.13. I get nearly the same behavior. 

One new oops:

swap_free: Bad swap file entry c000007fffff802f
swap_free: Bad swap file entry c800007fffff802f
swap_free: Bad swap file entry d000007fffff802f
swap_free: Bad swap file entry d800007fffff802f
swap_free: Bad swap file entry e000007fffff802f
swap_free: Bad swap file entry 4000000000000000
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "mm/rmap.c":493
invalid operand: 0000 [1] SMP 
CPU 1 
Modules linked in: autofs4 floppy i2c_amd756 i2c_core hw_random ohci_hcd
tg3 tsdev evdev evbug psmouse genrtc unix
Pid: 9014, comm: sh Not tainted 2.6.13
RIP: 0010:[<ffffffff8016e9ab>] <ffffffff8016e9ab>{page_remove_rmap+43}
RSP: 0018:ffff8100481c3da0  EFLAGS: 00010286
RAX: 00000000ffffffff RBX: ffff81004a5fc420 RCX: ffff81000000d000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8100011a69c8
RBP: 0000000000484000 R08: 0000000000000001 R09: 000000000000000f
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000078bfbff
R13: ffff810040e133e0 R14: ffff8100011a69c8 R15: 0000000000000000
FS:  00000000457ff970(0000) GS:ffffffff8056f880(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaaaabd000 CR3: 0000000048205000 CR4: 00000000000006e0
Process sh (pid: 9014, threadinfo ffff8100481c2000, task
ffff810048e7e270)
Stack: ffffffff801663f4 0000000000497000 ffff81004937f010
0000000000497000 
       0000000000497000 0000000000496fff ffff8100497dd000
0000000000497000 
       ffffffff801666ab 0000000000000000 
Call Trace:<ffffffff801663f4>{zap_pte_range+436}
<ffffffff801666ab>{unmap_page_range+507}
       <ffffffff80166815>{unmap_vmas+293}
<ffffffff8016c4d2>{exit_mmap+162}
       <ffffffff801318b1>{mmput+49} <ffffffff80136d3a>{do_exit+442}
       <ffffffff801370c0>{sys_exit_group+0}
<ffffffff8010db7a>{system_call+126}
       

Code: 0f 0b a3 b4 5b 3f 80 ff ff ff ff c2 ed 01 66 66 66 90 66 66 
RIP <ffffffff8016e9ab>{page_remove_rmap+43} RSP <ffff8100481c3da0>
 <1>Fixing recursive fault but reboot is needed!


With this i get an hanging [sh] process which can't be killed, only
cleanable with reboot:

www-data  7701  0.0  0.3 74448 6452 ?        S    11:56   0:00
/usr/sbin/cactid 0 93
www-data  7721  0.0  0.5 56296 10504 ?       S    11:56   0:00  \_
/usr/bin/php /usr/share/cacti/site/script_server.php cactid 0
www-data  9014  0.0  0.0     0    0 ?        D    11:56   0:00  \_ [sh]


The machine is an cacti system with generally high load ... seems the
kernel does only have problems on higher load.

HTH, 

    weiti

-- 

Interpunktion und Orthographie dieser Email ist frei erfunden.
Eine Übereinstimmung mit aktuellen oder ehemaligen Regeln
wäre rein zufällig und ist nicht beabsichtigt.

Tim Weippert <weiti@topf-sicret.org>
http://www.topf-sicret.org/

WARNING: multiple messages have this Message-ID (diff)
From: Tim Weippert <weiti@security.tds.de>
To: linux-kernel@vger.kernel.org
Cc: Daniel Drake <dsd@gentoo.org>,
	linux-kernel@vger.kernel.org, cpufreq@lists.linux.org.uk,
	davej@codemonkey.org.uk, akpm@osdl.org, discuss@x86-64.org
Subject: Re: Bad page state on AMD Opteron Dual System with kernel 2.6.13-rc6-git13
Date: Mon, 29 Aug 2005 12:28:31 +0200	[thread overview]
Message-ID: <20050829102830.GA7604@pbkg4> (raw)
In-Reply-To: <20050829052454.GA8172@pbkg4>


Hi, 

On Mon, Aug 29, 2005 at 07:24:54AM +0200, Tim Weippert wrote:
> On Sun, Aug 28, 2005 at 01:20:51AM +0100, Daniel Drake wrote:

> > 
> > Seems to be an identical problem as was filed here:
> > 
> > 	http://bugs.gentoo.org/show_bug.cgi?id=103497
> > 
> > This bug report seems to suggest that the ondemand scaling governor may be 
> > at fault. Does your setup use this too?
> > 
> > (CC'ing some extra people to make sure problem is known)
> > 
> 
> As this is an Server, i don't even use cpufreq on this machine. So it
> think this isn't the same problem ...

Update, with stable 2.6.13. I get nearly the same behavior. 

One new oops:

swap_free: Bad swap file entry c000007fffff802f
swap_free: Bad swap file entry c800007fffff802f
swap_free: Bad swap file entry d000007fffff802f
swap_free: Bad swap file entry d800007fffff802f
swap_free: Bad swap file entry e000007fffff802f
swap_free: Bad swap file entry 4000000000000000
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "mm/rmap.c":493
invalid operand: 0000 [1] SMP 
CPU 1 
Modules linked in: autofs4 floppy i2c_amd756 i2c_core hw_random ohci_hcd
tg3 tsdev evdev evbug psmouse genrtc unix
Pid: 9014, comm: sh Not tainted 2.6.13
RIP: 0010:[<ffffffff8016e9ab>] <ffffffff8016e9ab>{page_remove_rmap+43}
RSP: 0018:ffff8100481c3da0  EFLAGS: 00010286
RAX: 00000000ffffffff RBX: ffff81004a5fc420 RCX: ffff81000000d000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8100011a69c8
RBP: 0000000000484000 R08: 0000000000000001 R09: 000000000000000f
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000078bfbff
R13: ffff810040e133e0 R14: ffff8100011a69c8 R15: 0000000000000000
FS:  00000000457ff970(0000) GS:ffffffff8056f880(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaaaabd000 CR3: 0000000048205000 CR4: 00000000000006e0
Process sh (pid: 9014, threadinfo ffff8100481c2000, task
ffff810048e7e270)
Stack: ffffffff801663f4 0000000000497000 ffff81004937f010
0000000000497000 
       0000000000497000 0000000000496fff ffff8100497dd000
0000000000497000 
       ffffffff801666ab 0000000000000000 
Call Trace:<ffffffff801663f4>{zap_pte_range+436}
<ffffffff801666ab>{unmap_page_range+507}
       <ffffffff80166815>{unmap_vmas+293}
<ffffffff8016c4d2>{exit_mmap+162}
       <ffffffff801318b1>{mmput+49} <ffffffff80136d3a>{do_exit+442}
       <ffffffff801370c0>{sys_exit_group+0}
<ffffffff8010db7a>{system_call+126}
       

Code: 0f 0b a3 b4 5b 3f 80 ff ff ff ff c2 ed 01 66 66 66 90 66 66 
RIP <ffffffff8016e9ab>{page_remove_rmap+43} RSP <ffff8100481c3da0>
 <1>Fixing recursive fault but reboot is needed!


With this i get an hanging [sh] process which can't be killed, only
cleanable with reboot:

www-data  7701  0.0  0.3 74448 6452 ?        S    11:56   0:00
/usr/sbin/cactid 0 93
www-data  7721  0.0  0.5 56296 10504 ?       S    11:56   0:00  \_
/usr/bin/php /usr/share/cacti/site/script_server.php cactid 0
www-data  9014  0.0  0.0     0    0 ?        D    11:56   0:00  \_ [sh]


The machine is an cacti system with generally high load ... seems the
kernel does only have problems on higher load.

HTH, 

    weiti

-- 

Interpunktion und Orthographie dieser Email ist frei erfunden.
Eine Übereinstimmung mit aktuellen oder ehemaligen Regeln
wäre rein zufällig und ist nicht beabsichtigt.

Tim Weippert <weiti@topf-sicret.org>
http://www.topf-sicret.org/

  reply	other threads:[~2005-08-29 10:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-26 16:53 Bad page state on AMD Opteron Dual System with kernel 2.6.13-rc6-git13 Tim Weippert
2005-08-28  0:20 ` Daniel Drake
2005-08-28  0:20   ` Daniel Drake
2005-08-29  5:24   ` Tim Weippert
2005-08-29  5:24     ` Tim Weippert
2005-08-29 10:28     ` Tim Weippert [this message]
2005-08-29 10:28       ` Tim Weippert
2005-08-29 20:04       ` Bongani Hlope
2005-08-30  7:36         ` Tim Weippert
     [not found]       ` <Pine.LNX.4.61.0508291401470.13709@goblin.wat.veritas.com>
     [not found]         ` <20050829142318.GB7604@pbkg4>
     [not found]           ` <20050830072759.GA4150@pbkg4>
     [not found]             ` <Pine.LNX.4.61.0508301035410.6339@goblin.wat.veritas.com>
2005-08-30 12:35               ` Tim Weippert
2005-09-02  9:58                 ` Tim Weippert
2005-08-29 10:49     ` Daniel Drake
2005-08-29 10:49       ` Daniel Drake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050829102830.GA7604@pbkg4 \
    --to=weiti@security.tds.de \
    --cc=akpm@osdl.org \
    --cc=cpufreq@lists.linux.org.uk \
    --cc=davej@codemonkey.org.uk \
    --cc=discuss@x86-64.org \
    --cc=dsd@gentoo.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.