linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* swap_dup: Bad swap file entry 00480020
@ 2005-07-21 15:29 bogdan antonovici
  2005-07-21 17:59 ` Dan Malek
  0 siblings, 1 reply; 10+ messages in thread
From: bogdan antonovici @ 2005-07-21 15:29 UTC (permalink / raw)
  To: linuxppc-embedded; +Cc: linuxppc-dev, ppckernel

Hi,

I have an MPC860 system, 16M flash, 16M SDRAM, no disk, just kernel
support for ramdisk. root fs is nfs mounted.
I get messages like the one bellow:

swap_dup: Bad swap file entry 00480020
VM: killing process chat
swap_free: Bad swap file entry 00480020

and after a while a get an oops:

Oops: kernel access of bad area, sig: 11
NIP: C0045154 XER: 20000000 LR: C0045464 SP: C0CBBE60 REGS: c0cbbdb0
TRAP: 0300    Not tainted
MSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 00E9A01C, DSISR: C0000000
TASK = c0cba000[36] 'pppd' Last syscall: 142
last math 00000000 last altivec 00000000
GPR00: C0045464 C0CBBE60 C0CBA000 C0CBBE88 C01DE8C0 00000000 C010E2B8
0000001F
GPR08: C0CBA000 00000000 C0C4E03C 00000001 84444448 100B0C5C 10040000
10040000
GPR16: 10040000 10040000 10040000 10040000 C0CBBEF8 00000000 00000001
7FFFFFFF
GPR24: 00000004 00000001 00000145 C0CBBED8 00000004 C0E9A000 C0E9A008
00E9A008
Call backtrace:
C0087D84 C0045464 C0045858 C0009448 C00041DC 1001DB84 10006F24
10006A8C 0FE4BDA0 00000000

At the time of swap messages i was running a proprietary driver, my
application and few daemons.

ps
  PID  Uid     VmSize Stat Command
    1 root        584 S   init
    2 root            SW  [keventd]
    3 root            SWN [ksoftirqd_CPU0]
    4 root            SW  [kswapd]
    5 root            SW  [bdflush]
    6 root            SW  [kupdated]
    7 root            SW  [rpciod]
    9 root        760 S   -ash
   21 root        616 S   sectionmond
   22 root        608 S   sectionmond
   23 root        608 S   sectionmond
   24 root        608 S   sectionmond
   25 root        612 S   sectionmond
   26 root        608 S   sectionmond
   30 root       1304 S   /ppcnetsnmp/sbin/snmpd
   39 root        916 S   pppd /dev/ttyS1 19200 defaultroute demand idle
5 192.
  109 root        680 R   ps


I look on the net for some clues but it's quite confusing, i noticed
many emails on swap_dup/swap_free error messages but i couldn't figure
out what should i search for.
I tested the ram with mtest from pppboot2.0.


Has any of you an idea what is all about? 

Thank you
Bogdan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-21 15:29 swap_dup: Bad swap file entry 00480020 bogdan antonovici
@ 2005-07-21 17:59 ` Dan Malek
  2005-07-21 18:14   ` Bogdan Antonovici
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Dan Malek @ 2005-07-21 17:59 UTC (permalink / raw)
  To: bogdan antonovici; +Cc: linuxppc-dev, ppckernel, linuxppc-embedded


On Jul 21, 2005, at 11:29 AM, bogdan antonovici wrote:

> At the time of swap messages i was running a proprietary driver, my
> application and few daemons.

Looks like your driver may have written over some of the page
tables in the kernel space.

> I look on the net for some clues but it's quite confusing, i noticed
> many emails on swap_dup/swap_free error messages but i couldn't figure
> out what should i search for.

Those messages are likely due to a bug with swapping to disk
that has been in some 2.4 kernels, but I don't believe that is
the case here, since you don't have a disk or swapping enabled.


	-- Dan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-21 17:59 ` Dan Malek
@ 2005-07-21 18:14   ` Bogdan Antonovici
  2005-07-22 15:46   ` bogdan antonovici
  2005-08-05 19:12   ` Bogdan Antonovici
  2 siblings, 0 replies; 10+ messages in thread
From: Bogdan Antonovici @ 2005-07-21 18:14 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-dev, ppckernel, linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 1256 bytes --]

I wasn't so worry about the driver because the same driver worked with a different application without seeing these kind of messages or oopses.
Dan, from your answer i understand that the swap code discovers a corruption in tables but why is swap code run when swap wasn't activated?
I will try to have a look on the driver code.

Bogdan
  ----- Original Message ----- 
  From: Dan Malek 
  To: bogdan antonovici 
  Cc: linuxppc-dev ; linuxppc-embedded@ozlabs.org ; ppckernel 
  Sent: Thursday, July 21, 2005 12:59 PM
  Subject: Re: swap_dup: Bad swap file entry 00480020



  On Jul 21, 2005, at 11:29 AM, bogdan antonovici wrote:

  > At the time of swap messages i was running a proprietary driver, my
  > application and few daemons.

  Looks like your driver may have written over some of the page
  tables in the kernel space.

  > I look on the net for some clues but it's quite confusing, i noticed
  > many emails on swap_dup/swap_free error messages but i couldn't figure
  > out what should i search for.

  Those messages are likely due to a bug with swapping to disk
  that has been in some 2.4 kernels, but I don't believe that is
  the case here, since you don't have a disk or swapping enabled.


  -- Dan


[-- Attachment #2: Type: text/html, Size: 2693 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-22 15:46   ` bogdan antonovici
@ 2005-07-22 12:57     ` Marcelo Tosatti
  2005-07-25 13:16       ` Bogdan Antonovici
  0 siblings, 1 reply; 10+ messages in thread
From: Marcelo Tosatti @ 2005-07-22 12:57 UTC (permalink / raw)
  To: bogdan antonovici; +Cc: linuxppc-dev, linuxppc-embedded, ppckernel

On Fri, Jul 22, 2005 at 10:46:38AM -0500, bogdan antonovici wrote:
> Hi Dan,
> 
> I checked the driver code. I found a pointer that was in my opinion
> initialized too late and i corrected that but other than that i haven't
> found anything.
> I ran the driver alone, enabling the interrupts and the interrupt
> routine doesn't cause any trouble.
> I started my application and i haven't seen any sign of trouble.
> But once i started also the snmpd after few interrupts i got the
> message:
> 
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> VM: killing process sectionmond

Thats a different problem: you ran out of memory and the VM can't swap
out any data.

So its likely that the pagetable corruption is gone (it was indeed a bug
in the driver as Dan suspected).

> sectionmond being my application.
> My read and write driver operation are requesting a page for a buffer
> but they also release it. Should i declare the buffer pointer with
> volatile attribute?
> Do you know what may cause that message?

Out of memory condition. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-21 17:59 ` Dan Malek
  2005-07-21 18:14   ` Bogdan Antonovici
@ 2005-07-22 15:46   ` bogdan antonovici
  2005-07-22 12:57     ` Marcelo Tosatti
  2005-08-05 19:12   ` Bogdan Antonovici
  2 siblings, 1 reply; 10+ messages in thread
From: bogdan antonovici @ 2005-07-22 15:46 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-dev, ppckernel, linuxppc-embedded

Hi Dan,

I checked the driver code. I found a pointer that was in my opinion
initialized too late and i corrected that but other than that i haven't
found anything.
I ran the driver alone, enabling the interrupts and the interrupt
routine doesn't cause any trouble.
I started my application and i haven't seen any sign of trouble.
But once i started also the snmpd after few interrupts i got the
message:

__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process sectionmond

sectionmond being my application.
My read and write driver operation are requesting a page for a buffer
but they also release it. Should i declare the buffer pointer with
volatile attribute?
Do you know what may cause that message?
Thanks
Bogdan



On Thu, 2005-07-21 at 12:59, Dan Malek wrote:
> On Jul 21, 2005, at 11:29 AM, bogdan antonovici wrote:
> 
> > At the time of swap messages i was running a proprietary driver, my
> > application and few daemons.
> 
> Looks like your driver may have written over some of the page
> tables in the kernel space.
> 
> > I look on the net for some clues but it's quite confusing, i noticed
> > many emails on swap_dup/swap_free error messages but i couldn't figure
> > out what should i search for.
> 
> Those messages are likely due to a bug with swapping to disk
> that has been in some 2.4 kernels, but I don't believe that is
> the case here, since you don't have a disk or swapping enabled.
> 
> 
> 	-- Dan
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-22 12:57     ` Marcelo Tosatti
@ 2005-07-25 13:16       ` Bogdan Antonovici
  2005-07-25 17:36         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 10+ messages in thread
From: Bogdan Antonovici @ 2005-07-25 13:16 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linuxppc-dev, linuxppc-embedded, ppckernel

[-- Attachment #1: Type: text/plain, Size: 1657 bytes --]

Yes, it was. After some research on the previous message i realized that it run out of memory so i did mem=8M and it hasn't been crashed since then.
I still would like to understand why that swap code was run when the swapping wasn't activated at all.
Thank you.
Bogdan
  ----- Original Message ----- 
  From: Marcelo Tosatti 
  To: bogdan antonovici 
  Cc: Dan Malek ; linuxppc-dev ; ppckernel ; linuxppc-embedded@ozlabs.org 
  Sent: Friday, July 22, 2005 7:57 AM
  Subject: Re: swap_dup: Bad swap file entry 00480020


  On Fri, Jul 22, 2005 at 10:46:38AM -0500, bogdan antonovici wrote:
  > Hi Dan,
  > 
  > I checked the driver code. I found a pointer that was in my opinion
  > initialized too late and i corrected that but other than that i haven't
  > found anything.
  > I ran the driver alone, enabling the interrupts and the interrupt
  > routine doesn't cause any trouble.
  > I started my application and i haven't seen any sign of trouble.
  > But once i started also the snmpd after few interrupts i got the
  > message:
  > 
  > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
  > VM: killing process sectionmond

  Thats a different problem: you ran out of memory and the VM can't swap
  out any data.

  So its likely that the pagetable corruption is gone (it was indeed a bug
  in the driver as Dan suspected).

  > sectionmond being my application.
  > My read and write driver operation are requesting a page for a buffer
  > but they also release it. Should i declare the buffer pointer with
  > volatile attribute?
  > Do you know what may cause that message?

  Out of memory condition.

[-- Attachment #2: Type: text/html, Size: 3192 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-25 13:16       ` Bogdan Antonovici
@ 2005-07-25 17:36         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2005-07-25 17:36 UTC (permalink / raw)
  To: Bogdan Antonovici; +Cc: linuxppc-dev, ppckernel, linuxppc-embedded

On Mon, 2005-07-25 at 08:16 -0500, Bogdan Antonovici wrote:
> Yes, it was. After some research on the previous message i realized
> that it run out of memory so i did mem=8M and it hasn't been crashed
> since then.
> I still would like to understand why that swap code was run when the
> swapping wasn't activated at all.

The "swap" code also kicks in for things like mmap of a file, this is
the same code path.

Ben.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-07-21 17:59 ` Dan Malek
  2005-07-21 18:14   ` Bogdan Antonovici
  2005-07-22 15:46   ` bogdan antonovici
@ 2005-08-05 19:12   ` Bogdan Antonovici
  2005-08-05 19:29     ` Dan Malek
  2005-08-05 21:12     ` Geoff Levand
  2 siblings, 2 replies; 10+ messages in thread
From: Bogdan Antonovici @ 2005-08-05 19:12 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-dev, ppckernel, linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]

Dan,

It hasn't crashed since i corrected that pointer initialization in driver. You were right.
Thank you.
Bogdan
  ----- Original Message ----- 
  From: Dan Malek 
  To: bogdan antonovici 
  Cc: linuxppc-dev ; linuxppc-embedded@ozlabs.org ; ppckernel 
  Sent: Thursday, July 21, 2005 12:59 PM
  Subject: Re: swap_dup: Bad swap file entry 00480020



  On Jul 21, 2005, at 11:29 AM, bogdan antonovici wrote:

  > At the time of swap messages i was running a proprietary driver, my
  > application and few daemons.

  Looks like your driver may have written over some of the page
  tables in the kernel space.

  > I look on the net for some clues but it's quite confusing, i noticed
  > many emails on swap_dup/swap_free error messages but i couldn't figure
  > out what should i search for.

  Those messages are likely due to a bug with swapping to disk
  that has been in some 2.4 kernels, but I don't believe that is
  the case here, since you don't have a disk or swapping enabled.


  -- Dan


[-- Attachment #2: Type: text/html, Size: 2462 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-08-05 19:12   ` Bogdan Antonovici
@ 2005-08-05 19:29     ` Dan Malek
  2005-08-05 21:12     ` Geoff Levand
  1 sibling, 0 replies; 10+ messages in thread
From: Dan Malek @ 2005-08-05 19:29 UTC (permalink / raw)
  To: Bogdan Antonovici; +Cc: linuxppc-dev, linuxppc-embedded, ppckernel


On Aug 5, 2005, at 3:12 PM, Bogdan Antonovici wrote:

> It hasn't crashed since i corrected that pointer initialization in 
> driver. You were right.
> Thank you.

Glad to hear it.  I wasn't looking forward to debugging such a problem
in the VM subsystem at this time :-)

Have fun!


	-- Dan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: swap_dup: Bad swap file entry 00480020
  2005-08-05 19:12   ` Bogdan Antonovici
  2005-08-05 19:29     ` Dan Malek
@ 2005-08-05 21:12     ` Geoff Levand
  1 sibling, 0 replies; 10+ messages in thread
From: Geoff Levand @ 2005-08-05 21:12 UTC (permalink / raw)
  To: Bogdan Antonovici; +Cc: linuxppc-dev, linuxppc-embedded, ppckernel

BTW, I posted a fix for a bug in the 2.6 page table attribute
settings for PPC440 that cause corruption when swapping.  I 
don't know if is the same for 2.4.

http://patchwork.ozlabs.org/linuxppc/patch?id=1458

-Geoff

Bogdan Antonovici wrote:
> Dan,
>  
> It hasn't crashed since i corrected that pointer initialization in
> driver. You were right.
> Thank you.
> Bogdan
> 
> ----- Original Message ----- 
> From: Dan Malek <mailto:dan@embeddededge.com>  
> To: bogdan antonovici <mailto:bantonovici@priority.mb.ca>  
> Cc: linuxppc-dev <mailto:linuxppc-dev@ozlabs.org>  ;
> linuxppc-embedded@ozlabs.org <mailto:linuxppc-embedded@ozlabs.org>  ;
> ppckernel <mailto:ppckernel@ppckernel.org>  
> Sent: Thursday, July 21, 2005 12:59 PM
> Subject: Re: swap_dup: Bad swap file entry 00480020
> 
> 
> On Jul 21, 2005, at 11:29 AM, bogdan antonovici wrote:
> 
> 
>>At the time of swap messages i was running a proprietary driver, my
>>application and few daemons.
> 
> 
> Looks like your driver may have written over some of the page
> tables in the kernel space.
> 
> 
>>I look on the net for some clues but it's quite confusing, i noticed
>>many emails on swap_dup/swap_free error messages but i couldn't figure
>>out what should i search for.
> 
> 
> Those messages are likely due to a bug with swapping to disk
> that has been in some 2.4 kernels, but I don't believe that is
> the case here, since you don't have a disk or swapping enabled.
> 
> 
> -- Dan
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-08-05 21:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-21 15:29 swap_dup: Bad swap file entry 00480020 bogdan antonovici
2005-07-21 17:59 ` Dan Malek
2005-07-21 18:14   ` Bogdan Antonovici
2005-07-22 15:46   ` bogdan antonovici
2005-07-22 12:57     ` Marcelo Tosatti
2005-07-25 13:16       ` Bogdan Antonovici
2005-07-25 17:36         ` Benjamin Herrenschmidt
2005-08-05 19:12   ` Bogdan Antonovici
2005-08-05 19:29     ` Dan Malek
2005-08-05 21:12     ` Geoff Levand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).