* 2.4.24 Paging Fault, Cache tries to swap with no swap partition
@ 2004-02-14 17:06 Ross Dickson
2004-02-14 20:33 ` vda
0 siblings, 1 reply; 5+ messages in thread
From: Ross Dickson @ 2004-02-14 17:06 UTC (permalink / raw)
To: linux-kernel
Greetings,
I have an imaging system writing files to removable hard drives.
Compact Flash boot with ram drives so I usually have no swap partition or file.
Recently I upgraded kernel from 2.4.20 to 2.4.24.
System has "mem=460M" (512M ram fitted) and starts with about
400M free. After recording for a while the Cached ram acquires all
but about 4Mb MemFree.
On a hot 38C day it started Oops'ing re paging memory. It runs the
same 2 programs all day gathering and compressing images.
Sorry I have no detail on the Oops at the moment, computer is in a vehicle and
does not normally have a screen. From memory it couldn't allocate a virtual
page.
I found if I put in a 16Mb ram drive as swap then it would grab
roughly 1.4Mb of it on occasion and keep it until recording stopped
for a while. SwapCached is either 0Kb or 1024Kb, not anything else.
Is this behaviour expected - to require a swap file?
Can the paging cache be tuned in /proc or somewhere to prevent it being so
greedy as to want more memory than the machine has?
Is the quickest fix to give it more ram. I read on another posting that with
greater than 512Mb the cache won't grab any more?
Regards
Ross.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 2.4.24 Paging Fault, Cache tries to swap with no swap partition
2004-02-14 17:06 2.4.24 Paging Fault, Cache tries to swap with no swap partition Ross Dickson
@ 2004-02-14 20:33 ` vda
2004-02-15 13:49 ` Ross Dickson
0 siblings, 1 reply; 5+ messages in thread
From: vda @ 2004-02-14 20:33 UTC (permalink / raw)
To: ross, linux-kernel
On Saturday 14 February 2004 19:06, Ross Dickson wrote:
> I have an imaging system writing files to removable hard drives.
> Compact Flash boot with ram drives so I usually have no swap partition or
> file.
>
> Recently I upgraded kernel from 2.4.20 to 2.4.24.
>
> System has "mem=460M" (512M ram fitted) and starts with about
> 400M free. After recording for a while the Cached ram acquires all
> but about 4Mb MemFree.
>
> On a hot 38C day it started Oops'ing re paging memory. It runs the
Too vague.
Do you have any logging? At least a circular buffer? Anything?
> same 2 programs all day gathering and compressing images.
> Sorry I have no detail on the Oops at the moment, computer is in a vehicle
> and does not normally have a screen. From memory it couldn't allocate a
> virtual page.
>
> I found if I put in a 16Mb ram drive as swap then it would grab
> roughly 1.4Mb of it on occasion and keep it until recording stopped
> for a while. SwapCached is either 0Kb or 1024Kb, not anything else.
If swap is active, some of it may be used even when box is not
heavily loaded. That's normal.
> Is this behaviour expected - to require a swap file?
No.
> Can the paging cache be tuned in /proc or somewhere to prevent it being so
> greedy as to want more memory than the machine has?
Maybe. But you should concentrate on finding where exactly it oopsed.
> Is the quickest fix to give it more ram. I read on another posting that
> with greater than 512Mb the cache won't grab any more?
Please don't succumb to 'add more RAM' syndrome. 460 megs should be fine
for you. I'd say better find the root of the problem.
--
vda
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 2.4.24 Paging Fault, Cache tries to swap with no swap partition
2004-02-14 20:33 ` vda
@ 2004-02-15 13:49 ` Ross Dickson
2004-02-16 10:25 ` Ross Dickson
0 siblings, 1 reply; 5+ messages in thread
From: Ross Dickson @ 2004-02-15 13:49 UTC (permalink / raw)
To: linux-kernel; +Cc: vda
On Sunday 15 February 2004 06:33, you wrote:
> On Saturday 14 February 2004 19:06, Ross Dickson wrote:
> > I have an imaging system writing files to removable hard drives.
> > Compact Flash boot with ram drives so I usually have no swap partition or
> > file.
> >
> > Recently I upgraded kernel from 2.4.20 to 2.4.24.
> >
> > System has "mem=460M" (512M ram fitted) and starts with about
> > 400M free. After recording for a while the Cached ram acquires all
> > but about 4Mb MemFree.
> >
> > On a hot 38C day it started Oops'ing re paging memory. It runs the
>
> Too vague.
> Do you have any logging? At least a circular buffer? Anything?
Unfortunately not much, I was hot, tired, not at my best - should have
grabbed it when I had the chance. All I grabbed was a partial code string
at the bottom of the Oops which I doubt is of any benefit without the rest.
8b 5f 04 8d 77 08 83 eb 18 8b
Oh yeah, it killed init too.
System defaults to not logging to permanent storage, flash would die over
time from writes and info would be meaningless to customer on their removable
hard drive. Note to self - must change that - I'm sure customer could spare some
space.
I assumed I could reproduce the fault today if but it wouldn't fault.
First misbehaved Friday 13th, reproduced Sat 14th.
Self cured??? Sunday 15th. Doh!!!
>
> > same 2 programs all day gathering and compressing images.
> > Sorry I have no detail on the Oops at the moment, computer is in a vehicle
> > and does not normally have a screen. From memory it couldn't allocate a
> > virtual page.
> >
> > I found if I put in a 16Mb ram drive as swap then it would grab
> > roughly 1.4Mb of it on occasion and keep it until recording stopped
> > for a while. SwapCached is either 0Kb or 1024Kb, not anything else.
>
> If swap is active, some of it may be used even when box is not
> heavily loaded. That's normal.
>
> > Is this behaviour expected - to require a swap file?
>
> No.
>
> > Can the paging cache be tuned in /proc or somewhere to prevent it being so
> > greedy as to want more memory than the machine has?
>
> Maybe. But you should concentrate on finding where exactly it oopsed.
I note memfree stabilises at around 4Mb when running OK, given it only wanted
an extra 1Mb cache swap, can I cat something to /proc/sys/vm/????? to force
it to stabilise at around 10Mb or 20Mb? Otherwise can I change a constant
and recompile kernel to achieve same? It might help give more headroom when
the event occurs.
>
> > Is the quickest fix to give it more ram. I read on another posting that
> > with greater than 512Mb the cache won't grab any more?
>
> Please don't succumb to 'add more RAM' syndrome. 460 megs should be fine
> for you. I'd say better find the root of the problem.
I admit it, I since tried the 'add more RAM' but a couple of capture card devices
did not like more than about 800Mb so I pulled the stick back out. It ran quite
well in 256Mb with the old kernel for about a year so it is puzzling.
> vda
>
Thanks for the response
Regards
Ross.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 2.4.24 Paging Fault, Cache tries to swap with no swap partition
2004-02-15 13:49 ` Ross Dickson
@ 2004-02-16 10:25 ` Ross Dickson
2004-02-16 13:45 ` 2.4.24 Paging Fault, Source Line located in slab.c, kmem_cache_reap() Ross Dickson
0 siblings, 1 reply; 5+ messages in thread
From: Ross Dickson @ 2004-02-16 10:25 UTC (permalink / raw)
To: linux-kernel; +Cc: vda
On Sunday 15 February 2004 23:49, Ross Dickson wrote:
> On Sunday 15 February 2004 06:33, you wrote:
> > On Saturday 14 February 2004 19:06, Ross Dickson wrote:
> > > I have an imaging system writing files to removable hard drives.
> > > Compact Flash boot with ram drives so I usually have no swap partition or
> > > file.
> > >
> > > Recently I upgraded kernel from 2.4.20 to 2.4.24.
Is KM18G Pro (nforce2 dual memory mode), AMD 2400XP, Preempt, Low latency,
64Bit jiffies 1000Hz patched.
I found some articles about memory overcommitment, checked the source and saw
strict in use for arm systems - no swap- so this time I thought I would try
echo 1 > /proc/sys/vm/overcommit_memory
I got another oops under equivalent circumstances to earlier (no swap).
I ran oops through ksymoops on another machine with same kernel , results below.
At this point I think the trigger may be a slow (bad blocks?) 80Gb hard drive the files
are being written to. The PCI bus is quite busy with imaging from 3 cameras on two
capture cards (bttv and meteor II mc).
> > > Can the paging cache be tuned in /proc or somewhere to prevent it being so
> > > greedy as to want more memory than the machine has?
> >
> > Maybe. But you should concentrate on finding where exactly it oopsed.
...snip...
> ......I added a 16mb ram drive swap (see earlier posting)
> I note memfree stabilises at around 4Mb when running OK, given it only wanted
> an extra 1Mb cache swap, can I cat something to /proc/sys/vm/????? to force
> it to stabilise at around 10Mb or 20Mb? Otherwise can I change a constant
> and recompile kernel to achieve same? It might help give more headroom when
> the event occurs.
>
> >
> > > Is the quickest fix to give it more ram. I read on another posting that
> > > with greater than 512Mb the cache won't grab any more?
> >
> > Please don't succumb to 'add more RAM' syndrome. 460 megs should be fine
> > for you. I'd say better find the root of the problem.
...snip...
> > vda
> >
Thanks for the response
Regards
Ross.
Was run "mem=450" with this Oops
ksymoops 2.4.8 on i686 2.4.24-rd. Options used
-V (default)
-k /proc/ksyms (default)
-L (specified)
-o /lib/modules/2.4.24-rd/ (default)
-m /boot/System.map (specified)
Unable to handle kernel paging request at virtual address 6a65656a
c0133f20
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0133f20>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010883
eax: 6a65656a ebx: 000000f5 ecx: c14dfdd4 edx: c14dfde4
esi: 00000000 edi: 00000008 ebp: c14dfe40 esp: dc16df38
ds: 0018 es: 0018 ss: 0018
Process kswapd (pid: 4, stackpage=dc16d000)
Stack: 000001d0 00000001 c14dfdd4 00000000 00000005 00000005 00000020 000001d0
c032c4d8 c032c4d8 c0134ebc dc16df84 000001d0 0000003c 00000020 c0134f58
dc16df84 dc16c000 00000000 00000000 c032c4d8 dc16c000 c032c400 00000000
Call Trace: [<c0134ebc>] [<c0134f58>] [<c01350f6>] [<c0135169>] [<c013529d>]
[<c0135210>] [<c0105000>] [<c01057db>] [<c0135210>]
Code: 8b 00 43 39 d0 75 f9 8b 44 24 08 89 da 8b 70 24 8b 40 44 89
>>EIP; c0133f20 <kmem_cache_reap+80/1f0> <=====
>>ecx; c14dfdd4 <_end+1115228/1e4db4b4>
>>edx; c14dfde4 <_end+1115238/1e4db4b4>
>>ebp; c14dfe40 <_end+1115294/1e4db4b4>
>>esp; dc16df38 <_end+1bda338c/1e4db4b4>
Trace; c0134ebc <shrink_caches+1c/60>
Trace; c0134f58 <try_to_free_pages_zone+58/e0>
Trace; c01350f6 <kswapd_balance_pgdat+56/b0>
Trace; c0135169 <kswapd_balance+19/30>
Trace; c013529d <kswapd+8d/b0>
Trace; c0135210 <kswapd+0/b0>
Trace; c0105000 <_stext+0/0>
Trace; c01057db <arch_kernel_thread+2b/40>
Trace; c0135210 <kswapd+0/b0>
Code; c0133f20 <kmem_cache_reap+80/1f0>
00000000 <_EIP>:
Code; c0133f20 <kmem_cache_reap+80/1f0> <=====
0: 8b 00 mov (%eax),%eax <=====
Code; c0133f22 <kmem_cache_reap+82/1f0>
2: 43 inc %ebx
Code; c0133f23 <kmem_cache_reap+83/1f0>
3: 39 d0 cmp %edx,%eax
Code; c0133f25 <kmem_cache_reap+85/1f0>
5: 75 f9 jne 0 <_EIP>
Code; c0133f27 <kmem_cache_reap+87/1f0>
7: 8b 44 24 08 mov 0x8(%esp,1),%eax
Code; c0133f2b <kmem_cache_reap+8b/1f0>
b: 89 da mov %ebx,%edx
Code; c0133f2d <kmem_cache_reap+8d/1f0>
d: 8b 70 24 mov 0x24(%eax),%esi
Code; c0133f30 <kmem_cache_reap+90/1f0>
10: 8b 40 44 mov 0x44(%eax),%eax
Code; c0133f33 <kmem_cache_reap+93/1f0>
13: 89 00 mov %eax,(%eax)
Mem starts like this when programs have been started.
Was run "mem=460" for these mem readings.
total: used: free: shared: buffers: cached:
Mem: 473899008 22151168 451747840 0 327680 7737344
Swap: 0 0 0
MemTotal: 462792 kB
MemFree: 441160 kB
MemShared: 0 kB
Buffers: 320 kB
Cached: 7556 kB
SwapCached: 0 kB
Active: 2320 kB
Inactive: 7072 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 462792 kB
LowFree: 441160 kB
SwapTotal: 0 kB
SwapFree: 0 kB
And is like this near to Oops time
total: used: free: shared: buffers: cached:
Mem: 473899008 469348352 4550656 0 831488 420454400
Swap: 0 0 0
MemTotal: 462792 kB
MemFree: 4444 kB
MemShared: 0 kB
Buffers: 812 kB
Cached: 410600 kB
SwapCached: 0 kB
Active: 5064 kB
Inactive: 410968 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 462792 kB
LowFree: 4444 kB
SwapTotal: 0 kB
SwapFree: 0 kB
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 2.4.24 Paging Fault, Source Line located in slab.c, kmem_cache_reap()
2004-02-16 10:25 ` Ross Dickson
@ 2004-02-16 13:45 ` Ross Dickson
0 siblings, 0 replies; 5+ messages in thread
From: Ross Dickson @ 2004-02-16 13:45 UTC (permalink / raw)
To: linux-kernel; +Cc: vda
Tracked down Oops to source line in kmem_cache_reap.
...............................................................
This Line is number 1784 in my file.........................
full_free = 0;
p = searchp->slabs_free.next;
while (p != &searchp->slabs_free) {
#if DEBUG
slabp = list_entry(p, slab_t, list);
if (slabp->inuse)
BUG();
#endif
full_free++;
p = p->next; <======Oops Here
}
............................
objdump for above code section:
Oops identified offending instruction at kmem_cache_reap+80 which is af0
............................
add: 8b 4c 24 08 mov 0x8(%esp,1),%ecx
ae1: 31 db xor %ebx,%ebx
ae3: 8b 41 10 mov 0x10(%ecx),%eax
ae6: 89 ca mov %ecx,%edx
ae8: 83 c2 10 add $0x10,%edx
aeb: 39 d0 cmp %edx,%eax
aed: 74 08 je af7 <kmem_cache_reap+0x87>
aef: 90 nop
af0: 8b 00 mov (%eax),%eax {<=====Oops}
af2: 43 inc %ebx
af3: 39 d0 cmp %edx,%eax
af5: 75 f9 jne af0 <kmem_cache_reap+0x80>
af7: 8b 44 24 08 mov 0x8(%esp,1),%eax
...............................
I do not know how this part of the kernel works.
Have we walked off the end of a list or something?
Can anyone help with a theory or still better a fix?
Hopefully thanks in advance,
Ross.
On Monday 16 February 2004 20:25, Ross Dickson wrote:
> On Sunday 15 February 2004 23:49, Ross Dickson wrote:
> > On Sunday 15 February 2004 06:33, you wrote:
> > > On Saturday 14 February 2004 19:06, Ross Dickson wrote:
> > > > I have an imaging system writing files to removable hard drives.
> > > > Compact Flash boot with ram drives so I usually have no swap partition or
> > > > file.
> > > >
> > > > Recently I upgraded kernel from 2.4.20 to 2.4.24.
> Is KM18G Pro (nforce2 dual memory mode), AMD 2400XP, Preempt, Low latency,
> 64Bit jiffies 1000Hz patched.
>
> I found some articles about memory overcommitment, checked the source and saw
> strict in use for arm systems - no swap- so this time I thought I would try
>
> echo 1 > /proc/sys/vm/overcommit_memory
>
> I got another oops under equivalent circumstances to earlier (no swap).
> I ran oops through ksymoops on another machine with same kernel , results below.
> At this point I think the trigger may be a slow (bad blocks?) 80Gb hard drive the files
> are being written to. The PCI bus is quite busy with imaging from 3 cameras on two
> capture cards (bttv and meteor II mc).
>
> > > > Can the paging cache be tuned in /proc or somewhere to prevent it being so
> > > > greedy as to want more memory than the machine has?
> > >
> > > Maybe. But you should concentrate on finding where exactly it oopsed.
> ...snip...
> > ......I added a 16mb ram drive swap (see earlier posting)
> > I note memfree stabilises at around 4Mb when running OK, given it only wanted
> > an extra 1Mb cache swap, can I cat something to /proc/sys/vm/????? to force
> > it to stabilise at around 10Mb or 20Mb? Otherwise can I change a constant
> > and recompile kernel to achieve same? It might help give more headroom when
> > the event occurs.
> >
> > >
> > > > Is the quickest fix to give it more ram. I read on another posting that
> > > > with greater than 512Mb the cache won't grab any more?
> > >
> > > Please don't succumb to 'add more RAM' syndrome. 460 megs should be fine
> > > for you. I'd say better find the root of the problem.
> ...snip...
> > > vda
> > >
>
> Thanks for the response
> Regards
> Ross.
>
> Was run "mem=450" with this Oops
>
> ksymoops 2.4.8 on i686 2.4.24-rd. Options used
> -V (default)
> -k /proc/ksyms (default)
> -L (specified)
> -o /lib/modules/2.4.24-rd/ (default)
> -m /boot/System.map (specified)
>
> Unable to handle kernel paging request at virtual address 6a65656a
> c0133f20
> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c0133f20>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010883
> eax: 6a65656a ebx: 000000f5 ecx: c14dfdd4 edx: c14dfde4
> esi: 00000000 edi: 00000008 ebp: c14dfe40 esp: dc16df38
> ds: 0018 es: 0018 ss: 0018
> Process kswapd (pid: 4, stackpage=dc16d000)
> Stack: 000001d0 00000001 c14dfdd4 00000000 00000005 00000005 00000020 000001d0
> c032c4d8 c032c4d8 c0134ebc dc16df84 000001d0 0000003c 00000020 c0134f58
> dc16df84 dc16c000 00000000 00000000 c032c4d8 dc16c000 c032c400 00000000
> Call Trace: [<c0134ebc>] [<c0134f58>] [<c01350f6>] [<c0135169>] [<c013529d>]
> [<c0135210>] [<c0105000>] [<c01057db>] [<c0135210>]
> Code: 8b 00 43 39 d0 75 f9 8b 44 24 08 89 da 8b 70 24 8b 40 44 89
>
>
> >>EIP; c0133f20 <kmem_cache_reap+80/1f0> <=====
>
> >>ecx; c14dfdd4 <_end+1115228/1e4db4b4>
> >>edx; c14dfde4 <_end+1115238/1e4db4b4>
> >>ebp; c14dfe40 <_end+1115294/1e4db4b4>
> >>esp; dc16df38 <_end+1bda338c/1e4db4b4>
>
> Trace; c0134ebc <shrink_caches+1c/60>
> Trace; c0134f58 <try_to_free_pages_zone+58/e0>
> Trace; c01350f6 <kswapd_balance_pgdat+56/b0>
> Trace; c0135169 <kswapd_balance+19/30>
> Trace; c013529d <kswapd+8d/b0>
> Trace; c0135210 <kswapd+0/b0>
> Trace; c0105000 <_stext+0/0>
> Trace; c01057db <arch_kernel_thread+2b/40>
> Trace; c0135210 <kswapd+0/b0>
>
> Code; c0133f20 <kmem_cache_reap+80/1f0>
> 00000000 <_EIP>:
> Code; c0133f20 <kmem_cache_reap+80/1f0> <=====
> 0: 8b 00 mov (%eax),%eax <=====
> Code; c0133f22 <kmem_cache_reap+82/1f0>
> 2: 43 inc %ebx
> Code; c0133f23 <kmem_cache_reap+83/1f0>
> 3: 39 d0 cmp %edx,%eax
> Code; c0133f25 <kmem_cache_reap+85/1f0>
> 5: 75 f9 jne 0 <_EIP>
> Code; c0133f27 <kmem_cache_reap+87/1f0>
> 7: 8b 44 24 08 mov 0x8(%esp,1),%eax
> Code; c0133f2b <kmem_cache_reap+8b/1f0>
> b: 89 da mov %ebx,%edx
> Code; c0133f2d <kmem_cache_reap+8d/1f0>
> d: 8b 70 24 mov 0x24(%eax),%esi
> Code; c0133f30 <kmem_cache_reap+90/1f0>
> 10: 8b 40 44 mov 0x44(%eax),%eax
> Code; c0133f33 <kmem_cache_reap+93/1f0>
> 13: 89 00 mov %eax,(%eax)
>
> Mem starts like this when programs have been started.
> Was run "mem=460" for these mem readings.
> total: used: free: shared: buffers: cached:
> Mem: 473899008 22151168 451747840 0 327680 7737344
> Swap: 0 0 0
> MemTotal: 462792 kB
> MemFree: 441160 kB
> MemShared: 0 kB
> Buffers: 320 kB
> Cached: 7556 kB
> SwapCached: 0 kB
> Active: 2320 kB
> Inactive: 7072 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 462792 kB
> LowFree: 441160 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
>
> And is like this near to Oops time
> total: used: free: shared: buffers: cached:
> Mem: 473899008 469348352 4550656 0 831488 420454400
> Swap: 0 0 0
> MemTotal: 462792 kB
> MemFree: 4444 kB
> MemShared: 0 kB
> Buffers: 812 kB
> Cached: 410600 kB
> SwapCached: 0 kB
> Active: 5064 kB
> Inactive: 410968 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 462792 kB
> LowFree: 4444 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-02-16 13:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-14 17:06 2.4.24 Paging Fault, Cache tries to swap with no swap partition Ross Dickson
2004-02-14 20:33 ` vda
2004-02-15 13:49 ` Ross Dickson
2004-02-16 10:25 ` Ross Dickson
2004-02-16 13:45 ` 2.4.24 Paging Fault, Source Line located in slab.c, kmem_cache_reap() Ross Dickson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox