All of lore.kernel.org
 help / color / mirror / Atom feed
* [MM BUG report] 2.4.20-ac2 mm/filemap.c truncate_complete_page BUG()
@ 2003-01-07 14:27 Lionel Bouton
  0 siblings, 0 replies; only message in thread
From: Lionel Bouton @ 2003-01-07 14:27 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4999 bytes --]

This happened 3 times, each time some (1-3) days after bootup

ksymoops output is attached

## Hardware/Kernel Config

The config was just upgraded from the following (where everything worked 
nicely for weeks between reboots).

Bi-Celeron 433 on Abit BP6 wit 256 Mb RAM and latest RedHat 7.3 kernel 
(at that time 2.4.18-18).

to this one :

Single Athlon XP 1700+ on ECS K7S5A (SiS735 chipset) with 128 Mb RAM and 
2.4.20-ac2 kernel.

This was a drop in replacemnt (same disks, same software).

## Modules loaded (yes there are many of them)

Here are the modules loaded on current config :

Module                  Size  Used by    Not tainted
smbfs                  37664   1 (autoclean)
nfsd                   75872   8 (autoclean)
lockd                  55936   1 (autoclean) [nfsd]
sunrpc                 74804   1 (autoclean) [nfsd lockd]
parport_pc             28904   1 (autoclean)
lp                      7552   0 (autoclean)
parport                35104   1 (autoclean) [parport_pc lp]
ppp_deflate             3904   2 (autoclean)
zlib_inflate           21664   0 (autoclean) [ppp_deflate]
zlib_deflate           20864   0 (autoclean) [ppp_deflate]
bsd_comp                5088   0 (autoclean)
it87                    8256   0
i2c-proc                8192   0 [it87]
i2c-isa                 1892   0 (unused)
i2c-core               20800   0 [it87 i2c-proc i2c-isa]
cls_u32                 5572   1 (autoclean)
cls_fw                  3072   1 (autoclean)
sch_sfq                 4416   2 (autoclean)
sch_cbq                13952   1 (autoclean)
ipt_MARK                1408  18 (autoclean)
iptable_mangle          2816   1 (autoclean)
ipt_TCPMSS              3040   1 (autoclean)
ipt_limit               1536   2 (autoclean)
ipt_REJECT              3584   2 (autoclean)
n_hdlc                  7136   1
ppp_synctty             6272   1
ipt_state               1088  19 (autoclean)
ipt_unclean             7552   1 (autoclean)
ipt_LOG                 4160   5 (autoclean)
ip_nat_ftp              3776   0 (unused)
ip_conntrack_ftp        4864   1
iptable_nat            18868   2 (autoclean) [ip_nat_ftp]
ip_conntrack           24748   3 (autoclean) [ipt_state ip_nat_ftp 
ip_conntrack_ftp iptable_nat]
ppp_async               8032   1
ppp_generic            19244   6 [ppp_deflate bsd_comp ppp_synctty 
ppp_async]
iptable_filter          2336   1 (autoclean)
slhc                    6412   1 [ppp_generic]
ip_tables              13568  12 [ipt_MARK iptable_mangle ipt_TCPMSS 
ipt_limit ipt_REJECT ipt_state ipt_unclean ipt_LOG iptable_nat 
iptable_filter]
sis900                 14756   1
3c59x                  27368   1
af_packet              13800   2 (autoclean)
ide-cd                 31776   0 (autoclean)
cdrom                  31968   0 (autoclean) [ide-cd]
md                     48768   0 (unused)
loop                   10128   0 (autoclean)
lvm-mod                49664   8 (autoclean)
usb-ohci               20192   0 (unused)
usbcore                71424   1 [usb-ohci]

Lm_sensors (it87 module here) was already used on previous config and 
upgraded to v2.7.0 with i2c-2.7.0.
I can prevent lm_sensors and i2c modules from loading if needed.

The box is under heavy memory pressure with quasi constant swap in swap 
out (personnal NAT box with numerous services running nfs, squid, apache 
...). Current vmstat output is :

   procs                      memory    swap          io     
system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  
sy  id
 0  0  0  63888   6036   1628  27112  28  36   174   227   37    33   
8   4  88

-> si = 28, so = 36

The perf is not stellar but is expected to be so under this kind of 
memory pressure (512 MB stick was already ordered when first Oops came). 
What's not expected is the oops attached.
The system does not crash but the process involved (always rrdtool 
launched by crontab for the 3 Oopps) is stuck in D state and each 
following process  trying to access the same files are stucked in D 
states too (as rrdtool is called each minute to update its databases and 
graph nice graphs it leads to load average jumping to 200 in a matter of 
3 hours...).

The code involved in filemap.c truncate_complete_page are the very first 
lines :

[...]
        /* Page has already been removed from processes, by vmtruncate()  */
        if (page->pte_chain)
                BUG();
[...]


For the record (probably not important) last time I checked the rrdtool 
processes stucked in D state were the ones that graphed the load average 
(so the file that posed problem was probably not the database but one of 
the resulting png file).

Question : how do I get the current syscall of a process stucked in D 
state remotely (Sysrq can do it but there's no screen attached so if I 
could get the info remotely that would spare me temporary screen 
relocations...).

I've a stock 2.4.20 kernel already compiled for testing (will be used on 
next reboot, lm_sensors and i2c modules won't load).

LB.

[-- Attachment #2: ksymoops.out --]
[-- Type: text/plain, Size: 3113 bytes --]

ksymoops 2.4.4 on i686 2.4.20-ac2.  Options used
     -v vmlinux (specified)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-ac2/ (default)
     -m /boot/System.map-2.4.20-ac2 (default)

Jan  6 01:13:04 twins kernel: kernel BUG at filemap.c:242!
Jan  6 01:13:04 twins kernel: invalid operand: 0000
Jan  6 01:13:04 twins kernel: CPU:    0
Jan  6 01:13:04 twins kernel: EIP:    0010:[<c01250ac>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Jan  6 01:13:04 twins kernel: EFLAGS: 00010286
Jan  6 01:13:04 twins kernel: eax: c5563dd8   ebx: c10b2968   ecx: c10b2968   edx: c4b257e8
Jan  6 01:13:04 twins kernel: esi: c54be1a8   edi: 00000000   ebp: 00000000   esp: c5563d78
Jan  6 01:13:04 twins kernel: ds: 0018   es: 0018   ss: 0018
Jan  6 01:13:04 twins kernel: Process rrdtool (pid: 25745, stackpage=c5563000)
Jan  6 01:13:04 twins kernel: Stack: c10b2968 c012523a c10b2968 00000000 00000001 c5563dd8 00000000 40400000
Jan  6 01:13:04 twins kernel:        40016000 c3564404 c0e3a000 c1bc5874 00000000 00000081 c307b300 00000000
Jan  6 01:13:04 twins kernel:        c307b35c c01bfefb 40015000 00000000 c5563dd8 00000000 c54be1a8 c01252db
Jan  6 01:13:04 twins kernel: Call Trace:    [<c012523a>] [<c01bfefb>] [<c01252db>] [<c0123058>] [<c0147f64>]
Jan  6 01:13:04 twins kernel:   [<c015923b>] [<c0126d8b>] [<c0126db9>] [<c01480d4>] [<c0132ffd>] [<c0132d5c>]
Jan  6 01:13:04 twins kernel:   [<c01336b6>] [<c013eca4>] [<c0113111>] [<c01bfa0e>] [<c01344b4>] [<c01347e1>]
Jan  6 01:13:04 twins kernel:   [<c0106c8b>]
Jan  6 01:13:04 twins kernel: Code: 0f 0b f2 00 be a4 20 c0 8b 43 30 85 c0 74 0e 6a 00 53 e8 ad

>>EIP; c01250ac <truncate_complete_page+c/60>   <=====
Trace; c012523a <truncate_list_pages+13a/1a0>
Trace; c01bfefb <kfree_skbmem+b/60>
Trace; c01252db <truncate_inode_pages+3b/70>
Trace; c0123058 <vmtruncate+98/130>
Trace; c0147f64 <inode_setattr+24/d0>
Trace; c015923b <ext3_setattr+13b/180>
Trace; c0126d8b <filemap_nopage+bb/200>
Trace; c0126db9 <filemap_nopage+e9/200>
Trace; c01480d4 <notify_change+54/100>
Trace; c0132ffd <pte_chain_alloc+d/30>
Trace; c0132d5c <page_add_rmap+2c/80>
Trace; c01336b6 <do_truncate+46/60>
Trace; c013eca4 <open_namei+3d4/4f0>
Trace; c0113111 <__wake_up+41/60>
Trace; c01bfa0e <sock_def_write_space+2e/70>
Trace; c01344b4 <filp_open+34/60>
Trace; c01347e1 <sys_open+31/80>
Trace; c0106c8b <system_call+33/38>
Code;  c01250ac <truncate_complete_page+c/60>
00000000 <_EIP>:
Code;  c01250ac <truncate_complete_page+c/60>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01250ae <truncate_complete_page+e/60>
   2:   f2 00 be a4 20 c0 8b      repnz add %bh,0x8bc020a4(%esi)
Code;  c01250b5 <truncate_complete_page+15/60>
   9:   43                        inc    %ebx
Code;  c01250b6 <truncate_complete_page+16/60>
   a:   30 85 c0 74 0e 6a         xor    %al,0x6a0e74c0(%ebp)
Code;  c01250bc <truncate_complete_page+1c/60>
  10:   00 53 e8                  add    %dl,0xffffffe8(%ebx)
Code;  c01250bf <truncate_complete_page+1f/60>
  13:   ad                        lods   %ds:(%esi),%eax

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2003-01-07 14:19 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-07 14:27 [MM BUG report] 2.4.20-ac2 mm/filemap.c truncate_complete_page BUG() Lionel Bouton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.