* [parisc-linux] SLAB bug SMP 64bit / XFS mess
@ 2005-04-03 1:34 Thibaut VARENE
2005-04-06 12:17 ` Joel Soete
0 siblings, 1 reply; 6+ messages in thread
From: Thibaut VARENE @ 2005-04-03 1:34 UTC (permalink / raw)
To: parisc-linux
Hi pa-ckers
tried 2.6.12-rc1-pa9 64bit SMP (gcc 3.3.5) with DEBUG_SLAB enabled, got
the following BUG:
LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
kernel BUG at mm/slab.c:1495!
Backtrace:
[<0000000010114480>] show_stack+0x60/0xf0
[<0000000010301f7c>] $$divoI+0x30c/0x460
[<0000000010107074>] intr_return+0x0/0x24
[<00000000101df19c>] page_put_link+0x14c/0x258
[<000000001011daac>] cpu_idle+0x34/0x40
[<000000001013f548>] activate_task+0x98/0x108
[<0000000010164ffc>] do_sigaction+0x1e4/0x2b8
[<00000000101720ec>] kthread+0x144/0x150
[<000000001010647c>] ret_from_kernel_thread+0x24/0x40
Kernel panic - not syncing: BUG!
Trying one more time was significantly the same (except the second line in
the backtrace):
LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
kernel BUG at mm/slab.c:1495!
Backtrace:
[<0000000010114480>] show_stack+0x60/0xf0
[<000000001010ab04>] $$remoI+0x278/0x1414
[<0000000010107074>] intr_return+0x0/0x24
[<000000001011daac>] cpu_idle+0x34/0x40
[<000000001013f548>] activate_task+0x98/0x108
[<0000000010164ffc>] do_sigaction+0x1e4/0x2b8
[<00000000101720ec>] kthread+0x144/0x150
[<000000001010647c>] ret_from_kernel_thread+0x24/0x40
Now the exact same kernel with DEBUG_SLAB disabled:
LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
SCSI subsystem initialized
unwind_init: start = 0x104bfec0, end = 0x104e7e90, entries = 10237
Performance monitoring counters enabled for Duet W+
SGI XFS with large block/inode numbers, no debug enabled
SuperIO: Found NS87560 Legacy I/O device at 0000:00:0e.1 (IRQ 20)
Unfortunately things weren't meant to last long:
Checking root file system...
fsck 1.35 (28-Feb-2004)
Backtrace:
[<000000001027e400>] xfs_alloc_fix_freelist+0x450/0x470
[<000000001027f060>] xfs_free_extent+0xb8/0x120
[<000000001028ff6c>] xfs_bmap_finish+0x17c/0x210
[<00000000102b75e0>] xfs_itruncate_finish+0x188/0x388
[<00000000102d5b40>] xfs_setattr+0x8d8/0x1068
[<00000000102e80f4>] linvfs_setattr+0x174/0x228
[<00000000101eb64c>] notify_change+0x204/0x2b0
[<00000000101b8e50>] do_truncate+0x80/0x110
[<00000000101d92a0>] may_open+0x270/0x2d8
[<00000000101d9424>] open_namei+0x11c/0xc10
[<00000000101ba1f8>] filp_open+0x48/0x88
[<00000000101baae0>] sys_open+0x98/0xf0
[<0000000010107fb4>] syscall_exit+0x0/0x14
Kernel Fault: Code=15 regs=00000000ff71ce90 (Addr=0000000566adeb00)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111111100001111 Not tainted
r00-03 0000000000000000 0000000466c87700 000000001027e884
0000000566adeb00
r04-07 000000001060d840 000000001027e400 00000000ff0a5738
00000000ff71cce0
r08-11 0000000000000001 0000000000000000 0000000010488a80
0000000000000002
r12-15 00000000ff71c960 0000000000000000 0000000000000001
00000000ff71c950
r16-19 0008000000000001 0000000000000000 0000000000000001
00000000ffe57400
r20-23 0000000000000000 00000000ff1b3b40 0000000000000080
00000000ff2fd200
r24-27 00000000ffe7a000 000000000007e434 00000000ff0a5738
000000001060d840
r28-31 0000000000000000 00000000ff71cf20 00000000ff71ce90
00000000ffd459a8
sr0-3 000000000002a800 0000000000000000 0000000000000000
000000000002a800
sr4-7 0000000000000000 0000000000000000 0000000000000000
0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001027e99c
000000001027e9a0
IIR: 0c601014 ISR: 0000000000000000 IOR: 0000000566adeb00
CPU: 1 CR30: 00000000ff71c000 CR31: 000000005001c08c
ORIG_R28: 000000001060d840
IAOQ[0]: xfs_alloc_read_agf+0x1bc/0x240
IAOQ[1]: xfs_alloc_read_agf+0x1c0/0x240
RP(r2): xfs_alloc_read_agf+0xa4/0x240
Kernel panic - not syncing: Kernel Fault
Truth told, it's 3:30 AM and I had a bunch of important things on that
machine, i'd rather not know how badly broken my FS is right now :(
HTH
T-Bone
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [parisc-linux] SLAB bug SMP 64bit / XFS mess
2005-04-03 1:34 Thibaut VARENE
@ 2005-04-06 12:17 ` Joel Soete
0 siblings, 0 replies; 6+ messages in thread
From: Joel Soete @ 2005-04-06 12:17 UTC (permalink / raw)
To: parisc-linux
Hello all,
> Hi pa-ckers
>
> tried 2.6.12-rc1-pa9 64bit SMP (gcc 3.3.5) with DEBUG_SLAB enabled, got=
> the following BUG:
> LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
> kernel BUG at mm/slab.c:1495!
> Backtrace:
> [<0000000010114480>] show_stack+0x60/0xf0
One thought about show_stack() (i.e. dump_stack()) :
its code looks like:
void show_stack(struct task_struct *task, unsigned long *s)
{
struct unwind_frame_info info;
if (!task) {
unsigned long sp;
struct pt_regs *r;
HERE:
asm volatile ("copy %%r30, %0" : "=3Dr"(sp));
r =3D (struct pt_regs *)kmalloc(sizeof(struct pt_regs), G=
FP_KERNEL);
if (!r)
return;
...
while #if DEBUG in slab kmalloc will:
static inline kmem_cache_t *kmem_find_general_cachep(size_t size, int gfp=
flags)
{
struct cache_sizes *csizep =3D malloc_sizes;
#if DEBUG
/* This happens if someone tries to call
* kmem_cache_create(), or __kmalloc(), before
* the generic caches are initialized.
*/
BUG_ON(csizep->cs_cachep =3D=3D NULL);
#endif
...
(this special case appends to me when I tried to change BUG_ON() by WARN_=
ON()
in kernel/posix-cpu-timers.c
see <http://cvs.parisc-linux.org/linux-2.6/kernel/posix-cpu-timers.c?rev=3D=
1.3&view=3Dmarkup>
with following
Badness in run_posix_cpu_timers at /CAD/linux-2.6.12-rc1-pa9-050401/kerne=
l/posi5
Backtrace:
[<00000000101124e8>] dump_stack+0x18/0x28
[<000000001015ed20>] run_posix_cpu_timers+0x168/0x1d0
[<0000000010113648>] timer_interrupt+0xc8/0x178
[<00000000101170b8>] real32_call+0xe0/0x110
[<0000000010116624>] pdc_iodc_putc+0x9c/0x150
[<00000000101686b4>] __do_IRQ+0x94/0x1e0
[<0000000010116054>] pdc_tod_read+0xac/0x100
[<0000000010116050>] pdc_tod_read+0xa8/0x100
[<00000000103e6890>] proc_dodebug+0x0/0x2d0
[<00000000103e64f0>] rpc_proc_register+0x0/0xb0
[<00000000103e5de0>] rpc_unlink+0x0/0x1a8
[<00000000103e5240>] rpc_depopulate+0x0/0x1f0
[<00000000103e4cb0>] rpc_show_info+0x0/0xe8
[<00000000103e4618>] rpc_alloc_inode+0x0/0x30
[<00000000103e3ed0>] content_open+0x0/0xb0
[<00000000103e35e0>] qword_addhex+0x0/0x110
Kernel Fault: Code=3D26 regs=3D000000001057cbe0 (Addr=3D0000000000000000)=
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001000000000000001110 Not tainted
r00-03 0000000000000000 0000000010481e38 0000000010112680 00000000080000=
0f
r04-07 00000000105c0b40 00000000101124e8 000000001057cb60 000000001057c2=
e0
r08-11 0000000000000000 00000000105c9b40 0000000000000000 00000000105d86=
a4
r12-15 0000000000000000 00000000ffffffff 0000000000000000 00000000f04000=
04
r16-19 000000001057c2e0 00000000f000017c 00000000f0000174 00000000105662=
30
r20-23 000000000800000e 00000000000003ea 00000000000003ea 00000000006297=
7c
r24-27 ffffffffffffffff 00000000000000d0 0000000000000000 00000000105c0b=
40
r28-31 0000000000000066 000000001057cbb0 000000001057cbe0 00000000000000=
60
sr0-3 0000000000000000 0000000000000000 0000000000000000 00000000000000=
00
sr4-7 0000000000000000 0000000000000000 0000000000000000 00000000000000=
00
IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000010176d20 0000000010=
176d24
IIR: 0f4010d5 ISR: 0000000000000000 IOR: 0000000000000000
CPU: 0 CR30: 000000001057c000 CR31: 0000000010580000
ORIG_R28: 0000000000000000
IAOQ[0]: kmem_cache_alloc+0x18/0x60
IAOQ[1]: kmem_cache_alloc+0x1c/0x60
RP(r2): show_stack+0x80/0xf0
Kernel panic - not syncing: Kernel Fault
a objdump of slab showing:
0000000000000000 <kmem_cache_alloc>:
0: 0f c2 12 c1 std rp,-10(,sp)
4: 73 c4 01 08 std,ma r4,80(sp)
8: 73 c3 3f 11 std r3,-78(sp)
c: db 39 0f e0 extrd,s r25,63,32,r25
10: 00 01 0e 63 rsm 1,r3
14: 37 dd 3f a1 ldo -30(sp),ret1
18: 0f 40 10 d5 ldd 0(,r26),r21 <=3D=3D=3D=3D
1c: 0e a0 10 94 ldw 0(,r21),r20
20: 86 80 20 58 cmpib,=3D 0,r20,54 <kmem_cache_alloc+0x54>
)
Hth,
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
[not found] <200504090100.j3910CW0019308@hiauly1.hia.nrc.ca>
@ 2005-04-09 13:25 ` Joel Soete
0 siblings, 0 replies; 6+ messages in thread
From: Joel Soete @ 2005-04-09 13:25 UTC (permalink / raw)
To: John David Anglin; +Cc: James.Bottomley, parisc-linux, Thibaut VARENE
John David Anglin wrote:
>>It's also quite surprising the 32bit kernel is unaffected.
>
well nothing new since: <http://lists.parisc-linux.org/pipermail/parisc-linux/2005-March/026055.html>
>
> As a plug for my put_user patch, I recall that there is at least one
> xfs ioctl that involves putting a long int to userspace.
>
Yes iirc that was (in 2.4) one of the reason I asked help to Randolph to introduce put_user_asm64() for 32bit kernel.
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
[not found] ` <20050409021049.0d0a851d@Tatooine.r3z0>
@ 2005-04-11 4:57 ` Grant Grundler
2005-04-11 5:15 ` Grant Grundler
0 siblings, 1 reply; 6+ messages in thread
From: Grant Grundler @ 2005-04-11 4:57 UTC (permalink / raw)
To: Thibaut VARENE; +Cc: James Bottomley, PARISC list
On Sat, Apr 09, 2005 at 02:10:49AM +0200, Thibaut VARENE wrote:
> Just for the records:
>
> while trying to debug the XFS issue on my J6k, here what i noticed:
>
> 32bit kernel: no bug
Sorry, though my j6k did not panic with 32-bit kernel, there is
definitely some XFS bug(s?). Test output and details follow.
grundler <511>uname -a
Linux gggj6k 2.6.12-rc2-pa1 #5 SMP Sun Apr 10 14:13:59 PDT 2005 parisc GNU/Linux
That's despite sym2 using I/O Port space:
CONFIG_SCSI_SYM53C8XX_IOMAPPED=y
I got alot (> 500 linus scrollback) of the following on console:
0x0: ed 41 00 00 00 10 00 00 a0 c4 59 42 4c ab 26 42
Filesystem "sdb3": XFS internal error xfs_da_do_buf(2) at line 2271 of file fs/4
Backtrace:
[<1027d514>] xfs_da_do_buf+0x354/0x754
[<1027d974>] xfs_da_read_buf+0x2c/0x38
[<1028157c>] xfs_dir2_block_lookup_int+0x74/0x1fc
[<1028146c>] xfs_dir2_block_lookup+0x1c/0xb8
[<1027fd30>] xfs_dir2_lookup+0xc8/0x140
[<102ad50c>] xfs_dir_lookup_int+0x50/0x118
[<102b26a8>] xfs_lookup+0x68/0xac
[<102c0894>] linvfs_lookup+0x60/0xa4
[<1019e8d8>] __lookup_hash+0xc0/0xf8
[<1019fa58>] lookup_create+0x68/0xcc
[<1019ff1c>] sys_mkdir+0x78/0x134
[<1010e178>] syscall_exit+0x0/0x14
And then the last console output was:
0x0: 58 44 32 44 0e 58 01 a8 07 c8 00 20 09 50 00 20
Filesystem "sdb3": XFS internal error xfs_dir2_block_addname at line 128 of fil4
Backtrace:
[<10281124>] xfs_dir2_block_addname+0x6c0/0x6d4
[<1027fc14>] xfs_dir2_createname+0x124/0x178
[<102b3ee8>] xfs_mkdir+0x484/0x6e4
[<102c07b8>] linvfs_mknod+0x1d0/0x210
[<1019fe78>] vfs_mkdir+0x94/0xc0
[<1019ff68>] sys_mkdir+0xc4/0x134
[<1010e178>] syscall_exit+0x0/0x14
xfs_force_shutdown(sdb3,0x8) called from line 1091 of file fs/xfs/xfs_trans.c. c
Filesystem "sdb3": Corruption of in-memory data detected. Shutting down filesy3
Please umount the filesystem, and rectify the problem(s)
SMP CALL FUNCTION TIMED OUT! (cpu=1), try 1
My test was "time rsync -aWx --stats / /mnt" with two disks:
grundler@gggj6k:~$ lsscsi
[1:0:5:0] disk HP 18.2GB C 80-D94N D94N /dev/sda
[1:0:6:0] disk SEAGATE ST318203LC 0001 /dev/sdb
grundler@gggj6k:~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 16911576 13625088 2427416 85% /
tmpfs 1812796 0 1812796 0% /dev/shm
/dev/sda1 85528 28487 52625 36% /boot
/dev/sdb3 17171264 13671156 3500108 80% /mnt
Ext3 on sda3 and freshly made "mkfs.xfs -f /dev/sdb3":
/dev/sda3 on / type ext3 (rw,errors=remount-ro)
The rsync complained *alot* (> 500 lines of scrollback) about:
rsync: stat "/mnt/home/tftpboot/pa8800/var/spool/postfix/deferred/8" failed: Input/output error (5)
rsync stats output:
Number of files: 394227
Number of files transferred: 349625
Total file size: 12845404686 bytes
Total transferred file size: 12845269080 bytes
Literal data: 12845327899 bytes
Matched data: 0 bytes
File list size: 7360585
Total bytes sent: 12869390640
Total bytes received: 6992520
sent 12869390640 bytes received 6992520 bytes 5379729.75 bytes/sec
total size is 12845404686 speedup is 1.00
rsync error: some files could not be transferred (code 23) at main.c(702)
real 39m52.694s
user 5m52.069s
sys 7m34.079s
hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
2005-04-11 4:57 ` Grant Grundler
@ 2005-04-11 5:15 ` Grant Grundler
2005-04-11 5:49 ` Grant Grundler
0 siblings, 1 reply; 6+ messages in thread
From: Grant Grundler @ 2005-04-11 5:15 UTC (permalink / raw)
To: Thibaut VARENE; +Cc: James Bottomley, PARISC list
On Sun, Apr 10, 2005 at 10:57:47PM -0600, Grant Grundler wrote:
> Sorry, though my j6k did not panic with 32-bit kernel, there is
> definitely some XFS bug(s?). Test output and details follow.
Just for comparison, I ran the same workload on ext3 and it worked fine.
Started with "mke2fs /dev/sdb3; mount /dev/sdb3 /mnt" and then ran
my pet test-of-the-week again:
root@gggj6k:~# time rsync -aWx --stats / /mnt
Number of files: 394228
Number of files transferred: 349626
Total file size: 12845690938 bytes
Total transferred file size: 12845555332 bytes
Literal data: 12845560371 bytes
Matched data: 0 bytes
File list size: 7360607
Total bytes sent: 12869623206
Total bytes received: 6992540
sent 12869623206 bytes received 6992540 bytes 9908900.15 bytes/sec
total size is 12845690938 speedup is 1.00
real 21m39.937s
user 5m47.908s
sys 5m50.986s
root@gggj6k:~#
"iostat 10" on another login reported pretty impressive numbers at
times like for this 30 second period (rsync says avg is 1/4th of this):
avg-cpu: %user %nice %sys %iowait %idle
27.54 0.00 24.04 28.49 19.94
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 208.60 43289.60 100.80 432896 1008
sdb 46.40 22.40 40007.20 224 400072
avg-cpu: %user %nice %sys %iowait %idle
28.96 0.00 24.06 40.77 6.20
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 212.30 45693.60 68.80 456936 688
sdb 48.40 4.00 35515.20 40 355152
avg-cpu: %user %nice %sys %iowait %idle
30.70 0.00 23.45 37.30 8.55
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 180.20 48866.40 51.20 488664 512
sdb 55.40 23.20 37864.80 232 378648
hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
2005-04-11 5:15 ` Grant Grundler
@ 2005-04-11 5:49 ` Grant Grundler
0 siblings, 0 replies; 6+ messages in thread
From: Grant Grundler @ 2005-04-11 5:49 UTC (permalink / raw)
To: PARISC list
On Sun, Apr 10, 2005 at 11:15:17PM -0600, Grant Grundler wrote:
> Just for comparison, I ran the same workload on ext3 and it worked fine.
Sorry, I meant ext2.
Here is the correct rsync output for ext3 on /dev/sdb3:
root@gggj6k:~# time rsync -aWx --stats / /mnt
Number of files: 394228
Number of files transferred: 349626
Total file size: 12845701275 bytes
Total transferred file size: 12845565669 bytes
Literal data: 12845570528 bytes
Matched data: 0 bytes
File list size: 7360229
Total bytes sent: 12869632989
Total bytes received: 6992540
sent 12869632989 bytes received 6992540 bytes 8178231.52 bytes/sec
total size is 12845701275 speedup is 1.00
real 26m14.078s
user 5m47.767s
sys 7m11.995s
root@gggj6k:~#
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-04-11 5:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200504090100.j3910CW0019308@hiauly1.hia.nrc.ca>
2005-04-09 13:25 ` [parisc-linux] SLAB bug SMP 64bit / XFS mess Joel Soete
[not found] <20050404110025.GA6987@tausq.org>
[not found] ` <4208AA1B000184AE@mail-3-bnl.tiscali.it>
[not found] ` <20050408205428.GJ1833@baldric.uwo.ca>
[not found] ` <20050409021049.0d0a851d@Tatooine.r3z0>
2005-04-11 4:57 ` Grant Grundler
2005-04-11 5:15 ` Grant Grundler
2005-04-11 5:49 ` Grant Grundler
2005-04-03 1:34 Thibaut VARENE
2005-04-06 12:17 ` Joel Soete
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.