All of lore.kernel.org
 help / color / mirror / Atom feed
* [parisc-linux] SLAB bug SMP 64bit / XFS mess
@ 2005-04-03  1:34 Thibaut VARENE
  2005-04-06 12:17 ` Joel Soete
  0 siblings, 1 reply; 6+ messages in thread
From: Thibaut VARENE @ 2005-04-03  1:34 UTC (permalink / raw)
  To: parisc-linux

Hi pa-ckers

tried 2.6.12-rc1-pa9 64bit SMP (gcc 3.3.5) with DEBUG_SLAB enabled, got
the following BUG:
LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
kernel BUG at mm/slab.c:1495!
Backtrace:
 [<0000000010114480>] show_stack+0x60/0xf0
 [<0000000010301f7c>] $$divoI+0x30c/0x460
 [<0000000010107074>] intr_return+0x0/0x24
 [<00000000101df19c>] page_put_link+0x14c/0x258
 [<000000001011daac>] cpu_idle+0x34/0x40
 [<000000001013f548>] activate_task+0x98/0x108
 [<0000000010164ffc>] do_sigaction+0x1e4/0x2b8
 [<00000000101720ec>] kthread+0x144/0x150
 [<000000001010647c>] ret_from_kernel_thread+0x24/0x40
Kernel panic - not syncing: BUG!

Trying one more time was significantly the same (except the second line in
the backtrace):

LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
kernel BUG at mm/slab.c:1495!
Backtrace:
 [<0000000010114480>] show_stack+0x60/0xf0
 [<000000001010ab04>] $$remoI+0x278/0x1414
 [<0000000010107074>] intr_return+0x0/0x24
 [<000000001011daac>] cpu_idle+0x34/0x40
 [<000000001013f548>] activate_task+0x98/0x108
 [<0000000010164ffc>] do_sigaction+0x1e4/0x2b8
 [<00000000101720ec>] kthread+0x144/0x150
 [<000000001010647c>] ret_from_kernel_thread+0x24/0x40

Now the exact same kernel with DEBUG_SLAB disabled:

LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
SCSI subsystem initialized
unwind_init: start = 0x104bfec0, end = 0x104e7e90, entries = 10237
Performance monitoring counters enabled for Duet W+
SGI XFS with large block/inode numbers, no debug enabled
SuperIO: Found NS87560 Legacy I/O device at 0000:00:0e.1 (IRQ 20)

Unfortunately things weren't meant to last long:

Checking root file system...
fsck 1.35 (28-Feb-2004)
Backtrace:
 [<000000001027e400>] xfs_alloc_fix_freelist+0x450/0x470
 [<000000001027f060>] xfs_free_extent+0xb8/0x120
 [<000000001028ff6c>] xfs_bmap_finish+0x17c/0x210
 [<00000000102b75e0>] xfs_itruncate_finish+0x188/0x388
 [<00000000102d5b40>] xfs_setattr+0x8d8/0x1068
 [<00000000102e80f4>] linvfs_setattr+0x174/0x228
 [<00000000101eb64c>] notify_change+0x204/0x2b0
 [<00000000101b8e50>] do_truncate+0x80/0x110
 [<00000000101d92a0>] may_open+0x270/0x2d8
 [<00000000101d9424>] open_namei+0x11c/0xc10
 [<00000000101ba1f8>] filp_open+0x48/0x88
 [<00000000101baae0>] sys_open+0x98/0xf0
 [<0000000010107fb4>] syscall_exit+0x0/0x14


Kernel Fault: Code=15 regs=00000000ff71ce90 (Addr=0000000566adeb00)

     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111111100001111 Not tainted
r00-03  0000000000000000 0000000466c87700 000000001027e884
0000000566adeb00
r04-07  000000001060d840 000000001027e400 00000000ff0a5738
00000000ff71cce0
r08-11  0000000000000001 0000000000000000 0000000010488a80
0000000000000002
r12-15  00000000ff71c960 0000000000000000 0000000000000001
00000000ff71c950
r16-19  0008000000000001 0000000000000000 0000000000000001
00000000ffe57400
r20-23  0000000000000000 00000000ff1b3b40 0000000000000080
00000000ff2fd200
r24-27  00000000ffe7a000 000000000007e434 00000000ff0a5738
000000001060d840
r28-31  0000000000000000 00000000ff71cf20 00000000ff71ce90
00000000ffd459a8
sr0-3   000000000002a800 0000000000000000 0000000000000000
000000000002a800
sr4-7   0000000000000000 0000000000000000 0000000000000000
0000000000000000

IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001027e99c
000000001027e9a0
 IIR: 0c601014    ISR: 0000000000000000  IOR: 0000000566adeb00
 CPU:        1   CR30: 00000000ff71c000 CR31: 000000005001c08c
 ORIG_R28: 000000001060d840
 IAOQ[0]: xfs_alloc_read_agf+0x1bc/0x240
 IAOQ[1]: xfs_alloc_read_agf+0x1c0/0x240
 RP(r2): xfs_alloc_read_agf+0xa4/0x240
Kernel panic - not syncing: Kernel Fault

Truth told, it's 3:30 AM and I had a bunch of important things on that
machine, i'd rather not know how badly broken my FS is right now :(

HTH

T-Bone
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [parisc-linux] SLAB bug SMP 64bit / XFS mess
  2005-04-03  1:34 Thibaut VARENE
@ 2005-04-06 12:17 ` Joel Soete
  0 siblings, 0 replies; 6+ messages in thread
From: Joel Soete @ 2005-04-06 12:17 UTC (permalink / raw)
  To: parisc-linux

Hello all,

> Hi pa-ckers
> 
> tried 2.6.12-rc1-pa9 64bit SMP (gcc 3.3.5) with DEBUG_SLAB enabled, got=

> the following BUG:
> LBA version TR4.0 (0x5) found at 0xfffffffffed3c000
> kernel BUG at mm/slab.c:1495!
> Backtrace:
>  [<0000000010114480>] show_stack+0x60/0xf0

One thought about show_stack() (i.e. dump_stack()) :

its code looks like:
void show_stack(struct task_struct *task, unsigned long *s)
{
        struct unwind_frame_info info;

        if (!task) {
                unsigned long sp;
                struct pt_regs *r;

HERE:
                asm volatile ("copy %%r30, %0" : "=3Dr"(sp));
                r =3D (struct pt_regs *)kmalloc(sizeof(struct pt_regs), G=
FP_KERNEL);
                if (!r)
                        return;
...

while #if DEBUG in slab kmalloc will:
static inline kmem_cache_t *kmem_find_general_cachep(size_t size, int gfp=
flags)
{
        struct cache_sizes *csizep =3D malloc_sizes;

#if DEBUG
        /* This happens if someone tries to call  
        * kmem_cache_create(), or __kmalloc(), before
        * the generic caches are initialized.
        */
        BUG_ON(csizep->cs_cachep =3D=3D NULL);
#endif
...

(this special case appends to me when I tried to change BUG_ON() by WARN_=
ON()
in kernel/posix-cpu-timers.c
see <http://cvs.parisc-linux.org/linux-2.6/kernel/posix-cpu-timers.c?rev=3D=
1.3&view=3Dmarkup>
with following 
Badness in run_posix_cpu_timers at /CAD/linux-2.6.12-rc1-pa9-050401/kerne=
l/posi5
Backtrace:
 [<00000000101124e8>] dump_stack+0x18/0x28
 [<000000001015ed20>] run_posix_cpu_timers+0x168/0x1d0
 [<0000000010113648>] timer_interrupt+0xc8/0x178
 [<00000000101170b8>] real32_call+0xe0/0x110
 [<0000000010116624>] pdc_iodc_putc+0x9c/0x150
 [<00000000101686b4>] __do_IRQ+0x94/0x1e0
 [<0000000010116054>] pdc_tod_read+0xac/0x100
 [<0000000010116050>] pdc_tod_read+0xa8/0x100
 [<00000000103e6890>] proc_dodebug+0x0/0x2d0
 [<00000000103e64f0>] rpc_proc_register+0x0/0xb0
 [<00000000103e5de0>] rpc_unlink+0x0/0x1a8
 [<00000000103e5240>] rpc_depopulate+0x0/0x1f0
 [<00000000103e4cb0>] rpc_show_info+0x0/0xe8
 [<00000000103e4618>] rpc_alloc_inode+0x0/0x30
 [<00000000103e3ed0>] content_open+0x0/0xb0
 [<00000000103e35e0>] qword_addhex+0x0/0x110


Kernel Fault: Code=3D26 regs=3D000000001057cbe0 (Addr=3D0000000000000000)=


     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001000000000000001110 Not tainted
r00-03  0000000000000000 0000000010481e38 0000000010112680 00000000080000=
0f
r04-07  00000000105c0b40 00000000101124e8 000000001057cb60 000000001057c2=
e0
r08-11  0000000000000000 00000000105c9b40 0000000000000000 00000000105d86=
a4
r12-15  0000000000000000 00000000ffffffff 0000000000000000 00000000f04000=
04
r16-19  000000001057c2e0 00000000f000017c 00000000f0000174 00000000105662=
30
r20-23  000000000800000e 00000000000003ea 00000000000003ea 00000000006297=
7c
r24-27  ffffffffffffffff 00000000000000d0 0000000000000000 00000000105c0b=
40
r28-31  0000000000000066 000000001057cbb0 000000001057cbe0 00000000000000=
60
sr0-3   0000000000000000 0000000000000000 0000000000000000 00000000000000=
00
sr4-7   0000000000000000 0000000000000000 0000000000000000 00000000000000=
00

IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000010176d20 0000000010=
176d24
 IIR: 0f4010d5    ISR: 0000000000000000  IOR: 0000000000000000
 CPU:        0   CR30: 000000001057c000 CR31: 0000000010580000
 ORIG_R28: 0000000000000000   
 IAOQ[0]: kmem_cache_alloc+0x18/0x60
 IAOQ[1]: kmem_cache_alloc+0x1c/0x60
 RP(r2): show_stack+0x80/0xf0 
Kernel panic - not syncing: Kernel Fault

a objdump of slab showing:
0000000000000000 <kmem_cache_alloc>:
   0:   0f c2 12 c1     std rp,-10(,sp)
   4:   73 c4 01 08     std,ma r4,80(sp)
   8:   73 c3 3f 11     std r3,-78(sp)
   c:   db 39 0f e0     extrd,s r25,63,32,r25
  10:   00 01 0e 63     rsm 1,r3
  14:   37 dd 3f a1     ldo -30(sp),ret1 
  18:   0f 40 10 d5     ldd 0(,r26),r21    <=3D=3D=3D=3D
  1c:   0e a0 10 94     ldw 0(,r21),r20
  20:   86 80 20 58     cmpib,=3D 0,r20,54 <kmem_cache_alloc+0x54>
)

Hth,
    Joel


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
       [not found] <200504090100.j3910CW0019308@hiauly1.hia.nrc.ca>
@ 2005-04-09 13:25 ` Joel Soete
  0 siblings, 0 replies; 6+ messages in thread
From: Joel Soete @ 2005-04-09 13:25 UTC (permalink / raw)
  To: John David Anglin; +Cc: James.Bottomley, parisc-linux, Thibaut VARENE



John David Anglin wrote:
>>It's also quite surprising the 32bit kernel is unaffected.
> 
well nothing new since: <http://lists.parisc-linux.org/pipermail/parisc-linux/2005-March/026055.html>

> 
> As a plug for my put_user patch, I recall that there is at least one
> xfs ioctl that involves putting a long int to userspace.
> 
Yes iirc that was (in 2.4) one of the reason I asked help to Randolph to introduce put_user_asm64() for 32bit kernel.

Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
       [not found]     ` <20050409021049.0d0a851d@Tatooine.r3z0>
@ 2005-04-11  4:57       ` Grant Grundler
  2005-04-11  5:15         ` Grant Grundler
  0 siblings, 1 reply; 6+ messages in thread
From: Grant Grundler @ 2005-04-11  4:57 UTC (permalink / raw)
  To: Thibaut VARENE; +Cc: James Bottomley, PARISC list

On Sat, Apr 09, 2005 at 02:10:49AM +0200, Thibaut VARENE wrote:
> Just for the records:
>
> while trying to debug the XFS issue on my J6k, here what i noticed:
>
> 32bit kernel: no bug

Sorry, though my j6k did not panic with 32-bit kernel, there is
definitely some XFS bug(s?).  Test output and details follow.

grundler <511>uname -a
Linux gggj6k 2.6.12-rc2-pa1 #5 SMP Sun Apr 10 14:13:59 PDT 2005 parisc GNU/Linux

That's despite sym2 using I/O Port space:
	CONFIG_SCSI_SYM53C8XX_IOMAPPED=y

I got alot (> 500 linus scrollback) of the following on console:

0x0: ed 41 00 00 00 10 00 00 a0 c4 59 42 4c ab 26 42
Filesystem "sdb3": XFS internal error xfs_da_do_buf(2) at line 2271 of file fs/4
Backtrace:
 [<1027d514>] xfs_da_do_buf+0x354/0x754
 [<1027d974>] xfs_da_read_buf+0x2c/0x38
 [<1028157c>] xfs_dir2_block_lookup_int+0x74/0x1fc
 [<1028146c>] xfs_dir2_block_lookup+0x1c/0xb8
 [<1027fd30>] xfs_dir2_lookup+0xc8/0x140
 [<102ad50c>] xfs_dir_lookup_int+0x50/0x118
 [<102b26a8>] xfs_lookup+0x68/0xac
 [<102c0894>] linvfs_lookup+0x60/0xa4
 [<1019e8d8>] __lookup_hash+0xc0/0xf8
 [<1019fa58>] lookup_create+0x68/0xcc
 [<1019ff1c>] sys_mkdir+0x78/0x134
 [<1010e178>] syscall_exit+0x0/0x14

And then the last console output was:
0x0: 58 44 32 44 0e 58 01 a8 07 c8 00 20 09 50 00 20
Filesystem "sdb3": XFS internal error xfs_dir2_block_addname at line 128 of fil4
Backtrace:
 [<10281124>] xfs_dir2_block_addname+0x6c0/0x6d4
 [<1027fc14>] xfs_dir2_createname+0x124/0x178
 [<102b3ee8>] xfs_mkdir+0x484/0x6e4
 [<102c07b8>] linvfs_mknod+0x1d0/0x210
 [<1019fe78>] vfs_mkdir+0x94/0xc0
 [<1019ff68>] sys_mkdir+0xc4/0x134
 [<1010e178>] syscall_exit+0x0/0x14

xfs_force_shutdown(sdb3,0x8) called from line 1091 of file fs/xfs/xfs_trans.c. c
Filesystem "sdb3": Corruption of in-memory data detected.  Shutting down filesy3
Please umount the filesystem, and rectify the problem(s)
SMP CALL FUNCTION TIMED OUT! (cpu=1), try 1


My test was "time rsync -aWx --stats / /mnt" with two disks:

grundler@gggj6k:~$ lsscsi
[1:0:5:0]    disk    HP       18.2GB C 80-D94N D94N  /dev/sda
[1:0:6:0]    disk    SEAGATE  ST318203LC       0001  /dev/sdb
grundler@gggj6k:~$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda3             16911576  13625088   2427416  85% /
tmpfs                  1812796         0   1812796   0% /dev/shm
/dev/sda1                85528     28487     52625  36% /boot
/dev/sdb3             17171264  13671156   3500108  80% /mnt

Ext3 on sda3 and freshly made "mkfs.xfs -f /dev/sdb3":
	/dev/sda3 on / type ext3 (rw,errors=remount-ro)


The rsync complained *alot* (> 500 lines of scrollback) about:
rsync: stat "/mnt/home/tftpboot/pa8800/var/spool/postfix/deferred/8" failed: Input/output error (5)

rsync stats output:

Number of files: 394227
Number of files transferred: 349625
Total file size: 12845404686 bytes
Total transferred file size: 12845269080 bytes
Literal data: 12845327899 bytes
Matched data: 0 bytes
File list size: 7360585
Total bytes sent: 12869390640
Total bytes received: 6992520

sent 12869390640 bytes  received 6992520 bytes  5379729.75 bytes/sec
total size is 12845404686  speedup is 1.00
rsync error: some files could not be transferred (code 23) at main.c(702)

real    39m52.694s
user    5m52.069s
sys     7m34.079s


hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
  2005-04-11  4:57       ` Grant Grundler
@ 2005-04-11  5:15         ` Grant Grundler
  2005-04-11  5:49           ` Grant Grundler
  0 siblings, 1 reply; 6+ messages in thread
From: Grant Grundler @ 2005-04-11  5:15 UTC (permalink / raw)
  To: Thibaut VARENE; +Cc: James Bottomley, PARISC list

On Sun, Apr 10, 2005 at 10:57:47PM -0600, Grant Grundler wrote:
> Sorry, though my j6k did not panic with 32-bit kernel, there is
> definitely some XFS bug(s?).  Test output and details follow.

Just for comparison, I ran the same workload on ext3 and it worked fine.
Started with "mke2fs /dev/sdb3; mount /dev/sdb3 /mnt" and then ran
my pet test-of-the-week again:

root@gggj6k:~# time rsync -aWx --stats / /mnt

Number of files: 394228
Number of files transferred: 349626
Total file size: 12845690938 bytes
Total transferred file size: 12845555332 bytes
Literal data: 12845560371 bytes
Matched data: 0 bytes
File list size: 7360607
Total bytes sent: 12869623206
Total bytes received: 6992540

sent 12869623206 bytes  received 6992540 bytes  9908900.15 bytes/sec
total size is 12845690938  speedup is 1.00

real    21m39.937s
user    5m47.908s
sys     5m50.986s
root@gggj6k:~# 


"iostat 10" on another login reported pretty impressive numbers at
times like for this 30 second period (rsync says avg is 1/4th of this):

avg-cpu:  %user   %nice    %sys %iowait   %idle
          27.54    0.00   24.04   28.49   19.94

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             208.60     43289.60       100.80     432896       1008
sdb              46.40        22.40     40007.20        224     400072

avg-cpu:  %user   %nice    %sys %iowait   %idle
          28.96    0.00   24.06   40.77    6.20

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             212.30     45693.60        68.80     456936        688
sdb              48.40         4.00     35515.20         40     355152

avg-cpu:  %user   %nice    %sys %iowait   %idle
          30.70    0.00   23.45   37.30    8.55

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             180.20     48866.40        51.20     488664        512
sdb              55.40        23.20     37864.80        232     378648



hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [parisc-linux] SLAB bug SMP 64bit / XFS mess
  2005-04-11  5:15         ` Grant Grundler
@ 2005-04-11  5:49           ` Grant Grundler
  0 siblings, 0 replies; 6+ messages in thread
From: Grant Grundler @ 2005-04-11  5:49 UTC (permalink / raw)
  To: PARISC list

On Sun, Apr 10, 2005 at 11:15:17PM -0600, Grant Grundler wrote:
> Just for comparison, I ran the same workload on ext3 and it worked fine.

Sorry, I meant ext2.


Here is the correct rsync output for ext3 on /dev/sdb3:

root@gggj6k:~# time rsync -aWx --stats / /mnt

Number of files: 394228
Number of files transferred: 349626
Total file size: 12845701275 bytes
Total transferred file size: 12845565669 bytes
Literal data: 12845570528 bytes
Matched data: 0 bytes
File list size: 7360229
Total bytes sent: 12869632989
Total bytes received: 6992540

sent 12869632989 bytes  received 6992540 bytes  8178231.52 bytes/sec
total size is 12845701275  speedup is 1.00

real    26m14.078s
user    5m47.767s
sys     7m11.995s
root@gggj6k:~# 


grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-04-11  5:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200504090100.j3910CW0019308@hiauly1.hia.nrc.ca>
2005-04-09 13:25 ` [parisc-linux] SLAB bug SMP 64bit / XFS mess Joel Soete
     [not found] <20050404110025.GA6987@tausq.org>
     [not found] ` <4208AA1B000184AE@mail-3-bnl.tiscali.it>
     [not found]   ` <20050408205428.GJ1833@baldric.uwo.ca>
     [not found]     ` <20050409021049.0d0a851d@Tatooine.r3z0>
2005-04-11  4:57       ` Grant Grundler
2005-04-11  5:15         ` Grant Grundler
2005-04-11  5:49           ` Grant Grundler
2005-04-03  1:34 Thibaut VARENE
2005-04-06 12:17 ` Joel Soete

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.