* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 [not found] <20090107165218.GA11132@dth.net> @ 2009-01-07 18:02 ` Christoph Hellwig 2009-01-07 18:24 ` Danny ter Haar 2009-01-14 19:44 ` Tino Keitel 1 sibling, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2009-01-07 18:02 UTC (permalink / raw) To: dth; +Cc: xfs On Wed, Jan 07, 2009 at 05:52:18PM +0100, dth wrote: > Am running low-power motherboard (via epia5000 c3) as a storage (samba/nfs) > server. (uses about 20 watts when both storage drives are in powersave mode) > > OS = debian lenny > > Pimairy drive (60gig 2.5" pata disk using libata = sda) > 512MB ext3 partition as /boot > swap, rest of the drive is XFS root file system. > 2 x 750GB drives on a sata controller (FastTrak S150 TX2plus) > sdb & sdc which have 1 big XFS partition. > > Any kernel i try to boot after 2.6.28-git2 (i tried git4-9) > sooner or later gives me an XFS error: The recover_uiunlinks code changed recently. Do you still have the exactly kernel tree version config around so you take a look using gdb what exact line of code the oops is? > [ 21.535820] XFS mounting filesystem sdb1 > [ 21.646126] Starting XFS recovery on filesystem: sdb1 (logdev: internal) > [ 21.747772] XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file > fs/xfs/xfs_btree.c. Caller 0xc01e9d71 > [ 21.747916] Pid: 1392, comm: mount Not tainted 2.6.28-git8 #1 > [ 21.748033] Call Trace: > [ 21.748130] [<c01e98b9>] xfs_btree_delrec+0x5b9/0xa50 > [ 21.748247] [<c01e9d71>] xfs_btree_delete+0x21/0x70 > [ 21.748339] [<c01e72ff>] xfs_btree_read_buf_block+0x4f/0x70 > [ 21.748462] [<c01e5dda>] xfs_bmbt_init_key_from_rec+0xa/0x20 > [ 21.748556] [<c01e66d5>] xfs_lookup_get_search_key+0x25/0x40 > [ 21.748680] [<c01e9d71>] xfs_btree_delete+0x21/0x70 > [ 21.748771] [<c01e2ad5>] xfs_bmap_del_extent+0x2f5/0x960 > [ 21.748894] [<c01e383e>] xfs_bunmapi+0x5be/0x980 > [ 21.749000] [<c01fba82>] xfs_itruncate_finish+0x1c2/0x2f0 > [ 21.749137] [<c0210852>] xfs_inactive+0x1d2/0x3d0 > [ 21.749228] [<c01fc33d>] xfs_imap_to_bp+0x5d/0xd0 > [ 21.749354] [<c01775bc>] clear_inode+0x5c/0xb0 > [ 21.749443] [<c0177a8c>] generic_delete_inode+0x6c/0xc0 > [ 21.749559] [<c0177157>] iput+0x47/0x50 > [ 21.749644] [<c02056fe>] xlog_recover_process_one_iunlink+0xae/0xe0 > [ 21.749768] [<c02057a7>] xlog_recover_process_iunlinks+0x77/0xe0 > [ 21.749863] [<c020584f>] xlog_recover_finish+0x3f/0x90 > [ 21.749980] [<c0209390>] xfs_mountfs+0x450/0x550 > [ 21.750126] [<c02122e6>] kmem_alloc+0x56/0xb0 > [ 21.750216] [<c020995d>] xfs_mru_cache_create+0xdd/0x120 > [ 21.750344] [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0 > [ 21.750441] [<c016b608>] get_sb_bdev+0xe8/0x130 > [ 21.750556] [<c0219602>] xfs_fs_get_sb+0x12/0x20 > [ 21.750646] [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0 > [ 21.750760] [<c016a949>] vfs_kern_mount+0x39/0x80 > [ 21.750850] [<c016a9cf>] do_kern_mount+0x2f/0xc0 > [ 21.750972] [<c017b3e5>] do_mount+0x5a5/0x5f0 > [ 21.751061] [<c016c35f>] sys_stat64+0xf/0x30 > [ 21.751186] [<c0150271>] __get_free_pages+0x11/0x30 > [ 21.751280] [<c017b496>] sys_mount+0x66/0xa0 > [ 21.751394] [<c0102c72>] syscall_call+0x7/0xb > [ 21.751494] Filesystem "sdb1": XFS internal error xfs_trans_cancel at line 1164 of file fs/xfs/xfs_trans.c. Caller 0xc021086b > [ 21.751514] > [ 21.751718] Pid: 1392, comm: mount Not tainted 2.6.28-git8 #1 > [ 21.751799] Call Trace: > [ 21.751893] [<c020b586>] xfs_trans_cancel+0x46/0xd0 > [ 21.751986] [<c021086b>] xfs_inactive+0x1eb/0x3d0 > [ 21.752102] [<c021086b>] xfs_inactive+0x1eb/0x3d0 > [ 21.752192] [<c01fc33d>] xfs_imap_to_bp+0x5d/0xd0 > [ 21.752307] [<c01775bc>] clear_inode+0x5c/0xb0 > [ 21.752396] [<c0177a8c>] generic_delete_inode+0x6c/0xc0 > [ 21.752512] [<c0177157>] iput+0x47/0x50 > [ 21.752596] [<c02056fe>] xlog_recover_process_one_iunlink+0xae/0xe0 > [ 21.752719] [<c02057a7>] xlog_recover_process_iunlinks+0x77/0xe0 > [ 21.752815] [<c020584f>] xlog_recover_finish+0x3f/0x90 > [ 21.752930] [<c0209390>] xfs_mountfs+0x450/0x550 > [ 21.753022] [<c02122e6>] kmem_alloc+0x56/0xb0 > [ 21.753136] [<c020995d>] xfs_mru_cache_create+0xdd/0x120 > [ 21.753232] [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0 > [ 21.753348] [<c016b608>] get_sb_bdev+0xe8/0x130 > [ 21.753439] [<c0219602>] xfs_fs_get_sb+0x12/0x20 > [ 21.753555] [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0 > [ 21.753645] [<c016a949>] vfs_kern_mount+0x39/0x80 > [ 21.753760] [<c016a9cf>] do_kern_mount+0x2f/0xc0 > [ 21.753852] [<c017b3e5>] do_mount+0x5a5/0x5f0 > [ 21.753965] [<c016c35f>] sys_stat64+0xf/0x30 > [ 21.754054] [<c0150271>] __get_free_pages+0x11/0x30 > [ 21.754172] [<c017b496>] sys_mount+0x66/0xa0 > [ 21.754258] [<c0102c72>] syscall_call+0x7/0xb > [ 21.754378] xfs_force_shutdown(sdb1,0x8) called from line 1165 of file fs/xfs/xfs_trans.c. Return address = 0xc020b59c > [ 21.754522] Filesystem "sdb1": Corruption of in-memory data detected. Shutting down filesystem: sdb1 > [ 21.754661] Please umount the filesystem, and rectify the problem(s) > [ 21.754814] BUG: unable to handle kernel NULL pointer dereference at 00000054 > [ 21.754990] IP: [<c02057bf>] xlog_recover_process_iunlinks+0x8f/0xe0 > [ 21.755137] *pde = 00000000 > [ 21.755263] Oops: 0000 [#1] > [ 21.755385] last sysfs file: /sys/class/net/lo/operstate > [ 21.755490] Modules linked in: i2c_viapro i2c_core via_agp agpgart sata_promise sd_mod > [ 21.755906] > [ 21.755976] Pid: 1392, comm: mount Not tainted (2.6.28-git8 #1) EPIA > [ 21.756091] EIP: 0060:[<c02057bf>] EFLAGS: 00010296 CPU: 0 > [ 21.756181] EIP is at xlog_recover_process_iunlinks+0x8f/0xe0 > [ 21.756293] EAX: 00000000 EBX: de827f00 ECX: 00000005 EDX: df231000 > [ 21.756384] ESI: ffffffff EDI: df231000 EBP: 00000026 ESP: df383e14 > [ 21.756501] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > [ 21.756592] Process mount (pid: 1392, ti=df382000 task=deb37b60 task.ti=df382000) > [ 21.756712] Stack: > [ 21.756792] df383e24 00000026 00000003 00000000 00000000 df2572e0 00000000 00000003 > [ 21.757192] df231000 c020584f 00000003 00000000 00000000 c0209390 00000017 00000000 > [ 21.757691] 08000004 00000000 00000000 00000001 00000058 00000000 c02122e6 00000001 > [ 21.758243] Call Trace: > [ 21.758327] [<c020584f>] xlog_recover_finish+0x3f/0x90 > [ 21.758476] [<c0209390>] xfs_mountfs+0x450/0x550 > [ 21.758615] [<c02122e6>] kmem_alloc+0x56/0xb0 > [ 21.758756] [<c020995d>] xfs_mru_cache_create+0xdd/0x120 > [ 21.758908] [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0 > [ 21.759059] [<c016b608>] get_sb_bdev+0xe8/0x130 > [ 21.759200] [<c0219602>] xfs_fs_get_sb+0x12/0x20 > [ 21.759350] [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0 > [ 21.759500] [<c016a949>] vfs_kern_mount+0x39/0x80 > [ 21.759647] [<c016a9cf>] do_kern_mount+0x2f/0xc0 > [ 21.759794] [<c017b3e5>] do_mount+0x5a5/0x5f0 > [ 21.759934] [<c016c35f>] sys_stat64+0xf/0x30 > [ 21.760103] [<c0150271>] __get_free_pages+0x11/0x30 > [ 21.760255] [<c017b496>] sys_mount+0x66/0xa0 > [ 21.760396] [<c0102c72>] syscall_call+0x7/0xb > [ 21.760535] Code: e8 e7 f6 00 00 55 89 f1 8b 54 24 04 89 f8 e8 a9 fe ff ff 89 c6 8d 44 24 0c 50 8b 4c 24 08 31 d2 [ 27.297839] NET: Registered protocol family 10 > [ 38.240089] eth0: no IPv6 routers present > [ 38.995204] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory > [ 38.995468] NFSD: starting 90-second grace period > > > When i reboot to git2 again and run a xfs_check and/or xfs_repair nothing shows up. > It just works without problems. So i think it's not actually an XFS problem but > something "underneath" > It's slow hardware and i have been reading about locking situations, which could be > triggered easier on my (slow) hardware then any modern (faster) hardware. > > Anyone else bitten by the same problem ? > > Danny > -- > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ---end quoted text--- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig @ 2009-01-07 18:24 ` Danny ter Haar 2009-01-07 18:31 ` Christoph Hellwig 0 siblings, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-07 18:24 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > > Any kernel i try to boot after 2.6.28-git2 (i tried git4-9) > > sooner or later gives me an XFS error: > The recover_uiunlinks code changed recently. Do you still have the > exactly kernel tree version config around so you take a look using > gdb what exact line of code the oops is? Since compiling a kernel on the native hardware takes "forever" i compile them at my laptop (ubuntu 32bits) and move the kernel to the NAS. This morning i compiled tried git9 (still an error) I have copied the config & system.map file to the same dir on my server ( http://www.dth.net/kernel/c3/ ) I'm not familiar with the debugging, do you have pointer where i could find how to do this ? In the mean time i'll try and google some info on how... I could copy the whole source tree over to the machine and give you (root) access to the machine so you can take a look yourself (if needed). Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:24 ` Danny ter Haar @ 2009-01-07 18:31 ` Christoph Hellwig 2009-01-07 18:44 ` Danny ter Haar 0 siblings, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2009-01-07 18:31 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Wed, Jan 07, 2009 at 07:24:15PM +0100, Danny ter Haar wrote: > Since compiling a kernel on the native hardware takes "forever" > i compile them at my laptop (ubuntu 32bits) and move the kernel > to the NAS. > This morning i compiled tried git9 (still an error) And error when compiling the kernel or you got the same error again? In case you can actually reproduce this error, I would be very interested in a metadump of the filesystem that causes this error. > I have copied the config & system.map file to the same dir on > my server ( http://www.dth.net/kernel/c3/ ) > > > I'm not familiar with the debugging, do you have pointer where > i could find how to do this ? > In the mean time i'll try and google some info on how... > > I could copy the whole source tree over to the machine and give you > (root) access to the machine so you can take a look yourself (if needed). The dbuegging should be really easy as long you actually have a tree / config with which the oops happened so that the oops has the same addresses. Compared to your config we would need CONFIG_DEBUG_INFO, but that's something we could turn on after the fact as it shouldn't change the reported address. After that we can just do a trivial command in gdb to find the source lines for the addresses reported (this can easily be done on the system where you compile the kernel, doesn't have to be on the NAS box). Btw, turning on CONFIG_XFS_DEBUG would be useful to see more output in case you can reproduce this issue. Just make sure to turn it off again when you want to use the box for a real workload.. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:31 ` Christoph Hellwig @ 2009-01-07 18:44 ` Danny ter Haar 2009-01-07 18:52 ` Christoph Hellwig 2009-01-07 18:56 ` Christoph Hellwig 0 siblings, 2 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-07 18:44 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > And error when compiling the kernel or you got the same error again? same error while trying to boot (this time on sda (pata with libata)) root filesystem of the OS partition. > In case you can actually reproduce this error, I would be very > interested in a metadump of the filesystem that causes this error. See the errorlog on: http://www.dth.net/kernel/c3/netconsole_2.6.28-git9-error.txt > The dbuegging should be really easy as long you actually have a tree / > config kernel tree is still on my laptop available. > with which the oops happened so that the oops has the same > addresses. > Compared to your config we would need CONFIG_DEBUG_INFO, > but that's something we could turn on after the fact as it shouldn't > change the reported address. After that we can just do a trivial > command in gdb to find the source lines for the addresses reported > (this can easily be done on the system where you compile the kernel, > doesn't have to be on the NAS box). ok, pointer to a howto by anychange ? still looking/googling. [think i found something] > Btw, turning on CONFIG_XFS_DEBUG would be useful to see more output > in case you can reproduce this issue. Just make sure to turn it > off again when you want to use the box for a real workload.. Will compile a new kernel in the mean time with those 2 options: CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:44 ` Danny ter Haar @ 2009-01-07 18:52 ` Christoph Hellwig 2009-01-07 22:09 ` Danny ter Haar 2009-01-08 0:38 ` Danny ter Haar 2009-01-07 18:56 ` Christoph Hellwig 1 sibling, 2 replies; 27+ messages in thread From: Christoph Hellwig @ 2009-01-07 18:52 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote: > Will compile a new kernel in the mean time with those 2 options: > CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO Please only with CONFIG_DEBUG_INFO in that case. CONFIG_XFS_DEBUG would be useful if you could reproduce it, but it does change the addresses, so it's harmful for the case of getting back to the source line. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:52 ` Christoph Hellwig @ 2009-01-07 22:09 ` Danny ter Haar 2009-01-08 0:38 ` Danny ter Haar 1 sibling, 0 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-07 22:09 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > Please only with CONFIG_DEBUG_INFO in that case. CONFIG_XFS_DEBUG would > be useful if you could reproduce it, but it does change the addresses, > so it's harmful for the case of getting back to the source line. Hmm, one bug is never alone... During compile my laptop started smelling because of heat buildup and switched itself off. I re-compiled the kernel on a debian lenny32 box i have access to. I only added the "CONFIG_DEBUG_INFO" option. Rebooted the nas, and so far the machine is running without a problem :-( Am now gonna compile a kernel on the machine itself, to force a lot of disk activity. Will keep you posted. -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:52 ` Christoph Hellwig 2009-01-07 22:09 ` Danny ter Haar @ 2009-01-08 0:38 ` Danny ter Haar 1 sibling, 0 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-08 0:38 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote: > > Will compile a new kernel in the mean time with those 2 options: > > CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO > > Please only with CONFIG_DEBUG_INFO in that case. CONFIG_XFS_DEBUG would > be useful if you could reproduce it, but it does change the addresses, > so it's harmful for the case of getting back to the source line. > Okay, it barfed again: Let me first explain thati've set the sleeptimer for the storage harddrives to 24(x5=120 sec) in hdparm.conf: /dev/sdb { spindown_time = 24 } /dev/sdc { spindown_time = 24 } This is the error (snapped using netconsole) http://www.dth.net/kernel/c3/netconsole_2.6.28-git9d-error.txt I rebooted again with the git2 kernel and xfs_check gave no errors: filer1:~# xfs_check /dev/sdb1 filer1:~# -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:44 ` Danny ter Haar 2009-01-07 18:52 ` Christoph Hellwig @ 2009-01-07 18:56 ` Christoph Hellwig 2009-01-07 19:01 ` Danny ter Haar 2009-01-08 21:56 ` Danny ter Haar 1 sibling, 2 replies; 27+ messages in thread From: Christoph Hellwig @ 2009-01-07 18:56 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote: > Quoting Christoph Hellwig (hch@infradead.org): > > And error when compiling the kernel or you got the same error again? > > same error while trying to boot (this time on sda (pata with libata)) > root filesystem of the OS partition. > > > In case you can actually reproduce this error, I would be very > > interested in a metadump of the filesystem that causes this error. > > See the errorlog on: http://www.dth.net/kernel/c3/netconsole_2.6.28-git9-error.txt This looks like pretty much the same corruption you saw in your first report. did you run xfs_repair before this one? _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:56 ` Christoph Hellwig @ 2009-01-07 19:01 ` Danny ter Haar 2009-01-08 21:56 ` Danny ter Haar 1 sibling, 0 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-07 19:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > This looks like pretty much the same corruption you saw in your first > report. did you run xfs_repair before this one? Yes! and with git2: no problem. The git8 error is on the storage drive(s) sdb1/sdc1 while the git9 is on the /dev/sda6 partition (containing the root filesystem) But it's completely repeatable. Compiling takes about 45 minuten on my laptop, will report later -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-07 18:56 ` Christoph Hellwig 2009-01-07 19:01 ` Danny ter Haar @ 2009-01-08 21:56 ` Danny ter Haar 2009-01-09 0:46 ` Dave Chinner 1 sibling, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-08 21:56 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs I needed the parallel port driver so i compiled 2.6.28-git3 with debug info. It barfed: http://www.dth.net/kernel/c3/netconsole_2.6.28-git3-d.txt I re-compiled git2 (with debug support) and <knock wood> so far it is stable: # procinfo Memory: Total Used Free Buffers RAM: 507116 258048 249068 428 Swap: 497972 0 497972 Bootup: Thu Jan 8 12:00:04 2009 Load average: 0.10 0.37 0.19 2/104 2767 user : 00:02:55.96 2.6% page in : 183385 nice : 00:00:00.00 0.0% page out: 497130 system: 00:00:47.36 0.7% page act: 23771 IOwait: 00:00:28.21 0.4% page dea: 0 hw irq: 00:00:02.10 0.0% page flt: 698907 sw irq: 00:00:03.37 0.0% swap in : 0 idle : 01:48:06.90 96.2% swap out: 0 uptime: 01:52:23.93 context : 323529 irq 0: 293258 timer irq 10: 112849 eth0 irq 1: 8 i8042 irq 11: 42028 sata_promise irq 2: 0 cascade irq 12: 0 uhci_hcd:usb1, uh irq 5: 0 acpi irq 14: 7629 pata_via irq 7: 1 parport0 irq 15: 0 pata_via sda 2740r 3576w sdb 1037r 325w sda1 70r 2w sdb1 1024r 325w sda2 2r 0w sdc 2615r 14257w sda5 25r 0w sdc1 2602r 14257w sda6 2636r 3574w lo TX 5.68MiB RX 5.68MiB eth0 TX 86.38MiB RX 10.32MiB ------------------- So (in my case) something while going from git2 -> git3 didn't go positive. If i could get some explanation howto gdb the vmlinux image to determine in what function the git3 barfed, i would be able to give more info/feedback. I looked in the Documents folder in the kernel source, i looked at the examples given there but i really dont understand how/what. Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-08 21:56 ` Danny ter Haar @ 2009-01-09 0:46 ` Dave Chinner 2009-01-09 1:26 ` Danny ter Haar 0 siblings, 1 reply; 27+ messages in thread From: Dave Chinner @ 2009-01-09 0:46 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Thu, Jan 08, 2009 at 10:56:02PM +0100, Danny ter Haar wrote: > > I needed the parallel port driver so i compiled 2.6.28-git3 with debug info. > It barfed: http://www.dth.net/kernel/c3/netconsole_2.6.28-git3-d.txt Looking at this, I think there are two possibilities in terms of the problem being detected. We are modifying the inode BMBT here, so that means we have XFS_BTREE_ROOT_IN_INODE set. The corruption trigger has occurred because a xfs_btree_increment() call has returned a zero status. This means we failed here: 1324 /* Fail if we just went off the right edge of the tree. */ 1325 xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); 1326 if (xfs_btree_ptr_is_null(cur, &ptr)) 1327 goto out0; or here: 1351 /* 1352 * If we went off the root then we are either seriously 1353 * confused or have the tree root in an inode. 1354 */ 1355 if (lev == cur->bc_nlevels) { 1356 if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) 1357 goto out0; 1358 ASSERT(0); i.e. we either fell off the right edge of the tree or went over the top of it. I can't really see how we've done either of those things unless the tree has been corrupted by a prior operation. Given that each time it is aptitude that is causing the problem, can you prevent aptitude from running automatically on boot and run it manually? If you can reporduce the problem manually then we can move on to the next step.... > So (in my case) something while going from git2 -> git3 didn't go positive. That would have been when Linus did the XFS pull... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 0:46 ` Dave Chinner @ 2009-01-09 1:26 ` Danny ter Haar 2009-01-09 2:08 ` Dave Chinner 0 siblings, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-09 1:26 UTC (permalink / raw) To: Christoph Hellwig, xfs Quoting Dave Chinner (david@fromorbit.com): > Looking at this, I think there are two possibilities in terms of the > problem being detected. We are modifying the inode BMBT here, > so that means we have XFS_BTREE_ROOT_IN_INODE set. The corruption > trigger has occurred because a xfs_btree_increment() call has > returned a zero status. This means we failed here: > > 1324 /* Fail if we just went off the right edge of the tree. */ > 1325 xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); > 1326 if (xfs_btree_ptr_is_null(cur, &ptr)) > 1327 goto out0; > > or here: > > 1351 /* > 1352 * If we went off the root then we are either seriously > 1353 * confused or have the tree root in an inode. > 1354 */ > 1355 if (lev == cur->bc_nlevels) { > 1356 if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) > 1357 goto out0; > 1358 ASSERT(0); > > i.e. we either fell off the right edge of the tree or went over the top > of it. > I can't really see how we've done either of those things unless the > tree has been corrupted by a prior operation. sounds logical. First time when it happened i moved the primairy hd to sec ide connector, connected a seperate hard drive as new master, installed a fresh debian lenny on that harddrive, ran xfs-repair on all xfs filesystems: no errors > Given that each time it is aptitude that is causing the problem, can you > prevent aptitude from running automatically on boot and run it manually? > If you can reporduce the problem manually then we can move on to the > next step.... I wasn't clear (obvioulsy) This machine is besides my NAS also my apt-cacher-ng server for all my other machines here at home. The easiest way to trigger the error is often by running a simple "aptitude update; aptitude -d dist-upgrade" So when it barfed i did the aptitude by hand. And it checks everything from the cache at /var/cache/apt-cacher-ng which is on sda6 (root filesystem on XFS) So it doesn't "barf" right on boot, it takes a few minutes or even hours: filer1:~# last -20 reboot reboot system boot 2.6.28-git2-d Thu Jan 8 12:00 - current(05:18) reboot system boot 2.6.28-git3-d Thu Jan 8 11:31 - 11:59 (00:27) reboot system boot 2.6.28-git3-d Thu Jan 8 10:56 - 11:59 (01:02) reboot system boot 2.6.28-git3-d Thu Jan 8 10:44 - 10:54 (00:10) reboot system boot 2.6.28-git3-d Thu Jan 8 10:30 - 10:43 (00:12) reboot system boot 2.6.28-git2 Wed Jan 7 15:08 - 10:28 (19:19) reboot system boot 2.6.28-git9-d Wed Jan 7 12:29 - 14:58 (02:29) reboot system boot 2.6.28-git2 Wed Jan 7 10:08 - 12:27 (02:19) reboot system boot 2.6.28-git9 Wed Jan 7 09:21 - 10:06 (00:45) reboot system boot 2.6.28-git9 Wed Jan 7 08:42 - 10:06 (01:24) reboot system boot 2.6.28-git2 Tue Jan 6 21:45 - 08:40 (10:55) reboot system boot 2.6.28-git4 Tue Jan 6 21:27 - 08:40 (11:13) reboot system boot 2.6.28-git4 Tue Jan 6 21:22 - 08:40 (11:18) Sometimes the kernel barfes while accessing /dev/sdb1 of /dev/sdc1 which is only accessed using samba. I can once more install the "other" debian lenny harddrive, boot from there and than manually do an xfs_repair on xfs filesystems. I can than boot a kernel that is know to barf and try to get it to barf. > > So (in my case) something while going from git2 -> git3 didn't go positive. > That would have been when Linus did the XFS pull... Do you want me to figure out what patch from git2->git3 is the cullprit ? I'll have to compile/reboot for a while. Tell me what else i can do to resolve this. Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 1:26 ` Danny ter Haar @ 2009-01-09 2:08 ` Dave Chinner 2009-01-09 6:10 ` Danny ter Haar 0 siblings, 1 reply; 27+ messages in thread From: Dave Chinner @ 2009-01-09 2:08 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Fri, Jan 09, 2009 at 02:26:10AM +0100, Danny ter Haar wrote: > Do you want me to figure out what patch from git2->git3 is the cullprit ? > I'll have to compile/reboot for a while. If you can do that, it would be *greatly* appreciated. > Tell me what else i can do to resolve this. If you can isolate the commit that introduced the bug, then code review is probably the step after that.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 2:08 ` Dave Chinner @ 2009-01-09 6:10 ` Danny ter Haar 2009-01-09 19:44 ` Christoph Hellwig 0 siblings, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-09 6:10 UTC (permalink / raw) To: Christoph Hellwig, xfs Quoting Dave Chinner (david@fromorbit.com): > > Do you want me to figure out what patch from git2->git3 is the cullprit ? > > I'll have to compile/reboot for a while. > If you can do that, it would be *greatly* appreciated. Need a little bit of help here. (i'm a hardware guy with just the bare essential software knowhow) I found http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/ (used that one before) but it doesn't seem to know the "-gitXX" kernel trees. Q: how do i get a list of available kernel versions (including -gitXX branches) My intention is to say something like: # git bisect bad v2.6.28-git3 # git bisect good v2.6.28-git2 and take it from there. I've been googling now for nearly 2hours.. Any help appreciated Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 6:10 ` Danny ter Haar @ 2009-01-09 19:44 ` Christoph Hellwig 2009-01-09 19:51 ` Danny ter Haar 0 siblings, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2009-01-09 19:44 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote: > # git bisect bad v2.6.28-git3 > # git bisect good v2.6.28-git2 > > and take it from there. Yes. Then it gives you a revision to test and you can use git bisect next to try the next one, or git bisect skip if the current version is bad due to some other reason (doesn't boot / compile) _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 19:44 ` Christoph Hellwig @ 2009-01-09 19:51 ` Danny ter Haar 2009-01-09 19:58 ` Christoph Hellwig 0 siblings, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-09 19:51 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote: > Yes. Then it gives you a revision to test and you can use I'm still not making myself clear (sorry for that ;-) ) i'm not able to "access" the 2.6.28-gitXX repository for some reason. with other words: how do "include" the 2.6.28-git repository in git ? Danny _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 19:51 ` Danny ter Haar @ 2009-01-09 19:58 ` Christoph Hellwig 2009-01-09 21:42 ` Danny ter Haar 0 siblings, 1 reply; 27+ messages in thread From: Christoph Hellwig @ 2009-01-09 19:58 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Fri, Jan 09, 2009 at 08:51:44PM +0100, Danny ter Haar wrote: > Quoting Christoph Hellwig (hch@infradead.org): > > On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote: > > Yes. Then it gives you a revision to test and you can use > > I'm still not making myself clear (sorry for that ;-) ) > i'm not able to "access" the 2.6.28-gitXX repository for some reason. > with other words: > > how do "include" the 2.6.28-git repository in git ? Oh, sorry. There is no specific 2.6.28-git respository, just a main linux 2.6 one: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git The git bisect command make sure you will get the right revisions checked out to bisect. > > Danny ---end quoted text--- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 19:58 ` Christoph Hellwig @ 2009-01-09 21:42 ` Danny ter Haar 2009-01-09 22:01 ` Christoph Hellwig 0 siblings, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-09 21:42 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > There is no specific 2.6.28-git respository, just a main linux 2.6 one: > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git > The git bisect command make sure you will get the right revisions > checked out to bisect. still not making myself clear i guess.. lenny32:/usr/src/linux-2.6# git tag|grep 2.6.28 v2.6.28 v2.6.28-rc1 v2.6.28-rc2 v2.6.28-rc3 v2.6.28-rc4 v2.6.28-rc5 v2.6.28-rc6 v2.6.28-rc7 v2.6.28-rc8 v2.6.28-rc9 So i cannot use any "standard" GIT command to also include linus's git tree ?! that seems odd. He committed them somehow in git didn't he ? I could of course compile /patch 2 kernel trees by hand in fact i allready have: lenny32:/usr/src# ls -l total 16 drwxr-sr-x 2 root src 146 2009-01-09 08:29 archive drwxr-sr-x 2 root src 4096 2009-01-08 11:33 configs drwxr-sr-x 23 root src 4096 2009-01-08 19:56 linux-2.6 drwxr-xr-x 24 root root 4096 2009-01-08 11:53 linux-2.6.28-git2-d drwxr-xr-x 24 root root 4096 2009-01-08 10:21 linux-2.6.28-git3-d But i'm not familiar how to instruct git to find the differences between the 2 source tree's and how to use git to find out was the culprit is. Any other suggestions ? -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 21:42 ` Danny ter Haar @ 2009-01-09 22:01 ` Christoph Hellwig 2009-01-09 22:23 ` Danny ter Haar 2009-01-13 20:04 ` Danny ter Haar 0 siblings, 2 replies; 27+ messages in thread From: Christoph Hellwig @ 2009-01-09 22:01 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Fri, Jan 09, 2009 at 10:42:06PM +0100, Danny ter Haar wrote: > Quoting Christoph Hellwig (hch@infradead.org): > > There is no specific 2.6.28-git respository, just a main linux 2.6 one: > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git > > The git bisect command make sure you will get the right revisions > > checked out to bisect. > > still not making myself clear i guess.. Or me beeing stupid, sorry :) The -gitN thingies aren't official tags, but unofficial snapshots done by Dave Woodhouse. He has a mapping on http://www.kernel.org/pub/linux/kernel/people/dwmw2/snapshot-tags/ from these names to actual tags. So 2.6.28-git2 would be 3c92ec8ae91ecf59d88c798301833d7cf83f2179 and 2.6.28-git3 6a94cb73064c952255336cc57731904174b2c58f so use git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179 git bisect bad 6a94cb73064c952255336cc57731904174b2c58f Sorry for the confusion.. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 22:01 ` Christoph Hellwig @ 2009-01-09 22:23 ` Danny ter Haar 2009-01-13 20:04 ` Danny ter Haar 1 sibling, 0 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-09 22:23 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Christoph Hellwig (hch@infradead.org): > git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179 > git bisect bad 6a94cb73064c952255336cc57731904174b2c58f > Sorry for the confusion.. *THANKS* Bisecting: 955 revisions left to test after this ;-) Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-09 22:01 ` Christoph Hellwig 2009-01-09 22:23 ` Danny ter Haar @ 2009-01-13 20:04 ` Danny ter Haar 2009-01-16 20:43 ` Danny ter Haar 1 sibling, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-13 20:04 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs I'm stuck in trying to bisect the problem. I restarted from scratch and same result: Here is what i did: lenny32:/usr/src# git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-git Initialized empty Git repository in /usr/src/linux-git/.git/ remote: Counting objects: 1057520, done. remote: Compressing objects: 100% (173057/173057), done. Resolving deltas: 100% (881801/881801), done. Checking out files: 100% (26544/26544), done. # git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179 # git bisect bad 6a94cb73064c952255336cc57731904174b2c58f Bisecting: 955 revisions left to test after this [5ed1836814d908f45cafde0e79cb85314ab9d41d] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 The EXTRAVERSION in the makefile at this point was "plain" 2.6.28 (which seems odd to me) I renamed it to "-a1" and compiled kernel. Installed/rebooted. after 16 hours (overnight) of no troubles i think it seemed stable: reboot system boot 2.6.28-a1 Fri Jan 9 16:54 - 09:29 (16:34) # git bisect good Bisecting: 477 revisions left to test after this M scripts/package/Makefile D scripts/package/builddeb [7b2cd079ec8dcc65cdca6621245cfa5e30a8ef9f] V4L/DVB (10007): gspca - m5602: Refactor the error handling in the s5k83a version "-a2" was branded "ok" by me after 3 hours of heavy use: reboot system boot 2.6.28-a2 Sat Jan 10 09:31 - 12:37 (03:06) # git bisect good Bisecting: 262 revisions left to test after this [6094c85a935f7eadb4c607c6dc6d86c0a9f09a4b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Version "-a3" also got my blessing: reboot system boot 2.6.28-a3 Sat Jan 10 12:39 - 17:55 (05:15) # git bisect good Bisecting: 131 revisions left to test after this [a1941895034cda2bffa23ba845607c82138ccf52] [XFS] remove dead code for old inode item recovery This time the Makefile EXTRAVERSION changed to "-RC6" I was/am under the impression that i'm testing between 2.6.28-git2 and 2.6.28-git3 So why is the Makefile going this way back ? dazzled & confused. "-a4" barfed lenny32:/usr/src/linux-git# git bisect bad Bisecting: 65 revisions left to test after this [6441e549157b749bae003cce70b4c8b62e4801fa] [XFS] factor xfs_iget_core() into hit and miss cases EXTRAVERSION went baack to "-RC2" ?! "-a5" totally froze the machine (didn't catch anything on the netconsole) So i interpreted that as a "fail" # git bisect bad Bisecting: 32 revisions left to test after this [fd6bcc5b63051392ba709a8fd33173b263669e0a] [XFS] kill xfs_bmbt_log_block and xfs_bmbt_log_recs Now "-a6" doesn't want to compile: # make-kpkg kernel-image --initrd exec debian/rules DEBIAN_REVISION=2.6.28-a6-10.00.Custom INITRD=YES kernel-image ====== making target debian/stamp/build/kernel [new prereqs: conf.vars]====== This is kernel package version 11.015. test ! -f scripts/package/builddeb.kpkg-dist || mv -f scripts/package/builddeb.kpkg-dist scripts/package/builddeb test ! -f scripts/package/Makefile.kpkg-dist || mv -f scripts/package/Makefile.kpkg-dist scripts/package/Makefile /usr/bin/make -j2 ARCH=i386 \ bzImage make[1]: Entering directory `/usr/src/linux-git' CHK include/linux/version.h CHK include/linux/utsrelease.h CALL scripts/checksyscalls.sh CHK include/linux/compile.h CC fs/xfs/xfs_alloc_btree.o fs/xfs/xfs_alloc_btree.c:38:29: error: xfs_btree_trace.h: No such file or directory fs/xfs/xfs_alloc_btree.c: In function ‘xfs_allocbt_alloc_block’: fs/xfs/xfs_alloc_btree.c:84: error: implicit declaration of function ‘XFS_BTREE_TRACE_CURSOR’ fs/xfs/xfs_alloc_btree.c:84: error: ‘XBT_ENTRY’ undeclared (first use in this function) fs/xfs/xfs_alloc_btree.c:84: error: (Each undeclared identifier is reported only once fs/xfs/xfs_alloc_btree.c:84: error: for each function it appears in.) fs/xfs/xfs_alloc_btree.c:90: error: ‘XBT_ERROR’ undeclared (first use in this function) fs/xfs/xfs_alloc_btree.c:95: error: ‘XBT_EXIT’ undeclared (first use in this function) fs/xfs/xfs_alloc_btree.c: In function ‘xfs_allocbt_kill_root’: fs/xfs/xfs_alloc_btree.c:291: error: ‘XBT_ENTRY’ undeclared (first use in this function) fs/xfs/xfs_alloc_btree.c:301: error: ‘XBT_ERROR’ undeclared (first use in this function) fs/xfs/xfs_alloc_btree.c:310: error: ‘XBT_EXIT’ undeclared (first use in this function) make[3]: *** [fs/xfs/xfs_alloc_btree.o] Error 1 make[2]: *** [fs/xfs] Error 2 make[1]: *** [fs] Error 2 make[1]: *** Waiting for unfinished jobs.... make[1]: Leaving directory `/usr/src/linux-git' make: *** [debian/stamp/build/kernel] Error 2 In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed: Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/ [F1] ;-) -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-13 20:04 ` Danny ter Haar @ 2009-01-16 20:43 ` Danny ter Haar 2009-01-17 7:38 ` Dave Chinner 0 siblings, 1 reply; 27+ messages in thread From: Danny ter Haar @ 2009-01-16 20:43 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs Quoting Danny ter Haar (dth@dth.net): > I'm stuck in trying to bisect the problem. > I restarted from scratch and same result: [SNIP for bravery] > In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed: > > Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/ > > [F1] ;-) This was to my way to say "HELP" I'm stuck.. Anyone have some tips/hunts/suggestions (like find another hobby ;-) ) Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-16 20:43 ` Danny ter Haar @ 2009-01-17 7:38 ` Dave Chinner 2009-01-17 23:25 ` Danny ter Haar 0 siblings, 1 reply; 27+ messages in thread From: Dave Chinner @ 2009-01-17 7:38 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Fri, Jan 16, 2009 at 09:43:46PM +0100, Danny ter Haar wrote: > Quoting Danny ter Haar (dth@dth.net): > > I'm stuck in trying to bisect the problem. > > I restarted from scratch and same result: > > [SNIP for bravery] > > > In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed: > > > > Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/ > > > > [F1] ;-) > > This was to my way to say "HELP" Sorry for not getting back to you sooner. I think that Alexander tripped over this same problem during his bisect. If you follow the thread from here: http://oss.sgi.com/archives/xfs/2009-01/msg00496.html You'll see that Alexander had the same problem and managed to continue the bisect once he copied the xfs_btree_trace.h header file from top-of-tree back into the broken commits. I hope this helps (and I hope that the bisect lands on the same commit that it did for Alexander). Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-17 7:38 ` Dave Chinner @ 2009-01-17 23:25 ` Danny ter Haar 2009-01-18 2:50 ` Danny ter Haar 2009-01-19 3:17 ` Dave Chinner 0 siblings, 2 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-17 23:25 UTC (permalink / raw) To: Christoph Hellwig, xfs Quoting Dave Chinner (david@fromorbit.com): > Sorry for not getting back to you sooner. No problem. I initally posted to LKLM, git redirected by Christoph to this list. I'm so stupid that i didn't check the other messages from this list. Sorry. > I think that Alexander tripped over this same problem during his bisect. > If you follow the thread from here: > http://oss.sgi.com/archives/xfs/2009-01/msg00496.html Yep! [cheer] i'm not alone! :-) But why only us two ? there must be thousands of users out there using XFS. Why did it bite us ? large filesystem together with slow hardware ? > You'll see that Alexander had the same problem and managed > to continue the bisect once he copied the xfs_btree_trace.h > header file from top-of-tree back into the broken commits. Grwat. > I hope this helps (and I hope that the bisect lands on the > same commit that it did for Alexander). Do you want me to still try it ? I think you allready figured out where the culprit is ?! I saw changes in the announcement of 2.6.29-rc3 and took the plunge: # procinfo Memory: Total Used Free Buffers RAM: 506940 447868 59072 84 Swap: 497972 0 497972 Bootup: Sat Jan 17 10:28:27 2009 Load average: 0.03 0.11 0.09 2/104 5259 user : 00:09:30.12 3.2% page in : 417582 nice : 00:00:00.00 0.0% page out: 1220260 system: 00:02:01.76 0.7% page act: 41134 IOwait: 00:03:34.28 1.2% page dea: 13444 hw irq: 00:00:01.94 0.0% page flt: 1531395 sw irq: 00:00:04.50 0.0% swap in : 0 idle : 04:39:41.90 94.8% swap out: 0 uptime: 04:54:55.09 context : 892623 irq 0: 799012 timer irq 10: 101465 eth0 irq 1: 8 i8042 irq 11: 46893 sata_promise irq 2: 0 cascade irq 12: 0 uhci_hcd:usb1, uh irq 5: 0 acpi irq 14: 50059 pata_via irq 7: 1 parport0 irq 15: 0 pata_via sda 15445r 22597w sdb 2820r 23372w sda1 555r 284w sdb1 2750r 23372w sda2 2r 0w sdc 63r 3w sda5 136r 0w sdc1 50r 3w sda6 14659r 22313w lo TX 59.65KiB RX 59.65KiB eth0 TX 13.40MiB RX 23.78MiB over 4 hours of uptime and moderate usage, so i'm not 100% convinced but this one looks good (so far) Let me know if i should persue some more. Thanks for all the help. Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-17 23:25 ` Danny ter Haar @ 2009-01-18 2:50 ` Danny ter Haar 2009-01-19 3:17 ` Dave Chinner 1 sibling, 0 replies; 27+ messages in thread From: Danny ter Haar @ 2009-01-18 2:50 UTC (permalink / raw) To: Christoph Hellwig, xfs > I saw changes in the announcement of 2.6.29-rc3 and took the plunge: > over 4 hours of uptime and moderate usage, so i'm not 100% convinced but this one > looks good (so far) reboot system boot 2.6.29-rc2 Sat Jan 17 10:28 - crash (08:20) ok, not stable (for me) Danny -- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 2009-01-17 23:25 ` Danny ter Haar 2009-01-18 2:50 ` Danny ter Haar @ 2009-01-19 3:17 ` Dave Chinner 1 sibling, 0 replies; 27+ messages in thread From: Dave Chinner @ 2009-01-19 3:17 UTC (permalink / raw) To: Danny ter Haar; +Cc: Christoph Hellwig, xfs On Sun, Jan 18, 2009 at 12:25:11AM +0100, Danny ter Haar wrote: > Quoting Dave Chinner (david@fromorbit.com): > > Sorry for not getting back to you sooner. > > No problem. I initally posted to LKLM, git redirected by Christoph to this > list. I'm so stupid that i didn't check the other messages from this list. > Sorry. > > > I think that Alexander tripped over this same problem during his bisect. > > If you follow the thread from here: > > http://oss.sgi.com/archives/xfs/2009-01/msg00496.html > > Yep! [cheer] i'm not alone! :-) > But why only us two ? there must be thousands of users out there using > XFS. Why did it bite us ? large filesystem together with slow hardware ? No idea - I can't reproduce it either so there's some state that your filesystem is getting into that trips over it. > > You'll see that Alexander had the same problem and managed > > to continue the bisect once he copied the xfs_btree_trace.h > > header file from top-of-tree back into the broken commits. > > Grwat. > > > I hope this helps (and I hope that the bisect lands on the > > same commit that it did for Alexander). > > Do you want me to still try it ? > I think you allready figured out where the culprit is ?! Yes, i think we have, but it wasn't totally conclusive. Can you continue your bisect to see if it narrows down to the same commit on your machine? I'm still trying to reproduce it but I haven't worked out what the initial state is. One thing that might be useful is to put a printk into the kernel on the failure path that prints the inode number out (e.g. at the goto that the WANT_CORRUPTED_GOTO jumps to). Then we can use xfs_db to find the file that is causing the problem and then use xfs_db or xfs_bmap to look at the extent tree prior to the corruption. That might help me set up the initial state needed to trip the problem..... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2 [not found] <20090107165218.GA11132@dth.net> 2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig @ 2009-01-14 19:44 ` Tino Keitel 1 sibling, 0 replies; 27+ messages in thread From: Tino Keitel @ 2009-01-14 19:44 UTC (permalink / raw) To: linux-kernel; +Cc: xfs [added CC:] On Wed, Jan 07, 2009 at 17:52:18 +0100, dth wrote: > Am running low-power motherboard (via epia5000 c3) as a storage (samba/nfs) > server. (uses about 20 watts when both storage drives are in powersave mode) > > OS = debian lenny > > Pimairy drive (60gig 2.5" pata disk using libata = sda) > 512MB ext3 partition as /boot > swap, rest of the drive is XFS root file system. > > storage: > 2 x 750GB drives on a sata controller (FastTrak S150 TX2plus) > sdb & sdc which have 1 big XFS partition. > > Any kernel i try to boot after 2.6.28-git2 (i tried git4-9) > sooner or later gives me an XFS error: [...] > Anyone else bitten by the same problem ? y Hi, I get the same on a Mac mini Core Duo using 2.6.29-rc1. I just booted, started firefox and my /home died. XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file linux-2.6/fs/xfs/xfs_btree.c. Caller 0xc022e9c6 Pid: 4937, comm: firefox-bin Not tainted 2.6.29-rc1-00224-ga652504 #11 Call Trace: [<c022e6d9>] xfs_btree_delrec+0xbd9/0xea0 [<c022e9c6>] xfs_btree_delete+0x26/0x90 [<c022b355>] xfs_btree_lookup_get_block+0xa5/0xe0 [<c022933a>] xfs_bmbt_init_key_from_rec+0xa/0x20 [<c022c6ba>] xfs_btree_lookup+0x21a/0x460 [<c022e9c6>] xfs_btree_delete+0x26/0x90 [<c02255ff>] xfs_bmap_del_extent+0x64f/0xb70 [<c02262ea>] xfs_bunmapi+0x65a/0xb30 [<c0244f6b>] xfs_itruncate_finish+0x1ab/0x3e0 [<c026115d>] xfs_inactive+0x3cd/0x4f0 [<c019113c>] clear_inode+0x5c/0x100 [<c01917fe>] generic_delete_inode+0xce/0xe0 [<c0190b24>] iput+0x44/0x50 [<c01898a0>] do_unlinkat+0xe0/0x160 [<c0188180>] lock_rename+0x0/0xa0 [<c01033c5>] sysenter_do_call+0x12/0x25 Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1164 of file linux-2.6/fs/xfs/xfs_trans.c. Caller 0xc0261173 Pid: 4937, comm: firefox-bin Not tainted 2.6.29-rc1-00224-ga652504 #11 Call Trace: [<c025a54b>] xfs_trans_cancel+0xcb/0xf0 [<c0261173>] xfs_inactive+0x3e3/0x4f0 [<c0261173>] xfs_inactive+0x3e3/0x4f0 [<c019113c>] clear_inode+0x5c/0x100 [<c01917fe>] generic_delete_inode+0xce/0xe0 [<c0190b24>] iput+0x44/0x50 [<c01898a0>] do_unlinkat+0xe0/0x160 [<c0188180>] lock_rename+0x0/0xa0 [<c01033c5>] sysenter_do_call+0x12/0x25 xfs_force_shutdown(dm-1,0x8) called from line 1165 of file linux-2.6/fs/xfs/xfs_trans.c. Return address = 0xc025a563 Filesystem "dm-1": Corruption of in-memory data detected. Shutting down filesystem: dm-1 Please umount the filesystem, and rectify the problem(s) Regards, Tino _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-01-19 3:23 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090107165218.GA11132@dth.net>
2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
2009-01-07 18:24 ` Danny ter Haar
2009-01-07 18:31 ` Christoph Hellwig
2009-01-07 18:44 ` Danny ter Haar
2009-01-07 18:52 ` Christoph Hellwig
2009-01-07 22:09 ` Danny ter Haar
2009-01-08 0:38 ` Danny ter Haar
2009-01-07 18:56 ` Christoph Hellwig
2009-01-07 19:01 ` Danny ter Haar
2009-01-08 21:56 ` Danny ter Haar
2009-01-09 0:46 ` Dave Chinner
2009-01-09 1:26 ` Danny ter Haar
2009-01-09 2:08 ` Dave Chinner
2009-01-09 6:10 ` Danny ter Haar
2009-01-09 19:44 ` Christoph Hellwig
2009-01-09 19:51 ` Danny ter Haar
2009-01-09 19:58 ` Christoph Hellwig
2009-01-09 21:42 ` Danny ter Haar
2009-01-09 22:01 ` Christoph Hellwig
2009-01-09 22:23 ` Danny ter Haar
2009-01-13 20:04 ` Danny ter Haar
2009-01-16 20:43 ` Danny ter Haar
2009-01-17 7:38 ` Dave Chinner
2009-01-17 23:25 ` Danny ter Haar
2009-01-18 2:50 ` Danny ter Haar
2009-01-19 3:17 ` Dave Chinner
2009-01-14 19:44 ` Tino Keitel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox