* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
[not found] <20090107165218.GA11132@dth.net>
@ 2009-01-07 18:02 ` Christoph Hellwig
2009-01-07 18:24 ` Danny ter Haar
2009-01-14 19:44 ` Tino Keitel
1 sibling, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:02 UTC (permalink / raw)
To: dth; +Cc: xfs
On Wed, Jan 07, 2009 at 05:52:18PM +0100, dth wrote:
> Am running low-power motherboard (via epia5000 c3) as a storage (samba/nfs)
> server. (uses about 20 watts when both storage drives are in powersave mode)
>
> OS = debian lenny
>
> Pimairy drive (60gig 2.5" pata disk using libata = sda)
> 512MB ext3 partition as /boot
> swap, rest of the drive is XFS root file system.
> 2 x 750GB drives on a sata controller (FastTrak S150 TX2plus)
> sdb & sdc which have 1 big XFS partition.
>
> Any kernel i try to boot after 2.6.28-git2 (i tried git4-9)
> sooner or later gives me an XFS error:
The recover_uiunlinks code changed recently. Do you still have the
exactly kernel tree version config around so you take a look using
gdb what exact line of code the oops is?
> [ 21.535820] XFS mounting filesystem sdb1
> [ 21.646126] Starting XFS recovery on filesystem: sdb1 (logdev: internal)
> [ 21.747772] XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file
> fs/xfs/xfs_btree.c. Caller 0xc01e9d71
> [ 21.747916] Pid: 1392, comm: mount Not tainted 2.6.28-git8 #1
> [ 21.748033] Call Trace:
> [ 21.748130] [<c01e98b9>] xfs_btree_delrec+0x5b9/0xa50
> [ 21.748247] [<c01e9d71>] xfs_btree_delete+0x21/0x70
> [ 21.748339] [<c01e72ff>] xfs_btree_read_buf_block+0x4f/0x70
> [ 21.748462] [<c01e5dda>] xfs_bmbt_init_key_from_rec+0xa/0x20
> [ 21.748556] [<c01e66d5>] xfs_lookup_get_search_key+0x25/0x40
> [ 21.748680] [<c01e9d71>] xfs_btree_delete+0x21/0x70
> [ 21.748771] [<c01e2ad5>] xfs_bmap_del_extent+0x2f5/0x960
> [ 21.748894] [<c01e383e>] xfs_bunmapi+0x5be/0x980
> [ 21.749000] [<c01fba82>] xfs_itruncate_finish+0x1c2/0x2f0
> [ 21.749137] [<c0210852>] xfs_inactive+0x1d2/0x3d0
> [ 21.749228] [<c01fc33d>] xfs_imap_to_bp+0x5d/0xd0
> [ 21.749354] [<c01775bc>] clear_inode+0x5c/0xb0
> [ 21.749443] [<c0177a8c>] generic_delete_inode+0x6c/0xc0
> [ 21.749559] [<c0177157>] iput+0x47/0x50
> [ 21.749644] [<c02056fe>] xlog_recover_process_one_iunlink+0xae/0xe0
> [ 21.749768] [<c02057a7>] xlog_recover_process_iunlinks+0x77/0xe0
> [ 21.749863] [<c020584f>] xlog_recover_finish+0x3f/0x90
> [ 21.749980] [<c0209390>] xfs_mountfs+0x450/0x550
> [ 21.750126] [<c02122e6>] kmem_alloc+0x56/0xb0
> [ 21.750216] [<c020995d>] xfs_mru_cache_create+0xdd/0x120
> [ 21.750344] [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0
> [ 21.750441] [<c016b608>] get_sb_bdev+0xe8/0x130
> [ 21.750556] [<c0219602>] xfs_fs_get_sb+0x12/0x20
> [ 21.750646] [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0
> [ 21.750760] [<c016a949>] vfs_kern_mount+0x39/0x80
> [ 21.750850] [<c016a9cf>] do_kern_mount+0x2f/0xc0
> [ 21.750972] [<c017b3e5>] do_mount+0x5a5/0x5f0
> [ 21.751061] [<c016c35f>] sys_stat64+0xf/0x30
> [ 21.751186] [<c0150271>] __get_free_pages+0x11/0x30
> [ 21.751280] [<c017b496>] sys_mount+0x66/0xa0
> [ 21.751394] [<c0102c72>] syscall_call+0x7/0xb
> [ 21.751494] Filesystem "sdb1": XFS internal error xfs_trans_cancel at line 1164 of file fs/xfs/xfs_trans.c. Caller 0xc021086b
> [ 21.751514]
> [ 21.751718] Pid: 1392, comm: mount Not tainted 2.6.28-git8 #1
> [ 21.751799] Call Trace:
> [ 21.751893] [<c020b586>] xfs_trans_cancel+0x46/0xd0
> [ 21.751986] [<c021086b>] xfs_inactive+0x1eb/0x3d0
> [ 21.752102] [<c021086b>] xfs_inactive+0x1eb/0x3d0
> [ 21.752192] [<c01fc33d>] xfs_imap_to_bp+0x5d/0xd0
> [ 21.752307] [<c01775bc>] clear_inode+0x5c/0xb0
> [ 21.752396] [<c0177a8c>] generic_delete_inode+0x6c/0xc0
> [ 21.752512] [<c0177157>] iput+0x47/0x50
> [ 21.752596] [<c02056fe>] xlog_recover_process_one_iunlink+0xae/0xe0
> [ 21.752719] [<c02057a7>] xlog_recover_process_iunlinks+0x77/0xe0
> [ 21.752815] [<c020584f>] xlog_recover_finish+0x3f/0x90
> [ 21.752930] [<c0209390>] xfs_mountfs+0x450/0x550
> [ 21.753022] [<c02122e6>] kmem_alloc+0x56/0xb0
> [ 21.753136] [<c020995d>] xfs_mru_cache_create+0xdd/0x120
> [ 21.753232] [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0
> [ 21.753348] [<c016b608>] get_sb_bdev+0xe8/0x130
> [ 21.753439] [<c0219602>] xfs_fs_get_sb+0x12/0x20
> [ 21.753555] [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0
> [ 21.753645] [<c016a949>] vfs_kern_mount+0x39/0x80
> [ 21.753760] [<c016a9cf>] do_kern_mount+0x2f/0xc0
> [ 21.753852] [<c017b3e5>] do_mount+0x5a5/0x5f0
> [ 21.753965] [<c016c35f>] sys_stat64+0xf/0x30
> [ 21.754054] [<c0150271>] __get_free_pages+0x11/0x30
> [ 21.754172] [<c017b496>] sys_mount+0x66/0xa0
> [ 21.754258] [<c0102c72>] syscall_call+0x7/0xb
> [ 21.754378] xfs_force_shutdown(sdb1,0x8) called from line 1165 of file fs/xfs/xfs_trans.c. Return address = 0xc020b59c
> [ 21.754522] Filesystem "sdb1": Corruption of in-memory data detected. Shutting down filesystem: sdb1
> [ 21.754661] Please umount the filesystem, and rectify the problem(s)
> [ 21.754814] BUG: unable to handle kernel NULL pointer dereference at 00000054
> [ 21.754990] IP: [<c02057bf>] xlog_recover_process_iunlinks+0x8f/0xe0
> [ 21.755137] *pde = 00000000
> [ 21.755263] Oops: 0000 [#1]
> [ 21.755385] last sysfs file: /sys/class/net/lo/operstate
> [ 21.755490] Modules linked in: i2c_viapro i2c_core via_agp agpgart sata_promise sd_mod
> [ 21.755906]
> [ 21.755976] Pid: 1392, comm: mount Not tainted (2.6.28-git8 #1) EPIA
> [ 21.756091] EIP: 0060:[<c02057bf>] EFLAGS: 00010296 CPU: 0
> [ 21.756181] EIP is at xlog_recover_process_iunlinks+0x8f/0xe0
> [ 21.756293] EAX: 00000000 EBX: de827f00 ECX: 00000005 EDX: df231000
> [ 21.756384] ESI: ffffffff EDI: df231000 EBP: 00000026 ESP: df383e14
> [ 21.756501] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [ 21.756592] Process mount (pid: 1392, ti=df382000 task=deb37b60 task.ti=df382000)
> [ 21.756712] Stack:
> [ 21.756792] df383e24 00000026 00000003 00000000 00000000 df2572e0 00000000 00000003
> [ 21.757192] df231000 c020584f 00000003 00000000 00000000 c0209390 00000017 00000000
> [ 21.757691] 08000004 00000000 00000000 00000001 00000058 00000000 c02122e6 00000001
> [ 21.758243] Call Trace:
> [ 21.758327] [<c020584f>] xlog_recover_finish+0x3f/0x90
> [ 21.758476] [<c0209390>] xfs_mountfs+0x450/0x550
> [ 21.758615] [<c02122e6>] kmem_alloc+0x56/0xb0
> [ 21.758756] [<c020995d>] xfs_mru_cache_create+0xdd/0x120
> [ 21.758908] [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0
> [ 21.759059] [<c016b608>] get_sb_bdev+0xe8/0x130
> [ 21.759200] [<c0219602>] xfs_fs_get_sb+0x12/0x20
> [ 21.759350] [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0
> [ 21.759500] [<c016a949>] vfs_kern_mount+0x39/0x80
> [ 21.759647] [<c016a9cf>] do_kern_mount+0x2f/0xc0
> [ 21.759794] [<c017b3e5>] do_mount+0x5a5/0x5f0
> [ 21.759934] [<c016c35f>] sys_stat64+0xf/0x30
> [ 21.760103] [<c0150271>] __get_free_pages+0x11/0x30
> [ 21.760255] [<c017b496>] sys_mount+0x66/0xa0
> [ 21.760396] [<c0102c72>] syscall_call+0x7/0xb
> [ 21.760535] Code: e8 e7 f6 00 00 55 89 f1 8b 54 24 04 89 f8 e8 a9 fe ff ff 89 c6 8d 44 24 0c 50 8b 4c 24 08 31 d2 [ 27.297839] NET: Registered protocol family 10
> [ 38.240089] eth0: no IPv6 routers present
> [ 38.995204] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> [ 38.995468] NFSD: starting 90-second grace period
>
>
> When i reboot to git2 again and run a xfs_check and/or xfs_repair nothing shows up.
> It just works without problems. So i think it's not actually an XFS problem but
> something "underneath"
> It's slow hardware and i have been reading about locking situations, which could be
> triggered easier on my (slow) hardware then any modern (faster) hardware.
>
> Anyone else bitten by the same problem ?
>
> Danny
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
---end quoted text---
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
@ 2009-01-07 18:24 ` Danny ter Haar
2009-01-07 18:31 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 18:24 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> > Any kernel i try to boot after 2.6.28-git2 (i tried git4-9)
> > sooner or later gives me an XFS error:
> The recover_uiunlinks code changed recently. Do you still have the
> exactly kernel tree version config around so you take a look using
> gdb what exact line of code the oops is?
Since compiling a kernel on the native hardware takes "forever"
i compile them at my laptop (ubuntu 32bits) and move the kernel
to the NAS.
This morning i compiled tried git9 (still an error)
I have copied the config & system.map file to the same dir on
my server ( http://www.dth.net/kernel/c3/ )
I'm not familiar with the debugging, do you have pointer where
i could find how to do this ?
In the mean time i'll try and google some info on how...
I could copy the whole source tree over to the machine and give you
(root) access to the machine so you can take a look yourself (if needed).
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:24 ` Danny ter Haar
@ 2009-01-07 18:31 ` Christoph Hellwig
2009-01-07 18:44 ` Danny ter Haar
0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:31 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Wed, Jan 07, 2009 at 07:24:15PM +0100, Danny ter Haar wrote:
> Since compiling a kernel on the native hardware takes "forever"
> i compile them at my laptop (ubuntu 32bits) and move the kernel
> to the NAS.
> This morning i compiled tried git9 (still an error)
And error when compiling the kernel or you got the same error again?
In case you can actually reproduce this error, I would be very
interested in a metadump of the filesystem that causes this error.
> I have copied the config & system.map file to the same dir on
> my server ( http://www.dth.net/kernel/c3/ )
>
>
> I'm not familiar with the debugging, do you have pointer where
> i could find how to do this ?
> In the mean time i'll try and google some info on how...
>
> I could copy the whole source tree over to the machine and give you
> (root) access to the machine so you can take a look yourself (if needed).
The dbuegging should be really easy as long you actually have a tree /
config with which the oops happened so that the oops has the same
addresses. Compared to your config we would need CONFIG_DEBUG_INFO,
but that's something we could turn on after the fact as it shouldn't
change the reported address. After that we can just do a trivial
command in gdb to find the source lines for the addresses reported
(this can easily be done on the system where you compile the kernel,
doesn't have to be on the NAS box).
Btw, turning on CONFIG_XFS_DEBUG would be useful to see more output
in case you can reproduce this issue. Just make sure to turn it
off again when you want to use the box for a real workload..
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:31 ` Christoph Hellwig
@ 2009-01-07 18:44 ` Danny ter Haar
2009-01-07 18:52 ` Christoph Hellwig
2009-01-07 18:56 ` Christoph Hellwig
0 siblings, 2 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 18:44 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> And error when compiling the kernel or you got the same error again?
same error while trying to boot (this time on sda (pata with libata))
root filesystem of the OS partition.
> In case you can actually reproduce this error, I would be very
> interested in a metadump of the filesystem that causes this error.
See the errorlog on: http://www.dth.net/kernel/c3/netconsole_2.6.28-git9-error.txt
> The dbuegging should be really easy as long you actually have a tree /
> config
kernel tree is still on my laptop available.
> with which the oops happened so that the oops has the same
> addresses.
> Compared to your config we would need CONFIG_DEBUG_INFO,
> but that's something we could turn on after the fact as it shouldn't
> change the reported address. After that we can just do a trivial
> command in gdb to find the source lines for the addresses reported
> (this can easily be done on the system where you compile the kernel,
> doesn't have to be on the NAS box).
ok, pointer to a howto by anychange ?
still looking/googling. [think i found something]
> Btw, turning on CONFIG_XFS_DEBUG would be useful to see more output
> in case you can reproduce this issue. Just make sure to turn it
> off again when you want to use the box for a real workload..
Will compile a new kernel in the mean time with those 2 options:
CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:44 ` Danny ter Haar
@ 2009-01-07 18:52 ` Christoph Hellwig
2009-01-07 22:09 ` Danny ter Haar
2009-01-08 0:38 ` Danny ter Haar
2009-01-07 18:56 ` Christoph Hellwig
1 sibling, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:52 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote:
> Will compile a new kernel in the mean time with those 2 options:
> CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO
Please only with CONFIG_DEBUG_INFO in that case. CONFIG_XFS_DEBUG would
be useful if you could reproduce it, but it does change the addresses,
so it's harmful for the case of getting back to the source line.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:44 ` Danny ter Haar
2009-01-07 18:52 ` Christoph Hellwig
@ 2009-01-07 18:56 ` Christoph Hellwig
2009-01-07 19:01 ` Danny ter Haar
2009-01-08 21:56 ` Danny ter Haar
1 sibling, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:56 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote:
> Quoting Christoph Hellwig (hch@infradead.org):
> > And error when compiling the kernel or you got the same error again?
>
> same error while trying to boot (this time on sda (pata with libata))
> root filesystem of the OS partition.
>
> > In case you can actually reproduce this error, I would be very
> > interested in a metadump of the filesystem that causes this error.
>
> See the errorlog on: http://www.dth.net/kernel/c3/netconsole_2.6.28-git9-error.txt
This looks like pretty much the same corruption you saw in your first
report. did you run xfs_repair before this one?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:56 ` Christoph Hellwig
@ 2009-01-07 19:01 ` Danny ter Haar
2009-01-08 21:56 ` Danny ter Haar
1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 19:01 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> This looks like pretty much the same corruption you saw in your first
> report. did you run xfs_repair before this one?
Yes!
and with git2: no problem.
The git8 error is on the storage drive(s) sdb1/sdc1 while
the git9 is on the /dev/sda6 partition (containing the root filesystem)
But it's completely repeatable.
Compiling takes about 45 minuten on my laptop, will report later
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:52 ` Christoph Hellwig
@ 2009-01-07 22:09 ` Danny ter Haar
2009-01-08 0:38 ` Danny ter Haar
1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 22:09 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> Please only with CONFIG_DEBUG_INFO in that case. CONFIG_XFS_DEBUG would
> be useful if you could reproduce it, but it does change the addresses,
> so it's harmful for the case of getting back to the source line.
Hmm, one bug is never alone...
During compile my laptop started smelling because of heat buildup and switched itself
off. I re-compiled the kernel on a debian lenny32 box i have access to.
I only added the "CONFIG_DEBUG_INFO" option.
Rebooted the nas, and so far the machine is running without a problem :-(
Am now gonna compile a kernel on the machine itself, to force a lot
of disk activity.
Will keep you posted.
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:52 ` Christoph Hellwig
2009-01-07 22:09 ` Danny ter Haar
@ 2009-01-08 0:38 ` Danny ter Haar
1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-08 0:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote:
> > Will compile a new kernel in the mean time with those 2 options:
> > CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO
>
> Please only with CONFIG_DEBUG_INFO in that case. CONFIG_XFS_DEBUG would
> be useful if you could reproduce it, but it does change the addresses,
> so it's harmful for the case of getting back to the source line.
>
Okay, it barfed again:
Let me first explain thati've set the sleeptimer for the storage harddrives
to 24(x5=120 sec) in hdparm.conf:
/dev/sdb {
spindown_time = 24
}
/dev/sdc {
spindown_time = 24
}
This is the error (snapped using netconsole)
http://www.dth.net/kernel/c3/netconsole_2.6.28-git9d-error.txt
I rebooted again with the git2 kernel and xfs_check gave no errors:
filer1:~# xfs_check /dev/sdb1
filer1:~#
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-07 18:56 ` Christoph Hellwig
2009-01-07 19:01 ` Danny ter Haar
@ 2009-01-08 21:56 ` Danny ter Haar
2009-01-09 0:46 ` Dave Chinner
1 sibling, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-08 21:56 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
I needed the parallel port driver so i compiled 2.6.28-git3 with debug info.
It barfed: http://www.dth.net/kernel/c3/netconsole_2.6.28-git3-d.txt
I re-compiled git2 (with debug support) and <knock wood> so far it is stable:
# procinfo
Memory: Total Used Free Buffers
RAM: 507116 258048 249068 428
Swap: 497972 0 497972
Bootup: Thu Jan 8 12:00:04 2009 Load average: 0.10 0.37 0.19 2/104 2767
user : 00:02:55.96 2.6% page in : 183385
nice : 00:00:00.00 0.0% page out: 497130
system: 00:00:47.36 0.7% page act: 23771
IOwait: 00:00:28.21 0.4% page dea: 0
hw irq: 00:00:02.10 0.0% page flt: 698907
sw irq: 00:00:03.37 0.0% swap in : 0
idle : 01:48:06.90 96.2% swap out: 0
uptime: 01:52:23.93 context : 323529
irq 0: 293258 timer irq 10: 112849 eth0
irq 1: 8 i8042 irq 11: 42028 sata_promise
irq 2: 0 cascade irq 12: 0 uhci_hcd:usb1, uh
irq 5: 0 acpi irq 14: 7629 pata_via
irq 7: 1 parport0 irq 15: 0 pata_via
sda 2740r 3576w sdb 1037r 325w
sda1 70r 2w sdb1 1024r 325w
sda2 2r 0w sdc 2615r 14257w
sda5 25r 0w sdc1 2602r 14257w
sda6 2636r 3574w
lo TX 5.68MiB RX 5.68MiB eth0 TX 86.38MiB RX 10.32MiB
-------------------
So (in my case) something while going from git2 -> git3 didn't go positive.
If i could get some explanation howto gdb the vmlinux image to determine in what
function the git3 barfed, i would be able to give more info/feedback.
I looked in the Documents folder in the kernel source, i looked at the examples
given there but i really dont understand how/what.
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-08 21:56 ` Danny ter Haar
@ 2009-01-09 0:46 ` Dave Chinner
2009-01-09 1:26 ` Danny ter Haar
0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2009-01-09 0:46 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Thu, Jan 08, 2009 at 10:56:02PM +0100, Danny ter Haar wrote:
>
> I needed the parallel port driver so i compiled 2.6.28-git3 with debug info.
> It barfed: http://www.dth.net/kernel/c3/netconsole_2.6.28-git3-d.txt
Looking at this, I think there are two possibilities in terms of the
problem being detected. We are modifying the inode BMBT here,
so that means we have XFS_BTREE_ROOT_IN_INODE set. The corruption
trigger has occurred because a xfs_btree_increment() call has
returned a zero status. This means we failed here:
1324 /* Fail if we just went off the right edge of the tree. */
1325 xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB);
1326 if (xfs_btree_ptr_is_null(cur, &ptr))
1327 goto out0;
or here:
1351 /*
1352 * If we went off the root then we are either seriously
1353 * confused or have the tree root in an inode.
1354 */
1355 if (lev == cur->bc_nlevels) {
1356 if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
1357 goto out0;
1358 ASSERT(0);
i.e. we either fell off the right edge of the tree or went over the top
of it.
I can't really see how we've done either of those things unless the
tree has been corrupted by a prior operation.
Given that each time it is aptitude that is causing the problem, can you
prevent aptitude from running automatically on boot and run it manually?
If you can reporduce the problem manually then we can move on to the
next step....
> So (in my case) something while going from git2 -> git3 didn't go positive.
That would have been when Linus did the XFS pull...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 0:46 ` Dave Chinner
@ 2009-01-09 1:26 ` Danny ter Haar
2009-01-09 2:08 ` Dave Chinner
0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 1:26 UTC (permalink / raw)
To: Christoph Hellwig, xfs
Quoting Dave Chinner (david@fromorbit.com):
> Looking at this, I think there are two possibilities in terms of the
> problem being detected. We are modifying the inode BMBT here,
> so that means we have XFS_BTREE_ROOT_IN_INODE set. The corruption
> trigger has occurred because a xfs_btree_increment() call has
> returned a zero status. This means we failed here:
>
> 1324 /* Fail if we just went off the right edge of the tree. */
> 1325 xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB);
> 1326 if (xfs_btree_ptr_is_null(cur, &ptr))
> 1327 goto out0;
>
> or here:
>
> 1351 /*
> 1352 * If we went off the root then we are either seriously
> 1353 * confused or have the tree root in an inode.
> 1354 */
> 1355 if (lev == cur->bc_nlevels) {
> 1356 if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
> 1357 goto out0;
> 1358 ASSERT(0);
>
> i.e. we either fell off the right edge of the tree or went over the top
> of it.
> I can't really see how we've done either of those things unless the
> tree has been corrupted by a prior operation.
sounds logical.
First time when it happened i moved the primairy hd to sec ide connector, connected
a seperate hard drive as new master, installed a fresh debian lenny on that
harddrive, ran xfs-repair on all xfs filesystems: no errors
> Given that each time it is aptitude that is causing the problem, can you
> prevent aptitude from running automatically on boot and run it manually?
> If you can reporduce the problem manually then we can move on to the
> next step....
I wasn't clear (obvioulsy)
This machine is besides my NAS also my apt-cacher-ng server for all my other
machines here at home. The easiest way to trigger the error is often by running
a simple "aptitude update; aptitude -d dist-upgrade"
So when it barfed i did the aptitude by hand.
And it checks everything from the cache at /var/cache/apt-cacher-ng
which is on sda6 (root filesystem on XFS)
So it doesn't "barf" right on boot, it takes a few minutes or even hours:
filer1:~# last -20 reboot
reboot system boot 2.6.28-git2-d Thu Jan 8 12:00 - current(05:18)
reboot system boot 2.6.28-git3-d Thu Jan 8 11:31 - 11:59 (00:27)
reboot system boot 2.6.28-git3-d Thu Jan 8 10:56 - 11:59 (01:02)
reboot system boot 2.6.28-git3-d Thu Jan 8 10:44 - 10:54 (00:10)
reboot system boot 2.6.28-git3-d Thu Jan 8 10:30 - 10:43 (00:12)
reboot system boot 2.6.28-git2 Wed Jan 7 15:08 - 10:28 (19:19)
reboot system boot 2.6.28-git9-d Wed Jan 7 12:29 - 14:58 (02:29)
reboot system boot 2.6.28-git2 Wed Jan 7 10:08 - 12:27 (02:19)
reboot system boot 2.6.28-git9 Wed Jan 7 09:21 - 10:06 (00:45)
reboot system boot 2.6.28-git9 Wed Jan 7 08:42 - 10:06 (01:24)
reboot system boot 2.6.28-git2 Tue Jan 6 21:45 - 08:40 (10:55)
reboot system boot 2.6.28-git4 Tue Jan 6 21:27 - 08:40 (11:13)
reboot system boot 2.6.28-git4 Tue Jan 6 21:22 - 08:40 (11:18)
Sometimes the kernel barfes while accessing /dev/sdb1 of /dev/sdc1
which is only accessed using samba.
I can once more install the "other" debian lenny harddrive, boot from there
and than manually do an xfs_repair on xfs filesystems.
I can than boot a kernel that is know to barf and try to get it to barf.
> > So (in my case) something while going from git2 -> git3 didn't go positive.
> That would have been when Linus did the XFS pull...
Do you want me to figure out what patch from git2->git3 is the cullprit ?
I'll have to compile/reboot for a while.
Tell me what else i can do to resolve this.
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 1:26 ` Danny ter Haar
@ 2009-01-09 2:08 ` Dave Chinner
2009-01-09 6:10 ` Danny ter Haar
0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2009-01-09 2:08 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Fri, Jan 09, 2009 at 02:26:10AM +0100, Danny ter Haar wrote:
> Do you want me to figure out what patch from git2->git3 is the cullprit ?
> I'll have to compile/reboot for a while.
If you can do that, it would be *greatly* appreciated.
> Tell me what else i can do to resolve this.
If you can isolate the commit that introduced the bug, then
code review is probably the step after that....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 2:08 ` Dave Chinner
@ 2009-01-09 6:10 ` Danny ter Haar
2009-01-09 19:44 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 6:10 UTC (permalink / raw)
To: Christoph Hellwig, xfs
Quoting Dave Chinner (david@fromorbit.com):
> > Do you want me to figure out what patch from git2->git3 is the cullprit ?
> > I'll have to compile/reboot for a while.
> If you can do that, it would be *greatly* appreciated.
Need a little bit of help here.
(i'm a hardware guy with just the bare essential software knowhow)
I found
http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
(used that one before)
but it doesn't seem to know the "-gitXX" kernel trees.
Q:
how do i get a list of available kernel versions (including -gitXX branches)
My intention is to say something like:
# git bisect bad v2.6.28-git3
# git bisect good v2.6.28-git2
and take it from there.
I've been googling now for nearly 2hours..
Any help appreciated
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 6:10 ` Danny ter Haar
@ 2009-01-09 19:44 ` Christoph Hellwig
2009-01-09 19:51 ` Danny ter Haar
0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-09 19:44 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote:
> # git bisect bad v2.6.28-git3
> # git bisect good v2.6.28-git2
>
> and take it from there.
Yes. Then it gives you a revision to test and you can use
git bisect next
to try the next one, or
git bisect skip
if the current version is bad due to some other reason (doesn't boot /
compile)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 19:44 ` Christoph Hellwig
@ 2009-01-09 19:51 ` Danny ter Haar
2009-01-09 19:58 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 19:51 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote:
> Yes. Then it gives you a revision to test and you can use
I'm still not making myself clear (sorry for that ;-) )
i'm not able to "access" the 2.6.28-gitXX repository for some reason.
with other words:
how do "include" the 2.6.28-git repository in git ?
Danny
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 19:51 ` Danny ter Haar
@ 2009-01-09 19:58 ` Christoph Hellwig
2009-01-09 21:42 ` Danny ter Haar
0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-09 19:58 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Fri, Jan 09, 2009 at 08:51:44PM +0100, Danny ter Haar wrote:
> Quoting Christoph Hellwig (hch@infradead.org):
> > On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote:
> > Yes. Then it gives you a revision to test and you can use
>
> I'm still not making myself clear (sorry for that ;-) )
> i'm not able to "access" the 2.6.28-gitXX repository for some reason.
> with other words:
>
> how do "include" the 2.6.28-git repository in git ?
Oh, sorry.
There is no specific 2.6.28-git respository, just a main linux 2.6 one:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
The git bisect command make sure you will get the right revisions
checked out to bisect.
>
> Danny
---end quoted text---
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 19:58 ` Christoph Hellwig
@ 2009-01-09 21:42 ` Danny ter Haar
2009-01-09 22:01 ` Christoph Hellwig
0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 21:42 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> There is no specific 2.6.28-git respository, just a main linux 2.6 one:
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> The git bisect command make sure you will get the right revisions
> checked out to bisect.
still not making myself clear i guess..
lenny32:/usr/src/linux-2.6# git tag|grep 2.6.28
v2.6.28
v2.6.28-rc1
v2.6.28-rc2
v2.6.28-rc3
v2.6.28-rc4
v2.6.28-rc5
v2.6.28-rc6
v2.6.28-rc7
v2.6.28-rc8
v2.6.28-rc9
So i cannot use any "standard" GIT command to also include
linus's git tree ?! that seems odd.
He committed them somehow in git didn't he ?
I could of course compile /patch 2 kernel trees by hand
in fact i allready have:
lenny32:/usr/src# ls -l
total 16
drwxr-sr-x 2 root src 146 2009-01-09 08:29 archive
drwxr-sr-x 2 root src 4096 2009-01-08 11:33 configs
drwxr-sr-x 23 root src 4096 2009-01-08 19:56 linux-2.6
drwxr-xr-x 24 root root 4096 2009-01-08 11:53 linux-2.6.28-git2-d
drwxr-xr-x 24 root root 4096 2009-01-08 10:21 linux-2.6.28-git3-d
But i'm not familiar how to instruct git to find the differences
between the 2 source tree's and how to use git to find out was
the culprit is.
Any other suggestions ?
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 21:42 ` Danny ter Haar
@ 2009-01-09 22:01 ` Christoph Hellwig
2009-01-09 22:23 ` Danny ter Haar
2009-01-13 20:04 ` Danny ter Haar
0 siblings, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-09 22:01 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Fri, Jan 09, 2009 at 10:42:06PM +0100, Danny ter Haar wrote:
> Quoting Christoph Hellwig (hch@infradead.org):
> > There is no specific 2.6.28-git respository, just a main linux 2.6 one:
> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > The git bisect command make sure you will get the right revisions
> > checked out to bisect.
>
> still not making myself clear i guess..
Or me beeing stupid, sorry :) The -gitN thingies aren't
official tags, but unofficial snapshots done by Dave Woodhouse.
He has a mapping on
http://www.kernel.org/pub/linux/kernel/people/dwmw2/snapshot-tags/ from
these names to actual tags. So 2.6.28-git2 would be
3c92ec8ae91ecf59d88c798301833d7cf83f2179
and 2.6.28-git3
6a94cb73064c952255336cc57731904174b2c58f
so use
git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179
git bisect bad 6a94cb73064c952255336cc57731904174b2c58f
Sorry for the confusion..
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 22:01 ` Christoph Hellwig
@ 2009-01-09 22:23 ` Danny ter Haar
2009-01-13 20:04 ` Danny ter Haar
1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 22:23 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Christoph Hellwig (hch@infradead.org):
> git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179
> git bisect bad 6a94cb73064c952255336cc57731904174b2c58f
> Sorry for the confusion..
*THANKS*
Bisecting: 955 revisions left to test after this
;-)
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-09 22:01 ` Christoph Hellwig
2009-01-09 22:23 ` Danny ter Haar
@ 2009-01-13 20:04 ` Danny ter Haar
2009-01-16 20:43 ` Danny ter Haar
1 sibling, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-13 20:04 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
I'm stuck in trying to bisect the problem.
I restarted from scratch and same result:
Here is what i did:
lenny32:/usr/src# git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-git
Initialized empty Git repository in /usr/src/linux-git/.git/
remote: Counting objects: 1057520, done.
remote: Compressing objects: 100% (173057/173057), done.
Resolving deltas: 100% (881801/881801), done.
Checking out files: 100% (26544/26544), done.
# git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179
# git bisect bad 6a94cb73064c952255336cc57731904174b2c58f
Bisecting: 955 revisions left to test after this
[5ed1836814d908f45cafde0e79cb85314ab9d41d] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
The EXTRAVERSION in the makefile at this point was "plain" 2.6.28 (which seems odd to me)
I renamed it to "-a1" and compiled kernel.
Installed/rebooted.
after 16 hours (overnight) of no troubles i think it seemed stable:
reboot system boot 2.6.28-a1 Fri Jan 9 16:54 - 09:29 (16:34)
# git bisect good
Bisecting: 477 revisions left to test after this
M scripts/package/Makefile
D scripts/package/builddeb
[7b2cd079ec8dcc65cdca6621245cfa5e30a8ef9f] V4L/DVB (10007): gspca - m5602: Refactor the error handling in the s5k83a
version "-a2" was branded "ok" by me after 3 hours of heavy use:
reboot system boot 2.6.28-a2 Sat Jan 10 09:31 - 12:37 (03:06)
# git bisect good
Bisecting: 262 revisions left to test after this
[6094c85a935f7eadb4c607c6dc6d86c0a9f09a4b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Version "-a3" also got my blessing:
reboot system boot 2.6.28-a3 Sat Jan 10 12:39 - 17:55 (05:15)
# git bisect good
Bisecting: 131 revisions left to test after this
[a1941895034cda2bffa23ba845607c82138ccf52] [XFS] remove dead code for old inode item recovery
This time the Makefile EXTRAVERSION changed to "-RC6"
I was/am under the impression that i'm testing between 2.6.28-git2 and 2.6.28-git3
So why is the Makefile going this way back ? dazzled & confused.
"-a4" barfed
lenny32:/usr/src/linux-git# git bisect bad
Bisecting: 65 revisions left to test after this
[6441e549157b749bae003cce70b4c8b62e4801fa] [XFS] factor xfs_iget_core() into hit and miss cases
EXTRAVERSION went baack to "-RC2" ?!
"-a5" totally froze the machine (didn't catch anything on the netconsole)
So i interpreted that as a "fail"
# git bisect bad
Bisecting: 32 revisions left to test after this
[fd6bcc5b63051392ba709a8fd33173b263669e0a] [XFS] kill xfs_bmbt_log_block and xfs_bmbt_log_recs
Now "-a6" doesn't want to compile:
# make-kpkg kernel-image --initrd
exec debian/rules DEBIAN_REVISION=2.6.28-a6-10.00.Custom INITRD=YES kernel-image
====== making target debian/stamp/build/kernel [new prereqs: conf.vars]======
This is kernel package version 11.015.
test ! -f scripts/package/builddeb.kpkg-dist || mv -f scripts/package/builddeb.kpkg-dist scripts/package/builddeb
test ! -f scripts/package/Makefile.kpkg-dist || mv -f scripts/package/Makefile.kpkg-dist scripts/package/Makefile
/usr/bin/make -j2 ARCH=i386 \
bzImage
make[1]: Entering directory `/usr/src/linux-git'
CHK include/linux/version.h
CHK include/linux/utsrelease.h
CALL scripts/checksyscalls.sh
CHK include/linux/compile.h
CC fs/xfs/xfs_alloc_btree.o
fs/xfs/xfs_alloc_btree.c:38:29: error: xfs_btree_trace.h: No such file or directory
fs/xfs/xfs_alloc_btree.c: In function ‘xfs_allocbt_alloc_block’:
fs/xfs/xfs_alloc_btree.c:84: error: implicit declaration of function ‘XFS_BTREE_TRACE_CURSOR’
fs/xfs/xfs_alloc_btree.c:84: error: ‘XBT_ENTRY’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:84: error: (Each undeclared identifier is reported only once
fs/xfs/xfs_alloc_btree.c:84: error: for each function it appears in.)
fs/xfs/xfs_alloc_btree.c:90: error: ‘XBT_ERROR’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:95: error: ‘XBT_EXIT’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c: In function ‘xfs_allocbt_kill_root’:
fs/xfs/xfs_alloc_btree.c:291: error: ‘XBT_ENTRY’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:301: error: ‘XBT_ERROR’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:310: error: ‘XBT_EXIT’ undeclared (first use in this function)
make[3]: *** [fs/xfs/xfs_alloc_btree.o] Error 1
make[2]: *** [fs/xfs] Error 2
make[1]: *** [fs] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory `/usr/src/linux-git'
make: *** [debian/stamp/build/kernel] Error 2
In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed:
Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/
[F1] ;-)
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
[not found] <20090107165218.GA11132@dth.net>
2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
@ 2009-01-14 19:44 ` Tino Keitel
1 sibling, 0 replies; 27+ messages in thread
From: Tino Keitel @ 2009-01-14 19:44 UTC (permalink / raw)
To: linux-kernel; +Cc: xfs
[added CC:]
On Wed, Jan 07, 2009 at 17:52:18 +0100, dth wrote:
> Am running low-power motherboard (via epia5000 c3) as a storage (samba/nfs)
> server. (uses about 20 watts when both storage drives are in powersave mode)
>
> OS = debian lenny
>
> Pimairy drive (60gig 2.5" pata disk using libata = sda)
> 512MB ext3 partition as /boot
> swap, rest of the drive is XFS root file system.
>
> storage:
> 2 x 750GB drives on a sata controller (FastTrak S150 TX2plus)
> sdb & sdc which have 1 big XFS partition.
>
> Any kernel i try to boot after 2.6.28-git2 (i tried git4-9)
> sooner or later gives me an XFS error:
[...]
> Anyone else bitten by the same problem ?
y
Hi,
I get the same on a Mac mini Core Duo using 2.6.29-rc1. I just booted,
started firefox and my /home died.
XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file linux-2.6/fs/xfs/xfs_btree.c. Caller 0xc022e9c6
Pid: 4937, comm: firefox-bin Not tainted 2.6.29-rc1-00224-ga652504 #11
Call Trace:
[<c022e6d9>] xfs_btree_delrec+0xbd9/0xea0
[<c022e9c6>] xfs_btree_delete+0x26/0x90
[<c022b355>] xfs_btree_lookup_get_block+0xa5/0xe0
[<c022933a>] xfs_bmbt_init_key_from_rec+0xa/0x20
[<c022c6ba>] xfs_btree_lookup+0x21a/0x460
[<c022e9c6>] xfs_btree_delete+0x26/0x90
[<c02255ff>] xfs_bmap_del_extent+0x64f/0xb70
[<c02262ea>] xfs_bunmapi+0x65a/0xb30
[<c0244f6b>] xfs_itruncate_finish+0x1ab/0x3e0
[<c026115d>] xfs_inactive+0x3cd/0x4f0
[<c019113c>] clear_inode+0x5c/0x100
[<c01917fe>] generic_delete_inode+0xce/0xe0
[<c0190b24>] iput+0x44/0x50
[<c01898a0>] do_unlinkat+0xe0/0x160
[<c0188180>] lock_rename+0x0/0xa0
[<c01033c5>] sysenter_do_call+0x12/0x25
Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1164 of file linux-2.6/fs/xfs/xfs_trans.c.
Caller 0xc0261173
Pid: 4937, comm: firefox-bin Not tainted 2.6.29-rc1-00224-ga652504 #11
Call Trace:
[<c025a54b>] xfs_trans_cancel+0xcb/0xf0
[<c0261173>] xfs_inactive+0x3e3/0x4f0
[<c0261173>] xfs_inactive+0x3e3/0x4f0
[<c019113c>] clear_inode+0x5c/0x100
[<c01917fe>] generic_delete_inode+0xce/0xe0
[<c0190b24>] iput+0x44/0x50
[<c01898a0>] do_unlinkat+0xe0/0x160
[<c0188180>] lock_rename+0x0/0xa0
[<c01033c5>] sysenter_do_call+0x12/0x25
xfs_force_shutdown(dm-1,0x8) called from line 1165 of file linux-2.6/fs/xfs/xfs_trans.c. Return address = 0xc025a563
Filesystem "dm-1": Corruption of in-memory data detected. Shutting down filesystem: dm-1
Please umount the filesystem, and rectify the problem(s)
Regards,
Tino
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-13 20:04 ` Danny ter Haar
@ 2009-01-16 20:43 ` Danny ter Haar
2009-01-17 7:38 ` Dave Chinner
0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-16 20:43 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
Quoting Danny ter Haar (dth@dth.net):
> I'm stuck in trying to bisect the problem.
> I restarted from scratch and same result:
[SNIP for bravery]
> In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed:
>
> Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/
>
> [F1] ;-)
This was to my way to say "HELP"
I'm stuck..
Anyone have some tips/hunts/suggestions (like find another hobby ;-) )
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-16 20:43 ` Danny ter Haar
@ 2009-01-17 7:38 ` Dave Chinner
2009-01-17 23:25 ` Danny ter Haar
0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2009-01-17 7:38 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Fri, Jan 16, 2009 at 09:43:46PM +0100, Danny ter Haar wrote:
> Quoting Danny ter Haar (dth@dth.net):
> > I'm stuck in trying to bisect the problem.
> > I restarted from scratch and same result:
>
> [SNIP for bravery]
>
> > In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed:
> >
> > Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/
> >
> > [F1] ;-)
>
> This was to my way to say "HELP"
Sorry for not getting back to you sooner. I think that Alexander
tripped over this same problem during his bisect. If you follow the
thread from here:
http://oss.sgi.com/archives/xfs/2009-01/msg00496.html
You'll see that Alexander had the same problem and managed
to continue the bisect once he copied the xfs_btree_trace.h
header file from top-of-tree back into the broken commits.
I hope this helps (and I hope that the bisect lands on the
same commit that it did for Alexander).
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-17 7:38 ` Dave Chinner
@ 2009-01-17 23:25 ` Danny ter Haar
2009-01-18 2:50 ` Danny ter Haar
2009-01-19 3:17 ` Dave Chinner
0 siblings, 2 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-17 23:25 UTC (permalink / raw)
To: Christoph Hellwig, xfs
Quoting Dave Chinner (david@fromorbit.com):
> Sorry for not getting back to you sooner.
No problem. I initally posted to LKLM, git redirected by Christoph to this
list. I'm so stupid that i didn't check the other messages from this list.
Sorry.
> I think that Alexander tripped over this same problem during his bisect.
> If you follow the thread from here:
> http://oss.sgi.com/archives/xfs/2009-01/msg00496.html
Yep! [cheer] i'm not alone! :-)
But why only us two ? there must be thousands of users out there using
XFS. Why did it bite us ? large filesystem together with slow hardware ?
> You'll see that Alexander had the same problem and managed
> to continue the bisect once he copied the xfs_btree_trace.h
> header file from top-of-tree back into the broken commits.
Grwat.
> I hope this helps (and I hope that the bisect lands on the
> same commit that it did for Alexander).
Do you want me to still try it ?
I think you allready figured out where the culprit is ?!
I saw changes in the announcement of 2.6.29-rc3 and took the plunge:
# procinfo
Memory: Total Used Free Buffers
RAM: 506940 447868 59072 84
Swap: 497972 0 497972
Bootup: Sat Jan 17 10:28:27 2009 Load average: 0.03 0.11 0.09 2/104 5259
user : 00:09:30.12 3.2% page in : 417582
nice : 00:00:00.00 0.0% page out: 1220260
system: 00:02:01.76 0.7% page act: 41134
IOwait: 00:03:34.28 1.2% page dea: 13444
hw irq: 00:00:01.94 0.0% page flt: 1531395
sw irq: 00:00:04.50 0.0% swap in : 0
idle : 04:39:41.90 94.8% swap out: 0
uptime: 04:54:55.09 context : 892623
irq 0: 799012 timer irq 10: 101465 eth0
irq 1: 8 i8042 irq 11: 46893 sata_promise
irq 2: 0 cascade irq 12: 0 uhci_hcd:usb1, uh
irq 5: 0 acpi irq 14: 50059 pata_via
irq 7: 1 parport0 irq 15: 0 pata_via
sda 15445r 22597w sdb 2820r 23372w
sda1 555r 284w sdb1 2750r 23372w
sda2 2r 0w sdc 63r 3w
sda5 136r 0w sdc1 50r 3w
sda6 14659r 22313w
lo TX 59.65KiB RX 59.65KiB eth0 TX 13.40MiB RX 23.78MiB
over 4 hours of uptime and moderate usage, so i'm not 100% convinced but this one
looks good (so far)
Let me know if i should persue some more.
Thanks for all the help.
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-17 23:25 ` Danny ter Haar
@ 2009-01-18 2:50 ` Danny ter Haar
2009-01-19 3:17 ` Dave Chinner
1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-18 2:50 UTC (permalink / raw)
To: Christoph Hellwig, xfs
> I saw changes in the announcement of 2.6.29-rc3 and took the plunge:
> over 4 hours of uptime and moderate usage, so i'm not 100% convinced but this one
> looks good (so far)
reboot system boot 2.6.29-rc2 Sat Jan 17 10:28 - crash (08:20)
ok, not stable (for me)
Danny
--
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
2009-01-17 23:25 ` Danny ter Haar
2009-01-18 2:50 ` Danny ter Haar
@ 2009-01-19 3:17 ` Dave Chinner
1 sibling, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2009-01-19 3:17 UTC (permalink / raw)
To: Danny ter Haar; +Cc: Christoph Hellwig, xfs
On Sun, Jan 18, 2009 at 12:25:11AM +0100, Danny ter Haar wrote:
> Quoting Dave Chinner (david@fromorbit.com):
> > Sorry for not getting back to you sooner.
>
> No problem. I initally posted to LKLM, git redirected by Christoph to this
> list. I'm so stupid that i didn't check the other messages from this list.
> Sorry.
>
> > I think that Alexander tripped over this same problem during his bisect.
> > If you follow the thread from here:
> > http://oss.sgi.com/archives/xfs/2009-01/msg00496.html
>
> Yep! [cheer] i'm not alone! :-)
> But why only us two ? there must be thousands of users out there using
> XFS. Why did it bite us ? large filesystem together with slow hardware ?
No idea - I can't reproduce it either so there's some state
that your filesystem is getting into that trips over it.
> > You'll see that Alexander had the same problem and managed
> > to continue the bisect once he copied the xfs_btree_trace.h
> > header file from top-of-tree back into the broken commits.
>
> Grwat.
>
> > I hope this helps (and I hope that the bisect lands on the
> > same commit that it did for Alexander).
>
> Do you want me to still try it ?
> I think you allready figured out where the culprit is ?!
Yes, i think we have, but it wasn't totally conclusive. Can you
continue your bisect to see if it narrows down to the same commit
on your machine?
I'm still trying to reproduce it but I haven't worked out what the
initial state is. One thing that might be useful is to put a printk
into the kernel on the failure path that prints the inode number
out (e.g. at the goto that the WANT_CORRUPTED_GOTO jumps to). Then
we can use xfs_db to find the file that is causing the problem and
then use xfs_db or xfs_bmap to look at the extent tree prior to
the corruption. That might help me set up the initial state needed
to trip the problem.....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-01-19 3:23 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090107165218.GA11132@dth.net>
2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
2009-01-07 18:24 ` Danny ter Haar
2009-01-07 18:31 ` Christoph Hellwig
2009-01-07 18:44 ` Danny ter Haar
2009-01-07 18:52 ` Christoph Hellwig
2009-01-07 22:09 ` Danny ter Haar
2009-01-08 0:38 ` Danny ter Haar
2009-01-07 18:56 ` Christoph Hellwig
2009-01-07 19:01 ` Danny ter Haar
2009-01-08 21:56 ` Danny ter Haar
2009-01-09 0:46 ` Dave Chinner
2009-01-09 1:26 ` Danny ter Haar
2009-01-09 2:08 ` Dave Chinner
2009-01-09 6:10 ` Danny ter Haar
2009-01-09 19:44 ` Christoph Hellwig
2009-01-09 19:51 ` Danny ter Haar
2009-01-09 19:58 ` Christoph Hellwig
2009-01-09 21:42 ` Danny ter Haar
2009-01-09 22:01 ` Christoph Hellwig
2009-01-09 22:23 ` Danny ter Haar
2009-01-13 20:04 ` Danny ter Haar
2009-01-16 20:43 ` Danny ter Haar
2009-01-17 7:38 ` Dave Chinner
2009-01-17 23:25 ` Danny ter Haar
2009-01-18 2:50 ` Danny ter Haar
2009-01-19 3:17 ` Dave Chinner
2009-01-14 19:44 ` Tino Keitel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox