public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
       [not found] <20090107165218.GA11132@dth.net>
@ 2009-01-07 18:02 ` Christoph Hellwig
  2009-01-07 18:24   ` Danny ter Haar
  2009-01-14 19:44 ` Tino Keitel
  1 sibling, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:02 UTC (permalink / raw)
  To: dth; +Cc: xfs

On Wed, Jan 07, 2009 at 05:52:18PM +0100, dth wrote:
> Am running low-power motherboard (via epia5000 c3) as a storage (samba/nfs)
> server. (uses about 20 watts when both storage drives are in powersave mode)
> 
> OS = debian lenny
> 
> Pimairy drive (60gig 2.5" pata disk using libata = sda)
>  512MB ext3 partition as /boot
>  swap, rest of the drive is XFS root file system.
> 2 x 750GB drives on a sata controller (FastTrak S150 TX2plus)
> sdb & sdc which have 1 big XFS partition.
> 
> Any kernel i try to boot after 2.6.28-git2 (i tried git4-9) 
> sooner or later gives me an XFS error:

The recover_uiunlinks code changed recently.  Do you still have the
exactly kernel tree version config around so you take a look using
gdb what exact line of code the oops is?

> [   21.535820] XFS mounting filesystem sdb1
> [   21.646126] Starting XFS recovery on filesystem: sdb1 (logdev: internal)
> [   21.747772] XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file 
>                fs/xfs/xfs_btree.c.  Caller 0xc01e9d71
> [   21.747916] Pid: 1392, comm: mount Not tainted 2.6.28-git8 #1
> [   21.748033] Call Trace:
> [   21.748130]  [<c01e98b9>] xfs_btree_delrec+0x5b9/0xa50
> [   21.748247]  [<c01e9d71>] xfs_btree_delete+0x21/0x70
> [   21.748339]  [<c01e72ff>] xfs_btree_read_buf_block+0x4f/0x70
> [   21.748462]  [<c01e5dda>] xfs_bmbt_init_key_from_rec+0xa/0x20
> [   21.748556]  [<c01e66d5>] xfs_lookup_get_search_key+0x25/0x40
> [   21.748680]  [<c01e9d71>] xfs_btree_delete+0x21/0x70
> [   21.748771]  [<c01e2ad5>] xfs_bmap_del_extent+0x2f5/0x960
> [   21.748894]  [<c01e383e>] xfs_bunmapi+0x5be/0x980
> [   21.749000]  [<c01fba82>] xfs_itruncate_finish+0x1c2/0x2f0
> [   21.749137]  [<c0210852>] xfs_inactive+0x1d2/0x3d0
> [   21.749228]  [<c01fc33d>] xfs_imap_to_bp+0x5d/0xd0
> [   21.749354]  [<c01775bc>] clear_inode+0x5c/0xb0
> [   21.749443]  [<c0177a8c>] generic_delete_inode+0x6c/0xc0
> [   21.749559]  [<c0177157>] iput+0x47/0x50
> [   21.749644]  [<c02056fe>] xlog_recover_process_one_iunlink+0xae/0xe0
> [   21.749768]  [<c02057a7>] xlog_recover_process_iunlinks+0x77/0xe0
> [   21.749863]  [<c020584f>] xlog_recover_finish+0x3f/0x90
> [   21.749980]  [<c0209390>] xfs_mountfs+0x450/0x550
> [   21.750126]  [<c02122e6>] kmem_alloc+0x56/0xb0
> [   21.750216]  [<c020995d>] xfs_mru_cache_create+0xdd/0x120
> [   21.750344]  [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0
> [   21.750441]  [<c016b608>] get_sb_bdev+0xe8/0x130
> [   21.750556]  [<c0219602>] xfs_fs_get_sb+0x12/0x20
> [   21.750646]  [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0
> [   21.750760]  [<c016a949>] vfs_kern_mount+0x39/0x80
> [   21.750850]  [<c016a9cf>] do_kern_mount+0x2f/0xc0
> [   21.750972]  [<c017b3e5>] do_mount+0x5a5/0x5f0
> [   21.751061]  [<c016c35f>] sys_stat64+0xf/0x30
> [   21.751186]  [<c0150271>] __get_free_pages+0x11/0x30
> [   21.751280]  [<c017b496>] sys_mount+0x66/0xa0
> [   21.751394]  [<c0102c72>] syscall_call+0x7/0xb
> [   21.751494] Filesystem "sdb1": XFS internal error xfs_trans_cancel at line 1164 of file fs/xfs/xfs_trans.c.  Caller 0xc021086b
> [   21.751514] 
> [   21.751718] Pid: 1392, comm: mount Not tainted 2.6.28-git8 #1
> [   21.751799] Call Trace:
> [   21.751893]  [<c020b586>] xfs_trans_cancel+0x46/0xd0
> [   21.751986]  [<c021086b>] xfs_inactive+0x1eb/0x3d0
> [   21.752102]  [<c021086b>] xfs_inactive+0x1eb/0x3d0
> [   21.752192]  [<c01fc33d>] xfs_imap_to_bp+0x5d/0xd0
> [   21.752307]  [<c01775bc>] clear_inode+0x5c/0xb0
> [   21.752396]  [<c0177a8c>] generic_delete_inode+0x6c/0xc0
> [   21.752512]  [<c0177157>] iput+0x47/0x50
> [   21.752596]  [<c02056fe>] xlog_recover_process_one_iunlink+0xae/0xe0
> [   21.752719]  [<c02057a7>] xlog_recover_process_iunlinks+0x77/0xe0
> [   21.752815]  [<c020584f>] xlog_recover_finish+0x3f/0x90
> [   21.752930]  [<c0209390>] xfs_mountfs+0x450/0x550
> [   21.753022]  [<c02122e6>] kmem_alloc+0x56/0xb0
> [   21.753136]  [<c020995d>] xfs_mru_cache_create+0xdd/0x120
> [   21.753232]  [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0
> [   21.753348]  [<c016b608>] get_sb_bdev+0xe8/0x130
> [   21.753439]  [<c0219602>] xfs_fs_get_sb+0x12/0x20
> [   21.753555]  [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0
> [   21.753645]  [<c016a949>] vfs_kern_mount+0x39/0x80
> [   21.753760]  [<c016a9cf>] do_kern_mount+0x2f/0xc0
> [   21.753852]  [<c017b3e5>] do_mount+0x5a5/0x5f0
> [   21.753965]  [<c016c35f>] sys_stat64+0xf/0x30
> [   21.754054]  [<c0150271>] __get_free_pages+0x11/0x30
> [   21.754172]  [<c017b496>] sys_mount+0x66/0xa0
> [   21.754258]  [<c0102c72>] syscall_call+0x7/0xb
> [   21.754378] xfs_force_shutdown(sdb1,0x8) called from line 1165 of file fs/xfs/xfs_trans.c.  Return address = 0xc020b59c
> [   21.754522] Filesystem "sdb1": Corruption of in-memory data detected.  Shutting down filesystem: sdb1
> [   21.754661] Please umount the filesystem, and rectify the problem(s)
> [   21.754814] BUG: unable to handle kernel NULL pointer dereference at 00000054
> [   21.754990] IP: [<c02057bf>] xlog_recover_process_iunlinks+0x8f/0xe0
> [   21.755137] *pde = 00000000 
> [   21.755263] Oops: 0000 [#1] 
> [   21.755385] last sysfs file: /sys/class/net/lo/operstate
> [   21.755490] Modules linked in: i2c_viapro i2c_core via_agp agpgart sata_promise sd_mod
> [   21.755906] 
> [   21.755976] Pid: 1392, comm: mount Not tainted (2.6.28-git8 #1) EPIA
> [   21.756091] EIP: 0060:[<c02057bf>] EFLAGS: 00010296 CPU: 0
> [   21.756181] EIP is at xlog_recover_process_iunlinks+0x8f/0xe0
> [   21.756293] EAX: 00000000 EBX: de827f00 ECX: 00000005 EDX: df231000
> [   21.756384] ESI: ffffffff EDI: df231000 EBP: 00000026 ESP: df383e14
> [   21.756501]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [   21.756592] Process mount (pid: 1392, ti=df382000 task=deb37b60 task.ti=df382000)
> [   21.756712] Stack:
> [   21.756792]  df383e24 00000026 00000003 00000000 00000000 df2572e0 00000000 00000003
> [   21.757192]  df231000 c020584f 00000003 00000000 00000000 c0209390 00000017 00000000
> [   21.757691]  08000004 00000000 00000000 00000001 00000058 00000000 c02122e6 00000001
> [   21.758243] Call Trace:
> [   21.758327]  [<c020584f>] xlog_recover_finish+0x3f/0x90
> [   21.758476]  [<c0209390>] xfs_mountfs+0x450/0x550
> [   21.758615]  [<c02122e6>] kmem_alloc+0x56/0xb0
> [   21.758756]  [<c020995d>] xfs_mru_cache_create+0xdd/0x120
> [   21.758908]  [<c021ac92>] xfs_fs_fill_super+0x152/0x2a0
> [   21.759059]  [<c016b608>] get_sb_bdev+0xe8/0x130
> [   21.759200]  [<c0219602>] xfs_fs_get_sb+0x12/0x20
> [   21.759350]  [<c021ab40>] xfs_fs_fill_super+0x0/0x2a0
> [   21.759500]  [<c016a949>] vfs_kern_mount+0x39/0x80
> [   21.759647]  [<c016a9cf>] do_kern_mount+0x2f/0xc0
> [   21.759794]  [<c017b3e5>] do_mount+0x5a5/0x5f0
> [   21.759934]  [<c016c35f>] sys_stat64+0xf/0x30
> [   21.760103]  [<c0150271>] __get_free_pages+0x11/0x30
> [   21.760255]  [<c017b496>] sys_mount+0x66/0xa0
> [   21.760396]  [<c0102c72>] syscall_call+0x7/0xb
> [   21.760535] Code: e8 e7 f6 00 00 55 89 f1 8b 54 24 04 89 f8 e8 a9 fe ff ff 89 c6 8d 44 24 0c 50 8b 4c 24 08 31 d2 [   27.297839] NET: Registered protocol family 10
> [   38.240089] eth0: no IPv6 routers present
> [   38.995204] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> [   38.995468] NFSD: starting 90-second grace period
> 
> 
> When i reboot to git2 again and run a xfs_check and/or xfs_repair nothing shows up.
> It just works without problems. So i think it's not actually an XFS problem but
> something "underneath"
> It's slow hardware and i have been reading about locking situations, which could be
> triggered easier on my (slow) hardware then any modern (faster) hardware.
> 
> Anyone else bitten by the same problem ?
> 
> Danny
> -- 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
---end quoted text---

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
@ 2009-01-07 18:24   ` Danny ter Haar
  2009-01-07 18:31     ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 18:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> > Any kernel i try to boot after 2.6.28-git2 (i tried git4-9) 
> > sooner or later gives me an XFS error:

> The recover_uiunlinks code changed recently.  Do you still have the
> exactly kernel tree version config around so you take a look using
> gdb what exact line of code the oops is?

Since compiling a kernel on the native hardware takes "forever"
i compile them at my laptop (ubuntu 32bits) and move the kernel
to the NAS.
This morning i compiled tried git9 (still an error)
I have copied the config & system.map file to the same dir on
my server ( http://www.dth.net/kernel/c3/ )


I'm not familiar with the debugging, do you have pointer where
i could find how to do this ?
In the mean time i'll try and google some info on how...

I could copy the whole source tree over to the machine and give you 
(root) access to the machine so you can take a look yourself (if needed).

Danny

-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:24   ` Danny ter Haar
@ 2009-01-07 18:31     ` Christoph Hellwig
  2009-01-07 18:44       ` Danny ter Haar
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:31 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Wed, Jan 07, 2009 at 07:24:15PM +0100, Danny ter Haar wrote:
> Since compiling a kernel on the native hardware takes "forever"
> i compile them at my laptop (ubuntu 32bits) and move the kernel
> to the NAS.
> This morning i compiled tried git9 (still an error)

And error when compiling the kernel or you got the same error again?
In case you can actually reproduce this error, I would be very
interested in a metadump of the filesystem that causes this error.

> I have copied the config & system.map file to the same dir on
> my server ( http://www.dth.net/kernel/c3/ )
> 
> 
> I'm not familiar with the debugging, do you have pointer where
> i could find how to do this ?
> In the mean time i'll try and google some info on how...
> 
> I could copy the whole source tree over to the machine and give you 
> (root) access to the machine so you can take a look yourself (if needed).

The dbuegging should be really easy as long you actually have a tree /
config with which the oops happened so that the oops has the same
addresses.  Compared to your config we would need CONFIG_DEBUG_INFO,
but that's something we could turn on after the fact as it shouldn't
change the reported address.  After that we can just do a trivial
command in gdb to find the source lines for the addresses reported
(this can easily be done on the system where you compile the kernel,
doesn't have to be on the NAS box).

Btw, turning on CONFIG_XFS_DEBUG would be useful to see more output
in case you can reproduce this issue.  Just make sure to turn it
off again when you want to use the box for a real workload..

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:31     ` Christoph Hellwig
@ 2009-01-07 18:44       ` Danny ter Haar
  2009-01-07 18:52         ` Christoph Hellwig
  2009-01-07 18:56         ` Christoph Hellwig
  0 siblings, 2 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 18:44 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> And error when compiling the kernel or you got the same error again?

same error while trying to boot (this time on sda (pata with libata))
root filesystem of the OS partition.

> In case you can actually reproduce this error, I would be very
> interested in a metadump of the filesystem that causes this error.

See the errorlog on: http://www.dth.net/kernel/c3/netconsole_2.6.28-git9-error.txt

> The dbuegging should be really easy as long you actually have a tree /
> config 

kernel tree is still on my laptop available.

> with which the oops happened so that the oops has the same
> addresses.
> Compared to your config we would need CONFIG_DEBUG_INFO,
> but that's something we could turn on after the fact as it shouldn't
> change the reported address.  After that we can just do a trivial
> command in gdb to find the source lines for the addresses reported
> (this can easily be done on the system where you compile the kernel,
> doesn't have to be on the NAS box).

ok, pointer to a howto by anychange ?
still looking/googling. [think i found something]

> Btw, turning on CONFIG_XFS_DEBUG would be useful to see more output
> in case you can reproduce this issue.  Just make sure to turn it
> off again when you want to use the box for a real workload..

Will compile a new kernel in the mean time with those 2 options:
CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO


-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:44       ` Danny ter Haar
@ 2009-01-07 18:52         ` Christoph Hellwig
  2009-01-07 22:09           ` Danny ter Haar
  2009-01-08  0:38           ` Danny ter Haar
  2009-01-07 18:56         ` Christoph Hellwig
  1 sibling, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:52 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote:
> Will compile a new kernel in the mean time with those 2 options:
> CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO

Please only with CONFIG_DEBUG_INFO in that case.  CONFIG_XFS_DEBUG would
be useful if you could reproduce it, but it does change the addresses,
so it's harmful for the case of getting back to the source line.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:44       ` Danny ter Haar
  2009-01-07 18:52         ` Christoph Hellwig
@ 2009-01-07 18:56         ` Christoph Hellwig
  2009-01-07 19:01           ` Danny ter Haar
  2009-01-08 21:56           ` Danny ter Haar
  1 sibling, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-07 18:56 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote:
> Quoting Christoph Hellwig (hch@infradead.org):
> > And error when compiling the kernel or you got the same error again?
> 
> same error while trying to boot (this time on sda (pata with libata))
> root filesystem of the OS partition.
> 
> > In case you can actually reproduce this error, I would be very
> > interested in a metadump of the filesystem that causes this error.
> 
> See the errorlog on: http://www.dth.net/kernel/c3/netconsole_2.6.28-git9-error.txt

This looks like pretty much the same corruption you saw in your first
report.  did you run xfs_repair before this one?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:56         ` Christoph Hellwig
@ 2009-01-07 19:01           ` Danny ter Haar
  2009-01-08 21:56           ` Danny ter Haar
  1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 19:01 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> This looks like pretty much the same corruption you saw in your first
> report.  did you run xfs_repair before this one?

Yes!
and with git2: no problem.
The git8 error is on the storage drive(s) sdb1/sdc1 while
the git9 is on the /dev/sda6 partition (containing the root filesystem)

But it's completely repeatable.

Compiling takes about 45 minuten on my laptop, will report later

-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:52         ` Christoph Hellwig
@ 2009-01-07 22:09           ` Danny ter Haar
  2009-01-08  0:38           ` Danny ter Haar
  1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-07 22:09 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> Please only with CONFIG_DEBUG_INFO in that case.  CONFIG_XFS_DEBUG would
> be useful if you could reproduce it, but it does change the addresses,
> so it's harmful for the case of getting back to the source line.

Hmm, one bug is never alone...

During compile my laptop started smelling because of heat buildup and switched itself
off. I re-compiled the kernel on a debian lenny32 box i have access to.
I only added the "CONFIG_DEBUG_INFO" option.

Rebooted the nas, and so far the machine is running without a problem :-(
Am now gonna compile a kernel on the machine itself, to force a lot
of disk activity.

Will keep you posted.



-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:52         ` Christoph Hellwig
  2009-01-07 22:09           ` Danny ter Haar
@ 2009-01-08  0:38           ` Danny ter Haar
  1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-08  0:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> On Wed, Jan 07, 2009 at 07:44:20PM +0100, Danny ter Haar wrote:
> > Will compile a new kernel in the mean time with those 2 options:
> > CONFIG_XFS_DEBUG & CONFIG_DEBUG_INFO
> 
> Please only with CONFIG_DEBUG_INFO in that case.  CONFIG_XFS_DEBUG would
> be useful if you could reproduce it, but it does change the addresses,
> so it's harmful for the case of getting back to the source line.
> 

Okay, it barfed again:

Let me first explain thati've set the sleeptimer for the storage harddrives
to 24(x5=120 sec) in hdparm.conf:

/dev/sdb {
        spindown_time = 24
        }

/dev/sdc {
        spindown_time = 24
        }

This is the error (snapped using netconsole)

http://www.dth.net/kernel/c3/netconsole_2.6.28-git9d-error.txt

I rebooted again with the git2 kernel and xfs_check gave no errors:

filer1:~# xfs_check /dev/sdb1 
filer1:~# 

-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-07 18:56         ` Christoph Hellwig
  2009-01-07 19:01           ` Danny ter Haar
@ 2009-01-08 21:56           ` Danny ter Haar
  2009-01-09  0:46             ` Dave Chinner
  1 sibling, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-08 21:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs


I needed the parallel port driver so i compiled 2.6.28-git3 with debug info.
It barfed: http://www.dth.net/kernel/c3/netconsole_2.6.28-git3-d.txt

I re-compiled git2 (with debug support) and <knock wood> so far it is stable:

# procinfo
Memory:        Total        Used        Free     Buffers                       
RAM:          507116      258048      249068         428                       
Swap:         497972           0      497972                                   

Bootup: Thu Jan  8 12:00:04 2009   Load average: 0.10 0.37 0.19 2/104 2767     

user  :   00:02:55.96   2.6%  page in :           183385                       
nice  :   00:00:00.00   0.0%  page out:           497130                       
system:   00:00:47.36   0.7%  page act:            23771                       
IOwait:   00:00:28.21   0.4%  page dea:                0                       
hw irq:   00:00:02.10   0.0%  page flt:           698907                       
sw irq:   00:00:03.37   0.0%  swap in :                0                       
idle  :   01:48:06.90  96.2%  swap out:                0                       
uptime:   01:52:23.93         context :           323529                       

irq   0:     293258  timer               irq  10:     112849  eth0             
irq   1:          8  i8042               irq  11:      42028  sata_promise     
irq   2:          0  cascade             irq  12:          0  uhci_hcd:usb1, uh
irq   5:          0  acpi                irq  14:       7629  pata_via         
irq   7:          1  parport0            irq  15:          0  pata_via         

sda             2740r            3576w   sdb             1037r             325w
sda1              70r               2w   sdb1            1024r             325w
sda2               2r               0w   sdc             2615r           14257w
sda5              25r               0w   sdc1            2602r           14257w
sda6            2636r            3574w                                         

lo          TX 5.68MiB       RX 5.68MiB       eth0        TX 86.38MiB      RX 10.32MiB     
-------------------
So (in my case) something while going from git2 -> git3 didn't go positive.
If i could get some explanation howto gdb the vmlinux image to determine in what
function the git3 barfed, i would be able to give more info/feedback.
I looked in the Documents folder in the kernel source, i looked at the examples
given there but i really dont understand how/what.


Danny
-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-08 21:56           ` Danny ter Haar
@ 2009-01-09  0:46             ` Dave Chinner
  2009-01-09  1:26               ` Danny ter Haar
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2009-01-09  0:46 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Thu, Jan 08, 2009 at 10:56:02PM +0100, Danny ter Haar wrote:
> 
> I needed the parallel port driver so i compiled 2.6.28-git3 with debug info.
> It barfed: http://www.dth.net/kernel/c3/netconsole_2.6.28-git3-d.txt

Looking at this, I think there are two possibilities in terms of the
problem being detected. We are modifying the inode BMBT here,
so that means we have XFS_BTREE_ROOT_IN_INODE set. The corruption
trigger has occurred because a xfs_btree_increment() call has
returned a zero status. This means we failed here:

1324         /* Fail if we just went off the right edge of the tree. */
1325         xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB);
1326         if (xfs_btree_ptr_is_null(cur, &ptr))
1327                 goto out0;

or here:

1351         /*
1352          * If we went off the root then we are either seriously
1353          * confused or have the tree root in an inode.
1354          */
1355         if (lev == cur->bc_nlevels) {
1356                 if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
1357                         goto out0;
1358                 ASSERT(0);

i.e. we either fell off the right edge of the tree or went over the top
of it.

I can't really see how we've done either of those things unless the
tree has been corrupted by a prior operation.

Given that each time it is aptitude that is causing the problem, can you
prevent aptitude from running automatically on boot and run it manually?
If you can reporduce the problem manually then we can move on to the
next step....

> So (in my case) something while going from git2 -> git3 didn't go positive.

That would have been when Linus did the XFS pull...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09  0:46             ` Dave Chinner
@ 2009-01-09  1:26               ` Danny ter Haar
  2009-01-09  2:08                 ` Dave Chinner
  0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09  1:26 UTC (permalink / raw)
  To: Christoph Hellwig, xfs

Quoting Dave Chinner (david@fromorbit.com):
> Looking at this, I think there are two possibilities in terms of the
> problem being detected. We are modifying the inode BMBT here,
> so that means we have XFS_BTREE_ROOT_IN_INODE set. The corruption
> trigger has occurred because a xfs_btree_increment() call has
> returned a zero status. This means we failed here:
> 
> 1324         /* Fail if we just went off the right edge of the tree. */
> 1325         xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB);
> 1326         if (xfs_btree_ptr_is_null(cur, &ptr))
> 1327                 goto out0;
> 
> or here:
> 
> 1351         /*
> 1352          * If we went off the root then we are either seriously
> 1353          * confused or have the tree root in an inode.
> 1354          */
> 1355         if (lev == cur->bc_nlevels) {
> 1356                 if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
> 1357                         goto out0;
> 1358                 ASSERT(0);
> 
> i.e. we either fell off the right edge of the tree or went over the top
> of it.

> I can't really see how we've done either of those things unless the
> tree has been corrupted by a prior operation.
sounds logical.

First time when it happened i moved the primairy hd to sec ide connector, connected
a seperate hard drive as new master, installed a fresh debian lenny on that
harddrive, ran xfs-repair on all xfs filesystems: no errors

> Given that each time it is aptitude that is causing the problem, can you
> prevent aptitude from running automatically on boot and run it manually?
> If you can reporduce the problem manually then we can move on to the
> next step....

I wasn't clear (obvioulsy)
This machine is besides my NAS also my apt-cacher-ng server for all my other
machines here at home. The easiest way to trigger the error is often by running
a simple "aptitude update; aptitude -d dist-upgrade" 
So when it barfed i did the aptitude by hand.
And it checks everything from the cache at /var/cache/apt-cacher-ng
which is on sda6 (root filesystem on XFS)

So it doesn't "barf" right on boot, it takes a few minutes or even hours:

filer1:~# last -20 reboot
reboot   system boot  2.6.28-git2-d    Thu Jan  8 12:00 - current(05:18)    
reboot   system boot  2.6.28-git3-d    Thu Jan  8 11:31 - 11:59  (00:27)    
reboot   system boot  2.6.28-git3-d    Thu Jan  8 10:56 - 11:59  (01:02)    
reboot   system boot  2.6.28-git3-d    Thu Jan  8 10:44 - 10:54  (00:10)    
reboot   system boot  2.6.28-git3-d    Thu Jan  8 10:30 - 10:43  (00:12)    
reboot   system boot  2.6.28-git2      Wed Jan  7 15:08 - 10:28  (19:19)    
reboot   system boot  2.6.28-git9-d    Wed Jan  7 12:29 - 14:58  (02:29)    
reboot   system boot  2.6.28-git2      Wed Jan  7 10:08 - 12:27  (02:19)    
reboot   system boot  2.6.28-git9      Wed Jan  7 09:21 - 10:06  (00:45)    
reboot   system boot  2.6.28-git9      Wed Jan  7 08:42 - 10:06  (01:24)    
reboot   system boot  2.6.28-git2      Tue Jan  6 21:45 - 08:40  (10:55)    
reboot   system boot  2.6.28-git4      Tue Jan  6 21:27 - 08:40  (11:13)    
reboot   system boot  2.6.28-git4      Tue Jan  6 21:22 - 08:40  (11:18)

Sometimes the kernel barfes while accessing /dev/sdb1 of /dev/sdc1
which is only accessed using samba.

I can once more install the "other" debian lenny harddrive, boot from there
and than manually do an xfs_repair on xfs filesystems.
I can than boot a kernel that is know to barf and try to get it to barf.

> > So (in my case) something while going from git2 -> git3 didn't go positive.
> That would have been when Linus did the XFS pull...

Do you want me to figure out what patch from git2->git3 is the cullprit ?
I'll have to compile/reboot for a while.

Tell me what else i can do to resolve this.

Danny
-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09  1:26               ` Danny ter Haar
@ 2009-01-09  2:08                 ` Dave Chinner
  2009-01-09  6:10                   ` Danny ter Haar
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2009-01-09  2:08 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Fri, Jan 09, 2009 at 02:26:10AM +0100, Danny ter Haar wrote:
> Do you want me to figure out what patch from git2->git3 is the cullprit ?
> I'll have to compile/reboot for a while.

If you can do that, it would be *greatly* appreciated.

> Tell me what else i can do to resolve this.

If you can isolate the commit that introduced the bug, then
code review is probably the step after that....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09  2:08                 ` Dave Chinner
@ 2009-01-09  6:10                   ` Danny ter Haar
  2009-01-09 19:44                     ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09  6:10 UTC (permalink / raw)
  To: Christoph Hellwig, xfs

Quoting Dave Chinner (david@fromorbit.com):
> > Do you want me to figure out what patch from git2->git3 is the cullprit ?
> > I'll have to compile/reboot for a while.
> If you can do that, it would be *greatly* appreciated.

Need a little bit of help here. 
(i'm a hardware guy with just the bare essential software knowhow)
I found 
http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
(used that one before)
but it doesn't seem to know the "-gitXX" kernel trees.

Q: 
how do i get a list of available kernel versions (including -gitXX branches)

My intention is to say something like:

# git bisect bad v2.6.28-git3
# git bisect good v2.6.28-git2

and take it from there.
I've been googling now for nearly 2hours..
Any help appreciated

Danny
-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09  6:10                   ` Danny ter Haar
@ 2009-01-09 19:44                     ` Christoph Hellwig
  2009-01-09 19:51                       ` Danny ter Haar
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-09 19:44 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote:
> # git bisect bad v2.6.28-git3
> # git bisect good v2.6.28-git2
> 
> and take it from there.

Yes.  Then it gives you a revision to test and you can use

	git bisect next

to try the next one, or

	git bisect skip

if the current version is bad due to some other reason (doesn't boot /
compile)

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09 19:44                     ` Christoph Hellwig
@ 2009-01-09 19:51                       ` Danny ter Haar
  2009-01-09 19:58                         ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 19:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote:
> Yes.  Then it gives you a revision to test and you can use

I'm still not making myself clear (sorry for that ;-) )
i'm not able to "access" the 2.6.28-gitXX repository for some reason.
with other words: 

how do "include" the 2.6.28-git repository in git ?

Danny

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09 19:51                       ` Danny ter Haar
@ 2009-01-09 19:58                         ` Christoph Hellwig
  2009-01-09 21:42                           ` Danny ter Haar
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-09 19:58 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Fri, Jan 09, 2009 at 08:51:44PM +0100, Danny ter Haar wrote:
> Quoting Christoph Hellwig (hch@infradead.org):
> > On Fri, Jan 09, 2009 at 07:10:44AM +0100, Danny ter Haar wrote:
> > Yes.  Then it gives you a revision to test and you can use
> 
> I'm still not making myself clear (sorry for that ;-) )
> i'm not able to "access" the 2.6.28-gitXX repository for some reason.
> with other words: 
> 
> how do "include" the 2.6.28-git repository in git ?

Oh, sorry.

There is no specific 2.6.28-git respository, just a main linux 2.6 one:

	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

The git bisect command make sure you will get the right revisions
checked out to bisect.


> 
> Danny
---end quoted text---

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09 19:58                         ` Christoph Hellwig
@ 2009-01-09 21:42                           ` Danny ter Haar
  2009-01-09 22:01                             ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 21:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> There is no specific 2.6.28-git respository, just a main linux 2.6 one:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> The git bisect command make sure you will get the right revisions
> checked out to bisect.

still not making myself clear i guess..

lenny32:/usr/src/linux-2.6# git tag|grep 2.6.28
v2.6.28
v2.6.28-rc1
v2.6.28-rc2
v2.6.28-rc3
v2.6.28-rc4
v2.6.28-rc5
v2.6.28-rc6
v2.6.28-rc7
v2.6.28-rc8
v2.6.28-rc9

So i cannot use any "standard" GIT command to also include 
linus's git tree ?! that seems odd.
He committed them somehow in git didn't he ?


I could of course compile /patch 2 kernel trees by hand 
in fact i allready have:
lenny32:/usr/src# ls -l
total 16
drwxr-sr-x  2 root src   146 2009-01-09 08:29 archive
drwxr-sr-x  2 root src  4096 2009-01-08 11:33 configs
drwxr-sr-x 23 root src  4096 2009-01-08 19:56 linux-2.6
drwxr-xr-x 24 root root 4096 2009-01-08 11:53 linux-2.6.28-git2-d
drwxr-xr-x 24 root root 4096 2009-01-08 10:21 linux-2.6.28-git3-d

But i'm not familiar how to instruct git to find the differences
between the 2 source tree's and how to use git to find out was
the culprit is.
Any other suggestions ?

-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09 21:42                           ` Danny ter Haar
@ 2009-01-09 22:01                             ` Christoph Hellwig
  2009-01-09 22:23                               ` Danny ter Haar
  2009-01-13 20:04                               ` Danny ter Haar
  0 siblings, 2 replies; 27+ messages in thread
From: Christoph Hellwig @ 2009-01-09 22:01 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Fri, Jan 09, 2009 at 10:42:06PM +0100, Danny ter Haar wrote:
> Quoting Christoph Hellwig (hch@infradead.org):
> > There is no specific 2.6.28-git respository, just a main linux 2.6 one:
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> > The git bisect command make sure you will get the right revisions
> > checked out to bisect.
> 
> still not making myself clear i guess..

Or me beeing stupid, sorry :)  The -gitN thingies aren't
official tags, but unofficial snapshots done by Dave Woodhouse.

He has a mapping on
http://www.kernel.org/pub/linux/kernel/people/dwmw2/snapshot-tags/ from
these names to actual tags.  So 2.6.28-git2 would be

	3c92ec8ae91ecf59d88c798301833d7cf83f2179

and 2.6.28-git3

	6a94cb73064c952255336cc57731904174b2c58f

so use

git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179
git bisect bad 6a94cb73064c952255336cc57731904174b2c58f


Sorry for the confusion..

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09 22:01                             ` Christoph Hellwig
@ 2009-01-09 22:23                               ` Danny ter Haar
  2009-01-13 20:04                               ` Danny ter Haar
  1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-09 22:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Christoph Hellwig (hch@infradead.org):
> git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179
> git bisect bad 6a94cb73064c952255336cc57731904174b2c58f
> Sorry for the confusion..

*THANKS*

Bisecting: 955 revisions left to test after this

;-)

Danny

-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-09 22:01                             ` Christoph Hellwig
  2009-01-09 22:23                               ` Danny ter Haar
@ 2009-01-13 20:04                               ` Danny ter Haar
  2009-01-16 20:43                                 ` Danny ter Haar
  1 sibling, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-13 20:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

I'm stuck in trying to bisect the problem.
I restarted from scratch and same result:

Here is what i did:
lenny32:/usr/src# git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-git
Initialized empty Git repository in /usr/src/linux-git/.git/
remote: Counting objects: 1057520, done.
remote: Compressing objects: 100% (173057/173057), done.
Resolving deltas: 100% (881801/881801), done.
Checking out files: 100% (26544/26544), done.


# git bisect good 3c92ec8ae91ecf59d88c798301833d7cf83f2179
# git bisect bad 6a94cb73064c952255336cc57731904174b2c58f
Bisecting: 955 revisions left to test after this
[5ed1836814d908f45cafde0e79cb85314ab9d41d] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

The EXTRAVERSION in the makefile at this point was "plain" 2.6.28 (which seems odd to me)

I renamed it to "-a1" and compiled kernel. 
Installed/rebooted.
after 16 hours (overnight) of no troubles i think it seemed stable:
reboot   system boot  2.6.28-a1        Fri Jan  9 16:54 - 09:29  (16:34)

# git bisect good
Bisecting: 477 revisions left to test after this
M       scripts/package/Makefile
D       scripts/package/builddeb
[7b2cd079ec8dcc65cdca6621245cfa5e30a8ef9f] V4L/DVB (10007): gspca - m5602: Refactor the error handling in the s5k83a

version "-a2" was branded "ok" by me after 3 hours of heavy use:
reboot   system boot  2.6.28-a2        Sat Jan 10 09:31 - 12:37  (03:06) 

# git bisect good
Bisecting: 262 revisions left to test after this
[6094c85a935f7eadb4c607c6dc6d86c0a9f09a4b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Version "-a3" also got my blessing:
reboot   system boot  2.6.28-a3        Sat Jan 10 12:39 - 17:55  (05:15)    

# git bisect good
Bisecting: 131 revisions left to test after this
[a1941895034cda2bffa23ba845607c82138ccf52] [XFS] remove dead code for old inode item recovery

This time the Makefile EXTRAVERSION changed to "-RC6"
I was/am under the impression that i'm testing between 2.6.28-git2 and 2.6.28-git3
So why is the Makefile going this way back ? dazzled & confused.

"-a4" barfed

lenny32:/usr/src/linux-git# git bisect bad
Bisecting: 65 revisions left to test after this
[6441e549157b749bae003cce70b4c8b62e4801fa] [XFS] factor xfs_iget_core() into hit and miss cases
EXTRAVERSION went baack to "-RC2" ?!

"-a5" totally froze the machine (didn't catch anything on the netconsole)

So i interpreted that as a "fail"

# git bisect bad
Bisecting: 32 revisions left to test after this
[fd6bcc5b63051392ba709a8fd33173b263669e0a] [XFS] kill xfs_bmbt_log_block and xfs_bmbt_log_recs

Now "-a6" doesn't want to compile:

# make-kpkg kernel-image --initrd
exec debian/rules  DEBIAN_REVISION=2.6.28-a6-10.00.Custom  INITRD=YES  kernel-image 
====== making target debian/stamp/build/kernel [new prereqs: conf.vars]======
This is kernel package version 11.015.
test ! -f scripts/package/builddeb.kpkg-dist || mv -f scripts/package/builddeb.kpkg-dist scripts/package/builddeb
test ! -f scripts/package/Makefile.kpkg-dist || mv -f scripts/package/Makefile.kpkg-dist scripts/package/Makefile
/usr/bin/make -j2   ARCH=i386 \
                             bzImage
make[1]: Entering directory `/usr/src/linux-git'
  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
  CALL    scripts/checksyscalls.sh
  CHK     include/linux/compile.h
  CC      fs/xfs/xfs_alloc_btree.o
fs/xfs/xfs_alloc_btree.c:38:29: error: xfs_btree_trace.h: No such file or directory
fs/xfs/xfs_alloc_btree.c: In function ‘xfs_allocbt_alloc_block’:
fs/xfs/xfs_alloc_btree.c:84: error: implicit declaration of function ‘XFS_BTREE_TRACE_CURSOR’
fs/xfs/xfs_alloc_btree.c:84: error: ‘XBT_ENTRY’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:84: error: (Each undeclared identifier is reported only once
fs/xfs/xfs_alloc_btree.c:84: error: for each function it appears in.)
fs/xfs/xfs_alloc_btree.c:90: error: ‘XBT_ERROR’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:95: error: ‘XBT_EXIT’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c: In function ‘xfs_allocbt_kill_root’:
fs/xfs/xfs_alloc_btree.c:291: error: ‘XBT_ENTRY’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:301: error: ‘XBT_ERROR’ undeclared (first use in this function)
fs/xfs/xfs_alloc_btree.c:310: error: ‘XBT_EXIT’ undeclared (first use in this function)
make[3]: *** [fs/xfs/xfs_alloc_btree.o] Error 1
make[2]: *** [fs/xfs] Error 2
make[1]: *** [fs] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory `/usr/src/linux-git'
make: *** [debian/stamp/build/kernel] Error 2


In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed:

Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/

[F1] ;-)

-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
       [not found] <20090107165218.GA11132@dth.net>
  2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
@ 2009-01-14 19:44 ` Tino Keitel
  1 sibling, 0 replies; 27+ messages in thread
From: Tino Keitel @ 2009-01-14 19:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: xfs

[added CC:]

On Wed, Jan 07, 2009 at 17:52:18 +0100, dth wrote:
> Am running low-power motherboard (via epia5000 c3) as a storage (samba/nfs)
> server. (uses about 20 watts when both storage drives are in powersave mode)
> 
> OS = debian lenny
> 
> Pimairy drive (60gig 2.5" pata disk using libata = sda)
>  512MB ext3 partition as /boot
>  swap, rest of the drive is XFS root file system.
> 
> storage:
> 2 x 750GB drives on a sata controller (FastTrak S150 TX2plus)
> sdb & sdc which have 1 big XFS partition.
> 
> Any kernel i try to boot after 2.6.28-git2 (i tried git4-9) 
> sooner or later gives me an XFS error:

[...]

> Anyone else bitten by the same problem ?
y
Hi,

I get the same on a Mac mini Core Duo using 2.6.29-rc1. I just booted,
started firefox and my /home died.

XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file linux-2.6/fs/xfs/xfs_btree.c.  Caller 0xc022e9c6
Pid: 4937, comm: firefox-bin Not tainted 2.6.29-rc1-00224-ga652504 #11
Call Trace:
 [<c022e6d9>] xfs_btree_delrec+0xbd9/0xea0
 [<c022e9c6>] xfs_btree_delete+0x26/0x90
 [<c022b355>] xfs_btree_lookup_get_block+0xa5/0xe0
 [<c022933a>] xfs_bmbt_init_key_from_rec+0xa/0x20
 [<c022c6ba>] xfs_btree_lookup+0x21a/0x460
 [<c022e9c6>] xfs_btree_delete+0x26/0x90
 [<c02255ff>] xfs_bmap_del_extent+0x64f/0xb70
 [<c02262ea>] xfs_bunmapi+0x65a/0xb30
 [<c0244f6b>] xfs_itruncate_finish+0x1ab/0x3e0
 [<c026115d>] xfs_inactive+0x3cd/0x4f0
 [<c019113c>] clear_inode+0x5c/0x100
 [<c01917fe>] generic_delete_inode+0xce/0xe0
 [<c0190b24>] iput+0x44/0x50
 [<c01898a0>] do_unlinkat+0xe0/0x160
 [<c0188180>] lock_rename+0x0/0xa0
 [<c01033c5>] sysenter_do_call+0x12/0x25
Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1164 of file linux-2.6/fs/xfs/xfs_trans.c. 
Caller 0xc0261173

Pid: 4937, comm: firefox-bin Not tainted 2.6.29-rc1-00224-ga652504 #11
Call Trace:
 [<c025a54b>] xfs_trans_cancel+0xcb/0xf0
 [<c0261173>] xfs_inactive+0x3e3/0x4f0
 [<c0261173>] xfs_inactive+0x3e3/0x4f0
 [<c019113c>] clear_inode+0x5c/0x100
 [<c01917fe>] generic_delete_inode+0xce/0xe0
 [<c0190b24>] iput+0x44/0x50
 [<c01898a0>] do_unlinkat+0xe0/0x160
 [<c0188180>] lock_rename+0x0/0xa0
 [<c01033c5>] sysenter_do_call+0x12/0x25
xfs_force_shutdown(dm-1,0x8) called from line 1165 of file linux-2.6/fs/xfs/xfs_trans.c.  Return address = 0xc025a563
Filesystem "dm-1": Corruption of in-memory data detected.  Shutting down filesystem: dm-1
Please umount the filesystem, and rectify the problem(s)

Regards,
Tino

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-13 20:04                               ` Danny ter Haar
@ 2009-01-16 20:43                                 ` Danny ter Haar
  2009-01-17  7:38                                   ` Dave Chinner
  0 siblings, 1 reply; 27+ messages in thread
From: Danny ter Haar @ 2009-01-16 20:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

Quoting Danny ter Haar (dth@dth.net):
> I'm stuck in trying to bisect the problem.
> I restarted from scratch and same result:

[SNIP for bravery] 

> In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed:
> 
> Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/
> 
> [F1] ;-)

This was to my way to say "HELP"

I'm stuck..

Anyone have some tips/hunts/suggestions (like find another hobby ;-) )

Danny
-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-16 20:43                                 ` Danny ter Haar
@ 2009-01-17  7:38                                   ` Dave Chinner
  2009-01-17 23:25                                     ` Danny ter Haar
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2009-01-17  7:38 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Fri, Jan 16, 2009 at 09:43:46PM +0100, Danny ter Haar wrote:
> Quoting Danny ter Haar (dth@dth.net):
> > I'm stuck in trying to bisect the problem.
> > I restarted from scratch and same result:
> 
> [SNIP for bravery] 
> 
> > In the mean time i compiled/ran 2.6.29-rc1-git3 but it (as expected) barfed:
> > 
> > Al netconsole loggins are in my directory: http://www.dth.net/kernel/c3/
> > 
> > [F1] ;-)
> 
> This was to my way to say "HELP"

Sorry for not getting back to you sooner. I think that Alexander
tripped over this same problem during his bisect.  If you follow the
thread from here:

http://oss.sgi.com/archives/xfs/2009-01/msg00496.html

You'll see that Alexander had the same problem and managed
to continue the bisect once he copied the xfs_btree_trace.h
header file from top-of-tree back into the broken commits.

I hope this helps (and I hope that the bisect lands on the
same commit that it did for Alexander).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-17  7:38                                   ` Dave Chinner
@ 2009-01-17 23:25                                     ` Danny ter Haar
  2009-01-18  2:50                                       ` Danny ter Haar
  2009-01-19  3:17                                       ` Dave Chinner
  0 siblings, 2 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-17 23:25 UTC (permalink / raw)
  To: Christoph Hellwig, xfs

Quoting Dave Chinner (david@fromorbit.com):
> Sorry for not getting back to you sooner.

No problem. I initally posted to LKLM, git redirected by Christoph to this
list. I'm so stupid that i didn't check the other messages from this list.
Sorry.

> I think that Alexander tripped over this same problem during his bisect.
> If you follow the thread from here:
> http://oss.sgi.com/archives/xfs/2009-01/msg00496.html

Yep! [cheer] i'm not alone! :-)
But why only us two ? there must be thousands of users out there using
XFS. Why did it bite us ? large filesystem together with slow hardware ?

> You'll see that Alexander had the same problem and managed
> to continue the bisect once he copied the xfs_btree_trace.h
> header file from top-of-tree back into the broken commits.

Grwat.

> I hope this helps (and I hope that the bisect lands on the
> same commit that it did for Alexander).

Do you want me to still try it ?
I think you allready figured out where the culprit is ?!

I saw changes in the announcement of 2.6.29-rc3 and took the plunge:

# procinfo
Memory:        Total        Used        Free     Buffers                       
RAM:          506940      447868       59072          84                       
Swap:         497972           0      497972                                   

Bootup: Sat Jan 17 10:28:27 2009   Load average: 0.03 0.11 0.09 2/104 5259     

user  :   00:09:30.12   3.2%  page in :           417582                       
nice  :   00:00:00.00   0.0%  page out:          1220260                       
system:   00:02:01.76   0.7%  page act:            41134                       
IOwait:   00:03:34.28   1.2%  page dea:            13444                       
hw irq:   00:00:01.94   0.0%  page flt:          1531395                       
sw irq:   00:00:04.50   0.0%  swap in :                0                       
idle  :   04:39:41.90  94.8%  swap out:                0                       
uptime:   04:54:55.09         context :           892623                       

irq   0:     799012  timer               irq  10:     101465  eth0             
irq   1:          8  i8042               irq  11:      46893  sata_promise     
irq   2:          0  cascade             irq  12:          0  uhci_hcd:usb1, uh
irq   5:          0  acpi                irq  14:      50059  pata_via         
irq   7:          1  parport0            irq  15:          0  pata_via         

sda            15445r           22597w   sdb             2820r           23372w
sda1             555r             284w   sdb1            2750r           23372w
sda2               2r               0w   sdc               63r               3w
sda5             136r               0w   sdc1              50r               3w
sda6           14659r           22313w                                         

lo          TX 59.65KiB      RX 59.65KiB      eth0        TX 13.40MiB      RX 23.78MiB     

over 4 hours of uptime and moderate usage, so i'm not 100% convinced but this one
looks good (so far)

Let me know if i should persue some more.

Thanks for all the help.

Danny
-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-17 23:25                                     ` Danny ter Haar
@ 2009-01-18  2:50                                       ` Danny ter Haar
  2009-01-19  3:17                                       ` Dave Chinner
  1 sibling, 0 replies; 27+ messages in thread
From: Danny ter Haar @ 2009-01-18  2:50 UTC (permalink / raw)
  To: Christoph Hellwig, xfs

> I saw changes in the announcement of 2.6.29-rc3 and took the plunge:
> over 4 hours of uptime and moderate usage, so i'm not 100% convinced but this one
> looks good (so far)

reboot   system boot  2.6.29-rc2       Sat Jan 17 10:28 - crash  (08:20)

ok, not stable (for me)

Danny
-- 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: problems showing up as XFS problems on kernels after 2.6.28-git2
  2009-01-17 23:25                                     ` Danny ter Haar
  2009-01-18  2:50                                       ` Danny ter Haar
@ 2009-01-19  3:17                                       ` Dave Chinner
  1 sibling, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2009-01-19  3:17 UTC (permalink / raw)
  To: Danny ter Haar; +Cc: Christoph Hellwig, xfs

On Sun, Jan 18, 2009 at 12:25:11AM +0100, Danny ter Haar wrote:
> Quoting Dave Chinner (david@fromorbit.com):
> > Sorry for not getting back to you sooner.
> 
> No problem. I initally posted to LKLM, git redirected by Christoph to this
> list. I'm so stupid that i didn't check the other messages from this list.
> Sorry.
> 
> > I think that Alexander tripped over this same problem during his bisect.
> > If you follow the thread from here:
> > http://oss.sgi.com/archives/xfs/2009-01/msg00496.html
> 
> Yep! [cheer] i'm not alone! :-)
> But why only us two ? there must be thousands of users out there using
> XFS. Why did it bite us ? large filesystem together with slow hardware ?

No idea - I can't reproduce it either so there's some state
that your filesystem is getting into that trips over it.

> > You'll see that Alexander had the same problem and managed
> > to continue the bisect once he copied the xfs_btree_trace.h
> > header file from top-of-tree back into the broken commits.
> 
> Grwat.
> 
> > I hope this helps (and I hope that the bisect lands on the
> > same commit that it did for Alexander).
> 
> Do you want me to still try it ?
> I think you allready figured out where the culprit is ?!

Yes, i think we have, but it wasn't totally conclusive. Can you
continue your bisect to see if it narrows down to the same commit
on your machine?

I'm still trying to reproduce it but I haven't worked out what the
initial state is. One thing that might be useful is to put a printk
into the kernel on the failure path that prints the inode number
out (e.g. at the goto that the WANT_CORRUPTED_GOTO jumps to). Then
we can use xfs_db to find the file that is causing the problem and
then use xfs_db or xfs_bmap to look at the extent tree prior to
the corruption. That might help me set up the initial state needed
to trip the problem.....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2009-01-19  3:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090107165218.GA11132@dth.net>
2009-01-07 18:02 ` problems showing up as XFS problems on kernels after 2.6.28-git2 Christoph Hellwig
2009-01-07 18:24   ` Danny ter Haar
2009-01-07 18:31     ` Christoph Hellwig
2009-01-07 18:44       ` Danny ter Haar
2009-01-07 18:52         ` Christoph Hellwig
2009-01-07 22:09           ` Danny ter Haar
2009-01-08  0:38           ` Danny ter Haar
2009-01-07 18:56         ` Christoph Hellwig
2009-01-07 19:01           ` Danny ter Haar
2009-01-08 21:56           ` Danny ter Haar
2009-01-09  0:46             ` Dave Chinner
2009-01-09  1:26               ` Danny ter Haar
2009-01-09  2:08                 ` Dave Chinner
2009-01-09  6:10                   ` Danny ter Haar
2009-01-09 19:44                     ` Christoph Hellwig
2009-01-09 19:51                       ` Danny ter Haar
2009-01-09 19:58                         ` Christoph Hellwig
2009-01-09 21:42                           ` Danny ter Haar
2009-01-09 22:01                             ` Christoph Hellwig
2009-01-09 22:23                               ` Danny ter Haar
2009-01-13 20:04                               ` Danny ter Haar
2009-01-16 20:43                                 ` Danny ter Haar
2009-01-17  7:38                                   ` Dave Chinner
2009-01-17 23:25                                     ` Danny ter Haar
2009-01-18  2:50                                       ` Danny ter Haar
2009-01-19  3:17                                       ` Dave Chinner
2009-01-14 19:44 ` Tino Keitel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox