* xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18
@ 2009-06-18 6:56 Wolfram Schlich
2009-06-18 13:57 ` Eric Sandeen
0 siblings, 1 reply; 3+ messages in thread
From: Wolfram Schlich @ 2009-06-18 6:56 UTC (permalink / raw)
To: linux-xfs
Hi!
I'm currently using LVM snapshots to create full system backups
of a bunch of Xen-based virtual machines (so-called domUs).
Those domUs all run Xen kernel 2.6.18 from the Xen 3.2.0 release
(32bit domU on 32bit dom0, I can post the .config if needed).
All domUs are using XFS on their LVM logical volumes.
The backup of all mounted snapshot volumes is made using
rsnapshot/rsync. This has been running smoothly for some
weeks now on 5 domUs.
Yesterday this happened during the backup on 1 domU:
--8<--
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x604d68 ("xfs_trans_read_buf") error 5 buf count 4096
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x66c5a0 ("xfs_trans_read_buf") error 5 buf count 4096
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x202f70 ("xfs_trans_read_buf") error 5 buf count 4096
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x2701f8 ("xfs_trans_read_buf") error 5 buf count 4096
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x6a78 ("xfs_trans_read_buf") error 5 buf count 4096
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x600500 ("xfs_trans_read_buf") error 5 buf count 8192
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x600520 ("xfs_trans_read_buf") error 5 buf count 8192
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x600520 ("xfs_trans_read_buf") error 5 buf count 8192
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0xdd0 ("xfs_trans_read_buf") error 5 buf count 8192
kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x4055d0 ("xfs_trans_read_buf") error 5 buf count 8192
[...many more of such messages...]
kernel: xfs_force_shutdown(dm-21,0x1) called from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xc02b1cbb
kernel: Filesystem "dm-21": I/O Error Detected. Shutting down filesystem: dm-21
kernel: Please umount the filesystem, and rectify the problem(s)
kernel: xfs_force_shutdown(dm-21,0x1) called from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xc02b1cbb
--8<--
The rsync process was then terminated with SIGBUS (exit code 135 -> 128+7).
The device dm-21 was the snapshot of the /var filesystem and
was mounted using nouuid,norecovery.
Is it possible that the LVM snapshot (that should be using
xfs_freeze/xfs_unfreeze) has created an inconsistent/damaged
snapshot that was kept from being repaired through norecovery?
Any other ideas?
--
Regards,
Wolfram Schlich <wschlich@gentoo.org>
Gentoo Linux * http://dev.gentoo.org/~wschlich/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18
2009-06-18 6:56 xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18 Wolfram Schlich
@ 2009-06-18 13:57 ` Eric Sandeen
2009-06-18 15:03 ` Wolfram Schlich
0 siblings, 1 reply; 3+ messages in thread
From: Eric Sandeen @ 2009-06-18 13:57 UTC (permalink / raw)
To: Wolfram Schlich; +Cc: linux-xfs
Wolfram Schlich wrote:
> Hi!
>
> I'm currently using LVM snapshots to create full system backups
> of a bunch of Xen-based virtual machines (so-called domUs).
> Those domUs all run Xen kernel 2.6.18 from the Xen 3.2.0 release
> (32bit domU on 32bit dom0, I can post the .config if needed).
> All domUs are using XFS on their LVM logical volumes.
> The backup of all mounted snapshot volumes is made using
> rsnapshot/rsync. This has been running smoothly for some
> weeks now on 5 domUs.
>
> Yesterday this happened during the backup on 1 domU:
> --8<--
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x604d68 ("xfs_trans_read_buf") error 5 buf count 4096
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x66c5a0 ("xfs_trans_read_buf") error 5 buf count 4096
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x202f70 ("xfs_trans_read_buf") error 5 buf count 4096
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x2701f8 ("xfs_trans_read_buf") error 5 buf count 4096
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x6a78 ("xfs_trans_read_buf") error 5 buf count 4096
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x600500 ("xfs_trans_read_buf") error 5 buf count 8192
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x600520 ("xfs_trans_read_buf") error 5 buf count 8192
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x600520 ("xfs_trans_read_buf") error 5 buf count 8192
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0xdd0 ("xfs_trans_read_buf") error 5 buf count 8192
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x4055d0 ("xfs_trans_read_buf") error 5 buf count 8192
> [...many more of such messages...]
Well these are all I/O errors happening -to- xfs, so xfs is unlikely to
be at fault here. Any block layer messages before that?
> kernel: xfs_force_shutdown(dm-21,0x1) called from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xc02b1cbb
> kernel: Filesystem "dm-21": I/O Error Detected. Shutting down filesystem: dm-21
> kernel: Please umount the filesystem, and rectify the problem(s)
> kernel: xfs_force_shutdown(dm-21,0x1) called from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xc02b1cbb
> --8<--
> The rsync process was then terminated with SIGBUS (exit code 135 -> 128+7).
>
> The device dm-21 was the snapshot of the /var filesystem and
> was mounted using nouuid,norecovery.
>
> Is it possible that the LVM snapshot (that should be using
> xfs_freeze/xfs_unfreeze) has created an inconsistent/damaged
> snapshot that was kept from being repaired through norecovery?
> Any other ideas?
If it was a proper snapshot norecovery shouldn't matter, as the fs
should be clean already (well, hopefully, 2.6.18 was a long time ago;
this is true today, anyway)
I suppose it's possible that the snapshot was not consistent, and you're
hitting problems there, but things like:
> kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block
0xdd0 ("xfs_trans_read_buf") error 5 buf count 8192
looks like a failure to read a perfectly normal block, not out of bounds
or anything, so I'd most likely point to problems outside xfs.
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18
2009-06-18 13:57 ` Eric Sandeen
@ 2009-06-18 15:03 ` Wolfram Schlich
0 siblings, 0 replies; 3+ messages in thread
From: Wolfram Schlich @ 2009-06-18 15:03 UTC (permalink / raw)
To: xfs
* Eric Sandeen <sandeen@sandeen.net> [2009-06-18 16:09]:
> Wolfram Schlich wrote:
> > Hi!
> >
> > I'm currently using LVM snapshots to create full system backups
> > of a bunch of Xen-based virtual machines (so-called domUs).
> > Those domUs all run Xen kernel 2.6.18 from the Xen 3.2.0 release
> > (32bit domU on 32bit dom0, I can post the .config if needed).
> > All domUs are using XFS on their LVM logical volumes.
> > The backup of all mounted snapshot volumes is made using
> > rsnapshot/rsync. This has been running smoothly for some
> > weeks now on 5 domUs.
> >
> > Yesterday this happened during the backup on 1 domU:
> > --8<--
> > kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x604d68 ("xfs_trans_read_buf") error 5 buf count 4096
> [...]
> > [...many more of such messages...]
>
> Well these are all I/O errors happening -to- xfs, so xfs is unlikely to
> be at fault here. Any block layer messages before that?
Unfortunately not a single one :(
> > Is it possible that the LVM snapshot (that should be using
> > xfs_freeze/xfs_unfreeze) has created an inconsistent/damaged
> > snapshot that was kept from being repaired through norecovery?
> > Any other ideas?
>
> If it was a proper snapshot norecovery shouldn't matter, as the fs
> should be clean already (well, hopefully, 2.6.18 was a long time ago;
> this is true today, anyway)
Ok.
> I suppose it's possible that the snapshot was not consistent, and you're
> hitting problems there, but things like:
>
> > kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block
> 0xdd0 ("xfs_trans_read_buf") error 5 buf count 8192
>
> looks like a failure to read a perfectly normal block, not out of bounds
> or anything, so I'd most likely point to problems outside xfs.
I've now traced it back to LVM. It seems that the LVM snapshot
volume we were backing up at that time ran out of space and thus
was automatically removed (thus, the block device which the XFS
was on vanished).
Stupid LVM does not log ANYTHING when it just deletes a snapshot
running out of space :( I've now activated dmeventd which *does*
log such events *sigh*
Thanks!
--
Regards,
Wolfram Schlich <wschlich@gentoo.org>
Gentoo Linux * http://dev.gentoo.org/~wschlich/
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-06-18 15:03 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-18 6:56 xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18 Wolfram Schlich
2009-06-18 13:57 ` Eric Sandeen
2009-06-18 15:03 ` Wolfram Schlich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox