From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: util-linux-owner@vger.kernel.org Received: from brisi.sourcepole.ch ([5.9.57.43]:60562 "EHLO brisi.sourcepole.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758143Ab3CFOJ1 (ORCPT ); Wed, 6 Mar 2013 09:09:27 -0500 Received: from brisi-br ([192.168.138.1] helo=sourcepole.ch) by vilan-mail with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1UDER4-0007yz-9o for util-linux@vger.kernel.org; Wed, 06 Mar 2013 14:31:13 +0100 To: Subject: making users aware of losetup rw =?UTF-8?Q?setups=3F?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Date: Wed, 06 Mar 2013 14:31:03 +0100 From: Tomas Pospisek Message-ID: Sender: util-linux-owner@vger.kernel.org List-ID: Hello, The loopback blockdevice is extremely handy and nice to use, however there are many allusions on the internets [1] [2], that using a loopback block device on a disk image in a rw manner is unreliable, i.e. the loopback device *might eat your data* (!). This is a kernel/driver issue for sure, however I'd like if that was documented in the losetup manpage so that users are made aware of the fact and don't loose their data. We seem to have been bitten by this problem recently. In our case we have a LVM partition that contains a disk image. We make a LVM snapshot of that LVM partition in order to backup it up. Since the LVM partition contains a disk image with its own partition, we mount the respective disk-partition with an offset like this: mount -o offset=1048576 /dev/vg_disks/server_image /mnt/snapshot I think it is while mounting that we see this: [65162.184173] Buffer I/O error on device loop0, logical block 3932144 [65162.184297] Buffer I/O error on device loop0, logical block 3932144 [65162.184431] Buffer I/O error on device loop0, logical block 3932158 [65162.184555] Buffer I/O error on device loop0, logical block 3932158 [65162.210044] Buffer I/O error on device loop0, logical block 3932144 [65162.210281] Buffer I/O error on device loop0, logical block 3932158 [65162.210323] Buffer I/O error on device loop0, logical block 3932159 [65162.210373] Buffer I/O error on device loop0, logical block 3932159 [65162.210422] Buffer I/O error on device loop0, logical block 3932159 [65162.210588] Buffer I/O error on device loop0, logical block 3932159 [65162.299517] EXT4-fs (loop0): ext4_orphan_cleanup: deleting unreferenced inode 81 [65162.299531] EXT4-fs (loop0): 1 orphan inode deleted [65162.299550] EXT4-fs (loop0): recovery complete [65162.367689] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [65255.139406] quiet_error: 35 callbacks suppressed [65255.139447] Buffer I/O error on device loop0, logical block 7864304 [65255.139509] Buffer I/O error on device loop0, logical block 7864304 [65255.139578] Buffer I/O error on device loop0, logical block 7864318 [65255.139635] Buffer I/O error on device loop0, logical block 7864318 [65255.141001] Buffer I/O error on device loop0, logical block 7864304 [65255.152388] Buffer I/O error on device loop0, logical block 7864318 [65255.153897] Buffer I/O error on device loop0, logical block 7864319 [65255.155149] Buffer I/O error on device loop0, logical block 7864319 [65255.156447] Buffer I/O error on device loop0, logical block 7864319 [65255.157690] Buffer I/O error on device loop0, logical block 7864319 [65255.270004] EXT4-fs (loop0): ext4_orphan_cleanup: deleting unreferenced inode 92 [65255.270019] EXT4-fs (loop0): ext4_orphan_cleanup: deleting unreferenced inode 91 [65255.270023] EXT4-fs (loop0): ext4_orphan_cleanup: deleting unreferenced inode 90 [65255.270026] EXT4-fs (loop0): ext4_orphan_cleanup: deleting unreferenced inode 89 [65255.270030] EXT4-fs (loop0): ext4_orphan_cleanup: deleting unreferenced inode 86 [65255.270033] EXT4-fs (loop0): 5 orphan inodes deleted [65255.270504] EXT4-fs (loop0): recovery complete [65255.395442] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) In short - somehow the FS gets errors from the underlying loop device. However the layers below the loop device are *not* reporting any errors, thus one would think it's the loop device itself that is breaking here. The backups runs every night and we have been seeing this occassionally - say once a week. A few times it hung the whole machine. I get the impression that it has been known for at least 5 years that the loopback block device is broken wrt to writing back to the underlying layer in edge cases, however there is no up front clear documentation of that fact. Thus until (if) this will be fixed in the kernel I would suggest to put a big warning into the losetup man page. ? Thanks, *t [1] http://www.drbd.org/users-guide/ch-configure.html [2] https://bugs.launchpad.net/wubi/+bug/204133