From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris J Arges Subject: regression in ext4_ind_remove_space Date: Fri, 30 Jan 2015 12:56:57 -0600 Message-ID: <54CBD3F9.3030301@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org To: Lukas Czerner , Theodore Ts'o , Jan Kara Return-path: Received: from youngberry.canonical.com ([91.189.89.112]:57386 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752604AbbA3S5C (ORCPT ); Fri, 30 Jan 2015 13:57:02 -0500 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi, Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE flags. This seems to be a regression in ext4_ind_remove_space introduced in 4f579ae7, whereas commit 77ea2a4b passes the following test case. To reproduce this issue do the following: 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. In addition, I've tested this with a single vCPU and single host CPU and the problem persists. Running the same test on ext4 w/ extents exhibits no failures. Any ideas for bug hunting here? Commit 4f579ae7 completely re-writes ext4_ind_remove_space. A revert of that commit from master fixes the issue, but I'm unsure if that un-fixes other things. I'm happy to continue debugging, or run any tests as necessary. Thanks, --chris j arges