From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC8A8C43467 for ; Thu, 8 Oct 2020 09:24:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77A3A215A4 for ; Thu, 8 Oct 2020 09:24:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726078AbgJHJYR (ORCPT ); Thu, 8 Oct 2020 05:24:17 -0400 Received: from freki.datenkhaos.de ([81.7.17.101]:48180 "EHLO freki.datenkhaos.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725849AbgJHJYR (ORCPT ); Thu, 8 Oct 2020 05:24:17 -0400 Received: from localhost (localhost [127.0.0.1]) by freki.datenkhaos.de (Postfix) with ESMTP id F0EFD348C262; Thu, 8 Oct 2020 11:24:14 +0200 (CEST) Received: from freki.datenkhaos.de ([127.0.0.1]) by localhost (freki.datenkhaos.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2L7NVs7KBzBf; Thu, 8 Oct 2020 11:24:11 +0200 (CEST) Received: from latitude (geri.datenkhaos.de [81.7.17.45]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by freki.datenkhaos.de (Postfix) with ESMTPSA; Thu, 8 Oct 2020 11:24:11 +0200 (CEST) Date: Thu, 8 Oct 2020 11:24:09 +0200 From: Johannes Hirte To: Qu Wenruo Cc: linux-btrfs@vger.kernel.org, Jens Axboe Subject: Re: failed to read block groups: Operation not permitted Message-ID: <20201008092409.GB387879@latitude> References: <20201006090918.GA269054@latitude> <9cd7f2d0-4256-7311-483e-b1169e4c3655@gmx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9cd7f2d0-4256-7311-483e-b1169e4c3655@gmx.com> Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On 2020 Okt 06, Qu Wenruo wrote: > > > On 2020/10/6 下午5:09, Johannes Hirte wrote: > > I recently encountered filesystem damage on a VM. During normal > > operation, the filesystem was remounted ro suddenly. Dmesg showed me > > some errors about parent transid verify failed. I've forced of the VM > > and tried to mount the image on the host, but failed with: > > > > [ 340.702391] BTRFS info (device loop0p1): disk space caching is enabled > > [ 340.702393] BTRFS info (device loop0p1): has skinny extents > > [ 341.815890] BTRFS error (device loop0p1): parent transid verify failed on 152064327680 wanted 323984 found 323888 > > [ 341.831183] BTRFS error (device loop0p1): parent transid verify failed on 152064327680 wanted 323984 found 323888 > > Your extent tree is corrupted. Metadata CoW is broken. > > I don't believe only extent tree get corrupted, other part of your fs > can also be corrupted. > > > [ 341.831194] BTRFS error (device loop0p1): failed to read block groups: -5 > > [ 341.851954] BTRFS error (device loop0p1): open_ctree failed > > > > A btrfs check resulted in: > > > > btrfs check /dev/loop0p1 > > Opening filesystem to check... > > parent transid verify failed on 152064327680 wanted 323984 found 323888 > > parent transid verify failed on 152064327680 wanted 323984 found 323888 > > parent transid verify failed on 152064327680 wanted 323984 found 323888 > > Ignoring transid failure > > leaf parent key incorrect 152064327680 > > ERROR: failed to read block groups: Operation not permitted > > ERROR: cannot open file system > > > > The host is running libvirt with kvm, btrfs with RAID1. The VMs are raw > > images, with btrfs too. I've switche this VM from io=native to > > io=io_uring, and suspect that this caused the damage. All machines are > > running kernel 5.8.13. > > I'm not sure about the io_uring setup. IIRC as long as you're not using > cache=unsafe, it should be safe. > > Does the io_uring ignores the flush? Putting someone with more knowledge into cc. For another VM, I've found several errors in the log of the host machine: BTRFS warning (device sda1): direct IO failed ino 5988432 rw 1,2131969 sector 0x123ab840 len 32768 err no 10 The VM was remounted ro too, like the first one. But in this case the filesystem was ok after a check. For the first VM with the heavily damaged filesystem there aren't any log entries. -- Regards, Johannes Hirte