From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC742C433E0 for ; Sat, 27 Feb 2021 01:30:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 68E5C64EF0 for ; Sat, 27 Feb 2021 01:30:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230063AbhB0Bah (ORCPT ); Fri, 26 Feb 2021 20:30:37 -0500 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:57861 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S229999AbhB0Bag (ORCPT ); Fri, 26 Feb 2021 20:30:36 -0500 Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 11R1Tj2H003419 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 26 Feb 2021 20:29:45 -0500 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 20FCB15C39D2; Fri, 26 Feb 2021 20:29:45 -0500 (EST) Date: Fri, 26 Feb 2021 20:29:45 -0500 From: "Theodore Ts'o" To: Amy Parker Cc: bugzilla-daemon@bugzilla.kernel.org, Ext4 Developers List Subject: Re: [Bug 211971] New: Incorrect fix by e2fsck for blocks_count corruption Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, Feb 26, 2021 at 04:58:23PM -0800, Amy Parker wrote: > Can you replicate this on modern 5.4 from kernel.org? -generic kernels > are from Canonical and are sometimes broken compared to upstream. If > you can't replicate this on mainline, you'll need to contact > Canonical. We can't do anything if the problem only persists on > distribution kernels. This has nothing to do with the kernel. What the user is complaining about is that e2fsck trusts the blocks count field in the superblock as to be a source of truth. If that field is artificially changed to be a smaller value, e2fsck will assume the file system size indicated by that changed size. That's an intentional design choice of e2fsck. Given that with modern ext4 file systems, we have metadata checksums, if the superblock has been accidentally corrupted, the checksum will fail, and then e2fsck will try using the backup superblock instead. For older file systems that don't have metadata checksums enabled, we could check to see if certain "fundamental constants" in the primary superblock is different from the secondary superblock, but... > > debugfs -w image > > debugfs: ssv blocks_count 4000 > > debugfs: q This will update the blocks_count in the primary and all secondary backups. So that's not going to really help the user. Effectively, the complaint is "I pointed the gun at my foot, and pulled the triggered, and now my foot hurts!" > > # Expected that e2fsck would fix the blocks count corruption instead of > > changing other fields (e.g.,free blocks_count) The problem is that e2fsck can't really determine that the blocks count field has been corrupted. We could warn the user if the blocks_count is smaller than the reported size of the device, but.... that's actually something that can happen in real life, and it's not necessarily a file system "corruption", but rather an intentional choice by the system administrator. If we were to give a warning, or worse, assume that blocks count should be adjusted to be the size of the deivce, we'd be getting complaints from users who deliberately chose to set the file system size to be something smaller than the block device. So this is a case of e2fsck is working as intended. Cheers, - Ted