From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39742C11D05 for ; Thu, 20 Feb 2020 15:50:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0BD792071E for ; Thu, 20 Feb 2020 15:50:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728419AbgBTPua (ORCPT ); Thu, 20 Feb 2020 10:50:30 -0500 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:45407 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728387AbgBTPua (ORCPT ); Thu, 20 Feb 2020 10:50:30 -0500 Received: from callcc.thunk.org (guestnat-104-133-8-109.corp.google.com [104.133.8.109] (may be forged)) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 01KFoNHO009229 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 20 Feb 2020 10:50:25 -0500 Received: by callcc.thunk.org (Postfix, from userid 15806) id C43FA4211EF; Thu, 20 Feb 2020 10:50:22 -0500 (EST) Date: Thu, 20 Feb 2020 10:50:22 -0500 From: "Theodore Y. Ts'o" To: Jean-Louis Dupond Cc: linux-ext4@vger.kernel.org Subject: Re: Filesystem corruption after unreachable storage Message-ID: <20200220155022.GA532518@mit.edu> References: <20200124203725.GH147870@mit.edu> <3a7bc899-31d9-51f2-1ea9-b3bef2a98913@dupond.be> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3a7bc899-31d9-51f2-1ea9-b3bef2a98913@dupond.be> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu, Feb 20, 2020 at 10:08:44AM +0100, Jean-Louis Dupond wrote: > dumpe2fs -> see attachment Looking at the dumpe2fs output, it's interesting that it was "clean with errors", without any error information being logged in the superblock. What version of the kernel are you using? I'm guessing it's a fairly old one? > Fsck: > # e2fsck -fy /dev/mapper/vg01-root > e2fsck 1.44.5 (15-Dec-2018) And that's a old version of e2fsck as well. Is this some kind of stable/enterprise linux distro? > Pass 1: Checking inodes, blocks, and sizes > Inodes that were part of a corrupted orphan linked list found.  Fix? yes > > Inode 165708 was part of the orphaned inode list.  FIXED. OK, this and the rest looks like it's relating to a file truncation or deletion at the time of the disconnection. > > > On KVM for example there is a unlimited timeout (afaik) until the > > > storage is > > > back, and the VM just continues running after storage recovery. > > Well, you can adjust the SCSI timeout, if you want to give that a try.... > It has some other disadvantages? Or is it quite safe to increment the SCSI > timeout? It should be pretty safe. Can you reliably reproduce the problem by disconnecting the machine from the SAN? - Ted