From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 149D1C47254 for ; Tue, 5 May 2020 11:58:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E1A8B206A5 for ; Tue, 5 May 2020 11:58:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HGPkD9sk" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728180AbgEEL6R (ORCPT ); Tue, 5 May 2020 07:58:17 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:58643 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727090AbgEEL6R (ORCPT ); Tue, 5 May 2020 07:58:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1588679895; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7lyvXJ5DuoenPOxI+FBP/XOYDK6e/RtCSUuPqVfZI80=; b=HGPkD9skbtf+F3Oe57d0uQGcgfrq2hggDRBPrCYyNwnGj0fWMebD7aU/dnDGX2ed67OtJg ereWtiZGw0UDGu3wlLWcSfhDPI+jnk+64k1KfMgetICy6U7AVBw+PMyxLwo7ZZYENygrUd Iq5lBsEnlhVLRJoKph3RtSeuzYWe+Ko= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-169-vzLK3MBxMnGA-O7voMpEWg-1; Tue, 05 May 2020 07:58:13 -0400 X-MC-Unique: vzLK3MBxMnGA-O7voMpEWg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 00D1D107ACCA; Tue, 5 May 2020 11:58:13 +0000 (UTC) Received: from bfoster (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AB09260E1C; Tue, 5 May 2020 11:58:12 +0000 (UTC) Date: Tue, 5 May 2020 07:58:10 -0400 From: Brian Foster To: Dave Chinner Cc: linux-xfs@vger.kernel.org Subject: Re: [PATCH v4 00/17] xfs: flush related error handling cleanups Message-ID: <20200505115810.GA60048@bfoster> References: <20200504141154.55887-1-bfoster@redhat.com> <20200504215307.GL2040@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200504215307.GL2040@dread.disaster.area> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On Tue, May 05, 2020 at 07:53:07AM +1000, Dave Chinner wrote: > On Mon, May 04, 2020 at 10:11:37AM -0400, Brian Foster wrote: > > Hi all, > > > > I think everything has been reviewed to this point. Only minor changes > > noted below in this release. A git repo is available here[1]. > > > > The only outstanding feedback that I'm aware of is Dave's comment on > > patch 7 of v3 [2] regarding the shutdown assert check. I'm not aware of > > any means to get through xfs_wait_buftarg() with a dirty buffer that > > hasn't undergone the permanant error sequence and thus shut down the fs. > > # echo 0 > /sys/fs/xfs//fail_at_unmount > > And now any error with a "retry forever" config (the default) will > be collected by xfs_buftarg_wait() without a preceeding shutdown as > xfs_buf_iodone_callback_error() will not treat it as a permanent > error during unmount. i.e. this doesn't trigger: > > /* At unmount we may treat errors differently */ > if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && mp->m_fail_unmount) > goto permanent_error; > > and so the error handling just marks it with a write error and lets > it go for a write retry in future. These are then collected in > xfs_buftarg_wait() as nothing is going to retry them once unmount > gets to this point... > That doesn't accurately describe the behavior of that configuration, though. "Retry forever" means that dirty buffers are going to cycle through submission retries and the unmount is going to hang indefinitely (on pushing the AIL). Indeed, preventing this unmount hang is the original purpose of the fail at unmount knob (commit here[1]). IOW, we don't get to xfs_wait_buftarg() in that scenario until all dirty buffers are either written back successfully or the error configuration changes to process the failures as permanent errors and shuts down the fs. This can be confirmed easily with the buffer I/O error injection patch (use a value of 1 to simulate persistent errors). Brian [1] e6b3bb78962e6 ("xfs: add "fail at unmount" error handling configuration") > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >