From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1380CC63777 for ; Fri, 20 Nov 2020 08:48:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B6A3F2224E for ; Fri, 20 Nov 2020 08:48:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727355AbgKTIsn (ORCPT ); Fri, 20 Nov 2020 03:48:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727339AbgKTIsm (ORCPT ); Fri, 20 Nov 2020 03:48:42 -0500 Received: from mail.itouring.de (mail.itouring.de [IPv6:2a01:4f8:a0:4463::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71342C0613CF for ; Fri, 20 Nov 2020 00:48:42 -0800 (PST) Received: from tux.applied-asynchrony.com (p5ddd7c7c.dip0.t-ipconnect.de [93.221.124.124]) by mail.itouring.de (Postfix) with ESMTPSA id 4AAC110376F for ; Fri, 20 Nov 2020 09:48:39 +0100 (CET) Received: from [192.168.100.223] (ragnarok.applied-asynchrony.com [192.168.100.223]) by tux.applied-asynchrony.com (Postfix) with ESMTP id 1063574ECC6 for ; Fri, 20 Nov 2020 09:48:39 +0100 (CET) Subject: Re: [PATCH 5/5] btrfs: restart snapshot delete if we have to end the transaction References: To: linux-btrfs From: =?UTF-8?Q?Holger_Hoffst=c3=a4tte?= Organization: Applied Asynchrony, Inc. X-Forwarded-Message-Id: Message-ID: <9058dbb4-8889-854e-007c-33356bf9f55d@applied-asynchrony.com> Date: Fri, 20 Nov 2020 09:48:38 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Trying again out-of-thread, but I haven't seen any answers to this yet.. On 2020-03-20 19:34, Josef Bacik wrote: > This is to fully fix the deadlock described in > > btrfs: do not resolve backrefs for roots that are being deleted > > Holding write locks on our deleted snapshot across trans handles will > just lead to sadness, and our backref lookup code is going to want to > still process dropped snapshots for things like qgroup accounting. > > Fix this by simply dropping our path before we restart our transaction, > and picking back up from our drop_progress key. This is less efficient > obviously, but it also doesn't deadlock, so it feels like a reasonable > trade off. > > Signed-off-by: Josef Bacik > --- > fs/btrfs/extent-tree.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 2925b3ad77a1..bfb413747283 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -5257,6 +5257,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > * already dropped. > */ > set_bit(BTRFS_ROOT_DELETING, &root->state); > +again: > if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { > level = btrfs_header_level(root->node); > path->nodes[level] = btrfs_lock_root_node(root); > @@ -5269,7 +5270,9 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > btrfs_disk_key_to_cpu(&key, &root_item->drop_progress); > memcpy(&wc->update_progress, &key, > sizeof(wc->update_progress)); > + memcpy(&wc->drop_progress, &key, sizeof(key)); > > + wc->drop_level = root_item->drop_level; > level = root_item->drop_level; > BUG_ON(level == 0); > path->lowest_level = level; > @@ -5362,6 +5365,18 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > goto out_end_trans; > } > > + /* > + * We used to keep the path open until we completed the > + * snapshot delete. However this can deadlock with > + * things like backref walking that may want to resolve > + * references that still point to this deleted root. We > + * already have the ability to restart snapshot > + * deletions on mount, so just clear our walk_control, > + * drop the path, and go to the beginning and re-lookup > + * our drop_progress key and continue from there. > + */ > + memset(wc, 0, sizeof(*wc)); > + btrfs_release_path(path); > btrfs_end_transaction_throttle(trans); > if (!for_reloc && btrfs_need_cleaner_sleep(fs_info)) { > btrfs_debug(fs_info, > @@ -5377,6 +5392,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > } > if (block_rsv) > trans->block_rsv = block_rsv; > + goto again; > } > } > btrfs_release_path(path); > Josef, the above fix still seems to be missing, apparently since Dave couldn't merge it properly at the time (see [1]). Is this still needed? There were several long discussions about balance loops and it would be great to get this fixed once and for all. It applies and (seems to?) work fine in 5.9 (at least it hasn't eaten anything here so far) but if it's not needed anymore then all the better. thanks Holger [1] https://lore.kernel.org/linux-btrfs/20200320193927.GH12659@twin.jikos.cz/