From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78FC6C77B75 for ; Wed, 17 May 2023 16:38:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229793AbjEQQiA (ORCPT ); Wed, 17 May 2023 12:38:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229791AbjEQQh7 (ORCPT ); Wed, 17 May 2023 12:37:59 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5D6493C8 for ; Wed, 17 May 2023 09:37:52 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id 8865F68B05; Wed, 17 May 2023 18:37:48 +0200 (CEST) Date: Wed, 17 May 2023 18:37:48 +0200 From: Christoph Hellwig To: Filipe Manana Cc: Christoph Hellwig , Chris Mason , Josef Bacik , David Sterba , linux-btrfs@vger.kernel.org Subject: Re: [PATCH 1/6] btrfs: use a linked list for tracking per-transaction/log dirty buffers Message-ID: <20230517163748.GA5784@lst.de> References: <20230515192256.29006-1-hch@lst.de> <20230515192256.29006-2-hch@lst.de> <20230517122150.GA17334@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Wed, May 17, 2023 at 05:24:59PM +0100, Filipe Manana wrote: > Probably what you ran into is the case where a tree block is allocated > and freed in the same transaction before being written, in that case > it gets added > back to the free space cache (this happens at btrfs_free_tree_block()) > and can be allocated again in the same transaction. > If so, in that case you can remove it from the list and you shouldn't > run into that issue anymore. > We currently don't remove it from the io tree in that case, because > that may need splitting an existing extent state record and therefore > requiring allocating memory. Yes. I did in fact run into exactly that when I resurrected the old idea of the embedded list_head and add a lot of instrumentation. With the list removal in btrfs_clear_buffer_dirty it works fine. > > At least in theory yes. But we also save a whole lot of lookups > > by going directly to the object instead of indirecting through the > > pages xarray and then again through the buffers array for the > > sub-block case. > > I think you are talking about further changes in the patchset, not > about this particular patch? Yes. This one is the enabler. > I'm talking about the data structure used to represent a range, not pages. > The io tree is good to merge adjacent ranges and reduce the total > structure size and used memory Yes. Assuming you actually want to track ranges. But that's now what we ultimatively want to track here, though. In the end everything is about the actual buffers. Patches 3 and 6 are the other key ones that completely removed the need for any ranges. I've tried to keep them separate as Dave tends to prefer split out patches, but maybe I should just fold them back to make it more obvious what is going on here. > fs_mark (as you did already), dbench, fio for random writes and periodic fsync, > anything that is not too short, runtimes of several minutes at least. Ok. If you have any particular workloads you care about and have scripts for them, feel free to send them my way.