From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6655CC7618A for ; Mon, 20 Mar 2023 05:46:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229735AbjCTFqL (ORCPT ); Mon, 20 Mar 2023 01:46:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229700AbjCTFqI (ORCPT ); Mon, 20 Mar 2023 01:46:08 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66C6918AB4 for ; Sun, 19 Mar 2023 22:46:07 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id F23C668AFE; Mon, 20 Mar 2023 06:46:01 +0100 (CET) Date: Mon, 20 Mar 2023 06:46:00 +0100 From: Christoph Hellwig To: David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Chris Mason , Josef Bacik , David Sterba , "linux-btrfs@vger.kernel.org" Subject: Re: [PATCH 04/20] btrfs: always read the entire extent_buffer Message-ID: <20230320054600.GA18708@lst.de> References: <20230309090526.332550-1-hch@lst.de> <20230309090526.332550-5-hch@lst.de> <20230317231609.GE10580@twin.jikos.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230317231609.GE10580@twin.jikos.cz> User-Agent: Mutt/1.5.17 (2007-11-01) Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Sat, Mar 18, 2023 at 12:16:09AM +0100, David Sterba wrote: > I remember one bug that was solved by splitting the first loop into two, > one locking all pages first and then checking the uptodate status, the > comment is explaining that. > > 2571e739677f ("Btrfs: fix memory leak in reading btree blocks") > > As you can see in the changelog it's a bit convoluted race, the number > of loops should not matter if they're there for correctness reasons. I > haven't checked the final code but I doubt it's equivalent and may > introduce subtle bugs. The patch we're replying to would have actually fixed that above bug on it's own, and thus have also fixed the above issue. The problem it solves was that num_pages could be smaller than the actual number of pages in the extent_buffer. With this change, we always read all pages and thus can't have a wrong ->io_pages. In fact a later patch removes ->io_pages entirely. Even better at the end of the series the locking between different reads moves to the extent_buffer level, so the above race is fundamentally fixed at a higher level and there won't be an extra read at all.