From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5335EC7EE23 for ; Thu, 1 Jun 2023 04:40:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230303AbjFAEkm (ORCPT ); Thu, 1 Jun 2023 00:40:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229562AbjFAEkk (ORCPT ); Thu, 1 Jun 2023 00:40:40 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62A5E101 for ; Wed, 31 May 2023 21:40:38 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id CD89768BFE; Thu, 1 Jun 2023 06:40:34 +0200 (CEST) Date: Thu, 1 Jun 2023 06:40:34 +0200 From: Christoph Hellwig To: Qu Wenruo Cc: Christoph Hellwig , Johannes Thumshirn , Naohiro Aota , "linux-btrfs@vger.kernel.org" Subject: Re: new scrub code vs zoned file systems Message-ID: <20230601044034.GA21827@lst.de> References: <20230531125224.GB27468@lst.de> <546fad79-f436-c561-8b9b-0d9a7db09522@wdc.com> <20230531132032.GA30016@lst.de> <821003e3-b457-90ba-e733-8c2fdd0c3b3c@wdc.com> <20230531133038.GA30855@lst.de> <20230531141739.GA2160@lst.de> <134e56ed-1139-a71c-54d7-b4cbc27834a9@gmx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <134e56ed-1139-a71c-54d7-b4cbc27834a9@gmx.com> User-Agent: Mutt/1.5.17 (2007-11-01) Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Thu, Jun 01, 2023 at 10:09:24AM +0800, Qu Wenruo wrote: > So far the various wrapper around the write operations work as expected, > and hide the detailed well enough that most of us didn't even notice. > > E.g. all the zoned code is already handled in scrub_write_sectors(). > > The crash itself is caused by the fact that end io part is relying on > the inode pointer, that itself is a simple fix. But the reason why it is relying on the inode pointer is that it needs to record the actual written LBA after I/O completion. So it's not just a case of just add a NULL check, it needs a way to adjust the logical to physical mapping from the dummy added before the I/O. > But I'm more concerned about why we have a full zone before that crash. I think this is happening because we can't account for the zone filling without the proper context. >> b) don't create a new relocation thread per zone, but run it from >> the scrub context. >> > > That's a little too complex, the problem is that relocation is a > completely different beast, too different from the scrub code. > > But I agree the repair part for zoned needs some rework, it's not > working from the day 1 of zoned support, but shouldn't need that a huge > change. > > E.g. we just record that we need to relocate the bg, then after the > scrub of that bg is fully finished, queue a relocation for it. Yes. That's what the read repair already does, and also the scrub code, although in a somewhat sub-optimal way. > > Thanks, > Qu ---end quoted text---