From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36AC7481C0 for ; Wed, 14 Aug 2024 04:59:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723611570; cv=none; b=KJKCLtDBkkqLkZXe1ZBxr9X7qjpXB+/OYxarYSYqSTzpNw2sXJm6eGLEOBilYU0j899XozktSVLlAtIuPoY5f6SeS4/RgdrG7Uj74tErMbW4jnHO9CGR8PYtmR3ycIE2hZkIjDsTCCJ4FG89YpKKBtxpmBHeOjBWtQzsSVUZcEI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723611570; c=relaxed/simple; bh=N0sUmJT0wjmV/m+WHFZMycnAPET6uKJCQrUyePt9ayY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=qtEA5heyP2iSKezZO+R98RzVxLn7G87QVciYwlp6sPn09NFekgnWKh/2yOjf++VJffNl2MHtHsRq4e7H/bC63LcUcrs1mzW08GhCnwtO315NkoIj3ciBg54wFY33vTpAdazHU2kPQiIRxffIqg7VQxY6+STpQYSl1D7FJkuvUf4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=oMiWrhK3; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="oMiWrhK3" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-2cf93dc11c6so4859109a91.1 for ; Tue, 13 Aug 2024 21:59:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1723611566; x=1724216366; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=q9hs9yPPIjAFKNvPKQSLvy6Q7Cm9t0oU+bjgW2/ohow=; b=oMiWrhK3D/eiaMcT9CdZKFxRdCxGstjQQLDGII1fBnwx+Be3R1NsdqfGqX3DpTqALV LhkGsspSa86/eIQ7t8/DUlE988rS3LlIMZZl/dfn6sakXLFfOZBFmEMx/5zcdp6ITYpc 5TcFYjl+N0NjiRjkc6prIx4w0Pe3h1oPi61QT6g7lEtbPYRJhwwZ461NtwepwQ3y+T2y e9ose6GlNtsdNelhJh46ANBHjQMOorq+v4VS4/fJGghjz2sQX/OJ+7/ui4cnXAbJZGx+ 5G5PpVNgMSCEZwDgwtRIkPGOw2+u7vpuNXR8i+fcPLwINr7E6cSSTym0iDJqkF+zdniU u7lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723611566; x=1724216366; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=q9hs9yPPIjAFKNvPKQSLvy6Q7Cm9t0oU+bjgW2/ohow=; b=FWrdCriAwvOgAA4h9QCK/oGswuHHOzTtQ2p9IaTO2Y68xp9Bx8dviCOfInyvGQ0vep MucX8DYZzEz0ukTBFSWuPT9gQk9FQHWnnv/JpWM08UOyFocxZbqfTeK+0OSwHfGy1yQP moUILqzbZMNLmz94C3PoDWIdlw3edJ/lld6EnFlESi83MJTXLOUa2PX+36Xn+yH5abB9 fF4BYJtcB0mjSlEcsmhShA9KnmtmfAGxJDgY18g3JRQ7AQhRHWUyc8d0OKufqWuywdri xYrHzQaEYKxBFF8MSFVbNLLqod5h/pzh8nLAxFqo4DJNU4bo+ieehou0oJSz9NzjpS1N iPMw== X-Forwarded-Encrypted: i=1; AJvYcCUFSIksMuAavEeF1ntD/ZN4Zu/Kc0pEQ1Xzi9XwQQwqreNjTiWVv+d0jpOV72+BxZV2zejz/vpUGFgaEpk=@vger.kernel.org X-Gm-Message-State: AOJu0YwiWLPev9jhMO0V5F68IFCZYwQ41OmBXrQs8bf8HuV4W/SLvG37 UU9UbO6EFBfEJYQuGDSG3Dl5GSXcqArG1yEc3NWOoOaivDVMz1c55QI086lhO+I= X-Google-Smtp-Source: AGHT+IERVX43gOXYdcf3KbRYaCL9u9fdteGqPIVuHakd/aJAOk7ZsngokahozzcvU6RwEvf5ewwQoA== X-Received: by 2002:a17:90a:c909:b0:2ca:a625:f9e9 with SMTP id 98e67ed59e1d1-2d3aabad914mr1819772a91.42.1723611566502; Tue, 13 Aug 2024 21:59:26 -0700 (PDT) Received: from dread.disaster.area (pa49-181-47-239.pa.nsw.optusnet.com.au. [49.181.47.239]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2d3ac856c64sm539189a91.55.2024.08.13.21.59.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2024 21:59:26 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1se66V-00GQUx-2q; Wed, 14 Aug 2024 14:59:23 +1000 Date: Wed, 14 Aug 2024 14:59:23 +1000 From: Dave Chinner To: Christoph Hellwig Cc: Chandan Babu R , Matthew Wilcox , "Darrick J. Wong" , Andrew Morton , linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 2/3] xfs: convert perag lookup to xarray Message-ID: References: <20240812063143.3806677-1-hch@lst.de> <20240812063143.3806677-3-hch@lst.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240812063143.3806677-3-hch@lst.de> On Mon, Aug 12, 2024 at 08:31:01AM +0200, Christoph Hellwig wrote: > Convert the perag lookup from the legacy radix tree to the xarray, > which allows for much nicer iteration and bulk lookup semantics. > > Note that this removes the helpers for tagged get and grab and the > for_each* wrappers built around them and instead uses the xa_for_each* > iteration helpers directly in xfs_icache.c, which simplifies the code > nicely. Can we split the implementation change and the API change into two separate patches, please? I have no problems with the xarray conversion, I have reservations about the API changes. I hav eno idea what the new iteration method that is needed looks like, but I'd prefer not to be exposing all the perag locking and reference counting semantics all over the code - the current iterators were introduced to remove all that stuff from existing iterators. This patch makes iteration go back to this model: rcu_read_lock() xa_for_each....() { /* get active or passive ref count */ .... rcu_read_unlock(); /* do work on AG */ /* put/rele perag */ /* take RCU lock for next perag lookup */ rcu_read_lock(); } rcu_read_unlock(); And that feels like a step backward from an API perspective, not an improvement.... So what's the overall plan for avoiding this sort of mess everywhere? Can we re-implement the existing iterators more efficiently with xarray iterators, or does xarray-based iteration require going back to the old way of open coding all iterations? > @@ -1493,21 +1497,32 @@ xfs_blockgc_flush_all( > struct xfs_mount *mp) > { > struct xfs_perag *pag; > - xfs_agnumber_t agno; > + unsigned long index = 0; > > trace_xfs_blockgc_flush_all(mp, __return_address); > > /* > - * For each blockgc worker, move its queue time up to now. If it > - * wasn't queued, it will not be requeued. Then flush whatever's > - * left. > + * For each blockgc worker, move its queue time up to now. If it wasn't > + * queued, it will not be requeued. Then flush whatever is left. > */ > - for_each_perag_tag(mp, agno, pag, XFS_ICI_BLOCKGC_TAG) > - mod_delayed_work(pag->pag_mount->m_blockgc_wq, > - &pag->pag_blockgc_work, 0); > + rcu_read_lock(); > + xa_for_each_marked(&mp->m_perags, index, pag, XFS_ICI_BLOCKGC_TAG) > + mod_delayed_work(mp->m_blockgc_wq, &pag->pag_blockgc_work, 0); > + rcu_read_unlock(); > > + index = 0; > + rcu_read_lock(); > + xa_for_each_marked(&mp->m_perags, index, pag, XFS_ICI_BLOCKGC_TAG) { > + if (!atomic_inc_not_zero(&pag->pag_active_ref)) > + continue; > + rcu_read_unlock(); > > - for_each_perag_tag(mp, agno, pag, XFS_ICI_BLOCKGC_TAG) > flush_delayed_work(&pag->pag_blockgc_work); > + xfs_perag_rele(pag); > + > + rcu_read_lock(); > + } > + rcu_read_unlock(); And this is the whole problem with open coding iterators. The first iterator accesses perag structures and potentially queues them for work without holding a valid reference to the perag. The second iteration takes reference counts, so can access the perag safely. Why are these two iterations different? What makes the first, non-reference counted iteration safe? > > return xfs_inodegc_flush(mp); > } > @@ -1755,18 +1770,26 @@ xfs_icwalk( > struct xfs_perag *pag; > int error = 0; > int last_error = 0; > - xfs_agnumber_t agno; > + unsigned long index = 0; > + > + rcu_read_lock(); > + xa_for_each_marked(&mp->m_perags, index, pag, goal) { > + if (!atomic_inc_not_zero(&pag->pag_active_ref)) > + continue; > + rcu_read_unlock(); > > - for_each_perag_tag(mp, agno, pag, goal) { > error = xfs_icwalk_ag(pag, goal, icw); > + xfs_perag_rele(pag); > + > + rcu_read_lock(); > if (error) { > last_error = error; > - if (error == -EFSCORRUPTED) { > - xfs_perag_rele(pag); > + if (error == -EFSCORRUPTED) > break; > - } > } > } > + rcu_read_unlock(); > + And there's the open coded pattern I talked about earlier that we introduced the for_each_perag iterators to avoid. Like I said, converting to xarray - no problems with that. Changing the iterator API - doesn't seem like a step forwards right now. -Dave. -- Dave Chinner david@fromorbit.com