From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 687C8C83F01 for ; Wed, 30 Aug 2023 22:03:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244239AbjH3WDD (ORCPT ); Wed, 30 Aug 2023 18:03:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244215AbjH3WDC (ORCPT ); Wed, 30 Aug 2023 18:03:02 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 31777FB for ; Wed, 30 Aug 2023 15:02:32 -0700 (PDT) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id B22115C0003; Wed, 30 Aug 2023 18:01:22 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 30 Aug 2023 18:01:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm2; t=1693432882; x=1693519282; bh=/U e9cgPRlNmDI3ITBwzEr9+BZRw7qwTzHxBLNEpviPQ=; b=AhYT4K1Lrn3wQ29lcF UyscQLhY30N+Ly0u9sqdRoEXMJkA8THZRaGqlhExIRvJUCEI3Ko4bx/BMUo/SVfy irK+3667/i5KdfQDyIROO/USlW2s3VGBKvhr+1AVIzUCn7ajuWLa8/jrHaDq9mlz X9C9SlBB3MqeSfjFEUcD8NejJLTR5h/Y16NiDJkfXTFpn0hJ7bZsJpcuwFtitDUz 4WqKp16oo+RfDOCMlQxPwMkf1HeuOWuFbaE0+yInfHq5jbDHo6srgiRE5Ddh3Luv ZbrGkoddZOW5NoSLh44mevNE8RX7rX4C/BnUmqhlyDftwu09/exb05SK4grbjeej lNPA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1693432882; x=1693519282; bh=/Ue9cgPRlNmDI 3ITBwzEr9+BZRw7qwTzHxBLNEpviPQ=; b=lo3N72K+YspR++F3Qw2/Vq/6PsDXv GeVS3gQp//8H5LqgItc+9gnXuu/dS4wpUoNH+qgefbOtnd+dQZH+kXb1dxaG5Yy1 IMenRvb8Y2Z6jHB2jd+ZGdqOvVIBlxuAfuBVaZZUtAeDqPpTPgV5fkuvGTfSlT5O nbXcF2psovZEh4MpzoYuQ9L3yTrSgxfUs9gqAoZ9FW/V5vnTxmPoCC5O0um7eTaK BGuTd35it5EHKBzLzT+/C/vK9/WrnME5nuWokx91NedjTVR6o2XdWIPw8MhOEBw/ c666hRJdeqCpyJ02XJjD53YdP8j3XQyXfcwqqK7SLxm8Np1i0lyVQBuNQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudefledgtdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesthdtredttddtvdenucfhrhhomhepuehorhhi shcuuehurhhkohhvuceosghorhhishessghurhdrihhoqeenucggtffrrghtthgvrhhnpe ekvdekffejleelhfevhedvjeduhfejtdfhvdevieeiiedugfeugfdtjefgfeeljeenucev lhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegsohhrihhsse gsuhhrrdhioh X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 30 Aug 2023 18:01:22 -0400 (EDT) Date: Wed, 30 Aug 2023 14:58:48 -0700 From: Boris Burkov To: Qu Wenruo Cc: linux-btrfs@vger.kernel.org Subject: Re: [PATCH 1/5] btrfs: qgroup: iterate qgroups without memory allocation for qgroup_reserve() Message-ID: <20230830215848.GA834639@zen> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Wed, Aug 30, 2023 at 06:37:23PM +0800, Qu Wenruo wrote: > Qgroup heavily relies on ulist to go through all the involved > qgroups, but since we're using ulist inside fs_info->qgroup_lock > spinlock, this means we're doing a lot of GFP_ATOMIC allocation. > > This patch would reduce the GFP_ATOMIC usage for qgroup_reserve() by > eliminating the memory allocation completely. > > This is done by moving the needed memory to btrfs_qgroup::iterator > list_head, so that we can put all the involved qgroup into a on-stack list, thus > eliminate the need to allocate memory holding spinlock. > > The only cost is the slightly higher memory usage, but considering the > reduce GFP_ATOMIC during a hot path, it should still be acceptable. > > Function qgroup_reserve() is the perfect start point for this > conversion. This looks great, thanks! I like it more than my array/ulist hybrid since it never allocates :) Reviewed-by: Boris Burkov > > Signed-off-by: Qu Wenruo > --- > fs/btrfs/qgroup.c | 68 +++++++++++++++++++++++------------------------ > fs/btrfs/qgroup.h | 9 +++++++ > 2 files changed, 42 insertions(+), 35 deletions(-) > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index 74244b4bb0e9..de34e2aef710 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -216,6 +216,7 @@ static struct btrfs_qgroup *add_qgroup_rb(struct btrfs_fs_info *fs_info, > INIT_LIST_HEAD(&qgroup->groups); > INIT_LIST_HEAD(&qgroup->members); > INIT_LIST_HEAD(&qgroup->dirty); > + INIT_LIST_HEAD(&qgroup->iterator); > > rb_link_node(&qgroup->node, parent, p); > rb_insert_color(&qgroup->node, &fs_info->qgroup_tree); > @@ -1367,6 +1368,25 @@ static void qgroup_dirty(struct btrfs_fs_info *fs_info, > list_add(&qgroup->dirty, &fs_info->dirty_qgroups); > } > > +static void qgroup_iterator_add(struct list_head *head, struct btrfs_qgroup *qgroup) > +{ > + if (!list_empty(&qgroup->iterator)) > + return; > + > + list_add_tail(&qgroup->iterator, head); > +} > + > +static void qgroup_iterator_clean(struct list_head *head) > +{ > + > + while (!list_empty(head)) { > + struct btrfs_qgroup *qgroup; > + > + qgroup = list_first_entry(head, struct btrfs_qgroup, iterator); > + list_del_init(&qgroup->iterator); > + } > +} > + > /* > * The easy accounting, we're updating qgroup relationship whose child qgroup > * only has exclusive extents. > @@ -3154,12 +3174,11 @@ static bool qgroup_check_limits(const struct btrfs_qgroup *qg, u64 num_bytes) > static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce, > enum btrfs_qgroup_rsv_type type) > { > - struct btrfs_qgroup *qgroup; > + struct btrfs_qgroup *cur; > struct btrfs_fs_info *fs_info = root->fs_info; > u64 ref_root = root->root_key.objectid; > int ret = 0; > - struct ulist_node *unode; > - struct ulist_iterator uiter; > + LIST_HEAD(qgroup_list); > > if (!is_fstree(ref_root)) > return 0; > @@ -3175,53 +3194,32 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce, > if (!fs_info->quota_root) > goto out; > > - qgroup = find_qgroup_rb(fs_info, ref_root); > - if (!qgroup) > + cur = find_qgroup_rb(fs_info, ref_root); > + if (!cur) > goto out; > > - /* > - * in a first step, we check all affected qgroups if any limits would > - * be exceeded > - */ > - ulist_reinit(fs_info->qgroup_ulist); > - ret = ulist_add(fs_info->qgroup_ulist, qgroup->qgroupid, > - qgroup_to_aux(qgroup), GFP_ATOMIC); > - if (ret < 0) > - goto out; > - ULIST_ITER_INIT(&uiter); > - while ((unode = ulist_next(fs_info->qgroup_ulist, &uiter))) { > - struct btrfs_qgroup *qg; > + qgroup_iterator_add(&qgroup_list, cur); > + list_for_each_entry(cur, &qgroup_list, iterator) { > struct btrfs_qgroup_list *glist; > > - qg = unode_aux_to_qgroup(unode); > - > - if (enforce && !qgroup_check_limits(qg, num_bytes)) { > + if (enforce && !qgroup_check_limits(cur, num_bytes)) { > ret = -EDQUOT; > goto out; > } > > - list_for_each_entry(glist, &qg->groups, next_group) { > - ret = ulist_add(fs_info->qgroup_ulist, > - glist->group->qgroupid, > - qgroup_to_aux(glist->group), GFP_ATOMIC); > - if (ret < 0) > - goto out; > - } > + list_for_each_entry(glist, &cur->groups, next_group) > + qgroup_iterator_add(&qgroup_list, glist->group); > } > + > ret = 0; > /* > * no limits exceeded, now record the reservation into all qgroups > */ > - ULIST_ITER_INIT(&uiter); > - while ((unode = ulist_next(fs_info->qgroup_ulist, &uiter))) { > - struct btrfs_qgroup *qg; > - > - qg = unode_aux_to_qgroup(unode); > - > - qgroup_rsv_add(fs_info, qg, num_bytes, type); > - } > + list_for_each_entry(cur, &qgroup_list, iterator) > + qgroup_rsv_add(fs_info, cur, num_bytes, type); > > out: > + qgroup_iterator_clean(&qgroup_list); > spin_unlock(&fs_info->qgroup_lock); > return ret; > } > diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h > index 7bffa10589d6..5dc0583622c3 100644 > --- a/fs/btrfs/qgroup.h > +++ b/fs/btrfs/qgroup.h > @@ -220,6 +220,15 @@ struct btrfs_qgroup { > struct list_head groups; /* groups this group is member of */ > struct list_head members; /* groups that are members of this group */ > struct list_head dirty; /* dirty groups */ > + > + /* > + * For qgroup iteration usage. > + * > + * The iteration list should always be empty until > + * qgroup_iterator_add() is called. > + * And should be reset to empty after the iteration is finished. Can you also add a note that this relies on global exclusion under the qgroup spin lock? > + */ > + struct list_head iterator; > struct rb_node node; /* tree of qgroups */ > > /* > -- > 2.41.0 >