lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Lai Siyao <lai.siyao@whamcloud.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 02/24] lustre: lmv: always space-balance r-r directories
Date: Mon,  5 Sep 2022 21:55:15 -0400	[thread overview]
Message-ID: <1662429337-18737-3-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1662429337-18737-1-git-send-email-jsimmons@infradead.org>

From: Lai Siyao <lai.siyao@whamcloud.com>

If the MDT free space is imbalanced, use QOS space balancing for
round-robin subdirectory creation, regardless of the depth
of the directory tree.  Otherwise, new subdirectories created
in parents with round-robin default layout may suddenly become
"sticky" on the parent MDT and upset the space balancing and
load distribution.

Fixes: a8948860e4 ("lustre: lmv: improve MDT QOS space balance")
WC-bug-id: https://jira.whamcloud.com/browse/LU-15850
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47578
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/lmv/lmv_obd.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c
index 6c0eb03..0988b1a 100644
--- a/fs/lustre/lmv/lmv_obd.c
+++ b/fs/lustre/lmv/lmv_obd.c
@@ -55,6 +55,7 @@
 #include "lmv_internal.h"
 
 static int lmv_check_connect(struct obd_device *obd);
+static inline bool lmv_op_default_rr_mkdir(const struct md_op_data *op_data);
 
 void lmv_activate_target(struct lmv_obd *lmv, struct lmv_tgt_desc *tgt,
 			 int activate)
@@ -1446,8 +1447,8 @@ static int lmv_close(struct obd_export *exp, struct md_op_data *op_data,
 	return md_close(tgt->ltd_exp, op_data, mod, request);
 }
 
-static struct lu_tgt_desc *lmv_locate_tgt_qos(struct lmv_obd *lmv, u32 mdt,
-					      unsigned short dir_depth)
+static struct lu_tgt_desc *lmv_locate_tgt_qos(struct lmv_obd *lmv,
+					      struct md_op_data *op_data)
 {
 	struct lu_tgt_desc *tgt, *cur = NULL;
 	u64 total_avail = 0;
@@ -1481,23 +1482,31 @@ static struct lu_tgt_desc *lmv_locate_tgt_qos(struct lmv_obd *lmv, u32 mdt,
 
 		tgt->ltd_qos.ltq_usable = 1;
 		lu_tgt_qos_weight_calc(tgt);
-		if (tgt->ltd_index == mdt)
+		if (tgt->ltd_index == op_data->op_mds)
 			cur = tgt;
 		total_avail += tgt->ltd_qos.ltq_avail;
 		total_weight += tgt->ltd_qos.ltq_weight;
 		total_usable++;
 	}
 
-	/* if current MDT has above-average space, within range of the QOS
-	 * threshold, stay on the same MDT to avoid creating needless remote
-	 * MDT directories. It's more likely for low level directories
-	 * "16 / (dir_depth + 10)" is the factor to make it more unlikely for
-	 * top level directories, while more likely for low levels.
+	/* If current MDT has above-average space and dir is not aleady using
+	 * round-robin to spread across more MDTs, stay on the parent MDT
+	 * to avoid creating needless remote MDT directories.  Remote dirs
+	 * close to the root balance space more effectively than bottom dirs,
+	 * so prefer to create remote dirs at top level of directory tree.
+	 * "16 / (dir_depth + 10)" is the factor to make it less likely
+	 * for top-level directories to stay local unless they have more than
+	 * average free space, while deep dirs prefer local until more full.
+	 *    depth=0 -> 160%, depth=3 -> 123%, depth=6 -> 100%,
+	 *    depth=9 -> 84%, depth=12 -> 73%, depth=15 -> 64%
 	 */
-	rand = total_avail * 16 / (total_usable * (dir_depth + 10));
-	if (cur && cur->ltd_qos.ltq_avail >= rand) {
-		tgt = cur;
-		goto unlock;
+	if (!lmv_op_default_rr_mkdir(op_data)) {
+		rand = total_avail * 16 /
+			(total_usable * (op_data->op_dir_depth + 10));
+		if (cur && cur->ltd_qos.ltq_avail >= rand) {
+			tgt = cur;
+			goto unlock;
+		}
 	}
 
 	rand = lu_prandom_u64_max(total_weight);
@@ -1836,9 +1845,6 @@ static inline bool lmv_op_default_rr_mkdir(const struct md_op_data *op_data)
 {
 	const struct lmv_stripe_md *lsm = op_data->op_default_mea1;
 
-	if (!lmv_op_default_qos_mkdir(op_data))
-		return false;
-
 	return (op_data->op_flags & MF_RR_MKDIR) ||
 	       (lsm && lsm->lsm_md_max_inherit_rr != LMV_INHERIT_RR_NONE) ||
 	       fid_is_root(&op_data->op_fid1);
@@ -1873,7 +1879,7 @@ static struct lu_tgt_desc *lmv_locate_tgt_by_space(struct lmv_obd *lmv,
 {
 	struct lmv_tgt_desc *tmp = tgt;
 
-	tgt = lmv_locate_tgt_qos(lmv, op_data->op_mds, op_data->op_dir_depth);
+	tgt = lmv_locate_tgt_qos(lmv, op_data);
 	if (tgt == ERR_PTR(-EAGAIN)) {
 		if (ltd_qos_is_balanced(&lmv->lmv_mdt_descs) &&
 		    !lmv_op_default_rr_mkdir(op_data) &&
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  parent reply	other threads:[~2022-09-06  1:55 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-06  1:55 [lustre-devel] [PATCH 00/24] lustre: update to OpenSFS tree Sept 5, 2022 James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 01/24] lustre: sec: new connect flag for name encryption James Simmons
2022-09-06  1:55 ` James Simmons [this message]
2022-09-06  1:55 ` [lustre-devel] [PATCH 03/24] lustre: ldlm: rid of obsolete param of ldlm_resource_get() James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 04/24] lustre: llite: fully disable readahead in kernel I/O path James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 05/24] lustre: llite: use fatal_signal_pending in range_lock James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 06/24] lustre: update version to 2.15.51 James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 07/24] lustre: llite: simplify callback handling for async getattr James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 08/24] lustre: statahead: add total hit/miss count stats James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 09/24] lnet: o2iblnd: Salt comp_vector James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 10/24] lnet: selftest: use preallocate bulk for server James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 11/24] lnet: change ni_status in lnet_ni to u32* James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 12/24] lustre: llite: Rework upper/lower DIO/AIO James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 13/24] lustre: sec: use enc pool for bounce pages James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 14/24] lustre: llite: Unify range unlock James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 15/24] lustre: llite: Refactor DIO/AIO free code James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 16/24] lnet: Use fatal NI if none other available James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 17/24] lnet: LNet peer aliveness broken James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 18/24] lnet: Correct net selection for router ping James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 19/24] lnet: Remove duplicate checks for peer sensitivity James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 20/24] lustre: obdclass: use consistent stats units James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 21/24] lnet: Memory leak on adding existing interface James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 22/24] lustre: sec: fix detection of SELinux enforcement James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 23/24] lustre: idl: add checks for OBD_CONNECT flags James Simmons
2022-09-06  1:55 ` [lustre-devel] [PATCH 24/24] lustre: llite: fix stat attributes_mask James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1662429337-18737-3-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lai.siyao@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).