lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
	Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>,
	Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 01/12] lustre: llite: do not take mod rpc slot for getxattr
Date: Sun, 12 Dec 2021 10:07:52 -0500	[thread overview]
Message-ID: <1639321683-22909-2-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1639321683-22909-1-git-send-email-jsimmons@infradead.org>

From: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>

The following scenario may lead to client eviction:
clientA                clientB                  MDS
threadA1: write to file F1, get
and hold DoM MDC LDLM lock L1:
   ->cl_io_loop()
    ->cl_io_lock()
     :
     ->mdc_lock_granted()
      ->lock->l_writers++
     [hold ref until write done]

threadA2-A8: create files F2-F8:
   ->ll_file_open()
    ->mdc_enqueue_base()
     ->ldlm_cli_enqueue()
      ->ptlrpc_get_mod_rpc_slot()
      ->ptlrpc_queue_wait()
      [hold RPC slot until create done]

                                                OST(s) in recovery.
                                                MDS waiting on OST(s) to
                                                precreate new objects.

threadA1:
    -> cl_io_start()
     -> __generic_file_aio_write()
      -> file_remove_suid()
       -> ll_xattr_cache_refill()
        -> mdc_xattr_common()
         -> ptlrpc_get_mod_rpc_slot()
         [blocked waiting for RPC slot]

                        threadB1: write file F1,
                    enqueue DoM MDC lock L1

                                                MDS sends blocking AST
                                                to clientA for lock L1

ldlm_threadA3: cannot cancel busy lock L1:
   -> ldlm_handle_bl_callback()
   ["Lock L1 referenced, will be cancelled later"]

                                                MDS evicts clientA for
                                                not cancelling lock L1

threadA1: never completes write:
  ->cl_io_end()
   ->cl_io_unlock()
    ->osc_lock_cancel()
     ->lock->l_writers--;

The fix is to add IT_GETXATTR to list of operations which do not
need mod rpc slot.

Tests to illustrate the issue is added.

wait_for_function(): total sleep time (wait) is to be equal to max
when 1 is returned.

HPE-bug-id: LUS-7271
WC-bug-id: https://jira.whamcloud.com/browse/LU-12347
Lustre-commit: eb64594e4473af85 ("LU-12347 llite: do not take mod rpc slot for getxattr")
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44151
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd_support.h | 1 +
 fs/lustre/llite/xattr_cache.c   | 2 ++
 fs/lustre/mdc/mdc_locks.c       | 2 +-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index 540e1e0..d57c25c 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -484,6 +484,7 @@
 #define OBD_FAIL_LLITE_RACE_MOUNT			0x1417
 #define OBD_FAIL_LLITE_PAGE_ALLOC			0x1418
 #define OBD_FAIL_LLITE_OPEN_DELAY			0x1419
+#define OBD_FAIL_LLITE_XATTR_PAUSE			0x1420
 
 #define OBD_FAIL_FID_INDIR				0x1501
 #define OBD_FAIL_FID_INLMA				0x1502
diff --git a/fs/lustre/llite/xattr_cache.c b/fs/lustre/llite/xattr_cache.c
index b044c89..7c1f5b7 100644
--- a/fs/lustre/llite/xattr_cache.c
+++ b/fs/lustre/llite/xattr_cache.c
@@ -396,6 +396,8 @@ static int ll_xattr_cache_refill(struct inode *inode)
 	u32 *xsizes;
 	int rc, i;
 
+	CFS_FAIL_TIMEOUT(OBD_FAIL_LLITE_XATTR_PAUSE, cfs_fail_val ?: 2);
+
 	rc = ll_xattr_find_get_lock(inode, &oit, &req);
 	if (rc)
 		goto err_req;
diff --git a/fs/lustre/mdc/mdc_locks.c b/fs/lustre/mdc/mdc_locks.c
index 66f0039..2c344d7 100644
--- a/fs/lustre/mdc/mdc_locks.c
+++ b/fs/lustre/mdc/mdc_locks.c
@@ -886,7 +886,7 @@ static inline bool mdc_skip_mod_rpc_slot(const struct lookup_intent *it)
 {
 	if (it &&
 	    (it->it_op == IT_GETATTR || it->it_op == IT_LOOKUP ||
-	     it->it_op == IT_READDIR ||
+	     it->it_op == IT_READDIR || it->it_op == IT_GETXATTR ||
 	     (it->it_op == IT_LAYOUT && !(it->it_flags & MDS_FMODE_WRITE))))
 		return true;
 	return false;
-- 
1.8.3.1

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

  reply	other threads:[~2021-12-12 15:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-12 15:07 [lustre-devel] [PATCH 00/12] lustre: backport OpenSFS work Dec 12, 2021 James Simmons
2021-12-12 15:07 ` James Simmons [this message]
2021-12-12 15:07 ` [lustre-devel] [PATCH 02/12] lnet: uapi: move out kernel only code James Simmons
2021-12-12 15:07 ` [lustre-devel] [PATCH 03/12] lustre: ptlrpc: Do not unlink difficult reply until sent James Simmons
2021-12-12 15:07 ` [lustre-devel] [PATCH 04/12] lustre: obdclass: make niduuid for lustre_stop_mgc() static James Simmons
2021-12-12 15:07 ` [lustre-devel] [PATCH 05/12] lnet: Allow specifying a source NID for lnetctl ping James Simmons
2021-12-12 15:07 ` [lustre-devel] [PATCH 06/12] lnet: Fix source specified send to different net James Simmons
2021-12-12 15:07 ` [lustre-devel] [PATCH 07/12] lnet: Fix source specified to routed destination James Simmons
2021-12-12 15:07 ` [lustre-devel] [PATCH 08/12] lustre: obdclass: cosmetic changes in pool handling James Simmons
2021-12-12 15:08 ` [lustre-devel] [PATCH 09/12] lustre: llite: properly detect SELinux disabled case James Simmons
2021-12-12 15:08 ` [lustre-devel] [PATCH 10/12] lnet: o2iblnd: Default map_on_demand to 1 James Simmons
2021-12-12 15:08 ` [lustre-devel] [PATCH 11/12] lustre: pcc: disable PCC for encrypted files James Simmons
2021-12-12 15:08 ` [lustre-devel] [PATCH 12/12] lustre: llite: avoid needless large stats alloc James Simmons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1639321683-22909-2-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=adilger@whamcloud.com \
    --cc=green@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    --cc=neilb@suse.de \
    --cc=vlaidimir.saveliev@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).