lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: James Simmons <jsimmons@infradead.org>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] [PATCH 3/8] lustre: fld: resend seq lookup RPC if it is on LWP
Date: Wed, 24 Jul 2019 22:44:02 -0400	[thread overview]
Message-ID: <1564022647-17351-4-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1564022647-17351-1-git-send-email-jsimmons@infradead.org>

From: wang di <di.wang@intel.com>

Because Light Weight connection might be evicted after
restart, then cause inflight RPC fails, to avoid this,
we need resend seq lookup RPC.

remove "-f" from "stop mdt" in sanity 17m, so umount can
keep the the connection, and otherwise the OSP might be
evicted.

WC-bug-id: https://jira.whamcloud.com/browse/LU-4571
Lustre-commit: cf7f66d87e52293535cde6e8cc7386e6c1bdfa46
Signed-off-by: wang di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/9106
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
---
 fs/lustre/fld/fld_request.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/lustre/fld/fld_request.c b/fs/lustre/fld/fld_request.c
index 248fffa..ec45ea6 100644
--- a/fs/lustre/fld/fld_request.c
+++ b/fs/lustre/fld/fld_request.c
@@ -314,6 +314,7 @@ int fld_client_rpc(struct obd_export *exp,
 
 	LASSERT(exp);
 
+again:
 	imp = class_exp2cliimp(exp);
 	switch (fld_op) {
 	case FLD_QUERY:
@@ -329,8 +330,15 @@ int fld_client_rpc(struct obd_export *exp,
 		op = req_capsule_client_get(&req->rq_pill, &RMF_FLD_OPC);
 		*op = FLD_LOOKUP;
 
-		if (imp->imp_connect_flags_orig & OBD_CONNECT_MDS_MDS)
+		/* For MDS_MDS seq lookup, it will always use LWP connection,
+		 * but LWP will be evicted after restart, so cause the error.
+		 * so we will set no_delay for seq lookup request, once the
+		 * request fails because of the eviction. always retry here
+		 */
+		if (imp->imp_connect_flags_orig & OBD_CONNECT_MDS_MDS) {
 			req->rq_allow_replay = 1;
+			req->rq_no_delay = 1;
+		}
 		break;
 	case FLD_READ:
 		req = ptlrpc_request_alloc_pack(imp, &RQF_FLD_READ,
@@ -358,8 +366,19 @@ int fld_client_rpc(struct obd_export *exp,
 	obd_get_request_slot(&exp->exp_obd->u.cli);
 	rc = ptlrpc_queue_wait(req);
 	obd_put_request_slot(&exp->exp_obd->u.cli);
-	if (rc)
+	if (rc != 0) {
+		if (rc == -EWOULDBLOCK) {
+			/* For no_delay req(see above), EWOULDBLOCK means the
+			 * connection is being evicted, but this seq lookup
+			 * should not return error, since it would cause
+			 * unecessary failure of the application, instead
+			 * it should retry here
+			 */
+			ptlrpc_req_finished(req);
+			goto again;
+		}
 		goto out_req;
+	}
 
 	if (fld_op == FLD_QUERY) {
 		prange = req_capsule_server_get(&req->rq_pill, &RMF_FLD_MDFLD);
-- 
1.8.3.1

  parent reply	other threads:[~2019-07-25  2:44 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-25  2:43 [lustre-devel] [PATCH 0/8] lustre: some old patches from whamcloud tree James Simmons
2019-07-25  2:44 ` [lustre-devel] [PATCH 1/8] lustre: seq: make seq_proc_write_common() safer James Simmons
2019-07-25 23:55   ` NeilBrown
2019-07-26  3:31     ` James Simmons
2019-07-25  2:44 ` [lustre-devel] [PATCH 2/8] lustre: ptlrpc: Fix an rq_no_reply assertion failure James Simmons
2019-08-14 16:58   ` Andreas Dilger
2019-07-25  2:44 ` James Simmons [this message]
2019-08-14 16:58   ` [lustre-devel] [PATCH 3/8] lustre: fld: resend seq lookup RPC if it is on LWP Andreas Dilger
2019-07-25  2:44 ` [lustre-devel] [PATCH 4/8] lustre: fld: retry fld rpc even for ESHUTDOWN James Simmons
2019-08-14 16:58   ` Andreas Dilger
2019-08-14 16:58   ` Andreas Dilger
2019-07-25  2:44 ` [lustre-devel] [PATCH 5/8] lustre: fld: retry fld rpc until the import is closed James Simmons
2019-08-14 16:58   ` Andreas Dilger
2019-07-25  2:44 ` [lustre-devel] [PATCH 6/8] lustre: fld: fld client lookup should retry James Simmons
2019-08-14 16:58   ` Andreas Dilger
2019-07-25  2:44 ` [lustre-devel] [PATCH 7/8] lustre: tests: testcases for multiple modify RPCs feature James Simmons
2019-08-14 16:58   ` Andreas Dilger
2019-07-25  2:44 ` [lustre-devel] [PATCH 8/8] lustre: ldlm: Don't check opcode with NULL rq_reqmsg James Simmons
2019-08-14 16:58   ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1564022647-17351-4-git-send-email-jsimmons@infradead.org \
    --to=jsimmons@infradead.org \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).