From: Oleg Drokin <green@linuxhacker.ru>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-kernel@vger.kernel.org, devel@driverdev.osuosl.org
Cc: "Alexander.Boyko" <alexander_boyko@xyratex.com>,
Oleg Drokin <oleg.drokin@intel.com>
Subject: [PATCH 08/18] staging/lustre/ptlrpc: race at req processing
Date: Sun, 22 Jun 2014 21:32:12 -0400 [thread overview]
Message-ID: <1403487142-4880-9-git-send-email-green@linuxhacker.ru> (raw)
In-Reply-To: <1403487142-4880-1-git-send-email-green@linuxhacker.ru>
From: "Alexander.Boyko" <alexander_boyko@xyratex.com>
Race between ptlrpc_resend_req() and ptlrpc_check_set().
1 thread do ptlrpc_check_set()->after_reply()
2 thread do ptlrpc_resend_req()
The result is request with rq_resend = 1 and MSG_REPLY flag.
When this request will came to server it will cause client eviction.
The patch skip ptlrpc_resend_req logic if rq_replied is set,
and clear rq_resend flag at reply_in_callback() when client got
reply.
Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Xyratex-bug-id: MRP-1888
Reviewed-on: http://review.whamcloud.com/10471
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5116
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
---
drivers/staging/lustre/lustre/ptlrpc/client.c | 11 ++++++++++-
drivers/staging/lustre/lustre/ptlrpc/events.c | 2 ++
drivers/staging/lustre/lustre/ptlrpc/niobuf.c | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 7246e8c..d806257 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2530,10 +2530,19 @@ EXPORT_SYMBOL(ptlrpc_cleanup_client);
void ptlrpc_resend_req(struct ptlrpc_request *req)
{
DEBUG_REQ(D_HA, req, "going to resend");
+ spin_lock(&req->rq_lock);
+
+ /* Request got reply but linked to the import list still.
+ Let ptlrpc_check_set() to process it. */
+ if (ptlrpc_client_replied(req)) {
+ spin_unlock(&req->rq_lock);
+ DEBUG_REQ(D_HA, req, "it has reply, so skip it");
+ return;
+ }
+
lustre_msg_set_handle(req->rq_reqmsg, &(struct lustre_handle){ 0 });
req->rq_status = -EAGAIN;
- spin_lock(&req->rq_lock);
req->rq_resend = 1;
req->rq_net_err = 0;
req->rq_timedout = 0;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c b/drivers/staging/lustre/lustre/ptlrpc/events.c
index aa85239..9f9b8d1 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -145,6 +145,8 @@ void reply_in_callback(lnet_event_t *ev)
/* Real reply */
req->rq_rep_swab_mask = 0;
req->rq_replied = 1;
+ /* Got reply, no resend required */
+ req->rq_resend = 0;
req->rq_reply_off = ev->offset;
req->rq_nob_received = ev->mlength;
/* LNetMDUnlink can't be called under the LNET_LOCK,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
index ef18639..f760504 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c
@@ -505,6 +505,8 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
/* If this is a re-transmit, we're required to have disengaged
* cleanly from the previous attempt */
LASSERT(!request->rq_receiving_reply);
+ LASSERT(!((lustre_msg_get_flags(request->rq_reqmsg) & MSG_REPLAY) &&
+ (request->rq_import->imp_state == LUSTRE_IMP_FULL)));
if (unlikely(obd != NULL && obd->obd_fail)) {
CDEBUG(D_HA, "muting rpc for failed imp obd %s\n",
--
1.9.0
next prev parent reply other threads:[~2014-06-23 1:36 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-23 1:32 [PATCH 00/18] Lustre fixes Oleg Drokin
2014-06-23 1:32 ` [PATCH 01/18] staging/lustre/libcfs: revert changes to libcfs_sock_ioctl Oleg Drokin
2014-06-23 1:32 ` [PATCH 02/18] staging/lustre/ptlrpc: Protect request buffer changing Oleg Drokin
2014-06-23 1:32 ` [PATCH 03/18] staging/lustre/llite: Only kill SGID/SUID bits Oleg Drokin
2014-06-23 1:32 ` [PATCH 04/18] staging/lustre: fix frong ldlm flags type used Oleg Drokin
2014-06-23 1:32 ` [PATCH 05/18] staging/lustre/ptlrpc: fix NULL pointer dereference of {exp,imp}_obd Oleg Drokin
2014-06-23 1:32 ` [PATCH 06/18] staging/lustre/mgc: mgc import reconnect race Oleg Drokin
2014-06-23 1:32 ` [PATCH 07/18] staging/lustre/osc: get rid of old checksum initial value Oleg Drokin
2014-06-23 1:32 ` Oleg Drokin [this message]
2014-06-23 1:32 ` [PATCH 09/18] staging/lustre/mgc: replace hard-coded MGC_ENQUEUE_LIMIT value Oleg Drokin
2014-06-23 1:32 ` [PATCH 10/18] staging/lustre/ptlrpc: Add schedule point to ptlrpc_check_set() Oleg Drokin
2014-06-23 1:32 ` [PATCH 11/18] staging/lustre/obdclass: Fix uninitialized variables Oleg Drokin
2014-06-23 1:32 ` [PATCH 12/18] staging/lustre/osc: osc_extent_truncate()) ASSERTION( !ext->oe_urgent ) failed Oleg Drokin
2014-06-23 1:32 ` [PATCH 13/18] staging/lustre/llite: Fix uninitialized variable Oleg Drokin
2014-06-23 1:32 ` [PATCH 14/18] staging/lustre/ptlrpc: unlink request buffer correctly Oleg Drokin
2014-06-23 1:32 ` [PATCH 15/18] staging/lustre/obdclass: runtime load lustre client when needed Oleg Drokin
2014-06-23 1:32 ` [PATCH 16/18] staging/lustre/vvp: release mmap_sem in error case Oleg Drokin
2014-06-23 1:32 ` [PATCH 17/18] staging/lustre/llite: fix a flag bug of vvp_io_kernel_fault() Oleg Drokin
2014-06-23 1:32 ` [PATCH 18/18] staging/lustre/lnet: abort messages whose MD has been unlinked Oleg Drokin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1403487142-4880-9-git-send-email-green@linuxhacker.ru \
--to=green@linuxhacker.ru \
--cc=alexander_boyko@xyratex.com \
--cc=devel@driverdev.osuosl.org \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg.drokin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.