linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cxl: Use call_rcu to reduce latency when releasing the afu fd
@ 2015-05-08 12:55 Ian Munsie
  0 siblings, 0 replies; only message in thread
From: Ian Munsie @ 2015-05-08 12:55 UTC (permalink / raw)
  To: mpe
  Cc: mikey, Brian Allison, linux-kernel, linuxppc-dev, Ian Munsie,
	Fei K Chen

From: Ian Munsie <imunsie@au1.ibm.com>

The afu fd release path was identified as a significant bottleneck in
the overall performance of cxl. While an optimal AFU design would
minimise the need to close & reopen the AFU fd, it is not always
practical to avoid.

The bottleneck seems to be down to the call to synchronize_rcu(), which
will block until every other thread is guaranteed to be out of an RCU
critical section. Replace it with call_rcu() to free the context
structures later so we can return to the application sooner.

This reduces the time spent in the fd release path from 13356 usec to
13.3 usec - about a 100x speed up.

Reported-by: Fei K Chen <uchen@cn.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 drivers/misc/cxl/context.c | 15 ++++++++++-----
 drivers/misc/cxl/cxl.h     |  2 ++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c
index 22eb338..cea299e 100644
--- a/drivers/misc/cxl/context.c
+++ b/drivers/misc/cxl/context.c
@@ -243,12 +243,9 @@ void cxl_context_detach_all(struct cxl_afu *afu)
 	mutex_unlock(&afu->contexts_lock);
 }
 
-void cxl_context_free(struct cxl_context *ctx)
+static void reclaim_ctx(struct rcu_head *rcu)
 {
-	mutex_lock(&ctx->afu->contexts_lock);
-	idr_remove(&ctx->afu->contexts_idr, ctx->pe);
-	mutex_unlock(&ctx->afu->contexts_lock);
-	synchronize_rcu();
+	struct cxl_context *ctx = container_of(rcu, struct cxl_context, rcu);
 
 	free_page((u64)ctx->sstp);
 	ctx->sstp = NULL;
@@ -256,3 +253,11 @@ void cxl_context_free(struct cxl_context *ctx)
 	put_pid(ctx->pid);
 	kfree(ctx);
 }
+
+void cxl_context_free(struct cxl_context *ctx)
+{
+	mutex_lock(&ctx->afu->contexts_lock);
+	idr_remove(&ctx->afu->contexts_idr, ctx->pe);
+	mutex_unlock(&ctx->afu->contexts_lock);
+	call_rcu(&ctx->rcu, reclaim_ctx);
+}
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 47f655f..ebd2e0d 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -460,6 +460,8 @@ struct cxl_context {
 	bool pending_irq;
 	bool pending_fault;
 	bool pending_afu_err;
+
+	struct rcu_head rcu;
 };
 
 struct cxl {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2015-05-08 12:56 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-08 12:55 [PATCH] cxl: Use call_rcu to reduce latency when releasing the afu fd Ian Munsie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).