[PATCH v2 4/4] rsockets: acknowledge completion queue events in batch

public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 4/4] rsockets: acknowledge completion queue events in batch
@ 2014-09-05 13:23 Sreedhar Kodali
       [not found] ` <c144abfe27bd0bf36ff1e3c9c912d7c3-FJGp5E75HVmZamtmwQBW5tBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Sreedhar Kodali @ 2014-09-05 13:23 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
	pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

 From: Sreedhar Kodali <srkodali-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

     Expose '/unackcqe_default' variable so completion event
     acknowledgment can be done in batch instead of singles
     to minimize locking overheads.

     The default value of this variable is 1 so the existing
     functionality is preserved.  Having a higher value delays
     acknowledging until the specified limit is reached for
     retrieved completion events.

     Signed-off-by: Sreedhar Kodali <srkodali-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
     ---

diff --git a/src/rsocket.c b/src/rsocket.c
index ffea0ca..3c39da7 100644
--- a/src/rsocket.c
+++ b/src/rsocket.c
@@ -118,6 +118,7 @@ static uint32_t polling_time = 10;
  static uint16_t restart_onintr = 0;
  static uint16_t next_comp_vector = 0;
  static uint64_t comp_vector_mask = 0;
+static uint32_t def_unackcqe = 1;

  /*
   * Immediate data format is determined by the upper bits
@@ -384,6 +385,7 @@ struct rsocket {
  	dlist_entry	  iomap_list;
  	dlist_entry	  iomap_queue;
  	int		  iomap_pending;
+	int	      unackcqe;
  };

  #define DS_UDP_TAG 0x55555555
@@ -581,6 +583,14 @@ void rs_configure(void)
  			}
  		}
  	}
+
+	if ((f = fopen(RS_CONF_DIR "/unackcqe_default", "r"))) {
+		(void) fscanf(f, "%u", &def_unackcqe);
+		fclose(f);
+		if (def_unackcqe < 1) {
+			def_unackcqe = 1;
+		}
+	}
  	init = 1;
  out:
  	pthread_mutex_unlock(&mut);
@@ -638,6 +648,7 @@ static struct rsocket *rs_alloc(struct rsocket 
*inherited_rs, int type)
  			rs->target_iomap_size = def_iomap_size;
  		}
  	}
+	rs->unackcqe = 0;
  	fastlock_init(&rs->slock);
  	fastlock_init(&rs->rlock);
  	fastlock_init(&rs->cq_lock);
@@ -1044,8 +1055,11 @@ static void rs_free(struct rsocket *rs)

  	if (rs->cm_id) {
  		rs_free_iomappings(rs);
-		if (rs->cm_id->qp)
+		if (rs->cm_id->qp) {
+			if (rs->unackcqe > 0)
+				ibv_ack_cq_events(rs->cm_id->recv_cq, rs->unackcqe);
  			rdma_destroy_qp(rs->cm_id);
+		}
  		rdma_destroy_id(rs->cm_id);
  	}

@@ -2026,7 +2040,11 @@ static int rs_get_cq_event(struct rsocket *rs)
  resume_get_cq_event:
  	ret = ibv_get_cq_event(rs->cm_id->recv_cq_channel, &cq, &context);
  	if (!ret) {
-		ibv_ack_cq_events(rs->cm_id->recv_cq, 1);
+		rs->unackcqe += 1;
+		if (rs->unackcqe == def_unackcqe) {
+			ibv_ack_cq_events(rs->cm_id->recv_cq, rs->unackcqe);
+			rs->unackcqe = 0;
+		}
  		rs->cq_armed = 0;
  	} else if (restart_onintr == 1 && errno == EINTR) {
  		errno = 0;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH v2 4/4] rsockets: acknowledge completion queue events in batch
       [not found] ` <c144abfe27bd0bf36ff1e3c9c912d7c3-FJGp5E75HVmZamtmwQBW5tBPR1lH4CV8@public.gmane.org>
@ 2014-09-05 18:01   ` Hefty, Sean
       [not found]     ` <1828884A29C6694DAF28B7E6B8A8237399DC1B64-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Hefty, Sean @ 2014-09-05 18:01 UTC (permalink / raw)
  To: Sreedhar Kodali,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org

>      Expose '/unackcqe_default' variable so completion event
>      acknowledgment can be done in batch instead of singles
>      to minimize locking overheads.
> 
>      The default value of this variable is 1 so the existing
>      functionality is preserved.  Having a higher value delays
>      acknowledging until the specified limit is reached for
>      retrieved completion events.

I would rather just change the default behavior, rather than introduce more checks and options into the code.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v2 4/4] rsockets: acknowledge completion queue events in batch
       [not found]     ` <1828884A29C6694DAF28B7E6B8A8237399DC1B64-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2014-09-06  2:22       ` Sreedhar Kodali
       [not found]         ` <d52658b2f68bc014103e655857096b4a-FJGp5E75HVmZamtmwQBW5tBPR1lH4CV8@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Sreedhar Kodali @ 2014-09-06  2:22 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 2014-09-05 23:31, Hefty, Sean wrote:
>>      Expose '/unackcqe_default' variable so completion event
>>      acknowledgment can be done in batch instead of singles
>>      to minimize locking overheads.
>> 
>>      The default value of this variable is 1 so the existing
>>      functionality is preserved.  Having a higher value delays
>>      acknowledging until the specified limit is reached for
>>      retrieved completion events.
> 
> I would rather just change the default behavior, rather than introduce
> more checks and options into the code.

Hi Sean,

That's indeed a good approach.  But, we need to have an upper limit
before a batch of events is flushed.  Can we decide limit based on
completion event queue size?

Thank You.

- Sreedhar

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v2 4/4] rsockets: acknowledge completion queue events in batch
       [not found]         ` <d52658b2f68bc014103e655857096b4a-FJGp5E75HVmZamtmwQBW5tBPR1lH4CV8@public.gmane.org>
@ 2014-09-06 16:37           ` Hefty, Sean
       [not found]             ` <1828884A29C6694DAF28B7E6B8A8237399DC36C7-Q3cL8pyY+6s64kNsxIetb7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Hefty, Sean @ 2014-09-06 16:37 UTC (permalink / raw)
  To: Sreedhar Kodali
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org

> That's indeed a good approach.  But, we need to have an upper limit
> before a batch of events is flushed.  Can we decide limit based on
> completion event queue size?

I think that's fine.  We only need to ack before wrapping the 32-bit value, plus ensure that everything is acked when closing the rsocket.

But, does this change really make any difference?  At the point where ack is called, there's no contention, so the overhead seems small considering that we in a part of the code where we likely just blocked the thread.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v2 4/4] rsockets: acknowledge completion queue events in batch
       [not found]             ` <1828884A29C6694DAF28B7E6B8A8237399DC36C7-Q3cL8pyY+6s64kNsxIetb7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2014-09-07  3:01               ` Sreedhar Kodali
  0 siblings, 0 replies; 5+ messages in thread
From: Sreedhar Kodali @ 2014-09-07  3:01 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On 2014-09-06 22:07, Hefty, Sean wrote:
>> That's indeed a good approach.  But, we need to have an upper limit
>> before a batch of events is flushed.  Can we decide limit based on
>> completion event queue size?
> 
> I think that's fine.  We only need to ack before wrapping the 32-bit
> value, plus ensure that everything is acked when closing the rsocket.
> 

Sure.  I will revert back to you with modified patch.

> But, does this change really make any difference?  At the point where
> ack is called, there's no contention, so the overhead seems small
> considering that we in a part of the code where we likely just blocked
> the thread.

Yes.  It indeed makes a small improvement when all cores are fully
saturated in a scale-up environments.

Thank You.

- Sreedhar

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-07  3:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-05 13:23 [PATCH v2 4/4] rsockets: acknowledge completion queue events in batch Sreedhar Kodali
     [not found] ` <c144abfe27bd0bf36ff1e3c9c912d7c3-FJGp5E75HVmZamtmwQBW5tBPR1lH4CV8@public.gmane.org>
2014-09-05 18:01   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A8237399DC1B64-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-06  2:22       ` Sreedhar Kodali
     [not found]         ` <d52658b2f68bc014103e655857096b4a-FJGp5E75HVmZamtmwQBW5tBPR1lH4CV8@public.gmane.org>
2014-09-06 16:37           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A8237399DC36C7-Q3cL8pyY+6s64kNsxIetb7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-07  3:01               ` Sreedhar Kodali

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox