public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ibacm: acmp retry issues
@ 2017-11-10 22:28 Michael J. Ruhl
       [not found] ` <20171110222658.10387.16845.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Michael J. Ruhl @ 2017-11-10 22:28 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

While testing the retry mechanism for the acmp provider, I observed
that the retry event_wait() did not appear to be working correctly.
After studying the issue a bit more, I discovered that there were a
couple of issues.

This patch set addresses those issues.

---

Michael J. Ruhl (3):
      ibacm: Fix an incorrect expiration check for the retry timer
      ibacm: Calculate correct tv_nsec value in event_wait()
      ibacm: Fix a retry loop calculation race condition


 ibacm/linux/osd.h          |    5 +++++
 ibacm/prov/acmp/src/acmp.c |   12 +++++++-----
 2 files changed, 12 insertions(+), 5 deletions(-)

-- 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] ibacm: Fix an incorrect expiration check for the retry timer
       [not found] ` <20171110222658.10387.16845.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
@ 2017-11-10 22:28   ` Michael J. Ruhl
  2017-11-10 22:29   ` [PATCH 2/3] ibacm: Calculate correct tv_nsec value in event_wait() Michael J. Ruhl
  2017-11-10 22:29   ` [PATCH 3/3] ibacm: Fix a retry loop calculation race condition Michael J. Ruhl
  2 siblings, 0 replies; 6+ messages in thread
From: Michael J. Ruhl @ 2017-11-10 22:28 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The acmp_process_wait_queue() checks to see if a message expiration
time has passed.

Because the check is for less than (<), if the timeout expires matches
the current time, the check will result in a timeout value of 0, and
the wait loop will spin until the next millisecond has passed.

Using example values to demonstrate the issue, we can see:

With '<':
wait = -2106577636 (no work)
wait = 2510        (message wait)
(process spins)
wait = 0           (expires - current time == 0)
wait = 0
wait = 0
...                (1 ms of output)
wait = 0
wait = -2106580147 (retry complete)
wait = 2512

With '<=':
wait = -2106688780 (no work)
wait = 2512        ( message wait)
(process sleeps)
wait = -2106691293 (retry complete)
wait = 2512

Expire the message if the expires is less than or equal.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 ibacm/prov/acmp/src/acmp.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ibacm/prov/acmp/src/acmp.c b/ibacm/prov/acmp/src/acmp.c
index 78d9a29..d707b8e 100644
--- a/ibacm/prov/acmp/src/acmp.c
+++ b/ibacm/prov/acmp/src/acmp.c
@@ -1507,7 +1507,7 @@ static void acmp_process_wait_queue(struct acmp_ep *ep, uint64_t *next_expire)
 	struct ibv_send_wr *bad_wr;
 
 	list_for_each_safe(&ep->wait_queue, msg, next, entry) {
-		if (msg->expires < time_stamp_ms()) {
+		if (msg->expires <= time_stamp_ms()) {
 			list_del(&msg->entry);
 			(void) atomic_dec(&wait_cnt);
 			if (--msg->tries) {

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] ibacm: Calculate correct tv_nsec value in event_wait()
       [not found] ` <20171110222658.10387.16845.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
  2017-11-10 22:28   ` [PATCH 1/3] ibacm: Fix an incorrect expiration check for the retry timer Michael J. Ruhl
@ 2017-11-10 22:29   ` Michael J. Ruhl
       [not found]     ` <20171110222853.10387.81730.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
  2017-11-10 22:29   ` [PATCH 3/3] ibacm: Fix a retry loop calculation race condition Michael J. Ruhl
  2 siblings, 1 reply; 6+ messages in thread
From: Michael J. Ruhl @ 2017-11-10 22:29 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The event_wait() function calculates a tv_nsec value based on the
given timeout.  If the tv_nsec value calculation ends ups larger
than 1 second, the pthread_cond_timedwait() will return EINVAL,
and will not wait.

This causes the retry loop to spin (busy wait) until the actual
timeout occurs.

Ensure that the tv_nsec value is less than 1 second.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 ibacm/linux/osd.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/ibacm/linux/osd.h b/ibacm/linux/osd.h
index 95713e6..d6cbb8f 100644
--- a/ibacm/linux/osd.h
+++ b/ibacm/linux/osd.h
@@ -92,6 +92,7 @@ static inline void event_init(event_t *e)
 	pthread_mutex_init(&e->mutex, NULL);
 }
 #define event_signal(e)	pthread_cond_signal(&(e)->cond)
+#define ONE_SEC_IN_NSEC  1000000000
 static inline int event_wait(event_t *e, int timeout)
 {
 	struct timeval curtime;
@@ -101,6 +102,10 @@ static inline int event_wait(event_t *e, int timeout)
 	gettimeofday(&curtime, NULL);
 	wait.tv_sec = curtime.tv_sec + ((unsigned) timeout) / 1000;
 	wait.tv_nsec = (curtime.tv_usec + (((unsigned) timeout) % 1000) * 1000) * 1000;
+	if (wait.tv_nsec > ONE_SEC_IN_NSEC) {
+		wait.tv_sec++;
+		wait.tv_nsec -= ONE_SEC_IN_NSEC;
+	}
 	pthread_mutex_lock(&e->mutex);
 	ret = pthread_cond_timedwait(&e->cond, &e->mutex, &wait);
 	pthread_mutex_unlock(&e->mutex);

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] ibacm: Fix a retry loop calculation race condition
       [not found] ` <20171110222658.10387.16845.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
  2017-11-10 22:28   ` [PATCH 1/3] ibacm: Fix an incorrect expiration check for the retry timer Michael J. Ruhl
  2017-11-10 22:29   ` [PATCH 2/3] ibacm: Calculate correct tv_nsec value in event_wait() Michael J. Ruhl
@ 2017-11-10 22:29   ` Michael J. Ruhl
  2 siblings, 0 replies; 6+ messages in thread
From: Michael J. Ruhl @ 2017-11-10 22:29 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

From: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The retry loop calculation uses a conversion to int of an unsigned
64 bit number (next_expire) minus the current time to decide if
event_wait() should be called.  This calculation works correctly as
long as the next_expire value is not the default value (-1).

If the next_expire is the default value, periodically this subtraction
can result in a very large postive timeout value (days rather than
milliseconds).

For example:
next_expire  = 0xFFFFFFFFFFFFFFFF  (-1)
current_ms = 0x15f7db52146  (today's ms since 1970)

max_delay_ms = (int) next_expire - future_ms

future_ms  = 0x15f80000000  = max_delay_ms 2147483647
future_ms  = 0x16080000000  = max_delay_ms 2147483647

Converting max_delay_ms to days:
2147483647 / 1000 / 60 / 60 / 24 == 24 days

0xxx180000000 - 0xxx080000000 = 4294967296

every 48 days, this issue repeats

This calculation can occur if a wait_cnt is incremented and a message
expiration is handled so that next_expire is not updated.  If wait_cnt
is incremented before the wait calculation is done (the race condition),
event_wait() can be called with the potentially very large value.

If next_expire is not updated, do not do the wait calculation and
avoid the race condition.

Reported-by: Morys Grzegorz <grzegorz.morys-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 ibacm/prov/acmp/src/acmp.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/ibacm/prov/acmp/src/acmp.c b/ibacm/prov/acmp/src/acmp.c
index d707b8e..884fc48 100644
--- a/ibacm/prov/acmp/src/acmp.c
+++ b/ibacm/prov/acmp/src/acmp.c
@@ -1579,10 +1579,12 @@ static void *acmp_retry_handler(void *context)
 		pthread_mutex_unlock(&acmp_dev_lock);
 
 		acmp_process_timeouts();
-		wait = (int) (next_expire - time_stamp_ms());
-		if (wait > 0 && atomic_get(&wait_cnt)) {
-			pthread_testcancel();
-			event_wait(&timeout_event, wait);
+		if (next_expire != -1) {
+			wait = (int) (next_expire - time_stamp_ms());
+			if (wait > 0 && atomic_get(&wait_cnt)) {
+				pthread_testcancel();
+				event_wait(&timeout_event, wait);
+			}
 		}
 	}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] ibacm: Calculate correct tv_nsec value in event_wait()
       [not found]     ` <20171110222853.10387.81730.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
@ 2017-11-11 16:26       ` Jason Gunthorpe
       [not found]         ` <20171111162610.GL17451-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Gunthorpe @ 2017-11-11 16:26 UTC (permalink / raw)
  To: Michael J. Ruhl; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Fri, Nov 10, 2017 at 05:29:02PM -0500, Michael J. Ruhl wrote:
>  	gettimeofday(&curtime, NULL);

Ugh, the use of gettimeofday for this kind of stuff is also a serious bug.

This needs to use clock_gettime(CLOCK_MONOTONIC) and
pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 2/3] ibacm: Calculate correct tv_nsec value in event_wait()
       [not found]         ` <20171111162610.GL17451-uk2M96/98Pc@public.gmane.org>
@ 2017-11-13 13:30           ` Ruhl, Michael J
  0 siblings, 0 replies; 6+ messages in thread
From: Ruhl, Michael J @ 2017-11-13 13:30 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg-uk2M96/98Pc@public.gmane.org]
> Sent: Saturday, November 11, 2017 11:26 AM
> To: Ruhl, Michael J <michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: Re: [PATCH 2/3] ibacm: Calculate correct tv_nsec value in
> event_wait()
> 
> On Fri, Nov 10, 2017 at 05:29:02PM -0500, Michael J. Ruhl wrote:
> >  	gettimeofday(&curtime, NULL);
> 
> Ugh, the use of gettimeofday for this kind of stuff is also a serious bug.
> 
> This needs to use clock_gettime(CLOCK_MONOTONIC) and
> pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);

Yeah, that one is a lot cleaner.  I will get a new patch out shortly.

Thanks,

Mike
 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-13 13:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-10 22:28 [PATCH 0/3] ibacm: acmp retry issues Michael J. Ruhl
     [not found] ` <20171110222658.10387.16845.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
2017-11-10 22:28   ` [PATCH 1/3] ibacm: Fix an incorrect expiration check for the retry timer Michael J. Ruhl
2017-11-10 22:29   ` [PATCH 2/3] ibacm: Calculate correct tv_nsec value in event_wait() Michael J. Ruhl
     [not found]     ` <20171110222853.10387.81730.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
2017-11-11 16:26       ` Jason Gunthorpe
     [not found]         ` <20171111162610.GL17451-uk2M96/98Pc@public.gmane.org>
2017-11-13 13:30           ` Ruhl, Michael J
2017-11-10 22:29   ` [PATCH 3/3] ibacm: Fix a retry loop calculation race condition Michael J. Ruhl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox