public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Roland Dreier <rdreier@cisco.com>
To: linux-kernel@vger.kernel.org, mingo@elte.hu
Cc: Eli Cohen <eli@dev.mellanox.co.il>, general@lists.openfabrics.org
Subject: wait_for_completion_timeout() spurious failure under heavy load?
Date: Thu, 19 Jun 2008 15:04:07 -0700	[thread overview]
Message-ID: <ada63s5w088.fsf@cisco.com> (raw)

It seems that the current implementaton of wait_for_completion_timeout()
has a small problem under very high load for the common pattern:

	if (!wait_for_completion_timeout(&done, timeout))
		/* handle failure */

because the implementation very roughly does (lots of code deleted to
show the basic flow):

	static inline long __sched
	do_wait_for_common(struct completion *x, long timeout, int state)
	{
		if (x->done)
			return timeout;

		do {
			timeout = schedule_timeout(timeout);
	
			if (!timeout)
				return timeout;
	
		} while (!x->done);
	
		return timeout;
	}

so if the system is very busy and x->done is not set when
do_wait_for_common() is entered, it is possible that the first call to
schedule_timeout() returns 0 because the task doing wait_for_completion
doesn't get rescheduled for a long time, even if it is woken up early
enough.  In this case, wait_for_completion_timeout() returns 0 without
even checking x->done again, and the code above falls into its failure
case purely for scheduler reasons, even if the hardware event or
whatever was being waited for happened early enough.

So would it make sense to add an extra test to do_wait_for() in the
timeout case and, say, return 1 if x->done is actually set?  Something
like the patch below?

A quick audit (not exhaustive) of wait_for_completion_timeout() callers
seems to indicate that no one actually cares about the return value in
the success case -- they just test for 0 (timed out) versus non-zero
(wait succeeded).

Thanks,
  Roland

diff --git a/kernel/sched.c b/kernel/sched.c
index eaf6751..3d04ec1 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4405,7 +4405,12 @@ do_wait_for_common(struct completion *x, long timeout, int state)
 			spin_lock_irq(&x->wait.lock);
 			if (!timeout) {
 				__remove_wait_queue(&x->wait, &wait);
-				return timeout;
+				if (x->done) {
+					x->done--;
+					return 1;
+				} else {
+					return 0;
+				}
 			}
 		} while (!x->done);
 		__remove_wait_queue(&x->wait, &wait);

             reply	other threads:[~2008-06-19 22:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-19 22:04 Roland Dreier [this message]
2008-06-20  6:40 ` wait_for_completion_timeout() spurious failure under heavy load? Jiri Slaby
2008-06-20 11:20   ` Ingo Molnar
2008-06-20 11:30     ` Peter Zijlstra
2008-06-20 14:14     ` Oleg Nesterov
2008-06-20 14:32       ` Oleg Nesterov
2008-06-20 15:21         ` Ingo Molnar
2008-06-20 14:33     ` Roland Dreier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ada63s5w088.fsf@cisco.com \
    --to=rdreier@cisco.com \
    --cc=eli@dev.mellanox.co.il \
    --cc=general@lists.openfabrics.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox