All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Zachary Amsden <zach@vmware.com>
Subject: Re: [PATCH] stopmachine: add stopmachine_timeout
Date: Tue, 15 Jul 2008 11:14:58 +1000	[thread overview]
Message-ID: <200807151114.59562.rusty@rustcorp.com.au> (raw)
In-Reply-To: <20080714212026.GA6705@osiris.boeblingen.de.ibm.com>

On Tuesday 15 July 2008 07:20:26 Heiko Carstens wrote:
> On Mon, Jul 14, 2008 at 11:56:18AM -0700, Jeremy Fitzhardinge wrote:
> > Rusty Russell wrote:
> > > On Monday 14 July 2008 21:51:25 Christian Borntraeger wrote:
> > >> Am Montag, 14. Juli 2008 schrieb Hidetoshi Seto:
> > >>> +	/* Wait all others come to life */
> > >>> +	while (cpus_weight(prepared_cpus) != num_online_cpus() - 1) {
> > >>> +		if (time_is_before_jiffies(limit))
> > >>> +			goto timeout;
> > >>> +		cpu_relax();
> > >>> +	}
> > >>> +
> > >>
> > >> Hmm. I think this could become interesting on virtual machines. The
> > >> hypervisor might be to busy to schedule a specific cpu at certain load
> > >> scenarios. This would cause a failure even if the cpu is not really
> > >> locked up. We had similar problems with the soft lockup daemon on
> > >> s390.
> > >
> > > 5 seconds is a fairly long time.  If all else fails we could have a
> > > config option to simply disable this code.
>
> Hmm.. probably a stupid question: but what could happen that a real cpu
> (not virtual) becomes unresponsive so that it won't schedule a
> MAX_RT_PRIO-1 prioritized task for 5 seconds?

Yes.  That's exactly what we're trying to detect.  Currently the entire 
machine will wedge.  With this patch we can often limp along.

Hidetoshi's original problem was a client whose machine had one CPU die, then 
got wedged as the emergency backup tried to load a module.

Along these lines, I found VMWare's relaxed co-scheduling interesting, BTW:
http://communities.vmware.com/docs/DOC-4960

> cpu_relax() translates to a hypervisor yield on s390. Probably makes sense
> if other architectures would do the same.

Yes, I think so too.  Actually, doing a random yield-to-other-VCPU on 
cpu_relax is arguable the right semantic (in Linux it's used for spinning, 
almost exclusively to wait for other cpus).

Cheers,
Rusty.


  reply	other threads:[~2008-07-15  2:12 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-14  7:52 [PATCH] stopmachine: add stopmachine_timeout Hidetoshi Seto
2008-07-14  8:19 ` Hidetoshi Seto
2008-07-14 10:43 ` Rusty Russell
2008-07-15  1:11   ` Hidetoshi Seto
2008-07-15  7:50     ` Rusty Russell
2008-07-16  4:05       ` Hidetoshi Seto
2008-07-20  9:45         ` Rusty Russell
2008-07-22  3:28           ` [PATCH] stopmachine: allow force progress on timeout Hidetoshi Seto
2008-07-14 11:51 ` [PATCH] stopmachine: add stopmachine_timeout Christian Borntraeger
2008-07-14 12:34   ` Rusty Russell
2008-07-14 12:34   ` Rusty Russell
2008-07-14 18:56     ` Jeremy Fitzhardinge
2008-07-14 18:56     ` Jeremy Fitzhardinge
2008-07-14 21:20       ` Heiko Carstens
2008-07-15  1:14         ` Rusty Russell [this message]
2008-07-15  1:14         ` Rusty Russell
2008-07-15  2:24         ` Hidetoshi Seto
2008-07-15  2:24         ` Hidetoshi Seto
2008-07-15  2:37           ` Max Krasnyansky
2008-07-15  2:37             ` Max Krasnyansky
2008-07-15  2:24         ` Max Krasnyansky
2008-07-15  6:09           ` Heiko Carstens
2008-07-15  6:09           ` Heiko Carstens
2008-07-15  8:09           ` Rusty Russell
2008-07-15  8:09           ` Rusty Russell
2008-07-15  8:39             ` Heiko Carstens
2008-07-15  8:39             ` Heiko Carstens
2008-07-15  8:51             ` Max Krasnyansky
2008-07-15  8:51             ` Max Krasnyansky
2008-07-16  9:15             ` Christian Borntraeger
2008-07-16  9:15             ` Christian Borntraeger
2008-07-15  2:24         ` Max Krasnyansky
2008-07-14 21:20       ` Heiko Carstens
2008-07-16  4:27 ` [PATCH] stopmachine: add stopmachine_timeout v2 Hidetoshi Seto
2008-07-16  6:23   ` Max Krasnyansky
2008-07-16  6:35     ` Hidetoshi Seto
2008-07-16  6:51       ` [PATCH] stopmachine: add stopmachine_timeout v3 Hidetoshi Seto
2008-07-16  7:33         ` Peter Zijlstra
2008-07-16  7:33         ` Peter Zijlstra
2008-07-16  8:12           ` Hidetoshi Seto
2008-07-16  8:12           ` Hidetoshi Seto
2008-07-16  6:51       ` Hidetoshi Seto
2008-07-16  6:35     ` [PATCH] stopmachine: add stopmachine_timeout v2 Hidetoshi Seto
2008-07-16  6:23   ` Max Krasnyansky
2008-07-16 10:11   ` Jeremy Fitzhardinge
2008-07-16 10:11   ` Jeremy Fitzhardinge
2008-07-17  3:40     ` Hidetoshi Seto
2008-07-17  5:37       ` Jeremy Fitzhardinge
2008-07-17  5:37       ` Jeremy Fitzhardinge
2008-07-18  4:18       ` Rusty Russell
2008-07-18  4:18       ` Rusty Russell
2008-07-17  3:40     ` Hidetoshi Seto
2008-07-16  4:27 ` Hidetoshi Seto
2008-07-17  6:12 ` [PATCH] stopmachine: add stopmachine_timeout v4 Hidetoshi Seto
2008-07-17  6:12   ` Hidetoshi Seto
2008-07-17  7:09   ` Max Krasnyansky
2008-07-17  7:09   ` Max Krasnyansky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200807151114.59562.rusty@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=borntraeger@de.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.