All of lore.kernel.org
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>, lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] i386 Time: Avoid PIT SMP lockups
Date: Wed, 11 Oct 2006 15:48:31 -0700	[thread overview]
Message-ID: <1160606911.5973.36.camel@localhost.localdomain> (raw)
In-Reply-To: <20061011142646.eb41fac3.akpm@osdl.org>

On Wed, 2006-10-11 at 14:26 -0700, Andrew Morton wrote:
> On Wed, 11 Oct 2006 12:54:21 -0700
> john stultz <johnstul@us.ibm.com> wrote:
> > Andrew: I think this is 2.6.19 material, but probably should go through an -mm or two.
> > 
> > 
> > This patch avoids possible PIT livelock issues seen on SMP systems (and
> > reported by Andi), by not allowing it as a clocksource on SMP boxes.
> > 
> > However, since the PIT may no longer be present, we have to properly
> > handle the cases where SMP systems have TSC skew and fall back from the
> > TSC. Since the PIT isn't there, it would "fall back" to the TSC again.
> > So this changes the jiffies rating to 1, and the TSC-bad rating value to
> > 0.
> > 
> > Thus you will get the following behavior priority on i386 systems:
> > 
> > tsc		[if present & stable]
> > hpet		[if present]
> > cyclone		[if present]
> > acpi_pm		[if present]
> > pit		[if UP]
> > jiffies
> > 
> > Rather then the current more complicated:
> > tsc		[if present & stable]
> > hpet		[if present]
> > cyclone		[if present]
> > acpi_pm		[if present]
> > pit		[if cpus < 4]
> 
> Actually <=4, and that matters: there are a lot of 4-ways.

Yes, you're right.

> 
> > tsc		[if present & unstable]
> > jiffies
> > 
> 
> So this patch has the potential to screw up people who have 2-way or 4-way,
> no hpet/pm-timer and dodgy TSCs.

Indeed. Although I believe the number of TSC screwy SMP boxes w/o an
ACPI PM or HPET is quite small (NUMAQ being the one I'm familiar with,
but it already was using jiffies in multi-node configs).

And by "screw up", it would mean pretty course(tick) granular time.
Otherwise the systems should still function correctly, but I recognize
the loss in granularity could be seen as a regression. The trade off
being that a hang (which would be seen otherwise) isn't acceptable
either.

> Wouldn't it be better to fix the livelock?  What's causing it?

I spent a few days trying to narrow this down, and I haven't been able
to do so to my satisfaction.

At this point, my suspicion is that because the PIT io-read is very slow
(~18us), and done while holding a lock. It would be possible that one
cpu calling gettimeofday would do the following:

grab xtime sequence read lock
grab i8253 spin lock
do port io (very slow)
release i8253 spin lock
realize xtime has been grabed and repeat

While another cpu does the following after in a timer interrupt:
Grabs xtime sequence write lock
spins trying to grab i8253 spin lock

Assuming the first thread can reacquire the i8253 lock before the
second, you could have both threads potentially spinning forever.

I've not yet been able to prove this. alt-sysrq-t usually doesn't have
*any* threads in gettimeofday/timer_interrupt, but removing the
expensive port io in pit_read() causes the hang to go away.

I'm open to digging on this more, but I wanted to get the fix out so
others didn't hit the hang in the meantime.

thanks
-john


  reply	other threads:[~2006-10-11 22:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-11 19:54 [PATCH] i386 Time: Avoid PIT SMP lockups john stultz
2006-10-11 21:26 ` Andrew Morton
2006-10-11 22:48   ` john stultz [this message]
2006-10-11 23:03     ` Andrew Morton
2006-10-16 13:48       ` Andi Kleen
2006-10-16 18:39         ` Andrew Morton
2006-10-16 18:42           ` Andi Kleen
2006-10-16 19:09             ` john stultz
2006-10-18 16:16   ` Maciej W. Rozycki
2006-10-18 16:27     ` Andi Kleen
2006-10-18 16:45       ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1160606911.5973.36.camel@localhost.localdomain \
    --to=johnstul@us.ibm.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.