All of lore.kernel.org
 help / color / mirror / Atom feed
From: mingo@elte.hu (Ingo Molnar)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG] "sched: Remove rq->lock from the first half of ttwu()" locks up on ARM
Date: Fri, 27 May 2011 14:06:29 +0200	[thread overview]
Message-ID: <20110527120629.GA32617@elte.hu> (raw)
In-Reply-To: <BANLkTinUZ7EwN_nBCi_RQ9u8-LBcr_A74g@mail.gmail.com>


* Catalin Marinas <catalin.marinas@arm.com> wrote:

> > How much time does that take on contemporary ARM hardware, 
> > typically (and worst-case)?
> 
> On newer ARMv6 and ARMv7 hardware, we no longer flush the caches at 
> context switch as we got VIPT (or PIPT-like) caches.
> 
> But modern ARM processors use something called ASID to tag the TLB 
> entries and we are limited to 256. The switch_mm() code checks for 
> whether we ran out of them to restart the counting. This ASID 
> roll-over event needs to be broadcast to the other CPUs and issuing 
> IPIs with the IRQs disabled isn't always safe. Of course, we could 
> briefly re-enable them at the ASID roll-over time but I'm not sure 
> what the expectations of the code calling switch_mm() are.

The expectations are to have irqs off (we are holding the runqueue 
lock if !__ARCH_WANT_INTERRUPTS_ON_CTXSW), so that's not workable i 
suspect.

But in theory we could drop the rq lock and restart the scheduler 
task-pick and balancing sequence when the ARM TLB tag rolls over. So 
instead of this fragile and assymetric method we'd have a 
straightforward retry-in-rare-cases method.

That means some modifications to switch_mm() but should be solvable.

That would make ARM special only in so far that it's one of the few 
architectures that signal 'retry task pickup' via switch_mm() - it 
would use the stock scheduler otherwise and we could remove 
__ARCH_WANT_INTERRUPTS_ON_CTXSW and perhaps even 
__ARCH_WANT_UNLOCKED_CTXSW altogether.

I'd suggest doing this once modern ARM chips get so widespread that 
you can realistically induce a ~700 usecs irqs-off delays on old, 
virtual-cache ARM chips. Old chips would likely use old kernels 
anyway, right?

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@elte.hu>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>,
	Peter Zijlstra <peterz@infradead.org>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	Frank Rowand <frank.rowand@am.sony.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-kernel@vger.kernel.org, Yong Zhang <yong.zhang0@gmail.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()" locks up on ARM
Date: Fri, 27 May 2011 14:06:29 +0200	[thread overview]
Message-ID: <20110527120629.GA32617@elte.hu> (raw)
In-Reply-To: <BANLkTinUZ7EwN_nBCi_RQ9u8-LBcr_A74g@mail.gmail.com>


* Catalin Marinas <catalin.marinas@arm.com> wrote:

> > How much time does that take on contemporary ARM hardware, 
> > typically (and worst-case)?
> 
> On newer ARMv6 and ARMv7 hardware, we no longer flush the caches at 
> context switch as we got VIPT (or PIPT-like) caches.
> 
> But modern ARM processors use something called ASID to tag the TLB 
> entries and we are limited to 256. The switch_mm() code checks for 
> whether we ran out of them to restart the counting. This ASID 
> roll-over event needs to be broadcast to the other CPUs and issuing 
> IPIs with the IRQs disabled isn't always safe. Of course, we could 
> briefly re-enable them at the ASID roll-over time but I'm not sure 
> what the expectations of the code calling switch_mm() are.

The expectations are to have irqs off (we are holding the runqueue 
lock if !__ARCH_WANT_INTERRUPTS_ON_CTXSW), so that's not workable i 
suspect.

But in theory we could drop the rq lock and restart the scheduler 
task-pick and balancing sequence when the ARM TLB tag rolls over. So 
instead of this fragile and assymetric method we'd have a 
straightforward retry-in-rare-cases method.

That means some modifications to switch_mm() but should be solvable.

That would make ARM special only in so far that it's one of the few 
architectures that signal 'retry task pickup' via switch_mm() - it 
would use the stock scheduler otherwise and we could remove 
__ARCH_WANT_INTERRUPTS_ON_CTXSW and perhaps even 
__ARCH_WANT_UNLOCKED_CTXSW altogether.

I'd suggest doing this once modern ARM chips get so widespread that 
you can realistically induce a ~700 usecs irqs-off delays on old, 
virtual-cache ARM chips. Old chips would likely use old kernels 
anyway, right?

Thanks,

	Ingo

  reply	other threads:[~2011-05-27 12:06 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-24 18:13 [BUG] "sched: Remove rq->lock from the first half of ttwu()" locks up on ARM Marc Zyngier
2011-05-24 18:13 ` Marc Zyngier
2011-05-24 21:32 ` Peter Zijlstra
2011-05-24 21:32   ` Peter Zijlstra
2011-05-24 21:39   ` Ingo Molnar
2011-05-24 21:39     ` Ingo Molnar
2011-05-25 12:23     ` Marc Zyngier
2011-05-25 12:23       ` Marc Zyngier
2011-05-25 17:08   ` Peter Zijlstra
2011-05-25 17:08     ` Peter Zijlstra
2011-05-25 21:15     ` Peter Zijlstra
2011-05-25 21:15       ` Peter Zijlstra
2011-05-26  7:29       ` Yong Zhang
2011-05-26  7:29         ` Yong Zhang
2011-05-26 10:32         ` Peter Zijlstra
2011-05-26 10:32           ` Peter Zijlstra
2011-05-26 11:02           ` Marc Zyngier
2011-05-26 11:02             ` Marc Zyngier
2011-05-26 11:32             ` Peter Zijlstra
2011-05-26 11:32               ` Peter Zijlstra
2011-05-26 12:21               ` Peter Zijlstra
2011-05-26 12:21                 ` Peter Zijlstra
2011-05-26 12:26                 ` Ingo Molnar
2011-05-26 12:26                   ` Ingo Molnar
2011-05-26 12:31                   ` Russell King - ARM Linux
2011-05-26 12:31                     ` Russell King - ARM Linux
2011-05-26 12:37                     ` Peter Zijlstra
2011-05-26 12:37                       ` Peter Zijlstra
2011-05-26 12:50                     ` Ingo Molnar
2011-05-26 12:50                       ` Ingo Molnar
2011-05-26 13:36                       ` Russell King - ARM Linux
2011-05-26 13:36                         ` Russell King - ARM Linux
2011-05-26 14:45                       ` Catalin Marinas
2011-05-26 14:45                         ` Catalin Marinas
2011-05-27 12:06                         ` Ingo Molnar [this message]
2011-05-27 12:06                           ` Ingo Molnar
2011-05-27 17:55                           ` Russell King - ARM Linux
2011-05-27 17:55                             ` Russell King - ARM Linux
2011-05-27 19:41                           ` Nicolas Pitre
2011-05-27 19:41                             ` Nicolas Pitre
2011-05-27 20:52                           ` Russell King - ARM Linux
2011-05-27 20:52                             ` Russell King - ARM Linux
2011-05-28 13:13                             ` Peter Zijlstra
2011-05-28 13:13                               ` Peter Zijlstra
2011-05-31 11:08                               ` Michal Simek
2011-05-31 11:08                                 ` Michal Simek
2011-05-31 13:22                                 ` Peter Zijlstra
2011-05-31 13:22                                   ` Peter Zijlstra
2011-05-31 13:37                                   ` Michal Simek
2011-05-31 13:37                                     ` Michal Simek
2011-05-31 13:52                                     ` Peter Zijlstra
2011-05-31 13:52                                       ` Peter Zijlstra
2011-05-31 14:08                                       ` Michal Simek
2011-05-31 14:08                                         ` Michal Simek
2011-05-31 14:29                                         ` Peter Zijlstra
2011-05-31 14:29                                           ` Peter Zijlstra
2011-05-29 10:21                             ` Catalin Marinas
2011-05-29 10:21                               ` Catalin Marinas
2011-05-29 10:26                               ` Russell King - ARM Linux
2011-05-29 10:26                                 ` Russell King - ARM Linux
2011-05-29 12:01                                 ` Catalin Marinas
2011-05-29 12:01                                   ` Catalin Marinas
2011-05-29 13:19                                   ` Russell King - ARM Linux
2011-05-29 13:19                                     ` Russell King - ARM Linux
2011-05-29 21:21                                     ` Catalin Marinas
2011-05-29 21:21                                       ` Catalin Marinas
2011-05-29  9:51                           ` Catalin Marinas
2011-05-29  9:51                             ` Catalin Marinas
2011-06-06 10:29                           ` Pavel Machek
2011-06-06 10:29                             ` Pavel Machek
2011-05-26 14:56                 ` Marc Zyngier
2011-05-26 14:56                   ` Marc Zyngier
2011-05-26 15:45                 ` Oleg Nesterov
2011-05-26 15:45                   ` Oleg Nesterov
2011-05-26 15:59                   ` Peter Zijlstra
2011-05-26 15:59                     ` Peter Zijlstra
2011-05-26 16:09                     ` Peter Zijlstra
2011-05-26 16:09                       ` Peter Zijlstra
2011-05-26 16:20                       ` Marc Zyngier
2011-05-26 16:20                         ` Marc Zyngier
2011-05-26 16:32                         ` Peter Zijlstra
2011-05-26 16:32                           ` Peter Zijlstra
2011-05-27  8:01                           ` Marc Zyngier
2011-05-27  8:01                             ` Marc Zyngier
2011-05-26 16:22                       ` Marc Zyngier
2011-05-26 16:22                         ` Marc Zyngier
2011-05-26 17:04                       ` Oleg Nesterov
2011-05-26 17:04                         ` Oleg Nesterov
2011-05-26 17:17                         ` Peter Zijlstra
2011-05-26 17:17                           ` Peter Zijlstra
2011-05-26 17:23                           ` Peter Zijlstra
2011-05-26 17:23                             ` Peter Zijlstra
2011-05-26 17:49                             ` Oleg Nesterov
2011-05-26 17:49                               ` Oleg Nesterov
2011-05-27  7:01                             ` Yong Zhang
2011-05-27  7:01                               ` Yong Zhang
2011-05-27 15:23                             ` Santosh Shilimkar
2011-05-27 15:23                               ` Santosh Shilimkar
2011-05-27 15:29                               ` Marc Zyngier
2011-05-27 15:29                                 ` Marc Zyngier
2011-05-27 15:30                                 ` Santosh Shilimkar
2011-05-27 15:30                                   ` Santosh Shilimkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110527120629.GA32617@elte.hu \
    --to=mingo@elte.hu \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.