public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
@ 2002-01-31 10:45 Erich Focht
  2002-07-11 21:50 ` David Mosberger
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Erich Focht @ 2002-01-31 10:45 UTC (permalink / raw)
  To: linux-ia64

> 
> Is it really necessary to flush all TLBs when doing a context switch on
> one?

I think the global TLB flush is not necessary at all when the mmu_context
counter wraps over (in wrap_mmu_context). Please tell me if I'm wrong...

Commented that out and got rid of the lockups with Ingo's scheduler.

Regards,
Erich



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
@ 2002-07-11 21:50 ` David Mosberger
  2002-07-12 16:47 ` Erich Focht
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: David Mosberger @ 2002-07-11 21:50 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 10 Jul 2002 19:27:20 +0200, Erich Focht <efocht@ess.nec.de> said:

  Erich> Hi David, the attached patch fixes the switch_mm ->
  Erich> wrap_mmu_context deadlock which can occur in 2.5.18+ia64. We
  Erich> discussed this in January/February...

The patch basically looks fine to me and I put it in my 2.5 tree with
a few small changes to take advantage of the new per-CPU scheme
(avoids clutter in asm/processor.h).  Also, I think it's sufficient to
call delayed_flush_tlb() in get_new_mmu_context(), as we just need to
make sure that the TLB does get flushed before.  Or did this not work
for some reason?

	--david


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
  2002-07-11 21:50 ` David Mosberger
@ 2002-07-12 16:47 ` Erich Focht
  2002-07-12 17:02 ` David Mosberger
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Erich Focht @ 2002-07-12 16:47 UTC (permalink / raw)
  To: linux-ia64

> Also, I think it's sufficient to
> call delayed_flush_tlb() in get_new_mmu_context(), as we just need to
> make sure that the TLB does get flushed before.  Or did this not work
> for some reason?

But then other processors would continue using the old TLBs maybe of
tasks which died. The problem is that we need to do a TLB flush on all
CPUs when wrapping around the context. Otherwise we might reuse the context
number of a recently exited process and on other CPUs some of its TLB entries
may survive until the new task switches CPUs. Hard to debug, probably...

Regards,
Erich




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
  2002-07-11 21:50 ` David Mosberger
  2002-07-12 16:47 ` Erich Focht
@ 2002-07-12 17:02 ` David Mosberger
  2002-07-12 17:26 ` Erich Focht
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: David Mosberger @ 2002-07-12 17:02 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 12 Jul 2002 18:47:58 +0200, Erich Focht <efocht@ess.nec.de> said:

  Erich> But then other processors would continue using the old TLBs
  Erich> maybe of tasks which died. The problem is that we need to do
  Erich> a TLB flush on all CPUs when wrapping around the
  Erich> context. Otherwise we might reuse the context number of a
  Erich> recently exited process and on other CPUs some of its TLB
  Erich> entries may survive until the new task switches CPUs. Hard to
  Erich> debug, probably...

But the only time you might reuse an old context number is when you
allocate one.  Yes, you will have old entries hanging around in the
TLB for a while, but you _know_ that the corresponding tasks already
died and you also know that you'll flush the TLB before re-using one
of those context numbers.

	--david


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
                   ` (2 preceding siblings ...)
  2002-07-12 17:02 ` David Mosberger
@ 2002-07-12 17:26 ` Erich Focht
  2002-07-12 17:37 ` David Mosberger
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Erich Focht @ 2002-07-12 17:26 UTC (permalink / raw)
  To: linux-ia64

> But the only time you might reuse an old context number is when you
> allocate one.  Yes, you will have old entries hanging around in the
> TLB for a while, but you _know_ that the corresponding tasks already
> died and you also know that you'll flush the TLB before re-using one
> of those context numbers.

Context numbers are global, used by all CPUs in common. Suppose
task 123 is newly created and we want a context number for it. Unfortunately
it's time to wrap around context numbers and we reuse the context number
of task 100 which died a short while ago and left over TLB entries on all
CPUs of the machine.

get_new_mmu_context is called on the CPU where task 123 is scheduled first.
We can flush TLB there. But on the other CPUs the stale entries of task 100
survive. If no new context is needed before task 123 gets migrated to
another CPU, it might reuse TLB entries of task 100, which are wrong.
I'm not worried about taking an old context number but of using old TLB
entries when my newly created task with reused context number switches
CPUs. flush_tlb_all() was flushing all TLB entries by sending an IPI but
this can lead to a deadlock... Do I misunderstand something?

Regards,
Erich
 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
                   ` (3 preceding siblings ...)
  2002-07-12 17:26 ` Erich Focht
@ 2002-07-12 17:37 ` David Mosberger
  2002-07-12 18:02 ` Grant Grundler
  2002-07-12 18:47 ` David Mosberger
  6 siblings, 0 replies; 8+ messages in thread
From: David Mosberger @ 2002-07-12 17:37 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 12 Jul 2002 19:26:07 +0200, Erich Focht <efocht@ess.nec.de> said:

  Erich> get_new_mmu_context is called on the CPU where task 123 is
  Erich> scheduled first.  We can flush TLB there. But on the other
  Erich> CPUs the stale entries of task 100 survive. If no new context
  Erich> is needed before task 123 gets migrated to another CPU, it
  Erich> might reuse TLB entries of task 100, which are wrong.  I'm
  Erich> not worried about taking an old context number but of using
  Erich> old TLB entries when my newly created task with reused
  Erich> context number switches CPUs. flush_tlb_all() was flushing
  Erich> all TLB entries by sending an IPI but this can lead to a
  Erich> deadlock... Do I misunderstand something?

Ah, I see now what you mean.  Yes, you're right it could happen that
we allocate a context number for a task on one CPU, task gets
rescheduled, and then we have to flush the TLB when activating the
context number.

	--david


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
                   ` (4 preceding siblings ...)
  2002-07-12 17:37 ` David Mosberger
@ 2002-07-12 18:02 ` Grant Grundler
  2002-07-12 18:47 ` David Mosberger
  6 siblings, 0 replies; 8+ messages in thread
From: Grant Grundler @ 2002-07-12 18:02 UTC (permalink / raw)
  To: linux-ia64

David Mosberger wrote:
> Ah, I see now what you mean.  Yes, you're right it could happen that
> we allocate a context number for a task on one CPU, task gets
> rescheduled, and then we have to flush the TLB when activating the
> context number.

David,
Would "lazy TLB flushing" be possible?
ie flush TLB of given taskid first time task needs to run on a CPU.
Deal with the TLB flushing when running the task,
not when creating or rescheduling to a different CPU.

I'm thinking broadcasting TLB flushes (or IPI for that effect) is, uhm,
less than optimal for scalability. But I'm no expert on CPU TLB issues.
There might be lots of evils with this approach. I need to read
the section on process creation/scheduling in the IA64 Linux book.
(ie you don't need to explain what the book already says).

thanks,
grant


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler
  2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
                   ` (5 preceding siblings ...)
  2002-07-12 18:02 ` Grant Grundler
@ 2002-07-12 18:47 ` David Mosberger
  6 siblings, 0 replies; 8+ messages in thread
From: David Mosberger @ 2002-07-12 18:47 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 12 Jul 2002 11:02:25 -0700, Grant Grundler <grundler@cup.hp.com> said:

  Grant> Would "lazy TLB flushing" be possible?
  Grant> ie flush TLB of given taskid first time task needs to run on a CPU.
  Grant> Deal with the TLB flushing when running the task,
  Grant> not when creating or rescheduling to a different CPU.

There is no good way to distinguish between "running a task for the
first time on a CPU" and "rescheduling".  It could be done, but I
don't think it's worth the complexity.  The overhead of checking for a
delayed flush is just one load from a CPU-local variable and a test.
That's in the noise compared to a context-switch.

  Grant> I'm thinking broadcasting TLB flushes (or IPI for that
  Grant> effect) is, uhm, less than optimal for scalability. But I'm
  Grant> no expert on CPU TLB issues.  There might be lots of evils
  Grant> with this approach. I need to read the section on process
  Grant> creation/scheduling in the IA64 Linux book.  (ie you don't
  Grant> need to explain what the book already says).

With Erich's patch, there are no IPIs on wrap-around.  That was the
whole point of the patch (to avoid deadlock conditions).

	--david


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-07-12 18:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-31 10:45 [Linux-ia64] Re: switch_mm race condition with Ingo's scheduler Erich Focht
2002-07-11 21:50 ` David Mosberger
2002-07-12 16:47 ` Erich Focht
2002-07-12 17:02 ` David Mosberger
2002-07-12 17:26 ` Erich Focht
2002-07-12 17:37 ` David Mosberger
2002-07-12 18:02 ` Grant Grundler
2002-07-12 18:47 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox