linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] sched/mm: add finish_switch_mm function
Date: Wed, 13 Nov 2013 17:05:56 +0100	[thread overview]
Message-ID: <20131113170556.7e170e89@mschwide> (raw)
In-Reply-To: <20131113121909.GA18837@arm.com>

On Wed, 13 Nov 2013 12:19:09 +0000
Catalin Marinas <catalin.marinas@arm.com> wrote:

> On Wed, Nov 13, 2013 at 11:41:43AM +0000, Peter Zijlstra wrote:
> > On Wed, Nov 13, 2013 at 09:16:13AM +0100, Martin Schwidefsky wrote:
> > > The switch_mm function is called with the task_lock and/or with
> > > request queue lock. Add finish_switch_mm to allow an architecture
> > > to execute some code after the mm has been switched but without
> > > any locks held. One use case is the s390 architecture which will
> > > use this to wait for the completion of TLB flush operations.
> 
> We have similar needs on arm and arm64 (full cache flushing where we
> want interrupts enable or some IPIs for TLB tagging synchronisation).

On s390 we need to wait for the completion of a TLB flush.

> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 1deccd7..89409cb 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -32,7 +32,7 @@
> > >  #include <linux/init.h>
> > >  #include <linux/uaccess.h>
> > >  #include <linux/highmem.h>
> > > -#include <asm/mmu_context.h>
> > > +#include <linux/mmu_context.h>
> > >  #include <linux/interrupt.h>
> > >  #include <linux/capability.h>
> > >  #include <linux/completion.h>
> > > @@ -1996,6 +1996,7 @@ static void finish_task_switch(struct rq *rq, struct task_struct *prev)
> > >  	perf_event_task_sched_in(prev, current);
> > >  	finish_lock_switch(rq, prev);
> > >  	finish_arch_post_lock_switch();
> > > +	finish_switch_mm(current->mm, current);
> 
> This could use the same hook.

Yes.
 
> > >  
> > >  	fire_sched_in_preempt_notifiers(current);
> > >  	if (mm)
> > > @@ -4140,8 +4141,10 @@ void idle_task_exit(void)
> > >  
> > >  	BUG_ON(cpu_online(smp_processor_id()));
> > >  
> > > -	if (mm != &init_mm)
> > > +	if (mm != &init_mm) {
> > >  		switch_mm(mm, &init_mm, current);
> > > +		finish_switch_mm(&init_mm, current);
> > > +	}
> > >  	mmdrop(mm);
> > >  }
> 
> Here finish_switch_mm() is called in the same context with switch_mm().
> What we have on ARM via switch_mm() is to check for irqs_disabled() and
> if yes, defer the actual switching via a flag until the
> finish_arch_post_lock_switch() hook. But on ARM we only cared about the
> interrupts being enabled.

The guarantee s390 needs is that the rq-lock is not taken. What I have
seen with the wait loop in switch_mm is a dead lock because one CPU #0
was looping in switch_mm to wait for the TLB flush of another CPU #1.
CPU #1 got an interrupt that tried to wake-up a task which happened to
be on the run-queue of CPU #0.

> > > diff --git a/mm/mmu_context.c b/mm/mmu_context.c
> > > index 8a8cd02..11b3d47 100644
> > > --- a/mm/mmu_context.c
> > > +++ b/mm/mmu_context.c
> > > @@ -8,8 +8,6 @@
> > >  #include <linux/export.h>
> > >  #include <linux/sched.h>
> > >  
> > > -#include <asm/mmu_context.h>
> > > -
> > >  /*
> > >   * use_mm
> > >   *	Makes the calling kernel thread take on the specified
> > > @@ -31,6 +29,7 @@ void use_mm(struct mm_struct *mm)
> > >  	tsk->mm = mm;
> > >  	switch_mm(active_mm, mm, tsk);
> > >  	task_unlock(tsk);
> > > +	finish_switch_mm(mm, tsk);
> 
> As above, for ARM we only care about interrupts being enabled, so it
> didn't require a hook.
> 
> Is s390 switch_mm() ok with only interrupts being enabled but some locks
> held?

Interrupts on/off is not the problem for s390. 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


  reply	other threads:[~2013-11-13 16:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-13  8:16 [PATCH 0/2] sched: finish_switch_mm hook Martin Schwidefsky
2013-11-13  8:16 ` [PATCH 1/2] sched/mm: add finish_switch_mm function Martin Schwidefsky
2013-11-13 11:41   ` Peter Zijlstra
2013-11-13 11:49     ` Martin Schwidefsky
2013-11-13 12:19     ` Catalin Marinas
2013-11-13 16:05       ` Martin Schwidefsky [this message]
2013-11-13 17:03         ` Catalin Marinas
2013-11-14  8:00           ` Martin Schwidefsky
2013-11-13  8:16 ` [PATCH 2/2] s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries Martin Schwidefsky
2013-11-13 16:16   ` Catalin Marinas
2013-11-14  8:10     ` Martin Schwidefsky
2013-11-14 13:22       ` Catalin Marinas
2013-11-14 16:33         ` Martin Schwidefsky
2013-11-15 10:44           ` Catalin Marinas
2013-11-15 11:10             ` Martin Schwidefsky
2013-11-15 11:17               ` Martin Schwidefsky
2013-11-15 11:57                 ` Catalin Marinas
2013-11-15 13:29                   ` Martin Schwidefsky
2013-11-15 13:46                     ` Catalin Marinas
2013-11-18  8:11                       ` Martin Schwidefsky
2013-11-15  9:13       ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131113170556.7e170e89@mschwide \
    --to=schwidefsky@de.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).