All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>,
	xen-devel@lists.xenproject.org
Cc: Varun.Swara@arm.com, Julien Grall <julien.grall@arm.com>,
	Steve Capper <Steve.Capper@arm.com>,
	Wei Liu <wei.liu2@citrix.com>
Subject: Re: [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata.
Date: Wed, 4 May 2016 16:25:03 +0100	[thread overview]
Message-ID: <572A144F.10903@citrix.com> (raw)
In-Reply-To: <146231200265.25631.11092489140094592368.stgit@Solace.fritz.box>

On 03/05/16 22:46, Dario Faggioli wrote:
> commit 64269d9365 "sched: implement .init_pdata in Credit,
> Credit2 and RTDS" helped fixing Credit2 runqueues, and
> the races we had in switching scheduler for pCPUs, but
> introduced another issue. In fact, if CPU bringup fails
> during __cpu_up() (and, more precisely, after CPU_UP_PREPARE,
> but before CPU_STARTING) the CPU_UP_CANCELED notifier
> would be executed, which calls the free_pdata hook.
> 
> Such hook does, right now, two things: (1) undo the
> initialization done inside the init_pdata hook and (2)
> free the memory allocated by the alloc_pdata hook.
> 
> However, in the failure path just described, it is possible
> that only alloc_pdata were called, and this is potentially
> an issue (depending on how actually free_pdata does).
> 
> In fact, for Credit1 (the only scheduler that actually
> allocates per-pCPU data), this result in calling kill_timer()
> on a timer that had not yet been initialized, which causes
> the following:
> 
> (XEN) Xen call trace:
> (XEN)    [<000000000022e304>] timer.c#active_timer+0x8/0x24 (PC)
> (XEN)    [<000000000022f624>] kill_timer+0x108/0x2e0 (LR)
> (XEN)    [<00000000002208c0>] sched_credit.c#csched_free_pdata+0xd8/0x114
> (XEN)    [<0000000000227a18>] schedule.c#cpu_schedule_callback+0xc0/0x12c
> (XEN)    [<0000000000219944>] notifier_call_chain+0x78/0x9c
> (XEN)    [<00000000002015fc>] cpu_up+0x104/0x130
> (XEN)    [<000000000028f7c0>] start_xen+0xaf8/0xce0
> (XEN)    [<00000000810021d8>] 00000000810021d8
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'timer->status >= TIMER_STATUS_inactive' failed at timer.c:279
> (XEN) ****************************************
> 
> Solve this by making the scheduler hooks API symmetric again,
> i.e., by adding a deinit_pdata hook and making it responsible
> of undoing what init_pdata did, rather than asking to free_pdata
> to do everything.
> 
> This is cleaner and, in the case at hand, makes it possible to
> only call free_pdata (which is the right thing to do) as only
> allocation and no initialization was performed.
> 
> Reported-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Looks good, thanks!

Reviewed-by: George Dunlap <george.dunlap@citrix.com>


> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Varun.Swara@arm.com
> Cc: Steve Capper <Steve.Capper@arm.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/common/sched_credit.c  |   31 +++++++++++++++++++++++++++----
>  xen/common/sched_credit2.c |   16 +++++++++++++---
>  xen/common/schedule.c      |   38 +++++++++++++++++++++++++++++++++++++-
>  xen/include/xen/sched-if.h |    1 +
>  4 files changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index bc36837..db4d42a 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -485,11 +485,35 @@ static void
>  csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  {
>      struct csched_private *prv = CSCHED_PRIV(ops);
> +
> +    /*
> +     * pcpu either points to a valid struct csched_pcpu, or is NULL, if we're
> +     * beeing called from CPU_UP_CANCELLED, because bringing up a pCPU failed
> +     * very early. xfree() does not really mind, but we want to be sure that,
> +     * when we get here, either init_pdata has never been called, or
> +     * deinit_pdata has been called already.
> +     */
> +    ASSERT(!cpumask_test_cpu(cpu, prv->cpus));
> +
> +    xfree(pcpu);
> +}
> +
> +static void
> +csched_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
> +{
> +    struct csched_private *prv = CSCHED_PRIV(ops);
>      struct csched_pcpu *spc = pcpu;
>      unsigned long flags;
>  
> -    if ( spc == NULL )
> -        return;
> +    /*
> +     * Scheduler specific data for this pCPU must still be there and and be
> +     * valid. In fact, if we are here:
> +     *  1. alloc_pdata must have been called for this cpu, and free_pdata
> +     *     must not have been called on it before us,
> +     *  2. init_pdata must have been called on this cpu, and deinit_pdata
> +     *     (us!) must not have been called on it already.
> +     */
> +    ASSERT(spc && cpumask_test_cpu(cpu, prv->cpus));
>  
>      spin_lock_irqsave(&prv->lock, flags);
>  
> @@ -507,8 +531,6 @@ csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>          kill_timer(&prv->master_ticker);
>  
>      spin_unlock_irqrestore(&prv->lock, flags);
> -
> -    xfree(spc);
>  }
>  
>  static void *
> @@ -2091,6 +2113,7 @@ static const struct scheduler sched_credit_def = {
>      .free_vdata     = csched_free_vdata,
>      .alloc_pdata    = csched_alloc_pdata,
>      .init_pdata     = csched_init_pdata,
> +    .deinit_pdata   = csched_deinit_pdata,
>      .free_pdata     = csched_free_pdata,
>      .switch_sched   = csched_switch_sched,
>      .alloc_domdata  = csched_alloc_domdata,
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 50c1b9d..803cc44 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2201,6 +2201,12 @@ csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
>      unsigned long flags;
>      unsigned rqi;
>  
> +    /*
> +     * pdata contains what alloc_pdata returned. But since we don't (need to)
> +     * implement alloc_pdata, either that's NULL, or something is very wrong!
> +     */
> +    ASSERT(!pdata);
> +
>      spin_lock_irqsave(&prv->lock, flags);
>      old_lock = pcpu_schedule_lock(cpu);
>  
> @@ -2261,7 +2267,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>  }
>  
>  static void
> -csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
> +csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  {
>      unsigned long flags;
>      struct csched2_private *prv = CSCHED2_PRIV(ops);
> @@ -2270,7 +2276,11 @@ csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  
>      spin_lock_irqsave(&prv->lock, flags);
>  
> -    ASSERT(cpumask_test_cpu(cpu, &prv->initialized));
> +    /*
> +     * alloc_pdata is not implemented, so pcpu must be NULL. On the other
> +     * hand, init_pdata must have been called for this pCPU.
> +     */
> +    ASSERT(!pcpu && cpumask_test_cpu(cpu, &prv->initialized));
>      
>      /* Find the old runqueue and remove this cpu from it */
>      rqi = prv->runq_map[cpu];
> @@ -2387,7 +2397,7 @@ static const struct scheduler sched_credit2_def = {
>      .alloc_vdata    = csched2_alloc_vdata,
>      .free_vdata     = csched2_free_vdata,
>      .init_pdata     = csched2_init_pdata,
> -    .free_pdata     = csched2_free_pdata,
> +    .deinit_pdata   = csched2_deinit_pdata,
>      .switch_sched   = csched2_switch_sched,
>      .alloc_domdata  = csched2_alloc_domdata,
>      .free_domdata   = csched2_free_domdata,
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 5546999..80fea39 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1546,6 +1546,38 @@ static int cpu_schedule_callback(
>      struct schedule_data *sd = &per_cpu(schedule_data, cpu);
>      int rc = 0;
>  
> +    /*
> +     * From the scheduler perspective, bringing up a pCPU requires
> +     * allocating and initializing the per-pCPU scheduler specific data,
> +     * as well as "registering" this pCPU to the scheduler (which may
> +     * involve modifying some scheduler wide data structures).
> +     * This happens by calling the alloc_pdata and init_pdata hooks, in
> +     * this order. A scheduler that does not need to allocate any per-pCPU
> +     * data can avoid implementing alloc_pdata. init_pdata may, however, be
> +     * necessary/useful in this case too (e.g., it can contain the "register
> +     * the pCPU to the scheduler" part). alloc_pdata (if present) is called
> +     * during CPU_UP_PREPARE. init_pdata (if present) is called during
> +     * CPU_STARTING.
> +     *
> +     * On the other hand, at teardown, we need to reverse what has been done
> +     * during initialization, and then free the per-pCPU specific data. This
> +     * happens by calling the deinit_pdata and free_pdata hooks, in this
> +     * order. If no per-pCPU memory was allocated, there is no need to
> +     * provide an implementation of free_pdata. deinit_pdata may, however,
> +     * be necessary/useful in this case too (e.g., it can undo something done
> +     * on scheduler wide data structure during init_pdata). Both deinit_pdata
> +     * and free_pdata are called during CPU_DEAD.
> +     *
> +     * If someting goes wrong during bringup, we go to CPU_UP_CANCELLED
> +     * *before* having called init_pdata. In this case, as there is no
> +     * initialization needing undoing, only free_pdata should be called.
> +     * This means it is possible to call free_pdata just after alloc_pdata,
> +     * without a init_pdata/deinit_pdata "cycle" in between the two.
> +     *
> +     * So, in summary, the usage pattern should look either
> +     *  - alloc_pdata-->init_pdata-->deinit_pdata-->free_pdata, or
> +     *  - alloc_pdata-->free_pdata.
> +     */
>      switch ( action )
>      {
>      case CPU_STARTING:
> @@ -1554,8 +1586,10 @@ static int cpu_schedule_callback(
>      case CPU_UP_PREPARE:
>          rc = cpu_schedule_up(cpu);
>          break;
> -    case CPU_UP_CANCELED:
>      case CPU_DEAD:
> +        SCHED_OP(sched, deinit_pdata, sd->sched_priv, cpu);
> +        /* Fallthrough */
> +    case CPU_UP_CANCELED:
>          cpu_schedule_down(cpu);
>          break;
>      default:
> @@ -1713,6 +1747,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
>  
>      SCHED_OP(new_ops, tick_resume, cpu);
>  
> +    SCHED_OP(old_ops, deinit_pdata, ppriv_old, cpu);
> +
>      SCHED_OP(old_ops, free_vdata, vpriv_old);
>      SCHED_OP(old_ops, free_pdata, ppriv_old, cpu);
>  
> diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
> index 1db7c8d..bc0e794 100644
> --- a/xen/include/xen/sched-if.h
> +++ b/xen/include/xen/sched-if.h
> @@ -138,6 +138,7 @@ struct scheduler {
>      void         (*free_pdata)     (const struct scheduler *, void *, int);
>      void *       (*alloc_pdata)    (const struct scheduler *, int);
>      void         (*init_pdata)     (const struct scheduler *, void *, int);
> +    void         (*deinit_pdata)   (const struct scheduler *, void *, int);
>      void         (*free_domdata)   (const struct scheduler *, void *);
>      void *       (*alloc_domdata)  (const struct scheduler *, struct domain *);
>  
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-05-04 15:25 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-03 21:46 [PATCH for 4.7 0/4] Assorted scheduling fixes Dario Faggioli
2016-05-03 21:46 ` [PATCH for 4.7 1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched() Dario Faggioli
2016-05-04  8:48   ` Jan Beulich
2016-05-04  9:08     ` Dario Faggioli
2016-05-04 15:11   ` George Dunlap
2016-05-04 15:58     ` Dario Faggioli
2016-05-04 17:05       ` George Dunlap
2016-05-04 17:21         ` Dario Faggioli
2016-05-04 17:34           ` George Dunlap
2016-05-06 13:21             ` Dario Faggioli
2016-05-06 13:48               ` Wei Liu
2016-05-09 14:42               ` George Dunlap
2016-05-03 21:46 ` [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata Dario Faggioli
2016-05-04 15:25   ` George Dunlap [this message]
2016-05-03 21:46 ` [PATCH for 4.7 3/4] xen: credit2: fix 2 (minor) issues in load tracking logic Dario Faggioli
2016-05-04 15:38   ` George Dunlap
2016-05-03 21:46 ` [PATCH for 4.7 4/4] xen: adopt .deinit_pdata and improve timer handling Dario Faggioli
2016-05-04 15:51   ` George Dunlap
2016-05-04 15:53     ` Meng Xu
2016-05-06 23:05       ` Dario Faggioli
2016-05-07 21:19   ` Meng Xu
2016-05-08  3:12     ` Meng Xu
2016-05-09  8:07       ` Juergen Gross
2016-05-09 13:22     ` Dario Faggioli
2016-05-09 14:08       ` Meng Xu
2016-05-09 14:52         ` Dario Faggioli
2016-05-09 14:58           ` Meng Xu
2016-05-09 14:46     ` George Dunlap
2016-05-09 14:58       ` Wei Liu
2016-05-09 15:35         ` George Dunlap
2016-05-04  1:26 ` [PATCH for 4.7 0/4] Assorted scheduling fixes Konrad Rzeszutek Wilk
2016-05-04  9:06   ` Dario Faggioli
2016-05-05 12:00     ` Julien Grall
2016-05-05 12:38       ` Dario Faggioli
2016-05-04 15:53 ` George Dunlap
2016-05-04 16:04 ` Wei Liu
2016-05-07 21:23   ` Meng Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=572A144F.10903@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=Steve.Capper@arm.com \
    --cc=Varun.Swara@arm.com \
    --cc=dario.faggioli@citrix.com \
    --cc=julien.grall@arm.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.