Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>,
	Borislav Petkov <bp@alien8.de>,
	Viresh Kumar <viresh.kumar@linaro.org>
Cc: Michael Wang <wangyun@linux.vnet.ibm.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Jiri Kosina <jkosina@suse.cz>, Tony Luck <tony.luck@intel.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Linux PM mailing list <linux-pm@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2
Date: Mon, 20 May 2013 15:01:47 +0530	[thread overview]
Message-ID: <5199ED83.5040804@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAFTL4hwm4Cox=ynVcO-oLG=Jt_1Lro+hbcpEE5hiAYEJGipaLA@mail.gmail.com>

On 05/20/2013 01:40 PM, Frederic Weisbecker wrote:
> 2013/5/20 Borislav Petkov <bp@alien8.de>:
>> On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
>>> I suppose the reason is that the cpu we passed to
>>> mod_delayed_work_on() has a chance to become offline before we
>>> disabled irq, what about check it before send resched ipi? like:
>>
>> I think this is only addressing the symptoms - what we should be doing
>> instead is asking ourselves why are we even scheduling work on a cpu if
>> the machine goes offline?
>>
>> I don't know though who should be responsible for killing all that
>> work - the workqueue itself or the guy who created it, i.e. cpufreq
>> governor...
>>
>> Hmmm.
> 
> Let's look at this portion of cpu_down():
> 
> 	err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
> 	if (err) {
> 		/* CPU didn't die: tell everyone.  Can't complain. */
> 		smpboot_unpark_threads(cpu);
> 		cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
> 		goto out_release;
> 	}
> 	BUG_ON(cpu_online(cpu));
> 
> 	/*
> 	 * The migration_call() CPU_DYING callback will have removed all
> 	 * runnable tasks from the cpu, there's only the idle task left now
> 	 * that the migration thread is done doing the stop_machine thing.
> 	 *
> 	 * Wait for the stop thread to go away.
> 	 */
> 	while (!idle_cpu(cpu))
> 		cpu_relax();
> 	/* This actually kills the CPU. */
> 	__cpu_die(cpu);
> 
> 	/* CPU is completely dead: tell everyone.  Too late to complain. */
> 	cpu_notify_nofail(CPU_DEAD | mod, hcpu);
> 
> 	check_for_tasks(cpu);
> 
> The CPU is considered offline after the take_cpu_down stop machine job
> completes. But the struct timer_list timers are migrated later through
> CPU_DEAD notification. Only once that's completed we check for illegal
> residual tasks in the CPU.  So there is a little window between the
> stop machine thing and __cpu_die() where a timer can fire with
> cpu_online(cpu) == 1.
> 

Nope, the dying CPU is removed from the cpu_online_mask in the very first
stages of stop_machine(), specifically in the __cpu_disable() function.
__cpu_die() is just a dummy.

> Now concerning the workqueue I don't know. I guess the per cpu ones
> are not migrated due to their affinity. Apparently they can still wake
> up and execute works due to the timers...

The interesting thing is that the cpufreq governor actually _cancels_ the
queued work in CPU_DOWN_PREPARE stage, as far as I understand.

cpufreq_cpu_callback()
  -> __cpufreq_remove_dev()
    -> __cpufreq_governor(data, CPUFREQ_GOV_STOP);
      -> od_cpufreq_governor_dbs()
	 -> cpufreq_governor_dbs(), which has the following case statement:


        case CPUFREQ_GOV_STOP:
                if (dbs_data->cdata->governor == GOV_CONSERVATIVE)
                        cs_dbs_info->enable = 0;

                gov_cancel_work(dbs_data, policy);

                mutex_lock(&dbs_data->mutex);
                mutex_destroy(&cpu_cdbs->timer_mutex);

                mutex_unlock(&dbs_data->mutex);

                break;


But recently I removed the call to __cpufreq_remove_dev() in the suspend/resume
path (tasks frozen), in commit a66b2e503 (cpufreq: Preserve sysfs files across
suspend/resume). So I'm curious to know if this is affecting in any way.

So Boris, do you see the warnings during regular hotplug also (via sysfs) or
only during suspend/shutdown? [Actually shutdown doesn't freeze tasks, so that is
already a hint that this warning can be triggered via sysfs also, but it would
be good to get a confirmation.]

And Viresh, in the regular hotplug paths, the call to gov_cancel_work() is
supposed to kill any pending workqueue functions pertaining to offline CPUs
right? Could there be a synchronization bug somewhere due to which this
might not be happening properly?

Regards,
Srivatsa S. Bhat

next prev parent reply	other threads:[~2013-05-20  9:34 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-09 12:29 NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule Jiri Kosina
2013-05-09 12:50 ` Borislav Petkov
2013-05-09 12:58   ` Borislav Petkov
2013-05-15 18:45     ` Paul E. McKenney
2013-05-15 22:43       ` Borislav Petkov
2013-05-15 23:55         ` Paul E. McKenney
2013-05-17 13:56           ` NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2 Borislav Petkov
2013-05-20  3:16             ` Michael Wang
2013-05-20  4:50               ` Borislav Petkov
2013-05-20  6:23                 ` Michael Wang
2013-05-20  6:47                   ` Borislav Petkov
2013-05-20  6:58                     ` Michael Wang
2013-05-20  7:06                       ` Michael Wang
2013-05-20  7:12                         ` Viresh Kumar
2013-05-20  7:25                           ` Michael Wang
2013-05-20  8:56                             ` Michael Wang
2013-05-20  9:09                               ` Viresh Kumar
2013-05-20  9:24                                 ` Michael Wang
2013-05-20 13:23                                   ` Borislav Petkov
2013-05-20 13:43                                     ` Viresh Kumar
2013-05-20 15:08                                       ` Borislav Petkov
2013-05-21  2:20                                     ` Michael Wang
2013-05-21  2:37                                       ` Michael Wang
2013-05-21  7:21                                       ` Borislav Petkov
2013-05-21  7:58                                         ` Michael Wang
2013-05-20  7:36                     ` Tejun Heo
2013-05-20  8:10                 ` Frederic Weisbecker
2013-05-20  9:31                   ` Srivatsa S. Bhat [this message]
2013-05-20  9:40                     ` Viresh Kumar
2013-05-20 10:24                       ` Viresh Kumar
2013-06-04 21:20             ` Jiri Kosina
2013-06-05  2:30               ` Michael Wang
2013-06-05  8:08                 ` Jiri Kosina
2013-06-05  8:12                   ` Michael Wang
2013-05-10  0:29 ` NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule Frederic Weisbecker
2013-05-10  9:28   ` Borislav Petkov
2013-05-10  9:26     ` Frederic Weisbecker
2013-05-10  9:37       ` Ingo Molnar
2013-05-10  9:45         ` Borislav Petkov
2013-05-10 15:31           ` Frederic Weisbecker
2013-05-10  9:43       ` Borislav Petkov
2013-05-10 15:42       ` Jiri Kosina
2013-05-10 15:03   ` Jiri Kosina
2013-05-10 15:21     ` Borislav Petkov
2013-05-10 15:43       ` Frederic Weisbecker
2013-05-10 16:23         ` Borislav Petkov
2013-05-10 21:38           ` Frederic Weisbecker
2013-05-13 14:56             ` Jiri Kosina
2013-05-13 19:40               ` Thomas Gleixner
2013-05-13 20:01                 ` Jiri Kosina
2013-05-14 15:46                 ` [tip:timers/urgent] tick: Don't invoke tick_nohz_stop_sched_tick( ) if the cpu is offline tip-bot for Thomas Gleixner
2013-05-15 19:41                   ` Frederic Weisbecker
2013-05-16 14:06                     ` Thomas Gleixner
2013-05-16 14:15                       ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5199ED83.5040804@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=bp@alien8.de \
    --cc=fweisbec@gmail.com \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=viresh.kumar@linaro.org \
    --cc=wangyun@linux.vnet.ibm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.