From: Daniel Thompson <daniel.thompson@linaro.org>
To: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
linux-kernel@vger.kernel.org, patches@linaro.org,
linaro-kernel@lists.linaro.org, Mike Travis <travis@sgi.com>,
Randy Dunlap <rdunlap@infradead.org>,
Dimitri Sivanich <sivanich@sgi.com>,
Andrew Morton <akpm@linux-foundation.org>,
Borislav Petkov <bp@suse.de>,
kgdb-bugreport@lists.sourceforge.net,
Ingo Molnar <mingo@kernel.org>
Subject: [RESEND PATCH v2 3.17rc1] kgdb: Timeout if secondary CPUs ignore the roundup
Date: Mon, 18 Aug 2014 16:01:50 +0100 [thread overview]
Message-ID: <1408374110-19656-1-git-send-email-daniel.thompson@linaro.org> (raw)
In-Reply-To: <1404310379-30228-1-git-send-email-daniel.thompson@linaro.org>
Currently if an active CPU fails to respond to a roundup request the
CPU that requested the roundup will become stuck. This needlessly
reduces the robustness of the debugger.
This patch introduces a timeout allowing the system state to be examined
even when the system contains unresponsive processors. It also modifies
kdb's cpu command to make it censor attempts to switch to unresponsive
processors and to report their state as (D)ead.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: Ingo Molnar <mingo@kernel.org>
---
Notes:
Changes since v1:
- Set CATASTROPHIC if the system contains unresponsive processors
(Jason Wessel)
kernel/debug/debug_core.c | 9 +++++++--
kernel/debug/kdb/kdb_debugger.c | 4 ++++
kernel/debug/kdb/kdb_main.c | 4 +++-
3 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 1adf62b..acd7497 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -471,6 +471,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
int cpu;
int trace_on = 0;
int online_cpus = num_online_cpus();
+ u64 time_left;
kgdb_info[ks->cpu].enter_kgdb++;
kgdb_info[ks->cpu].exception_state |= exception_state;
@@ -595,9 +596,13 @@ return_normal:
/*
* Wait for the other CPUs to be notified and be waiting for us:
*/
- while (kgdb_do_roundup && (atomic_read(&masters_in_kgdb) +
- atomic_read(&slaves_in_kgdb)) != online_cpus)
+ time_left = loops_per_jiffy * HZ;
+ while (kgdb_do_roundup && --time_left &&
+ (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
+ online_cpus)
cpu_relax();
+ if (!time_left)
+ pr_crit("KGDB: Timed out waiting for secondary CPUs.\n");
/*
* At this point the primary processor is completely
diff --git a/kernel/debug/kdb/kdb_debugger.c b/kernel/debug/kdb/kdb_debugger.c
index 8859ca3..15e1a7a 100644
--- a/kernel/debug/kdb/kdb_debugger.c
+++ b/kernel/debug/kdb/kdb_debugger.c
@@ -129,6 +129,10 @@ int kdb_stub(struct kgdb_state *ks)
ks->pass_exception = 1;
KDB_FLAG_SET(CATASTROPHIC);
}
+ /* set CATASTROPHIC if the system contains unresponsive processors */
+ for_each_online_cpu(i)
+ if (!kgdb_info[i].enter_kgdb)
+ KDB_FLAG_SET(CATASTROPHIC);
if (KDB_STATE(SSBPT) && reason == KDB_REASON_SSTEP) {
KDB_STATE_CLEAR(SSBPT);
KDB_STATE_CLEAR(DOING_SS);
diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 379650b..0633e78 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -2157,6 +2157,8 @@ static void kdb_cpu_status(void)
for (start_cpu = -1, i = 0; i < NR_CPUS; i++) {
if (!cpu_online(i)) {
state = 'F'; /* cpu is offline */
+ } else if (!kgdb_info[i].enter_kgdb) {
+ state = 'D'; /* cpu is online but unresponsive */
} else {
state = ' '; /* cpu is responding to kdb */
if (kdb_task_state_char(KDB_TSK(i)) == 'I')
@@ -2210,7 +2212,7 @@ static int kdb_cpu(int argc, const char **argv)
/*
* Validate cpunum
*/
- if ((cpunum > NR_CPUS) || !cpu_online(cpunum))
+ if ((cpunum > NR_CPUS) || !kgdb_info[cpunum].enter_kgdb)
return KDB_BADCPUNUM;
dbg_switch_cpu = cpunum;
--
1.9.3
next prev parent reply other threads:[~2014-08-18 15:04 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-01 14:16 [RFC PATCH] kgdb: Timeout if secondary CPUs ignore the roundup Daniel Thompson
2014-07-01 14:22 ` Jason Wessel
2014-07-02 14:12 ` [PATCH v2] " Daniel Thompson
2014-08-18 15:01 ` Daniel Thompson [this message]
2014-11-11 17:50 ` [PATCH v3] " Daniel Thompson
2014-12-10 20:46 ` Daniel Thompson
2015-01-07 16:15 ` [RESEND PATCH v3 3.19-rc2] " Daniel Thompson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1408374110-19656-1-git-send-email-daniel.thompson@linaro.org \
--to=daniel.thompson@linaro.org \
--cc=akpm@linux-foundation.org \
--cc=bp@suse.de \
--cc=jason.wessel@windriver.com \
--cc=kgdb-bugreport@lists.sourceforge.net \
--cc=linaro-kernel@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=patches@linaro.org \
--cc=rdunlap@infradead.org \
--cc=sivanich@sgi.com \
--cc=travis@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).