From: Johannes Weiner <hannes@saeurebad.de>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Pekka Enberg <penberg@cs.helsinki.fi>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] kmemcheck: SMP support
Date: Fri, 23 May 2008 18:09:08 +0200 [thread overview]
Message-ID: <87od6xngt7.fsf@saeurebad.de> (raw)
In-Reply-To: <20080523141759.GA1833@damson.getinternet.no> (Vegard Nossum's message of "Fri, 23 May 2008 16:17:59 +0200")
Hi Vegard,
Vegard Nossum <vegard.nossum@gmail.com> writes:
> From: Vegard Nossum <vegard.nossum@gmail.com>
> Date: Fri, 23 May 2008 15:53:03 +0200
> Subject: [PATCH] kmemcheck: SMP support
>
> This patch adds SMP support to kmemcheck, that is, the ability to boot
> more than one processor even when kmemcheck is enabled. (Previously,
> only one CPU would be booted even if more were present in the system.)
>
> On page fault, kmemcheck needs to pause all the other CPUs in the system
> in order to guarantee that no other CPU will modify the same memory
> location (which will otherwise be unprotected after we set the present
> bit of the PTE).
>
> Since the page fault can be taken with any irq state (i.e. enabled or
> disabled), we can't send a normal IPI broadcast since this can deadlock.
>
> Instead, we send an NMI. This is guaranteed to be received _except_ if
> the processor is already inside the NMI handler.
>
> This is of course not very efficient, and booting with maxcpus=1 is
> recommended, however, this allows the kernel to be configured with
> CONFIG_KMEMCHECK=y with close to zero overhead when kmemcheck is
> disabled. (It can still be runtime-enabled at any time, though.)
>
> The patch has been tested on real hardware.
>
> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
> ---
> arch/x86/kernel/kmemcheck.c | 108 +++++++++++++++++++++++++++++++++++++++----
> 1 files changed, 98 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/kmemcheck.c b/arch/x86/kernel/kmemcheck.c
> index c0045e8..fdf8acb 100644
> --- a/arch/x86/kernel/kmemcheck.c
> +++ b/arch/x86/kernel/kmemcheck.c
> @@ -16,6 +16,7 @@
> #include <linux/kmemcheck.h>
> #include <linux/mm.h>
> #include <linux/module.h>
> +#include <linux/notifier.h>
> #include <linux/page-flags.h>
> #include <linux/percpu.h>
> #include <linux/stacktrace.h>
> @@ -23,9 +24,12 @@
> #include <asm/cacheflush.h>
> #include <asm/kmemcheck.h>
> #include <asm/pgtable.h>
> +#include <asm/smp.h>
> #include <asm/string.h>
> #include <asm/tlbflush.h>
>
> +#include <mach_ipi.h>
> +
> enum shadow {
> SHADOW_UNALLOCATED,
> SHADOW_UNINITIALIZED,
> @@ -240,18 +244,91 @@ static void do_wakeup(unsigned long data)
> }
> }
>
> +static atomic_t nmi_wait;
> +static atomic_t nmi_resume;
> +static atomic_t started;
> +static atomic_t finished;
> +
> +static int nmi_notifier(struct notifier_block *self,
> + unsigned long val, void *data)
> +{
> + struct die_args *args = (struct die_args *) data;
> +
> + if (val != DIE_NMI_IPI || !atomic_read(&nmi_wait))
> + return NOTIFY_DONE;
> +
> + atomic_inc(&started);
> +
> + /* Pause until the fault has been handled */
> + while (!atomic_read(&nmi_resume))
> + cpu_relax();
> +
> + atomic_inc(&finished);
> +
> + return NOTIFY_STOP;
> +}
> +
> +static void
> +pause_allbutself(void)
> +{
> +#ifdef CONFIG_SMP
> + static spinlock_t nmi_spinlock;
> +
> + int cpus;
> + cpumask_t mask = cpu_online_map;
> +
> + spin_lock(&nmi_spinlock);
> +
> + cpus = num_online_cpus() - 1;
> +
> + atomic_set(&started, 0);
> + atomic_set(&finished, 0);
> + atomic_set(&nmi_wait, 1);
> + atomic_set(&nmi_resume, 0);
> +
> + cpu_clear(safe_smp_processor_id(), mask);
> + if (!cpus_empty(mask))
> + send_IPI_mask(mask, NMI_VECTOR);
> +
> + while (atomic_read(&started) != cpus)
> + cpu_relax();
> +
> + atomic_set(&nmi_wait, 0);
> +
> + spin_unlock(&nmi_spinlock);
> +#endif
> +}
> +
> +static void
> +resume(void)
> +{
> +#ifdef CONFIG_SMP
> + int cpus;
> +
> + cpus = num_online_cpus() - 1;
> +
> + atomic_set(&nmi_resume, 1);
> +
> + while (atomic_read(&finished) != cpus)
> + cpu_relax();
> +#endif
> +}
How about merging finished and started into one? I.e. `paused'.
The notifiers increases `paused' before the waiting-loop and decreases
it again afterwards.
pause_allbutself() sends the IPIs and waits until `paused' reached the
number of CPUS.
resume() justs waits until `paused' reaches zero.
Would this work? Will the NMI handler finish even when the CPU is
removed while the handler runs?
Hannes
next prev parent reply other threads:[~2008-05-23 16:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-23 14:17 [PATCH] kmemcheck: SMP support Vegard Nossum
2008-05-23 15:06 ` Ingo Molnar
2008-05-23 15:30 ` Vegard Nossum
2008-05-23 16:13 ` Jeremy Fitzhardinge
2008-05-26 9:11 ` Ingo Molnar
2008-05-26 9:29 ` Avi Kivity
2008-05-23 15:40 ` Jeremy Fitzhardinge
2008-05-23 15:51 ` Vegard Nossum
2008-05-23 17:12 ` Jan Kiszka
2008-05-23 17:32 ` Vegard Nossum
2008-05-23 17:54 ` Jan Kiszka
2008-05-23 20:54 ` Jeremy Fitzhardinge
2008-05-23 16:09 ` Johannes Weiner [this message]
2008-05-23 17:10 ` Vegard Nossum
[not found] ` <19f34abd0805230719j1ce0e2eje6da7c1f963fdf75@mail.gmail.com>
2008-05-25 14:30 ` Fwd: " Pekka Paalanen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87od6xngt7.fsf@saeurebad.de \
--to=hannes@saeurebad.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=penberg@cs.helsinki.fi \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox