From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oliver.upton@linux.dev>,
Kristina Martsenko <kristina.martsenko@arm.com>
Cc: isaku.yamahata@intel.com, seanjc@google.com, pbonzini@redhat.com,
kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
James Morse <james.morse@arm.com>
Subject: Re: KVM CPU hotplug notifier triggers BUG_ON on arm64
Date: Mon, 03 Jul 2023 10:45:26 +0100 [thread overview]
Message-ID: <867crhxr9l.wl-maz@kernel.org> (raw)
In-Reply-To: <ZKBlhJwl9YD5FHvs@linux.dev>
On Sat, 01 Jul 2023 18:42:28 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> Hi Kristina,
>
> Thanks for the bug report.
>
> On Sat, Jul 01, 2023 at 01:50:52PM +0100, Kristina Martsenko wrote:
> > Hi,
> >
> > When I try to online a CPU on arm64 while a KVM guest is running, I hit a
> > BUG_ON(preemptible()) (as well as a WARN_ON). See below for the full log.
> >
> > This is on kvmarm/next, but seems to have been broken since 6.3. Bisecting it
> > points at commit:
> >
> > 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
>
> Makes sense. We were using a spinlock before, which implictly disables
> preemption.
>
> Well, one way to hack around the problem would be to just cram
> preempt_{disable,enable}() into kvm_arch_hardware_disable(), but that's
> kinda gross in the context of cpuhp which isn't migratable in the first
> place. Let me have a look...
An alternative would be to replace the preemptible() checks with a one
that looks at the migration state, but I'm not sure that's much better
(it certainly looks more costly).
There is also the fact that most of our per-CPU accessors are already
using preemption disabling, and this code has a bunch of them. So I'm
not sure there is a lot to be gained from not disabling preemption
upfront.
Anyway, as I was able to reproduce the issue under NV, I tested the
hack below. If anything, I expect it to be a reasonable fix for
6.3/6.4, and until we come up with a better approach.
Thanks,
M.
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index aaeae1145359..a28c4ffe4932 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1894,8 +1894,17 @@ static void _kvm_arch_hardware_enable(void *discard)
int kvm_arch_hardware_enable(void)
{
- int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
+ int was_enabled;
+ /*
+ * Most calls to this function are made with migration
+ * disabled, but not with preemption disabled. The former is
+ * enough to ensure correctness, but most of the helpers
+ * expect the later and will throw a tantrum otherwise.
+ */
+ preempt_disable();
+
+ was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
_kvm_arch_hardware_enable(NULL);
if (!was_enabled) {
@@ -1903,6 +1912,8 @@ int kvm_arch_hardware_enable(void)
kvm_timer_cpu_up();
}
+ preempt_enable();
+
return 0;
}
--
Without deviation from the norm, progress is not possible.
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oliver.upton@linux.dev>,
Kristina Martsenko <kristina.martsenko@arm.com>
Cc: isaku.yamahata@intel.com, seanjc@google.com, pbonzini@redhat.com,
kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
James Morse <james.morse@arm.com>
Subject: Re: KVM CPU hotplug notifier triggers BUG_ON on arm64
Date: Mon, 03 Jul 2023 10:45:26 +0100 [thread overview]
Message-ID: <867crhxr9l.wl-maz@kernel.org> (raw)
In-Reply-To: <ZKBlhJwl9YD5FHvs@linux.dev>
On Sat, 01 Jul 2023 18:42:28 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> Hi Kristina,
>
> Thanks for the bug report.
>
> On Sat, Jul 01, 2023 at 01:50:52PM +0100, Kristina Martsenko wrote:
> > Hi,
> >
> > When I try to online a CPU on arm64 while a KVM guest is running, I hit a
> > BUG_ON(preemptible()) (as well as a WARN_ON). See below for the full log.
> >
> > This is on kvmarm/next, but seems to have been broken since 6.3. Bisecting it
> > points at commit:
> >
> > 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock")
>
> Makes sense. We were using a spinlock before, which implictly disables
> preemption.
>
> Well, one way to hack around the problem would be to just cram
> preempt_{disable,enable}() into kvm_arch_hardware_disable(), but that's
> kinda gross in the context of cpuhp which isn't migratable in the first
> place. Let me have a look...
An alternative would be to replace the preemptible() checks with a one
that looks at the migration state, but I'm not sure that's much better
(it certainly looks more costly).
There is also the fact that most of our per-CPU accessors are already
using preemption disabling, and this code has a bunch of them. So I'm
not sure there is a lot to be gained from not disabling preemption
upfront.
Anyway, as I was able to reproduce the issue under NV, I tested the
hack below. If anything, I expect it to be a reasonable fix for
6.3/6.4, and until we come up with a better approach.
Thanks,
M.
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index aaeae1145359..a28c4ffe4932 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1894,8 +1894,17 @@ static void _kvm_arch_hardware_enable(void *discard)
int kvm_arch_hardware_enable(void)
{
- int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
+ int was_enabled;
+ /*
+ * Most calls to this function are made with migration
+ * disabled, but not with preemption disabled. The former is
+ * enough to ensure correctness, but most of the helpers
+ * expect the later and will throw a tantrum otherwise.
+ */
+ preempt_disable();
+
+ was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
_kvm_arch_hardware_enable(NULL);
if (!was_enabled) {
@@ -1903,6 +1912,8 @@ int kvm_arch_hardware_enable(void)
kvm_timer_cpu_up();
}
+ preempt_enable();
+
return 0;
}
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-07-03 9:45 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-01 12:50 KVM CPU hotplug notifier triggers BUG_ON on arm64 Kristina Martsenko
2023-07-01 12:50 ` Kristina Martsenko
2023-07-01 17:42 ` Oliver Upton
2023-07-01 17:42 ` Oliver Upton
2023-07-03 9:45 ` Marc Zyngier [this message]
2023-07-03 9:45 ` Marc Zyngier
2023-07-03 10:36 ` Kristina Martsenko
2023-07-03 10:36 ` Kristina Martsenko
2023-07-03 16:02 ` Oliver Upton
2023-07-03 16:02 ` Oliver Upton
2023-07-03 16:38 ` Marc Zyngier
2023-07-03 16:38 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=867crhxr9l.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=isaku.yamahata@intel.com \
--cc=james.morse@arm.com \
--cc=kristina.martsenko@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.