From: Marc Zyngier <maz@kernel.org>
To: Zenghui Yu <yuzenghui@huawei.com>
Cc: drjones@redhat.com, kvm@vger.kernel.org, suzuki.poulose@arm.com,
James Morse <james.morse@arm.com>,
linux-arm-kernel@lists.infradead.org,
"Wanghaibin \(D\)" <wanghaibin.wang@huawei.com>,
kvmarm@lists.cs.columbia.edu, julien.thierry.kdev@gmail.com
Subject: Re: kvm-unit-tests: psci_cpu_on_test FAILed
Date: Sat, 3 Aug 2019 11:10:47 +0100 [thread overview]
Message-ID: <20190803111047.11493907@why> (raw)
In-Reply-To: <f58de1d7-a6ca-bd6d-8423-01d27326e078@huawei.com>
On Sat, 3 Aug 2019 17:27:41 +0800
Zenghui Yu <yuzenghui@huawei.com> wrote:
> Hi Marc,
>
> On 2019/8/2 23:56, Marc Zyngier wrote:
> > On 02/08/2019 11:56, Zenghui Yu wrote:
> >> Hi folks,
> >>
> >> Running kvm-unit-tests with Linux 5.3.0-rc2 on Kunpeng 920, we will get
> >> the following fail info:
> >>
> >> [...]
> >> FAIL psci (4 tests, 1 unexpected failures)
> >> [...]
> >> and
> >> [...]
> >> INFO: unexpected cpu_on return value: caller=CPU9, ret=-2
> >> FAIL: cpu-on
> >> SUMMARY: 4 tests, 1 unexpected failures
> >>
> >>
> >> I think this is an issue had been fixed once by commit 6c7a5dce22b3
> >> ("KVM: arm/arm64: fix races in kvm_psci_vcpu_on"), which makes use of
> >> kvm->lock mutex to fix the race between two PSCI_CPU_ON calls - one
> >> does reset on the MPIDR register whilst another reads it.
> >>
> >> But commit 358b28f09f0 ("arm/arm64: KVM: Allow a VCPU to fully reset
> >> itself") later moves the reset work into check_vcpu_requests(), by
> >> making a KVM_REQ_VCPU_RESET request in PSCI code. Thus the reset work
> >> has not been protected by kvm->lock mutex anymore, and the race shows up
> >> again...
> >>
> >> Do we need a fix for this issue? At least achieve a mutex execution
> >> between the reset of MPIDR and kvm_mpidr_to_vcpu()?
> >
> > The thing is that the way we reset registers is marginally insane.
> > Yes, it catches most reset bugs. It also introduces many more in
> > the rest of the paths.
> >
> > The fun part is that there is hardly a need for resetting MPIDR.
> > It has already been set when we've created the vcpu. It is the
>
> (That means we can let reset_mpidr() do nothing?)
It should ever be only written once, as MPIDR is a constant from the
guest perspective. So it is not that it can do nothing. It is just that
there should never be any other value written to it.
>
> > poisoning of the sysreg array that creates a situation where
> > the MPIDR is temporarily invalid.
> >
> > So instead of poisoning the array, how about we just keep
> > track of the registers for which we've called a reset function?
> > It should be enough to track the most obvious bugs... I've
>
> The reset of DBG{BCR,BVR,WVR,WCR}n_EL1 registers will also be tracked.
> It may affect our judgment?
How so?
>
> > cobbled the following patch together, which seems to fix the
> > issue on my TX2 with 64 vcpus.
> >
> > Thoughts?
> >
> > M.
> >
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index f26e181d881c..17f46ee7dc83 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -2254,13 +2254,17 @@ static int emulate_sys_reg(struct kvm_vcpu *vcpu,
> > }
> > > static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
> > - const struct sys_reg_desc *table, size_t num)
> > + const struct sys_reg_desc *table, size_t num,
> > + unsigned long *bmap)
> > {
> > unsigned long i;
> > > for (i = 0; i < num; i++)
> > - if (table[i].reset)
> > + if (table[i].reset) {
> > table[i].reset(vcpu, &table[i]);
> > + if (bmap)
> > + set_bit(i, bmap);
>
> I think this should be:
> set_bit(table[i].reg, bmap);
>
> Am I wrong?
No, you're absolutely right.
>
> > + }
> > }
> > > /**
> > @@ -2772,21 +2776,23 @@ void kvm_sys_reg_table_init(void)
> > */
> > void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
> > {
> > + unsigned long *bmap;
> > size_t num;
> > const struct sys_reg_desc *table;
> > > - /* Catch someone adding a register without putting in reset entry. */
> > - memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
> > + bmap = bitmap_alloc(NR_SYS_REGS, GFP_KERNEL);
>
> LOCKDEP kernel will be not happy with this bitmap_alloc:
>
> " BUG: sleeping function called from invalid context at mm/slab.h:501
> in_atomic(): 1, irqs_disabled(): 0, pid: 8710, name: qemu-system-aar "
Well spotted. I guess GFP_ATOMIC is in order.
>
> > > /* Generic chip reset first (so target could override). */
> > - reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> > + reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
> > > table = get_target_table(vcpu->arch.target, true, &num);
> > - reset_sys_reg_descs(vcpu, table, num);
> > + reset_sys_reg_descs(vcpu, table, num, bmap);
> > > for (num = 1; num < NR_SYS_REGS; num++) {
> > - if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
> > + if (WARN(bmap && !test_bit(num, bmap),
> > "Didn't reset __vcpu_sys_reg(%zi)\n", num))
> > break;
> > }
> > +
> > + kfree(bmap);
> > }
> >
> >
>
> Some other minor questions about the sys reg resetting:
> 1. Pointer Authentication Registers haven't have reset entry yet,
> do they need? The same for ACTLR_EL1.
Pointer auth registers definitely have a reset function, set to
reset_unknown. So does ACTLR_EL1, which resets to the host's value.
> 2. Why does PMCR_EL0 register have no "reg" field, in sys_reg_descs[]?
This looks like a (very minor) bug. reset_pmcr writes directly to the
PMCR_EL0 shadow register without using r->reg as the register number.
But in the light of the reset tracking we want to add, this needs
fixing.
> I will test this patch with kvm-unit-tests next week!
Well, wait until I repost something a bit less buggy...
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-08-03 10:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-02 10:56 kvm-unit-tests: psci_cpu_on_test FAILed Zenghui Yu
2019-08-02 11:32 ` Andrew Jones
2019-08-02 15:56 ` Marc Zyngier
2019-08-03 9:27 ` Zenghui Yu
2019-08-03 10:10 ` Marc Zyngier [this message]
2019-08-05 2:38 ` Zenghui Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190803111047.11493907@why \
--to=maz@kernel.org \
--cc=drjones@redhat.com \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=suzuki.poulose@arm.com \
--cc=wanghaibin.wang@huawei.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).