From: "André Przywara" <andre.przywara@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>,
James Morse <james.morse@arm.com>,
kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org,
Suzuki K Poulose <suzuki.poulose@arm.com>
Subject: Re: [PATCH] KVM: arm64: arch_timer shouldn't assume the vcpu is loaded
Date: Thu, 9 Apr 2020 17:04:37 +0100 [thread overview]
Message-ID: <e8e5ce87-2f50-7adb-e6cd-51dfedcd6462@arm.com> (raw)
In-Reply-To: <20200409090834.7d655d8f@why>
On 09/04/2020 09:08, Marc Zyngier wrote:
Hi Marc,
> On Wed, 8 Apr 2020 17:50:09 +0100
> André Przywara <andre.przywara@arm.com> wrote:
>
>> On 08/04/2020 15:19, Marc Zyngier wrote:
>>
>> Hi Marc,
>>
>>> On 2020-04-08 13:13, André Przywara wrote:
>>>> On 08/04/2020 11:07, Marc Zyngier wrote:
>>>>
>>>> Hi Marc,
>>>>
>>>>> Hi James,
>>>>>
>>>>> Thanks for looking into this.
>>>>>
>>>>> On Mon, 6 Apr 2020 16:03:55 +0100
>>>>> James Morse <james.morse@arm.com> wrote:
>>>>>
>>>>>> kvm_arch_timer_get_input_level() needs to get the arch_timer_context
>>>>>> for
>>>>>> a particular vcpu, and uses kvm_get_running_vcpu() to find it.
>>>>>>
>>>>>> kvm_arch_timer_get_input_level() may be called to handle a user-space
>>>>>> write to the redistributor, where the vcpu is not loaded. This causes
>>>>>> kvm_get_running_vcpu() to return NULL:
>>>>>> | Unable to handle kernel paging request at virtual address
>>>>>> 0000000000001ec0
>>>>>> | Mem abort info:
>>>>>> | ESR = 0x96000004
>>>>>> | EC = 0x25: DABT (current EL), IL = 32 bits
>>>>>> | SET = 0, FnV = 0
>>>>>> | EA = 0, S1PTW = 0
>>>>>> | Data abort info:
>>>>>> | ISV = 0, ISS = 0x00000004
>>>>>> | CM = 0, WnR = 0
>>>>>> | user pgtable: 4k pages, 48-bit VAs, pgdp=000000003cbf9000
>>>>>> | [0000000000001ec0] pgd=0000000000000000
>>>>>> | Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>>>>> | Modules linked in: r8169 realtek efivarfs ip_tables x_tables
>>>>>> | CPU: 1 PID: 2615 Comm: qemu-system-aar Not tainted 5.6.0-rc7 #30
>>>>>> | Hardware name: Marvell mvebu_armada-37xx/mvebu_armada-37xx, BIOS
>>>>>> 2018.03-devel-18.12.3-gc9aa92c-armbian 02/20/2019
>>>>>> | pstate: 00000085 (nzcv daIf -PAN -UAO)
>>>>>> | pc : kvm_arch_timer_get_input_level+0x1c/0x68
>>>>>> | lr : kvm_arch_timer_get_input_level+0x1c/0x68
>>>>>>
>>>>>> | Call trace:
>>>>>> | kvm_arch_timer_get_input_level+0x1c/0x68
>>>>>> | vgic_get_phys_line_level+0x3c/0x90
>>>>>> | vgic_mmio_write_senable+0xe4/0x130
>>>>>> | vgic_uaccess+0xe0/0x100
>>>>>> | vgic_v3_redist_uaccess+0x5c/0x80
>>>>>> | vgic_v3_attr_regs_access+0xf0/0x200
>>>>>> | nvgic_v3_set_attr+0x234/0x250
>>>>>> | kvm_device_ioctl_attr+0xa4/0xf8
>>>>>> | kvm_device_ioctl+0x7c/0xc0
>>>>>> | ksys_ioctl+0x1fc/0xc18
>>>>>> | __arm64_sys_ioctl+0x24/0x30
>>>>>> | do_el0_svc+0x7c/0x148
>>>>>> | el0_sync_handler+0x138/0x258
>>>>>> | el0_sync+0x140/0x180
>>>>>> | Code: 910003fd f9000bf3 2a0003f3 97ff650c (b95ec001)
>>>>>> | ---[ end trace 81287612d93f1e70 ]---
>>>>>> | note: qemu-system-aar[2615] exited with preempt_count 1
>>>>>>
>>>>>> Loading the vcpu doesn't make a lot of sense for handling a device
>>>>>> ioctl(),
>>>>>> so instead pass the vcpu through to
>>>>>> kvm_arch_timer_get_input_level(). Its
>>>>>> not clear that an intid makes much sense without the paired vcpu.
>>>>>
>>>>> I don't fully agree with the analysis, Remember we are looking at the
>>>>> state of the physical interrupt associated with a virtual interrupt, so
>>>>> the vcpu doesn't quite make sense here if it isn't loaded.
>>>>
>>>> But wasn't it that this function is meant to specifically deal with this
>>>> *without* going to the hardware (which is costly, hence this
>>>> optimisation)? Because for the timer we *can* work out the logical IRQ
>>>> line state by examining our saved state? And this is what we do in
>>>> kvm_timer_should_fire(), when timer_ctx->loaded is false.
>>>
>>> Yes, but that's just a specialization of a more generic interface, which is
>>> "inspect the state of this *physical* intid". The fact that we are able
>>> to do
>>> it in a special way for the timer doesn't change the nature of the
>>> interface.
>>
>>
>>>
>>>> Which for me this sounds like the right thing to do in this situation:
>>>> the VCPU (and the timer) is not loaded, so we check our saved state and
>>>> construct the logical line level. We just need a valid VCPU struct to
>>>> achieve this, and hope for the virtual timer to be already initialised.
>>>>
>>>> Do I miss something here?
>>>
>>> Yes. You are missing that the *interface* is generic, and you can replace
>>> it with anything you want. Case in point, what we do when get_input_level
>>> is NULL.
>>>
>>>> Also to me it sound like the interface for this function is slightly
>>>> lacking, because just an intid is not enough to uniquely identify an
>>>> IRQ. It was just fine so far because of this special use case.
>>>
>>> This is a *physical* intid.
>>
>> Wait, I am confused, the type declaration in struct vgic_irq says:
>> ...
>> bool (*get_input_level)(int vintid);
>> ^^^
>> Also in vgic.c:vgic_get_phys_line_level() we call
>> irq->get_input_level(irq->intid), which is the virtual intid.
>
> Yeah, that's not great indeed. It is a cunning shortcut to get to the
> timer, but that really should be the host irq.
>> But I see that the physical intid makes more sense here (in the spirit
>> of: provide a shortcut for poking the GIC for the associated hwirq), but
>> shouldn't we then pass at least irq->hwintid (which just happens to be
>> the same in the arch timer case)?
>
> hwintid isn't really something we should consider, as it is an
> implementation detail of the physical GIC and list registers. It is
> too low-level to be generally useful. The host_irq field, on the other
> hand, is a better information source, and the timer already has this
> stashed.
That sounds indeed more logical, given that we use that number for the
default handling (irq_get_irqchip_state()).
> Overall, we could just pass the pointer to the vgic_irq, and let the
> helper do whatever it needs to sort it out. After all, it is supposed
> to be faster than going to the GIC, so we can have a bit of leeway here.
That sounds tempting, but didn't we consider struct vgic_irq a data
structure private to the VGIC? I remember we jumped through some hoops
to get rid of internal VGIC data in arch_timer.c, for instance.
A quick grep for vgic_irq returns only hits in the virt/kvm/arm/vgic
directory.
So maybe we should not try to sell this get_input_level() as a generic
solution, but demote it to what it really is: a "cunning shortcut"(TM)
to optimise the arch timer?
What about we change the interface to use the Linux IRQ number, rewrite
the implementation in arch_timer.c to match that against
host_[vp]timer_irq and adjust the comments accordingly?
Then the next user (shall there be one) can extend this interface as needed.
Plus your code to provide user space accessors.
Cheers,
Andre
> Not a big deal, as this isn't the part that is broken ATM.
>
> Thanks,
>
> M.
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
prev parent reply other threads:[~2020-04-09 16:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-06 15:03 [PATCH] KVM: arm64: arch_timer shouldn't assume the vcpu is loaded James Morse
2020-04-08 10:07 ` Marc Zyngier
2020-04-08 11:16 ` James Morse
2020-04-09 8:27 ` Marc Zyngier
2020-04-09 16:53 ` James Morse
2020-04-08 12:13 ` André Przywara
2020-04-08 14:19 ` Marc Zyngier
2020-04-08 16:50 ` André Przywara
2020-04-09 8:08 ` Marc Zyngier
2020-04-09 16:04 ` André Przywara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8e5ce87-2f50-7adb-e6cd-51dfedcd6462@arm.com \
--to=andre.przywara@arm.com \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=suzuki.poulose@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).