From: "Alex Bennée" <alex.bennee@linaro.org>
To: Peter Xu <peterx@redhat.com>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Philippe Mathieu-Daudé" <philippe.mathieu-daude@linaro.org>,
"Peter Maydell" <peter.maydell@linaro.org>
Subject: Re: should ioapic_service really be modelling cpu writes?
Date: Fri, 11 Nov 2022 12:26:21 +0000 [thread overview]
Message-ID: <87r0y9d623.fsf@linaro.org> (raw)
In-Reply-To: <Y21+VFqKpF6LGz2C@x1n>
Peter Xu <peterx@redhat.com> writes:
> Hi, Alex,
>
> On Thu, Nov 10, 2022 at 05:55:51PM +0000, Alex Bennée wrote:
>>
>> Alex Bennée <alex.bennee@linaro.org> writes:
>>
>> > Hi,
>> >
>> > I've been trying to remove current_cpu hacks from our hw/ emulation and
>> > replace them with an explicit cpu_index derived from MemTxAttrs. So far
>> > this has been going mostly ok but I've run into problems on x86 due to
>> > the way the apic/ioapic are modelled. It comes down to the function
>> > ioapic_service() which eventually does:
>> >
>> > /* No matter whether IR is enabled, we translate
>> > * the IOAPIC message into a MSI one, and its
>> > * address space will decide whether we need a
>> > * translation. */
>> > stl_le_phys(ioapic_as, info.addr, info.data);
>> >
>> > Which essentially calls the memory API to simulate a memory write.
>> > However to generate a properly formed MemTxAttrs we need to know what
>> > CPU we are running on. In the case of ioapic_service() we may well be in
>> > the main thread either for an expiring timer:
>> >
>> > hpet_timer->qemu_set_irq->ioapic_set_irq
>> >
>> > or for reset handling:
>> >
>> > pc_machine_reset->hpet_reset->qemu_set_irq->ioapic_set_irq
>> >
>> > neither of which can get a associated CPU. I assume if the actual writes
>> > are triggered we never actually actioned stuff because we had:
>> >
>> > DeviceState *cpu_get_current_apic(void)
>> > {
>> > if (current_cpu) {
>> > X86CPU *cpu = X86_CPU(current_cpu);
>> > return cpu->apic_state;
>> > } else {
>> > return NULL;
>> > }
>> > }
>> >
>> > which basically causes the updates to be dropped on the floor.
>
> I think it shouldn't? Normally the irq will be in MSI format (IOAPIC will
> translate to an MSI in QEMU, per ioapic_entry_parse()).
>
> I had a feeling that it'll just go the shortcut here (MSI always starts
> with 0xfeeXXXXX so definitely bigger than 0xfff):
>
> if (addr > 0xfff || !index) {
> /* MSI and MMIO APIC are at the same memory location,
> * but actually not on the global bus: MSI is on PCI bus
> * APIC is connected directly to the CPU.
> * Mapping them on the global bus happens to work because
> * MSI registers are reserved in APIC MMIO and vice versa. */
> MSIMessage msi = { .address = addr, .data = val };
> apic_send_msi(&msi);
> return;
> }
>
> apic_send_msi() doesn't need a cpu context.
Ahh so yes maybe my changes where too quick to reject the MMIO. So I've
made the following tweak to ioapic_service():
/*
* No matter whether IR is enabled, we translate
* the IOAPIC message into a MSI one, and its
* address space will decide whether we need a
* translation.
*
* As it is an access from something other than the
* CPU or a PCI device we set its source as machine
* specific.
*/
{
MemTxAttrs attrs = MEMTXATTRS_MACHINE(MEMTX_IOAPIC);
MemTxResult res;
address_space_stl_le(ioapic_as, info.addr, info.data,
attrs, &res);
if (res != MEMTX_ERROR) {
qemu_log_mask(LOG_GUEST_ERROR,
"%s: couldn't write to %"PRIx32"\n", __func__, info.addr);
}
}
and I had already tweaked the pci_msi_trigger so:
static void pci_msi_trigger(PCIDevice *dev, MSIMessage msg)
{
MemTxAttrs attrs = {
.requester_type = MTRT_PCI,
.requester_id = pci_requester_id(dev)
};
address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
attrs, NULL);
}
> No expert on that, but per my understanding ioapic isn't really bound to
> any apic, so it doesn't need any cpu context. As a quick reference of
> that, one can look at Intel SDM Vol 3 Chap 10, figure 10.3 will be a
> generic modern x86_64 system APIC structure.
>
> In hardware there should have a 3-wire apic bus that take care of all the
> messaging, including at least not only ioapic irqs to any cores, or IPIs
> between the cores. The messages coming from the ioapic should not require
> any apic too as it can come from devices, just like what we do with qemu
> when the device does things like pci_set_irq(), iiuc.
<snip>
>
> AFAICT apic_mem_write() doesn't mean that this cpu will take this IRQ. The
> target core to respond to the IRQ will be defined in the dest ID field of
> either an MSI message or embeded in the IOAPIC entry being setup by the
> guest driver. E.g. MSI message format can also be found in SDM Vol 3 chap
> 10.11.1, in QEMU we can refer to "dest" field of apic_send_msi().
So now the start of apic_mem_write checks and validates MSIs first:
static MemTxResult apic_mem_write(void *opaque, hwaddr addr, uint64_t val,
unsigned int size, MemTxAttrs attrs)
{
DeviceState *dev;
APICCommonState *s;
int index = (addr >> 4) & 0xff;
if (size < 4) {
return MEMTX_ERROR;
}
/*
* MSI and MMIO APIC are at the same memory location, but actually
* not on the global bus: MSI is on PCI bus APIC is connected
* directly to the CPU.
*
* We can check the MemTxAttrs to check they are coming from where
* we expect. Even though the MSI registers are reserved in APIC
* MMIO and vice versa they shouldn't respond to CPU writes.
*/
if (addr > 0xfff || !index) {
switch (attrs.requester_type) {
case MTRT_MACHINE:
/* should be from the directly wired IOPIC */
if (attrs.requester_id != MEMTX_IOAPIC) {
qemu_log_mask(LOG_GUEST_ERROR,
"%s: rejecting machine write from something other that IOPIC (%x)",
__func__, attrs.requester_id);
return MEMTX_ACCESS_ERROR;
}
break;
case MTRT_PCI:
/* PCI signalled MSI */
break;
case MTRT_UNSPECIFIED:
qemu_log_mask(LOG_GUEST_ERROR,
"%s: rejecting unspecified write", __func__);
return MEMTX_ACCESS_ERROR;
case MTRT_CPU:
/* can CPU directly trigger MSIs? */
break;
}
MSIMessage msi = { .address = addr, .data = val };
apic_send_msi(&msi);
return MEMTX_OK;
}
if (attrs.requester_type != MTRT_CPU) {
return MEMTX_ACCESS_ERROR;
}
dev = cpu_get_current_apic(attrs.requester_id);
s = APIC(dev);
trace_apic_mem_writel(addr, val);
which at least gets things booting properly. Does this seem like a
better modelling of the APIC behaviour?
--
Alex Bennée
next prev parent reply other threads:[~2022-11-11 12:33 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-10 17:01 should ioapic_service really be modelling cpu writes? Alex Bennée
2022-11-10 17:55 ` Alex Bennée
2022-11-10 22:42 ` Peter Xu
2022-11-11 11:08 ` Paolo Bonzini
2022-11-11 12:26 ` Alex Bennée [this message]
2022-11-11 13:14 ` Paolo Bonzini
2022-11-11 14:00 ` Alex Bennée
2022-11-11 15:57 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r0y9d623.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=philippe.mathieu-daude@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.