qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Peter Xu <peterx@redhat.com>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philippe.mathieu-daude@linaro.org>,
	"Peter Maydell" <peter.maydell@linaro.org>
Subject: Re: should ioapic_service really be modelling cpu writes?
Date: Fri, 11 Nov 2022 12:26:21 +0000	[thread overview]
Message-ID: <87r0y9d623.fsf@linaro.org> (raw)
In-Reply-To: <Y21+VFqKpF6LGz2C@x1n>


Peter Xu <peterx@redhat.com> writes:

> Hi, Alex,
>
> On Thu, Nov 10, 2022 at 05:55:51PM +0000, Alex Bennée wrote:
>> 
>> Alex Bennée <alex.bennee@linaro.org> writes:
>> 
>> > Hi,
>> >
>> > I've been trying to remove current_cpu hacks from our hw/ emulation and
>> > replace them with an explicit cpu_index derived from MemTxAttrs. So far
>> > this has been going mostly ok but I've run into problems on x86 due to
>> > the way the apic/ioapic are modelled. It comes down to the function
>> > ioapic_service() which eventually does:
>> >
>> >    /* No matter whether IR is enabled, we translate
>> >     * the IOAPIC message into a MSI one, and its
>> >     * address space will decide whether we need a
>> >     * translation. */
>> >    stl_le_phys(ioapic_as, info.addr, info.data);
>> >
>> > Which essentially calls the memory API to simulate a memory write.
>> > However to generate a properly formed MemTxAttrs we need to know what
>> > CPU we are running on. In the case of ioapic_service() we may well be in
>> > the main thread either for an expiring timer:
>> >
>> >  hpet_timer->qemu_set_irq->ioapic_set_irq
>> >
>> > or for reset handling:
>> >
>> >  pc_machine_reset->hpet_reset->qemu_set_irq->ioapic_set_irq
>> >
>> > neither of which can get a associated CPU. I assume if the actual writes
>> > are triggered we never actually actioned stuff because we had:
>> >
>> >   DeviceState *cpu_get_current_apic(void)
>> >   {
>> >       if (current_cpu) {
>> >           X86CPU *cpu = X86_CPU(current_cpu);
>> >           return cpu->apic_state;
>> >       } else {
>> >           return NULL;
>> >       }
>> >   }
>> >
>> > which basically causes the updates to be dropped on the floor.
>
> I think it shouldn't?  Normally the irq will be in MSI format (IOAPIC will
> translate to an MSI in QEMU, per ioapic_entry_parse()).
>
> I had a feeling that it'll just go the shortcut here (MSI always starts
> with 0xfeeXXXXX so definitely bigger than 0xfff):
>
>     if (addr > 0xfff || !index) {
>         /* MSI and MMIO APIC are at the same memory location,
>          * but actually not on the global bus: MSI is on PCI bus
>          * APIC is connected directly to the CPU.
>          * Mapping them on the global bus happens to work because
>          * MSI registers are reserved in APIC MMIO and vice versa. */
>         MSIMessage msi = { .address = addr, .data = val };
>         apic_send_msi(&msi);
>         return;
>     }
>
> apic_send_msi() doesn't need a cpu context.

Ahh so yes maybe my changes where too quick to reject the MMIO. So I've
made the following tweak to ioapic_service():

                /*
                 * No matter whether IR is enabled, we translate
                 * the IOAPIC message into a MSI one, and its
                 * address space will decide whether we need a
                 * translation.
                 *
                 * As it is an access from something other than the
                 * CPU or a PCI device we set its source as machine
                 * specific.
                 */
                {
                    MemTxAttrs attrs = MEMTXATTRS_MACHINE(MEMTX_IOAPIC);
                    MemTxResult res;

                    address_space_stl_le(ioapic_as, info.addr, info.data,
                                         attrs, &res);
                    if (res != MEMTX_ERROR) {
                        qemu_log_mask(LOG_GUEST_ERROR,
                                      "%s: couldn't write to %"PRIx32"\n", __func__, info.addr);
                    }
                }

and I had already tweaked the pci_msi_trigger so:

  static void pci_msi_trigger(PCIDevice *dev, MSIMessage msg)
  {
      MemTxAttrs attrs = {
          .requester_type = MTRT_PCI,
          .requester_id = pci_requester_id(dev)
      };
      address_space_stl_le(&dev->bus_master_as, msg.address, msg.data,
                           attrs, NULL);
  }

> No expert on that, but per my understanding ioapic isn't really bound to
> any apic, so it doesn't need any cpu context.  As a quick reference of
> that, one can look at Intel SDM Vol 3 Chap 10, figure 10.3 will be a
> generic modern x86_64 system APIC structure.
>
> In hardware there should have a 3-wire apic bus that take care of all the
> messaging, including at least not only ioapic irqs to any cores, or IPIs
> between the cores.  The messages coming from the ioapic should not require
> any apic too as it can come from devices, just like what we do with qemu
> when the device does things like pci_set_irq(), iiuc.
<snip>
>
> AFAICT apic_mem_write() doesn't mean that this cpu will take this IRQ.  The
> target core to respond to the IRQ will be defined in the dest ID field of
> either an MSI message or embeded in the IOAPIC entry being setup by the
> guest driver.  E.g. MSI message format can also be found in SDM Vol 3 chap
> 10.11.1, in QEMU we can refer to "dest" field of apic_send_msi().


So now the start of apic_mem_write checks and validates MSIs first:

  static MemTxResult apic_mem_write(void *opaque, hwaddr addr, uint64_t val,
                                    unsigned int size, MemTxAttrs attrs)
  {
      DeviceState *dev;
      APICCommonState *s;
      int index = (addr >> 4) & 0xff;

      if (size < 4) {
          return MEMTX_ERROR;
      }

      /*
       * MSI and MMIO APIC are at the same memory location, but actually
       * not on the global bus: MSI is on PCI bus APIC is connected
       * directly to the CPU.
       *
       * We can check the MemTxAttrs to check they are coming from where
       * we expect. Even though the MSI registers are reserved in APIC
       * MMIO and vice versa they shouldn't respond to CPU writes.
       */
      if (addr > 0xfff || !index) {
          switch (attrs.requester_type) {
          case MTRT_MACHINE:
              /* should be from the directly wired IOPIC */
              if (attrs.requester_id != MEMTX_IOAPIC) {
                  qemu_log_mask(LOG_GUEST_ERROR,
                                "%s: rejecting machine write from something other that IOPIC (%x)",
                                __func__, attrs.requester_id);
                  return MEMTX_ACCESS_ERROR;
              }
              break;
          case MTRT_PCI:
              /* PCI signalled MSI */
              break;
          case MTRT_UNSPECIFIED:
              qemu_log_mask(LOG_GUEST_ERROR,
                            "%s: rejecting unspecified write", __func__);
              return MEMTX_ACCESS_ERROR;
          case MTRT_CPU:
              /* can CPU directly trigger MSIs? */
              break;
          }
          MSIMessage msi = { .address = addr, .data = val };
          apic_send_msi(&msi);
          return MEMTX_OK;
      }

      if (attrs.requester_type != MTRT_CPU) {
          return MEMTX_ACCESS_ERROR;
      }
      dev = cpu_get_current_apic(attrs.requester_id);
      s = APIC(dev);

      trace_apic_mem_writel(addr, val);

which at least gets things booting properly. Does this seem like a
better modelling of the APIC behaviour?

-- 
Alex Bennée


  parent reply	other threads:[~2022-11-11 12:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-10 17:01 should ioapic_service really be modelling cpu writes? Alex Bennée
2022-11-10 17:55 ` Alex Bennée
2022-11-10 22:42   ` Peter Xu
2022-11-11 11:08     ` Paolo Bonzini
2022-11-11 12:26     ` Alex Bennée [this message]
2022-11-11 13:14       ` Paolo Bonzini
2022-11-11 14:00         ` Alex Bennée
2022-11-11 15:57           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0y9d623.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=philippe.mathieu-daude@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).