From: Jan Kiszka <jan.kiszka@siemens.com>
To: Alexander Graf <agraf@suse.de>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Isaku Yamahata <yamahata@valinux.co.jp>
Subject: Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
Date: Tue, 14 Jun 2011 15:08:46 +0200 [thread overview]
Message-ID: <4DF75D5E.3000003@siemens.com> (raw)
In-Reply-To: <1D2C3E16-193F-4EEC-B7EF-AE038B4F2EE9@suse.de>
On 2011-06-14 14:30, Alexander Graf wrote:
>
> Am 14.06.2011 um 14:17 schrieb Ian Campbell <Ian.Campbell@citrix.com>:
>
>> On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
>>> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
>>>
>>>> On Sat, 28 May 2011, Alexander Graf wrote:
>>>>>
>>>>> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>>>>>
>>>>>> xen: fix interrupt routing
>>>>>>
>>>>>> - remove i440FX-xen and i440fx_write_config_xen
>>>>>> we don't need to intercept pci config writes to i440FX anymore;
>>>>>
>>>>> Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
>>>>
>>>> Nothing changed, I think it was a genuine mistake in the original patch
>>>> series: the intention was to catch the pci config writes to 0x60-0x63 to
>>>> reprogram the xen pci link routes accordingly (see
>>>> xen_piix_pci_write_config_client). If I am not mistaken a similar thing
>>>> is done by non-Xen Qemu in piix3_update_irq_levels.
>>>>
>>>>
>>>>>> - introduce PIIX3-xen and piix3_write_config_xen
>>>>>> we do need to intercept pci config write to the PCI-ISA bridge to update
>>>>>> the PCI link routing;
>>>>>>
>>>>>> - set the number of PIIX3-xen interrupts lines to 128;
>>>>>>
>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>>>>>
>>>>>> diff --git a/hw/pc.h b/hw/pc.h
>>>>>> index 0dcbee7..6d5730b 100644
>>>>>> --- a/hw/pc.h
>>>>>> +++ b/hw/pc.h
>>>>>> @@ -176,7 +176,6 @@ struct PCII440FXState;
>>>>>> typedef struct PCII440FXState PCII440FXState;
>>>>>>
>>>>>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>>>>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>>>>> void i440fx_init_memory_mappings(PCII440FXState *d);
>>>>>>
>>>>>> /* piix4.c */
>>>>>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>>>>>> index 9a22a8a..ba198de 100644
>>>>>> --- a/hw/pc_piix.c
>>>>>> +++ b/hw/pc_piix.c
>>>>>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>>>>>> isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>>>>>>
>>>>>> if (pci_enabled) {
>>>>>> - if (!xen_enabled()) {
>>>>>> - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>>> - } else {
>>>>>> - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>>> - }
>>>>>> + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>>> } else {
>>>>>> pci_bus = NULL;
>>>>>> i440fx_state = NULL;
>>>>>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
>>>>>> index 7f1c4cc..3d73a42 100644
>>>>>> --- a/hw/piix_pci.c
>>>>>> +++ b/hw/piix_pci.c
>>>>>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>>>>>>
>>>>>> #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
>>>>>> #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
>>>>>> +#define XEN_PIIX_NUM_PIRQS 128ULL
>>>>>> #define PIIX_PIRQC 0x60
>>>>>>
>>>>>> typedef struct PIIX3State {
>>>>>> @@ -78,6 +79,8 @@ struct PCII440FXState {
>>>>>> #define I440FX_SMRAM 0x72
>>>>>>
>>>>>> static void piix3_set_irq(void *opaque, int pirq, int level);
>>>>>> +static void piix3_write_config_xen(PCIDevice *dev,
>>>>>> + uint32_t address, uint32_t val, int len);
>>>>>>
>>>>>> /* return the global irq number corresponding to a given device irq
>>>>>> pin. We could also use the bus number to have a more precise
>>>>>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> -static void i440fx_write_config_xen(PCIDevice *dev,
>>>>>> - uint32_t address, uint32_t val, int len)
>>>>>> -{
>>>>>> - xen_piix_pci_write_config_client(address, val, len);
>>>>>> - i440fx_write_config(dev, address, val, len);
>>>>>> -}
>>>>>> -
>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>>> {
>>>>>> PCII440FXState *d = opaque;
>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>>>>> d = pci_create_simple(b, 0, device_name);
>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>>
>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>>>>> + if (xen_enabled()) {
>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>>>>
>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>>>>
>>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>>> IO-APIC.
>>>> The four pins of each PCI device on the bus not only are routed to the
>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>>> they are connected to the IO-APIC directly.
>>>> These additional routes can only be discovered through ACPI, so you need
>>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>>
>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>>> printf("Name(PRTA, Package() {\n");
>>>> for ( dev = 1; dev < 32; dev++ )
>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>>> printf("})\n");
>>>>
>>>
>>> Interesting concept, but completely non-standard and very much
>>> different from real hardware. Please at least add a comment there to
>>> show readers that Xen is doing a hack which is not at all related to
>>> how the PIIX really works.
>>
>> Isn't this more a function of the "wires" on the motherboard than the
>> PIIX specifically? i.e. this just encodes the permutation of the wires
>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>
> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
>
> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
>
> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
>
> Did this really give you actual performance/latency/scalability gains? I still think for devices that matter, we should go with MSI rather than deriving from real hw.
I bet the motivation is to have an IRQ route that is independent of
QEMU, thus can be discovered inside the Xen kernel and then remains
stable - or is simply hard-wired. Device assignment? Direct legacy IRQ
injection from the kernel?
KVM (qemu-kvm) has that issue as well and uses a simple hack to get a
notification on routing changes. That works as long as there is only one
hub (the PIIX3) involved. We need something smarter on the long term,
though.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2011-06-14 13:09 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-26 15:48 [Qemu-devel] [PATCH] xen: fix interrupt routing Stefano Stabellini
2011-05-27 23:17 ` Alexander Graf
2011-05-31 11:05 ` Stefano Stabellini
2011-06-14 9:25 ` Alexander Graf
2011-06-14 12:17 ` [Qemu-devel] [Xen-devel] " Ian Campbell
2011-06-14 12:30 ` Alexander Graf
2011-06-14 13:08 ` Jan Kiszka [this message]
2011-06-14 13:31 ` [Qemu-devel] " Stefano Stabellini
2011-06-14 13:27 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2011-06-14 15:30 ` [Qemu-devel] " Jan Kiszka
2011-06-14 18:18 ` Stefano Stabellini
2011-06-15 8:24 ` Alexander Graf
2011-06-15 16:34 ` [Qemu-devel] [Xen-devel] " Avi Kivity
2011-06-15 16:54 ` Alexander Graf
2011-06-15 8:16 ` Alexander Graf
2011-06-15 16:32 ` Stefano Stabellini
-- strict thread matches above, loose matches on Subject: below --
2011-05-18 17:53 [Qemu-devel] " stefano.stabellini
2011-05-19 2:14 ` Isaku Yamahata
2011-05-19 19:16 ` Stefano Stabellini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DF75D5E.3000003@siemens.com \
--to=jan.kiszka@siemens.com \
--cc=Ian.Campbell@citrix.com \
--cc=Stefano.Stabellini@eu.citrix.com \
--cc=agraf@suse.de \
--cc=qemu-devel@nongnu.org \
--cc=xen-devel@lists.xensource.com \
--cc=yamahata@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).