From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alexander Graf <agraf@suse.de>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
"Andreas Färber" <afaerber@suse.de>
Subject: Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics
Date: Sat, 12 Apr 2014 02:30:02 +1000 [thread overview]
Message-ID: <5348188A.5040202@ozlabs.ru> (raw)
In-Reply-To: <53481512.5020503@suse.de>
On 04/12/2014 02:15 AM, Alexander Graf wrote:
>
> On 11.04.14 18:01, Alexey Kardashevskiy wrote:
>> On 04/12/2014 01:38 AM, Alexander Graf wrote:
>>> On 11.04.14 17:27, Alexey Kardashevskiy wrote:
>>>> On 04/12/2014 12:58 AM, Alexander Graf wrote:
>>>>> On 11.04.14 16:50, Alexey Kardashevskiy wrote:
>>>>>> On 04/11/2014 11:58 PM, Alexander Graf wrote:
>>>>>>> On 11.04.2014, at 14:38, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>>>>>
>>>>>>>> On 04/11/2014 07:24 PM, Alexander Graf wrote:
>>>>>>>>> On 10.04.14 16:43, Alexey Kardashevskiy wrote:
>>>>>>>>>> On 04/10/2014 11:26 PM, Alexander Graf wrote:
>>>>>>>>>>> On 10.04.14 15:24, Alexey Kardashevskiy wrote:
>>>>>>>>>>>> On 04/10/2014 10:51 PM, Alexander Graf wrote:
>>>>>>>>>>>>> On 14.03.14 05:18, Alexey Kardashevskiy wrote:
>>>>>>>>>>>>>> The current allocator returns IRQ numbers from a pool and
>>>>>>>>>>>>>> does not
>>>>>>>>>>>>>> support IRQs reuse in any form as it did not keep track of
>>>>>>>>>>>>>> what it
>>>>>>>>>>>>>> previously returned, it only had the last returned IRQ.
>>>>>>>>>>>>>> However migration may change interrupts for devices depending on
>>>>>>>>>>>>>> their order in the command line.
>>>>>>>>>>>>> Wtf? Nonono, this sounds very bogus and wrong. Migration
>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>> change
>>>>>>>>>>>>> anything.
>>>>>>>>>>>> I put wrong commit message. By change I meant that the default
>>>>>>>>>>>> state
>>>>>>>>>>>> before
>>>>>>>>>>>> the destination guest started accepting migration is different
>>>>>>>>>>>> from
>>>>>>>>>>>> what
>>>>>>>>>>>> the destination guest became after migration finished. And
>>>>>>>>>>>> migration
>>>>>>>>>>>> cannot
>>>>>>>>>>>> avoid changing this default state.
>>>>>>>>>>> Ok, why is the IRQ configuration different?
>>>>>>>>>> Because QEMU creates devices in the order as in the command line,
>>>>>>>>>> and
>>>>>>>>>> libvirt changes this order - the XML used to create the guest and
>>>>>>>>>> the
>>>>>>>>>> XML
>>>>>>>>>> which is sends during migration are different. libvirt thinks it
>>>>>>>>>> is ok
>>>>>>>>>> while it keeps @reg property for (for example) spapr-vscsi devices
>>>>>>>>>> but it
>>>>>>>>>> is not because since the order is different, devices call IRQ
>>>>>>>>>> allocator in
>>>>>>>>>> different order and get different IRQs.
>>>>>>>>> So your patch migrates the current IRQ configuration, but once you
>>>>>>>>> restart
>>>>>>>>> the virtual machine on the destination host it will have different
>>>>>>>>> IRQ
>>>>>>>>> numbering again, right?
>>>>>>>> No, why? IRQs are assigned at init time from realize() callbacks (and
>>>>>>>> survive reset) or as a part of ibm,change-msi rtas call which
>>>>>>>> happens in
>>>>>>>> the same order as it only depends on pci addresses and we do not
>>>>>>>> change
>>>>>>>> this either.
>>>>>>> Ok, let me rephrase. If I shut the machine down because I'm doing
>>>>>>> on-disk hibernate and then boot it back up, will the guest find the
>>>>>>> same
>>>>>>> configuration?
>>>>>> I do not understand what you mean by this. Hibernation by the guest OS
>>>>>> itself or by QEMU? If this involves QEMU exit and QEMU start - then yes,
>>>>> by the guest OS. The host will only see a genuine "shutdown" event. The
>>>>> guest OS will expect the machine to look *the exact same* as before the
>>>>> shutdown.
>>>> Ok. So. I have to implement "irq" property everywhere (PHB is missing
>>>> INTA/B/C/D now) and check if they did not change during migration via
>>>> those
>>> Hrm. Not sure. Maybe it'd make sense to join next week's call on platform
>>> device creation. The problem seems pretty closely related.
>> What are those platform devices and what are you going to discuss exactly?
>
> Devices that don't have a unified interrupt routing scheme like PCI where
> you just link lines A/B/C/D to your controller and you're good to go.
Ah. VIO in my case.
>>>> VMSTATE.*EQUAL. Correct?
>>> Why would you need this? I think we already said a couple dozen times that
>>> configuration matching is a bigger problem, no?
>> For debug! It is not needed in general, yes.
>>
>>
>>>> If so (more or less), I still would like to keep patches 1..7.
>>>> In fact, the first one is independent and we need it anyway.
>>>> Yes/no?
>>> Why?
>> IOMMUs do not migrate correctly - they only have a class have and
>> instance_id and this instance_it depends on command line arguments order.
>> The #1 patch makes it classname + liobn.
>
> Why do we need a bus for that?
For BusClass::get_dev_path callback to get an unique name.
>>>>>> config may be different. If it is "migrate to file" and then "migrate
>>>>>> from
>>>>>> file" (do not know what you call it when migration goes to a pipe
>>>>>> which is
>>>>>> "tar") - then config will be the same.
>>>>>>
>>>>>>
>>>>>>>>> I'm not sure that's a good solution to the problem. I guess we should
>>>>>>>>> rather aim to make sure that we can make IRQ allocation explicit.
>>>>>>>>> Fundamentally the problem sounds very similar to the PCI slot
>>>>>>>>> allocation
>>>>>>>>> which eventually got solved by libvirt specifying the slots manually.
>>>>>>>> We can do that too. Who decides? :)
>>>>>>> The better solution wins :)
>>>>>> We both know who decides ;) I posted series, I need heads up if it is
>>>>>> going
>>>>>> the right way or not.
>>>>> It's not :). If a guest may not have different IRQ allocation after
>>>>> migration, it also must not have different IRQ allocation after
>>>>> shutdown +
>>>>> restart.
>>>> Ok. That's good answer, thanks. How does x86 work then? IRQs are hardcoded
>>>> (some are for sure but I do not know about MSI)? Or in order to support
>>> Non-PCI IRQs are hardcoded, yes. PCI IRQs are mapped to one of the 4 PCI
>>> interrupts which again are hardcoded to IOAPIC interrupt lines after some
>>> PCI line swizzling.
>> This is what I meant - I need to have a way to tell PHB IRQ numbers for
>> INTA/B/C/D.
>
> Yes, just like platform devices ;).
--
Alexey
next prev parent reply other threads:[~2014-04-11 16:30 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-14 4:18 [Qemu-devel] [PATCH 0/8] spapr: fix IOMMU and XICS/IRQs migration Alexey Kardashevskiy
2014-03-14 4:18 ` [Qemu-devel] [PATCH 1/8] spapr-iommu: add a bus for spapr-iommu devices Alexey Kardashevskiy
2014-04-10 12:40 ` Alexander Graf
2014-04-10 14:40 ` Alexey Kardashevskiy
2014-04-10 14:52 ` Andreas Färber
2014-04-10 15:18 ` Alexey Kardashevskiy
2014-03-14 4:18 ` [Qemu-devel] [PATCH 2/8] xics: add flags for interrupts Alexey Kardashevskiy
2014-04-10 12:43 ` Alexander Graf
2014-03-14 4:18 ` [Qemu-devel] [PATCH 3/8] xics: add find_server Alexey Kardashevskiy
2014-03-14 4:18 ` [Qemu-devel] [PATCH 4/8] xics: add pre_load() hook to ICSStateClass Alexey Kardashevskiy
2014-03-14 4:18 ` [Qemu-devel] [PATCH 5/8] xics: disable flags reset on xics reset Alexey Kardashevskiy
2014-03-14 4:18 ` [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics Alexey Kardashevskiy
2014-04-10 12:51 ` Alexander Graf
2014-04-10 13:24 ` Alexey Kardashevskiy
2014-04-10 13:26 ` Alexander Graf
2014-04-10 14:43 ` Alexey Kardashevskiy
2014-04-11 9:24 ` Alexander Graf
2014-04-11 12:38 ` Alexey Kardashevskiy
2014-04-11 13:58 ` Alexander Graf
2014-04-11 14:50 ` Alexey Kardashevskiy
2014-04-11 14:58 ` Alexander Graf
2014-04-11 15:27 ` Alexey Kardashevskiy
2014-04-11 15:38 ` Alexander Graf
2014-04-11 16:01 ` Alexey Kardashevskiy
2014-04-11 16:15 ` Alexander Graf
2014-04-11 16:30 ` Alexey Kardashevskiy [this message]
2014-03-14 4:18 ` [Qemu-devel] [PATCH 7/8] spapr: remove @next_irq Alexey Kardashevskiy
2014-03-14 7:19 ` Thomas Huth
2014-03-14 4:18 ` [Qemu-devel] [PATCH 8/8] xics: enable interrupt configuration reset on migration Alexey Kardashevskiy
2014-04-10 12:55 ` Alexander Graf
2014-03-20 1:25 ` [Qemu-devel] [PATCH 0/8] spapr: fix IOMMU and XICS/IRQs migration Andreas Färber
2014-04-04 5:53 ` Alexey Kardashevskiy
2014-05-04 13:56 ` Alexey Kardashevskiy
2014-05-04 21:52 ` Paolo Bonzini
2014-05-04 23:48 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5348188A.5040202@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=afaerber@suse.de \
--cc=agraf@suse.de \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.