From: Alexander Graf <agraf@suse.de>
To: Alexey Kardashevskiy <aik@ozlabs.ru>, qemu-devel@nongnu.org
Cc: qemu-ppc@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Date: Thu, 22 May 2014 12:57:51 +0200 [thread overview]
Message-ID: <537DD82F.4020402@suse.de> (raw)
In-Reply-To: <1400756006-26297-1-git-send-email-aik@ozlabs.ru>
On 22.05.14 12:53, Alexey Kardashevskiy wrote:
> On 05/22/2014 05:16 PM, Alexander Graf wrote:>
>>> Am 22.05.2014 um 08:53 schrieb Alexey Kardashevskiy <aik@ozlabs.ru>:
>>>
>>>> On 05/21/2014 10:42 PM, Alexey Kardashevskiy wrote:
>>>>> On 05/21/2014 08:35 PM, Alexander Graf wrote:
>>>>>
>>>>>> On 21.05.14 12:13, Alexey Kardashevskiy wrote:
>>>>>>> On 05/21/2014 07:50 PM, Alexander Graf wrote:
>>>>>>>> On 21.05.14 11:33, Alexey Kardashevskiy wrote:
>>>>>>>>> On 05/21/2014 07:13 PM, Alexander Graf wrote:
>>>>>>>>>> On 21.05.14 11:11, Michael S. Tsirkin wrote:
>>>>>>>>>>> On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote:
>>>>>>>>>>>> On 21.05.14 10:52, Alexey Kardashevskiy wrote:
>>>>>>>>>>>>> On 05/21/2014 06:40 PM, Alexander Graf wrote:
>>>>>>>>>>>>>> On 15.05.14 11:59, Alexey Kardashevskiy wrote:
>>>>>>>>>>>>>> Currently SPAPR PHB keeps track of all allocated MSI/MISX
>>>>>>>>>>>>>> interrupt as
>>>>>>>>>>>>>> XICS used to be unable to reuse interrupts which becomes a
>>>>>>>>>>>>>> problem for
>>>>>>>>>>>>>> dynamic MSI reconfiguration which is happening on guest driver
>>>>>>>>>>>>>> reload or
>>>>>>>>>>>>>> PCI hot (un)plug. Another problem is that PHB has a limit of devices
>>>>>>>>>>>>>> supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good
>>>>>>>>>>>>>> reason
>>>>>>>>>>>>>> for that.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This makes use of new XICS ability to reuse interrupts.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This removes cached MSI configuration from SPAPR PHB so the first
>>>>>>>>>>>>>> IRQ
>>>>>>>>>>>>>> number
>>>>>>>>>>>>>> of a device is stored in MSI/MSIX config space so there is no
>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>> store
>>>>>>>>>>>>>> this anywhere else. From now on, SPAPR PHB only keeps flags telling
>>>>>>>>>>>>>> what
>>>>>>>>>>>>>> type
>>>>>>>>>>>>>> of interrupt for which device it has configured in order to return
>>>>>>>>>>>>>> error if
>>>>>>>>>>>>>> (for example) MSIX was enabled and the guest is trying to disable
>>>>>>>>>>>>>> MSI
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>> it has not enabled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This removes a limit for the maximum number of MSIX-enabled devices
>>>>>>>>>>>>>> per PHB,
>>>>>>>>>>>>>> now XICS and PCI bus capacity are the only limitation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This changes migration stream as it fixes
>>>>>>>>>>>>>> vmstate_spapr_pci_msi::name
>>>>>>>>>>>>>> which was
>>>>>>>>>>>>>> wrong since the beginning.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This fixed traces to be more informative.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In reality either MSIX or MSI is enabled, never both. So I could
>>>>>>>>>>>>>> remove
>>>>>>>>>>>>>> msi/msix
>>>>>>>>>>>>>> bitmaps from this patch, would it make sense?
>>>>>>>>>>>>> Is this a hard requirement? Does a device have to choose between
>>>>>>>>>>>>> MSIX and
>>>>>>>>>>>>> MSI or could it theoretically have both enabled? Is this a PCI
>>>>>>>>>>>>> limitation,
>>>>>>>>>>>>> a PAPR/XICS limitation or just a limitation of your implementation?
>>>>>>>>>>>> My implementation does not have this limitation, I asked if I can
>>>>>>>>>>>> simplify
>>>>>>>>>>>> code by introducing one :)
>>>>>>>>>>>>
>>>>>>>>>>>> I cannot see any reason why PCI cannot have both MSI and MSIX enabled
>>>>>>>>>>>> but
>>>>>>>>>>>> it does not seem to be used by anyone => cannot debug and confirm.
>>>>>>>>>>>>
>>>>>>>>>>>> PAPR spec assumes that if the guest tries enabling MSIX when MSI is
>>>>>>>>>>>> already
>>>>>>>>>>>> enabled, this is a "change", not enabling both types. But it also
>>>>>>>>>>>> says MSI
>>>>>>>>>>>> and MSIX vector numbers are not shared. Hm.
>>>>>>>>>>> Yeah, I'm not aware of any limitation on hardware here and I'd
>>>>>>>>>>> rather not impose one.
>>>>>>>>>>>
>>>>>>>>>>> Michael, do you know of any hardware that uses MSI *and* MSI-X at
>>>>>>>>>>> the same time?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Alex
>>>>>>>>>> No, and the PCI spec says:
>>>>>>>>>> A function is permitted to implement both MSI and MSI-X, but
>>>>>>>>>> system
>>>>>>>>>> software is
>>>>>>>>>> prohibited from enabling both at the same time. If system software
>>>>>>>>>> enables both at the same time, the result is undefined.
>>>>>>>>> Ah, cool. So yes Alexey, feel free to impose it :).
>>>>>>>> Heh. This solves just half of the problem - I still have to keep track of
>>>>>>>> what device got MSI/MSIX configured via that ibm,change-msi interface. I
>>>>>>>> was hoping I can store such flag somewhere in a device PCI config space
>>>>>>>> but
>>>>>>>> MSI/MSIX enable bit is not good as it is not set when those calls are
>>>>>>>> made.
>>>>>>>> And I cannot rely on address/data fields much as the guest can change them
>>>>>>>> (I already use them to store IRQ numbers and btw it is missing checks when
>>>>>>>> I read them back for disposal, I'll fix in next round).
>>>>>>>>
>>>>>>>> Or on "enable" event I could put IRQ numbers to .data of MSI config space
>>>>>>>> and on "disable" check if it is not zero, then configuration took place,
>>>>>>>> then I can remove my msi[]/msix[] flag arrays. If the guest did any change
>>>>>>>> to MSI/MSIX config space (it does not on SPAPR except weird selftest
>>>>>>>> cases), I compare .data with what ICS can possibly have and either reject
>>>>>>>> "disable" or handle it and if it breaks XICS - that's too bad for the
>>>>>>>> stupid guest. Would that be acceptable?
>>>>>>> Can't you prohibit the guest from writing to the MSI configuration
>>>>>>> registers itself? Then you don't need to do sanity checks.
>>>>>> I could for emulated devices but VFIO uses the same code. For example,
>>>>>> there is an IBM SCSI IPR card which does a "self test". For that, it saves
>>>>>> MSIX BAR content, does reboot via some backdoor interface and restores MSIX
>>>>>> BAR. It has been solved for VFIO in the host kernel by restoring MSIX data
>>>>>> from cached values when guest is trying to restore it with what it thinks
>>>>>> is actual MSIX data (it is virtualized because of x86). But there is cache
>>>>> We already have a cache because we don't access the real PCI registers with
>>>>> msi_set_message(), no?
>>>>
>>>> For emulated devices there is no cache. And in any case the guest is
>>>> allowed to write to it... Who knows what AIX does? I do not.
>>>
>>> Tried GHashTable for keeping bus:dev.fn <-> (irq, num), more or less ok but
>>> how to migrate such thing? Temporary cache somewhere and then unpack it? Or
>>> use old style migration callbacks?
>> Could you try to introduce a new vmstate type that just serializes and
>> deserializes hash tables? Maybe there is already a serialization
>> function for it in glib?
> I have not found any (most likely I do not know how to search there),
> I added mine, here are VMSTATE_HASH + its use for SPAPR.
>
>
> Is this a movement to right direction? I need to put key/value sizes
> into VMSTATE definition somehow but do not really want to touch
> VMStateField.
I'm not sure. Juan?
Alex
next prev parent reply other threads:[~2014-05-22 10:58 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-15 9:59 [Qemu-devel] [PATCH v2 0/8] Move interrupts from spapr to xics Alexey Kardashevskiy
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 1/8] xics: Add flags for interrupts Alexey Kardashevskiy
2014-05-21 6:30 ` Alexey Kardashevskiy
2014-05-21 8:40 ` Alexander Graf
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 2/8] xics: Add xics_find_source() Alexey Kardashevskiy
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 3/8] xics: Disable flags reset on xics reset Alexey Kardashevskiy
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 4/8] spapr: Move interrupt allocator to xics Alexey Kardashevskiy
2014-05-21 8:34 ` Alexander Graf
2014-05-21 8:46 ` Alexey Kardashevskiy
2014-05-21 8:47 ` Alexander Graf
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 5/8] xics: Remove obsolete xics_set_irq_type() Alexey Kardashevskiy
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 6/8] spapr: Remove @next_irq Alexey Kardashevskiy
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 7/8] xics: Implement xics_ics_free() Alexey Kardashevskiy
2014-05-15 9:59 ` [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB Alexey Kardashevskiy
2014-05-21 8:40 ` Alexander Graf
2014-05-21 8:52 ` Alexey Kardashevskiy
2014-05-21 9:06 ` Alexander Graf
2014-05-21 9:11 ` Michael S. Tsirkin
2014-05-21 9:13 ` Alexander Graf
2014-05-21 9:33 ` Alexey Kardashevskiy
2014-05-21 9:38 ` Michael S. Tsirkin
2014-05-21 10:03 ` Alexey Kardashevskiy
2014-05-21 9:50 ` Alexander Graf
2014-05-21 10:13 ` Alexey Kardashevskiy
2014-05-21 10:35 ` Alexander Graf
2014-05-21 12:42 ` Alexey Kardashevskiy
2014-05-22 6:53 ` Alexey Kardashevskiy
2014-05-22 7:16 ` Alexander Graf
2014-05-22 10:53 ` Alexey Kardashevskiy
2014-05-22 10:53 ` [Qemu-devel] [PATCH 1/2] vmstate: Add helper to enable GHashTable migration Alexey Kardashevskiy
2014-05-22 10:57 ` Alexander Graf
2014-05-22 11:04 ` Alexey Kardashevskiy
2014-05-22 10:53 ` [Qemu-devel] [PATCH 2/2] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB Alexey Kardashevskiy
2014-05-22 10:57 ` Alexander Graf [this message]
2014-05-22 14:25 ` [Qemu-devel] [PATCH v2 8/8] " Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=537DD82F.4020402@suse.de \
--to=agraf@suse.de \
--cc=aik@ozlabs.ru \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).