From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>,
Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: chien.yen@oracle.com, joe.jin@oracle.com,
zhenzhong.duan@oracle.com, yuval.shaia@oracle.com,
xen-devel@lists.xen.org
Subject: Is: Telling QEMU to re-use PIRQ value Was: Re: [PATCH 1/2] xen, libxc: init msix addr/data with value from qemu via hypercall
Date: Fri, 10 May 2013 15:03:03 -0400 [thread overview]
Message-ID: <20130510190303.GF2562@phenom.dumpdata.com> (raw)
In-Reply-To: <518CC42202000078000D4FFA@nat28.tlf.novell.com>
On Fri, May 10, 2013 at 08:55:46AM +0100, Jan Beulich wrote:
> >>> On 10.05.13 at 09:39, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
>
> > On 2013-05-10 14:37, Jan Beulich wrote:
> >>>>> On 10.05.13 at 04:49, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
> >>> On 2013-05-10 03:05, Jan Beulich wrote:
> >>>>>>> Zhenzhong Duan <zhenzhong.duan@oracle.com> 05/09/13 5:02 AM >>>
> >>>>> On 2013/5/8 20:03, Jan Beulich wrote:
> >>>>>> But of course I still don't really understand why all of the sudden
> >>>>>> this needs to be passed in rather than being under the full control
> >>>>>> of the hypervisor at all times. Perhaps this is related to me not
> >>>>>> understanding why the kernel would read these values at all:
> >>>>>> There's no other place in the kernel where the message would
> >>>>>> be read before first getting written (in fact, apart from the
> >>>>>> use of __read_msi_msg() by the Xen code, there's only one
> >>>>>> other user under arch/powerpc/, and there - according to the
> >>>>>> accompanying comment - this is just to save away the data for
> >>>>>> later use during resume).
> >>>>> There is a bug if msi_ad is not passed in.
> >>>>>
> >>>>> when driver first load,
> >>>>>
> >>>>> kernel.__read_msi_msg()
> >>>>> (got all zero)
> >>>> But you don't even comment on the apparently bogus use of the function here.
> >>> This pattern is used only when hvm_pirq is enabled. kernel need to check
> >>> XEN_PIRQ_MSI_DATA.
> >>> It's not a issue if data is 0 at first driver load, kernel will call
> >>> __write_msi_msg with pirq and XEN_PIRQ_MSI_DATA set.
> >> But this doesn't make the use of __read_msi_msg() less bogus. It's
> >> not clear on what basis this mechanism got invented in the first
> >> place.
> > It's there since hvm_irq introduced. But it works indeed.
>
> But that doesn't in any way mean the concept is sound.
>
> >>>>> kernel.__write_msi_msg(pirq)
> >>>>> (ioreq passed to qemu as no msixtbl_entry established yet)
> >>>>> qemu.pt_msi_update_one()
> >>>>> xc_domain_update_msi_irq()
> >>>>> (msixtbl_entry dynamicly allocated with msi_ad all zero)
> >>>>>
> >>>>> then driver unload,
> >>>>> ...
> >>>>> driver load again,
> >>>>>
> >>>>> kernel.__read_msi_msg()
> >>>>> (got all zero from xen as accelerated entry just established with all zero)
> >>>> If all zeroes get returned, why would the flow here be different then above?
> >>> Because pirq and related mapping and binding are not freed between
> >>> driver load-unload-load. They are freed when device detach.
> >>> We should try to use the last pirq.
> >> But then you need to solve the problem generically, i.e. not just
> >> for the driver reload case, but also for e.g. the kexec one (where
> >> __read_msi_msg() returning other than all zeros wouldn't help you
> >> as xen_irq_from_pirq() would then return -1, and you'd be back to
> >> the same problem.
> > No, not only kexec ones, it's driver unload that makes xen_irq_from_pirq
> > return -1. So there is also a bug in kernel side.
> > I have sent a patch about kernel. I think you miss it.
> > http://www.gossamer-threads.com/lists/xen/devel/281498
> >> IOW I think the prior IRQ needs to be freed
> >> anyway rather than an attempt be made to reuse it.
> > I have ever thought about this idea, but when to free the pirq is a problem.
> > When driver unload? qemu has no idea of if driver unloaded.
>
> But the kernel does, and hence could deal with this. As much as
> the setup is being done when the driver gets loaded, cleanup
> should be done when the driver gets unloaded. _If_ there
> already is such an odd protocol between kernel and qemu, then
> if that can't be dropped, it surely can be leveraged to also deal
> with the cleanup side of things? No need to fiddle with the
I don't know if such thing exists. Stefano, is there a way
to tell QEMU to re-use the PIRQ? Writting zero to the MSI?
> hypervisor interfaces for something that it's not supposed to
> know about anyway.
>
> > When msix entry masked? kernel mask and unmask msix entry
> > intermittently, especially when irqbalance enabled.
> >
> > So based on above, I think it's better to reuse same pirq, only free it
> > when device detached.
>
> I continue to disagree. Also from a theoretical perspective - if you
> have a lot of devices that no driver is loaded for, you'd keep a lot
> of IRQs allocated without any need.
The guest has to use PHYSDEVOP_get_free_pirq to allocate it. And
in this case we don't have a 'free_pirq' hypercall to release it.
The Linux Xen<->IRQ drivers drops all of the information it has on
the PIRQ once the driver is unloaded (rightly so - the driver after
does not need the IRQ anymore and the PIRQ<->events connection has
been broken).
I wrote a tiny patch that needs improvements that would cache the
last seen BDF and PIRQ (that part is missing). That would allow us
to re-use the PIRQ and not call PHYSDEVOP_get_free_pirq until we
exhaust the allocation we have.
In other words - this can be fixed in the kernel.
But if there is a 'magic' value that can be written to QEMU to tell
it to re-use the PIRQ.. that would good too.
>
> Jan
prev parent reply other threads:[~2013-05-10 19:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-08 8:17 [PATCH 1/2] xen, libxc: init msix addr/data with value from qemu via hypercall Zhenzhong Duan
2013-05-08 9:39 ` Jan Beulich
2013-05-08 10:00 ` Zhenzhong Duan
2013-05-08 12:03 ` Jan Beulich
2013-05-09 3:02 ` Zhenzhong Duan
2013-05-09 19:05 ` Jan Beulich
2013-05-10 2:49 ` Zhenzhong Duan
2013-05-10 6:37 ` Jan Beulich
2013-05-10 7:39 ` Zhenzhong Duan
2013-05-10 7:55 ` Jan Beulich
2013-05-10 8:22 ` Zhenzhong Duan
2013-05-10 19:03 ` Konrad Rzeszutek Wilk [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130510190303.GF2562@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=chien.yen@oracle.com \
--cc=joe.jin@oracle.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=xen-devel@lists.xen.org \
--cc=yuval.shaia@oracle.com \
--cc=zhenzhong.duan@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.