From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Avi Kivity <avi@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>, kvm <kvm@vger.kernel.org>,
Alex Williamson <alex.williamson@redhat.com>,
Jesse Barnes <jbarnes@virtuousgeek.org>
Subject: Re: [PATCH] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
Date: Tue, 10 Jan 2012 20:31:43 +0200 [thread overview]
Message-ID: <20120110183143.GG17105@redhat.com> (raw)
In-Reply-To: <4F0C818D.9@siemens.com>
On Tue, Jan 10, 2012 at 07:21:01PM +0100, Jan Kiszka wrote:
> > ATM writes to msi/msix mask bit have no effect for assigned
> > devices. For virtio, they are implemented by deassigning irqfd
> > which is a very slow operation (rcu write side).
> >
> > Instead, When guest writes to mask, qemu can set/clear by calling
> > this ioctl.
>
> Isn't that effort better invested in proper in-kernel mask emulation for
> MSI-X?
This gives us a working implementation fo free. Whether MSIX mask
writes are worth accelerating in kernel I'm not 100% sure. But IMO this
shows it is a more generic interface.
> >
> >>>
> >>>> As long as the
> >>>> +guest masks the legacy INTx, the kernel will refrain from unmasking it at
> >>>> +hardware level and will not assert the guest's IRQ line. User space is still
> >>>> +responsible for applying this state to the assigned device's real config space.
> >>>
> >>> Can this be made more explicit? You mean writing into 1st
> >>> byte of PCI control, right?
> >>
> >> For sure, I can state this.
> >>
> >>>
> >>>> +To avoid that the kernel overwrites the state user space wants to set,
> >>>> +KVM_ASSIGN_SET_INTX_MASK has to be called prior to updating the config space.
> >>>
> >>> This looks like a strange requirement, could you explain how
> >>> this helps avoid races?
> >>
> >> By declaring the target state of the INTx bit first to the kernel,
> >> concurrent changes of the kernel while user space performs a
> >> read-modify-write will not lead to an old mask state being written.
> >
> > I note you don't require KVM_ASSIGN_SET_INTX_MASK before read though.
> > Further, userspace might cache the control byte. If we require
> > it not to do it, we probably need to be explicit?
>
> User space can do with the control byte what it wants - kernel can't
> help this anyway. I should just tell the kernel ahead of time what the
> next INTx mask state will be. That particularly avoids that the kernel
> sets the mask when user space wants it cleared. The other way around is
> actually unproblematic as we check KVM_ASSIGN_SET_INTX_MASK before
> delivering the IRQ to the guest.
>
> >
> >>> This also raises questions about
> >>> what should be done to write a bit unrelated to masking.
> >>
> >> Just write it, using the INTx state user space maintains. In the worst
> >> case, some masking done by the kernel in the meantime will be
> >> overwritten, leading to a single spurious but harmless IRQ. That event
> >> won't be delivered to the guest unless it is ready to receive it - as we
> >> updated the mask state prior to writing to the config space. The point
> >> is that the kernel mechanism has to deal with crazy user space clearing
> >> the mask for whatever reason again.
> >
> > I guess the point is that we need to avoid is this:
> >
> > kernel masks bit
> > read
> > kernel unmasks bit
> > write
> >
> > I'm not sure I understand how the text above suggests
> > doing this in a race free manner.
>
> User space must not write INTx as read from the hardware but according
> to its own view. Then the above is harmless.
>
> >
> >
> > A simple way would be to ask userspace to always clear
> > this bit on writes. What do you think?
>
> That or - sounds more consistent - writing the state that user space
> exposes to the guest anyway. That (in addition to the ordering
> requirement) should be clearly stated in the doc, I agree.
>
> Jan
Yes, I agree it all works, just needs clear documentation.
In summary, userspace must ignore the value of the bit
it reads from device.
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2012-01-10 18:29 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-09 14:03 [PATCH] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices Jan Kiszka
2012-01-09 19:45 ` Alex Williamson
2012-01-09 21:25 ` Jan Kiszka
2012-01-09 22:05 ` Alex Williamson
2012-01-09 22:26 ` Jan Kiszka
2012-01-10 13:47 ` Jan Kiszka
2012-01-10 23:41 ` Alex Williamson
2012-01-11 9:47 ` Michael S. Tsirkin
2012-01-10 16:17 ` Michael S. Tsirkin
2012-01-10 17:29 ` Jan Kiszka
2012-01-10 18:10 ` Michael S. Tsirkin
2012-01-10 18:21 ` Jan Kiszka
2012-01-10 18:31 ` Michael S. Tsirkin [this message]
2012-01-10 18:43 ` Jan Kiszka
2012-01-10 19:04 ` Michael S. Tsirkin
2012-01-10 19:40 ` Jan Kiszka
2012-01-10 20:44 ` Michael S. Tsirkin
2012-01-10 21:18 ` Jan Kiszka
2012-01-10 21:36 ` Michael S. Tsirkin
2012-01-12 15:49 ` [PATCH v2] " Jan Kiszka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120110183143.GG17105@redhat.com \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=avi@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=jbarnes@virtuousgeek.org \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.