From: "Michael S. Tsirkin" <mst@redhat.com>
To: Gleb Natapov <gleb@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] kvm: enable irq injection from interrupt context
Date: Thu, 16 Sep 2010 15:51:37 +0200 [thread overview]
Message-ID: <20100916135137.GB24850@redhat.com> (raw)
In-Reply-To: <20100916131823.GH3008@redhat.com>
On Thu, Sep 16, 2010 at 03:18:23PM +0200, Gleb Natapov wrote:
> On Thu, Sep 16, 2010 at 02:57:17PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Sep 16, 2010 at 02:33:01PM +0200, Gleb Natapov wrote:
> > > On Thu, Sep 16, 2010 at 02:13:38PM +0200, Michael S. Tsirkin wrote:
> > > > > > We haver two users: qemu does deasserts, vhost-net does asserts.
> > > > > Well this is broken. You want KVM to track level for you and this is
> > > > > wrong. KVM does this anyway because it can't relay on devise model
> > > > > to behave correctly [0], but in your case it is designed to behave
> > > > > incorrectly.
> > > > >
> > > > > Interrupt type is a device property. PCI devices just happen to be level
> > > > > triggered according to PCI spec. What if you want to use vhost-net to
> > > > > implement network device which has active-low interrupt line? [1]
> > > >
> > > > The polarity would have to be reversed in gsi (irq line can be shared,
> > > > all devices must be active high or low consistently).
> > > >
> > > There are gsi dedicated to PCI. They can be shared only between PCI
> > > devices.
> > >
> > > > > If you want to split parts that asserts irq and de-asserts it then we
> > > > > should have irqfd that tracks line status and knows interrupt line
> > > > > polarity.
> > > >
> > > > Yes, it can know about polarity even though I think it's cleaner to do this
> > > > per gsi. But it can not track line status as line is shared with
> > > > other devices.
> > > It should track only device's line status.
> >
> > There is no such thing as device's line status on real hardware, either.
> > Devices do not drive INT# high: they drive it low (all the time)
> > or do not drive it at all.
> Same thing, other naming. Device either drive it low (irq_set(1)) or
> not (irq_set(0)).
Well, if I drive it low any number of times it should hae no effect.
> >
> > Or consider express, the spec explicitly says:
> > "Note: Duplicate Assert_INTx/Deassert_INTx Messages have no effect, but
> > are not errors."
> >
> > > >
> > > > > > Another application is out of process virtio (sandboxing!).
> > > > > It will still assert and de-assert irq at the same code, so it will be
> > > > > able to track irq line status.
> > > > >
> > > > > > Again, pci stuff needs to stay in qemu.
> > > > > >
> > > > >
> > > > > Nothing to do with PCI whatsoever.
> > > > >
> > > > > [0] most qemu devices behave incorrectly and trigger level irq more then
> > > > > needed.
> > > >
> > > > Which devices?
> > > Most of them. They just call update_irq_status() or something and
> > > re-assert interrupt regardless of what previous status was.
> >
> > At least for PCI devices, these calls do nothing if status does not change.
> I am not sure what code are you locking at. e1000 device emulation
> doesn't check previous line status before calling qemu_set_irq().
Right. If you dig through useless levels of indirection, you will
see that each PCI device has 4 pin levels, when one of these
changes this makes it up level to the parent bus, and so on.
> >
> > > > pci core tracks line status and will never assert the same
> > > > line multiple times.
> > > That's good if pci core does this, but device shouldn't even try it.
> >
> > I disagree. We don't want to duplicate a ton of code all over
> > the codebase.
> >
> So abstract it into a function. It shouldn't be part of PCI emulation.
I don't know what you mean by this, send a patch and we can discuss?
Note that when I patches PCI interrupt handling for compliance
I made it mimic hardware as closely as possible: devices
can send any # of assert/deassert messages, bus discards duplicates.
> > > >
> > > > > [1] this is how correct PCI device should behave but we override
> > > > > polarity in ACPI, but now incorrect behaviour is deeply designed
> > > > > into vhost-net.
> > > >
> > > > Not really, vhost net signals an eventfd. What happens then is
> > > > up to kvm.
> > > >
> > > That is what current broken design does and it works, but if you want to
> > > save unneeded calls into kvm fix design.
> >
> > The interface seems clean enough: vhost handles virtio ring, qemu/kvm handle pci.
> > Making vhost aware of pci breaks this, I would not call that fixing the
> > design.
> >
> Once again. Nothing to do with PCI, everything to do with device
> emulation. And I propose to abstract interrupt assertion part into
> irqfd, not into vhost.
>
> --
> Gleb.
This could work. KVM would need to find all irqfd
objects mapped to gsi and notify them on deassert.
No idea how hard this is.
--
MST
next prev parent reply other threads:[~2010-09-16 13:57 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-15 18:54 [PATCH RFC] kvm: enable irq injection from interrupt context Michael S. Tsirkin
2010-09-16 9:02 ` Gleb Natapov
2010-09-16 9:10 ` Michael S. Tsirkin
2010-09-16 9:25 ` Gleb Natapov
2010-09-16 9:43 ` Michael S. Tsirkin
2010-09-16 9:46 ` Avi Kivity
2010-09-16 9:53 ` Michael S. Tsirkin
2010-09-16 10:13 ` Gleb Natapov
2010-09-16 10:13 ` Michael S. Tsirkin
2010-09-16 10:20 ` Gleb Natapov
2010-09-16 10:44 ` Michael S. Tsirkin
2010-09-16 10:54 ` Gleb Natapov
2010-09-16 10:53 ` Michael S. Tsirkin
2010-09-16 11:17 ` Gleb Natapov
2010-09-16 12:13 ` Michael S. Tsirkin
2010-09-16 12:33 ` Gleb Natapov
2010-09-16 12:57 ` Michael S. Tsirkin
2010-09-16 13:14 ` Avi Kivity
2010-09-16 13:38 ` Michael S. Tsirkin
2010-09-16 13:50 ` Avi Kivity
2010-09-16 13:55 ` Gleb Natapov
2010-09-16 13:18 ` Gleb Natapov
2010-09-16 13:51 ` Michael S. Tsirkin [this message]
2010-09-16 14:06 ` Gleb Natapov
2010-09-16 14:23 ` Michael S. Tsirkin
2010-09-16 14:51 ` Gleb Natapov
2010-09-16 15:24 ` Michael S. Tsirkin
2010-09-16 15:43 ` Gleb Natapov
2010-09-16 22:07 ` Michael S. Tsirkin
2010-09-17 7:59 ` Gleb Natapov
2010-09-19 10:45 ` Michael S. Tsirkin
2010-09-19 10:56 ` Avi Kivity
2010-09-19 10:55 ` Michael S. Tsirkin
2010-09-19 11:05 ` Avi Kivity
2010-09-19 11:23 ` Michael S. Tsirkin
2010-09-19 11:18 ` Gleb Natapov
2010-09-19 11:21 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100916135137.GB24850@redhat.com \
--to=mst@redhat.com \
--cc=avi@redhat.com \
--cc=gleb@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).