kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* assign-dev: Purpose of interrupt_work
@ 2009-10-12  7:03 Jan Kiszka
  2009-10-12  7:13 ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  7:03 UTC (permalink / raw)
  To: Gleb Natapov, Avi Kivity; +Cc: kvm-devel

[-- Attachment #1: Type: text/plain, Size: 329 bytes --]

Hi,

I was starring at the IRQ delivery path of assigned devices for a while,
wondering why we have a work queue there. Now, after looking at some
prehistoric versions, I think the reason is that there once was a mutex
involved while we now use RCU. Am I right that we could actually drop
this indirection today?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:03 assign-dev: Purpose of interrupt_work Jan Kiszka
@ 2009-10-12  7:13 ` Gleb Natapov
  2009-10-12  7:27   ` Jan Kiszka
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  7:13 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm-devel

On Mon, Oct 12, 2009 at 09:03:18AM +0200, Jan Kiszka wrote:
> Hi,
> 
> I was starring at the IRQ delivery path of assigned devices for a while,
> wondering why we have a work queue there. Now, after looking at some
> prehistoric versions, I think the reason is that there once was a mutex
> involved while we now use RCU. Am I right that we could actually drop
> this indirection today?
> 
ioapic/pic path still has mutex. If MSIX is used (like it should) we can
drop work queue.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:13 ` Gleb Natapov
@ 2009-10-12  7:27   ` Jan Kiszka
  2009-10-12  7:45     ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  7:27 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 620 bytes --]

Gleb Natapov wrote:
> On Mon, Oct 12, 2009 at 09:03:18AM +0200, Jan Kiszka wrote:
>> Hi,
>>
>> I was starring at the IRQ delivery path of assigned devices for a while,
>> wondering why we have a work queue there. Now, after looking at some
>> prehistoric versions, I think the reason is that there once was a mutex
>> involved while we now use RCU. Am I right that we could actually drop
>> this indirection today?
>>
> ioapic/pic path still has mutex. If MSIX is used (like it should) we can
> drop work queue.

I see. Wouldn't it be feasible to convert the fast paths of [io]apic to
spinlocks?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:27   ` Jan Kiszka
@ 2009-10-12  7:45     ` Gleb Natapov
  2009-10-12  7:50       ` Jan Kiszka
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  7:45 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm-devel

On Mon, Oct 12, 2009 at 09:27:19AM +0200, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Mon, Oct 12, 2009 at 09:03:18AM +0200, Jan Kiszka wrote:
> >> Hi,
> >>
> >> I was starring at the IRQ delivery path of assigned devices for a while,
> >> wondering why we have a work queue there. Now, after looking at some
> >> prehistoric versions, I think the reason is that there once was a mutex
> >> involved while we now use RCU. Am I right that we could actually drop
> >> this indirection today?
> >>
> > ioapic/pic path still has mutex. If MSIX is used (like it should) we can
> > drop work queue.
> 
> I see. Wouldn't it be feasible to convert the fast paths of [io]apic to
> spinlocks?
> 
Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
prefers mutexes. Theoretically it is possible to make them lockless,
but code will be complex and eventually more slow, since more then two
atomic operation will be used on irq injection path.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:45     ` Gleb Natapov
@ 2009-10-12  7:50       ` Jan Kiszka
  2009-10-12  7:57         ` Gleb Natapov
  2009-10-12  8:39         ` Avi Kivity
  0 siblings, 2 replies; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  7:50 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]

Gleb Natapov wrote:
> On Mon, Oct 12, 2009 at 09:27:19AM +0200, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Mon, Oct 12, 2009 at 09:03:18AM +0200, Jan Kiszka wrote:
>>>> Hi,
>>>>
>>>> I was starring at the IRQ delivery path of assigned devices for a while,
>>>> wondering why we have a work queue there. Now, after looking at some
>>>> prehistoric versions, I think the reason is that there once was a mutex
>>>> involved while we now use RCU. Am I right that we could actually drop
>>>> this indirection today?
>>>>
>>> ioapic/pic path still has mutex. If MSIX is used (like it should) we can
>>> drop work queue.
>> I see. Wouldn't it be feasible to convert the fast paths of [io]apic to
>> spinlocks?
>>
> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
> prefers mutexes. Theoretically it is possible to make them lockless,
> but code will be complex and eventually more slow, since more then two
> atomic operation will be used on irq injection path.

Well, lockless is another thing.

But also converting to spinlocks would indeed add some overhead:
irqsave/restore. But I wonder if this isn't worth it, at least when
looking at the (supposed to be fast) device passthrough scenario which
would be simpler and faster.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:50       ` Jan Kiszka
@ 2009-10-12  7:57         ` Gleb Natapov
  2009-10-12  8:16           ` Jan Kiszka
  2009-10-12  8:39         ` Avi Kivity
  1 sibling, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  7:57 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm-devel

On Mon, Oct 12, 2009 at 09:50:58AM +0200, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Mon, Oct 12, 2009 at 09:27:19AM +0200, Jan Kiszka wrote:
> >> Gleb Natapov wrote:
> >>> On Mon, Oct 12, 2009 at 09:03:18AM +0200, Jan Kiszka wrote:
> >>>> Hi,
> >>>>
> >>>> I was starring at the IRQ delivery path of assigned devices for a while,
> >>>> wondering why we have a work queue there. Now, after looking at some
> >>>> prehistoric versions, I think the reason is that there once was a mutex
> >>>> involved while we now use RCU. Am I right that we could actually drop
> >>>> this indirection today?
> >>>>
> >>> ioapic/pic path still has mutex. If MSIX is used (like it should) we can
> >>> drop work queue.
> >> I see. Wouldn't it be feasible to convert the fast paths of [io]apic to
> >> spinlocks?
> >>
> > Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
> > prefers mutexes. Theoretically it is possible to make them lockless,
> > but code will be complex and eventually more slow, since more then two
> > atomic operation will be used on irq injection path.
> 
> Well, lockless is another thing.
> 
> But also converting to spinlocks would indeed add some overhead:
> irqsave/restore. But I wonder if this isn't worth it, at least when
> looking at the (supposed to be fast) device passthrough scenario which
> would be simpler and faster.
> 
Avi's point in favor of mutex is: they are as fast as spinlocks when
congested and allows preemption when held.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:57         ` Gleb Natapov
@ 2009-10-12  8:16           ` Jan Kiszka
  2009-10-12  8:39             ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  8:16 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 1663 bytes --]

Gleb Natapov wrote:
> On Mon, Oct 12, 2009 at 09:50:58AM +0200, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Mon, Oct 12, 2009 at 09:27:19AM +0200, Jan Kiszka wrote:
>>>> Gleb Natapov wrote:
>>>>> On Mon, Oct 12, 2009 at 09:03:18AM +0200, Jan Kiszka wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I was starring at the IRQ delivery path of assigned devices for a while,
>>>>>> wondering why we have a work queue there. Now, after looking at some
>>>>>> prehistoric versions, I think the reason is that there once was a mutex
>>>>>> involved while we now use RCU. Am I right that we could actually drop
>>>>>> this indirection today?
>>>>>>
>>>>> ioapic/pic path still has mutex. If MSIX is used (like it should) we can
>>>>> drop work queue.
>>>> I see. Wouldn't it be feasible to convert the fast paths of [io]apic to
>>>> spinlocks?
>>>>
>>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
>>> prefers mutexes. Theoretically it is possible to make them lockless,
>>> but code will be complex and eventually more slow, since more then two
>>> atomic operation will be used on irq injection path.
>> Well, lockless is another thing.
>>
>> But also converting to spinlocks would indeed add some overhead:
>> irqsave/restore. But I wonder if this isn't worth it, at least when
>> looking at the (supposed to be fast) device passthrough scenario which
>> would be simpler and faster.
>>
> Avi's point in favor of mutex is: they are as fast as spinlocks when
> congested and allows preemption when held.

...but require scheduler activity in case of dev-passthrough, which is
surely playing in a different league.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  7:50       ` Jan Kiszka
  2009-10-12  7:57         ` Gleb Natapov
@ 2009-10-12  8:39         ` Avi Kivity
  2009-10-12  9:05           ` Gleb Natapov
  2009-10-12  9:07           ` Jan Kiszka
  1 sibling, 2 replies; 23+ messages in thread
From: Avi Kivity @ 2009-10-12  8:39 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm-devel

On 10/12/2009 09:50 AM, Jan Kiszka wrote:
>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
>> prefers mutexes. Theoretically it is possible to make them lockless,
>> but code will be complex and eventually more slow, since more then two
>> atomic operation will be used on irq injection path.
>>      
> Well, lockless is another thing.
>
> But also converting to spinlocks would indeed add some overhead:
> irqsave/restore. But I wonder if this isn't worth it, at least when
> looking at the (supposed to be fast) device passthrough scenario which
> would be simpler and faster.
>    

I'm worried about disabling irqs for non-device-assignment cases.  It 
would be more palatable if ioapic was completely O(1) (there are some 
per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but we want 
to scale).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  8:16           ` Jan Kiszka
@ 2009-10-12  8:39             ` Gleb Natapov
  2009-10-12  9:04               ` Jan Kiszka
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  8:39 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm-devel

On Mon, Oct 12, 2009 at 10:16:56AM +0200, Jan Kiszka wrote:
> >>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
> >>> prefers mutexes. Theoretically it is possible to make them lockless,
> >>> but code will be complex and eventually more slow, since more then two
> >>> atomic operation will be used on irq injection path.
> >> Well, lockless is another thing.
> >>
> >> But also converting to spinlocks would indeed add some overhead:
> >> irqsave/restore. But I wonder if this isn't worth it, at least when
> >> looking at the (supposed to be fast) device passthrough scenario which
> >> would be simpler and faster.
> >>
> > Avi's point in favor of mutex is: they are as fast as spinlocks when
> > congested and allows preemption when held.
> 
> ...but require scheduler activity in case of dev-passthrough, which is
> surely playing in a different league.
> 
I'd rather remove dev-passthrough completely than continue adding hack upon hack
upon hack to make is some times kinda sorta work :)

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  8:39             ` Gleb Natapov
@ 2009-10-12  9:04               ` Jan Kiszka
  2009-10-12  9:14                 ` Avi Kivity
  2009-10-12  9:14                 ` Gleb Natapov
  0 siblings, 2 replies; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  9:04 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 1159 bytes --]

Gleb Natapov wrote:
> On Mon, Oct 12, 2009 at 10:16:56AM +0200, Jan Kiszka wrote:
>>>>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
>>>>> prefers mutexes. Theoretically it is possible to make them lockless,
>>>>> but code will be complex and eventually more slow, since more then two
>>>>> atomic operation will be used on irq injection path.
>>>> Well, lockless is another thing.
>>>>
>>>> But also converting to spinlocks would indeed add some overhead:
>>>> irqsave/restore. But I wonder if this isn't worth it, at least when
>>>> looking at the (supposed to be fast) device passthrough scenario which
>>>> would be simpler and faster.
>>>>
>>> Avi's point in favor of mutex is: they are as fast as spinlocks when
>>> congested and allows preemption when held.
>> ...but require scheduler activity in case of dev-passthrough, which is
>> surely playing in a different league.
>>
> I'd rather remove dev-passthrough completely than continue adding hack upon hack
> upon hack to make is some times kinda sorta work :)

Hmm, is this code not needed for the VT-d & Co. case? Or what is the
alternative?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  8:39         ` Avi Kivity
@ 2009-10-12  9:05           ` Gleb Natapov
  2009-10-12  9:07             ` Avi Kivity
  2009-10-12  9:07           ` Jan Kiszka
  1 sibling, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  9:05 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, kvm-devel

On Mon, Oct 12, 2009 at 10:39:09AM +0200, Avi Kivity wrote:
> On 10/12/2009 09:50 AM, Jan Kiszka wrote:
> >>Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
> >>prefers mutexes. Theoretically it is possible to make them lockless,
> >>but code will be complex and eventually more slow, since more then two
> >>atomic operation will be used on irq injection path.
> >Well, lockless is another thing.
> >
> >But also converting to spinlocks would indeed add some overhead:
> >irqsave/restore. But I wonder if this isn't worth it, at least when
> >looking at the (supposed to be fast) device passthrough scenario which
> >would be simpler and faster.
> 
> I'm worried about disabling irqs for non-device-assignment cases.
> It would be more palatable if ioapic was completely O(1) (there are
> some per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but
> we want to scale).
> 
We can change it to hash for directed irqs (still not O(1)), but for
broadcast irq the loop will be required.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:05           ` Gleb Natapov
@ 2009-10-12  9:07             ` Avi Kivity
  0 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-10-12  9:07 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Jan Kiszka, kvm-devel

On 10/12/2009 11:05 AM, Gleb Natapov wrote:
>
>> I'm worried about disabling irqs for non-device-assignment cases.
>> It would be more palatable if ioapic was completely O(1) (there are
>> some per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but
>> we want to scale).
>>
>>      
> We can change it to hash for directed irqs (still not O(1)),

Or cache the vcpu pointer in the redirection table entry.

> but for
> broadcast irq the loop will be required.
>    

That's fine, those are rare.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  8:39         ` Avi Kivity
  2009-10-12  9:05           ` Gleb Natapov
@ 2009-10-12  9:07           ` Jan Kiszka
  2009-10-12  9:16             ` Avi Kivity
  2009-10-12 17:36             ` Marcelo Tosatti
  1 sibling, 2 replies; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  9:07 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gleb Natapov, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 1159 bytes --]

Avi Kivity wrote:
> On 10/12/2009 09:50 AM, Jan Kiszka wrote:
>>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
>>> prefers mutexes. Theoretically it is possible to make them lockless,
>>> but code will be complex and eventually more slow, since more then two
>>> atomic operation will be used on irq injection path.
>>>      
>> Well, lockless is another thing.
>>
>> But also converting to spinlocks would indeed add some overhead:
>> irqsave/restore. But I wonder if this isn't worth it, at least when
>> looking at the (supposed to be fast) device passthrough scenario which
>> would be simpler and faster.
>>    
> 
> I'm worried about disabling irqs for non-device-assignment cases.  It
> would be more palatable if ioapic was completely O(1) (there are some
> per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but we want
> to scale).

Yeah, what a pity. That's likely not solvable in a generic way, given
that the guest finally decided how many VCPUs may listen to a line.

OK, but dropping interrupt_work from the MSI path is still worthwhile,
and probably more future-proof anyway.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:04               ` Jan Kiszka
@ 2009-10-12  9:14                 ` Avi Kivity
  2009-10-12  9:25                   ` Jan Kiszka
  2009-10-12  9:14                 ` Gleb Natapov
  1 sibling, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2009-10-12  9:14 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm-devel

On 10/12/2009 11:04 AM, Jan Kiszka wrote:
>> I'd rather remove dev-passthrough completely than continue adding hack upon hack
>> upon hack to make is some times kinda sorta work :)
>>      
> Hmm, is this code not needed for the VT-d&  Co. case?

It is needed.

> Or what is the
> alternative?
>    

irqfd (which only supports edge-triggered interrupts now).  Note irqfd 
needs the same love, it uses a workqueue as well.

The theory is:
  fd1 = eventfd()
  give fd1 to kvm as irqfd, vhost-net as trigger fd
  fd2 = eventfd()
  give fd2 to kvm as irqfd, uio as trigger fd
  fd3 = evenfd()
  give fd3 to kvm as irqfd, another kvm as ioeventfd

One interface, multiple users (in the kernel, userspace, or other processes)

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:04               ` Jan Kiszka
  2009-10-12  9:14                 ` Avi Kivity
@ 2009-10-12  9:14                 ` Gleb Natapov
  1 sibling, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  9:14 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, kvm-devel

On Mon, Oct 12, 2009 at 11:04:23AM +0200, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Mon, Oct 12, 2009 at 10:16:56AM +0200, Jan Kiszka wrote:
> >>>>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
> >>>>> prefers mutexes. Theoretically it is possible to make them lockless,
> >>>>> but code will be complex and eventually more slow, since more then two
> >>>>> atomic operation will be used on irq injection path.
> >>>> Well, lockless is another thing.
> >>>>
> >>>> But also converting to spinlocks would indeed add some overhead:
> >>>> irqsave/restore. But I wonder if this isn't worth it, at least when
> >>>> looking at the (supposed to be fast) device passthrough scenario which
> >>>> would be simpler and faster.
> >>>>
> >>> Avi's point in favor of mutex is: they are as fast as spinlocks when
> >>> congested and allows preemption when held.
> >> ...but require scheduler activity in case of dev-passthrough, which is
> >> surely playing in a different league.
> >>
> > I'd rather remove dev-passthrough completely than continue adding hack upon hack
> > upon hack to make is some times kinda sorta work :)
> 
> Hmm, is this code not needed for the VT-d & Co. case? Or what is the
> alternative?
> 
The smile there is for a reason. My dislike of dev-passthrough will not
make it go away. Alternatives? I don't know what HW makers plans are, but
with interrupt remapping + APIC vitalization in HW theoretically it is
possible to inject interrupts directly from device to virtual machine.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:07           ` Jan Kiszka
@ 2009-10-12  9:16             ` Avi Kivity
  2009-10-12 17:36             ` Marcelo Tosatti
  1 sibling, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-10-12  9:16 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm-devel

On 10/12/2009 11:07 AM, Jan Kiszka wrote:
>> I'm worried about disabling irqs for non-device-assignment cases.  It
>> would be more palatable if ioapic was completely O(1) (there are some
>> per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but we want
>> to scale).
>>      
> Yeah, what a pity. That's likely not solvable in a generic way, given
> that the guest finally decided how many VCPUs may listen to a line.
>
> OK, but dropping interrupt_work from the MSI path is still worthwhile,
> and probably more future-proof anyway.
>    

Yes - we can extend the irq routing table to contain this information.  
It means we need to invalidate it when the guest reprograms the 
interrupt controllers, but I think it should work out fine.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:14                 ` Avi Kivity
@ 2009-10-12  9:25                   ` Jan Kiszka
  2009-10-12  9:30                     ` Avi Kivity
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12  9:25 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gleb Natapov, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

Avi Kivity wrote:
> On 10/12/2009 11:04 AM, Jan Kiszka wrote:
>>> I'd rather remove dev-passthrough completely than continue adding
>>> hack upon hack
>>> upon hack to make is some times kinda sorta work :)
>>>      
>> Hmm, is this code not needed for the VT-d&  Co. case?
> 
> It is needed.
> 
>> Or what is the
>> alternative?
>>    
> 
> irqfd (which only supports edge-triggered interrupts now).  Note irqfd
> needs the same love, it uses a workqueue as well.
> 
> The theory is:
>  fd1 = eventfd()
>  give fd1 to kvm as irqfd, vhost-net as trigger fd
>  fd2 = eventfd()
>  give fd2 to kvm as irqfd, uio as trigger fd
>  fd3 = evenfd()
>  give fd3 to kvm as irqfd, another kvm as ioeventfd
> 
> One interface, multiple users (in the kernel, userspace, or other
> processes)

I see to overall gain, but likely only worsens my objective (low
latency), at least it doesn't improve it.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:25                   ` Jan Kiszka
@ 2009-10-12  9:30                     ` Avi Kivity
  2009-10-12  9:38                       ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2009-10-12  9:30 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm-devel

On 10/12/2009 11:25 AM, Jan Kiszka wrote:
>>> Or what is the
>>> alternative?
>>>
>>>        
>> irqfd (which only supports edge-triggered interrupts now).  Note irqfd
>> needs the same love, it uses a workqueue as well.
>>
>> The theory is:
>>   fd1 = eventfd()
>>   give fd1 to kvm as irqfd, vhost-net as trigger fd
>>   fd2 = eventfd()
>>   give fd2 to kvm as irqfd, uio as trigger fd
>>   fd3 = evenfd()
>>   give fd3 to kvm as irqfd, another kvm as ioeventfd
>>
>> One interface, multiple users (in the kernel, userspace, or other
>> processes)
>>      
> I see to overall gain, but likely only worsens my objective (low
> latency), at least it doesn't improve it.
>    

irqfd can work from interrupt context, so if we only use the workqueue 
for the many-vcpu cases, we should be fine.

I think we can do it cleanly, too: in ioapic code, if we're talking to 
one vcpu, wake it immediately; otherwise schedule work to continue in 
process context.  This was the common code can always work in interrupt 
context and not worry about the work queue.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:30                     ` Avi Kivity
@ 2009-10-12  9:38                       ` Gleb Natapov
  2009-10-12  9:40                         ` Avi Kivity
  0 siblings, 1 reply; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  9:38 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, kvm-devel

On Mon, Oct 12, 2009 at 11:30:10AM +0200, Avi Kivity wrote:
> On 10/12/2009 11:25 AM, Jan Kiszka wrote:
> >>>Or what is the
> >>>alternative?
> >>>
> >>irqfd (which only supports edge-triggered interrupts now).  Note irqfd
> >>needs the same love, it uses a workqueue as well.
> >>
> >>The theory is:
> >>  fd1 = eventfd()
> >>  give fd1 to kvm as irqfd, vhost-net as trigger fd
> >>  fd2 = eventfd()
> >>  give fd2 to kvm as irqfd, uio as trigger fd
> >>  fd3 = evenfd()
> >>  give fd3 to kvm as irqfd, another kvm as ioeventfd
> >>
> >>One interface, multiple users (in the kernel, userspace, or other
> >>processes)
> >I see to overall gain, but likely only worsens my objective (low
> >latency), at least it doesn't improve it.
> 
> irqfd can work from interrupt context, so if we only use the
> workqueue for the many-vcpu cases, we should be fine.
> 
> I think we can do it cleanly, too: in ioapic code, if we're talking
> to one vcpu, wake it immediately; otherwise schedule work to
> continue in process context.  This was the common code can always
> work in interrupt context and not worry about the work queue.
> 
Even if we are talking to one cpu locking is still needed, so unless we
make it spinlock we can't inject from irq context. If we change mutex to
spinlock what is the point of work queue?

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:38                       ` Gleb Natapov
@ 2009-10-12  9:40                         ` Avi Kivity
  2009-10-12  9:42                           ` Gleb Natapov
  0 siblings, 1 reply; 23+ messages in thread
From: Avi Kivity @ 2009-10-12  9:40 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Jan Kiszka, kvm-devel

On 10/12/2009 11:38 AM, Gleb Natapov wrote:
>
>> irqfd can work from interrupt context, so if we only use the
>> workqueue for the many-vcpu cases, we should be fine.
>>
>> I think we can do it cleanly, too: in ioapic code, if we're talking
>> to one vcpu, wake it immediately; otherwise schedule work to
>> continue in process context.  This was the common code can always
>> work in interrupt context and not worry about the work queue.
>>
>>      
> Even if we are talking to one cpu locking is still needed, so unless we
> make it spinlock we can't inject from irq context. If we change mutex to
> spinlock what is the point of work queue?
>    

Not to loop over 4096 vcpus with interrupts disabled, triggered by guest 
action.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:40                         ` Avi Kivity
@ 2009-10-12  9:42                           ` Gleb Natapov
  0 siblings, 0 replies; 23+ messages in thread
From: Gleb Natapov @ 2009-10-12  9:42 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Jan Kiszka, kvm-devel

On Mon, Oct 12, 2009 at 11:40:59AM +0200, Avi Kivity wrote:
> On 10/12/2009 11:38 AM, Gleb Natapov wrote:
> >
> >>irqfd can work from interrupt context, so if we only use the
> >>workqueue for the many-vcpu cases, we should be fine.
> >>
> >>I think we can do it cleanly, too: in ioapic code, if we're talking
> >>to one vcpu, wake it immediately; otherwise schedule work to
> >>continue in process context.  This was the common code can always
> >>work in interrupt context and not worry about the work queue.
> >>
> >Even if we are talking to one cpu locking is still needed, so unless we
> >make it spinlock we can't inject from irq context. If we change mutex to
> >spinlock what is the point of work queue?
> 
> Not to loop over 4096 vcpus with interrupts disabled, triggered by
> guest action.
> 
But in a work queue you'll do this anyway. You can't access ioapic
without the lock.

--
			Gleb.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12  9:07           ` Jan Kiszka
  2009-10-12  9:16             ` Avi Kivity
@ 2009-10-12 17:36             ` Marcelo Tosatti
  2009-10-12 20:44               ` Jan Kiszka
  1 sibling, 1 reply; 23+ messages in thread
From: Marcelo Tosatti @ 2009-10-12 17:36 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Avi Kivity, Gleb Natapov, kvm-devel

On Mon, Oct 12, 2009 at 11:07:28AM +0200, Jan Kiszka wrote:
> Avi Kivity wrote:
> > On 10/12/2009 09:50 AM, Jan Kiszka wrote:
> >>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
> >>> prefers mutexes. Theoretically it is possible to make them lockless,
> >>> but code will be complex and eventually more slow, since more then two
> >>> atomic operation will be used on irq injection path.
> >>>      
> >> Well, lockless is another thing.
> >>
> >> But also converting to spinlocks would indeed add some overhead:
> >> irqsave/restore. But I wonder if this isn't worth it, at least when
> >> looking at the (supposed to be fast) device passthrough scenario which
> >> would be simpler and faster.
> >>    
> > 
> > I'm worried about disabling irqs for non-device-assignment cases.  It
> > would be more palatable if ioapic was completely O(1) (there are some
> > per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but we want
> > to scale).
> 
> Yeah, what a pity. That's likely not solvable in a generic way, given
> that the guest finally decided how many VCPUs may listen to a line.
> 
> OK, but dropping interrupt_work from the MSI path is still worthwhile,
> and probably more future-proof anyway.

Seems appropriate to convert the process context work to threaded
interrupt (instead of workqueue). That should help latency.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: assign-dev: Purpose of interrupt_work
  2009-10-12 17:36             ` Marcelo Tosatti
@ 2009-10-12 20:44               ` Jan Kiszka
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Kiszka @ 2009-10-12 20:44 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Avi Kivity, Gleb Natapov, kvm-devel

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

Marcelo Tosatti wrote:
> On Mon, Oct 12, 2009 at 11:07:28AM +0200, Jan Kiszka wrote:
>> Avi Kivity wrote:
>>> On 10/12/2009 09:50 AM, Jan Kiszka wrote:
>>>>> Apic is lockless. For ioapic/pic I used spinlocks initially, but Avi
>>>>> prefers mutexes. Theoretically it is possible to make them lockless,
>>>>> but code will be complex and eventually more slow, since more then two
>>>>> atomic operation will be used on irq injection path.
>>>>>      
>>>> Well, lockless is another thing.
>>>>
>>>> But also converting to spinlocks would indeed add some overhead:
>>>> irqsave/restore. But I wonder if this isn't worth it, at least when
>>>> looking at the (supposed to be fast) device passthrough scenario which
>>>> would be simpler and faster.
>>>>    
>>> I'm worried about disabling irqs for non-device-assignment cases.  It
>>> would be more palatable if ioapic was completely O(1) (there are some
>>> per-vcpu loops in there, shouldn't be too bad for 16 vcpus, but we want
>>> to scale).
>> Yeah, what a pity. That's likely not solvable in a generic way, given
>> that the guest finally decided how many VCPUs may listen to a line.
>>
>> OK, but dropping interrupt_work from the MSI path is still worthwhile,
>> and probably more future-proof anyway.
> 
> Seems appropriate to convert the process context work to threaded
> interrupt (instead of workqueue). That should help latency.
> 

Not for the trivial case (I want to avoid scheduling as far as possible).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-10-12 20:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-12  7:03 assign-dev: Purpose of interrupt_work Jan Kiszka
2009-10-12  7:13 ` Gleb Natapov
2009-10-12  7:27   ` Jan Kiszka
2009-10-12  7:45     ` Gleb Natapov
2009-10-12  7:50       ` Jan Kiszka
2009-10-12  7:57         ` Gleb Natapov
2009-10-12  8:16           ` Jan Kiszka
2009-10-12  8:39             ` Gleb Natapov
2009-10-12  9:04               ` Jan Kiszka
2009-10-12  9:14                 ` Avi Kivity
2009-10-12  9:25                   ` Jan Kiszka
2009-10-12  9:30                     ` Avi Kivity
2009-10-12  9:38                       ` Gleb Natapov
2009-10-12  9:40                         ` Avi Kivity
2009-10-12  9:42                           ` Gleb Natapov
2009-10-12  9:14                 ` Gleb Natapov
2009-10-12  8:39         ` Avi Kivity
2009-10-12  9:05           ` Gleb Natapov
2009-10-12  9:07             ` Avi Kivity
2009-10-12  9:07           ` Jan Kiszka
2009-10-12  9:16             ` Avi Kivity
2009-10-12 17:36             ` Marcelo Tosatti
2009-10-12 20:44               ` Jan Kiszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).