From: Marcelo Tosatti <mtosatti@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC 2/2] kvm: set affinity hint for assigned device msi
Date: Mon, 17 Oct 2011 14:07:41 -0200 [thread overview]
Message-ID: <20111017160741.GA25476@amt.cnet> (raw)
In-Reply-To: <20111017133215.GA6406@redhat.com>
On Mon, Oct 17, 2011 at 03:32:15PM +0200, Michael S. Tsirkin wrote:
> On Mon, Oct 17, 2011 at 08:58:59AM -0200, Marcelo Tosatti wrote:
> > On Sun, Oct 16, 2011 at 03:12:23PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Oct 13, 2011 at 11:54:50AM -0300, Marcelo Tosatti wrote:
> > > > On Tue, Oct 11, 2011 at 08:38:28PM +0200, Michael S. Tsirkin wrote:
> > > > > To forward an interrupt to a vcpu that runs on
> > > > > a host cpu different from the current one,
> > > > > we need an ipi which likely will cost us as much
> > > > > as delivering the interrupt directly to that cpu would.
> > > > >
> > > > > Set irq affinity hint to point there, irq balancer
> > > > > can then take this into accound and balance
> > > > > interrupts accordingly.
> > > > >
> > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > ---
> > > > > virt/kvm/assigned-dev.c | 8 +++++---
> > > > > virt/kvm/irq_comm.c | 17 ++++++++++++++++-
> > > > > 2 files changed, 21 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> > > > > index f89f138..b579777 100644
> > > > > --- a/virt/kvm/assigned-dev.c
> > > > > +++ b/virt/kvm/assigned-dev.c
> > > > > @@ -142,9 +142,11 @@ static void deassign_host_irq(struct kvm *kvm,
> > > > > for (i = 0; i < assigned_dev->entries_nr; i++)
> > > > > disable_irq(assigned_dev->host_msix_entries[i].vector);
> > > > >
> > > > > - for (i = 0; i < assigned_dev->entries_nr; i++)
> > > > > - free_irq(assigned_dev->host_msix_entries[i].vector,
> > > > > - (void *)assigned_dev);
> > > > > + for (i = 0; i < assigned_dev->entries_nr; i++) {
> > > > > + u32 vector = assigned_dev->host_msix_entries[i].vector;
> > > > > + irq_set_affinity_hint(vector, NULL);
> > > > > + free_irq(vector, (void *)assigned_dev);
> > > > > + }
> > > > >
> > > > > assigned_dev->entries_nr = 0;
> > > > > kfree(assigned_dev->host_msix_entries);
> > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > index ac8b629..68b1f7c 100644
> > > > > --- a/virt/kvm/irq_comm.c
> > > > > +++ b/virt/kvm/irq_comm.c
> > > > > @@ -22,6 +22,7 @@
> > > > >
> > > > > #include <linux/kvm_host.h>
> > > > > #include <linux/slab.h>
> > > > > +#include <linux/interrupt.h>
> > > > > #include <trace/events/kvm.h>
> > > > >
> > > > > #include <asm/msidef.h>
> > > > > @@ -80,6 +81,17 @@ inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
> > > > > #endif
> > > > > }
> > > > >
> > > > > +static void kvm_vcpu_host_irq_hint(struct kvm_vcpu *vcpu, int host_irq)
> > > > > +{
> > > > > + const struct cpumask *mask;
> > > > > + /* raw_smp_processor_id() is ok here: if we get preempted we can get a
> > > > > + * wrong value but we don't mind much. */
> > > > > + if (host_irq >= 0 && unlikely(vcpu->cpu != raw_smp_processor_id())) {
> > > > > + mask = get_cpu_mask(vcpu->cpu);
> > > > > + irq_set_affinity_hint(host_irq, mask);
> > > > > + }
> > > > > +}
> > > >
> > > > Unsure about the internals of irq_set_affinity_hint, but AFAICS its
> > > > exported so that irqbalance in userspace can make a decision.
> > >
> > > Yes. Pls note at the moment there's no hint so irqbalance
> > > will likely try to move the irq away from vcpu if that
> > > is doing a lot of work. My patch tries to correct that.
> > >
> > > > If that is the case, then irqbalance update rate should be high enough
> > > > to catch up with a vcpu migrating betweens cpus (which initially does
> > > > not appear a sensible arrangement).
> > >
> > > At least for pinned vcpus, that's almost sure to be the case :)
> >
> > What i mean is that the frequency of a vcpu migrating between cpus
> > might be higher than what irqbalance can cope with.
> >
> > > > The decision to have the host interrupt follow the vcpu seems a good
> > > > one, given that it saves an IPI and is potentially more cache friendly
> > > > overall.
> > >
> > > > And AFAICS its more intelligent for the device assignment case than
> > > > anything irqbalance can come up with
> > >
> > > Do you just propose overwriting affinity set by userspace then?
> >
> > Yes.
> >
> > > My concern would be to avoid breaking setups some users have,
> > > with carefully manually optimized affinity for vcpus and device irqs.
> >
> > They can disable automatic in-kernel affinity.
>
> This still means code needs to be changed ...
> Anyway, what's the interface for that?
>
> > >
> > > > (note it depends on how the APIC is
> > > > configured, your patch ignores that).
> > >
> > > Could you clarify please? What is meant by 'it' in 'it depends'?
> >
> > "It" means the target vcpu selection. It depends on how the guest
> > APIC is programmed.
> >
> > > Which APIC - host or guest - do you mean, and what are possible APIC
> > > configurations to consider?
> >
> > Guest APIC. Guest APIC programmed with round robin would break the
> > static assignment on your patch.
>
> For round robin we might just want to disable this
> automatic affinity?
OK.
> > Configurations to consider, all common ones used for assigned devices?
>
> I mean, besides round robin, any other modes that
> have an issue? Interrupts can also be multicast,
> I think, but we probably don't care what happens
> to affinity then, as msi interrupts are probably never
> broadcast ...
There is also lowest priority, which can be used with MSI.
next prev parent reply other threads:[~2011-10-17 16:08 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-11 18:38 [PATCH RFC 0/2] kvm: set irq affinity for assigned devices Michael S. Tsirkin
2011-10-11 18:38 ` [PATCH RFC 1/2] kvm: pass host irq number to set irq calls Michael S. Tsirkin
2011-10-11 18:38 ` Michael S. Tsirkin
2011-10-11 18:38 ` [PATCH RFC 2/2] kvm: set affinity hint for assigned device msi Michael S. Tsirkin
2011-10-11 18:38 ` Michael S. Tsirkin
2011-10-13 14:54 ` Marcelo Tosatti
2011-10-16 13:12 ` Michael S. Tsirkin
2011-10-17 10:58 ` Marcelo Tosatti
2011-10-17 13:32 ` Michael S. Tsirkin
2011-10-17 16:07 ` Marcelo Tosatti [this message]
2011-10-17 16:14 ` Michael S. Tsirkin
2011-10-17 17:04 ` Michael S. Tsirkin
2012-01-11 16:10 ` Marcelo Tosatti
2011-10-17 11:09 ` Marcelo Tosatti
2011-10-17 13:35 ` Michael S. Tsirkin
2012-01-12 14:09 ` Avi Kivity
2012-01-15 13:24 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111017160741.GA25476@amt.cnet \
--to=mtosatti@redhat.com \
--cc=avi@redhat.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mst@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.