From: Gleb Natapov <gleb@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Sasha Levin <levinsasha928@gmail.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Avi Kivity <avi@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
"Tian, Kevin" <kevin.tian@intel.com>
Subject: Re: [PATCH v2] KVM: Implement support for the RH bit
Date: Fri, 2 Sep 2011 17:44:22 +0300 [thread overview]
Message-ID: <20110902144422.GH26451@redhat.com> (raw)
In-Reply-To: <4E60E9FE.60608@siemens.com>
On Fri, Sep 02, 2011 at 04:36:46PM +0200, Jan Kiszka wrote:
> On 2011-09-02 16:30, Sasha Levin wrote:
> > On Fri, 2011-09-02 at 16:25 +0200, Jan Kiszka wrote:
> >> On 2011-09-02 16:11, Sasha Levin wrote:
> >>> On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
> >>>> On 2011-09-02 15:13, Sasha Levin wrote:
> >>>>> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
> >>>>>> On 2011-09-02 13:36, Jan Kiszka wrote:
> >>>>>>> On 2011-09-02 13:27, Jan Kiszka wrote:
> >>>>>>>> On 2011-09-02 09:48, Sasha Levin wrote:
> >>>>>>>>> The RH bit exists in the message address register (lower 32 bits of
> >>>>>>>>> the address).
> >>>>>>>>>
> >>>>>>>>> The bit indicates whether the message should go to the processor which was
> >>>>>>>>> indicated in the destination ID bits, or whether it should go to the
> >>>>>>>>> processor running at the lowest priority.
> >>>>>>>>>
> >>>>>>>>> Cc: Avi Kivity <avi@redhat.com>
> >>>>>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> >>>>>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >>>>>>>>> ---
> >>>>>>>>> virt/kvm/irq_comm.c | 17 ++++++++++++++++-
> >>>>>>>>> 1 files changed, 16 insertions(+), 1 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>>>>> index 9f614b4..0ba3a3d 100644
> >>>>>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>>>>>> irq.level = 1;
> >>>>>>>>> irq.shorthand = 0;
> >>>>>>>>>
> >>>>>>>>> - /* TODO Deal with RH bit of MSI message address */
> >>>>>>>>> + /*
> >>>>>>>>> + * If the RH bit is set, we'll deliver to the processor running
> >>>>>>>>> + * at the lowest priority.
> >>>>>>>>> + */
> >>>>>>>>> + if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
> >>>>>>>>> + irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
> >>>>>>>>> + } else {
> >>>>>>>>> + /*
> >>>>>>>>> + * If the RH bit is not set, we'll deliver to the specific
> >>>>>>>>> + * processor mentioned in destination ID, and ignore the DM
> >>>>>>>>> + * bit.
> >>>>>>>>> + */
> >>>>>>>>> + irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
> >>>>>>>>> + irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
> >>>>>>>>> + }
> >>>>>>>>> +
> >>>>>>>>> return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Do you happen have a kvm unit test for this? Or how did you validate the
> >>>>>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
> >>>>>>>> as well which apparently already has the logic above but also some
> >>>>>>>> contradictory comment.
> >>>>>>>
> >>>>>>> Err, no, QEMU does not have this logic, it also ignores RH.
> >>>>>>>
> >>>>>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
> >>>>>>> pointless. And that strongly suggests something is still wrong.
> >>>>>>
> >>>>>> I tend to believe that this is what the spec tries to tell us:
> >>>>>>
> >>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>> index 9f614b4..b72f77a 100644
> >>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>>> MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
> >>>>>> irq.vector = (e->msi.data &
> >>>>>> MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
> >>>>>> - irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
> >>>>>> + irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
> >>>>>> + (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
> >>>>>> irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
> >>>>>> irq.delivery_mode = e->msi.data & 0x700;
> >>>>>> irq.level = 1;
> >>>>>>
> >>>>>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
> >>>>>> to RH==1 && DH==0.
> >>>>>
> >>>>> Thing is, the spec specifically states that RH==1 should deliver to
> >>>>> lowest priority - even though it doesn't state whats the relationship
> >>>>> between delivery mode and RH bit.
> >>>>
> >>>> The spec says "When RH is 1 and the physical destination mode is used
> >>>> [DM=0], the Destination ID field must not be set to 0xFF; it must point
> >>>> to a processor that is present and enabled to receive the interrupt."
> >>>>
> >>>
> >>> When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?
> >>
> >> irq.dest_mode becomes non-zero, and kvm_apic_match_dest uses
> >> kvm_apic_match_logical_addr for filtering out possible target CPUs.
> >>
> >> Mmh, a remaining question is if kvm_irq_delivery_to_apic is then already
> >> doing the right thing, even for delivery_mode != APIC_DM_LOWEST.
> >>
> >
> > The missing part is that when RH=1 we must look for the lowest priority:
> >
> > "Redirection hint indication (RH) - This bit indicates whether the
> > message should be directed to the processor with the lowest interrupt
> > priority among processors that can receive the interrupt."
> >
> > So it's not enough to set dest_mode, we must also make sure that
> > delivery_mode is set to low prio when RH=1.
>
> That's debatable. delivery_mode == APIC_DM_LOWEST includes this target
> selection, but also more. I have a bad feeling when we just overwrite
> delivery_mode as defined by the MSI data field instead of only patching
> kvm_irq_delivery_to_apic or kvm_is_dm_lowest_prio - if required.
>
Patching them how? To behave exactly like delivery_mode == APIC_DM_LOWEST in
case RH bit is set? Then setting delivery_mode to APIC_DM_LOWEST will
achieve the same goal.
> >
> >> Again my question to you: Did you observe unexpected behaviour with some
> >> real guests, or is this just based on code and spec study so far? If we
> >> had a test case, that could also provide valuable hints.
> >
> > Sorry, no test case.
> >
> > I've stumbled on the 'TODO' comment when I was digging into the MSI
> > implementation in KVM and decided to implement it based on specs.
>
> Then we definitely need some blessing by Intel to avoid subtle regressions.
>
Yes, if we are going to pursue that we need Intel to clarify what SDM means.
--
Gleb.
next prev parent reply other threads:[~2011-09-02 14:44 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-02 7:48 [PATCH v2] KVM: Implement support for the RH bit Sasha Levin
2011-09-02 11:27 ` Jan Kiszka
2011-09-02 11:36 ` Jan Kiszka
2011-09-02 12:11 ` Jan Kiszka
2011-09-02 13:13 ` Sasha Levin
2011-09-02 14:00 ` Jan Kiszka
2011-09-02 14:11 ` Sasha Levin
2011-09-02 14:25 ` Jan Kiszka
2011-09-02 14:30 ` Sasha Levin
2011-09-02 14:36 ` Jan Kiszka
2011-09-02 14:44 ` Gleb Natapov [this message]
2011-09-02 14:52 ` Jan Kiszka
2011-09-02 15:03 ` Gleb Natapov
2011-09-02 12:25 ` Gleb Natapov
2011-09-02 13:00 ` Jan Kiszka
2011-09-02 14:22 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110902144422.GH26451@redhat.com \
--to=gleb@redhat.com \
--cc=avi@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=levinsasha928@gmail.com \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox