From mboxrd@z Thu Jan 1 00:00:00 1970 From: Suravee Suthikulanit Subject: Re: [PART1 RFC v4 08/11] svm: Add VMEXIT handlers for AVIC Date: Thu, 28 Apr 2016 17:08:17 -0500 Message-ID: <572289D1.5040901@amd.com> References: <1460017232-17429-1-git-send-email-Suravee.Suthikulpanit@amd.com> <1460017232-17429-9-git-send-email-Suravee.Suthikulpanit@amd.com> <20160412162245.GD6762@potion.brq.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: , , , , , , , , To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from mail-by2on0097.outbound.protection.outlook.com ([207.46.100.97]:40661 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752330AbcD1WI2 (ORCPT ); Thu, 28 Apr 2016 18:08:28 -0400 In-Reply-To: <20160412162245.GD6762@potion.brq.redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Hi Radim / Paolo, Sorry for delay in response. On 4/12/2016 11:22 AM, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: > 2016-04-07 03:20-0500, Suravee Suthikulpanit: >> From: Suravee Suthikulpanit >> >> This patch introduces VMEXIT handlers, avic_incomplete_ipi_intercept= ion() >> and avic_unaccelerated_access_interception() along with two trace po= ints >> (trace_kvm_avic_incomplete_ipi and trace_kvm_avic_unaccelerated_acce= ss). >> >> Signed-off-by: Suravee Suthikulpanit >> --- >> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c >> @@ -3515,6 +3515,250 @@ static int mwait_interception(struct vcpu_sv= m *svm) >> +static u32 *avic_get_logical_id_entry(struct kvm_vcpu *vcpu, u8 mda= , bool flat) >> +{ >> + struct kvm_arch *vm_data =3D &vcpu->kvm->arch; >> + int index; >> + u32 *logical_apic_id_table; >> + >> + if (flat) { /* flat */ >> + if (mda > 7) > > Don't you want to check that just one bit it set? > >> + return NULL; >> + index =3D mda; >> + } else { /* cluster */ >> + int apic_id =3D mda & 0xf; >> + int cluster_id =3D (mda & 0xf0) >> 8; > > ">> 4". > >> + >> + if (apic_id > 4 || cluster_id >=3D 0xf) >> + return NULL; >> + index =3D (cluster_id << 2) + apic_id; > > ffs(apic_id), because 'apic_id' must be compacted into 2 bits. > >> + } >> + logical_apic_id_table =3D (u32 *) page_address(vm_data->avic_logic= al_id_table_page); >> + >> + return &logical_apic_id_table[index]; >> +} I have quite messed up in the logic here for handling the logical=20 cluster ID. Sorry for not catching this earlier. I'm rewriting this=20 function altogether to simplify it in the V5. >> [...] >> + lid =3D ffs(dlid) - 1; >> + ret =3D avic_handle_ldr_write(&svm->vcpu, svm->vcpu.vcpu_id, lid)= ; >> + if (ret) >> + return 0; > > OS can actually change LDR, so the old one should be invalidated. > > (No OS does, but that is not an important factor for the hypervisor.) > By validating the old one, are you suggesting that we should disable th= e=20 logical APIC table entry previously used by this vcpu? If so, that mean= s=20 we would need to cached the previous LDR value since the one in vAPIC=20 backing page would already be updated. >> [...] > >> + if (vm_data->ldr_mode !=3D mod) { >> + clear_page(page_address(vm_data->avic_logical_id_table_page)); >> + vm_data->ldr_mode =3D mod; >> + } >> + break; >> + } > > All these cases need to be called on KVM_SET_LAPIC -- the userspace c= an > provide completely new set of APIC registers and AVIC should build it= s > maps with them. Test with save/restore or migration. Hm.. This means we might need to introduce a new hook which is called=20 from the arch/x86/kvm/lapic.c: kvm_apic_post_state_restore(). Probably=20 something like kvm_x86_ops->apic_post_state_restore(), which sets up th= e=20 new physical and logical APIC id tables for AVIC. If this works, I'll add support to handle this and test with the=20 migration stuff in the V5. >> + if (offset >=3D 0x400) { >> + WARN(1, "Unsupported APIC offset %#x\n", offset); > > "printk_ratelimited(KERN_INFO " is the most severe message you could > give. I think that not printing anything is best, > >> + return ret; > > because we should not return, but continue to emulate the access. Then, this would continue as if we are handling the normal fault access= =2E > >> + } >> + >> + if (trap) { >> + /* Handling Trap */ >> + if (!write) /* Trap read should never happens */ >> + BUG(); > > (BUG_ON(!write) is shorter, though I would avoid BUG -- only guests a= re > going to fail, so we don't need to kill the host.) Ok. What about just WARN_ONCE(!write, "svm: Handling trap read.\n"); Thanks, Suravee