From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38862) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZPVHL-0007Wh-Cj for qemu-devel@nongnu.org; Wed, 12 Aug 2015 08:37:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZPVHG-0000f6-0f for qemu-devel@nongnu.org; Wed, 12 Aug 2015 08:37:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57686) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZPVHF-0000f2-Of for qemu-devel@nongnu.org; Wed, 12 Aug 2015 08:37:05 -0400 References: <1438966995-5913-1-git-send-email-a.rigo@virtualopensystems.com> <1438966995-5913-2-git-send-email-a.rigo@virtualopensystems.com> <55C9FE29.2080204@redhat.com> <55CA1AF8.1090409@redhat.com> <55CA2383.60805@redhat.com> From: Paolo Bonzini Message-ID: <55CB3DEA.2050103@redhat.com> Date: Wed, 12 Aug 2015 14:36:58 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: alvise rigo Cc: mttcg@greensocs.com, Claudio Fontana , QEMU Developers , Jani Kokkonen , VirtualOpenSystems Technical Team , =?UTF-8?Q?Alex_Benn=c3=a9e?= On 12/08/2015 09:31, alvise rigo wrote: > I think that tlb_flush_entry is not enough, since in theory another > vCPU could have a different TLB address referring the same phys > address. You're right, this is a TLB so it's virtually-indexed. :( I'm not sure=20 what happens on ARM, since it has a virtually indexed (VIVT or VIPT)=20 cache, but indeed it would be a problem when implementing e.g. CMPXCHG=20 using the TCG ll/sc ops. I'm a bit worried about adding such a big bitmap. It's only used on=20 TCG, but it is also allocated on KVM and on KVM you can have hundreds=20 of VCPUs. Wasting 200 bits per guest memory page (i.e. ~0.6% of guest=20 memory) is obviously not a great idea. :( Perhaps we can use a bytemap instead: - 0..253 =3D TLB_EXCL must be set in all VCPUs except CPU n. A VCPU that loads the TLB for this vaddr does not have to set it. - 254 =3D TLB_EXCL must be set in all VCPUs. A VCPU that loads the TLB for this vaddr has to set it. - 255 =3D TLB_EXCL not set in at least two VCPUs Transitions: - ll transitions: anything -> 254 - sc transitions: 254 -> current CPU_ID - TLB_EXCL store transitions: 254 -> current CPU_ID - tlb_st_page transitions: CPU_ID other than current -> 255 The initial value is 255 on SMP guests, 0 on UP guests. The algorithms are very similar to yours, just using this approximate representation. ll algorithm: llsc_value =3D bytemap[vaddr] if llsc_value =3D=3D CPU_ID do nothing elseif llsc_value < 254 flush TLB of CPU llsc_value elseif llsc_value =3D=3D 255 flush all TLBs set TLB_EXCL bytemap[vaddr] =3D 254 load tlb_set_page algorithm: llsc_value =3D bytemap[vaddr] if llsc_value =3D=3D CPU_ID or llsc_value =3D=3D 255 do nothing else if llsc_value =3D=3D 254 set TLB_EXCL else # two CPUs without TLB_EXCL bytemap[vaddr] =3D 255 TLB_EXCL slow path algorithm: if bytemap[vaddr] =3D=3D 254 bytemap[vaddr] =3D CPU_ID else # two CPUs without TLB_EXCL bytemap[vaddr] =3D 255 clear TLB_EXCL in this CPU store sc algorithm: if bytemap[vaddr] =3D=3D CPU_ID or bytemap[vaddr] =3D=3D 254 bytemap[vaddr] =3D CPU_ID clear TLB_EXCL in this CPU store succeed else fail clear algorithm: if bytemap[vaddr] =3D=3D 254 bytemap[vaddr] =3D CPU_ID The UP case is optimized because bytemap[vaddr] will always be 0 or 254. The algorithm requires the LL to be cleared e.g. on exceptions. Paolo > alvise >=20 > On Tue, Aug 11, 2015 at 6:32 PM, Paolo Bonzini wr= ote: >> >> >> On 11/08/2015 18:11, alvise rigo wrote: >>>>> Why flush the entire cache (I understand you mean TLB)? >>> Sorry, I meant the TLB. >>> If for each removal of an exclusive entry we set also the bit to 1, w= e >>> force the following LL to make a tlb_flush() on every vCPU. >> >> What if you only flush one entry with tlb_flush_entry (on every vCPU)? >> >> Paolo