From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx3-rdu2.redhat.com ([66.187.233.73] helo=mx1.redhat.com) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fOSoR-0002ze-L1 for speck@linutronix.de; Thu, 31 May 2018 21:00:40 +0200 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5A894C12A8 for ; Thu, 31 May 2018 19:00:33 +0000 (UTC) Received: from washington.bos.jonmasters.org (ovpn-122-118.rdu2.redhat.com [10.10.122.118]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 45383111AF01 for ; Thu, 31 May 2018 19:00:33 +0000 (UTC) Subject: [MODERATED] Re: [PATCH 1/2] L1TF KVM 1 References: <20180529194214.2600-1-pbonzini@redhat.com> <20180529194240.7F1336110A@crypto-ml.lab.linutronix.de> From: Jon Masters Message-ID: Date: Thu, 31 May 2018 15:00:33 -0400 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="uRFXy7WCsBBJrOfDblgDASjBUcu24loZx"; protected-headers="v1" To: speck@linutronix.de List-ID: This is an OpenPGP/MIME encrypted message (RFC 4880 and 3156) --uRFXy7WCsBBJrOfDblgDASjBUcu24loZx Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 05/29/2018 06:49 PM, speck for Thomas Gleixner wrote: > On Tue, 29 May 2018, speck for Paolo Bonzini wrote: >> +void kvm_l1d_flush(void) >> +{ >> + asm volatile( >> + "movq %0, %%rax\n\t" >> + "leaq 65536(%0), %%rdx\n\t" >=20 > Why 64K? >=20 >> + "11: \n\t" >> + "movzbl (%%rax), %%ecx\n\t" >> + "addq $4096, %%rax\n\t" >> + "cmpq %%rax, %%rdx\n\t" >> + "jne 11b\n\t" >> + "xorl %%eax, %%eax\n\t" >> + "cpuid\n\t" My guess is they're saying that the maximum L1D$ size is 64K so they want to stride it 1 4K page at a time to get the prefetchers going ahead of the next loop...this in theory will make the following loop "faster". > What's the cpuid invocation for? >=20 >> + "xorl %%eax, %%eax\n\t" >> + "12:\n\t" >> + "movzwl %%ax, %%edx\n\t" >> + "addl $64, %%eax\n\t" >> + "movzbl (%%rdx, %0), %%ecx\n\t">> + "cmpl $65536, %%eax\n\t" =2E..which then tries to do 64 bytes (Intel cache line) at a time. They use the CPUID as a serializing instruction to ensure the store has been observed, others have commented on that. Jon. --=20 Computer Architect | Sent from my Fedora powered laptop --uRFXy7WCsBBJrOfDblgDASjBUcu24loZx--