From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53114) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1db0TF-0003tc-9R for qemu-devel@nongnu.org; Fri, 28 Jul 2017 04:18:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1db0TA-00088b-8y for qemu-devel@nongnu.org; Fri, 28 Jul 2017 04:18:05 -0400 Received: from mail-wr0-x22d.google.com ([2a00:1450:400c:c0c::22d]:36889) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1db0TA-00085j-2E for qemu-devel@nongnu.org; Fri, 28 Jul 2017 04:18:00 -0400 Received: by mail-wr0-x22d.google.com with SMTP id 33so95601378wrz.4 for ; Fri, 28 Jul 2017 01:17:57 -0700 (PDT) References: <20170724210306.18428-1-bobby.prani@gmail.com> <98c62747-c020-e0f8-d0ce-c49bd05bb9d9@redhat.com> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <98c62747-c020-e0f8-d0ce-c49bd05bb9d9@redhat.com> Date: Fri, 28 Jul 2017 09:17:54 +0100 Message-ID: <871sp1dlfx.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC PATCH] tcg/softmmu: Increase size of TLB cache List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Pranith Kumar , qemu-devel@nongnu.org, rth@twiddle.net Paolo Bonzini writes: > On 24/07/2017 23:03, Pranith Kumar wrote: >> This patch increases the number of entries we allow in the TLB. I went >> over a few architectures to see if increasing it is problematic. Only >> armv6 seems to have a limitation that only 8 bits can be used for >> indexing these entries. For other architectures, I increased the >> number of TLB entries to a 4K-sized cache. >> >> Signed-off-by: Pranith Kumar > > How did you benchmark this, and can you plot (at least for x86 hosts) > the results as CPU_TLB_BITS_MAX grows from 8 to 12? Pranith has some numbers but what we were seeing is the re-fill path creeping up the perf profiles. Because it is so expensive to re-compute the entries pushing up the TLB size does ameliorate the problem. That said I don't think increasing the TLB size is our only solution. What I've asked for is some sort of idea of the pattern for the eviction of entries from the TLB and the performance of the victim cache. It may be tweaking the locality of that cache would be enough. One idea I had was With an 8 bit TLB you could afford to have 256 dynamically grown arrays in the victim path - one per entry. Then at flush time you could simply count up the number of victims in the array for that slot. That would give you a good idea if some regions are hotter than others. -- Alex Bennée