From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45169) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gH9ld-0002k5-Bd for qemu-devel@nongnu.org; Mon, 29 Oct 2018 11:47:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gH9lc-00036J-0A for qemu-devel@nongnu.org; Mon, 29 Oct 2018 11:47:49 -0400 Date: Mon, 29 Oct 2018 11:47:28 -0400 From: "Emilio G. Cota" Message-ID: <20181029154728.GA17805@flamenco> References: <20181025144644.15464-1-cota@braap.org> <20181025151103.GA19931@flamenco> <878t2j21ko.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <878t2j21ko.fsf@linaro.org> Subject: Re: [Qemu-devel] [Qemu-arm] [RFC v4 00/71] per-CPU locks List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex =?iso-8859-1?Q?Benn=E9e?= Cc: qemu-devel@nongnu.org, Peter Maydell , Chris Wulff , Sagar Karandikar , David Hildenbrand , James Hogan , Anthony Green , Palmer Dabbelt , Mark Cave-Ayland , Max Filippov , Michael Clark , Guan Xuetao , Marek Vasut , Alexander Graf , Christian Borntraeger , Pavel Dovgalyuk , Andrzej Zaborowski , Artyom Tarasenko , Eduardo Habkost , Richard Henderson , Fabien Chouteau , qemu-s390x@nongnu.org, qemu-arm@nongnu.org, Alistair Francis , Stafford Horne , David Gibson , Bastian Koppelmann , Cornelia Huck , Laurent Vivier , Michael Walle , qemu-ppc@nongnu.org, Aleksandar Markovic , Paolo Bonzini , Aurelien Jarno On Sat, Oct 27, 2018 at 10:14:47 +0100, Alex Bennée wrote: > > Emilio G. Cota writes: > > > [I forgot to add the cover letter to git send-email; here it is] > > > > v3: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg04179.html > > > > "Why is this an RFC?" See v3 link above. Also, see comment at > > the bottom of this message regarding the last patch of this series. > > I'm also seeing his hang on check-tcg, specifically qemu-aarch64 ./tests/linux-test Thanks for reporting. The last patch in the series is the one that causes the hang. I didn't test that patch much, since I did not intend to get it merged. Over the weekend I had a bit of time to think about an actual fix, i.e. how to reduce safe work calls for TLB invalidations. The idea is to check whether the remote invalidation is necessary at all; we can take the remote tlb's lock, and check whether the address we want to invalidate has been read by the remote CPU since its latest flush. On some quick tests booting an aarch64 system I measured that only up to ~2% of remote invalidations are actually necessary. I just did a search on google scholar and found a similar approach to reduce remote TLB shootdowns on ARM, this time for hardware. This paper "TLB Shootdown Mitigation for Low-Power Many-Core Servers with L1 Virtual Caches" https://dl.acm.org/citation.cfm?id=3202975 addresses the issue by employing bloom filters in hardware to determine whether an address has been accessed by a TLB before performing an invalidation (and the corresponding icache flush). In software, using a per-TLB hash table might be enough. I'll try to have something ready for v5. Thanks, Emilio