From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751979AbdHRJRq (ORCPT ); Fri, 18 Aug 2017 05:17:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43942 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751204AbdHRJRm (ORCPT ); Fri, 18 Aug 2017 05:17:42 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6FFB5C04B31B Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=vkuznets@redhat.com From: Vitaly Kuznetsov To: Juergen Gross Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , "Kirill A. Shutemov" , Peter Zijlstra , Linus Torvalds , Jork Loeser , KY Srinivasan , Stephen Hemminger , Steven Rostedt , Boris Ostrovsky , Andrew Cooper , Andy Lutomirski , xen-devel@lists.xenproject.org Subject: Re: [PATCH RFC] x86: enable RCU based table free when PARAVIRT References: <20170817092057.18920-1-vkuznets@redhat.com> <841bbc41-37d5-d9dd-54b7-20de21cc7cb2@suse.com> Date: Fri, 18 Aug 2017 11:17:36 +0200 In-Reply-To: <841bbc41-37d5-d9dd-54b7-20de21cc7cb2@suse.com> (Juergen Gross's message of "Fri, 18 Aug 2017 11:07:39 +0200") Message-ID: <87inhlz13j.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 18 Aug 2017 09:17:42 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Juergen Gross writes: > On 17/08/17 11:20, Vitaly Kuznetsov wrote: >> On x86 software page-table walkers depend on the fact that remote TLB flush >> does an IPI: walk is performed lockless but with interrupts disabled and in >> case the pagetable is freed the freeing CPU will get blocked as remote TLB >> flush is required. On other architecture which don't require an IPI to do >> remote TLB flush we have an RCU-based mechanism (see >> include/asm-generic/tlb.h for more details). >> >> In virtualized environments we may want to override .flush_tlb_others hook >> in pv_mmu_ops and use a hypercall asking the hypervisor to do remote TLB >> flush for us. This breaks the assumption about IPI. Xen PV does this for >> years and the upcoming remote TLB flush for Hyper-V will do it too. This >> is not safe, software pagetable walkers may step on an already freed page. >> >> Solve the issue by enabling RCU-based table free mechanism when PARAVIRT >> is selected in config. Testing with kernbench doesn't show any notable >> performance impact: >> >> 6-CPU host: >> >> Average Half load -j 3 Run (std deviation): >> CURRENT HAVE_RCU_TABLE_FREE >> ======= =================== >> Elapsed Time 400.498 (0.179679) Elapsed Time 399.909 (0.162853) >> User Time 1098.72 (0.278536) User Time 1097.59 (0.283894) >> System Time 100.301 (0.201629) System Time 99.736 (0.196254) >> Percent CPU 299 (0) Percent CPU 299 (0) >> Context Switches 5774.1 (69.2121) Context Switches 5744.4 (79.4162) >> Sleeps 87621.2 (78.1093) Sleeps 87586.1 (99.7079) >> >> Average Optimal load -j 24 Run (std deviation): >> CURRENT HAVE_RCU_TABLE_FREE >> ======= =================== >> Elapsed Time 219.03 (0.652534) Elapsed Time 218.959 (0.598674) >> User Time 1119.51 (21.3284) User Time 1118.81 (21.7793) >> System Time 100.499 (0.389308) System Time 99.8335 (0.251423) >> Percent CPU 432.5 (136.974) Percent CPU 432.45 (136.922) >> Context Switches 81827.4 (78029.5) Context Switches 81818.5 (78051) >> Sleeps 97124.8 (9822.4) Sleeps 97207.9 (9955.04) >> >> 6-CPU host: > > I guess this is wrong information ... Oops, is was 16, not 6! :-) > >> >> Average Half load -j 8 Run (std deviation): >> CURRENT HAVE_RCU_TABLE_FREE >> ======= =================== >> Elapsed Time 213.538 (3.7891) Elapsed Time 212.5 (3.10939) >> User Time 1306.4 (1.83399) User Time 1307.65 (1.01364) >> System Time 194.59 (0.864378) System Time 195.478 (0.794588) >> Percent CPU 702.6 (13.5388) Percent CPU 707 (11.1131) >> Context Switches 21189.2 (1199.4) Context Switches 21288.2 (552.388) >> Sleeps 89390.2 (482.325) Sleeps 89677 (277.06) >> >> Average Optimal load -j 64 Run (std deviation): >> CURRENT HAVE_RCU_TABLE_FREE >> ======= =================== >> Elapsed Time 137.866 (0.787928) Elapsed Time 138.438 (0.218792) >> User Time 1488.92 (192.399) User Time 1489.92 (192.135) >> System Time 234.981 (42.5806) System Time 236.09 (42.8138) >> Percent CPU 1057.1 (373.826) Percent CPU 1057.1 (369.114) > > ... as I suspect more than 100% usage per cpu are rather unlikely. :-) > >> Context Switches 187514 (175324) Context Switches 187358 (175060) >> Sleeps 112633 (24535.5) Sleeps 111743 (23297.6) >> >> Suggested-by: Peter Zijlstra >> Signed-off-by: Vitaly Kuznetsov > > Acked-by: Juergen Gross > Thanks! -- Vitaly