From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Jork Loeser <Jork.Loeser@microsoft.com>,
KY Srinivasan <kys@microsoft.com>,
Simon Xiao <sixiao@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
"luto\@kernel.org" <luto@kernel.org>,
"hpa\@zytor.com" <hpa@zytor.com>,
"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"rostedt\@goodmis.org" <rostedt@goodmis.org>,
"andy.shevchenko\@gmail.com" <andy.shevchenko@gmail.com>,
"tglx\@linutronix.de" <tglx@linutronix.de>,
"mingo\@kernel.org" <mingo@kernel.org>,
"linux-tip-commits\@vger.kernel.org"
<linux-tip-commits@vger.kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush
Date: Wed, 16 Aug 2017 18:42:47 +0200 [thread overview]
Message-ID: <87shgrmpko.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <87fucup9ou.fsf@vitty.brq.redhat.com> (Vitaly Kuznetsov's message of "Mon, 14 Aug 2017 15:20:49 +0200")
[-- Attachment #1: Type: text/plain, Size: 4164 bytes --]
Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Fri, Aug 11, 2017 at 09:16:29AM -0700, Linus Torvalds wrote:
>>> On Fri, Aug 11, 2017 at 2:03 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>> >
>>> > I'm sure we talked about using HAVE_RCU_TABLE_FREE for x86 (and yes that
>>> > would make it work again), but this was some years ago and I cannot
>>> > readily find those emails.
>>>
>>> I think the only time we really talked about HAVE_RCU_TABLE_FREE for
>>> x86 (at least that I was cc'd on) was not because of RCU freeing, but
>>> because we just wanted to use the generic page table lookup code on
>>> x86 *despite* not using RCU freeing.
>>>
>>> And we just ended up renaming HAVE_GENERIC_RCU_GUP as HAVE_GENERIC_GUP.
>>>
>>> There was only passing mention of maybe making x86 use RCU, but the
>>> discussion was really about why the IF flag meant that x86 didn't need
>>> to, iirc.
>>>
>>> I don't recall us ever discussing *really* making x86 use RCU.
>>
>> Google finds me this:
>>
>> https://lwn.net/Articles/500188/
>>
>> Which includes:
>>
>> http://www.mail-archive.com/kvm@vger.kernel.org/msg72918.html
>>
>> which does as was suggested here, selects HAVE_RCU_TABLE_FREE for
>> PARAVIRT_TLB_FLUSH.
>>
>> But yes, this is very much virt specific nonsense, native would never
>> need this.
>
> In case we decide to go HAVE_RCU_TABLE_FREE for all PARAVIRT-enabled
> kernels (as it seems to be the easiest/fastest way to fix Xen PV) - what
> do you think about the required testing? Any suggestion for a
> specifically crafted micro benchmark in addition to standard
> ebizzy/kernbench/...?
In the meantime I tested HAVE_RCU_TABLE_FREE with kernbench (enablement
patch I used is attached; I know that it breaks other architectures) on
bare metal with PARAVIRT enabled in config. The results are:
6-CPU host:
Average Half load -j 3 Run (std deviation):
CURRENT HAVE_RCU_TABLE_FREE
======= ===================
Elapsed Time 400.498 (0.179679) Elapsed Time 399.909 (0.162853)
User Time 1098.72 (0.278536) User Time 1097.59 (0.283894)
System Time 100.301 (0.201629) System Time 99.736 (0.196254)
Percent CPU 299 (0) Percent CPU 299 (0)
Context Switches 5774.1 (69.2121) Context Switches 5744.4 (79.4162)
Sleeps 87621.2 (78.1093) Sleeps 87586.1 (99.7079)
Average Optimal load -j 24 Run (std deviation):
CURRENT HAVE_RCU_TABLE_FREE
======= ===================
Elapsed Time 219.03 (0.652534) Elapsed Time 218.959 (0.598674)
User Time 1119.51 (21.3284) User Time 1118.81 (21.7793)
System Time 100.499 (0.389308) System Time 99.8335 (0.251423)
Percent CPU 432.5 (136.974) Percent CPU 432.45 (136.922)
Context Switches 81827.4 (78029.5) Context Switches 81818.5 (78051)
Sleeps 97124.8 (9822.4) Sleeps 97207.9 (9955.04)
16-CPU host:
Average Half load -j 8 Run (std deviation):
CURRENT HAVE_RCU_TABLE_FREE
======= ===================
Elapsed Time 213.538 (3.7891) Elapsed Time 212.5 (3.10939)
User Time 1306.4 (1.83399) User Time 1307.65 (1.01364)
System Time 194.59 (0.864378) System Time 195.478 (0.794588)
Percent CPU 702.6 (13.5388) Percent CPU 707 (11.1131)
Context Switches 21189.2 (1199.4) Context Switches 21288.2 (552.388)
Sleeps 89390.2 (482.325) Sleeps 89677 (277.06)
Average Optimal load -j 64 Run (std deviation):
CURRENT HAVE_RCU_TABLE_FREE
======= ===================
Elapsed Time 137.866 (0.787928) Elapsed Time 138.438 (0.218792)
User Time 1488.92 (192.399) User Time 1489.92 (192.135)
System Time 234.981 (42.5806) System Time 236.09 (42.8138)
Percent CPU 1057.1 (373.826) Percent CPU 1057.1 (369.114)
Context Switches 187514 (175324) Context Switches 187358 (175060)
Sleeps 112633 (24535.5) Sleeps 111743 (23297.6)
As you can see, there's no notable difference. I'll think of a
microbenchmark though.
>
> Additionally, I see another option for us: enable 'rcu table free' on
> boot (e.g. by taking tlb_remove_table to pv_ops and doing boot-time
> patching for it) so bare metal and other hypervisors are not affected
> by the change.
It seems there's no need for that and we can keep things simple...
--
Vitaly
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-x86-enable-RCU-based-table-free-when-PARAVIRT.patch --]
[-- Type: text/x-patch, Size: 2726 bytes --]
>From daf5117706920aebe793d1239fccac2edd4d680c Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Mon, 14 Aug 2017 16:05:05 +0200
Subject: [PATCH] x86: enable RCU based table free when PARAVIRT
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/Kconfig | 1 +
arch/x86/mm/pgtable.c | 16 ++++++++++++++++
mm/memory.c | 5 +++++
3 files changed, 22 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 781521b7cf9e..9c1666ea04c9 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -167,6 +167,7 @@ config X86
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_REGS_AND_STACK_ACCESS_API
+ select HAVE_RCU_TABLE_FREE if SMP && PARAVIRT
select HAVE_RELIABLE_STACKTRACE if X86_64 && FRAME_POINTER && STACK_VALIDATION
select HAVE_STACK_VALIDATION if X86_64
select HAVE_SYSCALL_TRACEPOINTS
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 508a708eb9a6..b6aaab9fb3b8 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -56,7 +56,11 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
{
pgtable_page_dtor(pte);
paravirt_release_pte(page_to_pfn(pte));
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+ tlb_remove_table(tlb, pte);
+#else
tlb_remove_page(tlb, pte);
+#endif
}
#if CONFIG_PGTABLE_LEVELS > 2
@@ -72,21 +76,33 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
tlb->need_flush_all = 1;
#endif
pgtable_pmd_page_dtor(page);
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+ tlb_remove_table(tlb, page);
+#else
tlb_remove_page(tlb, page);
+#endif
}
#if CONFIG_PGTABLE_LEVELS > 3
void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
{
paravirt_release_pud(__pa(pud) >> PAGE_SHIFT);
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+ tlb_remove_table(tlb, virt_to_page(pud));
+#else
tlb_remove_page(tlb, virt_to_page(pud));
+#endif
}
#if CONFIG_PGTABLE_LEVELS > 4
void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
{
paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+ tlb_remove_table(tlb, virt_to_page(p4d));
+#else
tlb_remove_page(tlb, virt_to_page(p4d));
+#endif
}
#endif /* CONFIG_PGTABLE_LEVELS > 4 */
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
diff --git a/mm/memory.c b/mm/memory.c
index e158f7ac6730..18d6671b6ae2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -329,6 +329,11 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_
* See the comment near struct mmu_table_batch.
*/
+static void __tlb_remove_table(void *table)
+{
+ free_page_and_swap_cache(table);
+}
+
static void tlb_remove_table_smp_sync(void *arg)
{
/* Simply deliver the interrupt */
--
2.13.4
next prev parent reply other threads:[~2017-08-16 16:42 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-02 16:09 [PATCH v10 0/9] Hyper-V: paravirtualized remote TLB flushing and hypercall improvements Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 1/9] x86/hyper-v: include hyperv/ only when CONFIG_HYPERV is set Vitaly Kuznetsov
2017-08-10 16:37 ` [tip:x86/platform] x86/hyper-v: Include " tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 2/9] x86/hyper-v: make hv_do_hypercall() inline Vitaly Kuznetsov
2017-08-10 16:37 ` [tip:x86/platform] x86/hyper-v: Make " tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 3/9] x86/hyper-v: fast hypercall implementation Vitaly Kuznetsov
2017-08-10 16:37 ` [tip:x86/platform] x86/hyper-v: Introduce " tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 4/9] hyper-v: use fast hypercall for HVCALL_SIGNAL_EVENT Vitaly Kuznetsov
2017-08-10 16:38 ` [tip:x86/platform] hyper-v: Use " tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 5/9] x86/hyper-v: implement rep hypercalls Vitaly Kuznetsov
2017-08-10 16:38 ` [tip:x86/platform] x86/hyper-v: Implement " tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 6/9] hyper-v: globalize vp_index Vitaly Kuznetsov
2017-08-10 16:39 ` [tip:x86/platform] hyper-v: Globalize vp_index tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 7/9] x86/hyper-v: use hypercall for remote TLB flush Vitaly Kuznetsov
2017-08-10 16:39 ` [tip:x86/platform] x86/hyper-v: Use " tip-bot for Vitaly Kuznetsov
2017-08-10 18:21 ` tip-bot for Vitaly Kuznetsov
2017-08-10 18:56 ` Peter Zijlstra
2017-08-10 18:59 ` KY Srinivasan
2017-08-10 19:08 ` Jork Loeser
2017-08-10 19:27 ` Peter Zijlstra
2017-08-11 1:15 ` Jork Loeser
2017-08-11 9:03 ` Peter Zijlstra
2017-08-11 11:29 ` Kirill A. Shutemov
2017-08-11 16:16 ` Linus Torvalds
2017-08-11 16:26 ` Peter Zijlstra
2017-08-14 13:20 ` Vitaly Kuznetsov
2017-08-16 16:42 ` Vitaly Kuznetsov [this message]
2017-08-16 21:41 ` Boris Ostrovsky
2017-08-17 7:58 ` Vitaly Kuznetsov
2017-08-11 9:23 ` Vitaly Kuznetsov
2017-08-11 10:56 ` Peter Zijlstra
2017-08-11 11:05 ` [Xen-devel] " Andrew Cooper
2017-08-11 12:07 ` Peter Zijlstra
2017-08-16 0:02 ` Steven Rostedt
2017-08-11 12:22 ` Juergen Gross
2017-08-11 12:35 ` Peter Zijlstra
2017-08-11 12:46 ` Juergen Gross
2017-08-11 12:54 ` Peter Zijlstra
2017-08-11 13:07 ` Juergen Gross
2017-08-11 13:39 ` Peter Zijlstra
2017-08-02 16:09 ` [PATCH v10 8/9] x86/hyper-v: support extended CPU ranges for TLB flush hypercalls Vitaly Kuznetsov
2017-08-31 20:01 ` [tip:x86/platform] x86/hyper-v: Support " tip-bot for Vitaly Kuznetsov
2017-08-02 16:09 ` [PATCH v10 9/9] tracing/hyper-v: trace hyperv_mmu_flush_tlb_others() Vitaly Kuznetsov
2017-08-31 20:01 ` [tip:x86/platform] tracing/hyper-v: Trace hyperv_mmu_flush_tlb_others() tip-bot for Vitaly Kuznetsov
2017-08-10 11:58 ` [PATCH v10 0/9] Hyper-V: paravirtualized remote TLB flushing and hypercall improvements Vitaly Kuznetsov
2017-08-10 15:12 ` Ingo Molnar
2017-08-10 15:17 ` Vitaly Kuznetsov
2017-08-10 16:03 ` Ingo Molnar
2017-08-10 17:00 ` Vitaly Kuznetsov
2017-08-31 11:43 ` Vitaly Kuznetsov
2017-08-31 12:22 ` Ingo Molnar
2017-08-31 14:53 ` Vitaly Kuznetsov
2017-08-31 20:01 ` Ingo Molnar
2017-11-06 8:43 ` Wanpeng Li
2017-11-06 9:14 ` Vitaly Kuznetsov
2017-11-06 9:57 ` Wanpeng Li
2017-11-06 10:10 ` Vitaly Kuznetsov
2017-11-06 11:07 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87shgrmpko.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=Jork.Loeser@microsoft.com \
--cc=andy.shevchenko@gmail.com \
--cc=haiyangz@microsoft.com \
--cc=hpa@zytor.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kys@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sixiao@microsoft.com \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox