From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752191AbdGMMq0 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 13 Jul 2017 08:46:26 -0400
Received: from mx1.redhat.com ([209.132.183.28]:46420 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751165AbdGMMqZ (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 13 Jul 2017 08:46:25 -0400
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 5CA7D61D38
Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=vkuznets@redhat.com
DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 5CA7D61D38
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: devel@linuxdriverproject.org,
        Stephen Hemminger <sthemmin@microsoft.com>,
        Jork Loeser <Jork.Loeser@microsoft.com>,
        Haiyang Zhang <haiyangz@microsoft.com>, X86 ML <x86@kernel.org>,
        "linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Steven Rostedt <rostedt@goodmis.org>, Ingo Molnar <mingo@redhat.com>,
        "H. Peter Anvin" <hpa@zytor.com>, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH v3 08/10] x86/hyper-v: use hypercall for remote TLB flush
References: <20170519140953.1167-1-vkuznets@redhat.com>
        <20170519140953.1167-9-vkuznets@redhat.com>
        <95cc1a34-418a-a875-4848-7e297a8b48b7@kernel.org>
        <87zie5tbmm.fsf@vitty.brq.redhat.com>
        <CALCETrWApganOO+Tqi63u-HW1d=j0iELFbioNaEf_hBUMoYhpQ@mail.gmail.com>
        <87bmqju4v9.fsf@vitty.brq.redhat.com>
        <CALCETrUyhjy-tCZYYjRfZcRqHdA48iYzGoAk0QrWvOeVRhSmbQ@mail.gmail.com>
Date: Thu, 13 Jul 2017 14:46:20 +0200
In-Reply-To: <CALCETrUyhjy-tCZYYjRfZcRqHdA48iYzGoAk0QrWvOeVRhSmbQ@mail.gmail.com>
        (Andy Lutomirski's message of "Mon, 26 Jun 2017 18:36:46 -0700")
Message-ID: <87d194mrmr.fsf@vitty.brq.redhat.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 13 Jul 2017 12:46:24 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Andy Lutomirski <luto@kernel.org> writes:

> On Tue, May 23, 2017 at 5:36 AM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> Andy Lutomirski <luto@kernel.org> writes:
>>
>>>
>>> Also, can you share the benchmark you used for these patches?
>>
>> I didn't do much while writing the patchset, mostly I was running the
>> attached dumb trasher (32 pthreads doing mmap/munmap). On a 16 vCPU
>> Hyper-V 2016 guest I get the following (just re-did the test with
>> 4.12-rc1):
>>
>> Before the patchset:
>> # time ./pthread_mmap ./randfile
>>
>> real    3m33.118s
>> user    0m3.698s
>> sys     3m16.624s
>>
>> After the patchset:
>> # time ./pthread_mmap ./randfile
>>
>> real    2m19.920s
>> user    0m2.662s
>> sys     2m9.948s
>>
>> K. Y.'s guys at Microsoft did additional testing for the patchset on
>> different Hyper-V deployments including Azure, they may share their
>> findings too.
>
> I ran this benchmark on my big TLB patchset, mainly to make sure I
> didn't regress your test.  I seem to have sped it up by 30% or so
> instead.  I need to study this a little bit to figure out why to make
> sure that the reason isn't that I'm failing to do flushes I need to
> do.

Got back to this and tested everything on WS2016 Hyper-V guest (24
vCPUs) with my slightly modified benchmark. The numbers are:

1) pre-patch:

real	1m15.775s
user	0m0.850s
sys	1m31.515s

2) your 'x86/pcid' series (PCID feature is not passed to the guest so this
is mainly your lazy tlb optimization):

real	0m55.135s
user	0m1.168s
sys	1m3.810s

3) My 'pv tlb shootdown' patchset on top of your 'x86/pcid' series:

real	0m48.891s
user	0m1.052s
sys	0m52.591s

As far as I understand I need to add
'setup_clear_cpu_cap(X86_FEATURE_PCID)' to my series to make things work
properly if this feature appears in the guest.

Other than that there is an additional room for optimization:
tlb_single_page_flush_ceiling, I'm not sure that with Hyper-V's PV the
default value of 33 is optimal. But the investigation can be done
separately.

AFAIU with your TLB preparatory work which got into 4.13 our series
become untangled and can go through different trees. I'll rebase mine
and send it to K. Y. to push through Greg's char-misc tree.

Is there anything blocking your PCID series from going into 4.14? It
seems to big a huge improvement for some workloads.

-- 
  Vitaly