From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id B61CDC4321E
	for <linux-arm-kernel@archiver.kernel.org>; Thu, 24 Nov 2022 13:56:39 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type:
	Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive:
	List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To:Subject:
	MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description:
	Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:
	List-Owner; bh=l2Us/uOGWigmLVoLVcetItnS03jnQwDP5y8tfBKSpPw=; b=NEuNsFCLfUslSF
	Qaw1L5FwgeqNj2Om84wPh2ygfGaSOfHSdboY+Ybva9bkabddR9H6FELK7VYPBovbWzPY+76CGzUIH
	avgAVs7ivDP1AHaIAFljNzStBMuKgYdmikSuF1f4d1AsS9F5pY8CPxcUQuR/7tnPmJk/gGqTX4l5v
	BzZhoB5vlx8nN8kKWpMV50uh5fLm8XZtYPn8TY6bUcdxvxSC48DXeBSYY4ct2RjmEDuXr0m8+TZdI
	+1qFqP5Y6NBAHQv+ykDQrktWIzsMRVq1StH/BeIxC/Nfpbvctb3s9VYgDVA2KbOhuvjG2S1KzjQEI
	J6U2hQycU7kfo58GWB1A==;
Received: from localhost ([::1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux))
	id 1oyChV-0091dA-OK; Thu, 24 Nov 2022 13:55:37 +0000
Received: from mail-wm1-x32b.google.com ([2a00:1450:4864:20::32b])
	by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux))
	id 1oyChQ-0091Yq-DO
	for linux-arm-kernel@lists.infradead.org; Thu, 24 Nov 2022 13:55:35 +0000
Received: by mail-wm1-x32b.google.com with SMTP id o30so1369528wms.2
        for <linux-arm-kernel@lists.infradead.org>; Thu, 24 Nov 2022 05:55:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=bytedance-com.20210112.gappssmtp.com; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :from:to:cc:subject:date:message-id:reply-to;
        bh=pN+fMTXIbwkvTVCL6teMynvLqCQ1zb8psy5cG5agjcI=;
        b=o29ZVKlW0hg+wTr5SNdc59ZTn7GtkgIjvdfNuhfZ9Go5RYF0bUJQaqw5oksOVSlXKZ
         gcP9TNLHnjIxApSzZCFmyVSHsnHp5d8nqgupDXaf2I2GRs9KooJHuWeafVn03RAPJtr/
         0qkPSxuCTWSkaFU8OuSG1Uq66QpY8nhn69mWWtcNAOe2IaA095kW57glzfzcSggluPjv
         gM2S+9x9+HeAxzpzkyPXaxArRwLN7K37FBGUiotrf5SRPQRRIGyY0GrWM/F6EFK1MOFE
         SuMysdzefbRAZ6CgZjQ+ih+yQp1V+6yCvXoo5pkoW/Kg6aWhZ65xv+8aFUSMN8tbrBJG
         aQPQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=pN+fMTXIbwkvTVCL6teMynvLqCQ1zb8psy5cG5agjcI=;
        b=aFy87Lhlq3ZDQVXof0jutbPm8foEpBm8kp0E2LhIlxF0Pqrr2Yr61VZkvlVooUdCV6
         +q50PrRQPMO/0zmQbIAPBF9DREwHWKbI09OoFsBGpV1N5+yfd0l0le8hfJL293HOgpgq
         zCgmWCD67vwTG/72SkFMMznYQTlM0hfUSHhlbzmqS/1iVATTb42jMSbDVZhHvWHdhddY
         Hdfdr1OoZ3e2iBKuO0dy06hKtqa3DIx4YzB90uaWR00xum5vdh7zbyBmawJ5GnjXE4cU
         DGs+T2DIEYgc3nYRWAAd4cPSQK+8IrDEwkrLG2Wslouhad634QOjDDyjrzNhPiLPSGSe
         Yaag==
X-Gm-Message-State: ANoB5pk3VshF4aZwKaCvvjNtUI904wl5svzdT+GEdRb/XKHNvJSYeU08
	RS4wid39u2rObYbxuLXnQU96vw==
X-Google-Smtp-Source: AA0mqf7LmgqnY5TT62nL6QBBImocwCh9tlDp2Ug4ekB8tl5kFgZ55ulOqFa7jWHsk4HR1gTXeY+Y4A==
X-Received: by 2002:a1c:a381:0:b0:3cf:4d14:5705 with SMTP id m123-20020a1ca381000000b003cf4d145705mr12126137wme.35.1669298128503;
        Thu, 24 Nov 2022 05:55:28 -0800 (PST)
Received: from ?IPV6:2a02:6b6a:b4d7:0:f0b8:522f:5d6e:d788? ([2a02:6b6a:b4d7:0:f0b8:522f:5d6e:d788])
        by smtp.gmail.com with ESMTPSA id h20-20020a05600c351400b003b4935f04a4sm2479548wmq.5.2022.11.24.05.55.27
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Thu, 24 Nov 2022 05:55:28 -0800 (PST)
Message-ID: <95efd030-27f6-5668-a25e-9fbf210bfa1c@bytedance.com>
Date: Thu, 24 Nov 2022 13:55:27 +0000
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.11.0
Subject: Re: [External] Re: [v2 0/6] KVM: arm64: implement vcpu_is_preempted
 check
Content-Language: en-US
To: Marc Zyngier <maz@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
 kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
 linux-doc@vger.kernel.org, virtualization@lists.linux-foundation.org,
 linux@armlinux.org.uk, yezengruan@huawei.com, catalin.marinas@arm.com,
 will@kernel.org, steven.price@arm.com, mark.rutland@arm.com,
 bagasdotme@gmail.com, fam.zheng@bytedance.com, liangma@liangbit.com,
 punit.agrawal@bytedance.com
References: <20221104062105.4119003-1-usama.arif@bytedance.com>
 <87k048f3cm.wl-maz@kernel.org>
 <180b91af-a2aa-2cfd-eb7f-b2825c4e3dbe@bytedance.com>
 <86r0y1nmep.wl-maz@kernel.org>
From: Usama Arif <usama.arif@bytedance.com>
In-Reply-To: <86r0y1nmep.wl-maz@kernel.org>
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20221124_055532_714982_380D7F66 
X-CRM114-Status: GOOD (  26.35  )
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
 <mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
 <mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org


On 18/11/2022 00:20, Marc Zyngier wrote:
> On Mon, 07 Nov 2022 12:00:44 +0000,
> Usama Arif <usama.arif@bytedance.com> wrote:
>>
>>
>>
>> On 06/11/2022 16:35, Marc Zyngier wrote:
>>> On Fri, 04 Nov 2022 06:20:59 +0000,
>>> Usama Arif <usama.arif@bytedance.com> wrote:
>>>>
>>>> This patchset adds support for vcpu_is_preempted in arm64, which
>>>> allows the guest to check if a vcpu was scheduled out, which is
>>>> useful to know incase it was holding a lock. vcpu_is_preempted can
>>>> be used to improve performance in locking (see owner_on_cpu usage in
>>>> mutex_spin_on_owner, mutex_can_spin_on_owner, rtmutex_spin_on_owner
>>>> and osq_lock) and scheduling (see available_idle_cpu which is used
>>>> in several places in kernel/sched/fair.c for e.g. in wake_affine to
>>>> determine which CPU can run soonest):
>>>
>>> [...]
>>>
>>>> pvcy shows a smaller overall improvement (50%) compared to
>>>> vcpu_is_preempted (277%).  Host side flamegraph analysis shows that
>>>> ~60% of the host time when using pvcy is spent in kvm_handle_wfx,
>>>> compared with ~1.5% when using vcpu_is_preempted, hence
>>>> vcpu_is_preempted shows a larger improvement.
>>>
>>> And have you worked out *why* we spend so much time handling WFE?
>>>
>>> 	M.
>>
>> Its from the following change in pvcy patchset:
>>
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index e778eefcf214..915644816a85 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -118,7 +118,12 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
>>          }
>>
>>          if (esr & ESR_ELx_WFx_ISS_WFE) {
>> -               kvm_vcpu_on_spin(vcpu, vcpu_mode_priv(vcpu));
>> +               int state;
>> +               while ((state = kvm_pvcy_check_state(vcpu)) == 0)
>> +                       schedule();
>> +
>> +               if (state == -1)
>> +                       kvm_vcpu_on_spin(vcpu, vcpu_mode_priv(vcpu));
>>          } else {
>>                  if (esr & ESR_ELx_WFx_ISS_WFxT)
>>                          vcpu_set_flag(vcpu, IN_WFIT);
>>
>>
>> If my understanding is correct of the pvcy changes, whenever pvcy
>> returns an unchanged vcpu state, we would schedule to another
>> vcpu. And its the constant scheduling where the time is spent. I guess
>> the affects are much higher when the lock contention is very
>> high. This can be seem from the pvcy host side flamegraph as well with
>> (~67% of the time spent in the schedule() call in kvm_handle_wfx), For
>> reference, I have put the graph at:
>> https://uarif1.github.io/pvlock/perf_host_pvcy_nmi.svg
> 
> The real issue here is that we don't try to pick the right vcpu to
> run, and strictly rely on schedule() to eventually pick something that
> can run.
> 
> An interesting to do would be to try and fit the directed yield
> mechanism there. It would be a lot more interesting than the one-off
> vcpu_is_preempted hack, as it gives us a low-level primitive on which
> to construct things (pvcy is effectively a mwait-like primitive).

We could use kvm_vcpu_yield_to to yield to a specific vcpu, but how 
would we determine which vcpu to yield to?

IMO vcpu_is_preempted is very well integrated in a lot of core kernel 
code, i.e. mutex, rtmutex, rwsem and osq_lock. It is also used in 
scheduler to determine better which vCPU we can run on soonest, select 
idle core, etc. I am not sure if all of these cases will be optimized by 
pvcy? Also, with vcpu_is_preempted, some of the lock heavy benchmarks 
come down from spending around 50% of the time in lock to less than 1% 
(so not sure how much more room is there for improvement).

We could also use vcpu_is_preempted to optimize IPI performance (along 
with directed yield to target IPI vCPU) similar to how its done in x86 
(https://lore.kernel.org/all/1560255830-8656-2-git-send-email-wanpengli@tencent.com/). 
This case definitely wont be covered by pvcy.

Considering all the above, i.e. the core kernel integration already 
present and possible future usecases of vcpu_is_preempted, maybe its 
worth making vcpu_is_preempted work on arm independently of pvcy?

Thanks,
Usama

> 
> 	M.
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel