From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F983C4361B for ; Thu, 10 Dec 2020 08:45:35 +0000 (UTC) Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by mail.kernel.org (Postfix) with ESMTP id AA57D22D58 for ; Thu, 10 Dec 2020 08:45:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA57D22D58 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvmarm-bounces@lists.cs.columbia.edu Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 09C244B1C0; Thu, 10 Dec 2020 03:45:34 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d6ZBDavlEEGr; Thu, 10 Dec 2020 03:45:32 -0500 (EST) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id D9E484B14D; Thu, 10 Dec 2020 03:45:32 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 4D6804B14D for ; Thu, 10 Dec 2020 03:45:31 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fk4kPtYTNQ9J for ; Thu, 10 Dec 2020 03:45:26 -0500 (EST) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 3A8B14B134 for ; Thu, 10 Dec 2020 03:45:26 -0500 (EST) Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 06B3422CA1; Thu, 10 Dec 2020 08:45:25 +0000 (UTC) Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1knHZi-00013B-TO; Thu, 10 Dec 2020 08:45:23 +0000 MIME-Version: 1.0 Date: Thu, 10 Dec 2020 08:45:22 +0000 From: Marc Zyngier To: Joel Fernandes , Quentin Perret Subject: Re: [RFC][PATCH 0/4] arm64:kvm: teach guest sched that VCPUs can be preempted In-Reply-To: References: <20200721041742.197354-1-sergey.senozhatsky@gmail.com> <20200817020310.GA1210848@jagdpanzerIV.localdomain> <20200911085841.GB562@jagdpanzerIV.localdomain> User-Agent: Roundcube Webmail/1.4.9 Message-ID: <78091359dab0d8decfc452f7c5c25971@kernel.org> X-Sender: maz@kernel.org X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: joelaf@google.com, qperret@google.com, sergey.senozhatsky@gmail.com, yezengruan@huawei.com, will@kernel.org, linux-kernel@vger.kernel.org, suleiman@google.com, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, wanghaibin.wang@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Cc: LKML , Sergey Senozhatsky , Suleiman Souhlal , Will Deacon , kvmarm@lists.cs.columbia.edu, "moderated list:ARM64 PORT \(AARCH64 ARCHITECTURE\)" X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On 2020-12-10 01:39, Joel Fernandes wrote: [...] >> Quentin and I have discussed potential ways of improving guest >> scheduling >> on terminally broken systems (otherwise known as big-little), in the >> form of a capacity request from the guest to the host. I'm not really >> keen on the host exposing its own capacity, as that doesn't tell the >> host what the guest actually needs. > > I am not sure how a capacity request could work well. It seems the > cost of a repeated hypercall could be prohibitive. In this case, a > lighter approach might be for KVM to restrict vCPU threads to run on > certain types of cores, and pass the capacity information to the guest > at guest's boot time. That seems like a very narrow use case. If you actually pin vcpus to physical CPU classes, DT is the right place to put things, because it is completely static. This is effectively creating a virtual big-little, which is in my opinion a userspace job. > This would be a one-time cost to pay. And then, > then the guest scheduler can handle the scheduling appropriately > without any more hypercalls. Thoughts? Anything that is a one-off belongs to firmware configuration, IMO. The case I'm concerned with is when vcpus are allowed to roam across the system, and hit random physical CPUs because the host has no idea of the workload the guest deals with (specially as the AMU counters are either absent or unusable on any available core). The cost of a hypercall really depends on where you terminate it. If it is a shallow exit, that's only a few hundred cycles on any half baked CPU. Go all the way to userspace, and the host scheduler is the limit. But the frequency of that hypercall obviously matters too. How often do you expect the capacity request to fire? Probably not on each and every time slice, right? Quentin, can you shed some light on this? Thanks, M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8720C433FE for ; Thu, 10 Dec 2020 08:46:29 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5FDAD22CA1 for ; Thu, 10 Dec 2020 08:46:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5FDAD22CA1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:Message-ID:References:In-Reply-To:Subject:To:From: Date:MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=SHOyBDKUp3Zku73xKLcnk7DjC+wa3gYEIYoqVPGVVMc=; b=fKuIKUTK6PIZdhmgS25pVWvuR pyTmSsynNRVOYayJdWkp5uLIy4k1j3C/DWs+0BcD3jIiR1jBSEffqRQLSkbadn4tRROGP9w/j+ALa vtsergZfVZj4TA1aEBPaCdPU1TUIbbtjfPD6ek1xH3rnScxaeIPpJ0KsEjYTYVRZ2oneOMuCDoxTk 3b8mel1znOHSh6usSv3qxLtSQ4VF5t8+ZpeStIOX/Exjd1YUUdGaWwYyMD69PXjS2odPaKKsQ3iSm MyZb4yC/WY6IZEh/UQ3cSmY3c4E9XyTTJt9JyY65UqegISbxb/02TbI2RGbsbT87giVJTMuls6VoG P3Auif9Gg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1knHZp-0006id-4o; Thu, 10 Dec 2020 08:45:29 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1knHZm-0006hl-CY for linux-arm-kernel@lists.infradead.org; Thu, 10 Dec 2020 08:45:27 +0000 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 06B3422CA1; Thu, 10 Dec 2020 08:45:25 +0000 (UTC) Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1knHZi-00013B-TO; Thu, 10 Dec 2020 08:45:23 +0000 MIME-Version: 1.0 Date: Thu, 10 Dec 2020 08:45:22 +0000 From: Marc Zyngier To: Joel Fernandes , Quentin Perret Subject: Re: [RFC][PATCH 0/4] arm64:kvm: teach guest sched that VCPUs can be preempted In-Reply-To: References: <20200721041742.197354-1-sergey.senozhatsky@gmail.com> <20200817020310.GA1210848@jagdpanzerIV.localdomain> <20200911085841.GB562@jagdpanzerIV.localdomain> User-Agent: Roundcube Webmail/1.4.9 Message-ID: <78091359dab0d8decfc452f7c5c25971@kernel.org> X-Sender: maz@kernel.org X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: joelaf@google.com, qperret@google.com, sergey.senozhatsky@gmail.com, yezengruan@huawei.com, will@kernel.org, linux-kernel@vger.kernel.org, suleiman@google.com, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, wanghaibin.wang@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201210_034526_579837_0C095DA8 X-CRM114-Status: GOOD ( 15.57 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yezengruan , LKML , Sergey Senozhatsky , "Wanghaibin \(D\)" , Suleiman Souhlal , Will Deacon , kvmarm@lists.cs.columbia.edu, "moderated list:ARM64 PORT \(AARCH64 ARCHITECTURE\)" Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2020-12-10 01:39, Joel Fernandes wrote: [...] >> Quentin and I have discussed potential ways of improving guest >> scheduling >> on terminally broken systems (otherwise known as big-little), in the >> form of a capacity request from the guest to the host. I'm not really >> keen on the host exposing its own capacity, as that doesn't tell the >> host what the guest actually needs. > > I am not sure how a capacity request could work well. It seems the > cost of a repeated hypercall could be prohibitive. In this case, a > lighter approach might be for KVM to restrict vCPU threads to run on > certain types of cores, and pass the capacity information to the guest > at guest's boot time. That seems like a very narrow use case. If you actually pin vcpus to physical CPU classes, DT is the right place to put things, because it is completely static. This is effectively creating a virtual big-little, which is in my opinion a userspace job. > This would be a one-time cost to pay. And then, > then the guest scheduler can handle the scheduling appropriately > without any more hypercalls. Thoughts? Anything that is a one-off belongs to firmware configuration, IMO. The case I'm concerned with is when vcpus are allowed to roam across the system, and hit random physical CPUs because the host has no idea of the workload the guest deals with (specially as the AMU counters are either absent or unusable on any available core). The cost of a hypercall really depends on where you terminate it. If it is a shallow exit, that's only a few hundred cycles on any half baked CPU. Go all the way to userspace, and the host scheduler is the limit. But the frequency of that hypercall obviously matters too. How often do you expect the capacity request to fire? Probably not on each and every time slice, right? Quentin, can you shed some light on this? Thanks, M. -- Jazz is not dead. It just smells funny... _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D5C0C433FE for ; Thu, 10 Dec 2020 08:46:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 09D7E22CA1 for ; Thu, 10 Dec 2020 08:46:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388166AbgLJIqG (ORCPT ); Thu, 10 Dec 2020 03:46:06 -0500 Received: from mail.kernel.org ([198.145.29.99]:45136 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726579AbgLJIqG (ORCPT ); Thu, 10 Dec 2020 03:46:06 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 06B3422CA1; Thu, 10 Dec 2020 08:45:25 +0000 (UTC) Received: from disco-boy.misterjones.org ([51.254.78.96] helo=www.loen.fr) by disco-boy.misterjones.org with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1knHZi-00013B-TO; Thu, 10 Dec 2020 08:45:23 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 10 Dec 2020 08:45:22 +0000 From: Marc Zyngier To: Joel Fernandes , Quentin Perret Cc: Sergey Senozhatsky , yezengruan , Will Deacon , LKML , Suleiman Souhlal , kvmarm@lists.cs.columbia.edu, "moderated list:ARM64 PORT (AARCH64 ARCHITECTURE)" , "Wanghaibin (D)" Subject: Re: [RFC][PATCH 0/4] arm64:kvm: teach guest sched that VCPUs can be preempted In-Reply-To: References: <20200721041742.197354-1-sergey.senozhatsky@gmail.com> <20200817020310.GA1210848@jagdpanzerIV.localdomain> <20200911085841.GB562@jagdpanzerIV.localdomain> User-Agent: Roundcube Webmail/1.4.9 Message-ID: <78091359dab0d8decfc452f7c5c25971@kernel.org> X-Sender: maz@kernel.org X-SA-Exim-Connect-IP: 51.254.78.96 X-SA-Exim-Rcpt-To: joelaf@google.com, qperret@google.com, sergey.senozhatsky@gmail.com, yezengruan@huawei.com, will@kernel.org, linux-kernel@vger.kernel.org, suleiman@google.com, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, wanghaibin.wang@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-12-10 01:39, Joel Fernandes wrote: [...] >> Quentin and I have discussed potential ways of improving guest >> scheduling >> on terminally broken systems (otherwise known as big-little), in the >> form of a capacity request from the guest to the host. I'm not really >> keen on the host exposing its own capacity, as that doesn't tell the >> host what the guest actually needs. > > I am not sure how a capacity request could work well. It seems the > cost of a repeated hypercall could be prohibitive. In this case, a > lighter approach might be for KVM to restrict vCPU threads to run on > certain types of cores, and pass the capacity information to the guest > at guest's boot time. That seems like a very narrow use case. If you actually pin vcpus to physical CPU classes, DT is the right place to put things, because it is completely static. This is effectively creating a virtual big-little, which is in my opinion a userspace job. > This would be a one-time cost to pay. And then, > then the guest scheduler can handle the scheduling appropriately > without any more hypercalls. Thoughts? Anything that is a one-off belongs to firmware configuration, IMO. The case I'm concerned with is when vcpus are allowed to roam across the system, and hit random physical CPUs because the host has no idea of the workload the guest deals with (specially as the AMU counters are either absent or unusable on any available core). The cost of a hypercall really depends on where you terminate it. If it is a shallow exit, that's only a few hundred cycles on any half baked CPU. Go all the way to userspace, and the host scheduler is the limit. But the frequency of that hypercall obviously matters too. How often do you expect the capacity request to fire? Probably not on each and every time slice, right? Quentin, can you shed some light on this? Thanks, M. -- Jazz is not dead. It just smells funny...