From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE39BC54EBC for ; Sat, 7 Jan 2023 21:46:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230224AbjAGVqN (ORCPT ); Sat, 7 Jan 2023 16:46:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229785AbjAGVqL (ORCPT ); Sat, 7 Jan 2023 16:46:11 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D4AA1EAD4 for ; Sat, 7 Jan 2023 13:46:06 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D1DA560BC7 for ; Sat, 7 Jan 2023 21:46:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C5F2C433D2; Sat, 7 Jan 2023 21:46:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673127965; bh=yRsWYzIPP8e9/7Ge8D49KR9iqOwLcOMBt8MSNzxu62w=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=N4SCpZL7UgQ3cm/YwhhCyEQHaOMEdExLRxGYZiK6lJ2gGHPcdvydvjySMoCNMny16 MZJcgVGrRwUMJ1b/EKyow75QhIW1Rr597/s+ciQUQiJKj63yJ2TNcJrhUgZpeMMYfl V3r9LFt46TqH/J4K5Gz2K0MfRVRtaxjlpH4mKNRGwtWWDaTQ5y9Tp1461UN1MQWOeE kO/ZfrCG4XNwNvmCLakuvQEFOxTxfFRIy0B7BlGrFvZDbpGUsBhQDTqusE0boNM5gX UnYDfcM4fWGTtuGPTJMXm50Sb3bl1NiLkTT6QRVlHSEdOfm6lZGu+EWjjuAcIG0iS9 OE2OvgYQ+3eGQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pEH0s-00HY1F-OP; Sat, 07 Jan 2023 21:46:02 +0000 Date: Sat, 07 Jan 2023 21:44:52 +0000 Message-ID: <87r0w6dnor.wl-maz@kernel.org> From: Marc Zyngier To: Shivam Kumar Cc: Sean Christopherson , pbonzini@redhat.com, james.morse@arm.com, borntraeger@linux.ibm.com, david@redhat.com, kvm@vger.kernel.org, Shaju Abraham , Manish Mishra , Anurag Madnawat Subject: Re: [PATCH v7 1/4] KVM: Implement dirty quota-based throttling of vcpus In-Reply-To: <77408d91-655a-6f51-5a3e-258e8ff7c358@nutanix.com> References: <20221113170507.208810-1-shivam.kumar1@nutanix.com> <20221113170507.208810-2-shivam.kumar1@nutanix.com> <86zgcpo00m.wl-maz@kernel.org> <18b66b42-0bb4-4b32-e92c-3dce61d8e6a4@nutanix.com> <86mt8iopb7.wl-maz@kernel.org> <86ilinqi3l.wl-maz@kernel.org> <874jtifpg0.wl-maz@kernel.org> <77408d91-655a-6f51-5a3e-258e8ff7c358@nutanix.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: shivam.kumar1@nutanix.com, seanjc@google.com, pbonzini@redhat.com, james.morse@arm.com, borntraeger@linux.ibm.com, david@redhat.com, kvm@vger.kernel.org, shaju.abraham@nutanix.com, manish.mishra@nutanix.com, anurag.madnawat@nutanix.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Sat, 07 Jan 2023 17:24:24 +0000, Shivam Kumar wrote: > On 26/12/22 3:37 pm, Marc Zyngier wrote: > > On Sun, 25 Dec 2022 16:50:04 +0000, > > Shivam Kumar wrote: > >> > >> Hi Marc, > >> Hi Sean, > >> > >> Please let me know if there's any further question or feedback. > > > > My earlier comments still stand: the proposed API is not usable as a > > general purpose memory-tracking API because it counts faults instead > > of memory, making it inadequate except for the most trivial cases. > > And I cannot believe you were serious when you mentioned that you were > > happy to make that the API. > > > > This requires some serious work, and this series is not yet near a > > state where it could be merged. > > > > Thanks, > > > > M. > > > > Hi Marc, > > IIUC, in the dirty ring interface too, the dirty_index variable is > incremented in the mark_page_dirty_in_slot function and it is also > count-based. At least on x86, I am aware that for dirty tracking we > have uniform granularity as huge pages (2MB pages) too are broken into > 4K pages and bitmap is at 4K-granularity. Please let me know if it is > possible to have multiple page sizes even during dirty logging on > ARM. And if that is the case, I am wondering how we handle the bitmap > with different page sizes on ARM. Easy. It *is* page-size, by the very definition of the API which explicitly says that a single bit represent one basic page. If you were to only break 1GB mappings into 2MB blocks, you'd have to mask 512 pages dirty at once, no question asked. Your API is different because at no point it implies any relationship with any page size. As it stands, it is a useless API. I understand that you are only concerned with your particular use case, but that's nowhere good enough. And it has nothing to do with ARM. This is equally broken on *any* architecture. > I agree that the notion of pages dirtied according to our > pages_dirtied variable depends on how we are handling the bitmap but > we expect the userspace to use the same granularity at which the dirty > bitmap is handled. I can capture this in documentation But what does the bitmap have to do with any of this? This is not what your API is about. You are supposed to count dirtied memory, and you are counting page faults instead. No sane userspace can make any sense of that. You keep coupling the two, but that's wrong. This thing has to be useful on its own, not just for your particular, super narrow use case. And that's a shame because the general idea of a dirty quota is an interesting one. If your sole intention is to capture in the documentation that the API is broken, then all I can do is to NAK the whole thing. Until you turn this page-fault quota into the dirty memory quota that you advertise, I'll continue to say no to it. Thanks, M. -- Without deviation from the norm, progress is not possible.