From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D5B3C7EE21 for ; Tue, 2 May 2023 21:27:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229921AbjEBV1T (ORCPT ); Tue, 2 May 2023 17:27:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229923AbjEBV1L (ORCPT ); Tue, 2 May 2023 17:27:11 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA07E10EF for ; Tue, 2 May 2023 14:26:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683062781; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1zc7E43mDugyj5NEnd93trnYdgLQsKcvQy7NWFtl2VU=; b=XoE4KrpmyyhhnEyZAdgOgdEiYjDw+rERgtiHV9+ZgnKWZQgBBsZqepvyp4hLb5YSOSvSvc D1HoTbHRagGQ+YgAXjQ2vg0cG2FhsY+WsngiP4p0MuqpRVhYGXs7pa8QJOL0FMIh8sTHeN uNFvdHCiux7X556Y46OQBdHLUtcZroM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-403-h9qpYmFjMQu-UNgu6fOiTA-1; Tue, 02 May 2023 17:26:18 -0400 X-MC-Unique: h9qpYmFjMQu-UNgu6fOiTA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 33D3C811E7E; Tue, 2 May 2023 21:26:18 +0000 (UTC) Received: from [10.22.10.239] (unknown [10.22.10.239]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A78C112132E; Tue, 2 May 2023 21:26:17 +0000 (UTC) Message-ID: Date: Tue, 2 May 2023 17:26:17 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition Content-Language: en-US To: =?UTF-8?Q?Michal_Koutn=c3=bd?= Cc: Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Juri Lelli , Valentin Schneider , Frederic Weisbecker References: <1b8d9128-d076-7d37-767d-11d6af314662@redhat.com> <9862da55-5f41-24c3-f3bb-4045ccf24b2e@redhat.com> <226cb2da-e800-6531-4e57-cbf991022477@redhat.com> <60ec12dc-943c-b8f0-8b6f-97c5d332144c@redhat.com> <46d26abf-a725-b924-47fa-4419b20bbc02@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On 5/2/23 14:01, Michal Koutný wrote: > Hello. > > The previous thread arrived incomplete to me, so I respond to the last > message only. Point me to a message URL if it was covered. > > On Fri, Apr 14, 2023 at 03:06:27PM -0400, Waiman Long wrote: >> Below is a draft of the new cpuset.cpus.reserve cgroupfs file: >> >>   cpuset.cpus.reserve >>         A read-write multiple values file which exists on all >>         cpuset-enabled cgroups. >> >>         It lists the reserved CPUs to be used for the creation of >>         child partitions.  See the section on "cpuset.cpus.partition" >>         below for more information on cpuset partition.  These reserved >>         CPUs should be a subset of "cpuset.cpus" and will be mutually >>         exclusive of "cpuset.cpus.effective" when used since these >>         reserved CPUs cannot be used by tasks in the current cgroup. >> >>         There are two modes for partition CPUs reservation - >>         auto or manual.  The system starts up in auto mode where >>         "cpuset.cpus.reserve" will be set automatically when valid >>         child partitions are created and users don't need to touch the >>         file at all.  This mode has the limitation that the parent of a >>         partition must be a partition root itself.  So child partition >>         has to be created one-by-one from the cgroup root down. >> >>         To enable the creation of a partition down in the hierarchy >>         without the intermediate cgroups to be partition roots, > Why would be this needed? Owning a CPU (a resource) must logically be > passed all the way from root to the target cgroup, i.e. this is > expressed by valid partitioning down to given level. > >> one >>         has to turn on the manual reservation mode by writing directly >>         to "cpuset.cpus.reserve" with a value different from its >>         current value.  By distributing the reserve CPUs down the cgroup >>         hierarchy to the parent of the target cgroup, this target cgroup >>         can be switched to become a partition root if its "cpuset.cpus" >>         is a subset of the set of valid reserve CPUs in its parent. > level n > `- level n+1 > cpuset.cpus // these are actually configured by "owner" of level n > cpuset.cpus.partition // similrly here, level n decides if child is a partition > > I.e. what would be level n/cpuset.cpus.reserve good for when it can > directly control level n+1/cpuset.cpus? In the new scheme, the available cpus are still directly passed down to a descendant cgroup. However, isolated CPUs (or more generally CPUs dedicated to a partition) have to be exclusive. So what the cpuset.cpus.reserve does is to identify those exclusive CPUs that can be excluded from the effective_cpus of the parent cgroups before they are claimed by a child partition. Currently this is done automatically when a child partition is created off a parent partition root. The new scheme will break it into 2 separate steps without the requirement that the parent of a partition has to be a partition root itself. Cheers, Longman claimed by a partition and will be excluded from the effective_cpus of the parent