From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "Haitao Huang" <haitao.huang@linux.intel.com>,
<dave.hansen@linux.intel.com>, <kai.huang@intel.com>,
<tj@kernel.org>, <mkoutny@suse.com>, <chenridong@huawei.com>,
<linux-kernel@vger.kernel.org>, <linux-sgx@vger.kernel.org>,
<x86@kernel.org>, <cgroups@vger.kernel.org>, <tglx@linutronix.de>,
<mingo@redhat.com>, <bp@alien8.de>, <hpa@zytor.com>,
<sohil.mehta@intel.com>, <tim.c.chen@linux.intel.com>
Cc: <zhiquan1.li@intel.com>, <kristen@linux.intel.com>,
<seanjc@google.com>, <zhanb@microsoft.com>,
<anakrish@microsoft.com>, <mikko.ylinen@linux.intel.com>,
<yangjie@microsoft.com>, <chrisyan@microsoft.com>
Subject: Re: [PATCH v16 09/16] x86/sgx: Add basic EPC reclamation flow for cgroup
Date: Tue, 27 Aug 2024 21:15:38 +0300 [thread overview]
Message-ID: <D3QWDRKVYJ25.2TPQTKQ3CWDDZ@kernel.org> (raw)
In-Reply-To: <20240821015404.6038-10-haitao.huang@linux.intel.com>
On Wed Aug 21, 2024 at 4:53 AM EEST, Haitao Huang wrote:
> Currently in the EPC page allocation, the kernel simply fails the
> allocation when the current EPC cgroup fails to charge due to its usage
> reaching limit. This is not ideal. When that happens, a better way is
> to reclaim EPC page(s) from the current EPC cgroup to reduce its usage
> so the new allocation can succeed.
>
> Currently, all EPC pages are tracked in a single global LRU, and the
> "global EPC reclamation" supports the following 3 cases:
>
> 1) On-demand asynchronous reclamation: For allocation requests that can
> not wait for reclamation but can be retried, an asynchronous
> reclamation is triggered, in which the global reclaimer, ksgxd, keeps
> reclaiming EPC pages until the free page count is above a minimal
> threshold.
>
> 2) On-demand synchronous reclamation: For allocations that can wait for
> reclamation, the EPC page allocator, sgx_alloc_epc_page() reclaims
> EPC page(s) immediately until at least one free page is available for
> allocation.
>
> 3) Preemptive reclamation: For some allocation requests, e.g.,
> allocation for reloading a reclaimed page to change its permissions
> or page type, the kernel invokes sgx_reclaim_direct() to preemptively
> reclaim EPC page(s) as a best effort to minimize on-demand
> reclamation for subsequent allocations.
>
> Similarly, a "per-cgroup reclamation" is needed to support the above 3
> cases as well:
>
> 1) For on-demand asynchronous reclamation, a per-cgroup reclamation
> needs to be invoked to maintain a minimal difference between the
> usage and the limit for each cgroup, analogous to the minimal free
> page threshold maintained by the global reclaimer.
>
> 2) For on-demand synchronous reclamation, sgx_cgroup_try_charge() needs
> to invoke the per-cgroup reclamation until the cgroup usage become
> at least one page lower than its limit.
>
> 3) For preemptive reclamation, sgx_reclaim_direct() needs to invoke the
> per-cgroup reclamation to minimize per-cgroup on-demand reclamation
> for subsequent allocations.
>
> To support the per-cgroup reclamation, introduce a "per-cgroup LRU" to
> track all EPC pages belong to the owner cgroup to utilize the existing
> sgx_reclaim_pages().
>
> Currently, the global reclamation treats all EPC pages equally as it
> scans all EPC pages in FIFO order in the global LRU. The "per-cgroup
> reclamation" needs to somehow achieve the same fairness of all EPC pages
> that are tracked in the multiple LRUs of the given cgroup and all the
> descendants to reflect the nature of the cgroup.
>
> The idea is to achieve such fairness by scanning "all EPC cgroups" of
> the subtree (the given cgroup and all the descendants) equally in turns,
> and in the scan to each cgroup, apply the existing sgx_reclaim_pages()
> to its LRU. This basic flow is encapsulated in a new function,
> sgx_cgroup_reclaim_pages().
>
> Export sgx_reclaim_pages() for use in sgx_cgroup_reclaim_pages(). And
> modify sgx_reclaim_pages() to return the number of pages scanned so
> sgx_cgroup_reclaim_pages() can track scanning progress and determine
> whether enough scanning is done or to continue the scanning for next
> descendant.
>
> Whenever reclaiming in a subtree of a given root is needed, start the
> scanning from the next descendant where scanning was stopped at last
> time. To keep track of the next descendant cgroup to scan, add a new
> field, next_cg, in the sgx_cgroup struct. Create an iterator function,
> sgx_cgroup_next_get(), atomically returns a valid reference of the
> descendant for next round of scanning and advances @next_cg to next
> valid descendant in a preorder walk. This iterator function is used in
> sgx_cgroup_reclaim_pages() to iterate descendants for scanning.
> Separately also advances @next_cg to next valid descendant when the
> cgroup referenced by @next_cg is to be freed.
>
> Add support for on-demand synchronous reclamation in
> sgx_cgroup_try_charge(), applying sgx_cgroup_reclaim_pages() iteratively
> until cgroup usage is lower than its limit.
>
> Later patches will reuse sgx_cgroup_reclaim_pages() to add support for
> asynchronous and preemptive reclamation.
>
> Note all reclaimable EPC pages are still tracked in the global LRU thus
> no per-cgroup reclamation is actually active at the moment: -ENOMEM is
> returned by __sgx_cgroup_try_charge() when LRUs are empty. Per-cgroup
> tracking and reclamation will be turned on in the end after all
> necessary infrastructure is in place.
>
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
BR, Jarkko
next prev parent reply other threads:[~2024-08-27 18:15 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-21 1:53 [PATCH v16 00/16] Add Cgroup support for SGX EPC memory Haitao Huang
2024-08-21 1:53 ` [PATCH v16 01/16] x86/sgx: Replace boolean parameters with enums Haitao Huang
2024-08-21 1:53 ` [PATCH v16 02/16] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2024-08-27 10:45 ` Huang, Kai
2024-08-29 9:25 ` Huang, Kai
2024-08-21 1:53 ` [PATCH v16 03/16] cgroup/misc: Export APIs for SGX driver Haitao Huang
2024-08-27 10:25 ` Huang, Kai
2024-08-21 1:53 ` [PATCH v16 04/16] cgroup/misc: Add SGX EPC resource type Haitao Huang
2024-08-21 1:53 ` [PATCH v16 05/16] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2024-08-27 10:21 ` Huang, Kai
2024-08-27 23:11 ` Huang, Kai
2024-08-28 0:01 ` Huang, Kai
2024-08-21 1:53 ` [PATCH v16 06/16] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2024-08-21 1:53 ` [PATCH v16 07/16] x86/sgx: Abstract tracking reclaimable pages in LRU Haitao Huang
2024-08-21 1:53 ` [PATCH v16 08/16] x86/sgx: Encapsulate uses of the global LRU Haitao Huang
2024-08-27 11:13 ` Huang, Kai
2024-08-27 23:58 ` Huang, Kai
2024-08-27 18:15 ` Jarkko Sakkinen
2024-08-21 1:53 ` [PATCH v16 09/16] x86/sgx: Add basic EPC reclamation flow for cgroup Haitao Huang
2024-08-22 4:00 ` Haitao Huang
2024-08-27 18:15 ` Jarkko Sakkinen [this message]
2024-08-27 23:42 ` Huang, Kai
2024-08-21 1:53 ` [PATCH v16 10/16] x86/sgx: Implement async reclamation " Haitao Huang
2024-08-27 10:22 ` Huang, Kai
2024-08-27 23:46 ` Huang, Kai
2024-08-27 18:15 ` Jarkko Sakkinen
2024-08-21 1:53 ` [PATCH v16 11/16] x86/sgx: Charge mem_cgroup for per-cgroup reclamation Haitao Huang
2024-08-21 1:54 ` [PATCH v16 12/16] x86/sgx: Revise global reclamation for EPC cgroups Haitao Huang
2024-08-27 11:32 ` Huang, Kai
2024-08-27 23:29 ` Huang, Kai
2024-08-27 18:16 ` Jarkko Sakkinen
2024-08-21 1:54 ` [PATCH v16 13/16] x86/sgx: implement direct reclamation for cgroups Haitao Huang
2024-08-27 18:16 ` Jarkko Sakkinen
2024-08-27 23:55 ` Huang, Kai
2024-08-29 9:34 ` Huang, Kai
2024-08-21 1:54 ` [PATCH v16 14/16] x86/sgx: Turn on per-cgroup EPC reclamation Haitao Huang
2024-08-27 23:25 ` Huang, Kai
2024-08-21 1:54 ` [PATCH v16 15/16] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2024-08-21 1:54 ` [PATCH v16 16/16] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2024-08-27 23:26 ` Huang, Kai
2024-08-26 10:57 ` [PATCH v16 00/16] Add Cgroup support for SGX EPC memory Mikko Ylinen
2024-08-27 18:18 ` Jarkko Sakkinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D3QWDRKVYJ25.2TPQTKQ3CWDDZ@kernel.org \
--to=jarkko@kernel.org \
--cc=anakrish@microsoft.com \
--cc=bp@alien8.de \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huawei.com \
--cc=chrisyan@microsoft.com \
--cc=dave.hansen@linux.intel.com \
--cc=haitao.huang@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kai.huang@intel.com \
--cc=kristen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=mikko.ylinen@linux.intel.com \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=seanjc@google.com \
--cc=sohil.mehta@intel.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=tj@kernel.org \
--cc=x86@kernel.org \
--cc=yangjie@microsoft.com \
--cc=zhanb@microsoft.com \
--cc=zhiquan1.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox