From: Haitao Huang <haitao.huang@linux.intel.com>
To: jarkko@kernel.org, dave.hansen@linux.intel.com,
kai.huang@intel.com, tj@kernel.org, mkoutny@suse.com,
linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org,
x86@kernel.org, cgroups@vger.kernel.org, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
sohil.mehta@intel.com, tim.c.chen@linux.intel.com
Cc: zhiquan1.li@intel.com, kristen@linux.intel.com,
seanjc@google.com, zhanb@microsoft.com, anakrish@microsoft.com,
mikko.ylinen@linux.intel.com, yangjie@microsoft.com,
chrisyan@microsoft.com
Subject: [PATCH v11 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
Date: Wed, 10 Apr 2024 11:25:56 -0700 [thread overview]
Message-ID: <20240410182558.41467-13-haitao.huang@linux.intel.com> (raw)
In-Reply-To: <20240410182558.41467-1-haitao.huang@linux.intel.com>
From: Kristen Carlson Accardi <kristen@linux.intel.com>
Previous patches have implemented all infrastructure needed for
per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC
pages are still tracked in the global LRU as sgx_lru_list() returns hard
coded reference to the global LRU.
Change sgx_lru_list() to return the LRU of the cgroup in which the given
EPC page is allocated.
This makes all EPC pages tracked in per-cgroup LRUs and the global
reclaimer (ksgxd) will not be able to reclaim any pages from the global
LRU. However, in cases of over-committing, i.e., the sum of cgroup
limits greater than the total capacity, cgroups may never reclaim but
the total usage can still be near the capacity. Therefore a global
reclamation is still needed in those cases and it should be performed
from the root cgroup.
Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup
when cgroup is enabled, otherwise from the global LRU. Export
sgx_cgroup_reclaim_pages() in the header file so it can be reused for
this purpose.
Similarly, modify sgx_can_reclaim(), to check emptiness of LRUs of all
cgroups when EPC cgroup is enabled, otherwise only check the global LRU.
Export sgx_cgroup_lru_empty() so it can be reused for this purpose.
Finally, change sgx_reclaim_direct(), to check and ensure there are free
pages at cgroup level so forward progress can be made by the caller.
Export sgx_cgroup_should_reclaim() for reuse.
With these changes, the global reclamation and per-cgroup reclamation
both work properly with all pages tracked in per-cgroup LRUs.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com>
Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
V11:
- Reword the comments for global reclamation for allocation failure
after passing cgroup charging. (Kai)
- Add stub functions to remove ifdefs in c file (Kai)
- Add more detailed comments to clarify each page belongs to one cgroup, or the
root. (Kai)
V10:
- Add comment to clarify each page belongs to one cgroup, or the root by
default. (Kai)
- Merge the changes that expose sgx_cgroup_* functions to this patch.
- Add changes for sgx_reclaim_direct() that was missed previously.
V7:
- Split this out from the big patch, #10 in V6. (Dave, Kai)
---
arch/x86/kernel/cpu/sgx/epc_cgroup.c | 6 ++--
arch/x86/kernel/cpu/sgx/epc_cgroup.h | 27 +++++++++++++++++
arch/x86/kernel/cpu/sgx/main.c | 43 ++++++++++++++++++++++++++--
3 files changed, 71 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
index 2efc33476b0b..16fe0e1574ec 100644
--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
@@ -68,7 +68,7 @@ static inline u64 sgx_cgroup_max_pages_to_root(struct sgx_cgroup *sgx_cg)
*
* Return: %true if all cgroups under the specified root have empty LRU lists.
*/
-static bool sgx_cgroup_lru_empty(struct misc_cg *root)
+bool sgx_cgroup_lru_empty(struct misc_cg *root)
{
struct cgroup_subsys_state *css_root;
struct cgroup_subsys_state *pos;
@@ -116,7 +116,7 @@ static bool sgx_cgroup_lru_empty(struct misc_cg *root)
* the LRUs are recently accessed, i.e., considered "too young" to reclaim, no
* page will actually be reclaimed after walking the whole tree.
*/
-static void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *charge_mm)
+void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *charge_mm)
{
struct cgroup_subsys_state *css_root;
struct cgroup_subsys_state *pos;
@@ -157,7 +157,7 @@ static void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *cha
* threshold (%SGX_CG_MIN_FREE_PAGE) and there are reclaimable pages within the
* cgroup.
*/
-static bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg)
+bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg)
{
u64 cur, max;
diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.h b/arch/x86/kernel/cpu/sgx/epc_cgroup.h
index 9a2d51a91e5c..963aa19d3c47 100644
--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.h
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h
@@ -13,6 +13,11 @@
#define MISC_CG_RES_SGX_EPC MISC_CG_RES_TYPES
struct sgx_cgroup;
+static inline struct misc_cg *misc_from_sgx(struct sgx_cgroup *sgx_cg)
+{
+ return NULL;
+}
+
static inline struct sgx_cgroup *sgx_get_current_cg(void)
{
return NULL;
@@ -27,8 +32,22 @@ static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_recl
static inline void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg) { }
+static inline bool sgx_cgroup_lru_empty(struct misc_cg *root)
+{
+ return true;
+}
+
+static inline bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg)
+{
+ return false;
+}
+
static inline void sgx_cgroup_init(void) { }
+static inline void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *charge_mm)
+{
+}
+
#else
struct sgx_cgroup {
@@ -37,6 +56,11 @@ struct sgx_cgroup {
struct work_struct reclaim_work;
};
+static inline struct misc_cg *misc_from_sgx(struct sgx_cgroup *sgx_cg)
+{
+ return sgx_cg->cg;
+}
+
static inline struct sgx_cgroup *sgx_cgroup_from_misc_cg(struct misc_cg *cg)
{
return (struct sgx_cgroup *)(cg->res[MISC_CG_RES_SGX_EPC].priv);
@@ -67,6 +91,9 @@ static inline void sgx_put_cg(struct sgx_cgroup *sgx_cg)
int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_reclaim reclaim);
void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg);
+bool sgx_cgroup_lru_empty(struct misc_cg *root);
+bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg);
+void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *charge_mm);
void sgx_cgroup_init(void);
#endif
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 11edbdb06782..e42e4a972752 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -32,9 +32,30 @@ static DEFINE_XARRAY(sgx_epc_address_space);
*/
static struct sgx_epc_lru_list sgx_global_lru;
+/*
+ * Get the per-cgroup or global LRU list that tracks the given reclaimable page.
+ */
static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page *epc_page)
{
+#ifdef CONFIG_CGROUP_SGX_EPC
+ /*
+ * epc_page->sgx_cg here is never NULL during a reclaimable epc_page's
+ * life between sgx_alloc_epc_page() and sgx_free_epc_page():
+ *
+ * In sgx_alloc_epc_page(), epc_page->sgx_cg is set to the return from
+ * sgx_get_current_cg() which is the misc cgroup of the current task, or
+ * the root by default even if the misc cgroup is disabled by kernel
+ * command line.
+ *
+ * epc_page->sgx_cg is only unset by sgx_free_epc_page().
+ *
+ * This function is never used before sgx_alloc_epc_page() or after
+ * sgx_free_epc_page().
+ */
+ return &epc_page->sgx_cg->lru;
+#else
return &sgx_global_lru;
+#endif
}
/*
@@ -42,7 +63,8 @@ static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page *epc_pag
*/
static inline bool sgx_can_reclaim(void)
{
- return !list_empty(&sgx_global_lru.reclaimable);
+ return !sgx_cgroup_lru_empty(misc_cg_root()) ||
+ !list_empty(&sgx_global_lru.reclaimable);
}
static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0);
@@ -404,7 +426,10 @@ static bool sgx_should_reclaim(unsigned long watermark)
static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
{
- sgx_reclaim_pages(&sgx_global_lru, charge_mm);
+ if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC))
+ sgx_cgroup_reclaim_pages(misc_cg_root(), charge_mm);
+ else
+ sgx_reclaim_pages(&sgx_global_lru, charge_mm);
}
/*
@@ -414,6 +439,14 @@ static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
*/
void sgx_reclaim_direct(void)
{
+ struct sgx_cgroup *sgx_cg = sgx_get_current_cg();
+
+ /* Make sure there are some free pages at cgroup level */
+ if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg)) {
+ sgx_cgroup_reclaim_pages(misc_from_sgx(sgx_cg), current->mm);
+ sgx_put_cg(sgx_cg);
+ }
+ /* Make sure there are some free pages at global level */
if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
sgx_reclaim_pages_global(current->mm);
}
@@ -616,6 +649,12 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
break;
}
+ /*
+ * At this point, the usage within this cgroup is under its
+ * limit but there is no physical page left for allocation.
+ * Perform a global reclaim to get some pages released from any
+ * cgroup with reclaimable pages.
+ */
sgx_reclaim_pages_global(current->mm);
cond_resched();
}
--
2.25.1
next prev parent reply other threads:[~2024-04-10 18:26 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-10 18:25 [PATCH v11 00/14] Add Cgroup support for SGX EPC memory Haitao Huang
2024-04-10 18:25 ` [PATCH v11 01/14] x86/sgx: Replace boolean parameters with enums Haitao Huang
2024-04-15 13:22 ` Huang, Kai
2024-04-15 19:10 ` Jarkko Sakkinen
2024-04-10 18:25 ` [PATCH v11 02/14] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2024-04-15 13:43 ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 03/14] cgroup/misc: Export APIs for SGX driver Haitao Huang
2024-04-15 13:45 ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 04/14] cgroup/misc: Add SGX EPC resource type Haitao Huang
2024-04-15 13:49 ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 05/14] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2024-04-10 18:25 ` [PATCH v11 06/14] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2024-04-15 13:51 ` Huang, Kai
2024-04-10 18:25 ` [PATCH v11 07/14] x86/sgx: Abstract tracking reclaimable pages in LRU Haitao Huang
2024-04-10 18:25 ` [PATCH v11 08/14] x86/sgx: Add basic EPC reclamation flow for cgroup Haitao Huang
2024-04-10 18:25 ` [PATCH v11 09/14] x86/sgx: Implement async reclamation " Haitao Huang
2024-04-10 18:25 ` [PATCH v11 10/14] x86/sgx: Charge mem_cgroup for per-cgroup reclamation Haitao Huang
2024-04-10 18:25 ` [PATCH v11 11/14] x86/sgx: Abstract check for global reclaimable pages Haitao Huang
2024-04-10 18:25 ` Haitao Huang [this message]
2024-04-10 18:25 ` [PATCH v11 13/14] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2024-04-10 18:25 ` [PATCH v11 14/14] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2024-04-13 21:34 ` Jarkko Sakkinen
2024-04-15 17:32 ` Haitao Huang
2024-04-15 19:12 ` Jarkko Sakkinen
2024-04-14 15:01 ` Jarkko Sakkinen
2024-04-15 3:13 ` Haitao Huang
2024-04-15 19:08 ` Jarkko Sakkinen
2024-04-15 19:28 ` Haitao Huang
2024-04-22 19:38 ` Haitao Huang
2024-04-13 6:48 ` [PATCH v11 00/14] Add Cgroup support for SGX EPC memory Mikko Ylinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240410182558.41467-13-haitao.huang@linux.intel.com \
--to=haitao.huang@linux.intel.com \
--cc=anakrish@microsoft.com \
--cc=bp@alien8.de \
--cc=cgroups@vger.kernel.org \
--cc=chrisyan@microsoft.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jarkko@kernel.org \
--cc=kai.huang@intel.com \
--cc=kristen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=mikko.ylinen@linux.intel.com \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=seanjc@google.com \
--cc=sohil.mehta@intel.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=tj@kernel.org \
--cc=x86@kernel.org \
--cc=yangjie@microsoft.com \
--cc=zhanb@microsoft.com \
--cc=zhiquan1.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox