From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "Haitao Huang" <haitao.huang@linux.intel.com>,
<dave.hansen@linux.intel.com>, <tj@kernel.org>,
<mkoutny@suse.com>, <linux-kernel@vger.kernel.org>,
<linux-sgx@vger.kernel.org>, <x86@kernel.org>,
<cgroups@vger.kernel.org>, <tglx@linutronix.de>,
<mingo@redhat.com>, <bp@alien8.de>, <hpa@zytor.com>,
<sohil.mehta@intel.com>
Cc: <zhiquan1.li@intel.com>, <kristen@linux.intel.com>,
<seanjc@google.com>, <zhanb@microsoft.com>,
<anakrish@microsoft.com>, <mikko.ylinen@linux.intel.com>,
<yangjie@microsoft.com>,
"Sean Christopherson" <sean.j.christopherson@intel.com>
Subject: Re: [PATCH v6 07/12] x86/sgx: Introduce EPC page states
Date: Wed, 15 Nov 2023 22:53:14 +0200 [thread overview]
Message-ID: <CWZOMMET6NDV.UVZQZ5VPS3NP@kernel.org> (raw)
In-Reply-To: <20231030182013.40086-8-haitao.huang@linux.intel.com>
On Mon Oct 30, 2023 at 8:20 PM EET, Haitao Huang wrote:
> Use the lower 2 bits in the flags field of sgx_epc_page struct to track
> EPC states and define an enum for possible states for EPC pages tracked
> for reclamation.
>
> Add the RECLAIM_IN_PROGRESS state to explicitly indicate a page that is
> identified as a candidate for reclaiming, but has not yet been
> reclaimed, instead of relying on list_empty(&epc_page->list). A later
> patch will replace the array on stack with a temporary list to store the
> candidate pages, so list_empty() should no longer be used for this
> purpose.
>
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Co-developed-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> Cc: Sean Christopherson <seanjc@google.com>
> ---
> V6:
> - Drop UNRECLAIMABLE and use only 2 bits for states (Kai)
> - Combine the patch for RECLAIM_IN_PROGRESS
> - Style fixes (Jarkko and Kai)
> ---
> arch/x86/kernel/cpu/sgx/encl.c | 2 +-
> arch/x86/kernel/cpu/sgx/main.c | 33 +++++++++---------
> arch/x86/kernel/cpu/sgx/sgx.h | 62 +++++++++++++++++++++++++++++++---
> 3 files changed, 76 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 279148e72459..17dc108d3ff7 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -1315,7 +1315,7 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page)
> {
> int ret;
>
> - WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> + WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_STATE_MASK);
>
> ret = __eremove(sgx_get_epc_virt_addr(page));
> if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret))
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index d347acd717fd..e27ac73d8843 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -315,13 +315,14 @@ static void sgx_reclaim_pages(void)
> list_del_init(&epc_page->list);
> encl_page = epc_page->owner;
>
> - if (kref_get_unless_zero(&encl_page->encl->refcount) != 0)
> + if (kref_get_unless_zero(&encl_page->encl->refcount) != 0) {
> + sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIM_IN_PROGRESS);
> chunk[cnt++] = epc_page;
> - else
> + } else
> /* The owner is freeing the page. No need to add the
> * page back to the list of reclaimable pages.
> */
> - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
> + sgx_epc_page_reset_state(epc_page);
> }
> spin_unlock(&sgx_global_lru.lock);
>
> @@ -347,6 +348,7 @@ static void sgx_reclaim_pages(void)
>
> skip:
> spin_lock(&sgx_global_lru.lock);
> + sgx_epc_page_set_state(epc_page, SGX_EPC_PAGE_RECLAIMABLE);
> list_add_tail(&epc_page->list, &sgx_global_lru.reclaimable);
> spin_unlock(&sgx_global_lru.lock);
>
> @@ -370,7 +372,7 @@ static void sgx_reclaim_pages(void)
> sgx_reclaimer_write(epc_page, &backing[i]);
>
> kref_put(&encl_page->encl->refcount, sgx_encl_release);
> - epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
> + sgx_epc_page_reset_state(epc_page);
>
> sgx_free_epc_page(epc_page);
> }
> @@ -509,7 +511,8 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void)
> void sgx_mark_page_reclaimable(struct sgx_epc_page *page)
> {
> spin_lock(&sgx_global_lru.lock);
> - page->flags |= SGX_EPC_PAGE_RECLAIMER_TRACKED;
> + WARN_ON_ONCE(sgx_epc_page_reclaimable(page->flags));
> + page->flags |= SGX_EPC_PAGE_RECLAIMABLE;
> list_add_tail(&page->list, &sgx_global_lru.reclaimable);
> spin_unlock(&sgx_global_lru.lock);
> }
> @@ -527,16 +530,13 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page)
> int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
> {
> spin_lock(&sgx_global_lru.lock);
> - if (page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED) {
> - /* The page is being reclaimed. */
> - if (list_empty(&page->list)) {
> - spin_unlock(&sgx_global_lru.lock);
> - return -EBUSY;
> - }
> -
> - list_del(&page->list);
> - page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
> + if (sgx_epc_page_reclaim_in_progress(page->flags)) {
> + spin_unlock(&sgx_global_lru.lock);
> + return -EBUSY;
> }
> +
> + list_del(&page->list);
> + sgx_epc_page_reset_state(page);
> spin_unlock(&sgx_global_lru.lock);
>
> return 0;
> @@ -623,6 +623,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
> struct sgx_epc_section *section = &sgx_epc_sections[page->section];
> struct sgx_numa_node *node = section->node;
>
> + WARN_ON_ONCE(page->flags & (SGX_EPC_PAGE_STATE_MASK));
> if (page->epc_cg) {
> sgx_epc_cgroup_uncharge(page->epc_cg);
> page->epc_cg = NULL;
> @@ -635,7 +636,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
> list_add(&page->list, &node->sgx_poison_page_list);
> else
> list_add_tail(&page->list, &node->free_page_list);
> - page->flags = SGX_EPC_PAGE_IS_FREE;
> + page->flags = SGX_EPC_PAGE_FREE;
>
> spin_unlock(&node->lock);
> atomic_long_inc(&sgx_nr_free_pages);
> @@ -737,7 +738,7 @@ int arch_memory_failure(unsigned long pfn, int flags)
> * If the page is on a free list, move it to the per-node
> * poison page list.
> */
> - if (page->flags & SGX_EPC_PAGE_IS_FREE) {
> + if (page->flags == SGX_EPC_PAGE_FREE) {
> list_move(&page->list, &node->sgx_poison_page_list);
> goto out;
> }
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index 0fbe6a2a159b..dd7ab65b5b27 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -23,11 +23,44 @@
> #define SGX_NR_LOW_PAGES 32
> #define SGX_NR_HIGH_PAGES 64
>
> -/* Pages, which are being tracked by the page reclaimer. */
> -#define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0)
> +enum sgx_epc_page_state {
> + /*
> + * Allocated but not tracked by the reclaimer.
> + *
> + * Pages allocated for virtual EPC which are never tracked by the host
> + * reclaimer; pages just allocated from free list but not yet put in
> + * use; pages just reclaimed, but not yet returned to the free list.
> + * Becomes FREE after sgx_free_epc().
> + * Becomes RECLAIMABLE after sgx_mark_page_reclaimable().
> + */
> + SGX_EPC_PAGE_NOT_TRACKED = 0,
> +
> + /*
> + * Page is in the free list, ready for allocation.
> + *
> + * Becomes NOT_TRACKED after sgx_alloc_epc_page().
> + */
> + SGX_EPC_PAGE_FREE = 1,
> +
> + /*
> + * Page is in use and tracked in a reclaimable LRU list.
> + *
> + * Becomes NOT_TRACKED after sgx_unmark_page_reclaimable().
> + * Becomes RECLAIM_IN_PROGRESS in sgx_reclaim_pages() when identified
> + * for reclaiming.
> + */
> + SGX_EPC_PAGE_RECLAIMABLE = 2,
> +
> + /*
> + * Page is in the middle of reclamation.
> + *
> + * Back to RECLAIMABLE if reclamation fails for any reason.
> + * Becomes NOT_TRACKED if reclaimed successfully.
> + */
> + SGX_EPC_PAGE_RECLAIM_IN_PROGRESS = 3,
> +};
>
> -/* Pages on free list */
> -#define SGX_EPC_PAGE_IS_FREE BIT(1)
> +#define SGX_EPC_PAGE_STATE_MASK GENMASK(1, 0)
>
> struct sgx_epc_cgroup;
>
> @@ -40,6 +73,27 @@ struct sgx_epc_page {
> struct sgx_epc_cgroup *epc_cg;
> };
>
> +static inline void sgx_epc_page_reset_state(struct sgx_epc_page *page)
> +{
> + page->flags &= ~SGX_EPC_PAGE_STATE_MASK;
> +}
> +
> +static inline void sgx_epc_page_set_state(struct sgx_epc_page *page, unsigned long flags)
> +{
> + page->flags &= ~SGX_EPC_PAGE_STATE_MASK;
> + page->flags |= (flags & SGX_EPC_PAGE_STATE_MASK);
> +}
> +
> +static inline bool sgx_epc_page_reclaim_in_progress(unsigned long flags)
> +{
> + return SGX_EPC_PAGE_RECLAIM_IN_PROGRESS == (flags & SGX_EPC_PAGE_STATE_MASK);
> +}
> +
> +static inline bool sgx_epc_page_reclaimable(unsigned long flags)
> +{
> + return SGX_EPC_PAGE_RECLAIMABLE == (flags & SGX_EPC_PAGE_STATE_MASK);
> +}
> +
> /*
> * Contains the tracking data for NUMA nodes having EPC pages. Most importantly,
> * the free page list local to the node is stored here.
Looks pretty good to me. I'll hold ack's a bit until everything looks as
a whole good, but I agree with the general idea in this patch.
BR, Jarkko
next prev parent reply other threads:[~2023-11-15 20:53 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-30 18:20 [PATCH v6 00/12] Add Cgroup support for SGX EPC memory Haitao Huang
2023-10-30 18:20 ` [PATCH v6 01/12] cgroup/misc: Add per resource callbacks for CSS events Haitao Huang
2023-11-15 20:25 ` Jarkko Sakkinen
2024-01-09 3:37 ` Haitao Huang
2024-01-10 19:55 ` Jarkko Sakkinen
2024-01-05 9:45 ` Michal Koutný
2024-01-06 1:42 ` Haitao Huang
2023-10-30 18:20 ` [PATCH v6 02/12] cgroup/misc: Export APIs for SGX driver Haitao Huang
2023-10-30 18:20 ` [PATCH v6 03/12] cgroup/misc: Add SGX EPC resource type Haitao Huang
2023-10-30 18:20 ` [PATCH v6 04/12] x86/sgx: Implement basic EPC misc cgroup functionality Haitao Huang
2023-11-06 12:09 ` Huang, Kai
2023-11-06 18:59 ` Haitao Huang
2023-11-06 22:18 ` Huang, Kai
2023-11-07 1:16 ` Haitao Huang
2023-11-07 2:08 ` Haitao Huang
2023-11-07 19:07 ` Huang, Kai
2023-11-20 3:16 ` Huang, Kai
2023-11-26 16:01 ` Haitao Huang
2023-11-26 16:32 ` Haitao Huang
2023-11-06 22:23 ` Huang, Kai
2023-11-15 20:48 ` Jarkko Sakkinen
2023-10-30 18:20 ` [PATCH v6 05/12] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Haitao Huang
2023-10-30 18:20 ` [PATCH v6 06/12] x86/sgx: Use sgx_epc_lru_list for existing active page list Haitao Huang
2023-10-30 18:20 ` [PATCH v6 07/12] x86/sgx: Introduce EPC page states Haitao Huang
2023-11-15 20:53 ` Jarkko Sakkinen [this message]
2024-01-05 17:57 ` Dave Hansen
2024-01-06 1:45 ` Haitao Huang
2023-10-30 18:20 ` [PATCH v6 08/12] x86/sgx: Use a list to track to-be-reclaimed pages Haitao Huang
2023-11-15 20:59 ` Jarkko Sakkinen
2023-10-30 18:20 ` [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function Haitao Huang
2023-11-20 3:45 ` Huang, Kai
2023-11-26 16:27 ` Haitao Huang
2023-11-27 9:57 ` Huang, Kai
2023-12-12 4:04 ` Haitao Huang
2023-12-13 11:17 ` Huang, Kai
2023-12-15 19:49 ` Haitao Huang
2023-12-18 1:44 ` Huang, Kai
2023-12-18 17:32 ` Mikko Ylinen
2023-12-18 21:24 ` Haitao Huang
2024-01-03 16:37 ` Dave Hansen
2024-01-04 19:11 ` Haitao Huang
2024-01-04 19:19 ` Jarkko Sakkinen
2024-01-04 19:27 ` Dave Hansen
2024-01-04 21:01 ` Haitao Huang
2024-01-05 14:43 ` Mikko Ylinen
2024-01-04 12:38 ` Michal Koutný
2024-01-04 19:20 ` Haitao Huang
2024-01-12 17:07 ` Haitao Huang
2024-01-13 21:04 ` Jarkko Sakkinen
2024-01-13 21:08 ` Jarkko Sakkinen
2023-10-30 18:20 ` [PATCH v6 10/12] x86/sgx: Implement EPC reclamation for cgroup Haitao Huang
2023-11-06 15:58 ` [PATCH] x86/sgx: Charge proper mem_cgroup for usage due to EPC reclamation by cgroups Haitao Huang
2023-11-06 16:10 ` [PATCH v6 10/12] x86/sgx: Implement EPC reclamation for cgroup Haitao Huang
2023-10-30 18:20 ` [PATCH v6 11/12] Docs/x86/sgx: Add description for cgroup support Haitao Huang
2023-10-30 18:20 ` [PATCH v6 12/12] selftests/sgx: Add scripts for EPC cgroup testing Haitao Huang
2023-11-15 21:00 ` Jarkko Sakkinen
2023-11-15 21:22 ` Haitao Huang
2023-11-06 3:26 ` [PATCH v6 00/12] Add Cgroup support for SGX EPC memory Jarkko Sakkinen
2023-11-06 15:48 ` Haitao Huang
2023-11-08 1:00 ` Haitao Huang
2024-01-05 18:29 ` Dave Hansen
2024-01-05 20:13 ` Haitao Huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CWZOMMET6NDV.UVZQZ5VPS3NP@kernel.org \
--to=jarkko@kernel.org \
--cc=anakrish@microsoft.com \
--cc=bp@alien8.de \
--cc=cgroups@vger.kernel.org \
--cc=dave.hansen@linux.intel.com \
--cc=haitao.huang@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kristen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=mikko.ylinen@linux.intel.com \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=sean.j.christopherson@intel.com \
--cc=seanjc@google.com \
--cc=sohil.mehta@intel.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=x86@kernel.org \
--cc=yangjie@microsoft.com \
--cc=zhanb@microsoft.com \
--cc=zhiquan1.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox