public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Kai" <kai.huang@intel.com>
To: "chenridong@huawei.com" <chenridong@huawei.com>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"mkoutny@suse.com" <mkoutny@suse.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"haitao.huang@linux.intel.com" <haitao.huang@linux.intel.com>,
	"tim.c.chen@linux.intel.com" <tim.c.chen@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"tj@kernel.org" <tj@kernel.org>,
	"jarkko@kernel.org" <jarkko@kernel.org>,
	"Mehta, Sohil" <sohil.mehta@intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>, "bp@alien8.de" <bp@alien8.de>,
	"x86@kernel.org" <x86@kernel.org>
Cc: "mikko.ylinen@linux.intel.com" <mikko.ylinen@linux.intel.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"anakrish@microsoft.com" <anakrish@microsoft.com>,
	"Zhang, Bo" <zhanb@microsoft.com>,
	"kristen@linux.intel.com" <kristen@linux.intel.com>,
	"yangjie@microsoft.com" <yangjie@microsoft.com>,
	"Li, Zhiquan1" <zhiquan1.li@intel.com>,
	"chrisyan@microsoft.com" <chrisyan@microsoft.com>
Subject: Re: [PATCH v15 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
Date: Thu, 20 Jun 2024 10:30:16 +0000	[thread overview]
Message-ID: <103f18636f0d65e3bcb0ca5f1008c0c7df0bdfd7.camel@intel.com> (raw)
In-Reply-To: <20240617125321.36658-13-haitao.huang@linux.intel.com>


On 18/06/2024 12:53 am, Huang, Haitao wrote:
> From: Kristen Carlson Accardi <kristen@linux.intel.com>
> 
> Previous patches have implemented all infrastructure needed for
> per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC
> pages are still tracked in the global LRU as sgx_epc_page_lru() always
> returns reference to the global LRU.
> 
> Change sgx_epc_page_lru() to return the LRU of the cgroup in which the
> given EPC page is allocated.
> 
> This makes all EPC pages tracked in per-cgroup LRUs and the global
> reclaimer (ksgxd) will not be able to reclaim any pages from the global
> LRU. However, in cases of over-committing, i.e., the sum of cgroup
> limits greater than the total capacity, cgroups may never reclaim but
> the total usage can still be near the capacity. Therefore a global
> reclamation is still needed in those cases and it should be performed
> from the root cgroup.
> 
> Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup
> when cgroup is enabled. Similar to sgx_cgroup_reclaim_pages(), return
> the next cgroup so callers can use it as the new starting node for next
> round of reclamation if needed.
> 
> Also update sgx_can_reclaim_global(), to check emptiness of LRUs of all
> cgroups when EPC cgroup is enabled, otherwise only check the global LRU.
> 
> Finally, change sgx_reclaim_direct(), to check and ensure there are free
> pages at cgroup level so forward progress can be made by the caller.

Reading above, it's not clear how the _new_ global reclaim works with
multiple LRUs.

E.g., the current global reclaim essentially treats all EPC pages equally
when scanning those pages.  From the above, I don't see how this is
achieved in the new global reclaim.

The changelog should:

1) describe the how does existing global reclaim work, and then describe
how to achieve the same beahviour in the new global reclaim which works
with multiple LRUs;

2) If there's any behaviour difference between the "existing" vs the "new"
global reclaim, the changelog should point out the difference, and explain
why such difference is OK.

> 
> With these changes, the global reclamation and per-cgroup reclamation
> both work properly with all pages tracked in per-cgroup LRUs.
> 

[...]

>   
> -static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
> +static struct misc_cg *sgx_reclaim_pages_global(struct misc_cg *next_cg,
> +						struct mm_struct *charge_mm)
>   {
> +	if (IS_ENABLED(CONFIG_CGROUP_MISC))
> +		return sgx_cgroup_reclaim_pages(misc_cg_root(), next_cg, charge_mm);
> +
>   	sgx_reclaim_pages(&sgx_global_lru, charge_mm);
> +	return NULL;
>   }
>   
>   /*
> @@ -414,12 +443,35 @@ static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
>    */
>   void sgx_reclaim_direct(void)
>   {
> +	struct sgx_cgroup *sgx_cg = sgx_get_current_cg();
> +	struct misc_cg *cg = misc_from_sgx(sgx_cg);

From below @sgx_cg could be NULL.  It's not immediately clear whether calling 
misc_from_sgx(sgx_cg) unconditionally is safe here.

Leave the initiaization of @cg to a later phase where @sgx_cg is
guaranteed not being NULL, or initialize @cg to NULL here and update later.

> +	struct misc_cg *next_cg = NULL;
> +
> +	/*
> +	 * Make sure there are some free pages at both cgroup and global levels.
> +	 * In both cases, only make one attempt of reclamation to avoid lengthy
> +	 * block on the caller.
> +	 */
> +	if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg))
> +		next_cg = sgx_cgroup_reclaim_pages(cg, next_cg, current->mm);

I don't quite follow the logic.

First of all, sgx_cgroup_reclaim_pages() isn't called in a loop, so why
not just do:

	next_cg = sgx_cgroup_reclaim_pages(cg, NULL, current->mm);

And what is the point of set @next_cg here, since ...


> +
> +	if (next_cg != cg)
> +		put_misc_cg(next_cg);
> +
> +	next_cg = NULL;

... here @next_cg is reset to NULL ?

Looks the only reason is you need to do ...

	put_misc_cg(next_cg);

... above?

This piece of code appears repeatedly in this file.  Is there any way we
can get rid of it?

>   	if (sgx_should_reclaim_global(SGX_NR_LOW_PAGES))
> -		sgx_reclaim_pages_global(current->mm);
> +		next_cg = sgx_reclaim_pages_global(next_cg, current->mm);

And this doesn't seems "global reclaim" at all?

Because it essentially equals to:

	next_cg = sgx_reclaim_pages_global(NULL, current->mm);

which always reclaims from the ROOT.

So each call to sgx_reclaim_direct() will always reclaim from the ROOT --
any other LRUs in the hierarchy will unlikely to get any chance to be
reclaimed.

> +
> +	if (next_cg != misc_cg_root())
> +		put_misc_cg(next_cg);
> +
> +	sgx_put_cg(sgx_cg);
>   }
>   
>   static int ksgxd(void *p)
>   {
> +	struct misc_cg *next_cg = NULL;
> +
>   	set_freezable();
>   
>   	/*
> @@ -437,11 +489,15 @@ static int ksgxd(void *p)
>   				     kthread_should_stop() ||
>   				     sgx_should_reclaim_global(SGX_NR_HIGH_PAGES));
>   
> -		if (sgx_should_reclaim_global(SGX_NR_HIGH_PAGES))
> +		while (!kthread_should_stop() && sgx_should_reclaim_global(SGX_NR_HIGH_PAGES)) {
>   			/* Indirect reclaim, no mm to charge, so NULL: */
> -			sgx_reclaim_pages_global(NULL);
> +			next_cg = sgx_reclaim_pages_global(next_cg, NULL);
> +			cond_resched();

Should the 'put_misc_cg(next_cg)' be done within the while() loop but not
below?
> +		}
>   
> -		cond_resched();
> +		if (next_cg != misc_cg_root())
> +			put_misc_cg(next_cg);
> +		next_cg = NULL;

Again, it doesn't seems "global reclaim" here, since you always restart
from the ROOT once the target pages have been reclaimed.

AFAICT it's completely different from the existing global reclaim.

>   	}
>   
>   	return 0;
> @@ -583,6 +639,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
>    */
>   struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
>   {
> +	struct misc_cg *next_cg = NULL;
>   	struct sgx_cgroup *sgx_cg;
>   	struct sgx_epc_page *page;
>   	int ret;
> @@ -616,10 +673,19 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
>   			break;
>   		}
>   
> -		sgx_reclaim_pages_global(current->mm);
> +		/*
> +		 * At this point, the usage within this cgroup is under its
> +		 * limit but there is no physical page left for allocation.
> +		 * Perform a global reclaim to get some pages released from any
> +		 * cgroup with reclaimable pages.
> +		 */
> +		next_cg = sgx_reclaim_pages_global(next_cg, current->mm);
>   		cond_resched();
>   	}

Ditto IIUC.




  reply	other threads:[~2024-06-20 10:30 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-17 12:53 [PATCH v15 00/14] Add Cgroup support for SGX EPC memory Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 01/14] x86/sgx: Replace boolean parameters with enums Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 02/14] cgroup/misc: Add per resource callbacks for CSS events Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 03/14] cgroup/misc: Export APIs for SGX driver Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 04/14] cgroup/misc: Add SGX EPC resource type Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 05/14] x86/sgx: Implement basic EPC misc cgroup functionality Huang, Haitao
2024-06-18 11:31   ` Huang, Kai
2024-06-18 12:56     ` Haitao Huang
2024-06-18 23:15       ` Huang, Kai
2024-06-18 23:23         ` Haitao Huang
2024-06-19  2:00           ` Huang, Kai
2024-06-17 12:53 ` [PATCH v15 06/14] x86/sgx: Add sgx_epc_lru_list to encapsulate LRU list Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 07/14] x86/sgx: Abstract tracking reclaimable pages in LRU Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 08/14] x86/sgx: Add basic EPC reclamation flow for cgroup Huang, Haitao
2024-06-20 13:28   ` Huang, Kai
2024-06-20 16:05     ` Haitao Huang
2024-06-20 22:52       ` Huang, Kai
2024-06-17 12:53 ` [PATCH v15 09/14] x86/sgx: Abstract check for global reclaimable pages Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 10/14] x86/sgx: Implement async reclamation for cgroup Huang, Haitao
2024-06-20 15:39   ` Haitao Huang
2024-06-17 12:53 ` [PATCH v15 11/14] x86/sgx: Charge mem_cgroup for per-cgroup reclamation Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 12/14] x86/sgx: Turn on per-cgroup EPC reclamation Huang, Haitao
2024-06-20 10:30   ` Huang, Kai [this message]
2024-06-20 15:06     ` Haitao Huang
2024-06-20 23:53       ` Huang, Kai
2024-06-17 12:53 ` [PATCH v15 13/14] Docs/x86/sgx: Add description for cgroup support Huang, Haitao
2024-06-17 12:53 ` [PATCH v15 14/14] selftests/sgx: Add scripts for EPC cgroup testing Huang, Haitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=103f18636f0d65e3bcb0ca5f1008c0c7df0bdfd7.camel@intel.com \
    --to=kai.huang@intel.com \
    --cc=anakrish@microsoft.com \
    --cc=bp@alien8.de \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huawei.com \
    --cc=chrisyan@microsoft.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=haitao.huang@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mikko.ylinen@linux.intel.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=seanjc@google.com \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    --cc=yangjie@microsoft.com \
    --cc=zhanb@microsoft.com \
    --cc=zhiquan1.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox