From mboxrd@z Thu Jan 1 00:00:00 1970 From: Leonardo =?ISO-8859-1?Q?Br=E1s?= Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Date: Fri, 27 Jan 2023 16:29:37 -0300 Message-ID: <029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com> References: <20230125073502.743446-1-leobras@redhat.com> <9e61ab53e1419a144f774b95230b789244895424.camel@redhat.com> <55ac6e3cbb97c7d13c49c3125c1455d8a2c785c3.camel@redhat.com> <15c605f27f87d732e80e294f13fd9513697b65e3.camel@redhat.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1674847784; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SUv5DFOx1orc2Szz1mjtgk5jcU9W4yiwziY2UuvBPys=; b=Y2PTWg68egXxUJjmq0soGogWzlHTh/J+kGegPkE9rwnLTQ37QJeSTN9W9lkt2nwMoCzMfj oRuTyixnmBjYwedNPYHGzazBbjjbHMBwH4LQpIxqihFEfIVvCkERZuM8cHCRwkLpLII/1Z Y4EPwMZWfyK8hRzFYz/pxUAs2o/h11w= In-Reply-To: List-ID: Content-Type: text/plain; charset="utf-8" To: Michal Hocko Cc: Roman Gushchin , Marcelo Tosatti , Johannes Weiner , Shakeel Butt , Muchun Song , Andrew Morton , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Fri, 2023-01-27 at 10:29 +0100, Michal Hocko wrote: > On Fri 27-01-23 04:35:22, Leonardo Br=C3=A1s wrote: > > On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote: > > > On Fri 27-01-23 04:14:19, Leonardo Br=C3=A1s wrote: > > > > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote: > > > [...] > > > > > I'd rather opt out of stock draining for isolated cpus: it might = slightly reduce > > > > > the accuracy of memory limits and slightly increase the memory fo= otprint (all > > > > > those dying memcgs...), but the impact will be limited. Actually = it is limited > > > > > by the number of cpus. > > > >=20 > > > > I was discussing this same idea with Marcelo yesterday morning. > > > >=20 > > > > The questions had in the topic were: > > > > a - About how many pages the pcp cache will hold before draining th= em itself?=C2=A0 > > >=20 > > > MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The ca= che > > > doesn't really hold any pages. It is a mere counter of how many charg= es > > > have been accounted for the memcg page counter. So it is not really > > > consuming proportional amount of resources. It just pins the > > > corresponding memcg. Have a look at consume_stock and refill_stock > >=20 > > I see. Thanks for pointing that out! > >=20 > > So in worst case scenario the memcg would have reserved 64 pages * (num= cpus - 1) >=20 > s@numcpus@num_isolated_cpus@ I was thinking worst case scenario being (ncpus - 1) being isolated. >=20 > > that are not getting used, and may cause an 'earlier' OOM if this amoun= t is > > needed but can't be freed. >=20 > s@OOM@memcg OOM@ =20 > > In the wave of worst case, supposing a big powerpc machine, 256 CPUs, e= ach > > holding 64k * 64 pages =3D> 1GB memory - 4MB (one cpu using resources). > > It's starting to get too big, but still ok for a machine this size. >=20 > It is more about the memcg limit rather than the size of the machine. > Again, let's focus on actual usacase. What is the usual memcg setup with > those isolcpus I understand it's about the limit, not actually allocated memory. When I po= int the machine size, I mean what is expected to be acceptable from a user in t= hat machine. >=20 > > The thing is that it can present an odd behavior:=20 > > You have a cgroup created before, now empty, and try to run given appli= cation, > > and hits OOM. >=20 > The application would either consume those cached charges or flush them > if it is running in a different memcg. Or what do you have in mind? 1 - Create a memcg with a VM inside, multiple vcpus pinned to isolated cpus= .=20 2 - Run multi-cpu task inside the VM, it allocates memory for every CPU and= keep the pcp cache 3 - Try to run a single-cpu task (pinned?) inside the VM, which uses almost= all the available memory. 4 - memcg OOM. Does it make sense? >=20 > > You then restart the cgroup, run the same application without an issue. > >=20 > > Even though it looks a good possibility, this can be perceived by user = as > > instability. > >=20 > > >=20 > > > > b - Would it cache any kind of bigger page, or huge page in this sa= me aspect? > > >=20 > > > The above should answer this as well as those following up I hope. If > > > not let me know. > >=20 > > IIUC we are talking normal pages, is that it? >=20 > We are talking about memcg charges and those have page granularity. >=20 Thanks for the info! Also, thanks for the feedback! Leo