From: Michal Hocko <mhocko@suse.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, patches@lists.linux.dev,
Tejun Heo <tj@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
mathieu.tortuyaux@gmail.com
Subject: Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
Date: Thu, 21 Sep 2023 09:52:17 +0200 [thread overview]
Message-ID: <ZQv2MXOynlEPW/bX@dhcp22.suse.cz> (raw)
In-Reply-To: <CALvZod5DSMoEGY0CwGz=P-2=Opbr4SmMfwHhZRROBx7yCaBdDA@mail.gmail.com>
On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> [...]
> > > have a strong opinion against it. Also just to be clear we are not
> > > talking about full revert of 58056f77502f but just the returning of
> > > EOPNOTSUPP, right?
> >
> > If we allow the limit to be set without returning a failure then we
> > still have options 2 and 3 on how to deal with that. One of them is to
> > enforce the limit.
> >
>
> Option 3 is a partial revert of 58056f77502f where we keep the no
> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> with option 3. In addition, let's add pr_warn_once on the read of
> kmem.limit_in_bytes as well.
How about this?
---
From 81ae0797d8da1b9cfbf357b4be4787a5bbf46bb4 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation
This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.
Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.
[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
Documentation/admin-guide/cgroup-v1/memory.rst | 7 +++++++
mm/memcontrol.c | 12 ++++++++++++
2 files changed, 19 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
memory.oom_control set/show oom controls.
memory.numa_stat show the number of memory usage per numa
node
+ memory.kmem.limit_in_bytes Deprecated knob to set and read the kernel
+ memory hard limit. Kernel hard limit is not
+ supported since 5.16. Writing any value to
+ do file will not have any effect same as if
+ nokmem kernel parameter was specified.
+ Kernel memory is still charged and reported
+ by memory.kmem.usage_in_bytes.
memory.kmem.usage_in_bytes show current kernel memory allocation
memory.kmem.failcnt show the number of kernel memory usage
hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..ac7f14b2338d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
unsigned int nr_pages)
{
+ struct page_counter *counter;
struct mem_cgroup *memcg;
int ret;
@@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
goto out;
memcg_account_kmem(memcg, nr_pages);
+
+ /* There is no way to set up kmem hard limit so this operation cannot fail */
+ if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+ WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
out:
css_put(&memcg->css);
@@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
case _MEMSWAP:
ret = mem_cgroup_resize_max(memcg, nr_pages, true);
break;
+ case _KMEM:
+ pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+ "Writing any value to this file has no effect. "
+ "Please report your usecase to linux-mm@kvack.org if you "
+ "depend on this functionality.\n");
+ ret = 0;
+ break;
case _TCP:
ret = memcg_update_tcp_max(memcg, nr_pages);
break;
--
2.30.2
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2023-09-21 21:03 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
2023-09-17 20:47 ` SeongJae Park
2023-09-18 5:34 ` Takeshi Ogasawara
2023-09-18 6:42 ` Bagas Sanjaya
2023-09-18 11:24 ` Conor Dooley
2023-09-18 12:08 ` Ron Economos
2023-09-18 12:48 ` Jon Hunter
2023-09-18 18:34 ` Florian Fainelli
2023-09-18 18:41 ` Guenter Roeck
2023-09-18 20:56 ` Naresh Kamboju
2023-09-18 22:21 ` Shuah Khan
[not found] ` <20230917191042.204185566@linuxfoundation.org>
2023-09-20 8:11 ` [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Jeremi Piotrowski
2023-09-20 8:43 ` Michal Hocko
2023-09-20 9:25 ` Greg Kroah-Hartman
2023-09-20 10:21 ` Jeremi Piotrowski
2023-09-20 10:45 ` Greg Kroah-Hartman
2023-09-20 11:08 ` Michal Hocko
2023-09-20 11:16 ` Greg Kroah-Hartman
2023-09-20 10:04 ` Jeremi Piotrowski
2023-09-20 11:07 ` Michal Hocko
2023-09-20 13:25 ` Jeremi Piotrowski
2023-09-20 13:47 ` Michal Hocko
2023-09-20 15:32 ` Shakeel Butt
2023-09-20 16:55 ` Michal Hocko
2023-09-20 19:46 ` Shakeel Butt
2023-09-20 20:08 ` Michal Hocko
2023-09-20 21:46 ` Shakeel Butt
2023-09-21 7:52 ` Michal Hocko [this message]
2023-09-21 10:43 ` Jeremi Piotrowski
2023-09-21 11:21 ` Michal Hocko
2023-09-21 17:25 ` Shakeel Butt
2023-09-21 19:50 ` Michal Hocko
2023-09-22 13:30 ` Johannes Weiner
2023-09-25 7:40 ` Michal Hocko
2023-09-22 23:00 ` Roman Gushchin
2023-09-25 7:41 ` Michal Hocko
2023-09-26 2:49 ` Roman Gushchin
2023-09-22 11:14 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-09-21 13:04 ` [PATCH 6.1 000/219] 6.1.54-rc1 review Conor Dooley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQv2MXOynlEPW/bX@dhcp22.suse.cz \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=jpiotrowski@linux.microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.tortuyaux@gmail.com \
--cc=muchun.song@linux.dev \
--cc=patches@lists.linux.dev \
--cc=regressions@lists.linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox