[PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
       [not found] <186d5b5b-a082-3814-9963-bf57dfe08511@openvz.org>
@ 2022-06-27  2:12 ` Vasily Averin
  2022-06-27  3:33   ` Muchun Song
                     ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Vasily Averin @ 2022-06-27  2:12 UTC (permalink / raw)
  To: Shakeel Butt, Roman Gushchin, Michal Koutný, Michal Hocko,
	Tejun Heo, Zefan Li, Johannes Weiner
  Cc: kernel, linux-kernel, Andrew Morton, linux-mm, Vlastimil Babka,
	Muchun Song, cgroups

When cgroup_mkdir reaches the limits of the cgroup hierarchy, it should
not return -EAGAIN, but instead react similarly to reaching the global
limit.

Signed-off-by: Vasily Averin <vvs@openvz.org>
---
 kernel/cgroup/cgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1be0f81fe8e1..243239553ea3 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5495,7 +5495,7 @@ int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, umode_t mode)
 		return -ENODEV;
 
 	if (!cgroup_check_hierarchy_limits(parent)) {
-		ret = -EAGAIN;
+		ret = -ENOSPC;
 		goto out_unlock;
 	}
 
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-27  2:12 ` [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached Vasily Averin
@ 2022-06-27  3:33   ` Muchun Song
  2022-06-27  9:07   ` Tejun Heo
  2022-06-28  0:44   ` Roman Gushchin
  2 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2022-06-27  3:33 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Shakeel Butt, Roman Gushchin, Michal Koutný, Michal Hocko,
	Tejun Heo, Zefan Li, Johannes Weiner, kernel, LKML, Andrew Morton,
	Linux Memory Management List, Vlastimil Babka, Cgroups

On Mon, Jun 27, 2022 at 10:12 AM Vasily Averin <vvs@openvz.org> wrote:
>
> When cgroup_mkdir reaches the limits of the cgroup hierarchy, it should
> not return -EAGAIN, but instead react similarly to reaching the global
> limit.
>
> Signed-off-by: Vasily Averin <vvs@openvz.org>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-27  2:12 ` [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached Vasily Averin
  2022-06-27  3:33   ` Muchun Song
@ 2022-06-27  9:07   ` Tejun Heo
  2022-06-28  0:44   ` Roman Gushchin
  2 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2022-06-27  9:07 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Shakeel Butt, Roman Gushchin, Michal Koutný, Michal Hocko,
	Zefan Li, Johannes Weiner, kernel, linux-kernel, Andrew Morton,
	linux-mm, Vlastimil Babka, Muchun Song, cgroups

On Mon, Jun 27, 2022 at 05:12:55AM +0300, Vasily Averin wrote:
> When cgroup_mkdir reaches the limits of the cgroup hierarchy, it should
> not return -EAGAIN, but instead react similarly to reaching the global
> limit.

While I'm not necessarily against this change, I find the rationale to
be somewhat lacking. Can you please elaborate why -ENOSPC is the right
one while -EAGAIN is incorrect?

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-27  2:12 ` [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached Vasily Averin
  2022-06-27  3:33   ` Muchun Song
  2022-06-27  9:07   ` Tejun Heo
@ 2022-06-28  0:44   ` Roman Gushchin
  2022-06-28  3:59     ` Vasily Averin
  2 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2022-06-28  0:44 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Shakeel Butt, Michal Koutný, Michal Hocko, Tejun Heo,
	Zefan Li, Johannes Weiner, kernel, linux-kernel, Andrew Morton,
	linux-mm, Vlastimil Babka, Muchun Song, cgroups

On Mon, Jun 27, 2022 at 05:12:55AM +0300, Vasily Averin wrote:
> When cgroup_mkdir reaches the limits of the cgroup hierarchy, it should
> not return -EAGAIN, but instead react similarly to reaching the global
> limit.
> 
> Signed-off-by: Vasily Averin <vvs@openvz.org>
> ---
>  kernel/cgroup/cgroup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 1be0f81fe8e1..243239553ea3 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -5495,7 +5495,7 @@ int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, umode_t mode)
>  		return -ENODEV;
>  
>  	if (!cgroup_check_hierarchy_limits(parent)) {
> -		ret = -EAGAIN;
> +		ret = -ENOSPC;

I'd not argue whether ENOSPC is better or worse here, but I don't think we need
to change it now. It's been in this state for a long time and is a part of ABI.
EAGAIN is pretty unique as a mkdir() result, so systemd can handle it well.

Thanks!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-28  0:44   ` Roman Gushchin
@ 2022-06-28  3:59     ` Vasily Averin
  2022-06-28  9:16       ` Michal Koutný
  0 siblings, 1 reply; 10+ messages in thread
From: Vasily Averin @ 2022-06-28  3:59 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Shakeel Butt, Michal Koutný, Michal Hocko, Tejun Heo,
	Zefan Li, Johannes Weiner, kernel, linux-kernel, Andrew Morton,
	linux-mm, Vlastimil Babka, Muchun Song, cgroups

On 6/28/22 03:44, Roman Gushchin wrote:
> On Mon, Jun 27, 2022 at 05:12:55AM +0300, Vasily Averin wrote:
>> When cgroup_mkdir reaches the limits of the cgroup hierarchy, it should
>> not return -EAGAIN, but instead react similarly to reaching the global
>> limit.
>>
>> Signed-off-by: Vasily Averin <vvs@openvz.org>
>> ---
>>  kernel/cgroup/cgroup.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>> index 1be0f81fe8e1..243239553ea3 100644
>> --- a/kernel/cgroup/cgroup.c
>> +++ b/kernel/cgroup/cgroup.c
>> @@ -5495,7 +5495,7 @@ int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, umode_t mode)
>>  		return -ENODEV;
>>  
>>  	if (!cgroup_check_hierarchy_limits(parent)) {
>> -		ret = -EAGAIN;
>> +		ret = -ENOSPC;
> 
> I'd not argue whether ENOSPC is better or worse here, but I don't think we need
> to change it now. It's been in this state for a long time and is a part of ABI.
> EAGAIN is pretty unique as a mkdir() result, so systemd can handle it well.

I would agree with you, however in my opinion EAGAIN is used to restart an
interrupted system call. Thus, I worry its return can loop the user space without
any chance of continuation.

However, maybe I'm confusing something?

Thank you,
	Vasily Averin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-28  3:59     ` Vasily Averin
@ 2022-06-28  9:16       ` Michal Koutný
  2022-06-28  9:22         ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Koutný @ 2022-06-28  9:16 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Roman Gushchin, Shakeel Butt, Michal Hocko, Tejun Heo, Zefan Li,
	Johannes Weiner, kernel, linux-kernel, Andrew Morton, linux-mm,
	Vlastimil Babka, Muchun Song, cgroups

On Tue, Jun 28, 2022 at 06:59:06AM +0300, Vasily Averin <vvs@openvz.org> wrote:
> I would agree with you, however in my opinion EAGAIN is used to restart an
> interrupted system call. Thus, I worry its return can loop the user space without
> any chance of continuation.
> 
> However, maybe I'm confusing something?

The mkdir(2) manpage doesn't list EAGAIN at all. ENOSPC makes better
sense here. (And I suspect the dependency on this particular value won't
be very wide spread.)

0.02€
Michal


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-28  9:16       ` Michal Koutný
@ 2022-06-28  9:22         ` Tejun Heo
  2022-06-29  6:13           ` Vasily Averin
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2022-06-28  9:22 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Vasily Averin, Roman Gushchin, Shakeel Butt, Michal Hocko,
	Zefan Li, Johannes Weiner, kernel, linux-kernel, Andrew Morton,
	linux-mm, Vlastimil Babka, Muchun Song, cgroups

On Tue, Jun 28, 2022 at 11:16:48AM +0200, Michal Koutný wrote:
> The mkdir(2) manpage doesn't list EAGAIN at all. ENOSPC makes better
> sense here. (And I suspect the dependency on this particular value won't
> be very wide spread.)

Given how we use these system calls as triggers for random kernel
operations, I don't think adhering to posix standard is necessary or
possible. Using an error code which isn't listed in the man page isn't
particularly high in the list of discrepancies.

Again, I'm not against changing it but I'd like to see better
rationales. On one side, we have "it's been this way for a long time
and there's nothing particularly broken about it". I'm not sure the
arguments we have for the other side is strong enough yet.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-28  9:22         ` Tejun Heo
@ 2022-06-29  6:13           ` Vasily Averin
  2022-06-29 19:25             ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: Vasily Averin @ 2022-06-29  6:13 UTC (permalink / raw)
  To: Tejun Heo, Michal Koutný
  Cc: Roman Gushchin, Shakeel Butt, Michal Hocko, Zefan Li,
	Johannes Weiner, kernel, linux-kernel, Andrew Morton, linux-mm,
	Vlastimil Babka, Muchun Song, cgroups

On 6/28/22 12:22, Tejun Heo wrote:
> On Tue, Jun 28, 2022 at 11:16:48AM +0200, Michal Koutný wrote:
>> The mkdir(2) manpage doesn't list EAGAIN at all. ENOSPC makes better
>> sense here. (And I suspect the dependency on this particular value won't
>> be very wide spread.)
> 
> Given how we use these system calls as triggers for random kernel
> operations, I don't think adhering to posix standard is necessary or
> possible. Using an error code which isn't listed in the man page isn't
> particularly high in the list of discrepancies.
> 
> Again, I'm not against changing it but I'd like to see better
> rationales. On one side, we have "it's been this way for a long time
> and there's nothing particularly broken about it". I'm not sure the
> arguments we have for the other side is strong enough yet.

I would like to recall this patch.

I experimented on fedora36 node with LXC and centos stream 9 container.
and I did not noticed any critical systemd troubles with original -EAGAIN.
When cgroup's limit is reached systemd cannot start new services, 
for example lxc-attach generates following output:

[root@fc34-vvs ~]# lxc-attach c9s
lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 Resource temporarily unavailable - Failed to create leaf cgroup ".lxc"
lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 Resource temporarily unavailable - Failed to attach to cgroup fd 11
lxc-attach: c9s: attach.c: lxc_attach: 1679 Resource temporarily unavailable - Failed to attach cgroup
lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container

I did not found any loop in userspace caused by EAGAIN.
Messages looks unclear, however situation with the patched kernel is not much better:

[root@fc34-vvs ~]# lxc-attach c9s
lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 No space left on device - Failed to create leaf cgroup ".lxc"
lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 No space left on device - Failed to attach to cgroup fd 11
lxc-attach: c9s: attach.c: lxc_attach: 1679 No space left on device - Failed to attach cgroup
lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container

Thank you,
	Vasily Averin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-29  6:13           ` Vasily Averin
@ 2022-06-29 19:25             ` Tejun Heo
  2022-07-01  2:42               ` Roman Gushchin
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2022-06-29 19:25 UTC (permalink / raw)
  To: Vasily Averin
  Cc: Michal Koutný, Roman Gushchin, Shakeel Butt, Michal Hocko,
	Zefan Li, Johannes Weiner, kernel, linux-kernel, Andrew Morton,
	linux-mm, Vlastimil Babka, Muchun Song, cgroups

On Wed, Jun 29, 2022 at 09:13:02AM +0300, Vasily Averin wrote:
> I experimented on fedora36 node with LXC and centos stream 9 container.
> and I did not noticed any critical systemd troubles with original -EAGAIN.
> When cgroup's limit is reached systemd cannot start new services, 
> for example lxc-attach generates following output:
> 
> [root@fc34-vvs ~]# lxc-attach c9s
> lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 Resource temporarily unavailable - Failed to create leaf cgroup ".lxc"
> lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 Resource temporarily unavailable - Failed to attach to cgroup fd 11
> lxc-attach: c9s: attach.c: lxc_attach: 1679 Resource temporarily unavailable - Failed to attach cgroup
> lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
> lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container
> 
> I did not found any loop in userspace caused by EAGAIN.
> Messages looks unclear, however situation with the patched kernel is not much better:
> 
> [root@fc34-vvs ~]# lxc-attach c9s
> lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 No space left on device - Failed to create leaf cgroup ".lxc"
> lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 No space left on device - Failed to attach to cgroup fd 11
> lxc-attach: c9s: attach.c: lxc_attach: 1679 No space left on device - Failed to attach cgroup
> lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
> lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container

I'd say "resource temporarily unavailable" is better fitting than "no
space left on device" and the syscall restart thing isn't handled by
-EAGAIN return value. Grep restart_block for that.

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached
  2022-06-29 19:25             ` Tejun Heo
@ 2022-07-01  2:42               ` Roman Gushchin
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Gushchin @ 2022-07-01  2:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Vasily Averin, Michal Koutný, Shakeel Butt, Michal Hocko,
	Zefan Li, Johannes Weiner, kernel, linux-kernel, Andrew Morton,
	linux-mm, Vlastimil Babka, Muchun Song, cgroups

On Thu, Jun 30, 2022 at 04:25:57AM +0900, Tejun Heo wrote:
> On Wed, Jun 29, 2022 at 09:13:02AM +0300, Vasily Averin wrote:
> > I experimented on fedora36 node with LXC and centos stream 9 container.
> > and I did not noticed any critical systemd troubles with original -EAGAIN.
> > When cgroup's limit is reached systemd cannot start new services, 
> > for example lxc-attach generates following output:
> > 
> > [root@fc34-vvs ~]# lxc-attach c9s
> > lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 Resource temporarily unavailable - Failed to create leaf cgroup ".lxc"
> > lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 Resource temporarily unavailable - Failed to attach to cgroup fd 11
> > lxc-attach: c9s: attach.c: lxc_attach: 1679 Resource temporarily unavailable - Failed to attach cgroup
> > lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
> > lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container
> > 
> > I did not found any loop in userspace caused by EAGAIN.
> > Messages looks unclear, however situation with the patched kernel is not much better:
> > 
> > [root@fc34-vvs ~]# lxc-attach c9s
> > lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 No space left on device - Failed to create leaf cgroup ".lxc"
> > lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 No space left on device - Failed to attach to cgroup fd 11
> > lxc-attach: c9s: attach.c: lxc_attach: 1679 No space left on device - Failed to attach cgroup
> > lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
> > lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container
> 
> I'd say "resource temporarily unavailable" is better fitting than "no
> space left on device"

+1

Thanks!


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-07-01  2:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <186d5b5b-a082-3814-9963-bf57dfe08511@openvz.org>
2022-06-27  2:12 ` [PATCH cgroup] cgroup: set the correct return code if hierarchy limits are reached Vasily Averin
2022-06-27  3:33   ` Muchun Song
2022-06-27  9:07   ` Tejun Heo
2022-06-28  0:44   ` Roman Gushchin
2022-06-28  3:59     ` Vasily Averin
2022-06-28  9:16       ` Michal Koutný
2022-06-28  9:22         ` Tejun Heo
2022-06-29  6:13           ` Vasily Averin
2022-06-29 19:25             ` Tejun Heo
2022-07-01  2:42               ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).