[Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-06-27  6:50 ` Zhang Qiao
  0 siblings, 0 replies; 12+ messages in thread
From: Zhang Qiao @ 2022-06-27  6:50 UTC (permalink / raw)
  To: Tejun Heo, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, Juri Lelli, Vincent Guittot
  Cc: lizefan.x-EC8Uxl6Npydl57MIdRCFDg, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, lkml,
	vschneid-H+wXaHxf7aLQT0dZR+AlfA, dietmar.eggemann-5wv7dgnIgG8,
	bristot-H+wXaHxf7aLQT0dZR+AlfA, bsegall-hpIqsD4AKlfQT0dZR+AlfA,
	Steven Rostedt, mgorman-l3A5Bk7waGM

Hi all,

I'm working on debuging a problem.
The testcase does follew operations:
1) create a test task cgroup, set cpu.cfs_quota_us=2000,cpu.cfs_period_us=100000.
2) run 20 test_fork[1] test process in the test task cgroup.
3) create 100 new containers:
   for i in {1..100}; do docker run -itd  --health-cmd="ls" --health-interval=1s ubuntu:latest  bash; done

These operations are expected to succeed and 100 containers create success. however, when creating containers,
the system will get stuck and create container failed.

After debug this, I found the test_fork process frequently sleep in freezer_fork()->mutex_lock()->might_sleep()
with taking the cgroup_threadgroup_rw_sem lock, as follow:

copy_process():
	cgroup_can_fork()			---> lock cgroup_threadgroup_rw_sem
	sched_cgroup_fork();
	  ->task_fork_fair(){
	      ->update_curr(){
		  ->__account_cfs_rq_runtime() {
			resched_curr();		---> the quota is used up, and set flag TIF_NEED_RESCHED to current
		   }
	cgroup_post_fork();   		
	  ->feezer_fork()
	      ->mutex_lock() {	
		  ->might_sleep()  		---> schedule() and the current task will be throttled long time.

	  ->cgroup_css_set_put_fork()    	---> unlock cgroup_threadgroup_rw_sem


Becuase the task cgroup's cpu.cfs_quota_us is very small and test_fork's load is very heavy, the test_fork
may be throttled long time, therefore, the cgroup_threadgroup_rw_sem read lock is held for a long time, other
processes will get stuck waiting for the lock:

1) a task fork child, will wait at copy_process()->cgroup_can_fork();

2) a task exiting will wait at exit_signals();

3) a task write cgroup.procs file will wait at cgroup_file_write()->__cgroup1_procs_write();
...

even the whole system will get stuck.

Anyone know how to slove this? Except for changing the cpu.cfs_quota_us.


[1] test_fork.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>

int main(int argc, char **argv)
{
    pid_t pid;
    int count = 20;

    while(1) {
        for (int i = 0; i < count; i++) {
            if ((pid = fork()) <0) {
                printf("fork error");
                return 1;
            } else if (pid ==0) {
                exit(0);
            }
        }

        for (int i = 0; i < count; i++) {
            wait(NULL);
        }
	sleep(1);
    }
    return 0;
}

Thanks a lot.
-Qiao
-

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-06-27  6:50 ` Zhang Qiao
  0 siblings, 0 replies; 12+ messages in thread
From: Zhang Qiao @ 2022-06-27  6:50 UTC (permalink / raw)
  To: Tejun Heo, mingo, peterz, Juri Lelli, Vincent Guittot
  Cc: lizefan.x, hannes, cgroups, lkml, vschneid, dietmar.eggemann,
	bristot, bsegall, Steven Rostedt, mgorman

Hi all,

I'm working on debuging a problem.
The testcase does follew operations:
1) create a test task cgroup, set cpu.cfs_quota_us=2000,cpu.cfs_period_us=100000.
2) run 20 test_fork[1] test process in the test task cgroup.
3) create 100 new containers:
   for i in {1..100}; do docker run -itd  --health-cmd="ls" --health-interval=1s ubuntu:latest  bash; done

These operations are expected to succeed and 100 containers create success. however, when creating containers,
the system will get stuck and create container failed.

After debug this, I found the test_fork process frequently sleep in freezer_fork()->mutex_lock()->might_sleep()
with taking the cgroup_threadgroup_rw_sem lock, as follow:

copy_process():
	cgroup_can_fork()			---> lock cgroup_threadgroup_rw_sem
	sched_cgroup_fork();
	  ->task_fork_fair(){
	      ->update_curr(){
		  ->__account_cfs_rq_runtime() {
			resched_curr();		---> the quota is used up, and set flag TIF_NEED_RESCHED to current
		   }
	cgroup_post_fork();   		
	  ->feezer_fork()
	      ->mutex_lock() {	
		  ->might_sleep()  		---> schedule() and the current task will be throttled long time.

	  ->cgroup_css_set_put_fork()    	---> unlock cgroup_threadgroup_rw_sem

Becuase the task cgroup's cpu.cfs_quota_us is very small and test_fork's load is very heavy, the test_fork
may be throttled long time, therefore, the cgroup_threadgroup_rw_sem read lock is held for a long time, other
processes will get stuck waiting for the lock:

1) a task fork child, will wait at copy_process()->cgroup_can_fork();

2) a task exiting will wait at exit_signals();

3) a task write cgroup.procs file will wait at cgroup_file_write()->__cgroup1_procs_write();
...

even the whole system will get stuck.

Anyone know how to slove this? Except for changing the cpu.cfs_quota_us.

[1] test_fork.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>

int main(int argc, char **argv)
{
    pid_t pid;
    int count = 20;

    while(1) {
        for (int i = 0; i < count; i++) {
            if ((pid = fork()) <0) {
                printf("fork error");
                return 1;
            } else if (pid ==0) {
                exit(0);
            }
        }

        for (int i = 0; i < count; i++) {
            wait(NULL);
        }
	sleep(1);
    }
    return 0;
}

Thanks a lot.
-Qiao
-

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <5987be34-b527-4ff5-a17d-5f6f0dc94d6d-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
  2022-06-27  6:50 ` Zhang Qiao
@ 2022-06-27  8:32     ` Tejun Heo
  -1 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2022-06-27  8:32 UTC (permalink / raw)
  To: Zhang Qiao
  Cc: mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	Juri Lelli, Vincent Guittot, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	lkml, vschneid-H+wXaHxf7aLQT0dZR+AlfA,
	dietmar.eggemann-5wv7dgnIgG8, bristot-H+wXaHxf7aLQT0dZR+AlfA,
	bsegall-hpIqsD4AKlfQT0dZR+AlfA, Steven Rostedt,
	mgorman-l3A5Bk7waGM

Hello,

On Mon, Jun 27, 2022 at 02:50:25PM +0800, Zhang Qiao wrote:
> Becuase the task cgroup's cpu.cfs_quota_us is very small and
> test_fork's load is very heavy, the test_fork may be throttled long
> time, therefore, the cgroup_threadgroup_rw_sem read lock is held for
> a long time, other processes will get stuck waiting for the lock:

Yeah, this is a known problem and can happen with other locks too. The
solution prolly is only throttling while in or when about to return to
userspace. There is one really important and wide-spread assumption in
the kernel:

  If things get blocked on some shared resource, whatever is holding
  the resource ends up using more of the system to exit the critical
  section faster and thus unblocks others ASAP. IOW, things running in
  kernel are work-conserving.

The cpu bw controller gives the userspace a rather easy way to break
this assumption and thus is rather fundamentally broken. This is
basically the same problem we had with the old cgroup freezer
implementation which trapped threads in random locations in the
kernel.

So, right now, it's rather broken and can easily be used as an dos
attack vector.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-06-27  8:32     ` Tejun Heo
  0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2022-06-27  8:32 UTC (permalink / raw)
  To: Zhang Qiao
  Cc: mingo, peterz, Juri Lelli, Vincent Guittot, lizefan.x, hannes,
	cgroups, lkml, vschneid, dietmar.eggemann, bristot, bsegall,
	Steven Rostedt, mgorman

Hello,

On Mon, Jun 27, 2022 at 02:50:25PM +0800, Zhang Qiao wrote:
> Becuase the task cgroup's cpu.cfs_quota_us is very small and
> test_fork's load is very heavy, the test_fork may be throttled long
> time, therefore, the cgroup_threadgroup_rw_sem read lock is held for
> a long time, other processes will get stuck waiting for the lock:

Yeah, this is a known problem and can happen with other locks too. The
solution prolly is only throttling while in or when about to return to
userspace. There is one really important and wide-spread assumption in
the kernel:

  If things get blocked on some shared resource, whatever is holding
  the resource ends up using more of the system to exit the critical
  section faster and thus unblocks others ASAP. IOW, things running in
  kernel are work-conserving.

The cpu bw controller gives the userspace a rather easy way to break
this assumption and thus is rather fundamentally broken. This is
basically the same problem we had with the old cgroup freezer
implementation which trapped threads in random locations in the
kernel.

So, right now, it's rather broken and can easily be used as an dos
attack vector.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <YrlrBmF3oOfS3+fq-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
  2022-06-27  8:32     ` Tejun Heo
@ 2022-07-01  7:34         ` Zhang Qiao
  -1 siblings, 0 replies; 12+ messages in thread
From: Zhang Qiao @ 2022-07-01  7:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	Juri Lelli, Vincent Guittot, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	lkml, vschneid-H+wXaHxf7aLQT0dZR+AlfA,
	dietmar.eggemann-5wv7dgnIgG8, bristot-H+wXaHxf7aLQT0dZR+AlfA,
	bsegall-hpIqsD4AKlfQT0dZR+AlfA, Steven Rostedt,
	mgorman-l3A5Bk7waGM


Hi, tejun

Thanks for your reply.

在 2022/6/27 16:32, Tejun Heo 写道:
> Hello,
> 
> On Mon, Jun 27, 2022 at 02:50:25PM +0800, Zhang Qiao wrote:
>> Becuase the task cgroup's cpu.cfs_quota_us is very small and
>> test_fork's load is very heavy, the test_fork may be throttled long
>> time, therefore, the cgroup_threadgroup_rw_sem read lock is held for
>> a long time, other processes will get stuck waiting for the lock:
> 
> Yeah, this is a known problem and can happen with other locks too. The
> solution prolly is only throttling while in or when about to return to
> userspace. There is one really important and wide-spread assumption in
> the kernel:
> 
>   If things get blocked on some shared resource, whatever is holding
>   the resource ends up using more of the system to exit the critical
>   section faster and thus unblocks others ASAP. IOW, things running in
>   kernel are work-conserving.
> 
> The cpu bw controller gives the userspace a rather easy way to break
> this assumption and thus is rather fundamentally broken. This is
> basically the same problem we had with the old cgroup freezer
> implementation which trapped threads in random locations in the
> kernel.
> 

so, if we want to completely slove this problem, is the best way to
change the cfs bw controller throttle mechanism? for example, throttle
tasks in a safe location.

Thanks.
    Qiao

> So, right now, it's rather broken and can easily be used as an dos
> attack vector.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-07-01  7:34         ` Zhang Qiao
  0 siblings, 0 replies; 12+ messages in thread
From: Zhang Qiao @ 2022-07-01  7:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: mingo, peterz, Juri Lelli, Vincent Guittot, lizefan.x, hannes,
	cgroups, lkml, vschneid, dietmar.eggemann, bristot, bsegall,
	Steven Rostedt, mgorman


Hi, tejun

Thanks for your reply.

在 2022/6/27 16:32, Tejun Heo 写道:
> Hello,
> 
> On Mon, Jun 27, 2022 at 02:50:25PM +0800, Zhang Qiao wrote:
>> Becuase the task cgroup's cpu.cfs_quota_us is very small and
>> test_fork's load is very heavy, the test_fork may be throttled long
>> time, therefore, the cgroup_threadgroup_rw_sem read lock is held for
>> a long time, other processes will get stuck waiting for the lock:
> 
> Yeah, this is a known problem and can happen with other locks too. The
> solution prolly is only throttling while in or when about to return to
> userspace. There is one really important and wide-spread assumption in
> the kernel:
> 
>   If things get blocked on some shared resource, whatever is holding
>   the resource ends up using more of the system to exit the critical
>   section faster and thus unblocks others ASAP. IOW, things running in
>   kernel are work-conserving.
> 
> The cpu bw controller gives the userspace a rather easy way to break
> this assumption and thus is rather fundamentally broken. This is
> basically the same problem we had with the old cgroup freezer
> implementation which trapped threads in random locations in the
> kernel.
> 

so, if we want to completely slove this problem, is the best way to
change the cfs bw controller throttle mechanism? for example, throttle
tasks in a safe location.

Thanks.
    Qiao

> So, right now, it's rather broken and can easily be used as an dos
> attack vector.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <f0f55f89-14db-de29-c182-32539f8d4e4d-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
  2022-07-01  7:34         ` Zhang Qiao
@ 2022-07-01 20:08             ` Benjamin Segall
  -1 siblings, 0 replies; 12+ messages in thread
From: Benjamin Segall @ 2022-07-01 20:08 UTC (permalink / raw)
  To: Zhang Qiao
  Cc: Tejun Heo, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, Juri Lelli, Vincent Guittot,
	lizefan.x-EC8Uxl6Npydl57MIdRCFDg, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, lkml,
	vschneid-H+wXaHxf7aLQT0dZR+AlfA, dietmar.eggemann-5wv7dgnIgG8,
	bristot-H+wXaHxf7aLQT0dZR+AlfA, Steven Rostedt,
	mgorman-l3A5Bk7waGM

Zhang Qiao <zhangqiao22-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> writes:

> Hi, tejun
>
> Thanks for your reply.
>
> 在 2022/6/27 16:32, Tejun Heo 写道:
>> Hello,
>> 
>> On Mon, Jun 27, 2022 at 02:50:25PM +0800, Zhang Qiao wrote:
>>> Becuase the task cgroup's cpu.cfs_quota_us is very small and
>>> test_fork's load is very heavy, the test_fork may be throttled long
>>> time, therefore, the cgroup_threadgroup_rw_sem read lock is held for
>>> a long time, other processes will get stuck waiting for the lock:
>> 
>> Yeah, this is a known problem and can happen with other locks too. The
>> solution prolly is only throttling while in or when about to return to
>> userspace. There is one really important and wide-spread assumption in
>> the kernel:
>> 
>>   If things get blocked on some shared resource, whatever is holding
>>   the resource ends up using more of the system to exit the critical
>>   section faster and thus unblocks others ASAP. IOW, things running in
>>   kernel are work-conserving.
>> 
>> The cpu bw controller gives the userspace a rather easy way to break
>> this assumption and thus is rather fundamentally broken. This is
>> basically the same problem we had with the old cgroup freezer
>> implementation which trapped threads in random locations in the
>> kernel.
>> 
>
> so, if we want to completely slove this problem, is the best way to
> change the cfs bw controller throttle mechanism? for example, throttle
> tasks in a safe location.

Yes, fixing (kernel) priority inversion due to CFS_BANDWIDTH requires a
serious reworking of how it works, because it would need to dequeue
tasks individually rather than doing the entire cfs_rq at a time (and
would require some effort to avoid pinging every throttling task to get
it into the kernel).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-07-01 20:08             ` Benjamin Segall
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Segall @ 2022-07-01 20:08 UTC (permalink / raw)
  To: Zhang Qiao
  Cc: Tejun Heo, mingo, peterz, Juri Lelli, Vincent Guittot, lizefan.x,
	hannes, cgroups, lkml, vschneid, dietmar.eggemann, bristot,
	Steven Rostedt, mgorman

Zhang Qiao <zhangqiao22@huawei.com> writes:

> Hi, tejun
>
> Thanks for your reply.
>
> 在 2022/6/27 16:32, Tejun Heo 写道:
>> Hello,
>> 
>> On Mon, Jun 27, 2022 at 02:50:25PM +0800, Zhang Qiao wrote:
>>> Becuase the task cgroup's cpu.cfs_quota_us is very small and
>>> test_fork's load is very heavy, the test_fork may be throttled long
>>> time, therefore, the cgroup_threadgroup_rw_sem read lock is held for
>>> a long time, other processes will get stuck waiting for the lock:
>> 
>> Yeah, this is a known problem and can happen with other locks too. The
>> solution prolly is only throttling while in or when about to return to
>> userspace. There is one really important and wide-spread assumption in
>> the kernel:
>> 
>>   If things get blocked on some shared resource, whatever is holding
>>   the resource ends up using more of the system to exit the critical
>>   section faster and thus unblocks others ASAP. IOW, things running in
>>   kernel are work-conserving.
>> 
>> The cpu bw controller gives the userspace a rather easy way to break
>> this assumption and thus is rather fundamentally broken. This is
>> basically the same problem we had with the old cgroup freezer
>> implementation which trapped threads in random locations in the
>> kernel.
>> 
>
> so, if we want to completely slove this problem, is the best way to
> change the cfs bw controller throttle mechanism? for example, throttle
> tasks in a safe location.

Yes, fixing (kernel) priority inversion due to CFS_BANDWIDTH requires a
serious reworking of how it works, because it would need to dequeue
tasks individually rather than doing the entire cfs_rq at a time (and
would require some effort to avoid pinging every throttling task to get
it into the kernel).

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <xm26czeoioju.fsf-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
  2022-07-01 20:08             ` Benjamin Segall
@ 2022-07-01 20:15                 ` Tejun Heo
  -1 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2022-07-01 20:15 UTC (permalink / raw)
  To: Benjamin Segall
  Cc: Zhang Qiao, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, Juri Lelli, Vincent Guittot,
	lizefan.x-EC8Uxl6Npydl57MIdRCFDg, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, lkml,
	vschneid-H+wXaHxf7aLQT0dZR+AlfA, dietmar.eggemann-5wv7dgnIgG8,
	bristot-H+wXaHxf7aLQT0dZR+AlfA, Steven Rostedt,
	mgorman-l3A5Bk7waGM

On Fri, Jul 01, 2022 at 01:08:21PM -0700, Benjamin Segall wrote:
> Yes, fixing (kernel) priority inversion due to CFS_BANDWIDTH requires a
> serious reworking of how it works, because it would need to dequeue
> tasks individually rather than doing the entire cfs_rq at a time (and
> would require some effort to avoid pinging every throttling task to get
> it into the kernel).

Right, I don't have a good idea on evolving the current implementation
into something correct. As you pointed out, we need to account along
the sched_group tree but conditionally enforce on each thread.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-07-01 20:15                 ` Tejun Heo
  0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2022-07-01 20:15 UTC (permalink / raw)
  To: Benjamin Segall
  Cc: Zhang Qiao, mingo, peterz, Juri Lelli, Vincent Guittot, lizefan.x,
	hannes, cgroups, lkml, vschneid, dietmar.eggemann, bristot,
	Steven Rostedt, mgorman

On Fri, Jul 01, 2022 at 01:08:21PM -0700, Benjamin Segall wrote:
> Yes, fixing (kernel) priority inversion due to CFS_BANDWIDTH requires a
> serious reworking of how it works, because it would need to dequeue
> tasks individually rather than doing the entire cfs_rq at a time (and
> would require some effort to avoid pinging every throttling task to get
> it into the kernel).

Right, I don't have a good idea on evolving the current implementation
into something correct. As you pointed out, we need to account along
the sched_group tree but conditionally enforce on each thread.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <Yr9V755mL6jr20c2-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>]

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
  2022-07-01 20:15                 ` Tejun Heo
@ 2022-07-07  6:59                     ` Zhang Qiao
  -1 siblings, 0 replies; 12+ messages in thread
From: Zhang Qiao @ 2022-07-07  6:59 UTC (permalink / raw)
  To: Tejun Heo, Benjamin Segall
  Cc: mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	Juri Lelli, Vincent Guittot, lizefan.x-EC8Uxl6Npydl57MIdRCFDg,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	lkml, vschneid-H+wXaHxf7aLQT0dZR+AlfA,
	dietmar.eggemann-5wv7dgnIgG8, bristot-H+wXaHxf7aLQT0dZR+AlfA,
	Steven Rostedt, mgorman-l3A5Bk7waGM



在 2022/7/2 4:15, Tejun Heo 写道:
> On Fri, Jul 01, 2022 at 01:08:21PM -0700, Benjamin Segall wrote:
>> Yes, fixing (kernel) priority inversion due to CFS_BANDWIDTH requires a
>> serious reworking of how it works, because it would need to dequeue
>> tasks individually rather than doing the entire cfs_rq at a time (and
>> would require some effort to avoid pinging every throttling task to get
>> it into the kernel).
> 
> Right, I don't have a good idea on evolving the current implementation
> into something correct. As you pointed out, we need to account along
> the sched_group tree but conditionally enforce on each thread.
> 
> Thanks.
> 

Understood. Thanks for your detailed explanation.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-07-07  6:59                     ` Zhang Qiao
  0 siblings, 0 replies; 12+ messages in thread
From: Zhang Qiao @ 2022-07-07  6:59 UTC (permalink / raw)
  To: Tejun Heo, Benjamin Segall
  Cc: mingo, peterz, Juri Lelli, Vincent Guittot, lizefan.x, hannes,
	cgroups, lkml, vschneid, dietmar.eggemann, bristot,
	Steven Rostedt, mgorman



在 2022/7/2 4:15, Tejun Heo 写道:
> On Fri, Jul 01, 2022 at 01:08:21PM -0700, Benjamin Segall wrote:
>> Yes, fixing (kernel) priority inversion due to CFS_BANDWIDTH requires a
>> serious reworking of how it works, because it would need to dequeue
>> tasks individually rather than doing the entire cfs_rq at a time (and
>> would require some effort to avoid pinging every throttling task to get
>> it into the kernel).
> 
> Right, I don't have a good idea on evolving the current implementation
> into something correct. As you pointed out, we need to account along
> the sched_group tree but conditionally enforce on each thread.
> 
> Thanks.
> 

Understood. Thanks for your detailed explanation.

Thanks.
--
Qiao
.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-07-07  7:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-27  6:50 [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low Zhang Qiao
2022-06-27  6:50 ` Zhang Qiao
     [not found] ` <5987be34-b527-4ff5-a17d-5f6f0dc94d6d-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-06-27  8:32   ` Tejun Heo
2022-06-27  8:32     ` Tejun Heo
     [not found]     ` <YrlrBmF3oOfS3+fq-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2022-07-01  7:34       ` Zhang Qiao
2022-07-01  7:34         ` Zhang Qiao
     [not found]         ` <f0f55f89-14db-de29-c182-32539f8d4e4d-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-07-01 20:08           ` Benjamin Segall
2022-07-01 20:08             ` Benjamin Segall
     [not found]             ` <xm26czeoioju.fsf-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2022-07-01 20:15               ` Tejun Heo
2022-07-01 20:15                 ` Tejun Heo
     [not found]                 ` <Yr9V755mL6jr20c2-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2022-07-07  6:59                   ` Zhang Qiao
2022-07-07  6:59                     ` Zhang Qiao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.