The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low
@ 2022-06-27  6:50 Zhang Qiao
  2022-06-27  8:32 ` Tejun Heo
  0 siblings, 1 reply; 6+ messages in thread
From: Zhang Qiao @ 2022-06-27  6:50 UTC (permalink / raw)
  To: Tejun Heo, mingo, peterz, Juri Lelli, Vincent Guittot
  Cc: lizefan.x, hannes, cgroups, lkml, vschneid, dietmar.eggemann,
	bristot, bsegall, Steven Rostedt, mgorman

Hi all,

I'm working on debuging a problem.
The testcase does follew operations:
1) create a test task cgroup, set cpu.cfs_quota_us=2000,cpu.cfs_period_us=100000.
2) run 20 test_fork[1] test process in the test task cgroup.
3) create 100 new containers:
   for i in {1..100}; do docker run -itd  --health-cmd="ls" --health-interval=1s ubuntu:latest  bash; done

These operations are expected to succeed and 100 containers create success. however, when creating containers,
the system will get stuck and create container failed.

After debug this, I found the test_fork process frequently sleep in freezer_fork()->mutex_lock()->might_sleep()
with taking the cgroup_threadgroup_rw_sem lock, as follow:

copy_process():
	cgroup_can_fork()			---> lock cgroup_threadgroup_rw_sem
	sched_cgroup_fork();
	  ->task_fork_fair(){
	      ->update_curr(){
		  ->__account_cfs_rq_runtime() {
			resched_curr();		---> the quota is used up, and set flag TIF_NEED_RESCHED to current
		   }
	cgroup_post_fork();   		
	  ->feezer_fork()
	      ->mutex_lock() {	
		  ->might_sleep()  		---> schedule() and the current task will be throttled long time.

	  ->cgroup_css_set_put_fork()    	---> unlock cgroup_threadgroup_rw_sem


Becuase the task cgroup's cpu.cfs_quota_us is very small and test_fork's load is very heavy, the test_fork
may be throttled long time, therefore, the cgroup_threadgroup_rw_sem read lock is held for a long time, other
processes will get stuck waiting for the lock:

1) a task fork child, will wait at copy_process()->cgroup_can_fork();

2) a task exiting will wait at exit_signals();

3) a task write cgroup.procs file will wait at cgroup_file_write()->__cgroup1_procs_write();
...

even the whole system will get stuck.

Anyone know how to slove this? Except for changing the cpu.cfs_quota_us.


[1] test_fork.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>

int main(int argc, char **argv)
{
    pid_t pid;
    int count = 20;

    while(1) {
        for (int i = 0; i < count; i++) {
            if ((pid = fork()) <0) {
                printf("fork error");
                return 1;
            } else if (pid ==0) {
                exit(0);
            }
        }

        for (int i = 0; i < count; i++) {
            wait(NULL);
        }
	sleep(1);
    }
    return 0;
}

Thanks a lot.
-Qiao
-

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-07-07  7:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-27  6:50 [Question] The system may be stuck if there is a cpu cgroup cpu.cfs_quato_us is very low Zhang Qiao
2022-06-27  8:32 ` Tejun Heo
2022-07-01  7:34   ` Zhang Qiao
2022-07-01 20:08     ` Benjamin Segall
2022-07-01 20:15       ` Tejun Heo
2022-07-07  6:59         ` Zhang Qiao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox