From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754358AbcBHNkz (ORCPT ); Mon, 8 Feb 2016 08:40:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54463 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753597AbcBHNkw (ORCPT ); Mon, 8 Feb 2016 08:40:52 -0500 Subject: Re: [BUG] scheduler doesn't balance thread to idle cpu for 3 seconds To: Peter Zijlstra References: <56A8D994.6050205@redhat.com> <56AA39D6.4070509@redhat.com> <20160128174903.GV6356@twins.programming.kicks-ass.net> <333246323.13611103.1454006593261.JavaMail.zimbra@redhat.com> <20160129101522.GF6357@twins.programming.kicks-ass.net> <654964868.14006956.1454063625314.JavaMail.zimbra@redhat.com> Cc: alex shi , guz fnst , mingo@redhat.com, jolsa@redhat.com, riel@redhat.com, linux-kernel@vger.kernel.org From: Jan Stancek Message-ID: <56B89AE0.9090603@redhat.com> Date: Mon, 8 Feb 2016 14:40:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <654964868.14006956.1454063625314.JavaMail.zimbra@redhat.com> Content-Type: multipart/mixed; boundary="------------050005050204090301040905" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a multi-part message in MIME format. --------------050005050204090301040905 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 01/29/2016 11:33 AM, Jan Stancek wrote: >> >> Also note that I don't think failing this test is a bug per se. >> Undesirable maybe, but within spec, since SIGALRM is process wide, so it >> being delivered to the SCHED_OTHER task is accepted, and SCHED_OTHER has >> no timeliness guarantees. >> >> That said; if I could reliably reproduce I'd have a go at fixing this, I >> suspect there's a 'fun' problem at the bottom of this. > > Thanks for trying, I'll see if I can find some more reliable way. I think I have found a more reliably way, however it requires an older stable kernel: 3.12.53 up to 4.1.17. Consider following scenario: - all tasks on system have RT sched class - main thread of reproducer becomes the only SCHED_OTHER task on system - when alarm(2) expires, main thread is woken up on cpu that is occupied by busy looping RT thread (low_priority_thread) - because main thread was sleeping for 2 seconds, its load has decayed to 0 - the only chance for main thread to run is if it gets balanced to idle CPU - task_tick_fair() doesn't run, there is RT task running on this CPU - main thread is on cfs run queue but its load stays 0 - load balancer never sees this CPU (group) as busy Attached is reproducer and script, which tries to trigger scenario above. I can reproduce it with 4.1.17 on baremetal 4 CPU x86_64 with about 1:50 chance. In this setup failure state persists for a long time, perhaps indefinitely. I tried extending RUNTIME to 10 minutes, main thread still wouldn't run. One more clue: I could work around this issue if I forced an update_entity_load_avg() on sched_entities that have not been updated for some time, as part of periodic rebalance_domains() call. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c7c1d28..1b5fe80 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5264,6 +5264,7 @@ static void update_blocked_averages(int cpu) struct rq *rq = cpu_rq(cpu); struct cfs_rq *cfs_rq; unsigned long flags; + struct rb_node *rb; raw_spin_lock_irqsave(&rq->lock, flags); update_rq_clock(rq); @@ -5281,6 +5282,19 @@ static void update_blocked_averages(int cpu) } raw_spin_unlock_irqrestore(&rq->lock, flags); + + cfs_rq = &(cpu_rq(cpu)->cfs); + for (rb = rb_first_postorder(&cfs_rq->tasks_timeline); rb; rb = rb_next_postorder(rb)) { + struct sched_entity *se = rb_entry(rb, struct sched_entity, run_node); + + // Task on rq has not been updated for 500ms :-( + if ((cfs_rq_clock_task(cfs_rq) - se->avg.last_runnable_update) > 500L * (1 << 20)) + update_entity_load_avg(se, 1); + } } /* Regards, Jan --------------050005050204090301040905 Content-Type: text/plain; charset=UTF-8; name="pthread_cond_wait_1_v3.c" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="pthread_cond_wait_1_v3.c" LyoKICogcmVwcm9kdWNlciB2MyBmb3I6CiAqIFtCVUddIHNjaGVkdWxlciBkb2Vzbid0IGJh bGFuY2UgdGhyZWFkIHRvIGlkbGUgY3B1IGZvciAzIHNlY29uZHMKICoKICogQmFzZWQgb24g TFRQJ3MgcHRocmVhZF9jb25kX3dhaXRfMS5jCiAqCiAqLwoKI2RlZmluZSBfR05VX1NPVVJD RQojaW5jbHVkZSA8cHRocmVhZC5oPgojaW5jbHVkZSA8c2NoZWQuaD4KI2luY2x1ZGUgPHN0 ZGlvLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1ZGUgPHNpZ25hbC5oPgojaW5jbHVk ZSA8aW50dHlwZXMuaD4KI2luY2x1ZGUgPHVuaXN0ZC5oPgojaW5jbHVkZSA8dGltZS5oPgoj aW5jbHVkZSA8c3lzL3RpbWUuaD4KI2luY2x1ZGUgPHN5cy9yZXNvdXJjZS5oPgoKI2RlZmlu ZSBFUlJPUl9QUkVGSVggInVuZXhwZWN0ZWQgZXJyb3I6ICIKCiNkZWZpbmUgSElHSF9QUklP UklUWSAxMAojZGVmaW5lIExPV19QUklPUklUWSAgNQojZGVmaW5lIFJVTlRJTUUgICAgICAg NQojZGVmaW5lIFBPTElDWSAgICAgICAgU0NIRURfUlIKCiNkZWZpbmUgUFRTX1BBU1MgMAoj ZGVmaW5lIFBUU19GQUlMIDEKI2RlZmluZSBQVFNfVU5SRVNPTFZFRCAyCgpwdGhyZWFkX211 dGV4X3QgbXV0ZXggPSBQVEhSRUFEX01VVEVYX0lOSVRJQUxJWkVSOwpwdGhyZWFkX2NvbmRf dCBjb25kID0gUFRIUkVBRF9DT05EX0lOSVRJQUxJWkVSOwoKLyogRmxhZ3MgdGhhdCB0aGUg dGhyZWFkcyB1c2UgdG8gaW5kaWNhdGUgZXZlbnRzICovCnZvbGF0aWxlIGludCB3b2tlbl91 cCA9IDA7CnZvbGF0aWxlIGludCBsb3dfZG9uZSA9IDA7CgovKiBTaWduYWwgaGFuZGxlciB0 aGF0IGhhbmRsZSB0aGUgQUxSTSBhbmQgd2FrZXMgdXAKICogdGhlIGhpZ2ggcHJpb3JpdHkg dGhyZWFkCiAqLwp2b2lkIHNpZ25hbF9oYW5kbGVyKGludCBzaWcpCnsKCSh2b2lkKSBzaWc7 CglpZiAocHRocmVhZF9jb25kX3NpZ25hbCgmY29uZCkgIT0gMCkgewoJCXByaW50ZihFUlJP Ul9QUkVGSVggInB0aHJlYWRfY29uZF9zaWduYWxcbiIpOwoJCWV4aXQoUFRTX1VOUkVTT0xW RUQpOwoJfQp9CgovKiBVdGlsaXR5IGZ1bmN0aW9uIHRvIGZpbmQgZGlmZmVyZW5jZSBiZXR3 ZWVuIHR3byB0aW1lIHZhbHVlcyAqLwpmbG9hdCB0aW1lZGlmZihzdHJ1Y3QgdGltZXNwZWMg dDIsIHN0cnVjdCB0aW1lc3BlYyB0MSkKewoJZmxvYXQgZGlmZiA9IHQyLnR2X3NlYyAtIHQx LnR2X3NlYzsKCWRpZmYgKz0gKHQyLnR2X25zZWMgLSB0MS50dl9uc2VjKSAvIDEwMDAwMDAw MDAuMDsKCXJldHVybiBkaWZmOwp9Cgp2b2lkICpoaV9wcmlvcml0eV90aHJlYWQodm9pZCAq dG1wKQp7CglzdHJ1Y3Qgc2NoZWRfcGFyYW0gcGFyYW07CglpbnQgcG9saWN5OwoJaW50IHJj ID0gMDsKCgkodm9pZCkgdG1wOwoJcGFyYW0uc2NoZWRfcHJpb3JpdHkgPSBISUdIX1BSSU9S SVRZOwoKCXJjID0gcHRocmVhZF9zZXRzY2hlZHBhcmFtKHB0aHJlYWRfc2VsZigpLCBQT0xJ Q1ksICZwYXJhbSk7CglpZiAocmMgIT0gMCkgewoJCXByaW50ZihFUlJPUl9QUkVGSVggInB0 aHJlYWRfc2V0c2NoZWRwYXJhbVxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZFRCk7Cgl9Cgly YyA9IHB0aHJlYWRfZ2V0c2NoZWRwYXJhbShwdGhyZWFkX3NlbGYoKSwgJnBvbGljeSwgJnBh cmFtKTsKCWlmIChyYyAhPSAwKSB7CgkJcHJpbnRmKEVSUk9SX1BSRUZJWCAicHRocmVhZF9n ZXRzY2hlZHBhcmFtXG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCWlmICgocG9s aWN5ICE9IFBPTElDWSkgfHwgKHBhcmFtLnNjaGVkX3ByaW9yaXR5ICE9IEhJR0hfUFJJT1JJ VFkpKSB7CgkJcHJpbnRmKCJFcnJvcjogdGhlIHBvbGljeSBvciBwcmlvcml0eSBub3QgY29y cmVjdFxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZFRCk7Cgl9CgoJLyogSW5zdGFsbCBhIHNp Z25hbCBoYW5kbGVyIGZvciBBTFJNICovCglpZiAoc2lnbmFsKFNJR0FMUk0sIHNpZ25hbF9o YW5kbGVyKSAhPSAwKSB7CgkJcGVycm9yKEVSUk9SX1BSRUZJWCAic2lnbmFsOiIpOwoJCWV4 aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoKCS8qIGFjcXVpcmUgdGhlIG11dGV4ICovCglyYyA9 IHB0aHJlYWRfbXV0ZXhfbG9jaygmbXV0ZXgpOwoJaWYgKHJjICE9IDApIHsKCQlwcmludGYo RVJST1JfUFJFRklYICJwdGhyZWFkX211dGV4X2xvY2tcbiIpOwoJCWV4aXQoUFRTX1VOUkVT T0xWRUQpOwoJfQoKCS8qIFNldHVwIGFuIGFsYXJtIHRvIGdvIG9mZiBpbiAyIHNlY29uZHMg Ki8KCWFsYXJtKDIpOwoKCS8qIEJsb2NrLCB0byBiZSB3b2tlbiB1cCBieSB0aGUgc2lnbmFs IGhhbmRsZXIgKi8KCXJjID0gcHRocmVhZF9jb25kX3dhaXQoJmNvbmQsICZtdXRleCk7Cglp ZiAocmMgIT0gMCkgewoJCXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfY29uZF93YWl0 XG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCgkvKiBUaGlzIHZhcmlhYmxlIGlz IHVucHJvdGVjdGVkIGJlY2F1c2UgdGhlIHNjaGVkdWxpbmcgcmVtb3ZlcwoJICogdGhlIGNv bnRlbnRpb24KCSAqLwoJaWYgKGxvd19kb25lICE9IDEpCgkJd29rZW5fdXAgPSAxOwoKCXJj ID0gcHRocmVhZF9tdXRleF91bmxvY2soJm11dGV4KTsKCWlmIChyYyAhPSAwKSB7CgkJcHJp bnRmKEVSUk9SX1BSRUZJWCAicHRocmVhZF9tdXRleF91bmxvY2tcbiIpOwoJCWV4aXQoUFRT X1VOUkVTT0xWRUQpOwoJfQoJcmV0dXJuIE5VTEw7Cn0KCnZvaWQgKmxvd19wcmlvcml0eV90 aHJlYWQodm9pZCAqdG1wKQp7CglzdHJ1Y3QgdGltZXNwZWMgc3RhcnRfdGltZSwgY3VycmVu dF90aW1lOwoJc3RydWN0IHNjaGVkX3BhcmFtIHBhcmFtOwoJaW50IHBvbGljeTsKCWNwdV9z ZXRfdCBjcHVzZXQ7CglpbnQgcmMgPSAwLCBzbGVwdF90aW1lcyA9IDA7CglmbG9hdCBzbGVw dF9mb3IgPSAwOwoJdWludHB0cl90IHRudW0gPSAodWludHB0cl90KXRtcDsKCglwYXJhbS5z Y2hlZF9wcmlvcml0eSA9IExPV19QUklPUklUWTsKCglyYyA9IHB0aHJlYWRfc2V0c2NoZWRw YXJhbShwdGhyZWFkX3NlbGYoKSwgUE9MSUNZLCAmcGFyYW0pOwoJaWYgKHJjICE9IDApIHsK CQlwcmludGYoRVJST1JfUFJFRklYICJwdGhyZWFkX3NldHNjaGVkcGFyYW1cbiIpOwoJCWV4 aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoJcmMgPSBwdGhyZWFkX2dldHNjaGVkcGFyYW0ocHRo cmVhZF9zZWxmKCksICZwb2xpY3ksICZwYXJhbSk7CglpZiAocmMgIT0gMCkgewoJCXByaW50 ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfZ2V0c2NoZWRwYXJhbVxuIik7CgkJZXhpdChQVFNf VU5SRVNPTFZFRCk7Cgl9CglpZiAoKHBvbGljeSAhPSBQT0xJQ1kpIHx8IChwYXJhbS5zY2hl ZF9wcmlvcml0eSAhPSBMT1dfUFJJT1JJVFkpKSB7CgkJcHJpbnRmKCJFcnJvcjogdGhlIHBv bGljeSBvciBwcmlvcml0eSBub3QgY29ycmVjdFxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZF RCk7Cgl9CgoJQ1BVX1pFUk8oJmNwdXNldCk7CglDUFVfU0VUKHRudW0sICZjcHVzZXQpOwoK CXJjID0gcHRocmVhZF9zZXRhZmZpbml0eV9ucChwdGhyZWFkX3NlbGYoKSwgc2l6ZW9mKGNw dV9zZXRfdCksICZjcHVzZXQpOwoJaWYgKHJjICE9IDApIHsKCQlwcmludGYoRVJST1JfUFJF RklYICJwdGhyZWFkX3NldGFmZmluaXR5X25wXG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVE KTsKCX0KCgkvKiBncmFiIHRoZSBzdGFydCB0aW1lIGFuZCBidXN5IGxvb3AgZm9yIDUgc2Vj b25kcyAqLwoJY2xvY2tfZ2V0dGltZShDTE9DS19SRUFMVElNRSwgJnN0YXJ0X3RpbWUpOwoJ d2hpbGUgKCF3b2tlbl91cCAmJiAhbG93X2RvbmUpIHsKCQljbG9ja19nZXR0aW1lKENMT0NL X1JFQUxUSU1FLCAmY3VycmVudF90aW1lKTsKCQlpZiAodGltZWRpZmYoY3VycmVudF90aW1l LCBzdGFydF90aW1lKSA+IFJVTlRJTUUpCgkJCWJyZWFrOwoJfQoKCWxvd19kb25lID0gMTsK CXJldHVybiBOVUxMOwp9CgppbnQgbWFpbigpCnsKCXB0aHJlYWRfdCBoaWdoX2lkLCAqbG93 X2lkLCBwYXVzZWRfaWQ7CglzdHJ1Y3Qgc2NoZWRfcGFyYW0gcGFyYW07CglpbnQgcmMgPSAw OwoJaW50IGksIG5jcHVzID0gc3lzY29uZihfU0NfTlBST0NFU1NPUlNfT05MTik7CgoJbG93 X2lkID0gbWFsbG9jKG5jcHVzICogc2l6ZW9mKHB0aHJlYWRfdCkpOwoKCS8qIGhpZ2ggcHJp byB0aHJlYWQgKi8KCXJjID0gcHRocmVhZF9jcmVhdGUoJmhpZ2hfaWQsIE5VTEwsIGhpX3By aW9yaXR5X3RocmVhZCwgTlVMTCk7CglpZiAocmMgIT0gMCkgewoJCXByaW50ZihFUlJPUl9Q UkVGSVggInB0aHJlYWRfY3JlYXRlXG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0K CgkvKiBsb3cgcHJpbyB0aHJlYWQgb24gZWFjaCBjcHUgZXhjZXB0IGxhc3Qgb25lICovCglm b3IgKGkgPSAwOyBpIDwgbmNwdXMgLSAxOyBpKyspIHsKCQl1aW50cHRyX3QgdG51bSA9IGk7 CgkJcmMgPSBwdGhyZWFkX2NyZWF0ZSgmbG93X2lkW2ldLCBOVUxMLCBsb3dfcHJpb3JpdHlf dGhyZWFkLCAodm9pZCAqKXRudW0pOwoJCWlmIChyYyAhPSAwKSB7CgkJCXByaW50ZihFUlJP Ul9QUkVGSVggInB0aHJlYWRfY3JlYXRlXG4iKTsKCQkJZXhpdChQVFNfVU5SRVNPTFZFRCk7 CgkJfQoJfQoKCXBhcmFtLnNjaGVkX3ByaW9yaXR5ID0gMDsKCXJjID0gcHRocmVhZF9zZXRz Y2hlZHBhcmFtKHB0aHJlYWRfc2VsZigpLCBTQ0hFRF9PVEhFUiwgJnBhcmFtKTsKCWlmIChy YyAhPSAwKSB7CgkJcHJpbnRmKEVSUk9SX1BSRUZJWCAicHRocmVhZF9zZXRzY2hlZHBhcmFt XG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCgkvKiBXYWl0IGZvciB0aGUgdGhy ZWFkcyB0byBleGl0ICovCglyYyA9IHB0aHJlYWRfam9pbihoaWdoX2lkLCBOVUxMKTsKCWlm IChyYyAhPSAwKSB7CgkJcHJpbnRmKEVSUk9SX1BSRUZJWCAicHRocmVhZF9qb2luXG4iKTsK CQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCglmb3IgKGkgPSAwOyBpIDwgbmNwdXMgLSAx OyBpKyspIHsKCQlyYyA9IHB0aHJlYWRfam9pbihsb3dfaWRbaV0sIE5VTEwpOwoJCWlmIChy YyAhPSAwKSB7CgkJCXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfam9pblxuIik7CgkJ CWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJCX0KCX0KCglpZiAod29rZW5fdXAgPT0gMCkgewoJ CXByaW50ZigiVGVzdCBGQUlMRUQ6IGhpZ2ggcHJpb3JpdHkgd2FzIG5vdCB3b2tlbiB1cFxu Iik7CgkJZXhpdChQVFNfRkFJTCk7Cgl9CgoJcHJpbnRmKCJUZXN0IFBBU1NFRFxuIik7Cgll eGl0KFBUU19QQVNTKTsKfQo= --------------050005050204090301040905 Content-Type: application/x-shellscript; name="reproduce_v3.sh" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="reproduce_v3.sh" IyEvYmluL2Jhc2gKCmk9MApmYWlsZWQ9MAoKZ2NjIC1PMiAtcHRocmVhZCBwdGhyZWFkX2Nv bmRfd2FpdF8xX3YzLmMgfHwgZXhpdCAxCgp3aGlsZSBbIFRydWUgXTsgZG8KCXBzIC1lTCB8 IGF3ayAne3ByaW50ICQxfScgfCB4YXJncyAtaXt9IGNocnQgLWEgLXAgLS1yciA5MCB7fSAy PiAvZGV2L251bGwKCWk9JCgoaSsxKSkKCXRpbWUgLi9hLm91dCB8fCBmYWlsZWQ9JCgoZmFp bGVkKzEpKQoJZWNobyAiYWxsL2ZhaWxlZDogJGkvJGZhaWxlZCIKZG9uZQo= --------------050005050204090301040905--