From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967614AbcA1PzK (ORCPT ); Thu, 28 Jan 2016 10:55:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54599 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965872AbcA1PzH (ORCPT ); Thu, 28 Jan 2016 10:55:07 -0500 Subject: Re: [BUG] scheduler doesn't balance thread to idle cpu for 3 seconds To: alex.shi@intel.com, guz.fnst@cn.fujitsu.com, peterz@infradead.org, mingo@redhat.com, jolsa@redhat.com, riel@redhat.com, linux-kernel@vger.kernel.org References: <56A8D994.6050205@redhat.com> Cc: jstancek@redhat.com From: Jan Stancek Message-ID: <56AA39D6.4070509@redhat.com> Date: Thu, 28 Jan 2016 16:55:02 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <56A8D994.6050205@redhat.com> Content-Type: multipart/mixed; boundary="------------000205050006040309040000" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a multi-part message in MIME format. --------------000205050006040309040000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 01/27/2016 03:52 PM, Jan Stancek wrote: > Hello, > > pthread_cond_wait_1/2 [1] is rarely failing for me on 4.5.0-rc1, > on x86_64 KVM guest with 2 CPUs. > > This test [1]: > - spawns 2 SCHED_RR threads > - first thread with higher priority sets alarm for 2 seconds and blocks on condition > - second thread with lower priority is busy looping for 5 seconds > - after 2 seconds alarm signal arrives and handler signals condition > - high priority thread should resume running I have slightly modified testcase, so it will finish immediately when high prio thread is done. And also to allow it to compile outside of openposix testsuite. Testcase is attached. I'm running it in following way: gcc -O2 -pthread pthread_cond_wait_1.c while [ True ]; do time ./a.out sleep 1 done for couple thousand iterations. About half of those are on system booted with init=/bin/bash. > > But rarely I see that high priority thread doesn't resume running until > low priority thread completes its 5 second busy loop. > > Looking at traces (short version attached, long version at [2]), > I see that after 2 seconds scheduler tries to wake up main thread, but it > appears to do that on same CPU where SCHED_RR low prio thread is running, > so nothing happens. Then scheduler makes numerous balance attempts, > but main thread is not balanced to idle CPU. > > My guess is this started with following commit, which changed weighted_cpuload(): > commit b92486cbf2aa230d00f160664858495c81d2b37b > Author: Alex Shi > Date: Thu Jun 20 10:18:50 2013 +0800 > sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task Here are some numbers gathered from kernels with HEAD at b92486c and previous commit 83dfd52. System is 2 CPU KVM guest. Each iteration measures how long it took for testcase to finish. Ideally it should take about 2 seconds. 1. HEAD at 83dfd52 sched: Update cpu load after task_tick finish time [s] | iterations ---------------------------------- [ 2, 2.2] | 3134 [ 2.2, 2.5] | 18 [ 2.5, 3] | 0 [ 3, 4] | 0 [ 4, 5] | 0 [ 5, 999] | 0 2. HEAD at b92486c sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task finish time [s] | iterations ---------------------------------- [ 2, 2.2] | 1617 [ 2.2, 2.5] | 38 [ 2.5, 3] | 727 [ 3, 4] | 399 [ 4, 5] | 17 [ 5, 999] | 11 Regards, Jan > > I could reproduce it with HEAD set at above commit, I couldn't reproduce it > with 3.10 kernel so far. > > Regards, > Jan > > [1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/functional/threads/condvar/pthread_cond_wait_1.c > [2] http://jan.stancek.eu/tmp/pthread_cond_wait_failure/sched-trace1.tar.bz2 > --------------000205050006040309040000 Content-Type: text/plain; charset=UTF-8; name="pthread_cond_wait_1.c" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="pthread_cond_wait_1.c" LyoKICogQ29weXJpZ2h0IChjKSAyMDA0LCBRVUFMQ09NTSBJbmMuIEFsbCByaWdodHMgcmVz ZXJ2ZWQuCiAqIENyZWF0ZWQgYnk6ICBhYmlzYWluIFJFTU9WRS1USElTIEFUIHF1YWxjb21t IERPVCBjb20KICogVGhpcyBmaWxlIGlzIGxpY2Vuc2VkIHVuZGVyIHRoZSBHUEwgbGljZW5z ZS4gIEZvciB0aGUgZnVsbCBjb250ZW50CiAqIG9mIHRoaXMgbGljZW5zZSwgc2VlIHRoZSBD T1BZSU5HIGZpbGUgYXQgdGhlIHRvcCBsZXZlbCBvZiB0aGlzCiAqIHNvdXJjZSB0cmVlLgoK ICogVGVzdCB0aGF0IHB0aHJlYWRfY29uZF9zaWduYWwoKQogKiAgIHNoYWxsIHdha2V1cCBh IGhpZ2ggcHJpb3JpdHkgdGhyZWFkIGV2ZW4gd2hlbiBhIGxvdyBwcmlvcml0eSB0aHJlYWQK ICogICBpcyBydW5uaW5nCgogKiBTdGVwczoKICogMS4gQ3JlYXRlIGEgY29uZGl0aW9uIHZh cmlhYmxlCiAqIDIuIENyZWF0ZSBhIGhpZ2ggcHJpb3JpdHkgdGhyZWFkIGFuZCBtYWtlIGl0 IHdhaXQgb24gdGhlIGNvbmQKICogMy4gQ3JlYXRlIGEgbG93IHByaW9yaXR5IHRocmVhZCBh bmQgbGV0IGl0IGJ1c3ktbG9vcAogKiA0LiBTaWduYWwgdGhlIGNvbmQgaW4gYSBzaWduYWwg aGFuZGxlciBhbmQgY2hlY2sgdGhhdCBoaWdoCiAqICAgIHByaW9yaXR5IHRocmVhZCBnb3Qg d29rZW4gdXAKICoKICovCgojaW5jbHVkZSA8cHRocmVhZC5oPgojaW5jbHVkZSA8c3RkaW8u aD4KI2luY2x1ZGUgPHN0ZGxpYi5oPgojaW5jbHVkZSA8c2lnbmFsLmg+CiNpbmNsdWRlIDx1 bmlzdGQuaD4KI2luY2x1ZGUgPHRpbWUuaD4KCiNkZWZpbmUgVEVTVCAiNS0xIgojZGVmaW5l IEFSRUEgInNjaGVkdWxlciIKI2RlZmluZSBFUlJPUl9QUkVGSVggInVuZXhwZWN0ZWQgZXJy b3I6ICIgQVJFQSAiICIgVEVTVCAiOiAiCgojZGVmaW5lIEhJR0hfUFJJT1JJVFkgMTAKI2Rl ZmluZSBMT1dfUFJJT1JJVFkgIDUKI2RlZmluZSBSVU5USU1FICAgICAgIDUKI2RlZmluZSBQ T0xJQ1kgICAgICAgIFNDSEVEX1JSCgojZGVmaW5lIFBUU19QQVNTIDAKI2RlZmluZSBQVFNf RkFJTCAxCiNkZWZpbmUgUFRTX1VOUkVTT0xWRUQgMgoKCi8qIG11dGV4IHJlcXVpcmVkIGJ5 IHRoZSBjb25kIHZhcmlhYmxlICovCnB0aHJlYWRfbXV0ZXhfdCBtdXRleCA9IFBUSFJFQURf TVVURVhfSU5JVElBTElaRVI7Ci8qIGNvbmRpdGlvbiB2YXJpYWJsZSB0aGF0IHRocmVhZHMg YmxvY2sgb24qLwpwdGhyZWFkX2NvbmRfdCBjb25kID0gUFRIUkVBRF9DT05EX0lOSVRJQUxJ WkVSOwoKLyogRmxhZ3MgdGhhdCB0aGUgdGhyZWFkcyB1c2UgdG8gaW5kaWNhdGUgZXZlbnRz ICovCnZvbGF0aWxlIGludCB3b2tlbl91cCA9IDA7CnZvbGF0aWxlIGludCBsb3dfZG9uZSA9 IDA7CgovKiBTaWduYWwgaGFuZGxlciB0aGF0IGhhbmRsZSB0aGUgQUxSTSBhbmQgd2FrZXMg dXAKICogdGhlIGhpZ2ggcHJpb3JpdHkgdGhyZWFkCiAqLwp2b2lkIHNpZ25hbF9oYW5kbGVy KGludCBzaWcpCnsKCSh2b2lkKSBzaWc7CglpZiAocHRocmVhZF9jb25kX3NpZ25hbCgmY29u ZCkgIT0gMCkgewoJCXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfY29uZF9zaWduYWxc biIpOwoJCWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQp9CgovKiBVdGlsaXR5IGZ1bmN0aW9u IHRvIGZpbmQgZGlmZmVyZW5jZSBiZXR3ZWVuIHR3byB0aW1lIHZhbHVlcyAqLwpmbG9hdCB0 aW1lZGlmZihzdHJ1Y3QgdGltZXNwZWMgdDIsIHN0cnVjdCB0aW1lc3BlYyB0MSkKewoJZmxv YXQgZGlmZiA9IHQyLnR2X3NlYyAtIHQxLnR2X3NlYzsKCWRpZmYgKz0gKHQyLnR2X25zZWMg LSB0MS50dl9uc2VjKSAvIDEwMDAwMDAwMDAuMDsKCXJldHVybiBkaWZmOwp9Cgp2b2lkICpo aV9wcmlvcml0eV90aHJlYWQodm9pZCAqdG1wKQp7CglzdHJ1Y3Qgc2NoZWRfcGFyYW0gcGFy YW07CglpbnQgcG9saWN5OwoJaW50IHJjID0gMDsKCgkodm9pZCkgdG1wOwoJcGFyYW0uc2No ZWRfcHJpb3JpdHkgPSBISUdIX1BSSU9SSVRZOwoKCXJjID0gcHRocmVhZF9zZXRzY2hlZHBh cmFtKHB0aHJlYWRfc2VsZigpLCBQT0xJQ1ksICZwYXJhbSk7CglpZiAocmMgIT0gMCkgewoJ CXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfc2V0c2NoZWRwYXJhbVxuIik7CgkJZXhp dChQVFNfVU5SRVNPTFZFRCk7Cgl9CglyYyA9IHB0aHJlYWRfZ2V0c2NoZWRwYXJhbShwdGhy ZWFkX3NlbGYoKSwgJnBvbGljeSwgJnBhcmFtKTsKCWlmIChyYyAhPSAwKSB7CgkJcHJpbnRm KEVSUk9SX1BSRUZJWCAicHRocmVhZF9nZXRzY2hlZHBhcmFtXG4iKTsKCQlleGl0KFBUU19V TlJFU09MVkVEKTsKCX0KCWlmICgocG9saWN5ICE9IFBPTElDWSkgfHwgKHBhcmFtLnNjaGVk X3ByaW9yaXR5ICE9IEhJR0hfUFJJT1JJVFkpKSB7CgkJcHJpbnRmKCJFcnJvcjogdGhlIHBv bGljeSBvciBwcmlvcml0eSBub3QgY29ycmVjdFxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZF RCk7Cgl9CgoJLyogSW5zdGFsbCBhIHNpZ25hbCBoYW5kbGVyIGZvciBBTFJNICovCglpZiAo c2lnbmFsKFNJR0FMUk0sIHNpZ25hbF9oYW5kbGVyKSAhPSAwKSB7CgkJcGVycm9yKEVSUk9S X1BSRUZJWCAic2lnbmFsOiIpOwoJCWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoKCS8qIGFj cXVpcmUgdGhlIG11dGV4ICovCglyYyA9IHB0aHJlYWRfbXV0ZXhfbG9jaygmbXV0ZXgpOwoJ aWYgKHJjICE9IDApIHsKCQlwcmludGYoRVJST1JfUFJFRklYICJwdGhyZWFkX211dGV4X2xv Y2tcbiIpOwoJCWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoKCS8qIFNldHVwIGFuIGFsYXJt IHRvIGdvIG9mZiBpbiAyIHNlY29uZHMgKi8KCWFsYXJtKDIpOwoKCS8qIEJsb2NrLCB0byBi ZSB3b2tlbiB1cCBieSB0aGUgc2lnbmFsIGhhbmRsZXIgKi8KCXJjID0gcHRocmVhZF9jb25k X3dhaXQoJmNvbmQsICZtdXRleCk7CglpZiAocmMgIT0gMCkgewoJCXByaW50ZihFUlJPUl9Q UkVGSVggInB0aHJlYWRfY29uZF93YWl0XG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsK CX0KCgkvKiBUaGlzIHZhcmlhYmxlIGlzIHVucHJvdGVjdGVkIGJlY2F1c2UgdGhlIHNjaGVk dWxpbmcgcmVtb3ZlcwoJICogdGhlIGNvbnRlbnRpb24KCSAqLwoJaWYgKGxvd19kb25lICE9 IDEpCgkJd29rZW5fdXAgPSAxOwoKCXJjID0gcHRocmVhZF9tdXRleF91bmxvY2soJm11dGV4 KTsKCWlmIChyYyAhPSAwKSB7CgkJcHJpbnRmKEVSUk9SX1BSRUZJWCAicHRocmVhZF9tdXRl eF91bmxvY2tcbiIpOwoJCWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoJcmV0dXJuIE5VTEw7 Cn0KCnZvaWQgKmxvd19wcmlvcml0eV90aHJlYWQodm9pZCAqdG1wKQp7CglzdHJ1Y3QgdGlt ZXNwZWMgc3RhcnRfdGltZSwgY3VycmVudF90aW1lOwoJc3RydWN0IHNjaGVkX3BhcmFtIHBh cmFtOwoJaW50IHBvbGljeTsKCWludCByYyA9IDA7CgoJKHZvaWQpIHRtcDsKCXBhcmFtLnNj aGVkX3ByaW9yaXR5ID0gTE9XX1BSSU9SSVRZOwoKCXJjID0gcHRocmVhZF9zZXRzY2hlZHBh cmFtKHB0aHJlYWRfc2VsZigpLCBQT0xJQ1ksICZwYXJhbSk7CglpZiAocmMgIT0gMCkgewoJ CXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfc2V0c2NoZWRwYXJhbVxuIik7CgkJZXhp dChQVFNfVU5SRVNPTFZFRCk7Cgl9CglyYyA9IHB0aHJlYWRfZ2V0c2NoZWRwYXJhbShwdGhy ZWFkX3NlbGYoKSwgJnBvbGljeSwgJnBhcmFtKTsKCWlmIChyYyAhPSAwKSB7CgkJcHJpbnRm KEVSUk9SX1BSRUZJWCAicHRocmVhZF9nZXRzY2hlZHBhcmFtXG4iKTsKCQlleGl0KFBUU19V TlJFU09MVkVEKTsKCX0KCWlmICgocG9saWN5ICE9IFBPTElDWSkgfHwgKHBhcmFtLnNjaGVk X3ByaW9yaXR5ICE9IExPV19QUklPUklUWSkpIHsKCQlwcmludGYoIkVycm9yOiB0aGUgcG9s aWN5IG9yIHByaW9yaXR5IG5vdCBjb3JyZWN0XG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVE KTsKCX0KCgkvKiBncmFiIHRoZSBzdGFydCB0aW1lIGFuZCBidXN5IGxvb3AgZm9yIDUgc2Vj b25kcyAqLwoJY2xvY2tfZ2V0dGltZShDTE9DS19SRUFMVElNRSwgJnN0YXJ0X3RpbWUpOwoJ d2hpbGUgKDEgJiYgIXdva2VuX3VwKSB7CgkJY2xvY2tfZ2V0dGltZShDTE9DS19SRUFMVElN RSwgJmN1cnJlbnRfdGltZSk7CgkJaWYgKHRpbWVkaWZmKGN1cnJlbnRfdGltZSwgc3RhcnRf dGltZSkgPiBSVU5USU1FKQoJCQlicmVhazsKCX0KCWxvd19kb25lID0gMTsKCXJldHVybiBO VUxMOwp9CgppbnQgbWFpbigpCnsKCXB0aHJlYWRfdCBoaWdoX2lkLCBsb3dfaWQ7CglwdGhy ZWFkX2F0dHJfdCBoaWdoX2F0dHIsIGxvd19hdHRyOwoJc3RydWN0IHNjaGVkX3BhcmFtIHBh cmFtOwoJaW50IHJjID0gMDsKCgkvKiBDcmVhdGUgdGhlIGhpZ2hlciBwcmlvcml0eSB0aHJl YWQgKi8KCXJjID0gcHRocmVhZF9hdHRyX2luaXQoJmhpZ2hfYXR0cik7CglpZiAocmMgIT0g MCkgewoJCXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRfYXR0cl9pbml0XG4iKTsKCQll eGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCglyYyA9IHB0aHJlYWRfYXR0cl9zZXRzY2hlZHBv bGljeSgmaGlnaF9hdHRyLCBQT0xJQ1kpOwoJaWYgKHJjICE9IDApIHsKCQlwcmludGYoRVJS T1JfUFJFRklYICJwdGhyZWFkX2F0dHJfc2V0c2NoZWRwb2xpY3lcbiIpOwoJCWV4aXQoUFRT X1VOUkVTT0xWRUQpOwoJfQoJcGFyYW0uc2NoZWRfcHJpb3JpdHkgPSBISUdIX1BSSU9SSVRZ OwoJcmMgPSBwdGhyZWFkX2F0dHJfc2V0c2NoZWRwYXJhbSgmaGlnaF9hdHRyLCAmcGFyYW0p OwoJaWYgKHJjICE9IDApIHsKCQlwcmludGYoRVJST1JfUFJFRklYICJwdGhyZWFkX2F0dHJf c2V0c2NoZWRwYXJhbVxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZFRCk7Cgl9CglyYyA9IHB0 aHJlYWRfY3JlYXRlKCZoaWdoX2lkLCAmaGlnaF9hdHRyLCBoaV9wcmlvcml0eV90aHJlYWQs IE5VTEwpOwoJaWYgKHJjICE9IDApIHsKCQlwcmludGYoRVJST1JfUFJFRklYICJwdGhyZWFk X2NyZWF0ZVxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZFRCk7Cgl9CgoJLyogQ3JlYXRlIHRo ZSBsb3cgcHJpb3JpdHkgdGhyZWFkICovCglyYyA9IHB0aHJlYWRfYXR0cl9pbml0KCZsb3df YXR0cik7CglpZiAocmMgIT0gMCkgewoJCXByaW50ZihFUlJPUl9QUkVGSVggInB0aHJlYWRf YXR0cl9pbml0XG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCXJjID0gcHRocmVh ZF9hdHRyX3NldHNjaGVkcG9saWN5KCZsb3dfYXR0ciwgUE9MSUNZKTsKCWlmIChyYyAhPSAw KSB7CgkJcHJpbnRmKEVSUk9SX1BSRUZJWCAicHRocmVhZF9hdHRyX3NldHNjaGVkcG9saWN5 XG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVEKTsKCX0KCXBhcmFtLnNjaGVkX3ByaW9yaXR5 ID0gTE9XX1BSSU9SSVRZOwoJcmMgPSBwdGhyZWFkX2F0dHJfc2V0c2NoZWRwYXJhbSgmbG93 X2F0dHIsICZwYXJhbSk7CglpZiAocmMgIT0gMCkgewoJCXByaW50ZihFUlJPUl9QUkVGSVgg InB0aHJlYWRfYXR0cl9zZXRzY2hlZHBhcmFtXG4iKTsKCQlleGl0KFBUU19VTlJFU09MVkVE KTsKCX0KCXJjID0gcHRocmVhZF9jcmVhdGUoJmxvd19pZCwgJmxvd19hdHRyLCBsb3dfcHJp b3JpdHlfdGhyZWFkLCBOVUxMKTsKCWlmIChyYyAhPSAwKSB7CgkJcHJpbnRmKEVSUk9SX1BS RUZJWCAicHRocmVhZF9jcmVhdGVcbiIpOwoJCWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoK CS8qIFdhaXQgZm9yIHRoZSB0aHJlYWRzIHRvIGV4aXQgKi8KCXJjID0gcHRocmVhZF9qb2lu KGhpZ2hfaWQsIE5VTEwpOwoJaWYgKHJjICE9IDApIHsKCQlwcmludGYoRVJST1JfUFJFRklY ICJwdGhyZWFkX2pvaW5cbiIpOwoJCWV4aXQoUFRTX1VOUkVTT0xWRUQpOwoJfQoKCXJjID0g cHRocmVhZF9qb2luKGxvd19pZCwgTlVMTCk7CglpZiAocmMgIT0gMCkgewoJCXByaW50ZihF UlJPUl9QUkVGSVggInB0aHJlYWRfam9pblxuIik7CgkJZXhpdChQVFNfVU5SRVNPTFZFRCk7 Cgl9CgoJLyogQ2hlY2sgdGhlIHJlc3VsdCAqLwoJaWYgKHdva2VuX3VwID09IDApIHsKCQlw cmludGYoIlRlc3QgRkFJTEVEOiBoaWdoIHByaW9yaXR5IHdhcyBub3Qgd29rZW4gdXBcXG4i KTsKCQlleGl0KFBUU19GQUlMKTsKCX0KCglwcmludGYoIlRlc3QgUEFTU0VEXG4iKTsKCWV4 aXQoUFRTX1BBU1MpOwp9Cg== --------------000205050006040309040000--