From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752537AbcAUXCt (ORCPT ); Thu, 21 Jan 2016 18:02:49 -0500 Received: from g1t6225.austin.hp.com ([15.73.96.126]:40250 "EHLO g1t6225.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751197AbcAUXCk (ORCPT ); Thu, 21 Jan 2016 18:02:40 -0500 Message-ID: <56A1638A.7050202@hpe.com> Date: Thu, 21 Jan 2016 18:02:34 -0500 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Ding Tianhong CC: Peter Zijlstra , Ingo Molnar , "linux-kernel@vger.kernel.org" , Davidlohr Bueso , Linus Torvalds , "Paul E. McKenney" , Thomas Gleixner , Will Deacon , Jason Low , Tim Chen , Waiman Long Subject: Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. References: <56A0A4ED.3070308@huawei.com> In-Reply-To: <56A0A4ED.3070308@huawei.com> Content-Type: multipart/mixed; boundary="------------010801020104040900040709" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a multi-part message in MIME format. --------------010801020104040900040709 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 01/21/2016 04:29 AM, Ding Tianhong wrote: > I build a script to create several process for ioctl loop calling, > the ioctl will calling the kernel function just like: > xx_ioctl { > ... > rtnl_lock(); > function(); > rtnl_unlock(); > ... > } > The function may sleep several ms, but will not halt, at the same time > another user service may calling ifconfig to change the state of the > ethernet, and after several hours, the hung task thread report this problem: > > ======================================================================== > 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds. > [149738.040597] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080 > [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8 > [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240 > [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248 > [149738.042290] Call Trace: > [149738.042300] [] schedule_preempt_disabled+0x29/0x70 > [149738.042303] [] __mutex_lock_slowpath+0xc5/0x1c0 > [149738.042305] [] mutex_lock+0x1f/0x2f > [149738.042309] [] rtnl_lock+0x15/0x20 > [149738.042311] [] dev_ioctl+0xda/0x590 > [149738.042314] [] ? __do_page_fault+0x21c/0x560 > [149738.042318] [] sock_do_ioctl+0x45/0x50 > [149738.042320] [] sock_ioctl+0x1f0/0x2c0 > [149738.042324] [] do_vfs_ioctl+0x2e5/0x4c0 > [149738.042327] [] ? fget_light+0xa0/0xd0 > > ================================ cut here ================================ > > I got the vmcore and found that the ifconfig is already in the wait_list of the > rtnl_lock for 120 second, but my process could get and release the rtnl_lock > normally several times in one second, so it means that my process jump the > queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock > slow path and found that the mutex may spin on owner ignore whether the wait list > is empty, it will cause the task in the wait list always be cut in line, so add > test for wait list in the mutex_can_spin_on_owner and avoid this problem. > > Signed-off-by: Ding Tianhong > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Davidlohr Bueso > Cc: Linus Torvalds > Cc: Paul E. McKenney > Cc: Thomas Gleixner > Cc: Will Deacon > Cc: Jason Low > Cc: Tim Chen > Cc: Waiman Long > --- > kernel/locking/mutex.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 0551c21..596b341 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) > struct task_struct *owner; > int retval = 1; > > - if (need_resched()) > + if (need_resched() || atomic_read(&lock->count) == -1) > return 0; > > rcu_read_lock(); > @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock) > /* > * Optimistic spinning. > * > - * We try to spin for acquisition when we find that the lock owner > - * is currently running on a (different) CPU and while we don't > - * need to reschedule. The rationale is that if the lock owner is > - * running, it is likely to release the lock soon. > + * We try to spin for acquisition when we find that there are no > + * pending waiters and the lock owner is currently running on a > + * (different) CPU and while we don't need to reschedule. The > + * rationale is that if the lock owner is running, it is likely > + * to release the lock soon. > * > * Since this needs the lock owner, and this mutex implementation > * doesn't track the owner atomically in the lock field, we need to This patch will largely defeat the performance benefit of optimistic spinning. I have an alternative solution to this live-lock problem. Would you mind trying out the attached patch to see if it can fix your problem? Cheers, Longman --------------010801020104040900040709 Content-Type: text/plain; name="0001-locking-mutex-Enable-optimistic-spinning-of-woken-ta.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0="0001-locking-mutex-Enable-optimistic-spinning-of-woken-ta.pa"; filename*1="tch" RnJvbSAxYmJiNWE0NDM0ZDM5NWY0ODE2M2FiYzU0MzVjNWM3MjBhMTVkMzI3IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBXYWltYW4gTG9uZyA8V2FpbWFuLkxvbmdAaHBlLmNv bT4KRGF0ZTogVGh1LCAyMSBKYW4gMjAxNiAxNzo1MzoxNCAtMDUwMApTdWJqZWN0OiBbUEFU Q0hdIGxvY2tpbmcvbXV0ZXg6IEVuYWJsZSBvcHRpbWlzdGljIHNwaW5uaW5nIG9mIHdva2Vu IHRhc2sgaW4gd2FpdCBsaXN0CgpEaW5nIFRpYW5ob25nIHJlcG9ydGVkIGEgbGl2ZS1sb2Nr IHNpdHVhdGlvbiB3aGVyZSBhIGNvbnN0YW50IHN0cmVhbQpvZiBpbmNvbWluZyBvcHRpbWlz dGljIHNwaW5uZXJzIGJsb2NrZWQgYSB0YXNrIGluIHRoZSB3YWl0IGxpc3QgZnJvbQpnZXR0 aW5nIHRoZSBtdXRleC4KClRoaXMgcGF0Y2ggYXR0ZW1wdHMgdG8gZml4IHRoaXMgbGl2ZS1s b2NrIGNvbmRpdGlvbiBieSBlbmFibGluZyB0aGUKYSB3b2tlbiB0YXNrIGluIHRoZSB3YWl0 IGxpc3QgdG8gZW50ZXIgb3B0aW1pc3RpYyBzcGlubmluZyBsb29wIGl0c2VsZgp3aXRoIHBy ZWNlZGVuY2Ugb3ZlciB0aGUgb25lcyBpbiB0aGUgT1NRLiBUaGlzIHNob3VsZCBwcmV2ZW50 IHRoZQpsaXZlLWxvY2sKY29uZGl0aW9uIGZyb20gaGFwcGVuaW5nLgoKU2lnbmVkLW9mZi1i eTogV2FpbWFuIExvbmcgPFdhaW1hbi5Mb25nQGhwZS5jb20+Ci0tLQogaW5jbHVkZS9saW51 eC9tdXRleC5oICB8ICAgIDIgKwoga2VybmVsL2xvY2tpbmcvbXV0ZXguYyB8ICAgOTUgKysr KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKystCiAyIGZpbGVz IGNoYW5nZWQsIDk1IGluc2VydGlvbnMoKyksIDIgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0 IGEvaW5jbHVkZS9saW51eC9tdXRleC5oIGIvaW5jbHVkZS9saW51eC9tdXRleC5oCmluZGV4 IDJjYjc1MzEuLjJjNTVlY2QgMTAwNjQ0Ci0tLSBhL2luY2x1ZGUvbGludXgvbXV0ZXguaAor KysgYi9pbmNsdWRlL2xpbnV4L211dGV4LmgKQEAgLTU3LDYgKzU3LDggQEAgc3RydWN0IG11 dGV4IHsKICNlbmRpZgogI2lmZGVmIENPTkZJR19NVVRFWF9TUElOX09OX09XTkVSCiAJc3Ry dWN0IG9wdGltaXN0aWNfc3Bpbl9xdWV1ZSBvc3E7IC8qIFNwaW5uZXIgTUNTIGxvY2sgKi8K KwkvKiBTZXQgaWYgd2FpdCBsaXN0IGhlYWQgYWN0aXZlbHkgc3Bpbm5pbmcgKi8KKwlpbnQJ CQl3bGhfc3Bpbm5pbmc7CiAjZW5kaWYKICNpZmRlZiBDT05GSUdfREVCVUdfTVVURVhFUwog CXZvaWQJCQkqbWFnaWM7CmRpZmYgLS1naXQgYS9rZXJuZWwvbG9ja2luZy9tdXRleC5jIGIv a2VybmVsL2xvY2tpbmcvbXV0ZXguYwppbmRleCAwNTUxYzIxLi44YjI3YjAzIDEwMDY0NAot LS0gYS9rZXJuZWwvbG9ja2luZy9tdXRleC5jCisrKyBiL2tlcm5lbC9sb2NraW5nL211dGV4 LmMKQEAgLTU1LDYgKzU1LDcgQEAgX19tdXRleF9pbml0KHN0cnVjdCBtdXRleCAqbG9jaywg Y29uc3QgY2hhciAqbmFtZSwgc3RydWN0IGxvY2tfY2xhc3Nfa2V5ICprZXkpCiAJbXV0ZXhf Y2xlYXJfb3duZXIobG9jayk7CiAjaWZkZWYgQ09ORklHX01VVEVYX1NQSU5fT05fT1dORVIK IAlvc3FfbG9ja19pbml0KCZsb2NrLT5vc3EpOworCWxvY2stPndsaF9zcGlubmluZyA9IGZh bHNlOwogI2VuZGlmCiAKIAlkZWJ1Z19tdXRleF9pbml0KGxvY2ssIG5hbWUsIGtleSk7CkBA IC0zNDYsOCArMzQ3LDEyIEBAIHN0YXRpYyBib29sIG11dGV4X29wdGltaXN0aWNfc3Bpbihz dHJ1Y3QgbXV0ZXggKmxvY2ssCiAJCWlmIChvd25lciAmJiAhbXV0ZXhfc3Bpbl9vbl9vd25l cihsb2NrLCBvd25lcikpCiAJCQlicmVhazsKIAotCQkvKiBUcnkgdG8gYWNxdWlyZSB0aGUg bXV0ZXggaWYgaXQgaXMgdW5sb2NrZWQuICovCi0JCWlmIChtdXRleF90cnlfdG9fYWNxdWly ZShsb2NrKSkgeworCQkvKgorCQkgKiBUcnkgdG8gYWNxdWlyZSB0aGUgbXV0ZXggaWYgaXQg aXMgdW5sb2NrZWQgYW5kIHRoZSB3YWl0CisJCSAqIGxpc3QgaGVhZCBpc24ndCBzcGlubmlu ZyBvbiB0aGUgbG9jay4KKwkJICovCisJCWlmICghUkVBRF9PTkNFKGxvY2stPndsaF9zcGlu bmluZykgJiYKKwkJICAgIG11dGV4X3RyeV90b19hY3F1aXJlKGxvY2spKSB7CiAJCQlsb2Nr X2FjcXVpcmVkKCZsb2NrLT5kZXBfbWFwLCBpcCk7CiAKIAkJCWlmICh1c2Vfd3dfY3R4KSB7 CkBAIC0zOTgsMTIgKzQwMyw5MSBAQCBkb25lOgogCiAJcmV0dXJuIGZhbHNlOwogfQorCisv KgorICogV2FpdCBsaXN0IGhlYWQgb3B0aW1pc3RpYyBzcGlubmluZworICoKKyAqIFRoZSB3 YWl0IGxpc3QgaGVhZCwgd2hlbiB3b2tlbiB1cCwgd2lsbCB0cnkgdG8gc3BpbiBvbiB0aGUg bG9jayBpZiB0aGUKKyAqIGxvY2sgb3duZXIgaXMgYWN0aXZlLiBJdCB3aWxsIGFsc28gc2V0 IHRoZSB3bGhfc3Bpbm5pbmcgZmxhZyB0byBnaXZlCisgKiBpdHNlbGYgYSBoaWdoZXIgY2hh bmNlIG9mIGdldHRpbmcgdGhlIGxvY2sgdGhhbiB0aGUgb3RoZXIgb3B0aW1pc2ljYWxseQor ICogc3Bpbm5pbmcgbG9ja2VyIGluIHRoZSBPU1EuCisgKi8KK3N0YXRpYyBib29sIG11dGV4 X3dsaF9vcHRfc3BpbihzdHJ1Y3QgbXV0ZXggKmxvY2ssCisJCQkgICAgICAgc3RydWN0IHd3 X2FjcXVpcmVfY3R4ICp3d19jdHgsIGNvbnN0IGJvb2wgdXNlX3d3X2N0eCkKK3sKKwlzdHJ1 Y3QgdGFza19zdHJ1Y3QgKm93bmVyLCAqdGFzayA9IGN1cnJlbnQ7CisJaW50IGdvdGxvY2sg PSBmYWxzZTsKKworCVdSSVRFX09OQ0UobG9jay0+d2xoX3NwaW5uaW5nLCB0cnVlKTsKKwl3 aGlsZSAodHJ1ZSkgeworCQlpZiAodXNlX3d3X2N0eCAmJiB3d19jdHgtPmFjcXVpcmVkID4g MCkgeworCQkJc3RydWN0IHd3X211dGV4ICp3dzsKKworCQkJd3cgPSBjb250YWluZXJfb2Yo bG9jaywgc3RydWN0IHd3X211dGV4LCBiYXNlKTsKKwkJCS8qCisJCQkgKiBJZiB3dy0+Y3R4 IGlzIHNldCB0aGUgY29udGVudHMgYXJlIHVuZGVmaW5lZCwgb25seQorCQkJICogYnkgYWNx dWlyaW5nIHdhaXRfbG9jayB0aGVyZSBpcyBhIGd1YXJhbnRlZSB0aGF0CisJCQkgKiB0aGV5 IGFyZSBub3QgaW52YWxpZCB3aGVuIHJlYWRpbmcuCisJCQkgKgorCQkJICogQXMgc3VjaCwg d2hlbiBkZWFkbG9jayBkZXRlY3Rpb24gbmVlZHMgdG8gYmUKKwkJCSAqIHBlcmZvcm1lZCB0 aGUgb3B0aW1pc3RpYyBzcGlubmluZyBjYW5ub3QgYmUgZG9uZS4KKwkJCSAqLworCQkJaWYg KFJFQURfT05DRSh3dy0+Y3R4KSkKKwkJCQlicmVhazsKKwkJfQorCisJCS8qCisJCSAqIElm IHRoZXJlJ3MgYW4gb3duZXIsIHdhaXQgZm9yIGl0IHRvIGVpdGhlcgorCQkgKiByZWxlYXNl IHRoZSBsb2NrIG9yIGdvIHRvIHNsZWVwLgorCQkgKi8KKwkJb3duZXIgPSBSRUFEX09OQ0Uo bG9jay0+b3duZXIpOworCQlpZiAob3duZXIgJiYgIW11dGV4X3NwaW5fb25fb3duZXIobG9j aywgb3duZXIpKQorCQkJYnJlYWs7CisKKwkJLyoKKwkJICogVHJ5IHRvIGFjcXVpcmUgdGhl IG11dGV4IGlmIGl0IGlzIHVubG9ja2VkLiBUaGUgbXV0ZXgKKwkJICogdmFsdWUgaXMgc2V0 IHRvIC0xIHdoaWNoIHdpbGwgYmUgY2hhbmdlZCB0byAwIGxhdGVyIG9uCisJCSAqIGlmIHRo ZSB3YWl0IGxpc3QgYmVjb21lcyBlbXB0eS4KKwkJICovCisJCWlmICghbXV0ZXhfaXNfbG9j a2VkKGxvY2spICYmCisJCSAgIChhdG9taWNfY21weGNoZ19hY3F1aXJlKCZsb2NrLT5jb3Vu dCwgMSwgLTEpID09IDEpKSB7CisJCQlnb3Rsb2NrID0gdHJ1ZTsKKwkJCWJyZWFrOworCQl9 CisKKwkJLyoKKwkJICogV2hlbiB0aGVyZSdzIG5vIG93bmVyLCB3ZSBtaWdodCBoYXZlIHBy ZWVtcHRlZCBiZXR3ZWVuIHRoZQorCQkgKiBvd25lciBhY3F1aXJpbmcgdGhlIGxvY2sgYW5k IHNldHRpbmcgdGhlIG93bmVyIGZpZWxkLiBJZgorCQkgKiB3ZSdyZSBhbiBSVCB0YXNrIHRo YXQgd2lsbCBsaXZlLWxvY2sgYmVjYXVzZSB3ZSB3b24ndCBsZXQKKwkJICogdGhlIG93bmVy IGNvbXBsZXRlLgorCQkgKi8KKwkJaWYgKCFvd25lciAmJiAobmVlZF9yZXNjaGVkKCkgfHwg cnRfdGFzayh0YXNrKSkpCisJCQlicmVhazsKKworCQkvKgorCQkgKiBUaGUgY3B1X3JlbGF4 KCkgY2FsbCBpcyBhIGNvbXBpbGVyIGJhcnJpZXIgd2hpY2ggZm9yY2VzCisJCSAqIGV2ZXJ5 dGhpbmcgaW4gdGhpcyBsb29wIHRvIGJlIHJlLWxvYWRlZC4gV2UgZG9uJ3QgbmVlZAorCQkg KiBtZW1vcnkgYmFycmllcnMgYXMgd2UnbGwgZXZlbnR1YWxseSBvYnNlcnZlIHRoZSByaWdo dAorCQkgKiB2YWx1ZXMgYXQgdGhlIGNvc3Qgb2YgYSBmZXcgZXh0cmEgc3BpbnMuCisJCSAq LworCQljcHVfcmVsYXhfbG93bGF0ZW5jeSgpOworCisJfQorCVdSSVRFX09OQ0UobG9jay0+ d2xoX3NwaW5uaW5nLCBmYWxzZSk7CisJcmV0dXJuIGdvdGxvY2s7Cit9CiAjZWxzZQogc3Rh dGljIGJvb2wgbXV0ZXhfb3B0aW1pc3RpY19zcGluKHN0cnVjdCBtdXRleCAqbG9jaywKIAkJ CQkgIHN0cnVjdCB3d19hY3F1aXJlX2N0eCAqd3dfY3R4LCBjb25zdCBib29sIHVzZV93d19j dHgpCiB7CiAJcmV0dXJuIGZhbHNlOwogfQorCitzdGF0aWMgYm9vbCBtdXRleF93bGhfb3B0 X3NwaW4oc3RydWN0IG11dGV4ICpsb2NrLAorCQkJICAgICAgIHN0cnVjdCB3d19hY3F1aXJl X2N0eCAqd3dfY3R4LCBjb25zdCBib29sIHVzZV93d19jdHgpCit7CisJcmV0dXJuIGZhbHNl OworfQogI2VuZGlmCiAKIF9fdmlzaWJsZSBfX3VzZWQgbm9pbmxpbmUKQEAgLTU0Myw2ICs2 MjcsOCBAQCBfX211dGV4X2xvY2tfY29tbW9uKHN0cnVjdCBtdXRleCAqbG9jaywgbG9uZyBz dGF0ZSwgdW5zaWduZWQgaW50IHN1YmNsYXNzLAogCWxvY2tfY29udGVuZGVkKCZsb2NrLT5k ZXBfbWFwLCBpcCk7CiAKIAlmb3IgKDs7KSB7CisJCWludCBnb3Rsb2NrOworCiAJCS8qCiAJ CSAqIExldHMgdHJ5IHRvIHRha2UgdGhlIGxvY2sgYWdhaW4gLSB0aGlzIGlzIG5lZWRlZCBl dmVuIGlmCiAJCSAqIHdlIGdldCBoZXJlIGZvciB0aGUgZmlyc3QgdGltZSAoc2hvcnRseSBh ZnRlciBmYWlsaW5nIHRvCkBAIC01NzcsNyArNjYzLDEyIEBAIF9fbXV0ZXhfbG9ja19jb21t b24oc3RydWN0IG11dGV4ICpsb2NrLCBsb25nIHN0YXRlLCB1bnNpZ25lZCBpbnQgc3ViY2xh c3MsCiAJCS8qIGRpZG4ndCBnZXQgdGhlIGxvY2ssIGdvIHRvIHNsZWVwOiAqLwogCQlzcGlu X3VubG9ja19tdXRleCgmbG9jay0+d2FpdF9sb2NrLCBmbGFncyk7CiAJCXNjaGVkdWxlX3By ZWVtcHRfZGlzYWJsZWQoKTsKKworCQkvKiBvcHRpbWlzdGljYWxseSBzcGlubmluZyBvbiB0 aGUgbXV0ZXggd2l0aG91dCB0aGUgd2FpdCBsb2NrICovCisJCWdvdGxvY2sgPSBtdXRleF93 bGhfb3B0X3NwaW4obG9jaywgd3dfY3R4LCB1c2Vfd3dfY3R4KTsKIAkJc3Bpbl9sb2NrX211 dGV4KCZsb2NrLT53YWl0X2xvY2ssIGZsYWdzKTsKKwkJaWYgKGdvdGxvY2spCisJCQlicmVh azsKIAl9CiAJX19zZXRfdGFza19zdGF0ZSh0YXNrLCBUQVNLX1JVTk5JTkcpOwogCi0tIAox LjcuMQoK --------------010801020104040900040709--