From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193] helo=mx.sourceforge.net) by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1VRoSM-0005HO-E4 for user-mode-linux-devel@lists.sourceforge.net; Thu, 03 Oct 2013 19:21:02 +0000 Received: from b.ns.miles-group.at ([95.130.255.144] helo=radon.swed.at) by sog-mx-3.v43.ch3.sourceforge.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.76) id 1VRoSK-0000p9-CA for user-mode-linux-devel@lists.sourceforge.net; Thu, 03 Oct 2013 19:21:02 +0000 Message-ID: <524DC394.6030406@nod.at> Date: Thu, 03 Oct 2013 21:20:52 +0200 From: Richard Weinberger MIME-Version: 1.0 References: <524C6643.2040209@gmx.de> <524DBD5D.1040203@gmx.de> <524DBFBB.1050002@nod.at> <524DC278.3020106@gmx.de> In-Reply-To: <524DC278.3020106@gmx.de> List-Id: The user-mode Linux development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: user-mode-linux-devel-bounces@lists.sourceforge.net Subject: Re: [uml-devel] BUG: soft lockup for a user mode linux image To: =?UTF-8?B?VG9yYWxmIEbDtnJzdGVy?= Cc: UML devel , trinity@vger.kernel.org QW0gMDMuMTAuMjAxMyAyMToxNiwgc2NocmllYiBUb3JhbGYgRsO2cnN0ZXI6Cj4gT24gMTAvMDMv MjAxMyAwOTowNCBQTSwgUmljaGFyZCBXZWluYmVyZ2VyIHdyb3RlOgo+PiBBbSAwMy4xMC4yMDEz IDIwOjU0LCBzY2hyaWViIFRvcmFsZiBGw7Zyc3RlcjoKPj4+IE9uIDEwLzAyLzIwMTMgMDk6NTUg UE0sIFJpY2hhcmQgV2VpbmJlcmdlciB3cm90ZToKPj4+PiBPbiBXZWQsIE9jdCAyLCAyMDEzIGF0 IDg6MzAgUE0sIFRvcmFsZiBGw7Zyc3RlciA8dG9yYWxmLmZvZXJzdGVyQGdteC5kZT4gd3JvdGU6 Cj4+Pj4+IFJ1bm5pbmcgdHJpbml0eSAoMSBwcm9jZXNzLCBubyB2aWN0aW0gZmlsZXMsIGp1c3Qg IiQ+dHJpbml0eSAtQzEpIGZvciBhIGxvbmdlciB0aW1lCj4+Pj4+IHdpdGhpbiBhIDMyIGJpdCB1 c2VyIG1vZGUgbGludXggaW1hZ2Ugd2l0aCBhIHJlY2VudCBnaXQga2VybmVsIChob3N0OiAzLjEx LjMgZ3Vlc3QgMy4xMi1yYzMtZy4uLikKPj4+Pj4geWllbGRzIGludG8gdGhpcyBrb25zb2xlIG1l c3NhZ2UgOgo+Pj4+Pgo+Pj4+PiAgKiBTdGFydGluZyBsb2NhbAo+Pj4+PiBuZXQuY29yZS53YXJu aW5ncyA9IDAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgWyBvayBdCj4+Pj4+IEJVRzogc29mdCBsb2NrdXAgLSBD UFUjMCBzdHVjayBmb3IgMjNzISBbdHJpbml0eS1jaGlsZDA6MjAzMV0KPj4+Pj4KPj4+Pj4KPj4+ Pj4gYW5kIGF0IHRoZSBob3N0IHQxIG9mIHRoZSAibGludXgiLXByb2Nlc3NlcyBlYXRzIGFsbCBD UFUgY3ljbGVzIGF0IDEgQ1BVIGNvcmUuCj4+Pj4+IDIgc3Vic2VxdWVudCBtYWRlIGJhY2sgdHJh Y2VzIG1hZGUgd2l0aAo+Pj4+Pgo+Pj4+PiAkPiBzdWRvIGdkYiAvaG9tZS90Zm9lcnN0ZS9kZXZl bC9saW51eC9saW51eCAyODE0NCAtbiAtYmF0Y2ggLWV4IGJ0Cj4+Pj4+Cj4+Pj4+IHNob3dzIG5l YXJseSBhIHNpbWlsYXIgcG9zaXRpb24gYXJvdW5kIF9fZ2V0X3VzZXJfcGFnZXMoKSAtIGJvdGgg YXJlIGF0dGFjaGVkLgo+Pj4+Pgo+Pj4+PiBJJ20gbm90IHN1cnByaXNlZCB0aGF0IHRyaW5pdHkg aGFybXMgYSBzeXN0ZW1zIC0gSSdtIGp1c3Qgd29uZGVyaW5nIHdoZXRoZXIgdGhpcyBwYXJ0aWN1 bGFyIHBpY3R1cmUgaXMKPj4+Pj4gZXhwZWN0ZWQgb3IgaWYgaXQgcG9pbnRzIHRvIGFuIGlzc3Vl Lgo+Pj4+Pgo+Pj4+Pgo+Pj4+PiBGV0lXIHRoZSBsYXN0IGxpbmVzIG9mIHRyaW5pdHkgbG9nIHdl cmUgOgo+Pj4+Pgo+Pj4+Pgo+Pj4+PiBbMjAzMV0gWzk0XSBzZXRzaWQoKSA9IDIwMzEKPj4+Pj4g WzIwMzFdIFs5NV0gc2V0cmVzZ2lkKHJnaWQ9MHhmZmZmMzNlMywgZWdpZD0weGZmZmZmZjkzLCBz Z2lkPTB4MjIwMDAwNDApID0gLTEgKE9wZXJhdGlvbiBub3QgcGVybWl0dGVkKQo+Pj4+PiBbMjAz MV0gWzk2XSB2bXNwbGljZShmZD01LCBpb3Y9MHg4NTUwMWUwLCBucl9zZWdzPTMwMCwgZmxhZ3M9 OSkgPSAweDMwMDAKPj4+Pj4gWzIwMzFdIFs5N10gc2V0cmVzdWlkKHJ1aWQ9MHg4MDU0OTE5Mywg ZXVpZD0weGM2MTA0MWUwLCBzdWlkPTB4ZmYxOWI2ZmEpID0gLTEgKE9wZXJhdGlvbiBub3QgcGVy bWl0dGVkKQo+Pj4+PiBbMjAzMV0gWzk4XSBzZXRwcmlvcml0eSh3aGljaD0weGZmMDEwMDAwLCB3 aG89MHhmMzczNzM3MywgbmljZXZhbD0weDgwODg5NjBjKSA9IC0xIChJbnZhbGlkIGFyZ3VtZW50 KQo+Pj4+PiBbMjAzMV0gWzk5XSBzb2NrZXRjYWxsKGNhbGw9MSwgYXJncz0weDg1NTAyMDApID0g LTEgKEFkZHJlc3MgZmFtaWx5IG5vdCBzdXBwb3J0ZWQgYnkgcHJvdG9jb2wpCj4+Pj4+IFsyMDMx XSBbMTAwXSBhY2Nlc3MoZmlsZW5hbWU9Iu+/vSIsIG1vZGU9MjAxNykgPSAtMSAoSW52YWxpZCBh cmd1bWVudCkKPj4+Pj4gWzIwMzFdIFsxMDFdIGdldGdyb3VwcyhnaWRzZXRzaXplPTAsIGdyb3Vw bGlzdD0weDgwZDAwMDBbcGFnZV9yYW5kXSkgPSAzCj4+Pj4+IFsyMDMxXSBbMTAyXSBtc3luYyhz dGFydD0weGMwMTAwMjIwLCBsZW49MCwgZmxhZ3M9MykgPSAtMSAoSW52YWxpZCBhcmd1bWVudCkK Pj4+Pj4gWzIwMzFdIFsxMDNdIHNpZ3BlbmRpbmcoc2V0PTB4NDAwMjUwMDApID0gMAo+Pj4+PiBb MjAzMV0gWzEwNF0gc2lnbmFsZmQ0KHVmZD0zODMsIHVzZXJfbWFzaz0xLCBzaXplbWFzaz0weGE0 MjAwMDAwLCBmbGFncz0weDgwODAwKSA9IC0xIChJbnZhbGlkIGFyZ3VtZW50KQo+Pj4+PiBbMjAz MV0gWzEwNV0gc2VuZGZpbGUob3V0X2ZkPTM4MywgaW5fZmQ9MzgyLCBvZmZzZXQ9MCwgY291bnQ9 NDA5NikgPSAtMSAoSW52YWxpZCBhcmd1bWVudCkKPj4+Pj4gWzIwMzFdIFsxMDZdIGZhbm90aWZ5 X21hcmsoZmFub3RpZnlfZmQ9MzgyLCBmbGFncz01LCBtYXNrPTB4ODAwMDAyMywgZGZkPTM4Miwg cGF0aG5hbWU9Ii9wcm9jLzEwOTIvdGFzay8xMDkyL2ZkaW5mby82OCIpID0gLTEgKEludmFsaWQg YXJndW1lbnQpCj4+Pj4+IFsyMDMxXSBbMTA3XSB3YWl0NCh1cGlkPTEsIHN0YXRfYWRkcj00LCBv cHRpb25zPTB4ZDc2MTk3OWIsIHJ1PTgpID0gLTEgKEludmFsaWQgYXJndW1lbnQpCj4+Pj4+IFsy MDMxXSBbMTA4XSBzaWdwZW5kaW5nKHNldD0weDgwY2EwMDBbcGFnZV96ZXJvc10pID0gMAo+Pj4+ PiBbMjAzMV0gWzEwOV0gc2V0cmVzdWlkKHJ1aWQ9MHhlZmZmZDZmYywgZXVpZD0weDFiZjRjOTJm LCBzdWlkPTB4ZmZmZjJlMzMpID0gLTEgKE9wZXJhdGlvbiBub3QgcGVybWl0dGVkKQo+Pj4+PiBb MjAzMV0gWzExMF0gbXVubG9jayhhZGRyPTB4NDAwMjUwMDAsIGxlbj0zNCkgPSAwCj4+Pj4+IFsy MDMxXSBbMTExXSB0aW1lcl9kZWxldGUodGltZXJfaWQ9MHhmZmZmZmZkYykgPSAtMSAoSW52YWxp ZCBhcmd1bWVudCkKPj4+Pj4gWzIwMzFdIFsxMTJdIHNjaGVkX2dldF9wcmlvcml0eV9tYXgocG9s aWN5PTB4MTAwMDAwNDApID0gLTEgKEludmFsaWQgYXJndW1lbnQpCj4+Pj4+IFsyMDMxXSBbMTEz XSBzeXNsb2codHlwZT0weGMxMDAwMDAwLCBidWY9MSwgbGVuPTB4ODJhNSkgPSAtMSAoT3BlcmF0 aW9uIG5vdCBwZXJtaXR0ZWQpCj4+Pj4+IFsyMDMxXSBbMTE0XSBzZXRwcmlvcml0eSh3aGljaD0w eGM0YzgwNmM2LCB3aG89MHhmZmZmZmYwMSwgbmljZXZhbD0weGZmZmYwNjgyKSA9IC0xIChJbnZh bGlkIGFyZ3VtZW50KQo+Pj4+PiBbMjAzMV0gWzExNV0gZ2V0Z3JvdXBzMTYoZ2lkc2V0c2l6ZT0w eGZmZmUsIGdyb3VwbGlzdD0xKSA9IC0xIChCYWQgYWRkcmVzcykKPj4+Pj4gWzIwMzFdIFsxMTZd IHJlbmFtZShvbGRuYW1lPTQsIG5ld25hbWU9OCkgPSAtMSAoQmFkIGFkZHJlc3MpCj4+Pj4+IFsy MDMxXSBbMTE3XSBpbm90aWZ5X2luaXQoKSA9IDY1NAo+Pj4+PiBbMjAzMV0gWzExOF0gZ2V0Z2lk KCkgPSAxMDAKPj4+Pj4gWzIwMzFdIFsxMTldIGZzdGF0YXQ2NChkZmQ9MzgyLCBmaWxlbmFtZT0i L3N5cy9kZXZpY2VzL3ZpcnR1YWwvbmV0L3NpdDAvZHVwbGV4Iiwgc3RhdGJ1Zj0wLCBmbGFnPTB4 YjU0NWQ3MjcpID0gLTEgKEludmFsaWQgYXJndW1lbnQpCj4+Pj4+IFsyMDMxXSBbMTIwXSB1bmxp bmthdChkZmQ9MzgyLCBwYXRobmFtZT0iL3Byb2Mvc3lzL25ldC9pcHY0L25laWdoL2RlZmF1bHQv cmV0cmFuc190aW1lIiwgZmxhZz0weGMwMGVmNzYpID0gLTEgKEludmFsaWQgYXJndW1lbnQpCj4+ Pj4+IFsyMDMxXSBbMTIxXSB0aW1lcmZkX2NyZWF0ZShjbG9ja2lkPTAsIGZsYWdzPTApID0gNjU1 Cj4+Pj4+IFsyMDMxXSBbMTIyXSBtdW5sb2NrKGFkZHI9NCwgbGVuPTB4M2ZmZikgPSAtMSAoQ2Fu bm90IGFsbG9jYXRlIG1lbW9yeSkKPj4+Pj4gWzIwMzFdIFsxMjNdIGZyZW1vdmV4YXR0cihmZD0z ODIsIG5hbWU9MCkgPSAtMSAoQmFkIGFkZHJlc3MpCj4+Pj4+IFsyMDMxXSBbMTI0XSBzY2hlZF9n ZXRfcHJpb3JpdHlfbWluKHBvbGljeT0weGZmNThiZmVmKSA9IC0xIChJbnZhbGlkIGFyZ3VtZW50 KQo+Pj4+PiBbMjAzMV0gWzEyNV0gbXFfdGltZWRyZWNlaXZlKG1xZGVzPTM5NywgdV9tc2dfcHRy PTQsIG1zZ19sZW49NTI0NSwgdV9tc2dfcHJpbz0weGMwMTAwMjIwLCB1X2Fic190aW1lb3V0PTB4 YzAxMDAyMjApID0gLTEgKEJhZCBhZGRyZXNzKQo+Pj4+PiBbMjAzMV0gWzEyNl0gY2hkaXIoZmls ZW5hbWU9Ii9wcm9jLzExNi9uZXQvcHR5cGUiKSA9IC0xIChOb3QgYSBkaXJlY3RvcnkpCj4+Pj4+ IFsyMDMxXSBbMTI3XSBzc2V0bWFzayhuZXdtYXNrPTB4ODgwMDAwOTIpID0gMAo+Pj4+PiBbMjAz MV0gWzEyOF0gc3RhdGZzKHBhdGhuYW1lPSIvcHJvYy82L21vdW50cyIsIGJ1Zj0wKSA9IC0xIChC YWQgYWRkcmVzcykKPj4+Pj4gWzIwMzFdIFsxMjldIGZjaG93bjE2KGZkPTM5NywgdXNlcj0xMDQs IGdyb3VwPTB4OTQxMDAwMDApID0gLTEgKE9wZXJhdGlvbiBub3QgcGVybWl0dGVkKQo+Pj4+PiBb MjAzMV0gWzEzMF0gZmNoZGlyKGZkPTM5NykgPSAtMSAoTm90IGEgZGlyZWN0b3J5KQo+Pj4+PiBb MjAzMV0gWzEzMV0gbWtkaXIocGF0aG5hbWU9Ii9wcm9jLzEwOTIvdGFzay8xMDkyL2ZkaW5mby8z MTYiLCBtb2RlPTUyNSkgPSAtMSAoRmlsZSBleGlzdHMpCj4+Pj4+IFsyMDMxXSBbMTMyXSBmc2V0 eGF0dHIoZmQ9Mzg2LCBuYW1lPTB4ODU2ZjE1OCwgdmFsdWU9MHg4NTcxMTYwLCBzaXplPTAsIGZs YWdzPTApID0gLTEgKE51bWVyaWNhbCByZXN1bHQgb3V0IG9mIHJhbmdlKQo+Pj4+PiBbMjAzMV0g WzEzM10gaW9fc2V0dXAobnJfZXZlbnRzPTQwOTUsIGN0eHA9MHg0MDI2NjAwMCkgXkNLaWxsZWQg Ynkgc2lnbmFsIDIuCj4+Pj4KPj4+PiBSZWFkaW5nIHlvdXIgZ2RiIGJhY2t0cmFjZXMgc2hvdyB0 aGF0IHNjaGVkdWxlX3RpbWVvdXQoKSBnb3QgY2FsbGVkCj4+Pj4gd2l0aCBhIG5lZ2F0aXZlIHZh bHVlLgo+Pj4+IExvb2tzIGxpa2UgYW4gaW50ZWdlciBvdmVyZmxvdy4KPj4+PiBUaGUgc29mdC1s b2NrdXAgbWlnaHQgYWxzbyBvcmlnaW4gZnJvbSB0aGF0ICh2ZXJ5IGJpZyBpbnRlZ2VyIHdoaWNo Cj4+Pj4gZGlkIG5vdCBvdmVyZmxvdyBqZXQpCj4+Pj4KPj4+Cj4+PiBJZiB0aGUgY3VscHJpdCBp cyBzb2x2ZWQgYnkgdGhpcyBwYXRjaCBJJ2QgbGlrZSB0byBzZW5kIGl0IG91dC4gQnV0IEknbQo+ Pj4gdW5zdXJlIHdoZXRoZXIgaXQgY2F0Y2hlcyB0aGUgY3VscHJpdCBvciBpZiBpdCBqdXN0IGNv dmVycyB0aGUgcm9vdCBjYXVzZS4KPj4KPj4gSSBmZWFyIHlvdXIgUGF0Y2ggd2lsbCBub3QgZml4 IHRoZSBpc3N1ZS4KPj4KPj4gRG9lcyB0aGUgaXNzdWUgb25seSB0cmlnZ2VyIG9uIDMyYml0IFVN THM/Cj4gTm8gZGllYSwgSSBkbyBvbmx5IGhhdmUgYSAzMiBiaXQgc3lzdGVtIGhlcmUgKGJvdGgg aG9zdCBhbmQgY2xpZW50KS4KPiAKPj4gSG93IGxvbmcgZG9lcyBpdCB0YWtlIHRpbGwgdHJpbml0 eSBoaXRzIGl0Pwo+IGEgY29tbWFuZCBsaWtlCj4gCj4gJD4gc3NoIHRmb2Vyc3RlQHRyaW5pdHkg InJtIC1yZiB0MzsgbWtkaXIgdDM7IGNkIHQzOyB0cmluaXR5IC1DNCIKPiAKPiB1c3VhbGx5IG5l ZWRzIDEwIHRpbGwgMTUgbWluIHRvIHRyaWdnZXIgdGhlIGlzc3VlLiBXaXRoIGp1c3QgMSB0cmlu aXR5Cj4gdGFzayAoLUMxKSBob3dldmVyIGl0IG5lZWRzIG9mdGVuIGEgaG91ciBvciBtb3JlLgoK VGhhdCdzIGdvb2QuIDotKQpZb3UgY2FuIHBsYWNlIHNvbWUgcHJpbnRrKClzIGludG8gYmFsYW5j ZV9kaXJ0eV9wYWdlcygpIGFuZCBvYnNlcnZlIHRoZSB2YWx1ZXMKb2YgcGVyaW9kLCBtYXhfcGF1 c2UsIG1pbl9wYXVzZSwgZXRjLi4uCk1heWJlIHRoaXMgd2lsbCBnaXZlIHVzIGEgY2x1ZS4KClNv IGZhciB0aGUgaXNzdWUgbG9va3Mgbm90IHJlYWxseSBVTUwgc3BlY2lmaWMuCkJ1dCBtYXliZSBp dCBpcyBtb3JlIGxpa2VseSB0byBoYXBwZW4gb24gVU1MIGJlY2F1c2Ugb2YgdGhlIHNsb3cgcGFn ZSBmYXVsdHMuLi4KClRoYW5rcywKLy9yaWNoYXJkCgoKLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCk9j dG9iZXIgV2ViaW5hcnM6IENvZGUgZm9yIFBlcmZvcm1hbmNlCkZyZWUgSW50ZWwgd2ViaW5hcnMg Y2FuIGhlbHAgeW91IGFjY2VsZXJhdGUgYXBwbGljYXRpb24gcGVyZm9ybWFuY2UuCkV4cGxvcmUg dGlwcyBmb3IgTVBJLCBPcGVuTVAsIGFkdmFuY2VkIHByb2ZpbGluZywgYW5kIG1vcmUuIEdldCB0 aGUgbW9zdCBmcm9tIAp0aGUgbGF0ZXN0IEludGVsIHByb2Nlc3NvcnMgYW5kIGNvcHJvY2Vzc29y cy4gU2VlIGFic3RyYWN0cyBhbmQgcmVnaXN0ZXIgPgpodHRwOi8vcHViYWRzLmcuZG91YmxlY2xp Y2submV0L2dhbXBhZC9jbGs/aWQ9NjAxMzQ3OTEmaXU9LzQxNDAvb3N0Zy5jbGt0cmsKX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KVXNlci1tb2RlLWxpbnV4 LWRldmVsIG1haWxpbmcgbGlzdApVc2VyLW1vZGUtbGludXgtZGV2ZWxAbGlzdHMuc291cmNlZm9y Z2UubmV0Cmh0dHBzOi8vbGlzdHMuc291cmNlZm9yZ2UubmV0L2xpc3RzL2xpc3RpbmZvL3VzZXIt bW9kZS1saW51eC1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 From: Richard Weinberger Subject: Re: [uml-devel] BUG: soft lockup for a user mode linux image Date: Thu, 03 Oct 2013 21:20:52 +0200 Message-ID: <524DC394.6030406@nod.at> References: <524C6643.2040209@gmx.de> <524DBD5D.1040203@gmx.de> <524DBFBB.1050002@nod.at> <524DC278.3020106@gmx.de> Mime-Version: 1.0 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <524DC278.3020106@gmx.de> Sender: trinity-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="utf-8" To: =?UTF-8?B?VG9yYWxmIEbDtnJzdGVy?= Cc: Richard Weinberger , trinity@vger.kernel.org, UML devel Am 03.10.2013 21:16, schrieb Toralf F=C3=B6rster: > On 10/03/2013 09:04 PM, Richard Weinberger wrote: >> Am 03.10.2013 20:54, schrieb Toralf F=C3=B6rster: >>> On 10/02/2013 09:55 PM, Richard Weinberger wrote: >>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf F=C3=B6rster wrote: >>>>> Running trinity (1 process, no victim files, just "$>trinity -C1)= for a longer time >>>>> within a 32 bit user mode linux image with a recent git kernel (h= ost: 3.11.3 guest 3.12-rc3-g...) >>>>> yields into this konsole message : >>>>> >>>>> * Starting local >>>>> net.core.warnings =3D 0 = [ ok ] >>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031] >>>>> >>>>> >>>>> and at the host t1 of the "linux"-processes eats all CPU cycles a= t 1 CPU core. >>>>> 2 subsequent made back traces made with >>>>> >>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex = bt >>>>> >>>>> shows nearly a similar position around __get_user_pages() - both = are attached. >>>>> >>>>> I'm not surprised that trinity harms a systems - I'm just wonderi= ng whether this particular picture is >>>>> expected or if it points to an issue. >>>>> >>>>> >>>>> FWIW the last lines of trinity log were : >>>>> >>>>> >>>>> [2031] [94] setsid() =3D 2031 >>>>> [2031] [95] setresgid(rgid=3D0xffff33e3, egid=3D0xffffff93, sgid=3D= 0x22000040) =3D -1 (Operation not permitted) >>>>> [2031] [96] vmsplice(fd=3D5, iov=3D0x85501e0, nr_segs=3D300, flag= s=3D9) =3D 0x3000 >>>>> [2031] [97] setresuid(ruid=3D0x80549193, euid=3D0xc61041e0, suid=3D= 0xff19b6fa) =3D -1 (Operation not permitted) >>>>> [2031] [98] setpriority(which=3D0xff010000, who=3D0xf3737373, nic= eval=3D0x8088960c) =3D -1 (Invalid argument) >>>>> [2031] [99] socketcall(call=3D1, args=3D0x8550200) =3D -1 (Addres= s family not supported by protocol) >>>>> [2031] [100] access(filename=3D"=EF=BF=BD", mode=3D2017) =3D -1 (= Invalid argument) >>>>> [2031] [101] getgroups(gidsetsize=3D0, grouplist=3D0x80d0000[page= _rand]) =3D 3 >>>>> [2031] [102] msync(start=3D0xc0100220, len=3D0, flags=3D3) =3D -1= (Invalid argument) >>>>> [2031] [103] sigpending(set=3D0x40025000) =3D 0 >>>>> [2031] [104] signalfd4(ufd=3D383, user_mask=3D1, sizemask=3D0xa42= 00000, flags=3D0x80800) =3D -1 (Invalid argument) >>>>> [2031] [105] sendfile(out_fd=3D383, in_fd=3D382, offset=3D0, coun= t=3D4096) =3D -1 (Invalid argument) >>>>> [2031] [106] fanotify_mark(fanotify_fd=3D382, flags=3D5, mask=3D0= x8000023, dfd=3D382, pathname=3D"/proc/1092/task/1092/fdinfo/68") =3D -= 1 (Invalid argument) >>>>> [2031] [107] wait4(upid=3D1, stat_addr=3D4, options=3D0xd761979b,= ru=3D8) =3D -1 (Invalid argument) >>>>> [2031] [108] sigpending(set=3D0x80ca000[page_zeros]) =3D 0 >>>>> [2031] [109] setresuid(ruid=3D0xefffd6fc, euid=3D0x1bf4c92f, suid= =3D0xffff2e33) =3D -1 (Operation not permitted) >>>>> [2031] [110] munlock(addr=3D0x40025000, len=3D34) =3D 0 >>>>> [2031] [111] timer_delete(timer_id=3D0xffffffdc) =3D -1 (Invalid = argument) >>>>> [2031] [112] sched_get_priority_max(policy=3D0x10000040) =3D -1 (= Invalid argument) >>>>> [2031] [113] syslog(type=3D0xc1000000, buf=3D1, len=3D0x82a5) =3D= -1 (Operation not permitted) >>>>> [2031] [114] setpriority(which=3D0xc4c806c6, who=3D0xffffff01, ni= ceval=3D0xffff0682) =3D -1 (Invalid argument) >>>>> [2031] [115] getgroups16(gidsetsize=3D0xfffe, grouplist=3D1) =3D = -1 (Bad address) >>>>> [2031] [116] rename(oldname=3D4, newname=3D8) =3D -1 (Bad address= ) >>>>> [2031] [117] inotify_init() =3D 654 >>>>> [2031] [118] getgid() =3D 100 >>>>> [2031] [119] fstatat64(dfd=3D382, filename=3D"/sys/devices/virtua= l/net/sit0/duplex", statbuf=3D0, flag=3D0xb545d727) =3D -1 (Invalid arg= ument) >>>>> [2031] [120] unlinkat(dfd=3D382, pathname=3D"/proc/sys/net/ipv4/n= eigh/default/retrans_time", flag=3D0xc00ef76) =3D -1 (Invalid argument) >>>>> [2031] [121] timerfd_create(clockid=3D0, flags=3D0) =3D 655 >>>>> [2031] [122] munlock(addr=3D4, len=3D0x3fff) =3D -1 (Cannot alloc= ate memory) >>>>> [2031] [123] fremovexattr(fd=3D382, name=3D0) =3D -1 (Bad address= ) >>>>> [2031] [124] sched_get_priority_min(policy=3D0xff58bfef) =3D -1 (= Invalid argument) >>>>> [2031] [125] mq_timedreceive(mqdes=3D397, u_msg_ptr=3D4, msg_len=3D= 5245, u_msg_prio=3D0xc0100220, u_abs_timeout=3D0xc0100220) =3D -1 (Bad = address) >>>>> [2031] [126] chdir(filename=3D"/proc/116/net/ptype") =3D -1 (Not = a directory) >>>>> [2031] [127] ssetmask(newmask=3D0x88000092) =3D 0 >>>>> [2031] [128] statfs(pathname=3D"/proc/6/mounts", buf=3D0) =3D -1 = (Bad address) >>>>> [2031] [129] fchown16(fd=3D397, user=3D104, group=3D0x94100000) =3D= -1 (Operation not permitted) >>>>> [2031] [130] fchdir(fd=3D397) =3D -1 (Not a directory) >>>>> [2031] [131] mkdir(pathname=3D"/proc/1092/task/1092/fdinfo/316", = mode=3D525) =3D -1 (File exists) >>>>> [2031] [132] fsetxattr(fd=3D386, name=3D0x856f158, value=3D0x8571= 160, size=3D0, flags=3D0) =3D -1 (Numerical result out of range) >>>>> [2031] [133] io_setup(nr_events=3D4095, ctxp=3D0x40266000) ^CKill= ed by signal 2. >>>> >>>> Reading your gdb backtraces show that schedule_timeout() got calle= d >>>> with a negative value. >>>> Looks like an integer overflow. >>>> The soft-lockup might also origin from that (very big integer whic= h >>>> did not overflow jet) >>>> >>> >>> If the culprit is solved by this patch I'd like to send it out. But= I'm >>> unsure whether it catches the culprit or if it just covers the root= cause. >> >> I fear your Patch will not fix the issue. >> >> Does the issue only trigger on 32bit UMLs? > No diea, I do only have a 32 bit system here (both host and client). >=20 >> How long does it take till trinity hits it? > a command like >=20 > $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4" >=20 > usually needs 10 till 15 min to trigger the issue. With just 1 trinit= y > task (-C1) however it needs often a hour or more. That's good. :-) You can place some printk()s into balance_dirty_pages() and observe the= values of period, max_pause, min_pause, etc... Maybe this will give us a clue. So far the issue looks not really UML specific. But maybe it is more likely to happen on UML because of the slow page f= aults... Thanks, //richard