* 3.4. sunrpc oops during shutdown @ 2012-05-21 17:14 Dave Jones 2012-05-21 18:03 ` Myklebust, Trond 0 siblings, 1 reply; 12+ messages in thread From: Dave Jones @ 2012-05-21 17:14 UTC (permalink / raw) To: bfields; +Cc: linux-nfs, Linux Kernel Tried to shutdown a machine, got this, and a bunch of hung processes. There was one NFS mount mounted at the time. Dave BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] PGD 1434c4067 PUD 144964067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU 4 Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) Stack: ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 Call Trace: [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 [<ffffffff811ccc16>] mntput+0x26/0x40 [<ffffffff811cd597>] release_mounts+0x77/0x90 [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 [<ffffffff8104f377>] sys_exit_group+0x17/0x20 [<ffffffff81641352>] system_call_fastpath+0x16/0x1b Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb <4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-21 17:14 3.4. sunrpc oops during shutdown Dave Jones @ 2012-05-21 18:03 ` Myklebust, Trond 2012-05-21 21:34 ` bfields ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Myklebust, Trond @ 2012-05-21 18:03 UTC (permalink / raw) To: Dave Jones; +Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org, Linux Kernel T24gTW9uLCAyMDEyLTA1LTIxIGF0IDEzOjE0IC0wNDAwLCBEYXZlIEpvbmVzIHdyb3RlOg0KPiBU cmllZCB0byBzaHV0ZG93biBhIG1hY2hpbmUsIGdvdCB0aGlzLCBhbmQgYSBidW5jaCBvZiBodW5n IHByb2Nlc3Nlcy4NCj4gVGhlcmUgd2FzIG9uZSBORlMgbW91bnQgbW91bnRlZCBhdCB0aGUgdGlt ZS4NCj4gDQo+IAlEYXZlDQo+IA0KPiBCVUc6IHVuYWJsZSB0byBoYW5kbGUga2VybmVsIE5VTEwg cG9pbnRlciBkZXJlZmVyZW5jZSBhdCAwMDAwMDAwMDAwMDAwMDI4DQo+IElQOiBbPGZmZmZmZmZm YTAxMTkxZGY+XSBzdmNfZGVzdHJveSsweDFmLzB4MTQwIFtzdW5ycGNdDQo+IFBHRCAxNDM0YzQw NjcgUFVEIDE0NDk2NDA2NyBQTUQgMCANCj4gT29wczogMDAwMCBbIzFdIFBSRUVNUFQgU01QIA0K PiBDUFUgNCANCj4gTW9kdWxlcyBsaW5rZWQgaW46IGlwNnRhYmxlX2ZpbHRlcigtKSBpcDZfdGFi bGVzIG5mc2QgbmZzIGZzY2FjaGUgYXV0aF9ycGNnc3MgbmZzX2FjbCBsb2NrZCBpcDZ0X1JFSkVD VCBuZl9jb25udHJhY2tfaXB2NiBuZl9kZWZyYWdfaXB2Ng0KPiANCj4gUGlkOiA2OTQ2LCBjb21t OiBudHBkIE5vdCB0YWludGVkIDMuNC4wKyAjMTMgDQo+IFJJUDogMDAxMDpbPGZmZmZmZmZmYTAx MTkxZGY+XSAgWzxmZmZmZmZmZmEwMTE5MWRmPl0gc3ZjX2Rlc3Ryb3krMHgxZi8weDE0MCBbc3Vu cnBjXQ0KPiBSU1A6IDAwMTg6ZmZmZjg4MDE0M2M2NWM0OCAgRUZMQUdTOiAwMDAxMDI4Ng0KPiBS QVg6IDAwMDAwMDAwMDAwMDAwMDAgUkJYOiBmZmZmODgwMTQyY2Q0MWEwIFJDWDogMDAwMDAwMDAw MDAwMDAwNg0KPiBSRFg6IDAwMDAwMDAwMDAwMDAwNDAgUlNJOiBmZmZmODgwMTQzMTA1MDI4IFJE STogZmZmZjg4MDE0MmNkNDFhMA0KPiBSQlA6IGZmZmY4ODAxNDNjNjVjNTggUjA4OiAwMDAwMDAw MDAwMDAwMDAwIFIwOTogMDAwMDAwMDAwMDAwMDAwMQ0KPiBSMTA6IDAwMDAwMDAwMDAwMDAwMDAg UjExOiAwMDAwMDAwMDAwMDAwMDAwIFIxMjogZmZmZjg4MDEzYmM1YTE0OA0KPiBSMTM6IGZmZmY4 ODAxNDA5ODE2NTggUjE0OiBmZmZmODgwMTQyY2Q0MWEwIFIxNTogZmZmZjg4MDE0NmM4ODAwMA0K PiBGUzogIDAwMDA3ZmRjMDM4MmE3NDAoMDAwMCkgR1M6ZmZmZjg4MDE0OTQwMDAwMCgwMDAwKSBr bmxHUzowMDAwMDAwMDAwMDAwMDAwDQo+IENTOiAgMDAxMCBEUzogMDAwMCBFUzogMDAwMCBDUjA6 IDAwMDAwMDAwODAwNTAwMzMNCj4gQ1IyOiAwMDAwMDAwMDAwMDAwMDI4IENSMzogMDAwMDAwMDAz NmNiYjAwMCBDUjQ6IDAwMDAwMDAwMDAxNDA3ZTANCj4gRFIwOiAwMDAwMDAwMDAwMDAwMDAwIERS MTogMDAwMDAwMDAwMDAwMDAwMCBEUjI6IDAwMDAwMDAwMDAwMDAwMDANCj4gRFIzOiAwMDAwMDAw MDAwMDAwMDAwIERSNjogMDAwMDAwMDBmZmZmMGZmMCBEUjc6IDAwMDAwMDAwMDAwMDA0MDANCj4g UHJvY2VzcyBudHBkIChwaWQ6IDY5NDYsIHRocmVhZGluZm8gZmZmZjg4MDE0M2M2NDAwMCwgdGFz ayBmZmZmODgwMTQzMTA0OTQwKQ0KPiBTdGFjazoNCj4gIGZmZmY4ODAxNDA5ODE2NjAgZmZmZjg4 MDEzYmM1YTE0OCBmZmZmODgwMTQzYzY1Yzg4IGZmZmZmZmZmYTAxMTkzYTYNCj4gIDAwMDAwMDAw MDAwMDAwMDAgZmZmZjg4MDEzZTU2NjAyMCBmZmZmODgwMTNlNTY1ZjI4IGZmZmY4ODAxNDZlZTZh YzANCj4gIGZmZmY4ODAxNDNjNjVjYTggZmZmZmZmZmZhMDI0ZjQwMyBmZmZmODgwMTQzYzY1Y2E4 IGZmZmY4ODAxNDNkM2E0ZjgNCj4gQ2FsbCBUcmFjZToNCj4gIFs8ZmZmZmZmZmZhMDExOTNhNj5d IHN2Y19leGl0X3RocmVhZCsweGE2LzB4YjAgW3N1bnJwY10NCj4gIFs8ZmZmZmZmZmZhMDI0ZjQw Mz5dIG5mc19jYWxsYmFja19kb3duKzB4NTMvMHg5MCBbbmZzXQ0KPiAgWzxmZmZmZmZmZmEwMjE2 NDJlPl0gbmZzX2ZyZWVfY2xpZW50KzB4ZmUvMHgxMjAgW25mc10NCj4gIFs8ZmZmZmZmZmZhMDIx ODVkZj5dIG5mc19wdXRfY2xpZW50KzB4MjlmLzB4NDIwIFtuZnNdDQo+ICBbPGZmZmZmZmZmYTAy MTg0ZTA+XSA/IG5mc19wdXRfY2xpZW50KzB4MWEwLzB4NDIwIFtuZnNdDQo+ICBbPGZmZmZmZmZm YTAyMTk2MmY+XSBuZnNfZnJlZV9zZXJ2ZXIrMHgxNmYvMHgyZTAgW25mc10NCj4gIFs8ZmZmZmZm ZmZhMDIxOTRlMz5dID8gbmZzX2ZyZWVfc2VydmVyKzB4MjMvMHgyZTAgW25mc10NCj4gIFs8ZmZm ZmZmZmZhMDIyMzYzYz5dIG5mczRfa2lsbF9zdXBlcisweDNjLzB4NTAgW25mc10NCj4gIFs8ZmZm ZmZmZmY4MTFhZDY3Yz5dIGRlYWN0aXZhdGVfbG9ja2VkX3N1cGVyKzB4M2MvMHhhMA0KPiAgWzxm ZmZmZmZmZjgxMWFlMjllPl0gZGVhY3RpdmF0ZV9zdXBlcisweDRlLzB4NzANCj4gIFs8ZmZmZmZm ZmY4MTFjY2JhND5dIG1udHB1dF9ub19leHBpcmUrMHhiNC8weDEwMA0KPiAgWzxmZmZmZmZmZjgx MWNjYzE2Pl0gbW50cHV0KzB4MjYvMHg0MA0KPiAgWzxmZmZmZmZmZjgxMWNkNTk3Pl0gcmVsZWFz ZV9tb3VudHMrMHg3Ny8weDkwDQo+ICBbPGZmZmZmZmZmODExY2VmYzY+XSBwdXRfbW50X25zKzB4 NjYvMHg4MA0KPiAgWzxmZmZmZmZmZjgxMDc4ZGZmPl0gZnJlZV9uc3Byb3h5KzB4MWYvMHhiMA0K PiAgWzxmZmZmZmZmZjgxMDc5MDVlPl0gc3dpdGNoX3Rhc2tfbmFtZXNwYWNlcysweDVlLzB4NzAN Cj4gIFs8ZmZmZmZmZmY4MTA3OTA4MD5dIGV4aXRfdGFza19uYW1lc3BhY2VzKzB4MTAvMHgyMA0K PiAgWzxmZmZmZmZmZjgxMDRlOTBlPl0gZG9fZXhpdCsweDRlZS8weGI4MA0KPiAgWzxmZmZmZmZm ZjgxNjM5YzBhPl0gPyByZXRpbnRfc3dhcGdzKzB4ZS8weDEzDQo+ICBbPGZmZmZmZmZmODEwNGYy ZWY+XSBkb19ncm91cF9leGl0KzB4NGYvMHhjMA0KPiAgWzxmZmZmZmZmZjgxMDRmMzc3Pl0gc3lz X2V4aXRfZ3JvdXArMHgxNy8weDIwDQo+ICBbPGZmZmZmZmZmODE2NDEzNTI+XSBzeXN0ZW1fY2Fs bF9mYXN0cGF0aCsweDE2LzB4MWINCj4gQ29kZTogNDggOGIgNWQgZjAgNGMgOGIgNjUgZjggYzkg YzMgNjYgOTAgNTUgNDggODkgZTUgNDEgNTQgNTMgNjYgNjYgNjYgNjYgOTAgNjUgNDggOGIgMDQg MjUgODAgYmEgMDAgMDAgNDggOGIgODAgNTAgMDUgMDAgMDAgNDggODkgZmIgPDRjPiA4YiA2MCAy OCA4YiA0NyA1OCA4NSBjMCAwZiA4NCBlYyAwMCAwMCAwMCA4MyBlOCAwMSA4NSBjMCA4OSANCg0K QXNpZGUgZnJvbSB0aGUgZmFjdCB0aGF0IHRoZSBjdXJyZW50IG5ldF9uYW1lc3BhY2UgaXMgbm90 IGd1YXJhbnRlZWQgdG8NCmV4aXN0IHdoZW4gd2UgYXJlIGNhbGxlZCBmcm9tIGZyZWVfbnNwcm94 eSwgc3ZjX2Rlc3Ryb3koKSBsb29rcw0Kc2VyaW91c2x5IGJyb2tlbjoNCg0KICAgICAgKiBPbiB0 aGUgb25lIGhhbmQgaXQgaXMgdHJ5aW5nIHRvIGZyZWUgc3RydWN0IHN2Y19zZXJ2IChhbmQNCiAg ICAgICAgcHJlc3VtYWJseSBhbGwgc3RydWN0dXJlcyBvd25lZCBieSBzdHJ1Y3Qgc3ZjX3NlcnYp Lg0KICAgICAgKiBPbiB0aGUgb3RoZXIgaGFuZCwgaXQgdHJpZXMgdG8gcGFzcyBhIHBhcmFtZXRl ciB0bw0KICAgICAgICBzdmNfY2xvc2VfbmV0KCkgc2F5aW5nICJwbGVhc2UgZG9uJ3QgZnJlZSBz dHJ1Y3R1cmVzIG9uIG15DQogICAgICAgIHN2X3RlbXBzb2Nrcywgb3Igc3ZfcGVybXNvY2tzIGxp c3QgdW5sZXNzIHRoZXkgbWF0Y2ggdGhpcyBuZXQNCiAgICAgICAgbmFtZXNwYWNlIi4NCg0KQnJ1 Y2UsIGhvdyBpcyB0aGlzIHN1cHBvc2VkIHRvIGJlIHdvcmtpbmc/DQoNCkNoZWVycw0KICBUcm9u ZA0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0KDQpO ZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNvbQ0KDQo= ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-21 18:03 ` Myklebust, Trond @ 2012-05-21 21:34 ` bfields 2012-05-24 15:55 ` bfields 2012-05-25 8:12 ` Stanislav Kinsbursky 2 siblings, 0 replies; 12+ messages in thread From: bfields @ 2012-05-21 21:34 UTC (permalink / raw) To: Myklebust, Trond; +Cc: Dave Jones, linux-nfs@vger.kernel.org, Linux Kernel On Mon, May 21, 2012 at 06:03:43PM +0000, Myklebust, Trond wrote: > On Mon, 2012-05-21 at 13:14 -0400, Dave Jones wrote: > > Tried to shutdown a machine, got this, and a bunch of hung processes. > > There was one NFS mount mounted at the time. > > > > Dave > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 > > IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] > > PGD 1434c4067 PUD 144964067 PMD 0 > > Oops: 0000 [#1] PREEMPT SMP > > CPU 4 > > Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 > > > > Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 > > RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] > > RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 > > RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 > > RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 > > RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 > > R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 > > FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) > > Stack: > > ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 > > 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 > > ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 > > Call Trace: > > [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] > > [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] > > [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] > > [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] > > [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] > > [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] > > [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] > > [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] > > [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 > > [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 > > [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 > > [<ffffffff811ccc16>] mntput+0x26/0x40 > > [<ffffffff811cd597>] release_mounts+0x77/0x90 > > [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 > > [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 > > [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 > > [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 > > [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 > > [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 > > [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 > > [<ffffffff8104f377>] sys_exit_group+0x17/0x20 > > [<ffffffff81641352>] system_call_fastpath+0x16/0x1b > > Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb <4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 > > Aside from the fact that the current net_namespace is not guaranteed to > exist when we are called from free_nsproxy, svc_destroy() looks > seriously broken: > > * On the one hand it is trying to free struct svc_serv (and > presumably all structures owned by struct svc_serv). > * On the other hand, it tries to pass a parameter to > svc_close_net() saying "please don't free structures on my > sv_tempsocks, or sv_permsocks list unless they match this net > namespace". > > Bruce, how is this supposed to be working? I'm not sure, I'll try to take a look tomorrow.... I notice Stanislav has posted a "[PATCH] NFSd: set nfsd_serv to NULL after service destruction", but I haven't reviewed it yet. --b. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-21 18:03 ` Myklebust, Trond 2012-05-21 21:34 ` bfields @ 2012-05-24 15:55 ` bfields 2012-05-24 19:20 ` Myklebust, Trond 2012-05-25 8:12 ` Stanislav Kinsbursky 2 siblings, 1 reply; 12+ messages in thread From: bfields @ 2012-05-24 15:55 UTC (permalink / raw) To: Myklebust, Trond; +Cc: Dave Jones, linux-nfs@vger.kernel.org, Linux Kernel On Mon, May 21, 2012 at 06:03:43PM +0000, Myklebust, Trond wrote: > On Mon, 2012-05-21 at 13:14 -0400, Dave Jones wrote: > > Tried to shutdown a machine, got this, and a bunch of hung processes. > > There was one NFS mount mounted at the time. > > > > Dave > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 > > IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] > > PGD 1434c4067 PUD 144964067 PMD 0 > > Oops: 0000 [#1] PREEMPT SMP > > CPU 4 > > Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 > > > > Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 > > RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] > > RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 > > RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 > > RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 > > RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 > > R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 > > FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) > > Stack: > > ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 > > 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 > > ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 > > Call Trace: > > [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] > > [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] > > [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] > > [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] > > [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] > > [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] > > [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] > > [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] > > [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 > > [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 > > [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 > > [<ffffffff811ccc16>] mntput+0x26/0x40 > > [<ffffffff811cd597>] release_mounts+0x77/0x90 > > [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 > > [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 > > [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 > > [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 > > [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 > > [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 > > [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 > > [<ffffffff8104f377>] sys_exit_group+0x17/0x20 > > [<ffffffff81641352>] system_call_fastpath+0x16/0x1b > > Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb <4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 > > Aside from the fact that the current net_namespace is not guaranteed to > exist when we are called from free_nsproxy, svc_destroy() looks > seriously broken: > > * On the one hand it is trying to free struct svc_serv (and > presumably all structures owned by struct svc_serv). > * On the other hand, it tries to pass a parameter to > svc_close_net() saying "please don't free structures on my > sv_tempsocks, or sv_permsocks list unless they match this net > namespace". > > Bruce, how is this supposed to be working? Yeah, I don't know. For the nfs callback case, it looks like you've just got a single callback service shared across all namespaces, and all you want to do is destroy that whole thing on last put; or is it more complicated than that? For the other servers at least the per-net and global parts of the server seem too entangled. That's unavoidable to some degree since we're sharing threads among the namespaces. But maybe separate structures for the per-namespace and global pieces would help. At a minimum the per-namespace piece would keep a count of the users in that namespace. To make the shutdown race-free I think we also need a way to wait for all threads processing requests in that namespace, which I don't see that we have yet. --b. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-24 15:55 ` bfields @ 2012-05-24 19:20 ` Myklebust, Trond 2012-05-24 20:27 ` bfields 0 siblings, 1 reply; 12+ messages in thread From: Myklebust, Trond @ 2012-05-24 19:20 UTC (permalink / raw) To: bfields@fieldses.org; +Cc: Dave Jones, linux-nfs@vger.kernel.org, Linux Kernel T24gVGh1LCAyMDEyLTA1LTI0IGF0IDExOjU1IC0wNDAwLCBiZmllbGRzQGZpZWxkc2VzLm9yZyB3 cm90ZToNCj4gT24gTW9uLCBNYXkgMjEsIDIwMTIgYXQgMDY6MDM6NDNQTSArMDAwMCwgTXlrbGVi dXN0LCBUcm9uZCB3cm90ZToNCj4gPiBPbiBNb24sIDIwMTItMDUtMjEgYXQgMTM6MTQgLTA0MDAs IERhdmUgSm9uZXMgd3JvdGU6DQo+ID4gPiBUcmllZCB0byBzaHV0ZG93biBhIG1hY2hpbmUsIGdv dCB0aGlzLCBhbmQgYSBidW5jaCBvZiBodW5nIHByb2Nlc3Nlcy4NCj4gPiA+IFRoZXJlIHdhcyBv bmUgTkZTIG1vdW50IG1vdW50ZWQgYXQgdGhlIHRpbWUuDQo+ID4gPiANCj4gPiA+IAlEYXZlDQo+ ID4gPiANCj4gPiA+IEJVRzogdW5hYmxlIHRvIGhhbmRsZSBrZXJuZWwgTlVMTCBwb2ludGVyIGRl cmVmZXJlbmNlIGF0IDAwMDAwMDAwMDAwMDAwMjgNCj4gPiA+IElQOiBbPGZmZmZmZmZmYTAxMTkx ZGY+XSBzdmNfZGVzdHJveSsweDFmLzB4MTQwIFtzdW5ycGNdDQo+ID4gPiBQR0QgMTQzNGM0MDY3 IFBVRCAxNDQ5NjQwNjcgUE1EIDAgDQo+ID4gPiBPb3BzOiAwMDAwIFsjMV0gUFJFRU1QVCBTTVAg DQo+ID4gPiBDUFUgNCANCj4gPiA+IE1vZHVsZXMgbGlua2VkIGluOiBpcDZ0YWJsZV9maWx0ZXIo LSkgaXA2X3RhYmxlcyBuZnNkIG5mcyBmc2NhY2hlIGF1dGhfcnBjZ3NzIG5mc19hY2wgbG9ja2Qg aXA2dF9SRUpFQ1QgbmZfY29ubnRyYWNrX2lwdjYgbmZfZGVmcmFnX2lwdjYNCj4gPiA+IA0KPiA+ ID4gUGlkOiA2OTQ2LCBjb21tOiBudHBkIE5vdCB0YWludGVkIDMuNC4wKyAjMTMgDQo+ID4gPiBS SVA6IDAwMTA6WzxmZmZmZmZmZmEwMTE5MWRmPl0gIFs8ZmZmZmZmZmZhMDExOTFkZj5dIHN2Y19k ZXN0cm95KzB4MWYvMHgxNDAgW3N1bnJwY10NCj4gPiA+IFJTUDogMDAxODpmZmZmODgwMTQzYzY1 YzQ4ICBFRkxBR1M6IDAwMDEwMjg2DQo+ID4gPiBSQVg6IDAwMDAwMDAwMDAwMDAwMDAgUkJYOiBm ZmZmODgwMTQyY2Q0MWEwIFJDWDogMDAwMDAwMDAwMDAwMDAwNg0KPiA+ID4gUkRYOiAwMDAwMDAw MDAwMDAwMDQwIFJTSTogZmZmZjg4MDE0MzEwNTAyOCBSREk6IGZmZmY4ODAxNDJjZDQxYTANCj4g PiA+IFJCUDogZmZmZjg4MDE0M2M2NWM1OCBSMDg6IDAwMDAwMDAwMDAwMDAwMDAgUjA5OiAwMDAw MDAwMDAwMDAwMDAxDQo+ID4gPiBSMTA6IDAwMDAwMDAwMDAwMDAwMDAgUjExOiAwMDAwMDAwMDAw MDAwMDAwIFIxMjogZmZmZjg4MDEzYmM1YTE0OA0KPiA+ID4gUjEzOiBmZmZmODgwMTQwOTgxNjU4 IFIxNDogZmZmZjg4MDE0MmNkNDFhMCBSMTU6IGZmZmY4ODAxNDZjODgwMDANCj4gPiA+IEZTOiAg MDAwMDdmZGMwMzgyYTc0MCgwMDAwKSBHUzpmZmZmODgwMTQ5NDAwMDAwKDAwMDApIGtubEdTOjAw MDAwMDAwMDAwMDAwMDANCj4gPiA+IENTOiAgMDAxMCBEUzogMDAwMCBFUzogMDAwMCBDUjA6IDAw MDAwMDAwODAwNTAwMzMNCj4gPiA+IENSMjogMDAwMDAwMDAwMDAwMDAyOCBDUjM6IDAwMDAwMDAw MzZjYmIwMDAgQ1I0OiAwMDAwMDAwMDAwMTQwN2UwDQo+ID4gPiBEUjA6IDAwMDAwMDAwMDAwMDAw MDAgRFIxOiAwMDAwMDAwMDAwMDAwMDAwIERSMjogMDAwMDAwMDAwMDAwMDAwMA0KPiA+ID4gRFIz OiAwMDAwMDAwMDAwMDAwMDAwIERSNjogMDAwMDAwMDBmZmZmMGZmMCBEUjc6IDAwMDAwMDAwMDAw MDA0MDANCj4gPiA+IFByb2Nlc3MgbnRwZCAocGlkOiA2OTQ2LCB0aHJlYWRpbmZvIGZmZmY4ODAx NDNjNjQwMDAsIHRhc2sgZmZmZjg4MDE0MzEwNDk0MCkNCj4gPiA+IFN0YWNrOg0KPiA+ID4gIGZm ZmY4ODAxNDA5ODE2NjAgZmZmZjg4MDEzYmM1YTE0OCBmZmZmODgwMTQzYzY1Yzg4IGZmZmZmZmZm YTAxMTkzYTYNCj4gPiA+ICAwMDAwMDAwMDAwMDAwMDAwIGZmZmY4ODAxM2U1NjYwMjAgZmZmZjg4 MDEzZTU2NWYyOCBmZmZmODgwMTQ2ZWU2YWMwDQo+ID4gPiAgZmZmZjg4MDE0M2M2NWNhOCBmZmZm ZmZmZmEwMjRmNDAzIGZmZmY4ODAxNDNjNjVjYTggZmZmZjg4MDE0M2QzYTRmOA0KPiA+ID4gQ2Fs bCBUcmFjZToNCj4gPiA+ICBbPGZmZmZmZmZmYTAxMTkzYTY+XSBzdmNfZXhpdF90aHJlYWQrMHhh Ni8weGIwIFtzdW5ycGNdDQo+ID4gPiAgWzxmZmZmZmZmZmEwMjRmNDAzPl0gbmZzX2NhbGxiYWNr X2Rvd24rMHg1My8weDkwIFtuZnNdDQo+ID4gPiAgWzxmZmZmZmZmZmEwMjE2NDJlPl0gbmZzX2Zy ZWVfY2xpZW50KzB4ZmUvMHgxMjAgW25mc10NCj4gPiA+ICBbPGZmZmZmZmZmYTAyMTg1ZGY+XSBu ZnNfcHV0X2NsaWVudCsweDI5Zi8weDQyMCBbbmZzXQ0KPiA+ID4gIFs8ZmZmZmZmZmZhMDIxODRl MD5dID8gbmZzX3B1dF9jbGllbnQrMHgxYTAvMHg0MjAgW25mc10NCj4gPiA+ICBbPGZmZmZmZmZm YTAyMTk2MmY+XSBuZnNfZnJlZV9zZXJ2ZXIrMHgxNmYvMHgyZTAgW25mc10NCj4gPiA+ICBbPGZm ZmZmZmZmYTAyMTk0ZTM+XSA/IG5mc19mcmVlX3NlcnZlcisweDIzLzB4MmUwIFtuZnNdDQo+ID4g PiAgWzxmZmZmZmZmZmEwMjIzNjNjPl0gbmZzNF9raWxsX3N1cGVyKzB4M2MvMHg1MCBbbmZzXQ0K PiA+ID4gIFs8ZmZmZmZmZmY4MTFhZDY3Yz5dIGRlYWN0aXZhdGVfbG9ja2VkX3N1cGVyKzB4M2Mv MHhhMA0KPiA+ID4gIFs8ZmZmZmZmZmY4MTFhZTI5ZT5dIGRlYWN0aXZhdGVfc3VwZXIrMHg0ZS8w eDcwDQo+ID4gPiAgWzxmZmZmZmZmZjgxMWNjYmE0Pl0gbW50cHV0X25vX2V4cGlyZSsweGI0LzB4 MTAwDQo+ID4gPiAgWzxmZmZmZmZmZjgxMWNjYzE2Pl0gbW50cHV0KzB4MjYvMHg0MA0KPiA+ID4g IFs8ZmZmZmZmZmY4MTFjZDU5Nz5dIHJlbGVhc2VfbW91bnRzKzB4NzcvMHg5MA0KPiA+ID4gIFs8 ZmZmZmZmZmY4MTFjZWZjNj5dIHB1dF9tbnRfbnMrMHg2Ni8weDgwDQo+ID4gPiAgWzxmZmZmZmZm ZjgxMDc4ZGZmPl0gZnJlZV9uc3Byb3h5KzB4MWYvMHhiMA0KPiA+ID4gIFs8ZmZmZmZmZmY4MTA3 OTA1ZT5dIHN3aXRjaF90YXNrX25hbWVzcGFjZXMrMHg1ZS8weDcwDQo+ID4gPiAgWzxmZmZmZmZm ZjgxMDc5MDgwPl0gZXhpdF90YXNrX25hbWVzcGFjZXMrMHgxMC8weDIwDQo+ID4gPiAgWzxmZmZm ZmZmZjgxMDRlOTBlPl0gZG9fZXhpdCsweDRlZS8weGI4MA0KPiA+ID4gIFs8ZmZmZmZmZmY4MTYz OWMwYT5dID8gcmV0aW50X3N3YXBncysweGUvMHgxMw0KPiA+ID4gIFs8ZmZmZmZmZmY4MTA0ZjJl Zj5dIGRvX2dyb3VwX2V4aXQrMHg0Zi8weGMwDQo+ID4gPiAgWzxmZmZmZmZmZjgxMDRmMzc3Pl0g c3lzX2V4aXRfZ3JvdXArMHgxNy8weDIwDQo+ID4gPiAgWzxmZmZmZmZmZjgxNjQxMzUyPl0gc3lz dGVtX2NhbGxfZmFzdHBhdGgrMHgxNi8weDFiDQo+ID4gPiBDb2RlOiA0OCA4YiA1ZCBmMCA0YyA4 YiA2NSBmOCBjOSBjMyA2NiA5MCA1NSA0OCA4OSBlNSA0MSA1NCA1MyA2NiA2NiA2NiA2NiA5MCA2 NSA0OCA4YiAwNCAyNSA4MCBiYSAwMCAwMCA0OCA4YiA4MCA1MCAwNSAwMCAwMCA0OCA4OSBmYiA8 NGM+IDhiIDYwIDI4IDhiIDQ3IDU4IDg1IGMwIDBmIDg0IGVjIDAwIDAwIDAwIDgzIGU4IDAxIDg1 IGMwIDg5IA0KPiA+IA0KPiA+IEFzaWRlIGZyb20gdGhlIGZhY3QgdGhhdCB0aGUgY3VycmVudCBu ZXRfbmFtZXNwYWNlIGlzIG5vdCBndWFyYW50ZWVkIHRvDQo+ID4gZXhpc3Qgd2hlbiB3ZSBhcmUg Y2FsbGVkIGZyb20gZnJlZV9uc3Byb3h5LCBzdmNfZGVzdHJveSgpIGxvb2tzDQo+ID4gc2VyaW91 c2x5IGJyb2tlbjoNCj4gPiANCj4gPiAgICAgICAqIE9uIHRoZSBvbmUgaGFuZCBpdCBpcyB0cnlp bmcgdG8gZnJlZSBzdHJ1Y3Qgc3ZjX3NlcnYgKGFuZA0KPiA+ICAgICAgICAgcHJlc3VtYWJseSBh bGwgc3RydWN0dXJlcyBvd25lZCBieSBzdHJ1Y3Qgc3ZjX3NlcnYpLg0KPiA+ICAgICAgICogT24g dGhlIG90aGVyIGhhbmQsIGl0IHRyaWVzIHRvIHBhc3MgYSBwYXJhbWV0ZXIgdG8NCj4gPiAgICAg ICAgIHN2Y19jbG9zZV9uZXQoKSBzYXlpbmcgInBsZWFzZSBkb24ndCBmcmVlIHN0cnVjdHVyZXMg b24gbXkNCj4gPiAgICAgICAgIHN2X3RlbXBzb2Nrcywgb3Igc3ZfcGVybXNvY2tzIGxpc3QgdW5s ZXNzIHRoZXkgbWF0Y2ggdGhpcyBuZXQNCj4gPiAgICAgICAgIG5hbWVzcGFjZSIuDQo+ID4gDQo+ ID4gQnJ1Y2UsIGhvdyBpcyB0aGlzIHN1cHBvc2VkIHRvIGJlIHdvcmtpbmc/DQo+IA0KPiBZZWFo LCBJIGRvbid0IGtub3cuDQo+IA0KPiBGb3IgdGhlIG5mcyBjYWxsYmFjayBjYXNlLCBpdCBsb29r cyBsaWtlIHlvdSd2ZSBqdXN0IGdvdCBhIHNpbmdsZSANCj4gY2FsbGJhY2sgc2VydmljZSBzaGFy ZWQgYWNyb3NzIGFsbCBuYW1lc3BhY2VzLCBhbmQgYWxsIHlvdSB3YW50IHRvIGRvIA0KPiBpcyBk ZXN0cm95IHRoYXQgd2hvbGUgdGhpbmcgb24gbGFzdCBwdXQ7IG9yIGlzIGl0IG1vcmUgY29tcGxp Y2F0ZWQgdGhhbg0KPiB0aGF0Pw0KDQpGb3IgTkZTdjQsIEkgbmVlZCB0byBjcmVhdGUgc29ja2V0 cyBmb3IgdGhlIHNhbWUgbmV0IG5hbWVzcGFjZSBhcyB0aGUNCnN0cnVjdCBuZnNfY2xpZW50IGlz IHJ1bm5pbmcgaW4uIFdoZW4gYWxsIHRoZSBzdHJ1Y3QgbmZzX2NsaWVudHMgb24gdGhhdA0KbmV0 IG5hbWVzcGFjZSBhcmUgZGVzdHJveWVkLCBJIHdvdWxkIGlkZWFsbHkgZ2V0IHJpZCBvZiB0aG9z ZSBzb2NrZXRzLg0KDQpGb3IgTkZTdjQuMSwgYWxsIEkgd2FudCB0byBkbyBpcyBjcmVhdGUgYSBi YWNrIGNoYW5uZWwgdXNpbmcgdGhlIHNhbWUNCnNvY2tldCBhcyB0aGUgc3RydWN0IG5mc19jbGll bnQuDQoNCj4gRm9yIHRoZSBvdGhlciBzZXJ2ZXJzIGF0IGxlYXN0IHRoZSBwZXItbmV0IGFuZCBn bG9iYWwgcGFydHMgb2YgdGhlIA0KPiBzZXJ2ZXIgc2VlbSB0b28gZW50YW5nbGVkLg0KPiANCj4g VGhhdCdzIHVuYXZvaWRhYmxlIHRvIHNvbWUgZGVncmVlIHNpbmNlIHdlJ3JlIHNoYXJpbmcgdGhy ZWFkcyBhbW9uZyB0aGUNCj4gbmFtZXNwYWNlcy4gIEJ1dCBtYXliZSBzZXBhcmF0ZSBzdHJ1Y3R1 cmVzIGZvciB0aGUgcGVyLW5hbWVzcGFjZSBhbmQNCj4gZ2xvYmFsIHBpZWNlcyB3b3VsZCBoZWxw Lg0KPiANCj4gQXQgYSBtaW5pbXVtIHRoZSBwZXItbmFtZXNwYWNlIHBpZWNlIHdvdWxkIGtlZXAg YSBjb3VudCBvZiB0aGUgdXNlcnMgaW4NCj4gdGhhdCBuYW1lc3BhY2UuDQo+IA0KPiBUbyBtYWtl IHRoZSBzaHV0ZG93biByYWNlLWZyZWUgSSB0aGluayB3ZSBhbHNvIG5lZWQgYSB3YXkgdG8gd2Fp dCBmb3INCj4gYWxsIHRocmVhZHMgcHJvY2Vzc2luZyByZXF1ZXN0cyBpbiB0aGF0IG5hbWVzcGFj ZSwgd2hpY2ggSSBkb24ndCBzZWUNCj4gdGhhdCB3ZSBoYXZlIHlldC4NCg0KDQotLSANClRyb25k IE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQu TXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg== ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-24 19:20 ` Myklebust, Trond @ 2012-05-24 20:27 ` bfields 0 siblings, 0 replies; 12+ messages in thread From: bfields @ 2012-05-24 20:27 UTC (permalink / raw) To: Myklebust, Trond Cc: Dave Jones, linux-nfs@vger.kernel.org, Linux Kernel, Stanislav Kinsbursky On Thu, May 24, 2012 at 07:20:41PM +0000, Myklebust, Trond wrote: > On Thu, 2012-05-24 at 11:55 -0400, bfields@fieldses.org wrote: > > On Mon, May 21, 2012 at 06:03:43PM +0000, Myklebust, Trond wrote: > > > On Mon, 2012-05-21 at 13:14 -0400, Dave Jones wrote: > > > > Tried to shutdown a machine, got this, and a bunch of hung processes. > > > > There was one NFS mount mounted at the time. > > > > > > > > Dave > > > > > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 > > > > IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] > > > > PGD 1434c4067 PUD 144964067 PMD 0 > > > > Oops: 0000 [#1] PREEMPT SMP > > > > CPU 4 > > > > Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 > > > > > > > > Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 > > > > RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] > > > > RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 > > > > RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 > > > > RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 > > > > RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 > > > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 > > > > R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 > > > > FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > > > Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) > > > > Stack: > > > > ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 > > > > 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 > > > > ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 > > > > Call Trace: > > > > [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] > > > > [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] > > > > [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] > > > > [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] > > > > [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] > > > > [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] > > > > [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] > > > > [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] > > > > [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 > > > > [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 > > > > [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 > > > > [<ffffffff811ccc16>] mntput+0x26/0x40 > > > > [<ffffffff811cd597>] release_mounts+0x77/0x90 > > > > [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 > > > > [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 > > > > [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 > > > > [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 > > > > [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 > > > > [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 > > > > [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 > > > > [<ffffffff8104f377>] sys_exit_group+0x17/0x20 > > > > [<ffffffff81641352>] system_call_fastpath+0x16/0x1b > > > > Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb <4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 > > > > > > Aside from the fact that the current net_namespace is not guaranteed to > > > exist when we are called from free_nsproxy, svc_destroy() looks > > > seriously broken: > > > > > > * On the one hand it is trying to free struct svc_serv (and > > > presumably all structures owned by struct svc_serv). > > > * On the other hand, it tries to pass a parameter to > > > svc_close_net() saying "please don't free structures on my > > > sv_tempsocks, or sv_permsocks list unless they match this net > > > namespace". > > > > > > Bruce, how is this supposed to be working? > > > > Yeah, I don't know. > > > > For the nfs callback case, it looks like you've just got a single > > callback service shared across all namespaces, and all you want to do > > is destroy that whole thing on last put; or is it more complicated than > > that? > > For NFSv4, I need to create sockets for the same net namespace as the > struct nfs_client is running in. When all the struct nfs_clients on that > net namespace are destroyed, I would ideally get rid of those sockets. > > For NFSv4.1, all I want to do is create a back channel using the same > socket as the struct nfs_client. Thanks, makes sense. Uh, I meant to cc: Stanislav on that last reply but didn't somehow. --b. > > > For the other servers at least the per-net and global parts of the > > server seem too entangled. > > > > That's unavoidable to some degree since we're sharing threads among the > > namespaces. But maybe separate structures for the per-namespace and > > global pieces would help. > > > > At a minimum the per-namespace piece would keep a count of the users in > > that namespace. > > > > To make the shutdown race-free I think we also need a way to wait for > > all threads processing requests in that namespace, which I don't see > > that we have yet. > > > -- > Trond Myklebust > Linux NFS client maintainer > > NetApp > Trond.Myklebust@netapp.com > www.netapp.com > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-21 18:03 ` Myklebust, Trond 2012-05-21 21:34 ` bfields 2012-05-24 15:55 ` bfields @ 2012-05-25 8:12 ` Stanislav Kinsbursky 2012-05-25 13:07 ` Myklebust, Trond 2 siblings, 1 reply; 12+ messages in thread From: Stanislav Kinsbursky @ 2012-05-25 8:12 UTC (permalink / raw) To: Myklebust, Trond Cc: Dave Jones, bfields@fieldses.org, linux-nfs@vger.kernel.org, Linux Kernel On 21.05.2012 22:03, Myklebust, Trond wrote: > On Mon, 2012-05-21 at 13:14 -0400, Dave Jones wrote: >> Tried to shutdown a machine, got this, and a bunch of hung processes. >> There was one NFS mount mounted at the time. >> >> Dave >> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 >> IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] >> PGD 1434c4067 PUD 144964067 PMD 0 >> Oops: 0000 [#1] PREEMPT SMP >> CPU 4 >> Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 >> >> Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 >> RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] >> RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 >> RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 >> RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 >> RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 >> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 >> R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 >> FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) >> Stack: >> ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 >> 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 >> ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 >> Call Trace: >> [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] >> [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] >> [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] >> [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] >> [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] >> [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] >> [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] >> [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] >> [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 >> [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 >> [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 >> [<ffffffff811ccc16>] mntput+0x26/0x40 >> [<ffffffff811cd597>] release_mounts+0x77/0x90 >> [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 >> [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 >> [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 >> [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 >> [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 >> [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 >> [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 >> [<ffffffff8104f377>] sys_exit_group+0x17/0x20 >> [<ffffffff81641352>] system_call_fastpath+0x16/0x1b >> Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb<4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 > > Aside from the fact that the current net_namespace is not guaranteed to > exist when we are called from free_nsproxy, svc_destroy() looks > seriously broken: Trond, looks like you are mistaken here. Any process holds references to all namespaces it belong to (copy_net_ns() increase usage counter). And network namespace is released after mount namespace in free_nsproxy. > > * On the one hand it is trying to free struct svc_serv (and > presumably all structures owned by struct svc_serv). > * On the other hand, it tries to pass a parameter to > svc_close_net() saying "please don't free structures on my > sv_tempsocks, or sv_permsocks list unless they match this net > namespace". > I've sent patches, which moves svc_shutdown_net() from svc_destroy() ("SUNRPC: separate per-net data creation from service"). with this patch set it's assumed, that per-net resources will be created or released prior to service creation and destruction. > Bruce, how is this supposed to be working? > > Cheers > Trond -- Best regards, Stanislav Kinsbursky ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-25 8:12 ` Stanislav Kinsbursky @ 2012-05-25 13:07 ` Myklebust, Trond 2012-05-25 13:31 ` Stanislav Kinsbursky 0 siblings, 1 reply; 12+ messages in thread From: Myklebust, Trond @ 2012-05-25 13:07 UTC (permalink / raw) To: Stanislav Kinsbursky Cc: Dave Jones, bfields@fieldses.org, linux-nfs@vger.kernel.org, Linux Kernel T24gRnJpLCAyMDEyLTA1LTI1IGF0IDEyOjEyICswNDAwLCBTdGFuaXNsYXYgS2luc2J1cnNreSB3 cm90ZToNCj4gT24gMjEuMDUuMjAxMiAyMjowMywgTXlrbGVidXN0LCBUcm9uZCB3cm90ZToNCj4g PiBPbiBNb24sIDIwMTItMDUtMjEgYXQgMTM6MTQgLTA0MDAsIERhdmUgSm9uZXMgd3JvdGU6DQo+ ID4+IFRyaWVkIHRvIHNodXRkb3duIGEgbWFjaGluZSwgZ290IHRoaXMsIGFuZCBhIGJ1bmNoIG9m IGh1bmcgcHJvY2Vzc2VzLg0KPiA+PiBUaGVyZSB3YXMgb25lIE5GUyBtb3VudCBtb3VudGVkIGF0 IHRoZSB0aW1lLg0KPiA+Pg0KPiA+PiAJRGF2ZQ0KPiA+Pg0KPiA+PiBCVUc6IHVuYWJsZSB0byBo YW5kbGUga2VybmVsIE5VTEwgcG9pbnRlciBkZXJlZmVyZW5jZSBhdCAwMDAwMDAwMDAwMDAwMDI4 DQo+ID4+IElQOiBbPGZmZmZmZmZmYTAxMTkxZGY+XSBzdmNfZGVzdHJveSsweDFmLzB4MTQwIFtz dW5ycGNdDQo+ID4+IFBHRCAxNDM0YzQwNjcgUFVEIDE0NDk2NDA2NyBQTUQgMA0KPiA+PiBPb3Bz OiAwMDAwIFsjMV0gUFJFRU1QVCBTTVANCj4gPj4gQ1BVIDQNCj4gPj4gTW9kdWxlcyBsaW5rZWQg aW46IGlwNnRhYmxlX2ZpbHRlcigtKSBpcDZfdGFibGVzIG5mc2QgbmZzIGZzY2FjaGUgYXV0aF9y cGNnc3MgbmZzX2FjbCBsb2NrZCBpcDZ0X1JFSkVDVCBuZl9jb25udHJhY2tfaXB2NiBuZl9kZWZy YWdfaXB2Ng0KPiA+Pg0KPiA+PiBQaWQ6IDY5NDYsIGNvbW06IG50cGQgTm90IHRhaW50ZWQgMy40 LjArICMxMw0KPiA+PiBSSVA6IDAwMTA6WzxmZmZmZmZmZmEwMTE5MWRmPl0gIFs8ZmZmZmZmZmZh MDExOTFkZj5dIHN2Y19kZXN0cm95KzB4MWYvMHgxNDAgW3N1bnJwY10NCj4gPj4gUlNQOiAwMDE4 OmZmZmY4ODAxNDNjNjVjNDggIEVGTEFHUzogMDAwMTAyODYNCj4gPj4gUkFYOiAwMDAwMDAwMDAw MDAwMDAwIFJCWDogZmZmZjg4MDE0MmNkNDFhMCBSQ1g6IDAwMDAwMDAwMDAwMDAwMDYNCj4gPj4g UkRYOiAwMDAwMDAwMDAwMDAwMDQwIFJTSTogZmZmZjg4MDE0MzEwNTAyOCBSREk6IGZmZmY4ODAx NDJjZDQxYTANCj4gPj4gUkJQOiBmZmZmODgwMTQzYzY1YzU4IFIwODogMDAwMDAwMDAwMDAwMDAw MCBSMDk6IDAwMDAwMDAwMDAwMDAwMDENCj4gPj4gUjEwOiAwMDAwMDAwMDAwMDAwMDAwIFIxMTog MDAwMDAwMDAwMDAwMDAwMCBSMTI6IGZmZmY4ODAxM2JjNWExNDgNCj4gPj4gUjEzOiBmZmZmODgw MTQwOTgxNjU4IFIxNDogZmZmZjg4MDE0MmNkNDFhMCBSMTU6IGZmZmY4ODAxNDZjODgwMDANCj4g Pj4gRlM6ICAwMDAwN2ZkYzAzODJhNzQwKDAwMDApIEdTOmZmZmY4ODAxNDk0MDAwMDAoMDAwMCkg a25sR1M6MDAwMDAwMDAwMDAwMDAwMA0KPiA+PiBDUzogIDAwMTAgRFM6IDAwMDAgRVM6IDAwMDAg Q1IwOiAwMDAwMDAwMDgwMDUwMDMzDQo+ID4+IENSMjogMDAwMDAwMDAwMDAwMDAyOCBDUjM6IDAw MDAwMDAwMzZjYmIwMDAgQ1I0OiAwMDAwMDAwMDAwMTQwN2UwDQo+ID4+IERSMDogMDAwMDAwMDAw MDAwMDAwMCBEUjE6IDAwMDAwMDAwMDAwMDAwMDAgRFIyOiAwMDAwMDAwMDAwMDAwMDAwDQo+ID4+ IERSMzogMDAwMDAwMDAwMDAwMDAwMCBEUjY6IDAwMDAwMDAwZmZmZjBmZjAgRFI3OiAwMDAwMDAw MDAwMDAwNDAwDQo+ID4+IFByb2Nlc3MgbnRwZCAocGlkOiA2OTQ2LCB0aHJlYWRpbmZvIGZmZmY4 ODAxNDNjNjQwMDAsIHRhc2sgZmZmZjg4MDE0MzEwNDk0MCkNCj4gPj4gU3RhY2s6DQo+ID4+ICAg ZmZmZjg4MDE0MDk4MTY2MCBmZmZmODgwMTNiYzVhMTQ4IGZmZmY4ODAxNDNjNjVjODggZmZmZmZm ZmZhMDExOTNhNg0KPiA+PiAgIDAwMDAwMDAwMDAwMDAwMDAgZmZmZjg4MDEzZTU2NjAyMCBmZmZm ODgwMTNlNTY1ZjI4IGZmZmY4ODAxNDZlZTZhYzANCj4gPj4gICBmZmZmODgwMTQzYzY1Y2E4IGZm ZmZmZmZmYTAyNGY0MDMgZmZmZjg4MDE0M2M2NWNhOCBmZmZmODgwMTQzZDNhNGY4DQo+ID4+IENh bGwgVHJhY2U6DQo+ID4+ICAgWzxmZmZmZmZmZmEwMTE5M2E2Pl0gc3ZjX2V4aXRfdGhyZWFkKzB4 YTYvMHhiMCBbc3VucnBjXQ0KPiA+PiAgIFs8ZmZmZmZmZmZhMDI0ZjQwMz5dIG5mc19jYWxsYmFj a19kb3duKzB4NTMvMHg5MCBbbmZzXQ0KPiA+PiAgIFs8ZmZmZmZmZmZhMDIxNjQyZT5dIG5mc19m cmVlX2NsaWVudCsweGZlLzB4MTIwIFtuZnNdDQo+ID4+ICAgWzxmZmZmZmZmZmEwMjE4NWRmPl0g bmZzX3B1dF9jbGllbnQrMHgyOWYvMHg0MjAgW25mc10NCj4gPj4gICBbPGZmZmZmZmZmYTAyMTg0 ZTA+XSA/IG5mc19wdXRfY2xpZW50KzB4MWEwLzB4NDIwIFtuZnNdDQo+ID4+ICAgWzxmZmZmZmZm ZmEwMjE5NjJmPl0gbmZzX2ZyZWVfc2VydmVyKzB4MTZmLzB4MmUwIFtuZnNdDQo+ID4+ICAgWzxm ZmZmZmZmZmEwMjE5NGUzPl0gPyBuZnNfZnJlZV9zZXJ2ZXIrMHgyMy8weDJlMCBbbmZzXQ0KPiA+ PiAgIFs8ZmZmZmZmZmZhMDIyMzYzYz5dIG5mczRfa2lsbF9zdXBlcisweDNjLzB4NTAgW25mc10N Cj4gPj4gICBbPGZmZmZmZmZmODExYWQ2N2M+XSBkZWFjdGl2YXRlX2xvY2tlZF9zdXBlcisweDNj LzB4YTANCj4gPj4gICBbPGZmZmZmZmZmODExYWUyOWU+XSBkZWFjdGl2YXRlX3N1cGVyKzB4NGUv MHg3MA0KPiA+PiAgIFs8ZmZmZmZmZmY4MTFjY2JhND5dIG1udHB1dF9ub19leHBpcmUrMHhiNC8w eDEwMA0KPiA+PiAgIFs8ZmZmZmZmZmY4MTFjY2MxNj5dIG1udHB1dCsweDI2LzB4NDANCj4gPj4g ICBbPGZmZmZmZmZmODExY2Q1OTc+XSByZWxlYXNlX21vdW50cysweDc3LzB4OTANCj4gPj4gICBb PGZmZmZmZmZmODExY2VmYzY+XSBwdXRfbW50X25zKzB4NjYvMHg4MA0KPiA+PiAgIFs8ZmZmZmZm ZmY4MTA3OGRmZj5dIGZyZWVfbnNwcm94eSsweDFmLzB4YjANCj4gPj4gICBbPGZmZmZmZmZmODEw NzkwNWU+XSBzd2l0Y2hfdGFza19uYW1lc3BhY2VzKzB4NWUvMHg3MA0KPiA+PiAgIFs8ZmZmZmZm ZmY4MTA3OTA4MD5dIGV4aXRfdGFza19uYW1lc3BhY2VzKzB4MTAvMHgyMA0KPiA+PiAgIFs8ZmZm ZmZmZmY4MTA0ZTkwZT5dIGRvX2V4aXQrMHg0ZWUvMHhiODANCj4gPj4gICBbPGZmZmZmZmZmODE2 MzljMGE+XSA/IHJldGludF9zd2FwZ3MrMHhlLzB4MTMNCj4gPj4gICBbPGZmZmZmZmZmODEwNGYy ZWY+XSBkb19ncm91cF9leGl0KzB4NGYvMHhjMA0KPiA+PiAgIFs8ZmZmZmZmZmY4MTA0ZjM3Nz5d IHN5c19leGl0X2dyb3VwKzB4MTcvMHgyMA0KPiA+PiAgIFs8ZmZmZmZmZmY4MTY0MTM1Mj5dIHN5 c3RlbV9jYWxsX2Zhc3RwYXRoKzB4MTYvMHgxYg0KPiA+PiBDb2RlOiA0OCA4YiA1ZCBmMCA0YyA4 YiA2NSBmOCBjOSBjMyA2NiA5MCA1NSA0OCA4OSBlNSA0MSA1NCA1MyA2NiA2NiA2NiA2NiA5MCA2 NSA0OCA4YiAwNCAyNSA4MCBiYSAwMCAwMCA0OCA4YiA4MCA1MCAwNSAwMCAwMCA0OCA4OSBmYjw0 Yz4gIDhiIDYwIDI4IDhiIDQ3IDU4IDg1IGMwIDBmIDg0IGVjIDAwIDAwIDAwIDgzIGU4IDAxIDg1 IGMwIDg5DQo+ID4NCj4gPiBBc2lkZSBmcm9tIHRoZSBmYWN0IHRoYXQgdGhlIGN1cnJlbnQgbmV0 X25hbWVzcGFjZSBpcyBub3QgZ3VhcmFudGVlZCB0bw0KPiA+IGV4aXN0IHdoZW4gd2UgYXJlIGNh bGxlZCBmcm9tIGZyZWVfbnNwcm94eSwgc3ZjX2Rlc3Ryb3koKSBsb29rcw0KPiA+IHNlcmlvdXNs eSBicm9rZW46DQo+IA0KPiBUcm9uZCwgbG9va3MgbGlrZSB5b3UgYXJlIG1pc3Rha2VuIGhlcmUu DQo+IEFueSBwcm9jZXNzIGhvbGRzIHJlZmVyZW5jZXMgdG8gYWxsIG5hbWVzcGFjZXMgaXQgYmVs b25nIHRvIChjb3B5X25ldF9ucygpIA0KPiBpbmNyZWFzZSB1c2FnZSBjb3VudGVyKS4gQW5kIG5l dHdvcmsgbmFtZXNwYWNlIGlzIHJlbGVhc2VkIGFmdGVyIG1vdW50IG5hbWVzcGFjZSANCj4gaW4g ZnJlZV9uc3Byb3h5Lg0KDQpUaGF0IGRvZXNuJ3QgaGVscCB5b3UgdGhvdWdoLiBzd2l0Y2hfdGFz a19uYW1lc3BhY2VzIHdpbGwgaGF2ZSBhbHJlYWR5DQpzZXQgY3VycmVudC0+bnNwcm94eSB0byBO VUxMLCB3aGljaCBpcyB3aHkgd2UgT29wcyB3aGVuIHdlIHRyeSB0byByZWFkDQpjdXJyZW50LT5u c3Byb3h5LT5uZXRfbnMgaW4gc3ZjX2V4aXRfdGhyZWFkKCkuDQoNCj4gPg0KPiA+ICAgICAgICAq IE9uIHRoZSBvbmUgaGFuZCBpdCBpcyB0cnlpbmcgdG8gZnJlZSBzdHJ1Y3Qgc3ZjX3NlcnYgKGFu ZA0KPiA+ICAgICAgICAgIHByZXN1bWFibHkgYWxsIHN0cnVjdHVyZXMgb3duZWQgYnkgc3RydWN0 IHN2Y19zZXJ2KS4NCj4gPiAgICAgICAgKiBPbiB0aGUgb3RoZXIgaGFuZCwgaXQgdHJpZXMgdG8g cGFzcyBhIHBhcmFtZXRlciB0bw0KPiA+ICAgICAgICAgIHN2Y19jbG9zZV9uZXQoKSBzYXlpbmcg InBsZWFzZSBkb24ndCBmcmVlIHN0cnVjdHVyZXMgb24gbXkNCj4gPiAgICAgICAgICBzdl90ZW1w c29ja3MsIG9yIHN2X3Blcm1zb2NrcyBsaXN0IHVubGVzcyB0aGV5IG1hdGNoIHRoaXMgbmV0DQo+ ID4gICAgICAgICAgbmFtZXNwYWNlIi4NCj4gPg0KPiANCj4gSSd2ZSBzZW50IHBhdGNoZXMsIHdo aWNoIG1vdmVzIHN2Y19zaHV0ZG93bl9uZXQoKSBmcm9tIHN2Y19kZXN0cm95KCkgKCJTVU5SUEM6 IA0KPiBzZXBhcmF0ZSBwZXItbmV0IGRhdGEgY3JlYXRpb24gZnJvbSBzZXJ2aWNlIikuDQo+IHdp dGggdGhpcyBwYXRjaCBzZXQgaXQncyBhc3N1bWVkLCB0aGF0IHBlci1uZXQgcmVzb3VyY2VzIHdp bGwgYmUgY3JlYXRlZCBvciANCj4gcmVsZWFzZWQgcHJpb3IgdG8gc2VydmljZSBjcmVhdGlvbiBh bmQgZGVzdHJ1Y3Rpb24uDQoNCkFyZSB0aG9zZSBwYXRjaGVzIGFwcHJvcHJpYXRlIGZvciBpbmNs dXNpb24gaW4gdGhlIHN0YWJsZSBrZXJuZWwgc2VyaWVzDQpzbyB0aGF0IHdlIGNhbiBmaXggMy40 Pw0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoN Ck5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg== ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-25 13:07 ` Myklebust, Trond @ 2012-05-25 13:31 ` Stanislav Kinsbursky 2012-05-28 23:43 ` Myklebust, Trond 0 siblings, 1 reply; 12+ messages in thread From: Stanislav Kinsbursky @ 2012-05-25 13:31 UTC (permalink / raw) To: Myklebust, Trond Cc: Dave Jones, bfields@fieldses.org, linux-nfs@vger.kernel.org, Linux Kernel On 25.05.2012 17:07, Myklebust, Trond wrote: > On Fri, 2012-05-25 at 12:12 +0400, Stanislav Kinsbursky wrote: >> On 21.05.2012 22:03, Myklebust, Trond wrote: >>> On Mon, 2012-05-21 at 13:14 -0400, Dave Jones wrote: >>>> Tried to shutdown a machine, got this, and a bunch of hung processes. >>>> There was one NFS mount mounted at the time. >>>> >>>> Dave >>>> >>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 >>>> IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] >>>> PGD 1434c4067 PUD 144964067 PMD 0 >>>> Oops: 0000 [#1] PREEMPT SMP >>>> CPU 4 >>>> Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 >>>> >>>> Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 >>>> RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] >>>> RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 >>>> RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 >>>> RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 >>>> RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 >>>> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 >>>> R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 >>>> FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 >>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>>> Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) >>>> Stack: >>>> ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 >>>> 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 >>>> ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 >>>> Call Trace: >>>> [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] >>>> [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] >>>> [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] >>>> [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] >>>> [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] >>>> [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] >>>> [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] >>>> [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] >>>> [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 >>>> [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 >>>> [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 >>>> [<ffffffff811ccc16>] mntput+0x26/0x40 >>>> [<ffffffff811cd597>] release_mounts+0x77/0x90 >>>> [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 >>>> [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 >>>> [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 >>>> [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 >>>> [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 >>>> [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 >>>> [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 >>>> [<ffffffff8104f377>] sys_exit_group+0x17/0x20 >>>> [<ffffffff81641352>] system_call_fastpath+0x16/0x1b >>>> Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb<4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 >>> >>> Aside from the fact that the current net_namespace is not guaranteed to >>> exist when we are called from free_nsproxy, svc_destroy() looks >>> seriously broken: >> >> Trond, looks like you are mistaken here. >> Any process holds references to all namespaces it belong to (copy_net_ns() >> increase usage counter). And network namespace is released after mount namespace >> in free_nsproxy. > > That doesn't help you though. switch_task_namespaces will have already > set current->nsproxy to NULL, which is why we Oops when we try to read > current->nsproxy->net_ns in svc_exit_thread(). > >>> >>> * On the one hand it is trying to free struct svc_serv (and >>> presumably all structures owned by struct svc_serv). >>> * On the other hand, it tries to pass a parameter to >>> svc_close_net() saying "please don't free structures on my >>> sv_tempsocks, or sv_permsocks list unless they match this net >>> namespace". >>> >> >> I've sent patches, which moves svc_shutdown_net() from svc_destroy() ("SUNRPC: >> separate per-net data creation from service"). >> with this patch set it's assumed, that per-net resources will be created or >> released prior to service creation and destruction. > > Are those patches appropriate for inclusion in the stable kernel series > so that we can fix 3.4? > Yes. But unfortunately, this won't be enough. "NFS: callback threads containerization" patch set is required as well. A a bugfix, I can suggest "SUNRPC: separate per-net data creation from service" patch set + pass hard-coded "init_net" for NFS callback shutdown routines (instead of current->nsproxy->net_ns). This should work. -- Best regards, Stanislav Kinsbursky ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-25 13:31 ` Stanislav Kinsbursky @ 2012-05-28 23:43 ` Myklebust, Trond 2012-05-29 8:48 ` Stanislav Kinsbursky 2012-05-29 11:21 ` bfields 0 siblings, 2 replies; 12+ messages in thread From: Myklebust, Trond @ 2012-05-28 23:43 UTC (permalink / raw) To: Stanislav Kinsbursky Cc: Dave Jones, bfields@fieldses.org, linux-nfs@vger.kernel.org, Linux Kernel T24gRnJpLCAyMDEyLTA1LTI1IGF0IDE3OjMxICswNDAwLCBTdGFuaXNsYXYgS2luc2J1cnNreSB3 cm90ZToNCj4gT24gMjUuMDUuMjAxMiAxNzowNywgTXlrbGVidXN0LCBUcm9uZCB3cm90ZToNCj4g PiBPbiBGcmksIDIwMTItMDUtMjUgYXQgMTI6MTIgKzA0MDAsIFN0YW5pc2xhdiBLaW5zYnVyc2t5 IHdyb3RlOg0KPiA+PiBPbiAyMS4wNS4yMDEyIDIyOjAzLCBNeWtsZWJ1c3QsIFRyb25kIHdyb3Rl Og0KPiA+Pj4gT24gTW9uLCAyMDEyLTA1LTIxIGF0IDEzOjE0IC0wNDAwLCBEYXZlIEpvbmVzIHdy b3RlOg0KPiA+Pj4+IFRyaWVkIHRvIHNodXRkb3duIGEgbWFjaGluZSwgZ290IHRoaXMsIGFuZCBh IGJ1bmNoIG9mIGh1bmcgcHJvY2Vzc2VzLg0KPiA+Pj4+IFRoZXJlIHdhcyBvbmUgTkZTIG1vdW50 IG1vdW50ZWQgYXQgdGhlIHRpbWUuDQo+ID4+Pj4NCj4gPj4+PiAJRGF2ZQ0KPiA+Pj4+DQo+ID4+ Pj4gQlVHOiB1bmFibGUgdG8gaGFuZGxlIGtlcm5lbCBOVUxMIHBvaW50ZXIgZGVyZWZlcmVuY2Ug YXQgMDAwMDAwMDAwMDAwMDAyOA0KPiA+Pj4+IElQOiBbPGZmZmZmZmZmYTAxMTkxZGY+XSBzdmNf ZGVzdHJveSsweDFmLzB4MTQwIFtzdW5ycGNdDQo+ID4+Pj4gUEdEIDE0MzRjNDA2NyBQVUQgMTQ0 OTY0MDY3IFBNRCAwDQo+ID4+Pj4gT29wczogMDAwMCBbIzFdIFBSRUVNUFQgU01QDQo+ID4+Pj4g Q1BVIDQNCj4gPj4+PiBNb2R1bGVzIGxpbmtlZCBpbjogaXA2dGFibGVfZmlsdGVyKC0pIGlwNl90 YWJsZXMgbmZzZCBuZnMgZnNjYWNoZSBhdXRoX3JwY2dzcyBuZnNfYWNsIGxvY2tkIGlwNnRfUkVK RUNUIG5mX2Nvbm50cmFja19pcHY2IG5mX2RlZnJhZ19pcHY2DQo+ID4+Pj4NCj4gPj4+PiBQaWQ6 IDY5NDYsIGNvbW06IG50cGQgTm90IHRhaW50ZWQgMy40LjArICMxMw0KPiA+Pj4+IFJJUDogMDAx MDpbPGZmZmZmZmZmYTAxMTkxZGY+XSAgWzxmZmZmZmZmZmEwMTE5MWRmPl0gc3ZjX2Rlc3Ryb3kr MHgxZi8weDE0MCBbc3VucnBjXQ0KPiA+Pj4+IFJTUDogMDAxODpmZmZmODgwMTQzYzY1YzQ4ICBF RkxBR1M6IDAwMDEwMjg2DQo+ID4+Pj4gUkFYOiAwMDAwMDAwMDAwMDAwMDAwIFJCWDogZmZmZjg4 MDE0MmNkNDFhMCBSQ1g6IDAwMDAwMDAwMDAwMDAwMDYNCj4gPj4+PiBSRFg6IDAwMDAwMDAwMDAw MDAwNDAgUlNJOiBmZmZmODgwMTQzMTA1MDI4IFJESTogZmZmZjg4MDE0MmNkNDFhMA0KPiA+Pj4+ IFJCUDogZmZmZjg4MDE0M2M2NWM1OCBSMDg6IDAwMDAwMDAwMDAwMDAwMDAgUjA5OiAwMDAwMDAw MDAwMDAwMDAxDQo+ID4+Pj4gUjEwOiAwMDAwMDAwMDAwMDAwMDAwIFIxMTogMDAwMDAwMDAwMDAw MDAwMCBSMTI6IGZmZmY4ODAxM2JjNWExNDgNCj4gPj4+PiBSMTM6IGZmZmY4ODAxNDA5ODE2NTgg UjE0OiBmZmZmODgwMTQyY2Q0MWEwIFIxNTogZmZmZjg4MDE0NmM4ODAwMA0KPiA+Pj4+IEZTOiAg MDAwMDdmZGMwMzgyYTc0MCgwMDAwKSBHUzpmZmZmODgwMTQ5NDAwMDAwKDAwMDApIGtubEdTOjAw MDAwMDAwMDAwMDAwMDANCj4gPj4+PiBDUzogIDAwMTAgRFM6IDAwMDAgRVM6IDAwMDAgQ1IwOiAw MDAwMDAwMDgwMDUwMDMzDQo+ID4+Pj4gQ1IyOiAwMDAwMDAwMDAwMDAwMDI4IENSMzogMDAwMDAw MDAzNmNiYjAwMCBDUjQ6IDAwMDAwMDAwMDAxNDA3ZTANCj4gPj4+PiBEUjA6IDAwMDAwMDAwMDAw MDAwMDAgRFIxOiAwMDAwMDAwMDAwMDAwMDAwIERSMjogMDAwMDAwMDAwMDAwMDAwMA0KPiA+Pj4+ IERSMzogMDAwMDAwMDAwMDAwMDAwMCBEUjY6IDAwMDAwMDAwZmZmZjBmZjAgRFI3OiAwMDAwMDAw MDAwMDAwNDAwDQo+ID4+Pj4gUHJvY2VzcyBudHBkIChwaWQ6IDY5NDYsIHRocmVhZGluZm8gZmZm Zjg4MDE0M2M2NDAwMCwgdGFzayBmZmZmODgwMTQzMTA0OTQwKQ0KPiA+Pj4+IFN0YWNrOg0KPiA+ Pj4+ICAgIGZmZmY4ODAxNDA5ODE2NjAgZmZmZjg4MDEzYmM1YTE0OCBmZmZmODgwMTQzYzY1Yzg4 IGZmZmZmZmZmYTAxMTkzYTYNCj4gPj4+PiAgICAwMDAwMDAwMDAwMDAwMDAwIGZmZmY4ODAxM2U1 NjYwMjAgZmZmZjg4MDEzZTU2NWYyOCBmZmZmODgwMTQ2ZWU2YWMwDQo+ID4+Pj4gICAgZmZmZjg4 MDE0M2M2NWNhOCBmZmZmZmZmZmEwMjRmNDAzIGZmZmY4ODAxNDNjNjVjYTggZmZmZjg4MDE0M2Qz YTRmOA0KPiA+Pj4+IENhbGwgVHJhY2U6DQo+ID4+Pj4gICAgWzxmZmZmZmZmZmEwMTE5M2E2Pl0g c3ZjX2V4aXRfdGhyZWFkKzB4YTYvMHhiMCBbc3VucnBjXQ0KPiA+Pj4+ICAgIFs8ZmZmZmZmZmZh MDI0ZjQwMz5dIG5mc19jYWxsYmFja19kb3duKzB4NTMvMHg5MCBbbmZzXQ0KPiA+Pj4+ICAgIFs8 ZmZmZmZmZmZhMDIxNjQyZT5dIG5mc19mcmVlX2NsaWVudCsweGZlLzB4MTIwIFtuZnNdDQo+ID4+ Pj4gICAgWzxmZmZmZmZmZmEwMjE4NWRmPl0gbmZzX3B1dF9jbGllbnQrMHgyOWYvMHg0MjAgW25m c10NCj4gPj4+PiAgICBbPGZmZmZmZmZmYTAyMTg0ZTA+XSA/IG5mc19wdXRfY2xpZW50KzB4MWEw LzB4NDIwIFtuZnNdDQo+ID4+Pj4gICAgWzxmZmZmZmZmZmEwMjE5NjJmPl0gbmZzX2ZyZWVfc2Vy dmVyKzB4MTZmLzB4MmUwIFtuZnNdDQo+ID4+Pj4gICAgWzxmZmZmZmZmZmEwMjE5NGUzPl0gPyBu ZnNfZnJlZV9zZXJ2ZXIrMHgyMy8weDJlMCBbbmZzXQ0KPiA+Pj4+ICAgIFs8ZmZmZmZmZmZhMDIy MzYzYz5dIG5mczRfa2lsbF9zdXBlcisweDNjLzB4NTAgW25mc10NCj4gPj4+PiAgICBbPGZmZmZm ZmZmODExYWQ2N2M+XSBkZWFjdGl2YXRlX2xvY2tlZF9zdXBlcisweDNjLzB4YTANCj4gPj4+PiAg ICBbPGZmZmZmZmZmODExYWUyOWU+XSBkZWFjdGl2YXRlX3N1cGVyKzB4NGUvMHg3MA0KPiA+Pj4+ ICAgIFs8ZmZmZmZmZmY4MTFjY2JhND5dIG1udHB1dF9ub19leHBpcmUrMHhiNC8weDEwMA0KPiA+ Pj4+ICAgIFs8ZmZmZmZmZmY4MTFjY2MxNj5dIG1udHB1dCsweDI2LzB4NDANCj4gPj4+PiAgICBb PGZmZmZmZmZmODExY2Q1OTc+XSByZWxlYXNlX21vdW50cysweDc3LzB4OTANCj4gPj4+PiAgICBb PGZmZmZmZmZmODExY2VmYzY+XSBwdXRfbW50X25zKzB4NjYvMHg4MA0KPiA+Pj4+ICAgIFs8ZmZm ZmZmZmY4MTA3OGRmZj5dIGZyZWVfbnNwcm94eSsweDFmLzB4YjANCj4gPj4+PiAgICBbPGZmZmZm ZmZmODEwNzkwNWU+XSBzd2l0Y2hfdGFza19uYW1lc3BhY2VzKzB4NWUvMHg3MA0KPiA+Pj4+ICAg IFs8ZmZmZmZmZmY4MTA3OTA4MD5dIGV4aXRfdGFza19uYW1lc3BhY2VzKzB4MTAvMHgyMA0KPiA+ Pj4+ICAgIFs8ZmZmZmZmZmY4MTA0ZTkwZT5dIGRvX2V4aXQrMHg0ZWUvMHhiODANCj4gPj4+PiAg ICBbPGZmZmZmZmZmODE2MzljMGE+XSA/IHJldGludF9zd2FwZ3MrMHhlLzB4MTMNCj4gPj4+PiAg ICBbPGZmZmZmZmZmODEwNGYyZWY+XSBkb19ncm91cF9leGl0KzB4NGYvMHhjMA0KPiA+Pj4+ICAg IFs8ZmZmZmZmZmY4MTA0ZjM3Nz5dIHN5c19leGl0X2dyb3VwKzB4MTcvMHgyMA0KPiA+Pj4+ICAg IFs8ZmZmZmZmZmY4MTY0MTM1Mj5dIHN5c3RlbV9jYWxsX2Zhc3RwYXRoKzB4MTYvMHgxYg0KPiA+ Pj4+IENvZGU6IDQ4IDhiIDVkIGYwIDRjIDhiIDY1IGY4IGM5IGMzIDY2IDkwIDU1IDQ4IDg5IGU1 IDQxIDU0IDUzIDY2IDY2IDY2IDY2IDkwIDY1IDQ4IDhiIDA0IDI1IDgwIGJhIDAwIDAwIDQ4IDhi IDgwIDUwIDA1IDAwIDAwIDQ4IDg5IGZiPDRjPiAgIDhiIDYwIDI4IDhiIDQ3IDU4IDg1IGMwIDBm IDg0IGVjIDAwIDAwIDAwIDgzIGU4IDAxIDg1IGMwIDg5DQo+ID4+Pg0KPiA+Pj4gQXNpZGUgZnJv bSB0aGUgZmFjdCB0aGF0IHRoZSBjdXJyZW50IG5ldF9uYW1lc3BhY2UgaXMgbm90IGd1YXJhbnRl ZWQgdG8NCj4gPj4+IGV4aXN0IHdoZW4gd2UgYXJlIGNhbGxlZCBmcm9tIGZyZWVfbnNwcm94eSwg c3ZjX2Rlc3Ryb3koKSBsb29rcw0KPiA+Pj4gc2VyaW91c2x5IGJyb2tlbjoNCj4gPj4NCj4gPj4g VHJvbmQsIGxvb2tzIGxpa2UgeW91IGFyZSBtaXN0YWtlbiBoZXJlLg0KPiA+PiBBbnkgcHJvY2Vz cyBob2xkcyByZWZlcmVuY2VzIHRvIGFsbCBuYW1lc3BhY2VzIGl0IGJlbG9uZyB0byAoY29weV9u ZXRfbnMoKQ0KPiA+PiBpbmNyZWFzZSB1c2FnZSBjb3VudGVyKS4gQW5kIG5ldHdvcmsgbmFtZXNw YWNlIGlzIHJlbGVhc2VkIGFmdGVyIG1vdW50IG5hbWVzcGFjZQ0KPiA+PiBpbiBmcmVlX25zcHJv eHkuDQo+ID4NCj4gPiBUaGF0IGRvZXNuJ3QgaGVscCB5b3UgdGhvdWdoLiBzd2l0Y2hfdGFza19u YW1lc3BhY2VzIHdpbGwgaGF2ZSBhbHJlYWR5DQo+ID4gc2V0IGN1cnJlbnQtPm5zcHJveHkgdG8g TlVMTCwgd2hpY2ggaXMgd2h5IHdlIE9vcHMgd2hlbiB3ZSB0cnkgdG8gcmVhZA0KPiA+IGN1cnJl bnQtPm5zcHJveHktPm5ldF9ucyBpbiBzdmNfZXhpdF90aHJlYWQoKS4NCj4gPg0KPiA+Pj4NCj4g Pj4+ICAgICAgICAgKiBPbiB0aGUgb25lIGhhbmQgaXQgaXMgdHJ5aW5nIHRvIGZyZWUgc3RydWN0 IHN2Y19zZXJ2IChhbmQNCj4gPj4+ICAgICAgICAgICBwcmVzdW1hYmx5IGFsbCBzdHJ1Y3R1cmVz IG93bmVkIGJ5IHN0cnVjdCBzdmNfc2VydikuDQo+ID4+PiAgICAgICAgICogT24gdGhlIG90aGVy IGhhbmQsIGl0IHRyaWVzIHRvIHBhc3MgYSBwYXJhbWV0ZXIgdG8NCj4gPj4+ICAgICAgICAgICBz dmNfY2xvc2VfbmV0KCkgc2F5aW5nICJwbGVhc2UgZG9uJ3QgZnJlZSBzdHJ1Y3R1cmVzIG9uIG15 DQo+ID4+PiAgICAgICAgICAgc3ZfdGVtcHNvY2tzLCBvciBzdl9wZXJtc29ja3MgbGlzdCB1bmxl c3MgdGhleSBtYXRjaCB0aGlzIG5ldA0KPiA+Pj4gICAgICAgICAgIG5hbWVzcGFjZSIuDQo+ID4+ Pg0KPiA+Pg0KPiA+PiBJJ3ZlIHNlbnQgcGF0Y2hlcywgd2hpY2ggbW92ZXMgc3ZjX3NodXRkb3du X25ldCgpIGZyb20gc3ZjX2Rlc3Ryb3koKSAoIlNVTlJQQzoNCj4gPj4gc2VwYXJhdGUgcGVyLW5l dCBkYXRhIGNyZWF0aW9uIGZyb20gc2VydmljZSIpLg0KPiA+PiB3aXRoIHRoaXMgcGF0Y2ggc2V0 IGl0J3MgYXNzdW1lZCwgdGhhdCBwZXItbmV0IHJlc291cmNlcyB3aWxsIGJlIGNyZWF0ZWQgb3IN Cj4gPj4gcmVsZWFzZWQgcHJpb3IgdG8gc2VydmljZSBjcmVhdGlvbiBhbmQgZGVzdHJ1Y3Rpb24u DQo+ID4NCj4gPiBBcmUgdGhvc2UgcGF0Y2hlcyBhcHByb3ByaWF0ZSBmb3IgaW5jbHVzaW9uIGlu IHRoZSBzdGFibGUga2VybmVsIHNlcmllcw0KPiA+IHNvIHRoYXQgd2UgY2FuIGZpeCAzLjQ/DQo+ ID4NCj4gDQo+IFllcy4gQnV0IHVuZm9ydHVuYXRlbHksIHRoaXMgd29uJ3QgYmUgZW5vdWdoLg0K PiAiTkZTOiBjYWxsYmFjayB0aHJlYWRzIGNvbnRhaW5lcml6YXRpb24iIHBhdGNoIHNldCBpcyBy ZXF1aXJlZCBhcyB3ZWxsLg0KPiANCj4gQSBhIGJ1Z2ZpeCwgSSBjYW4gc3VnZ2VzdCAiU1VOUlBD OiBzZXBhcmF0ZSBwZXItbmV0IGRhdGEgY3JlYXRpb24gZnJvbSBzZXJ2aWNlIiANCj4gcGF0Y2gg c2V0ICsgcGFzcyBoYXJkLWNvZGVkICJpbml0X25ldCIgZm9yIE5GUyBjYWxsYmFjayBzaHV0ZG93 biByb3V0aW5lcyANCj4gKGluc3RlYWQgb2YgY3VycmVudC0+bnNwcm94eS0+bmV0X25zKS4gVGhp cyBzaG91bGQgd29yay4NCg0KSGkgU3RhbmlzbGF2LA0KDQpNeSBxdWVzdGlvbiBpcyB3aHkgc2hv dWxkIHN2Y19kZXN0cm95KCkgY2FyZSBhYm91dCBuZXQgbmFtZXNwYWNlcyBhdA0KYWxsPyBPbmNl IGFuIGFwcGxpY2F0aW9uIGlzIGNhbGxpbmcgc3ZjX2Rlc3Ryb3koKSwgaXQgaXMgdHJ5aW5nIHRv IGNsb3NlDQpkb3duIHRoZSBlbnRpcmUgc2VydmljZS4gSXQgcmVhbGx5IHNob3VsZCBub3QgbWF0 dGVyIHRvIHdoaWNoIG5ldA0KbmFtZXNwYWNlIGEgcGFydGljdWxhciBzb2NrZXQgYmVsb25nczog dGhleSBfYWxsXyBuZWVkIHRvIGJlIGRlc3Ryb3llZC4NCg0KQ2hlZXJzLA0KICBUcm9uZA0KDQot LSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFw cA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg== ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-28 23:43 ` Myklebust, Trond @ 2012-05-29 8:48 ` Stanislav Kinsbursky 2012-05-29 11:21 ` bfields 1 sibling, 0 replies; 12+ messages in thread From: Stanislav Kinsbursky @ 2012-05-29 8:48 UTC (permalink / raw) To: Myklebust, Trond Cc: Dave Jones, bfields@fieldses.org, linux-nfs@vger.kernel.org, Linux Kernel On 29.05.2012 03:43, Myklebust, Trond wrote: > On Fri, 2012-05-25 at 17:31 +0400, Stanislav Kinsbursky wrote: >> On 25.05.2012 17:07, Myklebust, Trond wrote: >>> On Fri, 2012-05-25 at 12:12 +0400, Stanislav Kinsbursky wrote: >>>> On 21.05.2012 22:03, Myklebust, Trond wrote: >>>>> On Mon, 2012-05-21 at 13:14 -0400, Dave Jones wrote: >>>>>> Tried to shutdown a machine, got this, and a bunch of hung processes. >>>>>> There was one NFS mount mounted at the time. >>>>>> >>>>>> Dave >>>>>> >>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 >>>>>> IP: [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] >>>>>> PGD 1434c4067 PUD 144964067 PMD 0 >>>>>> Oops: 0000 [#1] PREEMPT SMP >>>>>> CPU 4 >>>>>> Modules linked in: ip6table_filter(-) ip6_tables nfsd nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 >>>>>> >>>>>> Pid: 6946, comm: ntpd Not tainted 3.4.0+ #13 >>>>>> RIP: 0010:[<ffffffffa01191df>] [<ffffffffa01191df>] svc_destroy+0x1f/0x140 [sunrpc] >>>>>> RSP: 0018:ffff880143c65c48 EFLAGS: 00010286 >>>>>> RAX: 0000000000000000 RBX: ffff880142cd41a0 RCX: 0000000000000006 >>>>>> RDX: 0000000000000040 RSI: ffff880143105028 RDI: ffff880142cd41a0 >>>>>> RBP: ffff880143c65c58 R08: 0000000000000000 R09: 0000000000000001 >>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013bc5a148 >>>>>> R13: ffff880140981658 R14: ffff880142cd41a0 R15: ffff880146c88000 >>>>>> FS: 00007fdc0382a740(0000) GS:ffff880149400000(0000) knlGS:0000000000000000 >>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>> CR2: 0000000000000028 CR3: 0000000036cbb000 CR4: 00000000001407e0 >>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>>>>> Process ntpd (pid: 6946, threadinfo ffff880143c64000, task ffff880143104940) >>>>>> Stack: >>>>>> ffff880140981660 ffff88013bc5a148 ffff880143c65c88 ffffffffa01193a6 >>>>>> 0000000000000000 ffff88013e566020 ffff88013e565f28 ffff880146ee6ac0 >>>>>> ffff880143c65ca8 ffffffffa024f403 ffff880143c65ca8 ffff880143d3a4f8 >>>>>> Call Trace: >>>>>> [<ffffffffa01193a6>] svc_exit_thread+0xa6/0xb0 [sunrpc] >>>>>> [<ffffffffa024f403>] nfs_callback_down+0x53/0x90 [nfs] >>>>>> [<ffffffffa021642e>] nfs_free_client+0xfe/0x120 [nfs] >>>>>> [<ffffffffa02185df>] nfs_put_client+0x29f/0x420 [nfs] >>>>>> [<ffffffffa02184e0>] ? nfs_put_client+0x1a0/0x420 [nfs] >>>>>> [<ffffffffa021962f>] nfs_free_server+0x16f/0x2e0 [nfs] >>>>>> [<ffffffffa02194e3>] ? nfs_free_server+0x23/0x2e0 [nfs] >>>>>> [<ffffffffa022363c>] nfs4_kill_super+0x3c/0x50 [nfs] >>>>>> [<ffffffff811ad67c>] deactivate_locked_super+0x3c/0xa0 >>>>>> [<ffffffff811ae29e>] deactivate_super+0x4e/0x70 >>>>>> [<ffffffff811ccba4>] mntput_no_expire+0xb4/0x100 >>>>>> [<ffffffff811ccc16>] mntput+0x26/0x40 >>>>>> [<ffffffff811cd597>] release_mounts+0x77/0x90 >>>>>> [<ffffffff811cefc6>] put_mnt_ns+0x66/0x80 >>>>>> [<ffffffff81078dff>] free_nsproxy+0x1f/0xb0 >>>>>> [<ffffffff8107905e>] switch_task_namespaces+0x5e/0x70 >>>>>> [<ffffffff81079080>] exit_task_namespaces+0x10/0x20 >>>>>> [<ffffffff8104e90e>] do_exit+0x4ee/0xb80 >>>>>> [<ffffffff81639c0a>] ? retint_swapgs+0xe/0x13 >>>>>> [<ffffffff8104f2ef>] do_group_exit+0x4f/0xc0 >>>>>> [<ffffffff8104f377>] sys_exit_group+0x17/0x20 >>>>>> [<ffffffff81641352>] system_call_fastpath+0x16/0x1b >>>>>> Code: 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 90 55 48 89 e5 41 54 53 66 66 66 66 90 65 48 8b 04 25 80 ba 00 00 48 8b 80 50 05 00 00 48 89 fb<4c> 8b 60 28 8b 47 58 85 c0 0f 84 ec 00 00 00 83 e8 01 85 c0 89 >>>>> >>>>> Aside from the fact that the current net_namespace is not guaranteed to >>>>> exist when we are called from free_nsproxy, svc_destroy() looks >>>>> seriously broken: >>>> >>>> Trond, looks like you are mistaken here. >>>> Any process holds references to all namespaces it belong to (copy_net_ns() >>>> increase usage counter). And network namespace is released after mount namespace >>>> in free_nsproxy. >>> >>> That doesn't help you though. switch_task_namespaces will have already >>> set current->nsproxy to NULL, which is why we Oops when we try to read >>> current->nsproxy->net_ns in svc_exit_thread(). >>> >>>>> >>>>> * On the one hand it is trying to free struct svc_serv (and >>>>> presumably all structures owned by struct svc_serv). >>>>> * On the other hand, it tries to pass a parameter to >>>>> svc_close_net() saying "please don't free structures on my >>>>> sv_tempsocks, or sv_permsocks list unless they match this net >>>>> namespace". >>>>> >>>> >>>> I've sent patches, which moves svc_shutdown_net() from svc_destroy() ("SUNRPC: >>>> separate per-net data creation from service"). >>>> with this patch set it's assumed, that per-net resources will be created or >>>> released prior to service creation and destruction. >>> >>> Are those patches appropriate for inclusion in the stable kernel series >>> so that we can fix 3.4? >>> >> >> Yes. But unfortunately, this won't be enough. >> "NFS: callback threads containerization" patch set is required as well. >> >> A a bugfix, I can suggest "SUNRPC: separate per-net data creation from service" >> patch set + pass hard-coded "init_net" for NFS callback shutdown routines >> (instead of current->nsproxy->net_ns). This should work. > > Hi Stanislav, > > My question is why should svc_destroy() care about net namespaces at > all? Once an application is calling svc_destroy(), it is trying to close > down the entire service. It really should not matter to which net > namespace a particular socket belongs: they _all_ need to be destroyed. > Hi, Trond. I have to mention, that from my pow svc_destroy() have to be split into two functions: svc_put() and __svc_destroy(). Anyway, previously we had one global counter per service, and we were used to destroy service, when the counter reached zero. Today the situation remain almost the same except we have additional per-net counter, which is used for per-net service resources management. IOW, when service starts, is creates per-net resources in current network namespace and increase current per-net and global service counters. Next service start request will do the same and so on. It actually means, that: 1) when per-net counter reaches zero, then per-net service resources have to be released. 2) when global counter reaches zero, then current user is the last one. And only it's resources left. Something like this... -- Best regards, Stanislav Kinsbursky ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 3.4. sunrpc oops during shutdown 2012-05-28 23:43 ` Myklebust, Trond 2012-05-29 8:48 ` Stanislav Kinsbursky @ 2012-05-29 11:21 ` bfields 1 sibling, 0 replies; 12+ messages in thread From: bfields @ 2012-05-29 11:21 UTC (permalink / raw) To: Myklebust, Trond Cc: Stanislav Kinsbursky, Dave Jones, linux-nfs@vger.kernel.org, Linux Kernel On Mon, May 28, 2012 at 11:43:40PM +0000, Myklebust, Trond wrote: > On Fri, 2012-05-25 at 17:31 +0400, Stanislav Kinsbursky wrote: > > Yes. But unfortunately, this won't be enough. > > "NFS: callback threads containerization" patch set is required as well. > > > > A a bugfix, I can suggest "SUNRPC: separate per-net data creation from service" > > patch set + pass hard-coded "init_net" for NFS callback shutdown routines > > (instead of current->nsproxy->net_ns). This should work. > > Hi Stanislav, > > My question is why should svc_destroy() care about net namespaces at > all? Once an application is calling svc_destroy(), it is trying to close > down the entire service. It really should not matter to which net > namespace a particular socket belongs: they _all_ need to be destroyed. Services started in different network namespaces should be independent--for example, starting nfsd in container A and then again in container B, then shutting it down in container A, shouldn't also shut down container B's service. *But* there is currently only a single global server object, because we're sharing threads: http://marc.info/?l=linux-nfs&m=133405747330055&w=2 "Having Lockd thread (or NFSd threads) per container looks easy to implement on first sight. But kernel threads currently supported only in initial pid namespace. I.e. it means that per-container kernel thread won't be visible in container, if it has it's own pid namespace. And there is no way to put a kernel thread into container. In OpenVZ we have per-container kernel threads. But integrating this feature to mainline looks hopeless (or very difficult) to me. At least for now. So this problem with signals remains unsolved. "So, as it looks to me, this "one service per all" is the only one suitable for now." so Stanislav is simulating multiple servers by shutting down sockets on a per-net basis. But I think it should be possible to share threads between servers while still behaving in every other way as if the servers are completely independent. --b. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-05-29 11:21 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-21 17:14 3.4. sunrpc oops during shutdown Dave Jones 2012-05-21 18:03 ` Myklebust, Trond 2012-05-21 21:34 ` bfields 2012-05-24 15:55 ` bfields 2012-05-24 19:20 ` Myklebust, Trond 2012-05-24 20:27 ` bfields 2012-05-25 8:12 ` Stanislav Kinsbursky 2012-05-25 13:07 ` Myklebust, Trond 2012-05-25 13:31 ` Stanislav Kinsbursky 2012-05-28 23:43 ` Myklebust, Trond 2012-05-29 8:48 ` Stanislav Kinsbursky 2012-05-29 11:21 ` bfields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).