* Regretion on NFS in mainline kernel @ 2012-04-18 11:26 Luis Henriques 2012-04-18 13:28 ` Jeff Layton 0 siblings, 1 reply; 21+ messages in thread From: Luis Henriques @ 2012-04-18 11:26 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs Hi, We have a bug reporting a regression in mainline kernel. Basically, the bug reporters are seeing lots of messages: [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed This happens when mounting a user's home directory over NFS. Is this a known issue being addressed at the moment? Is there any information needed to help debugging the issue? The original bug report can be found here: http://bugs.launchpad.net/bugs/974664 And there's also a similar report for Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=811138 Cheers, -- Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 11:26 Regretion on NFS in mainline kernel Luis Henriques @ 2012-04-18 13:28 ` Jeff Layton 2012-04-18 13:57 ` Luis Henriques 0 siblings, 1 reply; 21+ messages in thread From: Jeff Layton @ 2012-04-18 13:28 UTC (permalink / raw) To: Luis Henriques; +Cc: Trond Myklebust, linux-nfs On Wed, 18 Apr 2012 12:26:10 +0100 Luis Henriques <luis.henriques@canonical.com> wrote: > Hi, > > We have a bug reporting a regression in mainline kernel. Basically, the > bug reporters are seeing lots of messages: > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > This happens when mounting a user's home directory over NFS. > > Is this a known issue being addressed at the moment? Is there any > information needed to help debugging the issue? > > The original bug report can be found here: > > http://bugs.launchpad.net/bugs/974664 > > And there's also a similar report for Fedora: > > https://bugzilla.redhat.com/show_bug.cgi?id=811138 > > Cheers, This code in nfs4_reclaim_open_state() looks wrong to me, but I'm not that familiar with this code so I could be wrong: -------------------[snip]-------------------- status = ops->recover_open(sp, state); if (status >= 0) { status = nfs4_reclaim_locks(state, ops); if (status >= 0) { spin_lock(&state->state_lock); list_for_each_entry(lock, &state->lock_states, ls_locks) { if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) pr_warn_ratelimited("NFS: " "%s: Lock reclaim " "failed!\n", __func__); } spin_unlock(&state->state_lock); nfs4_put_open_state(state); goto restart; } } -------------------[snip]-------------------- Shouldn't the status check after nfs4_reclaim_locks be reversed? -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 13:28 ` Jeff Layton @ 2012-04-18 13:57 ` Luis Henriques 2012-04-18 14:04 ` Myklebust, Trond 0 siblings, 1 reply; 21+ messages in thread From: Luis Henriques @ 2012-04-18 13:57 UTC (permalink / raw) To: Jeff Layton; +Cc: Trond Myklebust, linux-nfs Hi Jeff, On Wed, Apr 18, 2012 at 09:28:22AM -0400, Jeff Layton wrote: > On Wed, 18 Apr 2012 12:26:10 +0100 > Luis Henriques <luis.henriques@canonical.com> wrote: > > > Hi, > > > > We have a bug reporting a regression in mainline kernel. Basically, the > > bug reporters are seeing lots of messages: > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > This happens when mounting a user's home directory over NFS. > > > > Is this a known issue being addressed at the moment? Is there any > > information needed to help debugging the issue? > > > > The original bug report can be found here: > > > > http://bugs.launchpad.net/bugs/974664 > > > > And there's also a similar report for Fedora: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=811138 > > > > Cheers, > > This code in nfs4_reclaim_open_state() looks wrong to me, but I'm not > that familiar with this code so I could be wrong: > > -------------------[snip]-------------------- > status = ops->recover_open(sp, state); > if (status >= 0) { > status = nfs4_reclaim_locks(state, ops); > if (status >= 0) { > spin_lock(&state->state_lock); > list_for_each_entry(lock, &state->lock_states, ls_locks) { > if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) > pr_warn_ratelimited("NFS: " > "%s: Lock reclaim " > "failed!\n", __func__); > } > spin_unlock(&state->state_lock); > nfs4_put_open_state(state); > goto restart; > } > } > -------------------[snip]-------------------- > > Shouldn't the status check after nfs4_reclaim_locks be reversed? Thanks a lot for your help. Could you please take a look at the patch below, just to make sure I understood you're suggestion correctly? I will prepare a test kernel so that we can check whether it actually solves the problem or not. Cheers, -- Luis >From a1348f473c157439ac62f502eb45ca48f95e627f Mon Sep 17 00:00:00 2001 From: Luis Henriques <luis.henriques@canonical.com> Date: Wed, 18 Apr 2012 14:50:10 +0100 Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() There have been several bug reports, with the following messages on the logs: [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed This happens, for example, when mounting a user's home directory over NFS. Thanks to Jeff Layton that identified the cause, this patch fixes an incorrect status check on nfs4_reclaim_open_state(). Signed-off-by: Luis Henriques <luis.henriques@canonical.com> --- fs/nfs/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 07354b7..8b6acec 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1180,7 +1180,7 @@ restart: atomic_inc(&state->count); spin_unlock(&sp->so_lock); status = ops->recover_open(sp, state); - if (status >= 0) { + if (status < 0) { status = nfs4_reclaim_locks(state, ops); if (status >= 0) { spin_lock(&state->state_lock); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 13:57 ` Luis Henriques @ 2012-04-18 14:04 ` Myklebust, Trond 2012-04-18 14:13 ` Luis Henriques 0 siblings, 1 reply; 21+ messages in thread From: Myklebust, Trond @ 2012-04-18 14:04 UTC (permalink / raw) To: Luis Henriques; +Cc: Jeff Layton, linux-nfs@vger.kernel.org T24gV2VkLCAyMDEyLTA0LTE4IGF0IDE0OjU3ICswMTAwLCBMdWlzIEhlbnJpcXVlcyB3cm90ZToN Cj4gSGkgSmVmZiwNCj4gDQo+IE9uIFdlZCwgQXByIDE4LCAyMDEyIGF0IDA5OjI4OjIyQU0gLTA0 MDAsIEplZmYgTGF5dG9uIHdyb3RlOg0KPiA+IE9uIFdlZCwgMTggQXByIDIwMTIgMTI6MjY6MTAg KzAxMDANCj4gPiBMdWlzIEhlbnJpcXVlcyA8bHVpcy5oZW5yaXF1ZXNAY2Fub25pY2FsLmNvbT4g d3JvdGU6DQo+ID4gDQo+ID4gPiBIaSwNCj4gPiA+IA0KPiA+ID4gV2UgaGF2ZSBhIGJ1ZyByZXBv cnRpbmcgYSByZWdyZXNzaW9uIGluIG1haW5saW5lIGtlcm5lbC4gIEJhc2ljYWxseSwgdGhlDQo+ ID4gPiBidWcgcmVwb3J0ZXJzIGFyZSBzZWVpbmcgbG90cyBvZiBtZXNzYWdlczoNCj4gPiA+IA0K PiA+ID4gWyA0OC43MDEyMTNdIE5GUzogbmZzNF9yZWNsYWltX29wZW5fc3RhdGU6IExvY2sgcmVj bGFpbSBmYWlsZWQhDQo+ID4gPiBbIDQ4LjcwMTk5MF0gTkZTOiBuZnM0X3JlY2xhaW1fb3Blbl9z dGF0ZTogTG9jayByZWNsYWltIGZhaWxlZCENCj4gPiA+IFsgNTMuNjk2MDc2XSBuZnM0X3JlY2xh aW1fb3Blbl9zdGF0ZTogNjQ0MCBjYWxsYmFja3Mgc3VwcHJlc3NlZA0KPiA+ID4gDQo+ID4gPiBU aGlzIGhhcHBlbnMgd2hlbiBtb3VudGluZyBhIHVzZXIncyBob21lIGRpcmVjdG9yeSBvdmVyIE5G Uy4NCj4gPiA+IA0KPiA+ID4gSXMgdGhpcyBhIGtub3duIGlzc3VlIGJlaW5nIGFkZHJlc3NlZCBh dCB0aGUgbW9tZW50PyAgSXMgdGhlcmUgYW55DQo+ID4gPiBpbmZvcm1hdGlvbiBuZWVkZWQgdG8g aGVscCBkZWJ1Z2dpbmcgdGhlIGlzc3VlPw0KPiA+ID4gDQo+ID4gPiBUaGUgb3JpZ2luYWwgYnVn IHJlcG9ydCBjYW4gYmUgZm91bmQgaGVyZToNCj4gPiA+IA0KPiA+ID4gaHR0cDovL2J1Z3MubGF1 bmNocGFkLm5ldC9idWdzLzk3NDY2NA0KPiA+ID4gDQo+ID4gPiBBbmQgdGhlcmUncyBhbHNvIGEg c2ltaWxhciByZXBvcnQgZm9yIEZlZG9yYToNCj4gPiA+IA0KPiA+ID4gaHR0cHM6Ly9idWd6aWxs YS5yZWRoYXQuY29tL3Nob3dfYnVnLmNnaT9pZD04MTExMzgNCj4gPiA+IA0KPiA+ID4gQ2hlZXJz LA0KPiA+IA0KPiA+IFRoaXMgY29kZSBpbiBuZnM0X3JlY2xhaW1fb3Blbl9zdGF0ZSgpIGxvb2tz IHdyb25nIHRvIG1lLCBidXQgSSdtIG5vdA0KPiA+IHRoYXQgZmFtaWxpYXIgd2l0aCB0aGlzIGNv ZGUgc28gSSBjb3VsZCBiZSB3cm9uZzoNCj4gPiANCj4gPiAtLS0tLS0tLS0tLS0tLS0tLS0tW3Nu aXBdLS0tLS0tLS0tLS0tLS0tLS0tLS0NCj4gPiAgICAgICAgICAgICAgICAgc3RhdHVzID0gb3Bz LT5yZWNvdmVyX29wZW4oc3AsIHN0YXRlKTsNCj4gPiAgICAgICAgICAgICAgICAgaWYgKHN0YXR1 cyA+PSAwKSB7DQo+ID4gICAgICAgICAgICAgICAgICAgICAgICAgc3RhdHVzID0gbmZzNF9yZWNs YWltX2xvY2tzKHN0YXRlLCBvcHMpOw0KPiA+ICAgICAgICAgICAgICAgICAgICAgICAgIGlmIChz dGF0dXMgPj0gMCkgew0KPiA+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgc3Bpbl9s b2NrKCZzdGF0ZS0+c3RhdGVfbG9jayk7DQo+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICBsaXN0X2Zvcl9lYWNoX2VudHJ5KGxvY2ssICZzdGF0ZS0+bG9ja19zdGF0ZXMsIGxzX2xv Y2tzKSB7DQo+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGlmICgh KGxvY2stPmxzX2ZsYWdzICYgTkZTX0xPQ0tfSU5JVElBTElaRUQpKQ0KPiA+ICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHByX3dhcm5fcmF0ZWxpbWl0ZWQo Ik5GUzogIg0KPiA+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgIiVzOiBMb2NrIHJlY2xhaW0gIg0KPiA+ICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgImZhaWxlZCFcbiIsIF9fZnVuY19f KTsNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIH0NCj4gPiAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgIHNwaW5fdW5sb2NrKCZzdGF0ZS0+c3RhdGVfbG9jayk7DQo+ ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBuZnM0X3B1dF9vcGVuX3N0YXRlKHN0 YXRlKTsNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGdvdG8gcmVzdGFydDsN Cj4gPiAgICAgICAgICAgICAgICAgICAgICAgICB9DQo+ID4gICAgICAgICAgICAgICAgIH0NCj4g PiAtLS0tLS0tLS0tLS0tLS0tLS0tW3NuaXBdLS0tLS0tLS0tLS0tLS0tLS0tLS0NCj4gPiANCj4g PiBTaG91bGRuJ3QgdGhlIHN0YXR1cyBjaGVjayBhZnRlciBuZnM0X3JlY2xhaW1fbG9ja3MgYmUg cmV2ZXJzZWQ/DQo+IA0KPiBUaGFua3MgYSBsb3QgZm9yIHlvdXIgaGVscC4gIENvdWxkIHlvdSBw bGVhc2UgdGFrZSBhIGxvb2sgYXQgdGhlIHBhdGNoDQo+IGJlbG93LCBqdXN0IHRvIG1ha2Ugc3Vy ZSBJIHVuZGVyc3Rvb2QgeW91J3JlIHN1Z2dlc3Rpb24gY29ycmVjdGx5PyAgSSB3aWxsDQo+IHBy ZXBhcmUgYSB0ZXN0IGtlcm5lbCBzbyB0aGF0IHdlIGNhbiBjaGVjayB3aGV0aGVyIGl0IGFjdHVh bGx5IHNvbHZlcyB0aGUNCj4gcHJvYmxlbSBvciBub3QuDQo+IA0KPiBDaGVlcnMsDQo+IC0tDQo+ IEx1aXMNCj4gDQo+IA0KPiA+RnJvbSBhMTM0OGY0NzNjMTU3NDM5YWM2MmY1MDJlYjQ1Y2E0OGY5 NWU2MjdmIE1vbiBTZXAgMTcgMDA6MDA6MDAgMjAwMQ0KPiBGcm9tOiBMdWlzIEhlbnJpcXVlcyA8 bHVpcy5oZW5yaXF1ZXNAY2Fub25pY2FsLmNvbT4NCj4gRGF0ZTogV2VkLCAxOCBBcHIgMjAxMiAx NDo1MDoxMCArMDEwMA0KPiBTdWJqZWN0OiBbUEFUQ0ggMS8xXSBORlM6IEZpeCBzdGF0dXMgY2hl Y2sgb24gbmZzNF9yZWNsYWltX29wZW5fc3RhdGUoKQ0KPiANCj4gVGhlcmUgaGF2ZSBiZWVuIHNl dmVyYWwgYnVnIHJlcG9ydHMsIHdpdGggdGhlIGZvbGxvd2luZyBtZXNzYWdlcyBvbiB0aGUNCj4g bG9nczoNCj4gDQo+ICBbIDQ4LjcwMTIxM10gTkZTOiBuZnM0X3JlY2xhaW1fb3Blbl9zdGF0ZTog TG9jayByZWNsYWltIGZhaWxlZCENCj4gIFsgNDguNzAxOTkwXSBORlM6IG5mczRfcmVjbGFpbV9v cGVuX3N0YXRlOiBMb2NrIHJlY2xhaW0gZmFpbGVkIQ0KPiAgWyA1My42OTYwNzZdIG5mczRfcmVj bGFpbV9vcGVuX3N0YXRlOiA2NDQwIGNhbGxiYWNrcyBzdXBwcmVzc2VkDQo+IA0KPiBUaGlzIGhh cHBlbnMsIGZvciBleGFtcGxlLCB3aGVuIG1vdW50aW5nIGEgdXNlcidzIGhvbWUgZGlyZWN0b3J5 IG92ZXIgTkZTLg0KPiANCj4gVGhhbmtzIHRvIEplZmYgTGF5dG9uIHRoYXQgaWRlbnRpZmllZCB0 aGUgY2F1c2UsIHRoaXMgcGF0Y2ggZml4ZXMgYW4NCj4gaW5jb3JyZWN0IHN0YXR1cyBjaGVjayBv biBuZnM0X3JlY2xhaW1fb3Blbl9zdGF0ZSgpLg0KPiANCj4gU2lnbmVkLW9mZi1ieTogTHVpcyBI ZW5yaXF1ZXMgPGx1aXMuaGVucmlxdWVzQGNhbm9uaWNhbC5jb20+DQo+IC0tLQ0KPiAgZnMvbmZz L25mczRzdGF0ZS5jIHwgICAgMiArLQ0KPiAgMSBmaWxlIGNoYW5nZWQsIDEgaW5zZXJ0aW9uKCsp LCAxIGRlbGV0aW9uKC0pDQo+IA0KPiBkaWZmIC0tZ2l0IGEvZnMvbmZzL25mczRzdGF0ZS5jIGIv ZnMvbmZzL25mczRzdGF0ZS5jDQo+IGluZGV4IDA3MzU0YjcuLjhiNmFjZWMgMTAwNjQ0DQo+IC0t LSBhL2ZzL25mcy9uZnM0c3RhdGUuYw0KPiArKysgYi9mcy9uZnMvbmZzNHN0YXRlLmMNCj4gQEAg LTExODAsNyArMTE4MCw3IEBAIHJlc3RhcnQ6DQo+ICAJCWF0b21pY19pbmMoJnN0YXRlLT5jb3Vu dCk7DQo+ICAJCXNwaW5fdW5sb2NrKCZzcC0+c29fbG9jayk7DQo+ICAJCXN0YXR1cyA9IG9wcy0+ cmVjb3Zlcl9vcGVuKHNwLCBzdGF0ZSk7DQo+IC0JCWlmIChzdGF0dXMgPj0gMCkgew0KPiArCQlp ZiAoc3RhdHVzIDwgMCkgew0KPiAgCQkJc3RhdHVzID0gbmZzNF9yZWNsYWltX2xvY2tzKHN0YXRl LCBvcHMpOw0KPiAgCQkJaWYgKHN0YXR1cyA+PSAwKSB7DQo+ICAJCQkJc3Bpbl9sb2NrKCZzdGF0 ZS0+c3RhdGVfbG9jayk7DQoNCkhlbGwgbm8hIA0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGlu dXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFw cC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg== ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 14:04 ` Myklebust, Trond @ 2012-04-18 14:13 ` Luis Henriques 2012-04-18 14:15 ` Jeff Layton 0 siblings, 1 reply; 21+ messages in thread From: Luis Henriques @ 2012-04-18 14:13 UTC (permalink / raw) To: Myklebust, Trond; +Cc: Jeff Layton, linux-nfs@vger.kernel.org On Wed, Apr 18, 2012 at 02:04:26PM +0000, Myklebust, Trond wrote: > On Wed, 2012-04-18 at 14:57 +0100, Luis Henriques wrote: > > Hi Jeff, > > > > On Wed, Apr 18, 2012 at 09:28:22AM -0400, Jeff Layton wrote: > > > On Wed, 18 Apr 2012 12:26:10 +0100 > > > Luis Henriques <luis.henriques@canonical.com> wrote: > > > > > > > Hi, > > > > > > > > We have a bug reporting a regression in mainline kernel. Basically, the > > > > bug reporters are seeing lots of messages: > > > > > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > > > > > This happens when mounting a user's home directory over NFS. > > > > > > > > Is this a known issue being addressed at the moment? Is there any > > > > information needed to help debugging the issue? > > > > > > > > The original bug report can be found here: > > > > > > > > http://bugs.launchpad.net/bugs/974664 > > > > > > > > And there's also a similar report for Fedora: > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=811138 > > > > > > > > Cheers, > > > > > > This code in nfs4_reclaim_open_state() looks wrong to me, but I'm not > > > that familiar with this code so I could be wrong: > > > > > > -------------------[snip]-------------------- > > > status = ops->recover_open(sp, state); > > > if (status >= 0) { > > > status = nfs4_reclaim_locks(state, ops); > > > if (status >= 0) { > > > spin_lock(&state->state_lock); > > > list_for_each_entry(lock, &state->lock_states, ls_locks) { > > > if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) > > > pr_warn_ratelimited("NFS: " > > > "%s: Lock reclaim " > > > "failed!\n", __func__); > > > } > > > spin_unlock(&state->state_lock); > > > nfs4_put_open_state(state); > > > goto restart; > > > } > > > } > > > -------------------[snip]-------------------- > > > > > > Shouldn't the status check after nfs4_reclaim_locks be reversed? > > > > Thanks a lot for your help. Could you please take a look at the patch > > below, just to make sure I understood you're suggestion correctly? I will > > prepare a test kernel so that we can check whether it actually solves the > > problem or not. > > > > Cheers, > > -- > > Luis > > > > > > >From a1348f473c157439ac62f502eb45ca48f95e627f Mon Sep 17 00:00:00 2001 > > From: Luis Henriques <luis.henriques@canonical.com> > > Date: Wed, 18 Apr 2012 14:50:10 +0100 > > Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() > > > > There have been several bug reports, with the following messages on the > > logs: > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > This happens, for example, when mounting a user's home directory over NFS. > > > > Thanks to Jeff Layton that identified the cause, this patch fixes an > > incorrect status check on nfs4_reclaim_open_state(). > > > > Signed-off-by: Luis Henriques <luis.henriques@canonical.com> > > --- > > fs/nfs/nfs4state.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > > index 07354b7..8b6acec 100644 > > --- a/fs/nfs/nfs4state.c > > +++ b/fs/nfs/nfs4state.c > > @@ -1180,7 +1180,7 @@ restart: > > atomic_inc(&state->count); > > spin_unlock(&sp->so_lock); > > status = ops->recover_open(sp, state); > > - if (status >= 0) { > > + if (status < 0) { > > status = nfs4_reclaim_locks(state, ops); > > if (status >= 0) { > > spin_lock(&state->state_lock); > > Hell no! Ouch! Wrong one... >From 005918b5eef853fd4d495743fef5a52ae62f825e Mon Sep 17 00:00:00 2001 From: Luis Henriques <luis.henriques@canonical.com> Date: Wed, 18 Apr 2012 15:08:39 +0100 Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() There have been several bug reports, with the following messages on the logs: [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed This happens, for example, when mounting a user's home directory over NFS. Thanks to Jeff Layton that identified the cause, this patch fixes an incorrect status check on nfs4_reclaim_open_state(). Signed-off-by: Luis Henriques <luis.henriques@canonical.com> --- fs/nfs/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 07354b7..023e09f 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1182,7 +1182,7 @@ restart: status = ops->recover_open(sp, state); if (status >= 0) { status = nfs4_reclaim_locks(state, ops); - if (status >= 0) { + if (status < 0) { spin_lock(&state->state_lock); list_for_each_entry(lock, &state->lock_states, ls_locks) { if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 14:13 ` Luis Henriques @ 2012-04-18 14:15 ` Jeff Layton 2012-04-18 14:41 ` Myklebust, Trond 0 siblings, 1 reply; 21+ messages in thread From: Jeff Layton @ 2012-04-18 14:15 UTC (permalink / raw) To: Luis Henriques; +Cc: Myklebust, Trond, linux-nfs@vger.kernel.org On Wed, 18 Apr 2012 15:13:13 +0100 Luis Henriques <luis.henriques@canonical.com> wrote: > On Wed, Apr 18, 2012 at 02:04:26PM +0000, Myklebust, Trond wrote: > > On Wed, 2012-04-18 at 14:57 +0100, Luis Henriques wrote: > > > Hi Jeff, > > > > > > On Wed, Apr 18, 2012 at 09:28:22AM -0400, Jeff Layton wrote: > > > > On Wed, 18 Apr 2012 12:26:10 +0100 > > > > Luis Henriques <luis.henriques@canonical.com> wrote: > > > > > > > > > Hi, > > > > > > > > > > We have a bug reporting a regression in mainline kernel. Basically, the > > > > > bug reporters are seeing lots of messages: > > > > > > > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > > > > > > > This happens when mounting a user's home directory over NFS. > > > > > > > > > > Is this a known issue being addressed at the moment? Is there any > > > > > information needed to help debugging the issue? > > > > > > > > > > The original bug report can be found here: > > > > > > > > > > http://bugs.launchpad.net/bugs/974664 > > > > > > > > > > And there's also a similar report for Fedora: > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=811138 > > > > > > > > > > Cheers, > > > > > > > > This code in nfs4_reclaim_open_state() looks wrong to me, but I'm not > > > > that familiar with this code so I could be wrong: > > > > > > > > -------------------[snip]-------------------- > > > > status = ops->recover_open(sp, state); > > > > if (status >= 0) { > > > > status = nfs4_reclaim_locks(state, ops); > > > > if (status >= 0) { > > > > spin_lock(&state->state_lock); > > > > list_for_each_entry(lock, &state->lock_states, ls_locks) { > > > > if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) > > > > pr_warn_ratelimited("NFS: " > > > > "%s: Lock reclaim " > > > > "failed!\n", __func__); > > > > } > > > > spin_unlock(&state->state_lock); > > > > nfs4_put_open_state(state); > > > > goto restart; > > > > } > > > > } > > > > -------------------[snip]-------------------- > > > > > > > > Shouldn't the status check after nfs4_reclaim_locks be reversed? > > > > > > Thanks a lot for your help. Could you please take a look at the patch > > > below, just to make sure I understood you're suggestion correctly? I will > > > prepare a test kernel so that we can check whether it actually solves the > > > problem or not. > > > > > > Cheers, > > > -- > > > Luis > > > > > > > > > >From a1348f473c157439ac62f502eb45ca48f95e627f Mon Sep 17 00:00:00 2001 > > > From: Luis Henriques <luis.henriques@canonical.com> > > > Date: Wed, 18 Apr 2012 14:50:10 +0100 > > > Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() > > > > > > There have been several bug reports, with the following messages on the > > > logs: > > > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > > > This happens, for example, when mounting a user's home directory over NFS. > > > > > > Thanks to Jeff Layton that identified the cause, this patch fixes an > > > incorrect status check on nfs4_reclaim_open_state(). > > > > > > Signed-off-by: Luis Henriques <luis.henriques@canonical.com> > > > --- > > > fs/nfs/nfs4state.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > > > index 07354b7..8b6acec 100644 > > > --- a/fs/nfs/nfs4state.c > > > +++ b/fs/nfs/nfs4state.c > > > @@ -1180,7 +1180,7 @@ restart: > > > atomic_inc(&state->count); > > > spin_unlock(&sp->so_lock); > > > status = ops->recover_open(sp, state); > > > - if (status >= 0) { > > > + if (status < 0) { > > > status = nfs4_reclaim_locks(state, ops); > > > if (status >= 0) { > > > spin_lock(&state->state_lock); > > > > Hell no! > > Ouch! Wrong one... > > From 005918b5eef853fd4d495743fef5a52ae62f825e Mon Sep 17 00:00:00 2001 > From: Luis Henriques <luis.henriques@canonical.com> > Date: Wed, 18 Apr 2012 15:08:39 +0100 > Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() > > There have been several bug reports, with the following messages on the > logs: > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > This happens, for example, when mounting a user's home directory over NFS. > > Thanks to Jeff Layton that identified the cause, this patch fixes an > incorrect status check on nfs4_reclaim_open_state(). > > Signed-off-by: Luis Henriques <luis.henriques@canonical.com> > --- > fs/nfs/nfs4state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index 07354b7..023e09f 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -1182,7 +1182,7 @@ restart: > status = ops->recover_open(sp, state); > if (status >= 0) { > status = nfs4_reclaim_locks(state, ops); > - if (status >= 0) { > + if (status < 0) { > spin_lock(&state->state_lock); > list_for_each_entry(lock, &state->lock_states, ls_locks) { > if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) Yeah, that looks more reasonable, but again I'm not sure about this either way. This code has been this way for a long time and it's not clear to me why it's only now become a problem if it is wrong. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 14:15 ` Jeff Layton @ 2012-04-18 14:41 ` Myklebust, Trond 2012-04-18 14:51 ` Jeff Layton 0 siblings, 1 reply; 21+ messages in thread From: Myklebust, Trond @ 2012-04-18 14:41 UTC (permalink / raw) To: Jeff Layton; +Cc: Luis Henriques, linux-nfs@vger.kernel.org T24gV2VkLCAyMDEyLTA0LTE4IGF0IDEwOjE1IC0wNDAwLCBKZWZmIExheXRvbiB3cm90ZToNCj4g T24gV2VkLCAxOCBBcHIgMjAxMiAxNToxMzoxMyArMDEwMA0KPiBMdWlzIEhlbnJpcXVlcyA8bHVp cy5oZW5yaXF1ZXNAY2Fub25pY2FsLmNvbT4gd3JvdGU6DQo+IA0KPiA+IE9uIFdlZCwgQXByIDE4 LCAyMDEyIGF0IDAyOjA0OjI2UE0gKzAwMDAsIE15a2xlYnVzdCwgVHJvbmQgd3JvdGU6DQo+ID4g PiBPbiBXZWQsIDIwMTItMDQtMTggYXQgMTQ6NTcgKzAxMDAsIEx1aXMgSGVucmlxdWVzIHdyb3Rl Og0KPiA+ID4gPiBIaSBKZWZmLA0KPiA+ID4gPiANCj4gPiA+ID4gT24gV2VkLCBBcHIgMTgsIDIw MTIgYXQgMDk6Mjg6MjJBTSAtMDQwMCwgSmVmZiBMYXl0b24gd3JvdGU6DQo+ID4gPiA+ID4gT24g V2VkLCAxOCBBcHIgMjAxMiAxMjoyNjoxMCArMDEwMA0KPiA+ID4gPiA+IEx1aXMgSGVucmlxdWVz IDxsdWlzLmhlbnJpcXVlc0BjYW5vbmljYWwuY29tPiB3cm90ZToNCj4gPiA+ID4gPiANCj4gPiA+ ID4gPiA+IEhpLA0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBXZSBoYXZlIGEgYnVnIHJlcG9y dGluZyBhIHJlZ3Jlc3Npb24gaW4gbWFpbmxpbmUga2VybmVsLiAgQmFzaWNhbGx5LCB0aGUNCj4g PiA+ID4gPiA+IGJ1ZyByZXBvcnRlcnMgYXJlIHNlZWluZyBsb3RzIG9mIG1lc3NhZ2VzOg0KPiA+ ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBbIDQ4LjcwMTIxM10gTkZTOiBuZnM0X3JlY2xhaW1fb3Bl bl9zdGF0ZTogTG9jayByZWNsYWltIGZhaWxlZCENCj4gPiA+ID4gPiA+IFsgNDguNzAxOTkwXSBO RlM6IG5mczRfcmVjbGFpbV9vcGVuX3N0YXRlOiBMb2NrIHJlY2xhaW0gZmFpbGVkIQ0KPiA+ID4g PiA+ID4gWyA1My42OTYwNzZdIG5mczRfcmVjbGFpbV9vcGVuX3N0YXRlOiA2NDQwIGNhbGxiYWNr cyBzdXBwcmVzc2VkDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IFRoaXMgaGFwcGVucyB3aGVu IG1vdW50aW5nIGEgdXNlcidzIGhvbWUgZGlyZWN0b3J5IG92ZXIgTkZTLg0KPiA+ID4gPiA+ID4g DQo+ID4gPiA+ID4gPiBJcyB0aGlzIGEga25vd24gaXNzdWUgYmVpbmcgYWRkcmVzc2VkIGF0IHRo ZSBtb21lbnQ/ICBJcyB0aGVyZSBhbnkNCj4gPiA+ID4gPiA+IGluZm9ybWF0aW9uIG5lZWRlZCB0 byBoZWxwIGRlYnVnZ2luZyB0aGUgaXNzdWU/DQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IFRo ZSBvcmlnaW5hbCBidWcgcmVwb3J0IGNhbiBiZSBmb3VuZCBoZXJlOg0KPiA+ID4gPiA+ID4gDQo+ ID4gPiA+ID4gPiBodHRwOi8vYnVncy5sYXVuY2hwYWQubmV0L2J1Z3MvOTc0NjY0DQo+ID4gPiA+ ID4gPiANCj4gPiA+ID4gPiA+IEFuZCB0aGVyZSdzIGFsc28gYSBzaW1pbGFyIHJlcG9ydCBmb3Ig RmVkb3JhOg0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBodHRwczovL2J1Z3ppbGxhLnJlZGhh dC5jb20vc2hvd19idWcuY2dpP2lkPTgxMTEzOA0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBD aGVlcnMsDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gVGhpcyBjb2RlIGluIG5mczRfcmVjbGFpbV9v cGVuX3N0YXRlKCkgbG9va3Mgd3JvbmcgdG8gbWUsIGJ1dCBJJ20gbm90DQo+ID4gPiA+ID4gdGhh dCBmYW1pbGlhciB3aXRoIHRoaXMgY29kZSBzbyBJIGNvdWxkIGJlIHdyb25nOg0KPiA+ID4gPiA+ IA0KPiA+ID4gPiA+IC0tLS0tLS0tLS0tLS0tLS0tLS1bc25pcF0tLS0tLS0tLS0tLS0tLS0tLS0t LQ0KPiA+ID4gPiA+ICAgICAgICAgICAgICAgICBzdGF0dXMgPSBvcHMtPnJlY292ZXJfb3Blbihz cCwgc3RhdGUpOw0KPiA+ID4gPiA+ICAgICAgICAgICAgICAgICBpZiAoc3RhdHVzID49IDApIHsN Cj4gPiA+ID4gPiAgICAgICAgICAgICAgICAgICAgICAgICBzdGF0dXMgPSBuZnM0X3JlY2xhaW1f bG9ja3Moc3RhdGUsIG9wcyk7DQo+ID4gPiA+ID4gICAgICAgICAgICAgICAgICAgICAgICAgaWYg KHN0YXR1cyA+PSAwKSB7DQo+ID4gPiA+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICBzcGluX2xvY2soJnN0YXRlLT5zdGF0ZV9sb2NrKTsNCj4gPiA+ID4gPiAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgIGxpc3RfZm9yX2VhY2hfZW50cnkobG9jaywgJnN0YXRlLT5sb2Nr X3N0YXRlcywgbHNfbG9ja3MpIHsNCj4gPiA+ID4gPiAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgaWYgKCEobG9jay0+bHNfZmxhZ3MgJiBORlNfTE9DS19JTklUSUFMSVpF RCkpDQo+ID4gPiA+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgcHJfd2Fybl9yYXRlbGltaXRlZCgiTkZTOiAiDQo+ID4gPiA+ID4gICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAiJXM6IExvY2sgcmVj bGFpbSAiDQo+ID4gPiA+ID4gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAiZmFpbGVkIVxuIiwgX19mdW5jX18pOw0KPiA+ID4gPiA+ICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgfQ0KPiA+ID4gPiA+ICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgc3Bpbl91bmxvY2soJnN0YXRlLT5zdGF0ZV9sb2NrKTsNCj4gPiA+ID4g PiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIG5mczRfcHV0X29wZW5fc3RhdGUoc3Rh dGUpOw0KPiA+ID4gPiA+ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZ290byByZXN0 YXJ0Ow0KPiA+ID4gPiA+ICAgICAgICAgICAgICAgICAgICAgICAgIH0NCj4gPiA+ID4gPiAgICAg ICAgICAgICAgICAgfQ0KPiA+ID4gPiA+IC0tLS0tLS0tLS0tLS0tLS0tLS1bc25pcF0tLS0tLS0t LS0tLS0tLS0tLS0tLQ0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IFNob3VsZG4ndCB0aGUgc3RhdHVz IGNoZWNrIGFmdGVyIG5mczRfcmVjbGFpbV9sb2NrcyBiZSByZXZlcnNlZD8NCj4gPiA+ID4gDQo+ ID4gPiA+IFRoYW5rcyBhIGxvdCBmb3IgeW91ciBoZWxwLiAgQ291bGQgeW91IHBsZWFzZSB0YWtl IGEgbG9vayBhdCB0aGUgcGF0Y2gNCj4gPiA+ID4gYmVsb3csIGp1c3QgdG8gbWFrZSBzdXJlIEkg dW5kZXJzdG9vZCB5b3UncmUgc3VnZ2VzdGlvbiBjb3JyZWN0bHk/ICBJIHdpbGwNCj4gPiA+ID4g cHJlcGFyZSBhIHRlc3Qga2VybmVsIHNvIHRoYXQgd2UgY2FuIGNoZWNrIHdoZXRoZXIgaXQgYWN0 dWFsbHkgc29sdmVzIHRoZQ0KPiA+ID4gPiBwcm9ibGVtIG9yIG5vdC4NCj4gPiA+ID4gDQo+ID4g PiA+IENoZWVycywNCj4gPiA+ID4gLS0NCj4gPiA+ID4gTHVpcw0KPiA+ID4gPiANCj4gPiA+ID4g DQo+ID4gPiA+ID5Gcm9tIGExMzQ4ZjQ3M2MxNTc0MzlhYzYyZjUwMmViNDVjYTQ4Zjk1ZTYyN2Yg TW9uIFNlcCAxNyAwMDowMDowMCAyMDAxDQo+ID4gPiA+IEZyb206IEx1aXMgSGVucmlxdWVzIDxs dWlzLmhlbnJpcXVlc0BjYW5vbmljYWwuY29tPg0KPiA+ID4gPiBEYXRlOiBXZWQsIDE4IEFwciAy MDEyIDE0OjUwOjEwICswMTAwDQo+ID4gPiA+IFN1YmplY3Q6IFtQQVRDSCAxLzFdIE5GUzogRml4 IHN0YXR1cyBjaGVjayBvbiBuZnM0X3JlY2xhaW1fb3Blbl9zdGF0ZSgpDQo+ID4gPiA+IA0KPiA+ ID4gPiBUaGVyZSBoYXZlIGJlZW4gc2V2ZXJhbCBidWcgcmVwb3J0cywgd2l0aCB0aGUgZm9sbG93 aW5nIG1lc3NhZ2VzIG9uIHRoZQ0KPiA+ID4gPiBsb2dzOg0KPiA+ID4gPiANCj4gPiA+ID4gIFsg NDguNzAxMjEzXSBORlM6IG5mczRfcmVjbGFpbV9vcGVuX3N0YXRlOiBMb2NrIHJlY2xhaW0gZmFp bGVkIQ0KPiA+ID4gPiAgWyA0OC43MDE5OTBdIE5GUzogbmZzNF9yZWNsYWltX29wZW5fc3RhdGU6 IExvY2sgcmVjbGFpbSBmYWlsZWQhDQo+ID4gPiA+ICBbIDUzLjY5NjA3Nl0gbmZzNF9yZWNsYWlt X29wZW5fc3RhdGU6IDY0NDAgY2FsbGJhY2tzIHN1cHByZXNzZWQNCj4gPiA+ID4gDQo+ID4gPiA+ IFRoaXMgaGFwcGVucywgZm9yIGV4YW1wbGUsIHdoZW4gbW91bnRpbmcgYSB1c2VyJ3MgaG9tZSBk aXJlY3Rvcnkgb3ZlciBORlMuDQo+ID4gPiA+IA0KPiA+ID4gPiBUaGFua3MgdG8gSmVmZiBMYXl0 b24gdGhhdCBpZGVudGlmaWVkIHRoZSBjYXVzZSwgdGhpcyBwYXRjaCBmaXhlcyBhbg0KPiA+ID4g PiBpbmNvcnJlY3Qgc3RhdHVzIGNoZWNrIG9uIG5mczRfcmVjbGFpbV9vcGVuX3N0YXRlKCkuDQo+ ID4gPiA+IA0KPiA+ID4gPiBTaWduZWQtb2ZmLWJ5OiBMdWlzIEhlbnJpcXVlcyA8bHVpcy5oZW5y aXF1ZXNAY2Fub25pY2FsLmNvbT4NCj4gPiA+ID4gLS0tDQo+ID4gPiA+ICBmcy9uZnMvbmZzNHN0 YXRlLmMgfCAgICAyICstDQo+ID4gPiA+ICAxIGZpbGUgY2hhbmdlZCwgMSBpbnNlcnRpb24oKyks IDEgZGVsZXRpb24oLSkNCj4gPiA+ID4gDQo+ID4gPiA+IGRpZmYgLS1naXQgYS9mcy9uZnMvbmZz NHN0YXRlLmMgYi9mcy9uZnMvbmZzNHN0YXRlLmMNCj4gPiA+ID4gaW5kZXggMDczNTRiNy4uOGI2 YWNlYyAxMDA2NDQNCj4gPiA+ID4gLS0tIGEvZnMvbmZzL25mczRzdGF0ZS5jDQo+ID4gPiA+ICsr KyBiL2ZzL25mcy9uZnM0c3RhdGUuYw0KPiA+ID4gPiBAQCAtMTE4MCw3ICsxMTgwLDcgQEAgcmVz dGFydDoNCj4gPiA+ID4gIAkJYXRvbWljX2luYygmc3RhdGUtPmNvdW50KTsNCj4gPiA+ID4gIAkJ c3Bpbl91bmxvY2soJnNwLT5zb19sb2NrKTsNCj4gPiA+ID4gIAkJc3RhdHVzID0gb3BzLT5yZWNv dmVyX29wZW4oc3AsIHN0YXRlKTsNCj4gPiA+ID4gLQkJaWYgKHN0YXR1cyA+PSAwKSB7DQo+ID4g PiA+ICsJCWlmIChzdGF0dXMgPCAwKSB7DQo+ID4gPiA+ICAJCQlzdGF0dXMgPSBuZnM0X3JlY2xh aW1fbG9ja3Moc3RhdGUsIG9wcyk7DQo+ID4gPiA+ICAJCQlpZiAoc3RhdHVzID49IDApIHsNCj4g PiA+ID4gIAkJCQlzcGluX2xvY2soJnN0YXRlLT5zdGF0ZV9sb2NrKTsNCj4gPiA+IA0KPiA+ID4g SGVsbCBubyEgDQo+ID4gDQo+ID4gT3VjaCEgIFdyb25nIG9uZS4uLg0KPiA+IA0KPiA+IEZyb20g MDA1OTE4YjVlZWY4NTNmZDRkNDk1NzQzZmVmNWE1MmFlNjJmODI1ZSBNb24gU2VwIDE3IDAwOjAw OjAwIDIwMDENCj4gPiBGcm9tOiBMdWlzIEhlbnJpcXVlcyA8bHVpcy5oZW5yaXF1ZXNAY2Fub25p Y2FsLmNvbT4NCj4gPiBEYXRlOiBXZWQsIDE4IEFwciAyMDEyIDE1OjA4OjM5ICswMTAwDQo+ID4g U3ViamVjdDogW1BBVENIIDEvMV0gTkZTOiBGaXggc3RhdHVzIGNoZWNrIG9uIG5mczRfcmVjbGFp bV9vcGVuX3N0YXRlKCkNCj4gPiANCj4gPiBUaGVyZSBoYXZlIGJlZW4gc2V2ZXJhbCBidWcgcmVw b3J0cywgd2l0aCB0aGUgZm9sbG93aW5nIG1lc3NhZ2VzIG9uIHRoZQ0KPiA+IGxvZ3M6DQo+ID4g DQo+ID4gIFsgNDguNzAxMjEzXSBORlM6IG5mczRfcmVjbGFpbV9vcGVuX3N0YXRlOiBMb2NrIHJl Y2xhaW0gZmFpbGVkIQ0KPiA+ICBbIDQ4LjcwMTk5MF0gTkZTOiBuZnM0X3JlY2xhaW1fb3Blbl9z dGF0ZTogTG9jayByZWNsYWltIGZhaWxlZCENCj4gPiAgWyA1My42OTYwNzZdIG5mczRfcmVjbGFp bV9vcGVuX3N0YXRlOiA2NDQwIGNhbGxiYWNrcyBzdXBwcmVzc2VkDQo+ID4gDQo+ID4gVGhpcyBo YXBwZW5zLCBmb3IgZXhhbXBsZSwgd2hlbiBtb3VudGluZyBhIHVzZXIncyBob21lIGRpcmVjdG9y eSBvdmVyIE5GUy4NCj4gPiANCj4gPiBUaGFua3MgdG8gSmVmZiBMYXl0b24gdGhhdCBpZGVudGlm aWVkIHRoZSBjYXVzZSwgdGhpcyBwYXRjaCBmaXhlcyBhbg0KPiA+IGluY29ycmVjdCBzdGF0dXMg Y2hlY2sgb24gbmZzNF9yZWNsYWltX29wZW5fc3RhdGUoKS4NCj4gPiANCj4gPiBTaWduZWQtb2Zm LWJ5OiBMdWlzIEhlbnJpcXVlcyA8bHVpcy5oZW5yaXF1ZXNAY2Fub25pY2FsLmNvbT4NCj4gPiAt LS0NCj4gPiAgZnMvbmZzL25mczRzdGF0ZS5jIHwgICAgMiArLQ0KPiA+ICAxIGZpbGUgY2hhbmdl ZCwgMSBpbnNlcnRpb24oKyksIDEgZGVsZXRpb24oLSkNCj4gPiANCj4gPiBkaWZmIC0tZ2l0IGEv ZnMvbmZzL25mczRzdGF0ZS5jIGIvZnMvbmZzL25mczRzdGF0ZS5jDQo+ID4gaW5kZXggMDczNTRi Ny4uMDIzZTA5ZiAxMDA2NDQNCj4gPiAtLS0gYS9mcy9uZnMvbmZzNHN0YXRlLmMNCj4gPiArKysg Yi9mcy9uZnMvbmZzNHN0YXRlLmMNCj4gPiBAQCAtMTE4Miw3ICsxMTgyLDcgQEAgcmVzdGFydDoN Cj4gPiAgCQlzdGF0dXMgPSBvcHMtPnJlY292ZXJfb3BlbihzcCwgc3RhdGUpOw0KPiA+ICAJCWlm IChzdGF0dXMgPj0gMCkgew0KPiA+ICAJCQlzdGF0dXMgPSBuZnM0X3JlY2xhaW1fbG9ja3Moc3Rh dGUsIG9wcyk7DQo+ID4gLQkJCWlmIChzdGF0dXMgPj0gMCkgew0KPiA+ICsJCQlpZiAoc3RhdHVz IDwgMCkgew0KPiA+ICAJCQkJc3Bpbl9sb2NrKCZzdGF0ZS0+c3RhdGVfbG9jayk7DQo+ID4gIAkJ CQlsaXN0X2Zvcl9lYWNoX2VudHJ5KGxvY2ssICZzdGF0ZS0+bG9ja19zdGF0ZXMsIGxzX2xvY2tz KSB7DQo+ID4gIAkJCQkJaWYgKCEobG9jay0+bHNfZmxhZ3MgJiBORlNfTE9DS19JTklUSUFMSVpF RCkpDQo+IA0KPiBZZWFoLCB0aGF0IGxvb2tzIG1vcmUgcmVhc29uYWJsZSwgYnV0IGFnYWluIEkn bSBub3Qgc3VyZSBhYm91dCB0aGlzDQo+IGVpdGhlciB3YXkuIFRoaXMgY29kZSBoYXMgYmVlbiB0 aGlzIHdheSBmb3IgYSBsb25nIHRpbWUgYW5kIGl0J3Mgbm90DQo+IGNsZWFyIHRvIG1lIHdoeSBp dCdzIG9ubHkgbm93IGJlY29tZSBhIHByb2JsZW0gaWYgaXQgaXMgd3JvbmcuDQoNClJpZ2h0LiBS YW5kb20gKGFuZCB3cm9uZyEpIGNoYW5nZXMgc3VjaCBhcyB0aGUgYWJvdmUgd29uJ3QgZml4IHRo ZQ0KcHJvYmxlbS4gVGhhdCBjb2RlIGlzIHBlcmZlY3RseSBjb3JyZWN0IChsb29rIGF0IHRoZSBu ZnM0X3JlY2xhaW1fbG9ja3MNCmVycm9yIGNhc2VzIHRvIHNlZSB3aHkpLg0KDQpIYXZlIHlvdSBp bnN0ZWFkIGxvb2tlZCBpbnRvIHdoYXQgdGhlc2UgYXBwbGljYXRpb25zIGFyZSBkb2luZz8gQXJl IHRoZXkNCnBlcmhhcHMgb3BlbmluZyB0aGUgZmlsZSByZWFkIG9ubHksIHRoZW4gdHJ5aW5nIHRv IGFwcGx5IGFuIGV4Y2x1c2l2ZQ0KQlNEIGxvY2sgKHNvbWV0aGluZyB3aGljaCBORlN2NCBjYW5u b3Qgc3VwcG9ydCk/DQoNCklPVzogZG9lcyB0aGUgcHJvYmxlbSBnbyBhd2F5IGlmIHlvdSBtb3Vu dCB3aXRoICdsb2NhbF9sb2NrPWZsb2NrJz8NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4 IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0KDQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAu Y29tDQp3d3cubmV0YXBwLmNvbQ0KDQo= ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 14:41 ` Myklebust, Trond @ 2012-04-18 14:51 ` Jeff Layton 2012-04-18 17:35 ` Luis Henriques ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Jeff Layton @ 2012-04-18 14:51 UTC (permalink / raw) To: Myklebust, Trond; +Cc: Luis Henriques, linux-nfs@vger.kernel.org On Wed, 18 Apr 2012 14:41:45 +0000 "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote: > On Wed, 2012-04-18 at 10:15 -0400, Jeff Layton wrote: > > On Wed, 18 Apr 2012 15:13:13 +0100 > > Luis Henriques <luis.henriques@canonical.com> wrote: > > > > > On Wed, Apr 18, 2012 at 02:04:26PM +0000, Myklebust, Trond wrote: > > > > On Wed, 2012-04-18 at 14:57 +0100, Luis Henriques wrote: > > > > > Hi Jeff, > > > > > > > > > > On Wed, Apr 18, 2012 at 09:28:22AM -0400, Jeff Layton wrote: > > > > > > On Wed, 18 Apr 2012 12:26:10 +0100 > > > > > > Luis Henriques <luis.henriques@canonical.com> wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > We have a bug reporting a regression in mainline kernel. Basically, the > > > > > > > bug reporters are seeing lots of messages: > > > > > > > > > > > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > > > > > > > > > > > This happens when mounting a user's home directory over NFS. > > > > > > > > > > > > > > Is this a known issue being addressed at the moment? Is there any > > > > > > > information needed to help debugging the issue? > > > > > > > > > > > > > > The original bug report can be found here: > > > > > > > > > > > > > > http://bugs.launchpad.net/bugs/974664 > > > > > > > > > > > > > > And there's also a similar report for Fedora: > > > > > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=811138 > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > This code in nfs4_reclaim_open_state() looks wrong to me, but I'm not > > > > > > that familiar with this code so I could be wrong: > > > > > > > > > > > > -------------------[snip]-------------------- > > > > > > status = ops->recover_open(sp, state); > > > > > > if (status >= 0) { > > > > > > status = nfs4_reclaim_locks(state, ops); > > > > > > if (status >= 0) { > > > > > > spin_lock(&state->state_lock); > > > > > > list_for_each_entry(lock, &state->lock_states, ls_locks) { > > > > > > if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) > > > > > > pr_warn_ratelimited("NFS: " > > > > > > "%s: Lock reclaim " > > > > > > "failed!\n", __func__); > > > > > > } > > > > > > spin_unlock(&state->state_lock); > > > > > > nfs4_put_open_state(state); > > > > > > goto restart; > > > > > > } > > > > > > } > > > > > > -------------------[snip]-------------------- > > > > > > > > > > > > Shouldn't the status check after nfs4_reclaim_locks be reversed? > > > > > > > > > > Thanks a lot for your help. Could you please take a look at the patch > > > > > below, just to make sure I understood you're suggestion correctly? I will > > > > > prepare a test kernel so that we can check whether it actually solves the > > > > > problem or not. > > > > > > > > > > Cheers, > > > > > -- > > > > > Luis > > > > > > > > > > > > > > > >From a1348f473c157439ac62f502eb45ca48f95e627f Mon Sep 17 00:00:00 2001 > > > > > From: Luis Henriques <luis.henriques@canonical.com> > > > > > Date: Wed, 18 Apr 2012 14:50:10 +0100 > > > > > Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() > > > > > > > > > > There have been several bug reports, with the following messages on the > > > > > logs: > > > > > > > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > > > > > > > This happens, for example, when mounting a user's home directory over NFS. > > > > > > > > > > Thanks to Jeff Layton that identified the cause, this patch fixes an > > > > > incorrect status check on nfs4_reclaim_open_state(). > > > > > > > > > > Signed-off-by: Luis Henriques <luis.henriques@canonical.com> > > > > > --- > > > > > fs/nfs/nfs4state.c | 2 +- > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > > > > > index 07354b7..8b6acec 100644 > > > > > --- a/fs/nfs/nfs4state.c > > > > > +++ b/fs/nfs/nfs4state.c > > > > > @@ -1180,7 +1180,7 @@ restart: > > > > > atomic_inc(&state->count); > > > > > spin_unlock(&sp->so_lock); > > > > > status = ops->recover_open(sp, state); > > > > > - if (status >= 0) { > > > > > + if (status < 0) { > > > > > status = nfs4_reclaim_locks(state, ops); > > > > > if (status >= 0) { > > > > > spin_lock(&state->state_lock); > > > > > > > > Hell no! > > > > > > Ouch! Wrong one... > > > > > > From 005918b5eef853fd4d495743fef5a52ae62f825e Mon Sep 17 00:00:00 2001 > > > From: Luis Henriques <luis.henriques@canonical.com> > > > Date: Wed, 18 Apr 2012 15:08:39 +0100 > > > Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state() > > > > > > There have been several bug reports, with the following messages on the > > > logs: > > > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed! > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed > > > > > > This happens, for example, when mounting a user's home directory over NFS. > > > > > > Thanks to Jeff Layton that identified the cause, this patch fixes an > > > incorrect status check on nfs4_reclaim_open_state(). > > > > > > Signed-off-by: Luis Henriques <luis.henriques@canonical.com> > > > --- > > > fs/nfs/nfs4state.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > > > index 07354b7..023e09f 100644 > > > --- a/fs/nfs/nfs4state.c > > > +++ b/fs/nfs/nfs4state.c > > > @@ -1182,7 +1182,7 @@ restart: > > > status = ops->recover_open(sp, state); > > > if (status >= 0) { > > > status = nfs4_reclaim_locks(state, ops); > > > - if (status >= 0) { > > > + if (status < 0) { > > > spin_lock(&state->state_lock); > > > list_for_each_entry(lock, &state->lock_states, ls_locks) { > > > if (!(lock->ls_flags & NFS_LOCK_INITIALIZED)) > > > > Yeah, that looks more reasonable, but again I'm not sure about this > > either way. This code has been this way for a long time and it's not > > clear to me why it's only now become a problem if it is wrong. > > Right. Random (and wrong!) changes such as the above won't fix the > problem. That code is perfectly correct (look at the nfs4_reclaim_locks > error cases to see why). > Ugh, ok I see and that code is correct even if it's a bit hard to follow... We clear the state_flag_bit on the first attempt against that lock so if it returns 0 (meaning a successful reclaim, we'll skip over it on the next pass through the loop. > Have you instead looked into what these applications are doing? Are they > perhaps opening the file read only, then trying to apply an exclusive > BSD lock (something which NFSv4 cannot support)? > > IOW: does the problem go away if you mount with 'local_lock=flock'? > I suspect that that is the trigger here. Sadly common among userspace apps... -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Regretion on NFS in mainline kernel 2012-04-18 14:51 ` Jeff Layton @ 2012-04-18 17:35 ` Luis Henriques 2012-04-18 17:36 ` [PATCH 0/2] Fix regression " Trond Myklebust ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: Luis Henriques @ 2012-04-18 17:35 UTC (permalink / raw) To: Jeff Layton; +Cc: Myklebust, Trond, linux-nfs@vger.kernel.org On Wed, Apr 18, 2012 at 10:51:47AM -0400, Jeff Layton wrote: > On Wed, 18 Apr 2012 14:41:45 +0000 > "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote: > > > Right. Random (and wrong!) changes such as the above won't fix the > > problem. That code is perfectly correct (look at the nfs4_reclaim_locks > > error cases to see why). > > > > Ugh, ok I see and that code is correct even if it's a bit hard to > follow... > > We clear the state_flag_bit on the first attempt against that lock so > if it returns 0 (meaning a successful reclaim, we'll skip over it on > the next pass through the loop. > > > Have you instead looked into what these applications are doing? Are they > > perhaps opening the file read only, then trying to apply an exclusive > > BSD lock (something which NFSv4 cannot support)? > > > > IOW: does the problem go away if you mount with 'local_lock=flock'? > > > I suspect that that is the trigger here. Sadly common among userspace > apps... Ok, I'll ask some of the guys reporting the bug to try to use that option to mount. Is there any other useful information that could be collected to help sorting this out? Btw, there was someone reporting that an easy reproducer of this problem is to just run: $ /usr/bin/flock /file/on/nfs echo Fish Cheers, -- Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-18 14:51 ` Jeff Layton 2012-04-18 17:35 ` Luis Henriques @ 2012-04-18 17:36 ` Trond Myklebust 2012-04-18 17:42 ` Jeff Layton ` (2 more replies) 2012-04-18 17:36 ` [PATCH 1/2] NFSv4: Ensure that the LOCK code sets exception->inode Trond Myklebust 2012-04-18 17:36 ` [PATCH 2/2] NFSv4: Ensure that we check lock exclusive/shared type against open modes Trond Myklebust 3 siblings, 3 replies; 21+ messages in thread From: Trond Myklebust @ 2012-04-18 17:36 UTC (permalink / raw) To: Jeff Layton; +Cc: Luis Henriques, linux-nfs@vger.kernel.org The first patch addresses the Oops. The second will hopefully address the looping. Trond Myklebust (2): NFSv4: Ensure that the LOCK code sets exception->inode NFSv4: Ensure that we check lock exclusive/shared type against open modes fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- 1 files changed, 21 insertions(+), 2 deletions(-) -- 1.7.7.6 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-18 17:36 ` [PATCH 0/2] Fix regression " Trond Myklebust @ 2012-04-18 17:42 ` Jeff Layton 2012-04-18 17:50 ` Luis Henriques 2012-04-19 9:30 ` Luis Henriques 2 siblings, 0 replies; 21+ messages in thread From: Jeff Layton @ 2012-04-18 17:42 UTC (permalink / raw) To: Trond Myklebust; +Cc: Luis Henriques, linux-nfs@vger.kernel.org On Wed, 18 Apr 2012 13:36:52 -0400 Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > The first patch addresses the Oops. > The second will hopefully address the looping. > > Trond Myklebust (2): > NFSv4: Ensure that the LOCK code sets exception->inode > NFSv4: Ensure that we check lock exclusive/shared type against open > modes > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > 1 files changed, 21 insertions(+), 2 deletions(-) > ACK. Looks like the best way to handle this... I wonder though whether we should also push 14977489 to stable as well. While we expect NFS4ERR_OPENMODE from these codepaths, it's possible that we could get that in others when we don't expect it (from a misbehaving server perhaps?). That could cause an oops since we're not checking for a NULL inode in nfs4_handle_exception currently in 3.3.x kernels. -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-18 17:36 ` [PATCH 0/2] Fix regression " Trond Myklebust 2012-04-18 17:42 ` Jeff Layton @ 2012-04-18 17:50 ` Luis Henriques 2012-04-19 9:30 ` Luis Henriques 2 siblings, 0 replies; 21+ messages in thread From: Luis Henriques @ 2012-04-18 17:50 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jeff Layton, linux-nfs@vger.kernel.org On Wed, Apr 18, 2012 at 01:36:52PM -0400, Trond Myklebust wrote: > The first patch addresses the Oops. > The second will hopefully address the looping. > > Trond Myklebust (2): > NFSv4: Ensure that the LOCK code sets exception->inode > NFSv4: Ensure that we check lock exclusive/shared type against open > modes > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > 1 files changed, 21 insertions(+), 2 deletions(-) > > -- > 1.7.7.6 > Great, thanks a lot. I'll have this tested and let you know of the results. Cheers, -- Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-18 17:36 ` [PATCH 0/2] Fix regression " Trond Myklebust 2012-04-18 17:42 ` Jeff Layton 2012-04-18 17:50 ` Luis Henriques @ 2012-04-19 9:30 ` Luis Henriques 2012-04-19 13:48 ` Luis Henriques 2 siblings, 1 reply; 21+ messages in thread From: Luis Henriques @ 2012-04-19 9:30 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jeff Layton, linux-nfs@vger.kernel.org On Wed, Apr 18, 2012 at 01:36:52PM -0400, Trond Myklebust wrote: > The first patch addresses the Oops. > The second will hopefully address the looping. > > Trond Myklebust (2): > NFSv4: Ensure that the LOCK code sets exception->inode > NFSv4: Ensure that we check lock exclusive/shared type against open > modes > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > 1 files changed, 21 insertions(+), 2 deletions(-) > > -- > 1.7.7.6 > Just to let you know we do have a few successful tests on these patches. You can check the details here: http://bugs.launchpad.net/bugs/974664 Although I haven't tested the patches myself, fell free to add a Tested-by to these patches. Cheers, -- Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-19 9:30 ` Luis Henriques @ 2012-04-19 13:48 ` Luis Henriques 2012-04-19 17:29 ` Myklebust, Trond 0 siblings, 1 reply; 21+ messages in thread From: Luis Henriques @ 2012-04-19 13:48 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jeff Layton, linux-nfs@vger.kernel.org On Thu, Apr 19, 2012 at 10:30:25AM +0100, Luis Henriques wrote: > On Wed, Apr 18, 2012 at 01:36:52PM -0400, Trond Myklebust wrote: > > The first patch addresses the Oops. > > The second will hopefully address the looping. > > > > Trond Myklebust (2): > > NFSv4: Ensure that the LOCK code sets exception->inode > > NFSv4: Ensure that we check lock exclusive/shared type against open > > modes > > > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > > 1 files changed, 21 insertions(+), 2 deletions(-) > > > > -- > > 1.7.7.6 > > > Just to let you know we do have a few successful tests on these patches. > You can check the details here: > > http://bugs.launchpad.net/bugs/974664 > > Although I haven't tested the patches myself, fell free to add a Tested-by > to these patches. There's something else a forgot to mention, which is the fact that the bug reports were for kernel 3.2.14. So you may want to update the "Cc: stable@vger.kernel.org" on commit 487790f27df9bb27d3400486bd021dd59edc7589 to include at least this version. There are two another patches we have applied that are present in mainline but haven't made its way into stable: - 14977489ffdb80d4caf5a184ba41b23b02fbacd9 - 96dcadc2fdd111dca90d559f189a30c65394451a Cheers, -- Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-19 13:48 ` Luis Henriques @ 2012-04-19 17:29 ` Myklebust, Trond 2012-04-19 20:46 ` Luis Henriques 0 siblings, 1 reply; 21+ messages in thread From: Myklebust, Trond @ 2012-04-19 17:29 UTC (permalink / raw) To: Luis Henriques; +Cc: Jeff Layton, linux-nfs@vger.kernel.org T24gVGh1LCAyMDEyLTA0LTE5IGF0IDE0OjQ4ICswMTAwLCBMdWlzIEhlbnJpcXVlcyB3cm90ZToN Cj4gT24gVGh1LCBBcHIgMTksIDIwMTIgYXQgMTA6MzA6MjVBTSArMDEwMCwgTHVpcyBIZW5yaXF1 ZXMgd3JvdGU6DQo+ID4gT24gV2VkLCBBcHIgMTgsIDIwMTIgYXQgMDE6MzY6NTJQTSAtMDQwMCwg VHJvbmQgTXlrbGVidXN0IHdyb3RlOg0KPiA+ID4gVGhlIGZpcnN0IHBhdGNoIGFkZHJlc3NlcyB0 aGUgT29wcy4NCj4gPiA+IFRoZSBzZWNvbmQgd2lsbCBob3BlZnVsbHkgYWRkcmVzcyB0aGUgbG9v cGluZy4NCj4gPiA+IA0KPiA+ID4gVHJvbmQgTXlrbGVidXN0ICgyKToNCj4gPiA+ICAgTkZTdjQ6 IEVuc3VyZSB0aGF0IHRoZSBMT0NLIGNvZGUgc2V0cyBleGNlcHRpb24tPmlub2RlDQo+ID4gPiAg IE5GU3Y0OiBFbnN1cmUgdGhhdCB3ZSBjaGVjayBsb2NrIGV4Y2x1c2l2ZS9zaGFyZWQgdHlwZSBh Z2FpbnN0IG9wZW4NCj4gPiA+ICAgICBtb2Rlcw0KPiA+ID4gDQo+ID4gPiAgZnMvbmZzL25mczRw cm9jLmMgfCAgIDIzICsrKysrKysrKysrKysrKysrKysrKy0tDQo+ID4gPiAgMSBmaWxlcyBjaGFu Z2VkLCAyMSBpbnNlcnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQ0KPiA+ID4gDQo+ID4gPiAtLSAN Cj4gPiA+IDEuNy43LjYNCj4gPiA+IA0KPiA+IEp1c3QgdG8gbGV0IHlvdSBrbm93IHdlIGRvIGhh dmUgYSBmZXcgc3VjY2Vzc2Z1bCB0ZXN0cyBvbiB0aGVzZSBwYXRjaGVzLg0KPiA+IFlvdSBjYW4g Y2hlY2sgdGhlIGRldGFpbHMgaGVyZToNCj4gPiANCj4gPiBodHRwOi8vYnVncy5sYXVuY2hwYWQu bmV0L2J1Z3MvOTc0NjY0DQo+ID4gDQo+ID4gQWx0aG91Z2ggSSBoYXZlbid0IHRlc3RlZCB0aGUg cGF0Y2hlcyBteXNlbGYsIGZlbGwgZnJlZSB0byBhZGQgYSBUZXN0ZWQtYnkNCj4gPiB0byB0aGVz ZSBwYXRjaGVzLg0KPiANCj4gVGhlcmUncyBzb21ldGhpbmcgZWxzZSBhIGZvcmdvdCB0byBtZW50 aW9uLCB3aGljaCBpcyB0aGUgZmFjdCB0aGF0IHRoZSBidWcNCj4gcmVwb3J0cyB3ZXJlIGZvciBr ZXJuZWwgMy4yLjE0LiAgU28geW91IG1heSB3YW50IHRvIHVwZGF0ZSB0aGUNCj4gIkNjOiBzdGFi bGVAdmdlci5rZXJuZWwub3JnIiBvbiBjb21taXQgNDg3NzkwZjI3ZGY5YmIyN2QzNDAwNDg2YmQw MjFkZDU5ZWRjNzU4OQ0KPiB0byBpbmNsdWRlIGF0IGxlYXN0IHRoaXMgdmVyc2lvbi4NCg0KWWVw LiBJIGFsc28gc2F3IHRoYXQgdGhlcmUgaXMgYSBuZWVkIGZvciBpdCBpbiAzLjAuMjcsIHNvIEkn bGwganVzdA0KcmVtb3ZlIHRoYXQgPj0gMy4zLjEuLi4NCg0KPiBUaGVyZSBhcmUgdHdvIGFub3Ro ZXIgcGF0Y2hlcyB3ZSBoYXZlIGFwcGxpZWQgdGhhdCBhcmUgcHJlc2VudCBpbiBtYWlubGluZQ0K PiBidXQgaGF2ZW4ndCBtYWRlIGl0cyB3YXkgaW50byBzdGFibGU6DQo+ICAtIDE0OTc3NDg5ZmZk YjgwZDRjYWY1YTE4NGJhNDFiMjNiMDJmYmFjZDkNCj4gIC0gOTZkY2FkYzJmZGQxMTFkY2E5MGQ1 NTlmMTg5YTMwYzY1Mzk0NDUxYQ0KDQpJIGRvbid0IHBsYW4gb24gc2VuZGluZyB0aGVzZSAyIGNv bW1pdHMgdG8gc3RhYmxlIHVubGVzcyB3ZSBzZWUgc29tZQ0Kc3BlY2lmaWMgcHJvYmxlbXMgdGhh dCBuZWVkIHRvIGJlIGNvcnJlY3RlZC4gSSB1bmRlcnN0YW5kIHRoYXQgc2VlaW5nIGENCk5GUzRF UlJfT1BFTk1PREUgYXQgdGhlIHdyb25nIHRpbWUgY291bGQgdGhlb3JldGljYWxseSBjYXVzZSBh biBPb3BzDQp3aXRob3V0IHRoZSAxNDk3NzQ4IGNvbW1pdCwgYnV0IGJyb2tlbiBzZXJ2ZXJzIGNh biB3cmVhayBhbGwgc29ydHMgb2YNCmhhdm9jIGFueXdheS4gVGhlcmUgaXNuJ3QgbXVjaCB5b3Ug Y2FuIGRvIHRvIHByb3RlY3QgYWdhaW5zdCB0aGVtLg0KDQpDaGVlcnMNCiAgVHJvbmQNCg0KLS0g DQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0KDQpOZXRBcHAN ClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNvbQ0KDQo= ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-19 17:29 ` Myklebust, Trond @ 2012-04-19 20:46 ` Luis Henriques 2013-01-04 10:05 ` Mario Bachmann 0 siblings, 1 reply; 21+ messages in thread From: Luis Henriques @ 2012-04-19 20:46 UTC (permalink / raw) To: Myklebust, Trond; +Cc: Jeff Layton, linux-nfs@vger.kernel.org On Thu, Apr 19, 2012 at 05:29:09PM +0000, Myklebust, Trond wrote: > On Thu, 2012-04-19 at 14:48 +0100, Luis Henriques wrote: > > On Thu, Apr 19, 2012 at 10:30:25AM +0100, Luis Henriques wrote: > > > On Wed, Apr 18, 2012 at 01:36:52PM -0400, Trond Myklebust wrote: > > > > The first patch addresses the Oops. > > > > The second will hopefully address the looping. > > > > > > > > Trond Myklebust (2): > > > > NFSv4: Ensure that the LOCK code sets exception->inode > > > > NFSv4: Ensure that we check lock exclusive/shared type against open > > > > modes > > > > > > > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > > > > 1 files changed, 21 insertions(+), 2 deletions(-) > > > > > > > > -- > > > > 1.7.7.6 > > > > > > > Just to let you know we do have a few successful tests on these patches. > > > You can check the details here: > > > > > > http://bugs.launchpad.net/bugs/974664 > > > > > > Although I haven't tested the patches myself, fell free to add a Tested-by > > > to these patches. > > > > There's something else a forgot to mention, which is the fact that the bug > > reports were for kernel 3.2.14. So you may want to update the > > "Cc: stable@vger.kernel.org" on commit 487790f27df9bb27d3400486bd021dd59edc7589 > > to include at least this version. > > Yep. I also saw that there is a need for it in 3.0.27, so I'll just > remove that >= 3.3.1... Great, thanks. > > There are two another patches we have applied that are present in mainline > > but haven't made its way into stable: > > - 14977489ffdb80d4caf5a184ba41b23b02fbacd9 > > - 96dcadc2fdd111dca90d559f189a30c65394451a > > I don't plan on sending these 2 commits to stable unless we see some > specific problems that need to be corrected. I understand that seeing a > NFS4ERR_OPENMODE at the wrong time could theoretically cause an Oops > without the 1497748 commit, but broken servers can wreak all sorts of > havoc anyway. There isn't much you can do to protect against them. I'm afraid I can't follow you on the protocol-related details, but the bug report I referred above was caused by an Oops which could be avoided by improving the robustness of the code (which the first commit does by checking for a NULL). Also, I'm not sure the NFS server being used by the bug reporter would fit into the "broken servers" category, as it seems to be working OK. For all of this, I would be glad to see the first commit on stable. Anyway, thanks a lot for your help sorting this out. Cheers, -- Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2012-04-19 20:46 ` Luis Henriques @ 2013-01-04 10:05 ` Mario Bachmann 2013-01-04 14:15 ` Myklebust, Trond 0 siblings, 1 reply; 21+ messages in thread From: Mario Bachmann @ 2013-01-04 10:05 UTC (permalink / raw) To: linux-nfs Luis Henriques <luis.henriques@...> writes: > > On Thu, Apr 19, 2012 at 05:29:09PM +0000, Myklebust, Trond wrote: > > On Thu, 2012-04-19 at 14:48 +0100, Luis Henriques wrote: > > > On Thu, Apr 19, 2012 at 10:30:25AM +0100, Luis Henriques wrote: > > > > On Wed, Apr 18, 2012 at 01:36:52PM -0400, Trond Myklebust wrote: > > > > > The first patch addresses the Oops. > > > > > The second will hopefully address the looping. > > > > > > > > > > Trond Myklebust (2): > > > > > NFSv4: Ensure that the LOCK code sets exception->inode > > > > > NFSv4: Ensure that we check lock exclusive/shared type against open > > > > > modes > > > > > > > > > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > > > > > 1 files changed, 21 insertions(+), 2 deletions(-) I use kernel 3.7.1 from kernel.org which has the patches included, which should fix the problems (I checked the source code). But: I get "NFS: nfs4_reclaim_open_state: Lock reclaim failed!" on the client, which mounts home over nfs4. I get senseless network traffic (50MB/s on my gigabit network). Finally I get Kernel Ops on the Server sometimes. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2013-01-04 10:05 ` Mario Bachmann @ 2013-01-04 14:15 ` Myklebust, Trond 2013-01-06 10:48 ` Mario Bachmann 0 siblings, 1 reply; 21+ messages in thread From: Myklebust, Trond @ 2013-01-04 14:15 UTC (permalink / raw) To: Mario Bachmann; +Cc: linux-nfs@vger.kernel.org On Fri, 2013-01-04 at 10:05 +0000, Mario Bachmann wrote: > Luis Henriques <luis.henriques@...> writes: > > > > > On Thu, Apr 19, 2012 at 05:29:09PM +0000, Myklebust, Trond wrote: > > > On Thu, 2012-04-19 at 14:48 +0100, Luis Henriques wrote: > > > > On Thu, Apr 19, 2012 at 10:30:25AM +0100, Luis Henriques wrote: > > > > > On Wed, Apr 18, 2012 at 01:36:52PM -0400, Trond Myklebust wrote: > > > > > > The first patch addresses the Oops. > > > > > > The second will hopefully address the looping. > > > > > > > > > > > > Trond Myklebust (2): > > > > > > NFSv4: Ensure that the LOCK code sets exception->inode > > > > > > NFSv4: Ensure that we check lock exclusive/shared type against open > > > > > > modes > > > > > > > > > > > > fs/nfs/nfs4proc.c | 23 +++++++++++++++++++++-- > > > > > > 1 files changed, 21 insertions(+), 2 deletions(-) > I use kernel 3.7.1 from kernel.org which has the patches included, > which should fix the problems (I checked the source code). But: > I get "NFS: nfs4_reclaim_open_state: Lock reclaim failed!" on the > client, which mounts home over nfs4. > I get senseless network traffic (50MB/s on my gigabit network). > Finally I get Kernel Ops on the Server sometimes. Lock reclaiming is not guaranteed to succeed; it all depends on why the lock was lost in the first place. Did the server restart, or did a network partition occur? Was some other process able to take that lock (or even delete the file, doh!), before your client could reclaim it? That said, if the server is Oopsing, then you have a broken server. I'd suggest posting that Oops to this list first. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Fix regression on NFS in mainline kernel 2013-01-04 14:15 ` Myklebust, Trond @ 2013-01-06 10:48 ` Mario Bachmann 0 siblings, 0 replies; 21+ messages in thread From: Mario Bachmann @ 2013-01-06 10:48 UTC (permalink / raw) To: linux-nfs Myklebust, Trond <Trond.Myklebust@...> writes: > > I use kernel 3.7.1 from kernel.org which has the patches included, > > which should fix the problems (I checked the source code). But: > > I get "NFS: nfs4_reclaim_open_state: Lock reclaim failed!" on the > > client, which mounts home over nfs4. > > I get senseless network traffic (50MB/s on my gigabit network). > > Finally I get Kernel Ops on the Server sometimes. > > Lock reclaiming is not guaranteed to succeed; it all depends on why the > lock was lost in the first place. Did the server restart, or did a > network partition occur? Was some other process able to take that lock > (or even delete the file, doh!), before your client could reclaim it? > > That said, if the server is Oopsing, then you have a broken server. I'd > suggest posting that Oops to this list first. > Thanks for the hint, the server was really broken! The power supply of the server had some broken capacitors. With a new power supply there are no oopses any more! So my reported problems with locking are worthless, because it happened with the broken server. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/2] NFSv4: Ensure that the LOCK code sets exception->inode 2012-04-18 14:51 ` Jeff Layton 2012-04-18 17:35 ` Luis Henriques 2012-04-18 17:36 ` [PATCH 0/2] Fix regression " Trond Myklebust @ 2012-04-18 17:36 ` Trond Myklebust 2012-04-18 17:36 ` [PATCH 2/2] NFSv4: Ensure that we check lock exclusive/shared type against open modes Trond Myklebust 3 siblings, 0 replies; 21+ messages in thread From: Trond Myklebust @ 2012-04-18 17:36 UTC (permalink / raw) To: Jeff Layton; +Cc: Luis Henriques, linux-nfs@vger.kernel.org All callers of nfs4_handle_exception() that need to handle NFS4ERR_OPENMODE correctly should set exception->inode Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [>= 3.3.1] --- fs/nfs/nfs4proc.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index f82bde0..3c787d0 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -4558,7 +4558,9 @@ static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *f static int nfs4_lock_reclaim(struct nfs4_state *state, struct file_lock *request) { struct nfs_server *server = NFS_SERVER(state->inode); - struct nfs4_exception exception = { }; + struct nfs4_exception exception = { + .inode = state->inode, + }; int err; do { @@ -4576,7 +4578,9 @@ static int nfs4_lock_reclaim(struct nfs4_state *state, struct file_lock *request static int nfs4_lock_expired(struct nfs4_state *state, struct file_lock *request) { struct nfs_server *server = NFS_SERVER(state->inode); - struct nfs4_exception exception = { }; + struct nfs4_exception exception = { + .inode = state->inode, + }; int err; err = nfs4_set_lock_state(state, request); @@ -4676,6 +4680,7 @@ static int nfs4_proc_setlk(struct nfs4_state *state, int cmd, struct file_lock * { struct nfs4_exception exception = { .state = state, + .inode = state->inode, }; int err; -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/2] NFSv4: Ensure that we check lock exclusive/shared type against open modes 2012-04-18 14:51 ` Jeff Layton ` (2 preceding siblings ...) 2012-04-18 17:36 ` [PATCH 1/2] NFSv4: Ensure that the LOCK code sets exception->inode Trond Myklebust @ 2012-04-18 17:36 ` Trond Myklebust 3 siblings, 0 replies; 21+ messages in thread From: Trond Myklebust @ 2012-04-18 17:36 UTC (permalink / raw) To: Jeff Layton; +Cc: Luis Henriques, linux-nfs@vger.kernel.org Since we may be simulating flock() locks using NFS byte range locks, we can't rely on the VFS having checked the file open mode for us. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org --- fs/nfs/nfs4proc.c | 14 ++++++++++++++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 3c787d0..ba837d9 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -4726,6 +4726,20 @@ nfs4_proc_lock(struct file *filp, int cmd, struct file_lock *request) if (state == NULL) return -ENOLCK; + /* + * Don't rely on the VFS having checked the file open mode, + * since it won't do this for flock() locks. + */ + switch (request->fl_type & (F_RDLCK|F_WRLCK|F_UNLCK)) { + case F_RDLCK: + if (!(filp->f_mode & FMODE_READ)) + return -EBADF; + break; + case F_WRLCK: + if (!(filp->f_mode & FMODE_WRITE)) + return -EBADF; + } + do { status = nfs4_proc_setlk(state, cmd, request); if ((status != -EAGAIN) || IS_SETLK(cmd)) -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 21+ messages in thread
end of thread, other threads:[~2013-01-06 10:49 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-04-18 11:26 Regretion on NFS in mainline kernel Luis Henriques 2012-04-18 13:28 ` Jeff Layton 2012-04-18 13:57 ` Luis Henriques 2012-04-18 14:04 ` Myklebust, Trond 2012-04-18 14:13 ` Luis Henriques 2012-04-18 14:15 ` Jeff Layton 2012-04-18 14:41 ` Myklebust, Trond 2012-04-18 14:51 ` Jeff Layton 2012-04-18 17:35 ` Luis Henriques 2012-04-18 17:36 ` [PATCH 0/2] Fix regression " Trond Myklebust 2012-04-18 17:42 ` Jeff Layton 2012-04-18 17:50 ` Luis Henriques 2012-04-19 9:30 ` Luis Henriques 2012-04-19 13:48 ` Luis Henriques 2012-04-19 17:29 ` Myklebust, Trond 2012-04-19 20:46 ` Luis Henriques 2013-01-04 10:05 ` Mario Bachmann 2013-01-04 14:15 ` Myklebust, Trond 2013-01-06 10:48 ` Mario Bachmann 2012-04-18 17:36 ` [PATCH 1/2] NFSv4: Ensure that the LOCK code sets exception->inode Trond Myklebust 2012-04-18 17:36 ` [PATCH 2/2] NFSv4: Ensure that we check lock exclusive/shared type against open modes Trond Myklebust
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).