diff for duplicates of <1507654135.4442.4.camel@primarydata.com> diff --git a/a/1.txt b/N1/1.txt index 2076e36..5ef4831 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,47 +1,69 @@ -T24gVHVlLCAyMDE3LTEwLTEwIGF0IDA3OjAzIC0wNzAwLCB0akBrZXJuZWwub3JnIHdyb3RlOg0K -PiBIZWxsbywgVHJvbmQuDQo+IA0KPiBPbiBNb24sIE9jdCAwOSwgMjAxNyBhdCAwNjozMjoxM1BN -ICswMDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6DQo+ID4gT24gTW9uLCAyMDE3LTEwLTA5IGF0 -IDE5OjE3ICswMTAwLCBMb3JlbnpvIFBpZXJhbGlzaSB3cm90ZToNCj4gPiA+IEkgaGF2ZSBydW4g -aW50byB0aGUgbG9ja2RlcCB3YXJuaW5nIGJlbG93IHdoaWxlIHJ1bm5pbmcgdjQuMTQtDQo+ID4g -PiByYzMvcmM0DQo+ID4gPiBvbiBhbiBBUk02NCBkZWZjb25maWcgSnVubyBkZXYgYm9hcmQgLSBy -ZXBvcnRpbmcgaXQgdG8gY2hlY2sNCj4gPiA+IHdoZXRoZXINCj4gPiA+IGl0IGlzIGEga25vd24v -Z2VudWluZSBpc3N1ZS4NCj4gPiA+IA0KPiA+ID4gUGxlYXNlIGxldCBtZSBrbm93IGlmIHlvdSBu -ZWVkIGZ1cnRoZXIgZGVidWcgZGF0YSBvciBuZWVkIHNvbWUNCj4gPiA+IHNwZWNpZmljIHRlc3Rz -Lg0KPiA+ID4gDQo+ID4gPiBbICAgIDYuMjA5Mzg0XQ0KPiA+ID4gPT09PT09PT09PT09PT09PT09 -PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09DQo+ID4gPiBbICAgIDYuMjE1NTY5 -XSBXQVJOSU5HOiBwb3NzaWJsZSBjaXJjdWxhciBsb2NraW5nIGRlcGVuZGVuY3kNCj4gPiA+IGRl -dGVjdGVkDQo+ID4gPiBbICAgIDYuMjIxNzU1XSA0LjE0LjAtcmM0ICM1NCBOb3QgdGFpbnRlZA0K -PiA+ID4gWyAgICA2LjIyNTUwM10gLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t -LS0tLS0tLS0tLS0tLS0NCj4gPiA+IC0tLS0NCj4gPiA+IFsgICAgNi4yMzE2ODldIGt3b3JrZXIv -NDowSC8zMiBpcyB0cnlpbmcgdG8gYWNxdWlyZSBsb2NrOg0KPiA+ID4gWyAgICA2LjIzNjgzMF0g -ICgoJnRhc2stPnUudGtfd29yaykpeysuKy59LCBhdDoNCj4gPiA+IFs8ZmZmZjAwMDAwODBlNjRj -Yz5dDQo+ID4gPiBwcm9jZXNzX29uZV93b3JrKzB4MWNjLzB4M2YwDQo+ID4gPiBbICAgIDYuMjQ1 -NDcyXSANCj4gPiA+ICAgICAgICAgICAgICAgIGJ1dCB0YXNrIGlzIGFscmVhZHkgaG9sZGluZyBs -b2NrOg0KPiA+ID4gWyAgICA2LjI1MTMwOV0gICgieHBydGlvZCIpeysuKy59LCBhdDogWzxmZmZm -MDAwMDA4MGU2NGNjPl0NCj4gPiA+IHByb2Nlc3Nfb25lX3dvcmsrMHgxY2MvMHgzZjANCj4gPiA+ -IFsgICAgNi4yNTkxNThdIA0KPiA+ID4gICAgICAgICAgICAgICAgd2hpY2ggbG9jayBhbHJlYWR5 -IGRlcGVuZHMgb24gdGhlIG5ldyBsb2NrLg0KPiA+ID4gDQo+ID4gPiBbICAgIDYuMjY3MzQ1XSAN -Cj4gPiA+ICAgICAgICAgICAgICAgIHRoZSBleGlzdGluZyBkZXBlbmRlbmN5IGNoYWluIChpbiBy -ZXZlcnNlIG9yZGVyKQ0KPiA+ID4gaXM6DQo+IA0KPiAuLg0KPiA+IEFkZGluZyBUZWp1biBhbmQg -TGFpLCBzaW5jZSB0aGlzIGxvb2tzIGxpa2UgYSB3b3JrcXVldWUgbG9ja2luZw0KPiA+IGlzc3Vl -Lg0KPiANCj4gSXQgbG9va3MgYSBiaXQgY3J5cHRpYyBidXQgaXQncyB3YXJuaW5nIGFnYWluc3Qg -dGhlIGZvbGxvd2luZyBjYXNlLg0KPiANCj4gMS4gTWVtb3J5IHByZXNzdXJlIGlzIGhpZ2ggYW5k -IHJlc2N1ZXIga2lja3MgaW4gZm9yIHRoZSB4cHJ0aW9kDQo+ICAgIHdvcmtxdWV1ZS4gIFRoZXJl -IGFyZSBubyBvdGhlciBrd29ya2VycyBzZXJ2aW5nIHRoZSB3b3JrcXVldWUuDQo+IA0KPiAyLiBU -aGUgcmVzY3VlciBydW5zIHRoZSB4cHRyX2Rlc3Ryb3kgcGF0aCBhbmQgZW5kcyB1cCBjYWxsaW5n -DQo+ICAgIGNhbmNlbF93b3JrX3N5bmMoKSBvbiBhIHdvcmsgaXRlbSB3aGljaCBpcyBxdWV1ZWQg -b24geHBydGlvZC4NCj4gDQo+IDMuIFRoZSB3b3JrIGl0ZW0gaXMgcGVuZGluZyBvbiB0aGUgc2Ft -ZSB3b3JrcXVldWUgYW5kIGFzc3VtaW5nIHRoYXQNCj4gICAgbWVtb3J5IHByZXNzdXJlIGRvZXNu -J3QgbGV0IG9mZiAobGV0J3Mgc2F5IHJlY2xhaW0gaXMgdHJ5aW5nIHRvDQo+ICAgIGtpY2sgb2Zm -IG5mcyBwYWdlcyksIHRoZSBvbmx5IHdheSBpdCBjYW4gZ2V0IGV4ZWN1dGVkIGlzIGJ5IHRoZQ0K -PiAgICByZXNjdWVyIHdoaWNoIGlzIHdhaXRpbmcgZm9yIHRoZSB3b3JrIGl0ZW0gLSBhbiBBLUIt -QSBkZWFkbG9jay4NCj4gDQoNCkhpIFRlanVuLA0KDQpUaGFua3MgZm9yIHRoZSBleHBsYW5hdGlv -bi4gV2hhdCBJJ20gbm90IHJlYWxseSB1bmRlcnN0YW5kaW5nIGhlcmUNCnRob3VnaCwgaXMgaG93 -IHRoZSB3b3JrIGl0ZW0gY291bGQgYmUgcXVldWVkIGF0IGFsbC4gV2UgaGF2ZSBhDQp3YWl0X29u -X2JpdF9sb2NrKCkgaW4geHBydF9kZXN0cm95KCkgdGhhdCBzaG91bGQgbWVhbiB0aGUgeHBydC0N -Cj50YXNrX2NsZWFudXAgd29yayBpdGVtIGhhcyBjb21wbGV0ZWQgcnVubmluZywgYW5kIHRoYXQg -aXQgY2Fubm90IGJlDQpyZXF1ZXVlZC4NCg0KSXMgdGhlcmUgYSBwb3NzaWJpbGl0eSB0aGF0IHRo -ZSBmbHVzaF9xdWV1ZSgpIG1pZ2h0IGJlIHRyaWdnZXJlZA0KZGVzcGl0ZSB0aGUgd29yayBpdGVt -IG5vdCBiZWluZyBxdWV1ZWQ/DQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xp -ZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEu -Y29tDQo= +On Tue, 2017-10-10 at 07:03 -0700, tj@kernel.org wrote: +> Hello, Trond. +> +> On Mon, Oct 09, 2017 at 06:32:13PM +0000, Trond Myklebust wrote: +> > On Mon, 2017-10-09 at 19:17 +0100, Lorenzo Pieralisi wrote: +> > > I have run into the lockdep warning below while running v4.14- +> > > rc3/rc4 +> > > on an ARM64 defconfig Juno dev board - reporting it to check +> > > whether +> > > it is a known/genuine issue. +> > > +> > > Please let me know if you need further debug data or need some +> > > specific tests. +> > > +> > > [ 6.209384] +> > > ====================================================== +> > > [ 6.215569] WARNING: possible circular locking dependency +> > > detected +> > > [ 6.221755] 4.14.0-rc4 #54 Not tainted +> > > [ 6.225503] -------------------------------------------------- +> > > ---- +> > > [ 6.231689] kworker/4:0H/32 is trying to acquire lock: +> > > [ 6.236830] ((&task->u.tk_work)){+.+.}, at: +> > > [<ffff0000080e64cc>] +> > > process_one_work+0x1cc/0x3f0 +> > > [ 6.245472] +> > > but task is already holding lock: +> > > [ 6.251309] ("xprtiod"){+.+.}, at: [<ffff0000080e64cc>] +> > > process_one_work+0x1cc/0x3f0 +> > > [ 6.259158] +> > > which lock already depends on the new lock. +> > > +> > > [ 6.267345] +> > > the existing dependency chain (in reverse order) +> > > is: +> +> .. +> > Adding Tejun and Lai, since this looks like a workqueue locking +> > issue. +> +> It looks a bit cryptic but it's warning against the following case. +> +> 1. Memory pressure is high and rescuer kicks in for the xprtiod +> workqueue. There are no other kworkers serving the workqueue. +> +> 2. The rescuer runs the xptr_destroy path and ends up calling +> cancel_work_sync() on a work item which is queued on xprtiod. +> +> 3. The work item is pending on the same workqueue and assuming that +> memory pressure doesn't let off (let's say reclaim is trying to +> kick off nfs pages), the only way it can get executed is by the +> rescuer which is waiting for the work item - an A-B-A deadlock. +> + +Hi Tejun, + +Thanks for the explanation. What I'm not really understanding here +though, is how the work item could be queued at all. We have a +wait_on_bit_lock() in xprt_destroy() that should mean the xprt- +>task_cleanup work item has completed running, and that it cannot be +requeued. + +Is there a possibility that the flush_queue() might be triggered +despite the work item not being queued? + +-- +Trond Myklebust +Linux NFS client maintainer, PrimaryData +trond.myklebust@primarydata.com diff --git a/a/content_digest b/N1/content_digest index 66f24c6..8418b4f 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -14,52 +14,74 @@ " anna.schumaker@netapp.com <anna.schumaker@netapp.com>\0" "\00:1\0" "b\0" - "T24gVHVlLCAyMDE3LTEwLTEwIGF0IDA3OjAzIC0wNzAwLCB0akBrZXJuZWwub3JnIHdyb3RlOg0K\n" - "PiBIZWxsbywgVHJvbmQuDQo+IA0KPiBPbiBNb24sIE9jdCAwOSwgMjAxNyBhdCAwNjozMjoxM1BN\n" - "ICswMDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6DQo+ID4gT24gTW9uLCAyMDE3LTEwLTA5IGF0\n" - "IDE5OjE3ICswMTAwLCBMb3JlbnpvIFBpZXJhbGlzaSB3cm90ZToNCj4gPiA+IEkgaGF2ZSBydW4g\n" - "aW50byB0aGUgbG9ja2RlcCB3YXJuaW5nIGJlbG93IHdoaWxlIHJ1bm5pbmcgdjQuMTQtDQo+ID4g\n" - "PiByYzMvcmM0DQo+ID4gPiBvbiBhbiBBUk02NCBkZWZjb25maWcgSnVubyBkZXYgYm9hcmQgLSBy\n" - "ZXBvcnRpbmcgaXQgdG8gY2hlY2sNCj4gPiA+IHdoZXRoZXINCj4gPiA+IGl0IGlzIGEga25vd24v\n" - "Z2VudWluZSBpc3N1ZS4NCj4gPiA+IA0KPiA+ID4gUGxlYXNlIGxldCBtZSBrbm93IGlmIHlvdSBu\n" - "ZWVkIGZ1cnRoZXIgZGVidWcgZGF0YSBvciBuZWVkIHNvbWUNCj4gPiA+IHNwZWNpZmljIHRlc3Rz\n" - "Lg0KPiA+ID4gDQo+ID4gPiBbICAgIDYuMjA5Mzg0XQ0KPiA+ID4gPT09PT09PT09PT09PT09PT09\n" - "PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09DQo+ID4gPiBbICAgIDYuMjE1NTY5\n" - "XSBXQVJOSU5HOiBwb3NzaWJsZSBjaXJjdWxhciBsb2NraW5nIGRlcGVuZGVuY3kNCj4gPiA+IGRl\n" - "dGVjdGVkDQo+ID4gPiBbICAgIDYuMjIxNzU1XSA0LjE0LjAtcmM0ICM1NCBOb3QgdGFpbnRlZA0K\n" - "PiA+ID4gWyAgICA2LjIyNTUwM10gLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t\n" - "LS0tLS0tLS0tLS0tLS0NCj4gPiA+IC0tLS0NCj4gPiA+IFsgICAgNi4yMzE2ODldIGt3b3JrZXIv\n" - "NDowSC8zMiBpcyB0cnlpbmcgdG8gYWNxdWlyZSBsb2NrOg0KPiA+ID4gWyAgICA2LjIzNjgzMF0g\n" - "ICgoJnRhc2stPnUudGtfd29yaykpeysuKy59LCBhdDoNCj4gPiA+IFs8ZmZmZjAwMDAwODBlNjRj\n" - "Yz5dDQo+ID4gPiBwcm9jZXNzX29uZV93b3JrKzB4MWNjLzB4M2YwDQo+ID4gPiBbICAgIDYuMjQ1\n" - "NDcyXSANCj4gPiA+ICAgICAgICAgICAgICAgIGJ1dCB0YXNrIGlzIGFscmVhZHkgaG9sZGluZyBs\n" - "b2NrOg0KPiA+ID4gWyAgICA2LjI1MTMwOV0gICgieHBydGlvZCIpeysuKy59LCBhdDogWzxmZmZm\n" - "MDAwMDA4MGU2NGNjPl0NCj4gPiA+IHByb2Nlc3Nfb25lX3dvcmsrMHgxY2MvMHgzZjANCj4gPiA+\n" - "IFsgICAgNi4yNTkxNThdIA0KPiA+ID4gICAgICAgICAgICAgICAgd2hpY2ggbG9jayBhbHJlYWR5\n" - "IGRlcGVuZHMgb24gdGhlIG5ldyBsb2NrLg0KPiA+ID4gDQo+ID4gPiBbICAgIDYuMjY3MzQ1XSAN\n" - "Cj4gPiA+ICAgICAgICAgICAgICAgIHRoZSBleGlzdGluZyBkZXBlbmRlbmN5IGNoYWluIChpbiBy\n" - "ZXZlcnNlIG9yZGVyKQ0KPiA+ID4gaXM6DQo+IA0KPiAuLg0KPiA+IEFkZGluZyBUZWp1biBhbmQg\n" - "TGFpLCBzaW5jZSB0aGlzIGxvb2tzIGxpa2UgYSB3b3JrcXVldWUgbG9ja2luZw0KPiA+IGlzc3Vl\n" - "Lg0KPiANCj4gSXQgbG9va3MgYSBiaXQgY3J5cHRpYyBidXQgaXQncyB3YXJuaW5nIGFnYWluc3Qg\n" - "dGhlIGZvbGxvd2luZyBjYXNlLg0KPiANCj4gMS4gTWVtb3J5IHByZXNzdXJlIGlzIGhpZ2ggYW5k\n" - "IHJlc2N1ZXIga2lja3MgaW4gZm9yIHRoZSB4cHJ0aW9kDQo+ICAgIHdvcmtxdWV1ZS4gIFRoZXJl\n" - "IGFyZSBubyBvdGhlciBrd29ya2VycyBzZXJ2aW5nIHRoZSB3b3JrcXVldWUuDQo+IA0KPiAyLiBU\n" - "aGUgcmVzY3VlciBydW5zIHRoZSB4cHRyX2Rlc3Ryb3kgcGF0aCBhbmQgZW5kcyB1cCBjYWxsaW5n\n" - "DQo+ICAgIGNhbmNlbF93b3JrX3N5bmMoKSBvbiBhIHdvcmsgaXRlbSB3aGljaCBpcyBxdWV1ZWQg\n" - "b24geHBydGlvZC4NCj4gDQo+IDMuIFRoZSB3b3JrIGl0ZW0gaXMgcGVuZGluZyBvbiB0aGUgc2Ft\n" - "ZSB3b3JrcXVldWUgYW5kIGFzc3VtaW5nIHRoYXQNCj4gICAgbWVtb3J5IHByZXNzdXJlIGRvZXNu\n" - "J3QgbGV0IG9mZiAobGV0J3Mgc2F5IHJlY2xhaW0gaXMgdHJ5aW5nIHRvDQo+ICAgIGtpY2sgb2Zm\n" - "IG5mcyBwYWdlcyksIHRoZSBvbmx5IHdheSBpdCBjYW4gZ2V0IGV4ZWN1dGVkIGlzIGJ5IHRoZQ0K\n" - "PiAgICByZXNjdWVyIHdoaWNoIGlzIHdhaXRpbmcgZm9yIHRoZSB3b3JrIGl0ZW0gLSBhbiBBLUIt\n" - "QSBkZWFkbG9jay4NCj4gDQoNCkhpIFRlanVuLA0KDQpUaGFua3MgZm9yIHRoZSBleHBsYW5hdGlv\n" - "bi4gV2hhdCBJJ20gbm90IHJlYWxseSB1bmRlcnN0YW5kaW5nIGhlcmUNCnRob3VnaCwgaXMgaG93\n" - "IHRoZSB3b3JrIGl0ZW0gY291bGQgYmUgcXVldWVkIGF0IGFsbC4gV2UgaGF2ZSBhDQp3YWl0X29u\n" - "X2JpdF9sb2NrKCkgaW4geHBydF9kZXN0cm95KCkgdGhhdCBzaG91bGQgbWVhbiB0aGUgeHBydC0N\n" - "Cj50YXNrX2NsZWFudXAgd29yayBpdGVtIGhhcyBjb21wbGV0ZWQgcnVubmluZywgYW5kIHRoYXQg\n" - "aXQgY2Fubm90IGJlDQpyZXF1ZXVlZC4NCg0KSXMgdGhlcmUgYSBwb3NzaWJpbGl0eSB0aGF0IHRo\n" - "ZSBmbHVzaF9xdWV1ZSgpIG1pZ2h0IGJlIHRyaWdnZXJlZA0KZGVzcGl0ZSB0aGUgd29yayBpdGVt\n" - "IG5vdCBiZWluZyBxdWV1ZWQ/DQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xp\n" - "ZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEu\n" - Y29tDQo= + "On Tue, 2017-10-10 at 07:03 -0700, tj@kernel.org wrote:\n" + "> Hello, Trond.\n" + "> \n" + "> On Mon, Oct 09, 2017 at 06:32:13PM +0000, Trond Myklebust wrote:\n" + "> > On Mon, 2017-10-09 at 19:17 +0100, Lorenzo Pieralisi wrote:\n" + "> > > I have run into the lockdep warning below while running v4.14-\n" + "> > > rc3/rc4\n" + "> > > on an ARM64 defconfig Juno dev board - reporting it to check\n" + "> > > whether\n" + "> > > it is a known/genuine issue.\n" + "> > > \n" + "> > > Please let me know if you need further debug data or need some\n" + "> > > specific tests.\n" + "> > > \n" + "> > > [ 6.209384]\n" + "> > > ======================================================\n" + "> > > [ 6.215569] WARNING: possible circular locking dependency\n" + "> > > detected\n" + "> > > [ 6.221755] 4.14.0-rc4 #54 Not tainted\n" + "> > > [ 6.225503] --------------------------------------------------\n" + "> > > ----\n" + "> > > [ 6.231689] kworker/4:0H/32 is trying to acquire lock:\n" + "> > > [ 6.236830] ((&task->u.tk_work)){+.+.}, at:\n" + "> > > [<ffff0000080e64cc>]\n" + "> > > process_one_work+0x1cc/0x3f0\n" + "> > > [ 6.245472] \n" + "> > > but task is already holding lock:\n" + "> > > [ 6.251309] (\"xprtiod\"){+.+.}, at: [<ffff0000080e64cc>]\n" + "> > > process_one_work+0x1cc/0x3f0\n" + "> > > [ 6.259158] \n" + "> > > which lock already depends on the new lock.\n" + "> > > \n" + "> > > [ 6.267345] \n" + "> > > the existing dependency chain (in reverse order)\n" + "> > > is:\n" + "> \n" + "> ..\n" + "> > Adding Tejun and Lai, since this looks like a workqueue locking\n" + "> > issue.\n" + "> \n" + "> It looks a bit cryptic but it's warning against the following case.\n" + "> \n" + "> 1. Memory pressure is high and rescuer kicks in for the xprtiod\n" + "> workqueue. There are no other kworkers serving the workqueue.\n" + "> \n" + "> 2. The rescuer runs the xptr_destroy path and ends up calling\n" + "> cancel_work_sync() on a work item which is queued on xprtiod.\n" + "> \n" + "> 3. The work item is pending on the same workqueue and assuming that\n" + "> memory pressure doesn't let off (let's say reclaim is trying to\n" + "> kick off nfs pages), the only way it can get executed is by the\n" + "> rescuer which is waiting for the work item - an A-B-A deadlock.\n" + "> \n" + "\n" + "Hi Tejun,\n" + "\n" + "Thanks for the explanation. What I'm not really understanding here\n" + "though, is how the work item could be queued at all. We have a\n" + "wait_on_bit_lock() in xprt_destroy() that should mean the xprt-\n" + ">task_cleanup work item has completed running, and that it cannot be\n" + "requeued.\n" + "\n" + "Is there a possibility that the flush_queue() might be triggered\n" + "despite the work item not being queued?\n" + "\n" + "-- \n" + "Trond Myklebust\n" + "Linux NFS client maintainer, PrimaryData\n" + trond.myklebust@primarydata.com -b43c671a99e90b9aa080ee3741593310071483ac1890f9a7c37671bafe247a5e +c185d557588e5a0254aa859cf254a3979a9a2e6e0815078ba901393a072d82fe
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.