diff for duplicates of <1494508201.3207.5.camel@primarydata.com> diff --git a/a/1.txt b/N1/1.txt index 9207f50..736e96d 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,89 +1,131 @@ -T24gVGh1LCAyMDE3LTA1LTExIGF0IDE0OjU2ICswMjAwLCBNaWNoYWwgSG9ja28gd3JvdGU6DQo+ -IE9uIFRodSAxMS0wNS0xNyAxMjo0NTowMCwgVHJvbmQgTXlrbGVidXN0IHdyb3RlOg0KPiA+IE9u -IFRodSwgMjAxNy0wNS0xMSBhdCAxNDoyNiArMDIwMCwgTWljaGFsIEhvY2tvIHdyb3RlOg0KPiA+ -ID4gT24gVGh1IDExLTA1LTE3IDEyOjE2OjM3LCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6DQo+ID4g -PiA+IE9uIFRodSwgMjAxNy0wNS0xMSBhdCAwOTo1OSArMDIwMCwgTWljaGFsIEhvY2tvIHdyb3Rl -Og0KPiA+ID4gPiA+IE9uIFRodSAxMS0wNS0xNyAxMDo1MzoyNywgTmlrb2xheSBCb3Jpc292IHdy -b3RlOg0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IE9uIDEwLjA1LjIw -MTcgMTk6NDcsIFRyb25kIE15a2xlYnVzdCB3cm90ZToNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBb -Li4uXQ0KPiA+ID4gPiA+ID4gPiAtIENsZWFudXAgYW5kIHJlbW92YWwgb2Ygc29tZSBtZW1vcnkg -ZmFpbHVyZSBwYXRocyBub3cNCj4gPiA+ID4gPiA+ID4gdGhhdA0KPiA+ID4gPiA+ID4gPiDCoCBH -RlBfTk9GUyBpcyBndWFyYW50ZWVkIHRvIG5ldmVyIGZhaWwuDQo+ID4gPiA+ID4gPiANCj4gPiA+ -ID4gPiA+IFdoYXQgZ3VhcmFudGVlcyB0aGF0PyBTaW5jZSBpZiB0aGlzIGlzIHRoZSBjYXNlIHRo -ZW4gdGhpcw0KPiA+ID4gPiA+ID4gY2FuDQo+ID4gPiA+ID4gPiByZXN1bHQgaW4NCj4gPiA+ID4g -PiA+IGEgbG90IG9mIG9wcG9ydHVuaXRpZXMgZm9yIGNsZWFudXAgYWNyb3NzIHRoZSB3aG9sZSBr -ZXJuZWwNCj4gPiA+ID4gPiA+IHRyZWUuDQo+ID4gPiA+ID4gPiBBZnRlcg0KPiA+ID4gPiA+ID4g -ZGlzY3Vzc2luZyB3aXRoIG1ob2NrbyAoY2MnZWQpIGl0IHNlZW1zIHRoYXQgaW4gcHJhY3RpY2UN -Cj4gPiA+ID4gPiA+IGV2ZXJ5dGhpbmcNCj4gPiA+ID4gPiA+IGJlbG93IENPU1RMWV9PUkRFUiB3 -aGljaCBhcmUgbm90IEdGUF9OT1JFVFJZIHdpbGwgbmV2ZXINCj4gPiA+ID4gPiA+IGZhaWwuDQo+ -ID4gPiA+ID4gPiBCdXQNCj4gPiA+ID4gPiA+IHRoaXMNCj4gPiA+ID4gPiA+IHNlbWFudGljIGlz -IG5vdCB0aGUgc2FtZSBhcyBHRlBfTk9GQUlMLiBFLmcuIG5vdGhpbmcNCj4gPiA+ID4gPiA+IGd1 -YXJhbnRlZXMNCj4gPiA+ID4gPiA+IHRoYXQNCj4gPiA+ID4gPiA+IHRoaXMgd2lsbCBzdGF5IGxp -a2UgdGhhdCBpbiB0aGUgZnV0dXJlPw0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IEluIHByYWN0aWNl -IGl0IGlzIGhhcmQgdG8gY2hhbmdlIHRoZSBzZW1hbnRpYyBvZiBzbWFsbA0KPiA+ID4gPiA+IGFs -bG9jYXRpb25zDQo+ID4gPiA+ID4gbmV2ZXINCj4gPiA+ID4gPiBmYWlsIF9wcmFjdGljYWxseV8u -IEJ1dCB0aGlzIGlzIGFic29sdXRlbHkgbm90IGd1YXJhbnRlZWQhDQo+ID4gPiA+ID4gVGhleQ0K -PiA+ID4gPiA+IGNhbg0KPiA+ID4gPiA+IGZhaWwNCj4gPiA+ID4gPiBlLmcuIHdoZW4gdGhlIGFs -bG9jYXRpb24gY29udGV4dCBpcyB0aGUgb29tIHZpY3RpbS4gUmVtb3ZpbmcNCj4gPiA+ID4gPiBl -cnJvcg0KPiA+ID4gPiA+IHBhdGhzDQo+ID4gPiA+ID4gZm9yIGFsbG9jYXRpb24gZmFpbHVyZXMg -aXMganVzdCB3cm9uZy4NCj4gPiA+ID4gDQo+ID4gPiA+IE9LLCB0aGlzIG1ha2VzIG5vIGZ1Y2tp -bmcgc2Vuc2UgYXQgYWxsLg0KPiA+ID4gPiANCj4gPiA+ID4gRWl0aGVyIGFsbG9jYXRpb25zIGNh -biBmYWlsIG9yIHRoZXkgY2FuJ3QuDQo+ID4gPiA+IDEpIElmIHRoZXkgY2FuJ3QgZmFpbCwgdGhl -biB3ZSBkb24ndCBuZWVkIHRoZSBjaGVja3MuDQo+ID4gPiA+IDIpIElmIHRoZXkgY2FuIGZhaWws -IHRoZW4gd2UgZG8gbmVlZCB0aGVtLCBhbmQgdGhpcyBoYW5kDQo+ID4gPiA+IHdyaW5naW5nDQo+ -ID4gPiA+IGluDQo+ID4gPiA+IHRoZSBNTSBjb21tdW5pdHkgYWJvdXQgR0ZQXyogc2VtYW50aWNz -IGFuZCBob3cgd2UgbmVlZCB0bw0KPiA+ID4gPiBwcmV2ZW50DQo+ID4gPiA+IGZhaWx1cmUgaXMg -ZnVja2luZyBwb2ludGxlc3MuDQo+ID4gPiANCj4gPiA+IGV2ZXJ5dGhpbmcgd2hpY2ggaXMgbm90 -IF9fR0ZQX05PRkFJTCBtaWdodCBmYWlsLiBXZSB0cnkgaGFyZCBub3QNCj4gPiA+IHRvDQo+ID4g -PiBmYWlsDQo+ID4gPiBzbWFsbCBhbGxvY2F0aW9ucyByZXF1ZXN0cyBhcyBtdWNoIGFzIHdlIGNh -biBpbiBnZW5lcmFsIGJ1dCB5b3UNCj4gPiA+IF9oYXZlXyB0bw0KPiA+ID4gY2hlY2sgZm9yIGZh -aWx1cmVzLiBUaGVyZSBpcyBzaW1wbHkgbm8gd2F5IHRvIGd1YXJhbnRlZSAibmV2ZXINCj4gPiA+ -IGZhaWwiDQo+ID4gPiBzZW1hbnRpYyBmb3IgYWxsIGFsbG9jYXRpb24gcmVxdWVzdHMuIFRoaXMg -aGFzIGJlZW4gbGlrZSB0aGF0DQo+ID4gPiBiYXNpY2FsbHkNCj4gPiA+IHNpbmNlIHllYXJzLiBB -bmQgZXZlbiB0aGlzIHRyeS10by1iZS1ub2ZhaWxpbmcgZm9yIHNtYWxsDQo+ID4gPiBhbGxvY2F0 -aW9ucw0KPiA+ID4gaGFzDQo+ID4gPiBiZWVuIFBJVEEgZm9yIHNvbWUgY29ybmVyIGNhc2VzLg0K -PiA+IA0KPiA+IEknbGwgdGFrZSB0aGF0IGFzIGEgdm90ZSBmb3IgKDIpLCB0aGVuLg0KPiA+IA0K -PiA+IEkga25vdyB0aGF0IGZhaWx1cmVzIGNvdWxkIG9jY3VyIGluIHRoZSBwYXN0LiBUaGF0J3Mg -d2h5IHRob3NlIGNvZGUNCj4gPiBwYXRocyB3ZXJlIHRoZXJlLiBUaGUgcHJvYmxlbSBpcyB0aGF0 -IHRoZSBNTSBjb21tdW5pdHkgaGFzIGJlZW4NCj4gPiBtYWtpbmcNCj4gPiBsb3RzIG9mIG5vaXNl -IG9uIG1haWxpbmcgbGlzdHMsIGNvbmZlcmVuY2VzIGFuZCBMV04gYXJ0aWNsZXMgYWJvdXQNCj4g -PiBob3cNCj4gPiB3ZSBtdXN0IG5vdCBmYWlsIHNtYWxsIGFsbG9jYXRpb25zIGJlY2F1c2UgdGhl -IE1NIGNvbW11bml0eQ0KPiA+IGJlbGlldmVzDQo+ID4gdGhhdCBub2JvZHkgZXhwZWN0cyBpdC4g -VGhpcyBpcyBjb25mdXNpbmcgZXZlcnlvbmUuLi4NCj4gDQo+IEl0IHdhcyBleGFjdGx5IG90aGVy -IHdheSBhcm91bmQuIFdlIHdvdWxkIGxpa2UgdG8gX2dldF9yaWRfb2ZfIHRoaXMNCj4gZG8NCj4g -bm90IGZhaWwgYmVoYXZpb3IgYmVjYXVzZSBpdCBpcyBjYXVzaW5nIGEgbWFqb3IgaGVhZGFjaGVz -IGluIG91dCBvZg0KPiBtZW1vcnkgY29ybmVyIGNhc2VzLiBKdXN0IHRha2UgR0ZQX05PRlMgYXMg -YW4gZXhhbXBsZS4gSXQgaXMgYSB3ZWFrDQo+IHJlY2xhaW0gY29udGV4dCBiZWNhdXNlIHdlIGNh -bm5vdCByZWNsYWltIGZzIG1ldGFkYXRhIGFuZCB0aGF0IG1pZ2h0DQo+IGJlDQo+IGEgbG90IG9m -IG1lbW9yeSBzbyB3ZSBjYW5ub3QgdHJpZ2dlciB0aGUgT09NIGtpbGxlciBhbmQgaGF2ZSB0byBy -ZWx5DQo+IG9uDQo+IGEgZGlmZmVyZW50IGFsbG9jYXRpb24gY29udGV4dCBvciBrc3dhcGQgdG8g -bWFrZSBhIHByb2dyZXNzIG9uIG91cg0KPiBiZWhhbGYuIFdlIHdvdWxkIHJlYWxseSBsaWtlIHRv -IGZhaWwgdGhvc2UgcmVxdWVzdHMgaW5zdGVhZC4gSSd2ZQ0KPiB0cmllZA0KPiB0aGF0IGluIHRo -ZSBwYXN0IGJ1dCBpdCB3YXMgZGVlbWVkIHRvIGRhbmdlcm91cyBiZWNhdXNlIF9hbGxfIGtlcm5l -bA0KPiBwYXRocyB3b3VsZCBoYXZlIHRvIGJlIGNoZWNrZWQgZm9yIGEgc2FuZSBmYWlsdXJlIGJl -aGF2aW9yLiBTbyB3ZSBhcmUNCj4ga2VlcGluZyBzdGF0dXMgcXVvIGluc3RlYWQuDQoNCklmIHdl -IHN1c3BlY3QgdGhlIGV4aXN0ZW5jZSBvZiBhIGxvYWQgb2YgcG90ZW50aWFsIHRpbWUgYm9tYnMg -aW4gdGhlDQprZXJuZWwgZHVlIHRvIG1pc3NpbmcgY2hlY2tzLCB0aGVuIHRoZSBzdGF0dXMgcXVv -IGlzIG5vdCBnb29kIGVub3VnaC4NCldlIHNob3VsZCBiZSB3b3JraW5nIG9uIHRvb2xzIHRvIGlk -ZW50aWZ5IHRoZXNlIGNvZGUgcGF0aHMuDQoNClF1aXRlIGZyYW5rbHksIEknZCBsb3ZlIHRvIHNl -ZSBhIGZ1enplci1saWtlIHRvb2wgdGhhdCBjYW4gcmFuZG9tbHkNCmZhaWwgYWxsb2NhdGlvbnMu -IEkgY2FuIGVhc2lseSBtYWtlIG9uZSBmb3IgdGhlIE5GUyBjb2RlLCBidXQgaWYgdGhlcmUNCmlz -IGEgZ2VuZXJhbCBwcm9ibGVtIGlkZW50aWZ5aW5nIGJ1Z2d5IGNvZGUsIHRoZW4gcGVyaGFwcyBp -dCBzaG91bGQgYmUNCnNvbHZlZCBhdCB0aGUgTU0gbGF5ZXIgaXRzZWxmLg0KDQo+ID4gSXQgY29u -ZnVzZWQgTmVpbCBCcm93biwgd2hvIGNvbnRyaWJ1dGVkIHRoZXNlIHBhdGNoZXMsIGFuZCBpdA0K -PiA+IGNvbmZ1c2VkDQo+ID4gbWUgYW5kIGFsbCB0aGUgb3RoZXIgcmV2aWV3ZXJzIG9mIHRoZXNl -IHBhdGNoZXMgb24gdGhlIGxpbnV4LW5mcw0KPiA+IG1haWxpbmcgbGlzdC4NCj4gPiANCj4gPiBT -byBpZiBpbmRlZWQgKDIpIGlzIGNvcnJlY3QsIHRoZW4gcGxlYXNlIGNhbiB3ZSBoYXZlIGEgY2xl -YXINCj4gPiBzdGF0ZW1lbnQNCj4gPiBfd2hlbiBkaXNjdXNzaW5nIGltcHJvdmVtZW50cyB0byBt -ZW1vcnkgYWxsb2NhdGlvbiBzZW1hbnRpY3NfIHRoYXQNCj4gPiBHRlBfKiBzdGlsbCBjYW4gZmFp -bCwgc3RpbGwgd2lsbCBmYWlsLCBhbmQgdGhhdCBjYWxsZXJzIHNob3VsZA0KPiA+IGFzc3VtZQ0K -PiA+IGl0IHdpbGwgZmFpbCBhbmQgc2hvdWxkIHRlc3QgdGhlaXIgY29kZSBwYXRocyBhc3N1bWlu -ZyB0aGUgZmFpbHVyZQ0KPiA+IGNhc2UuDQo+IA0KPiBJIGRvIG5vdCBzZWUgYW55IGV4cGxpY2l0 -IGRvY3VtZW50YXRpb24gd2hpY2ggd291bGQgZW5jb3VyYWdlIHVzZXJzDQo+IHRvDQo+IG5vdCBj -aGVjayBmb3IgdGhlIGFsbG9jYXRpb24gZmFpbHVyZS4gT25seSBfX0dGUF9OT0ZBSUwgaXMgZG9j -dW1lbnRlZA0KPiBpdA0KPiBfbXVzdF8gcmV0cnkgZm9yIGV2ZXIuIE9mIGNvdXJzZSBJIGFtIG9w -ZW4gZm9yIGFueSBkb2N1bWVudGF0aW9uDQo+IGltcHJvdmVtZW50cy4NCg0KQXMgSSBzYWlkLCB0 -aGUgcHJvYmxlbSBoYXMgYmVlbiB0aGUgZGlzY3Vzc2lvbiwgYW5kIGhvdyBpdCBmb2N1c3NlcyBv -bg0KIm11c3Qgbm90IGZhaWwiLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNs -aWVudCBtYWludGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRh -LmNvbQ0K +On Thu, 2017-05-11 at 14:56 +0200, Michal Hocko wrote: +> On Thu 11-05-17 12:45:00, Trond Myklebust wrote: +> > On Thu, 2017-05-11 at 14:26 +0200, Michal Hocko wrote: +> > > On Thu 11-05-17 12:16:37, Trond Myklebust wrote: +> > > > On Thu, 2017-05-11 at 09:59 +0200, Michal Hocko wrote: +> > > > > On Thu 11-05-17 10:53:27, Nikolay Borisov wrote: +> > > > > > +> > > > > > +> > > > > > On 10.05.2017 19:47, Trond Myklebust wrote: +> > > > > +> > > > > [...] +> > > > > > > - Cleanup and removal of some memory failure paths now +> > > > > > > that +> > > > > > > GFP_NOFS is guaranteed to never fail. +> > > > > > +> > > > > > What guarantees that? Since if this is the case then this +> > > > > > can +> > > > > > result in +> > > > > > a lot of opportunities for cleanup across the whole kernel +> > > > > > tree. +> > > > > > After +> > > > > > discussing with mhocko (cc'ed) it seems that in practice +> > > > > > everything +> > > > > > below COSTLY_ORDER which are not GFP_NORETRY will never +> > > > > > fail. +> > > > > > But +> > > > > > this +> > > > > > semantic is not the same as GFP_NOFAIL. E.g. nothing +> > > > > > guarantees +> > > > > > that +> > > > > > this will stay like that in the future? +> > > > > +> > > > > In practice it is hard to change the semantic of small +> > > > > allocations +> > > > > never +> > > > > fail _practically_. But this is absolutely not guaranteed! +> > > > > They +> > > > > can +> > > > > fail +> > > > > e.g. when the allocation context is the oom victim. Removing +> > > > > error +> > > > > paths +> > > > > for allocation failures is just wrong. +> > > > +> > > > OK, this makes no fucking sense at all. +> > > > +> > > > Either allocations can fail or they can't. +> > > > 1) If they can't fail, then we don't need the checks. +> > > > 2) If they can fail, then we do need them, and this hand +> > > > wringing +> > > > in +> > > > the MM community about GFP_* semantics and how we need to +> > > > prevent +> > > > failure is fucking pointless. +> > > +> > > everything which is not __GFP_NOFAIL might fail. We try hard not +> > > to +> > > fail +> > > small allocations requests as much as we can in general but you +> > > _have_ to +> > > check for failures. There is simply no way to guarantee "never +> > > fail" +> > > semantic for all allocation requests. This has been like that +> > > basically +> > > since years. And even this try-to-be-nofailing for small +> > > allocations +> > > has +> > > been PITA for some corner cases. +> > +> > I'll take that as a vote for (2), then. +> > +> > I know that failures could occur in the past. That's why those code +> > paths were there. The problem is that the MM community has been +> > making +> > lots of noise on mailing lists, conferences and LWN articles about +> > how +> > we must not fail small allocations because the MM community +> > believes +> > that nobody expects it. This is confusing everyone... +> +> It was exactly other way around. We would like to _get_rid_of_ this +> do +> not fail behavior because it is causing a major headaches in out of +> memory corner cases. Just take GFP_NOFS as an example. It is a weak +> reclaim context because we cannot reclaim fs metadata and that might +> be +> a lot of memory so we cannot trigger the OOM killer and have to rely +> on +> a different allocation context or kswapd to make a progress on our +> behalf. We would really like to fail those requests instead. I've +> tried +> that in the past but it was deemed to dangerous because _all_ kernel +> paths would have to be checked for a sane failure behavior. So we are +> keeping status quo instead. + +If we suspect the existence of a load of potential time bombs in the +kernel due to missing checks, then the status quo is not good enough. +We should be working on tools to identify these code paths. + +Quite frankly, I'd love to see a fuzzer-like tool that can randomly +fail allocations. I can easily make one for the NFS code, but if there +is a general problem identifying buggy code, then perhaps it should be +solved at the MM layer itself. + +> > It confused Neil Brown, who contributed these patches, and it +> > confused +> > me and all the other reviewers of these patches on the linux-nfs +> > mailing list. +> > +> > So if indeed (2) is correct, then please can we have a clear +> > statement +> > _when discussing improvements to memory allocation semantics_ that +> > GFP_* still can fail, still will fail, and that callers should +> > assume +> > it will fail and should test their code paths assuming the failure +> > case. +> +> I do not see any explicit documentation which would encourage users +> to +> not check for the allocation failure. Only __GFP_NOFAIL is documented +> it +> _must_ retry for ever. Of course I am open for any documentation +> improvements. + +As I said, the problem has been the discussion, and how it focusses on +"must not fail". + +-- +Trond Myklebust +Linux NFS client maintainer, PrimaryData +trond.myklebust@primarydata.com diff --git a/a/content_digest b/N1/content_digest index 374824a..6807dab 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -15,94 +15,136 @@ " n.borisov.lkml@gmail.com <n.borisov.lkml@gmail.com>\0" "\00:1\0" "b\0" - "T24gVGh1LCAyMDE3LTA1LTExIGF0IDE0OjU2ICswMjAwLCBNaWNoYWwgSG9ja28gd3JvdGU6DQo+\n" - "IE9uIFRodSAxMS0wNS0xNyAxMjo0NTowMCwgVHJvbmQgTXlrbGVidXN0IHdyb3RlOg0KPiA+IE9u\n" - "IFRodSwgMjAxNy0wNS0xMSBhdCAxNDoyNiArMDIwMCwgTWljaGFsIEhvY2tvIHdyb3RlOg0KPiA+\n" - "ID4gT24gVGh1IDExLTA1LTE3IDEyOjE2OjM3LCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6DQo+ID4g\n" - "PiA+IE9uIFRodSwgMjAxNy0wNS0xMSBhdCAwOTo1OSArMDIwMCwgTWljaGFsIEhvY2tvIHdyb3Rl\n" - "Og0KPiA+ID4gPiA+IE9uIFRodSAxMS0wNS0xNyAxMDo1MzoyNywgTmlrb2xheSBCb3Jpc292IHdy\n" - "b3RlOg0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IE9uIDEwLjA1LjIw\n" - "MTcgMTk6NDcsIFRyb25kIE15a2xlYnVzdCB3cm90ZToNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBb\n" - "Li4uXQ0KPiA+ID4gPiA+ID4gPiAtIENsZWFudXAgYW5kIHJlbW92YWwgb2Ygc29tZSBtZW1vcnkg\n" - "ZmFpbHVyZSBwYXRocyBub3cNCj4gPiA+ID4gPiA+ID4gdGhhdA0KPiA+ID4gPiA+ID4gPiDCoCBH\n" - "RlBfTk9GUyBpcyBndWFyYW50ZWVkIHRvIG5ldmVyIGZhaWwuDQo+ID4gPiA+ID4gPiANCj4gPiA+\n" - "ID4gPiA+IFdoYXQgZ3VhcmFudGVlcyB0aGF0PyBTaW5jZSBpZiB0aGlzIGlzIHRoZSBjYXNlIHRo\n" - "ZW4gdGhpcw0KPiA+ID4gPiA+ID4gY2FuDQo+ID4gPiA+ID4gPiByZXN1bHQgaW4NCj4gPiA+ID4g\n" - "PiA+IGEgbG90IG9mIG9wcG9ydHVuaXRpZXMgZm9yIGNsZWFudXAgYWNyb3NzIHRoZSB3aG9sZSBr\n" - "ZXJuZWwNCj4gPiA+ID4gPiA+IHRyZWUuDQo+ID4gPiA+ID4gPiBBZnRlcg0KPiA+ID4gPiA+ID4g\n" - "ZGlzY3Vzc2luZyB3aXRoIG1ob2NrbyAoY2MnZWQpIGl0IHNlZW1zIHRoYXQgaW4gcHJhY3RpY2UN\n" - "Cj4gPiA+ID4gPiA+IGV2ZXJ5dGhpbmcNCj4gPiA+ID4gPiA+IGJlbG93IENPU1RMWV9PUkRFUiB3\n" - "aGljaCBhcmUgbm90IEdGUF9OT1JFVFJZIHdpbGwgbmV2ZXINCj4gPiA+ID4gPiA+IGZhaWwuDQo+\n" - "ID4gPiA+ID4gPiBCdXQNCj4gPiA+ID4gPiA+IHRoaXMNCj4gPiA+ID4gPiA+IHNlbWFudGljIGlz\n" - "IG5vdCB0aGUgc2FtZSBhcyBHRlBfTk9GQUlMLiBFLmcuIG5vdGhpbmcNCj4gPiA+ID4gPiA+IGd1\n" - "YXJhbnRlZXMNCj4gPiA+ID4gPiA+IHRoYXQNCj4gPiA+ID4gPiA+IHRoaXMgd2lsbCBzdGF5IGxp\n" - "a2UgdGhhdCBpbiB0aGUgZnV0dXJlPw0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IEluIHByYWN0aWNl\n" - "IGl0IGlzIGhhcmQgdG8gY2hhbmdlIHRoZSBzZW1hbnRpYyBvZiBzbWFsbA0KPiA+ID4gPiA+IGFs\n" - "bG9jYXRpb25zDQo+ID4gPiA+ID4gbmV2ZXINCj4gPiA+ID4gPiBmYWlsIF9wcmFjdGljYWxseV8u\n" - "IEJ1dCB0aGlzIGlzIGFic29sdXRlbHkgbm90IGd1YXJhbnRlZWQhDQo+ID4gPiA+ID4gVGhleQ0K\n" - "PiA+ID4gPiA+IGNhbg0KPiA+ID4gPiA+IGZhaWwNCj4gPiA+ID4gPiBlLmcuIHdoZW4gdGhlIGFs\n" - "bG9jYXRpb24gY29udGV4dCBpcyB0aGUgb29tIHZpY3RpbS4gUmVtb3ZpbmcNCj4gPiA+ID4gPiBl\n" - "cnJvcg0KPiA+ID4gPiA+IHBhdGhzDQo+ID4gPiA+ID4gZm9yIGFsbG9jYXRpb24gZmFpbHVyZXMg\n" - "aXMganVzdCB3cm9uZy4NCj4gPiA+ID4gDQo+ID4gPiA+IE9LLCB0aGlzIG1ha2VzIG5vIGZ1Y2tp\n" - "bmcgc2Vuc2UgYXQgYWxsLg0KPiA+ID4gPiANCj4gPiA+ID4gRWl0aGVyIGFsbG9jYXRpb25zIGNh\n" - "biBmYWlsIG9yIHRoZXkgY2FuJ3QuDQo+ID4gPiA+IDEpIElmIHRoZXkgY2FuJ3QgZmFpbCwgdGhl\n" - "biB3ZSBkb24ndCBuZWVkIHRoZSBjaGVja3MuDQo+ID4gPiA+IDIpIElmIHRoZXkgY2FuIGZhaWws\n" - "IHRoZW4gd2UgZG8gbmVlZCB0aGVtLCBhbmQgdGhpcyBoYW5kDQo+ID4gPiA+IHdyaW5naW5nDQo+\n" - "ID4gPiA+IGluDQo+ID4gPiA+IHRoZSBNTSBjb21tdW5pdHkgYWJvdXQgR0ZQXyogc2VtYW50aWNz\n" - "IGFuZCBob3cgd2UgbmVlZCB0bw0KPiA+ID4gPiBwcmV2ZW50DQo+ID4gPiA+IGZhaWx1cmUgaXMg\n" - "ZnVja2luZyBwb2ludGxlc3MuDQo+ID4gPiANCj4gPiA+IGV2ZXJ5dGhpbmcgd2hpY2ggaXMgbm90\n" - "IF9fR0ZQX05PRkFJTCBtaWdodCBmYWlsLiBXZSB0cnkgaGFyZCBub3QNCj4gPiA+IHRvDQo+ID4g\n" - "PiBmYWlsDQo+ID4gPiBzbWFsbCBhbGxvY2F0aW9ucyByZXF1ZXN0cyBhcyBtdWNoIGFzIHdlIGNh\n" - "biBpbiBnZW5lcmFsIGJ1dCB5b3UNCj4gPiA+IF9oYXZlXyB0bw0KPiA+ID4gY2hlY2sgZm9yIGZh\n" - "aWx1cmVzLiBUaGVyZSBpcyBzaW1wbHkgbm8gd2F5IHRvIGd1YXJhbnRlZSAibmV2ZXINCj4gPiA+\n" - "IGZhaWwiDQo+ID4gPiBzZW1hbnRpYyBmb3IgYWxsIGFsbG9jYXRpb24gcmVxdWVzdHMuIFRoaXMg\n" - "aGFzIGJlZW4gbGlrZSB0aGF0DQo+ID4gPiBiYXNpY2FsbHkNCj4gPiA+IHNpbmNlIHllYXJzLiBB\n" - "bmQgZXZlbiB0aGlzIHRyeS10by1iZS1ub2ZhaWxpbmcgZm9yIHNtYWxsDQo+ID4gPiBhbGxvY2F0\n" - "aW9ucw0KPiA+ID4gaGFzDQo+ID4gPiBiZWVuIFBJVEEgZm9yIHNvbWUgY29ybmVyIGNhc2VzLg0K\n" - "PiA+IA0KPiA+IEknbGwgdGFrZSB0aGF0IGFzIGEgdm90ZSBmb3IgKDIpLCB0aGVuLg0KPiA+IA0K\n" - "PiA+IEkga25vdyB0aGF0IGZhaWx1cmVzIGNvdWxkIG9jY3VyIGluIHRoZSBwYXN0LiBUaGF0J3Mg\n" - "d2h5IHRob3NlIGNvZGUNCj4gPiBwYXRocyB3ZXJlIHRoZXJlLiBUaGUgcHJvYmxlbSBpcyB0aGF0\n" - "IHRoZSBNTSBjb21tdW5pdHkgaGFzIGJlZW4NCj4gPiBtYWtpbmcNCj4gPiBsb3RzIG9mIG5vaXNl\n" - "IG9uIG1haWxpbmcgbGlzdHMsIGNvbmZlcmVuY2VzIGFuZCBMV04gYXJ0aWNsZXMgYWJvdXQNCj4g\n" - "PiBob3cNCj4gPiB3ZSBtdXN0IG5vdCBmYWlsIHNtYWxsIGFsbG9jYXRpb25zIGJlY2F1c2UgdGhl\n" - "IE1NIGNvbW11bml0eQ0KPiA+IGJlbGlldmVzDQo+ID4gdGhhdCBub2JvZHkgZXhwZWN0cyBpdC4g\n" - "VGhpcyBpcyBjb25mdXNpbmcgZXZlcnlvbmUuLi4NCj4gDQo+IEl0IHdhcyBleGFjdGx5IG90aGVy\n" - "IHdheSBhcm91bmQuIFdlIHdvdWxkIGxpa2UgdG8gX2dldF9yaWRfb2ZfIHRoaXMNCj4gZG8NCj4g\n" - "bm90IGZhaWwgYmVoYXZpb3IgYmVjYXVzZSBpdCBpcyBjYXVzaW5nIGEgbWFqb3IgaGVhZGFjaGVz\n" - "IGluIG91dCBvZg0KPiBtZW1vcnkgY29ybmVyIGNhc2VzLiBKdXN0IHRha2UgR0ZQX05PRlMgYXMg\n" - "YW4gZXhhbXBsZS4gSXQgaXMgYSB3ZWFrDQo+IHJlY2xhaW0gY29udGV4dCBiZWNhdXNlIHdlIGNh\n" - "bm5vdCByZWNsYWltIGZzIG1ldGFkYXRhIGFuZCB0aGF0IG1pZ2h0DQo+IGJlDQo+IGEgbG90IG9m\n" - "IG1lbW9yeSBzbyB3ZSBjYW5ub3QgdHJpZ2dlciB0aGUgT09NIGtpbGxlciBhbmQgaGF2ZSB0byBy\n" - "ZWx5DQo+IG9uDQo+IGEgZGlmZmVyZW50IGFsbG9jYXRpb24gY29udGV4dCBvciBrc3dhcGQgdG8g\n" - "bWFrZSBhIHByb2dyZXNzIG9uIG91cg0KPiBiZWhhbGYuIFdlIHdvdWxkIHJlYWxseSBsaWtlIHRv\n" - "IGZhaWwgdGhvc2UgcmVxdWVzdHMgaW5zdGVhZC4gSSd2ZQ0KPiB0cmllZA0KPiB0aGF0IGluIHRo\n" - "ZSBwYXN0IGJ1dCBpdCB3YXMgZGVlbWVkIHRvIGRhbmdlcm91cyBiZWNhdXNlIF9hbGxfIGtlcm5l\n" - "bA0KPiBwYXRocyB3b3VsZCBoYXZlIHRvIGJlIGNoZWNrZWQgZm9yIGEgc2FuZSBmYWlsdXJlIGJl\n" - "aGF2aW9yLiBTbyB3ZSBhcmUNCj4ga2VlcGluZyBzdGF0dXMgcXVvIGluc3RlYWQuDQoNCklmIHdl\n" - "IHN1c3BlY3QgdGhlIGV4aXN0ZW5jZSBvZiBhIGxvYWQgb2YgcG90ZW50aWFsIHRpbWUgYm9tYnMg\n" - "aW4gdGhlDQprZXJuZWwgZHVlIHRvIG1pc3NpbmcgY2hlY2tzLCB0aGVuIHRoZSBzdGF0dXMgcXVv\n" - "IGlzIG5vdCBnb29kIGVub3VnaC4NCldlIHNob3VsZCBiZSB3b3JraW5nIG9uIHRvb2xzIHRvIGlk\n" - "ZW50aWZ5IHRoZXNlIGNvZGUgcGF0aHMuDQoNClF1aXRlIGZyYW5rbHksIEknZCBsb3ZlIHRvIHNl\n" - "ZSBhIGZ1enplci1saWtlIHRvb2wgdGhhdCBjYW4gcmFuZG9tbHkNCmZhaWwgYWxsb2NhdGlvbnMu\n" - "IEkgY2FuIGVhc2lseSBtYWtlIG9uZSBmb3IgdGhlIE5GUyBjb2RlLCBidXQgaWYgdGhlcmUNCmlz\n" - "IGEgZ2VuZXJhbCBwcm9ibGVtIGlkZW50aWZ5aW5nIGJ1Z2d5IGNvZGUsIHRoZW4gcGVyaGFwcyBp\n" - "dCBzaG91bGQgYmUNCnNvbHZlZCBhdCB0aGUgTU0gbGF5ZXIgaXRzZWxmLg0KDQo+ID4gSXQgY29u\n" - "ZnVzZWQgTmVpbCBCcm93biwgd2hvIGNvbnRyaWJ1dGVkIHRoZXNlIHBhdGNoZXMsIGFuZCBpdA0K\n" - "PiA+IGNvbmZ1c2VkDQo+ID4gbWUgYW5kIGFsbCB0aGUgb3RoZXIgcmV2aWV3ZXJzIG9mIHRoZXNl\n" - "IHBhdGNoZXMgb24gdGhlIGxpbnV4LW5mcw0KPiA+IG1haWxpbmcgbGlzdC4NCj4gPiANCj4gPiBT\n" - "byBpZiBpbmRlZWQgKDIpIGlzIGNvcnJlY3QsIHRoZW4gcGxlYXNlIGNhbiB3ZSBoYXZlIGEgY2xl\n" - "YXINCj4gPiBzdGF0ZW1lbnQNCj4gPiBfd2hlbiBkaXNjdXNzaW5nIGltcHJvdmVtZW50cyB0byBt\n" - "ZW1vcnkgYWxsb2NhdGlvbiBzZW1hbnRpY3NfIHRoYXQNCj4gPiBHRlBfKiBzdGlsbCBjYW4gZmFp\n" - "bCwgc3RpbGwgd2lsbCBmYWlsLCBhbmQgdGhhdCBjYWxsZXJzIHNob3VsZA0KPiA+IGFzc3VtZQ0K\n" - "PiA+IGl0IHdpbGwgZmFpbCBhbmQgc2hvdWxkIHRlc3QgdGhlaXIgY29kZSBwYXRocyBhc3N1bWlu\n" - "ZyB0aGUgZmFpbHVyZQ0KPiA+IGNhc2UuDQo+IA0KPiBJIGRvIG5vdCBzZWUgYW55IGV4cGxpY2l0\n" - "IGRvY3VtZW50YXRpb24gd2hpY2ggd291bGQgZW5jb3VyYWdlIHVzZXJzDQo+IHRvDQo+IG5vdCBj\n" - "aGVjayBmb3IgdGhlIGFsbG9jYXRpb24gZmFpbHVyZS4gT25seSBfX0dGUF9OT0ZBSUwgaXMgZG9j\n" - "dW1lbnRlZA0KPiBpdA0KPiBfbXVzdF8gcmV0cnkgZm9yIGV2ZXIuIE9mIGNvdXJzZSBJIGFtIG9w\n" - "ZW4gZm9yIGFueSBkb2N1bWVudGF0aW9uDQo+IGltcHJvdmVtZW50cy4NCg0KQXMgSSBzYWlkLCB0\n" - "aGUgcHJvYmxlbSBoYXMgYmVlbiB0aGUgZGlzY3Vzc2lvbiwgYW5kIGhvdyBpdCBmb2N1c3NlcyBv\n" - "bg0KIm11c3Qgbm90IGZhaWwiLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNs\n" - "aWVudCBtYWludGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRh\n" - LmNvbQ0K + "On Thu, 2017-05-11 at 14:56 +0200, Michal Hocko wrote:\n" + "> On Thu 11-05-17 12:45:00, Trond Myklebust wrote:\n" + "> > On Thu, 2017-05-11 at 14:26 +0200, Michal Hocko wrote:\n" + "> > > On Thu 11-05-17 12:16:37, Trond Myklebust wrote:\n" + "> > > > On Thu, 2017-05-11 at 09:59 +0200, Michal Hocko wrote:\n" + "> > > > > On Thu 11-05-17 10:53:27, Nikolay Borisov wrote:\n" + "> > > > > > \n" + "> > > > > > \n" + "> > > > > > On 10.05.2017 19:47, Trond Myklebust wrote:\n" + "> > > > > \n" + "> > > > > [...]\n" + "> > > > > > > - Cleanup and removal of some memory failure paths now\n" + "> > > > > > > that\n" + "> > > > > > > \302\240 GFP_NOFS is guaranteed to never fail.\n" + "> > > > > > \n" + "> > > > > > What guarantees that? Since if this is the case then this\n" + "> > > > > > can\n" + "> > > > > > result in\n" + "> > > > > > a lot of opportunities for cleanup across the whole kernel\n" + "> > > > > > tree.\n" + "> > > > > > After\n" + "> > > > > > discussing with mhocko (cc'ed) it seems that in practice\n" + "> > > > > > everything\n" + "> > > > > > below COSTLY_ORDER which are not GFP_NORETRY will never\n" + "> > > > > > fail.\n" + "> > > > > > But\n" + "> > > > > > this\n" + "> > > > > > semantic is not the same as GFP_NOFAIL. E.g. nothing\n" + "> > > > > > guarantees\n" + "> > > > > > that\n" + "> > > > > > this will stay like that in the future?\n" + "> > > > > \n" + "> > > > > In practice it is hard to change the semantic of small\n" + "> > > > > allocations\n" + "> > > > > never\n" + "> > > > > fail _practically_. But this is absolutely not guaranteed!\n" + "> > > > > They\n" + "> > > > > can\n" + "> > > > > fail\n" + "> > > > > e.g. when the allocation context is the oom victim. Removing\n" + "> > > > > error\n" + "> > > > > paths\n" + "> > > > > for allocation failures is just wrong.\n" + "> > > > \n" + "> > > > OK, this makes no fucking sense at all.\n" + "> > > > \n" + "> > > > Either allocations can fail or they can't.\n" + "> > > > 1) If they can't fail, then we don't need the checks.\n" + "> > > > 2) If they can fail, then we do need them, and this hand\n" + "> > > > wringing\n" + "> > > > in\n" + "> > > > the MM community about GFP_* semantics and how we need to\n" + "> > > > prevent\n" + "> > > > failure is fucking pointless.\n" + "> > > \n" + "> > > everything which is not __GFP_NOFAIL might fail. We try hard not\n" + "> > > to\n" + "> > > fail\n" + "> > > small allocations requests as much as we can in general but you\n" + "> > > _have_ to\n" + "> > > check for failures. There is simply no way to guarantee \"never\n" + "> > > fail\"\n" + "> > > semantic for all allocation requests. This has been like that\n" + "> > > basically\n" + "> > > since years. And even this try-to-be-nofailing for small\n" + "> > > allocations\n" + "> > > has\n" + "> > > been PITA for some corner cases.\n" + "> > \n" + "> > I'll take that as a vote for (2), then.\n" + "> > \n" + "> > I know that failures could occur in the past. That's why those code\n" + "> > paths were there. The problem is that the MM community has been\n" + "> > making\n" + "> > lots of noise on mailing lists, conferences and LWN articles about\n" + "> > how\n" + "> > we must not fail small allocations because the MM community\n" + "> > believes\n" + "> > that nobody expects it. This is confusing everyone...\n" + "> \n" + "> It was exactly other way around. We would like to _get_rid_of_ this\n" + "> do\n" + "> not fail behavior because it is causing a major headaches in out of\n" + "> memory corner cases. Just take GFP_NOFS as an example. It is a weak\n" + "> reclaim context because we cannot reclaim fs metadata and that might\n" + "> be\n" + "> a lot of memory so we cannot trigger the OOM killer and have to rely\n" + "> on\n" + "> a different allocation context or kswapd to make a progress on our\n" + "> behalf. We would really like to fail those requests instead. I've\n" + "> tried\n" + "> that in the past but it was deemed to dangerous because _all_ kernel\n" + "> paths would have to be checked for a sane failure behavior. So we are\n" + "> keeping status quo instead.\n" + "\n" + "If we suspect the existence of a load of potential time bombs in the\n" + "kernel due to missing checks, then the status quo is not good enough.\n" + "We should be working on tools to identify these code paths.\n" + "\n" + "Quite frankly, I'd love to see a fuzzer-like tool that can randomly\n" + "fail allocations. I can easily make one for the NFS code, but if there\n" + "is a general problem identifying buggy code, then perhaps it should be\n" + "solved at the MM layer itself.\n" + "\n" + "> > It confused Neil Brown, who contributed these patches, and it\n" + "> > confused\n" + "> > me and all the other reviewers of these patches on the linux-nfs\n" + "> > mailing list.\n" + "> > \n" + "> > So if indeed (2) is correct, then please can we have a clear\n" + "> > statement\n" + "> > _when discussing improvements to memory allocation semantics_ that\n" + "> > GFP_* still can fail, still will fail, and that callers should\n" + "> > assume\n" + "> > it will fail and should test their code paths assuming the failure\n" + "> > case.\n" + "> \n" + "> I do not see any explicit documentation which would encourage users\n" + "> to\n" + "> not check for the allocation failure. Only __GFP_NOFAIL is documented\n" + "> it\n" + "> _must_ retry for ever. Of course I am open for any documentation\n" + "> improvements.\n" + "\n" + "As I said, the problem has been the discussion, and how it focusses on\n" + "\"must not fail\".\n" + "\n" + "-- \n" + "Trond Myklebust\n" + "Linux NFS client maintainer, PrimaryData\n" + trond.myklebust@primarydata.com -345c9671247ed9f251be645b291b67ba2eceba660d0a53b5756e9b9107f492dc +eba8ee399d1e13a7fbf59667fa873da26857a99b6e1f71518a3a8f58f64842fd
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.