From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from g1t5425.austin.hp.com (g1t5425.austin.hp.com [15.216.225.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id BD2D81A1E16 for ; Mon, 25 Apr 2016 09:15:05 -0700 (PDT) Message-ID: <1461600377.8149.76.camel@hpe.com> Subject: Re: [PATCH v4 1/2] thp, dax: add thp_get_unmapped_area for pmd mappings From: Toshi Kani Date: Mon, 25 Apr 2016 10:06:17 -0600 In-Reply-To: <20160424225057.GA6670@node.shutemov.name> References: <1461370883-7664-1-git-send-email-toshi.kani@hpe.com> <1461370883-7664-2-git-send-email-toshi.kani@hpe.com> <20160424225057.GA6670@node.shutemov.name> Mime-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: "Kirill A. Shutemov" , Hugh Dickins Cc: tytso@mit.edu, jack@suse.cz, linux-nvdimm@lists.01.org, david@fromorbit.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, adilger.kernel@dilger.ca, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, mike.kravetz@oracle.com List-ID: T24gTW9uLCAyMDE2LTA0LTI1IGF0IDAxOjUwICswMzAwLCBLaXJpbGwgQS4gU2h1dGVtb3Ygd3Jv dGU6Cj4gT24gRnJpLCBBcHIgMjIsIDIwMTYgYXQgMDY6MjE6MjJQTSAtMDYwMCwgVG9zaGkgS2Fu aSB3cm90ZToKPiA+IArCoDoKPiA+ICt1bnNpZ25lZCBsb25nIF9fdGhwX2dldF91bm1hcHBlZF9h cmVhKHN0cnVjdCBmaWxlICpmaWxwLCB1bnNpZ25lZCBsb25nCj4gPiBsZW4sCj4gPiArCQlsb2Zm X3Qgb2ZmLCB1bnNpZ25lZCBsb25nIGZsYWdzLCB1bnNpZ25lZCBsb25nIHNpemUpCj4gPiArewo+ ID4gKwl1bnNpZ25lZCBsb25nIGFkZHI7Cj4gPiArCWxvZmZfdCBvZmZfZW5kID0gb2ZmICsgbGVu Owo+ID4gKwlsb2ZmX3Qgb2ZmX2FsaWduID0gcm91bmRfdXAob2ZmLCBzaXplKTsKPiA+ICsJdW5z aWduZWQgbG9uZyBsZW5fcGFkOwo+ID4gKwo+ID4gKwlpZiAob2ZmX2VuZCA8PSBvZmZfYWxpZ24g fHwgKG9mZl9lbmQgLSBvZmZfYWxpZ24pIDwgc2l6ZSkKPiA+ICsJCXJldHVybiAwOwo+ID4gKwo+ ID4gKwlsZW5fcGFkID0gbGVuICsgc2l6ZTsKPiA+ICsJaWYgKGxlbl9wYWQgPCBsZW4gfHwgKG9m ZiArIGxlbl9wYWQpIDwgb2ZmKQo+ID4gKwkJcmV0dXJuIDA7Cj4gPiArCj4gPiArCWFkZHIgPSBj dXJyZW50LT5tbS0+Z2V0X3VubWFwcGVkX2FyZWEoZmlscCwgMCwgbGVuX3BhZCwKPiA+ICsJCQkJ CcKgwqDCoMKgwqDCoG9mZiA+PiBQQUdFX1NISUZULAo+ID4gZmxhZ3MpOwo+ID4gKwlpZiAoSVNf RVJSX1ZBTFVFKGFkZHIpKQo+ID4gKwkJcmV0dXJuIDA7Cj4gPiArCj4gPiArCWFkZHIgKz0gKG9m ZiAtIGFkZHIpICYgKHNpemUgLSAxKTsKPiA+ICsJcmV0dXJuIGFkZHI7Cj4KPiBIdWdoIGhhcyBt b3JlIHNhbml0eSBjaGVja3MgYmVmb3JlIGFuZCBhZnRlciBjYWxsIHRvIGdldF91bm1hcHBlZF9h cmVhKCkuCj4gUGxlYXNlLCBjb25zaWRlciBib3Jyb3dpbmcgdGhlbS4KClRoaXMgZnVuY3Rpb24g b25seSBjaGVja3MgaWYgdGhlIHJlcXVlc3QgaXMgcXVhbGlmaWVkIGZvciBUSFAgbWFwcGluZ3Mu IEl0CnRyaWVzIG5vdCB0byBzdGVwIGludG8gdGhlIGltcGxlbWVudGF0aW9uIG9mIHRoZSBhbGxv Y2F0aW9uIGNvZGUgY3VycmVudC0KPm1tLT5nZXRfdW5tYXBwZWRfYXJlYSgpLCBzdWNoIGFzwqBh cmNoX2dldF91bm1hcHBlZF9hcmVhX3RvcGRvd24oKSBvbiB4ODYuCgpMZXQgbWUgd2FsayB0aHJ1 IEh1Z2gncyBjaGVja3MgdG8gbWFrZSBzdXJlIEkgYW0gbm90IG1pc3Npbmcgc29tZXRoaW5nOgoK LS0tKEh1Z2gncyBjaGVja3MpLS0tCnwgKwlpZiAobGVuID4gVEFTS19TSVpFKQp8ICsJCXJldHVy biAtRU5PTUVNOwoKVGhpcyBjaGVjayBpcyBtYWRlIGJ5IGFyY2hfZ2V0X3VubWFwcGVkX2FyZWFf dG9wZG93bigpLgoKfCArCnwgKwlnZXRfYXJlYSA9IGN1cnJlbnQtPm1tLT5nZXRfdW5tYXBwZWRf YXJlYTsKfCArCWFkZHIgPSBnZXRfYXJlYShmaWxlLCB1YWRkciwgbGVuLCBwZ29mZiwgZmxhZ3Mp Owp8ICsKfCArCWlmICghSVNfRU5BQkxFRChDT05GSUdfVFJBTlNQQVJFTlRfSFVHRVBBR0UpKQp8 ICsJCXJldHVybiBhZGRyOwoKdGhwX2dldF91bm1hcHBlZF9hcmVhKCkgaXMgZGVmaW5lZCB0byBO VUxMIGluIHRoaXMgY2FzZS4KCnwgKwlpZiAoSVNfRVJSX1ZBTFVFKGFkZHIpKQp8ICsJCXJldHVy biBhZGRyOwoKQ2hlY2tlZCBpbiBteSBwYXRjaC4KCnwgKwlpZiAoYWRkciAmIH5QQUdFX01BU0sp CnwgKwkJcmV0dXJuIGFkZHI7CgphcmNoX2dldF91bm1hcHBlZF9hcmVhX3RvcGRvd24oKSBhbGln bnMgJ2FkZHInIHVubGVzcyBNQVBfRklYRUQgaXMgc2V0LiBObwpuZWVkIHRvIGNoZWNrIGluIHRo aXMgZnVuYy4KCnwgKwlpZiAoYWRkciA+IFRBU0tfU0laRSAtIGxlbikKfCArCQlyZXR1cm4gYWRk cjsKClRoZSBhbGxvY2F0aW9uIGNvZGUgbmVlZHMgdG8gYXNzdXJlIHRoaXMgY2FzZS4KCnwgKwlp ZiAoc2htZW1faHVnZSA9PSBTSE1FTV9IVUdFX0RFTlkpCnwgKwkJcmV0dXJuIGFkZHI7CgpUaGlz IGNoZWNrIGlzIHNwZWNpZmljIHRvIEh1Z2gncyBwYXRjaC4KCnwgKwlpZiAobGVuIDwgSFBBR0Vf UE1EX1NJWkUpCnwgKwkJcmV0dXJuIGFkZHI7CgpDaGVja2VkIGluIG15IHBhdGNoLgoKfCArCWlm IChmbGFncyAmIE1BUF9GSVhFRCkKfCArCQlyZXR1cm4gYWRkcjsKCkNoZWNrZWQgYnkgYXJjaF9n ZXRfdW5tYXBwZWRfYXJlYV90b3Bkb3duKCkuCgp8ICsJLyoKfCArCcKgKiBPdXIgcHJpb3JpdHkg aXMgdG8gc3VwcG9ydCBNQVBfU0hBUkVEIG1hcHBlZCBodWdlbHk7CnwgKwnCoCogYW5kIHN1cHBv cnQgTUFQX1BSSVZBVEUgbWFwcGVkIGh1Z2VseSB0b28sIHVudGlsIGl0IGlzIENPV2VkLgp8ICsJ wqAqIEJ1dCBpZiBjYWxsZXIgc3BlY2lmaWVkIGFuIGFkZHJlc3MgaGludCwgcmVzcGVjdCB0aGF0 IGFzCmJlZm9yZS4KfCArCcKgKi8KfCArCWlmICh1YWRkcikKfCArCQlyZXR1cm4gYWRkcjsKCkNo ZWNrZWQgaW4gbXkgcGF0Y2guCgooY3V0KQoKfCArCW9mZnNldCA9IChwZ29mZiA8PCBQQUdFX1NI SUZUKSAmIChIUEFHRV9QTURfU0laRS0xKTsKfCArCWlmIChvZmZzZXQgJiYgb2Zmc2V0ICsgbGVu IDwgMiAqIEhQQUdFX1BNRF9TSVpFKQp8ICsJCXJldHVybiBhZGRyOwoKQ2hlY2tlZCBpbiBteSBw YXRjaC4KCnwgKwlpZiAoKGFkZHIgJiAoSFBBR0VfUE1EX1NJWkUtMSkpID09IG9mZnNldCkKfCAr CQlyZXR1cm4gYWRkcjsKClRoaXMgaXMgYSBsdWNreSBjYXNlLCBpLmUuIHRoZSAxc3QgZ2V0X3Vu bWFwcGVkX2FyZWEoKSBjYWxsIHJldHVybmVkIGFuCmFsaWduZWQgYWRkci4gTm90IGFwcGxpY2Fi bGUgdG8gbXkgcGF0Y2guCgp8ICsKfCArCWluZmxhdGVkX2xlbiA9IGxlbiArIEhQQUdFX1BNRF9T SVpFIC0gUEFHRV9TSVpFOwp8ICsJaWYgKGluZmxhdGVkX2xlbiA+IFRBU0tfU0laRSkKfCArCQly ZXR1cm4gYWRkcjsKCkNoZWNrZWQgYnkgYXJjaF9nZXRfdW5tYXBwZWRfYXJlYV90b3Bkb3duKCku Cgp8ICsJaWYgKGluZmxhdGVkX2xlbiA8IGxlbikKfCArCQlyZXR1cm4gYWRkcjsKCkNoZWNrZWQg aW4gbXkgcGF0Y2guCgp8ICsJaW5mbGF0ZWRfYWRkciA9IGdldF9hcmVhKE5VTEwsIDAsIGluZmxh dGVkX2xlbiwgMCwgZmxhZ3MpOwoKTm90IHN1cmUgd2h5IHBhc3NpbmcgJ2ZpbHAnIGFuZCAnb2Zm JyBhcyBOVUxMIGhlcmUuCgp8ICsJaWYgKElTX0VSUl9WQUxVRShpbmZsYXRlZF9hZGRyKSkKfCAr CQlyZXR1cm4gYWRkcjsKCkNoZWNrZWQgaW4gbXkgcGF0Y2guCgp8ICsJaWYgKGluZmxhdGVkX2Fk ZHIgJiB+UEFHRV9NQVNLKQp8ICsJCXJldHVybiBhZGRyOwoKSG1tLi4uIGlmIHRoaXMgaGFwcGVu cywgaXQgaXMgYSBidWcgaW4gdGhlIGFsbG9jYXRpb24gY29kZS4gSSBkbyBub3QgdGhpbmsKdGhp cyBjaGVjayBpcyBuZWNlc3NhcnkuCgp8ICsJaW5mbGF0ZWRfb2Zmc2V0ID0gaW5mbGF0ZWRfYWRk ciAmIChIUEFHRV9QTURfU0laRS0xKTsKfCArCWluZmxhdGVkX2FkZHIgKz0gb2Zmc2V0IC0gaW5m bGF0ZWRfb2Zmc2V0Owp8ICsJaWYgKGluZmxhdGVkX29mZnNldCA+IG9mZnNldCkKfCArCQlpbmZs YXRlZF9hZGRyICs9IEhQQUdFX1BNRF9TSVpFOwp8ICsKfCArCWlmIChpbmZsYXRlZF9hZGRyID4g VEFTS19TSVpFIC0gbGVuKQp8ICsJCXJldHVybiBhZGRyOwoKVGhlIGFsbG9jYXRpb24gY29kZSBu ZWVkcyB0byBhc3N1cmUgdGhpcy4KCnwgKwlyZXR1cm4gaW5mbGF0ZWRfYWRkcjsKCj4gPiAKPiA+ ICt9Cj4gPiArCj4gPiArdW5zaWduZWQgbG9uZyB0aHBfZ2V0X3VubWFwcGVkX2FyZWEoc3RydWN0 IGZpbGUgKmZpbHAsIHVuc2lnbmVkIGxvbmcKPiA+IGFkZHIsCj4gPiArCQl1bnNpZ25lZCBsb25n IGxlbiwgdW5zaWduZWQgbG9uZyBwZ29mZiwgdW5zaWduZWQgbG9uZwo+ID4gZmxhZ3MpCj4gPiAr ewo+ID4gKwlsb2ZmX3Qgb2ZmID0gKGxvZmZfdClwZ29mZiA8PCBQQUdFX1NISUZUOwo+ID4gKwo+ ID4gKwlpZiAoYWRkcikKPiA+ICsJCWdvdG8gb3V0Owo+Cj4gSSB0aGluayBpdCdzIHRvbyBzdHJv bmcgcmVhY3Rpb24gdG8gaGludCwgaXNuJ3QgaXQ/Cj4gV2UgZGVmaW5hdGVseSBuZWVkIHRoaXMg Zm9yIE1BUF9GSVhFRC4gQnV0IGluIGdlbmVyYWw/IE1heWJlLgoKSXQgY2FsbHMgYXJjaCdzIGdl dF91bm1hcHBlZF9hcmVhKCkgdG8gcHJvY2VlZCB3aXRoIHRoZSBvcmlnaW5hbCBhcmdzIHdoZW4K J2FkZHInIGlzIHBhc3NlZC4gVGhlIGFyY2gncyBnZXRfdW5tYXBwZWRfYXJlKCkgdGhlbiBoYW5k bGVzICdhZGRyJyBhcyBhCmhpbnQgd2hlbiBNQVBfRklYRUQgaXMgbm90IHNldC4gVGhpcyBjYW4g YmUgdXNlZCBhcyBhIGhpbnQgdG8gYXZvaWQgdXNpbmcKVEhQIG1hcHBpbmdzIGlmIGEgbm9uLWFs aWduZWQgYWRkcmVzcyBpcyBwYXNzZWQuIEh1Z2gncyBjb2RlIGhhbmRsZXMgaXQgaW4KdGhlIHNh bWUgd2F5IGFzIHdlbGwuCgpUaGFua3MsCi1Ub3NoaQpfX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fXwpMaW51eC1udmRpbW0gbWFpbGluZyBsaXN0CkxpbnV4LW52 ZGltbUBsaXN0cy4wMS5vcmcKaHR0cHM6Ly9saXN0cy4wMS5vcmcvbWFpbG1hbi9saXN0aW5mby9s aW51eC1udmRpbW0K From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <1461600377.8149.76.camel@hpe.com> Subject: Re: [PATCH v4 1/2] thp, dax: add thp_get_unmapped_area for pmd mappings From: Toshi Kani To: "Kirill A. Shutemov" , Hugh Dickins Cc: akpm@linux-foundation.org, dan.j.williams@intel.com, viro@zeniv.linux.org.uk, willy@linux.intel.com, ross.zwisler@linux.intel.com, kirill.shutemov@linux.intel.com, david@fromorbit.com, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, mike.kravetz@oracle.com, linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Mon, 25 Apr 2016 10:06:17 -0600 In-Reply-To: <20160424225057.GA6670@node.shutemov.name> References: <1461370883-7664-1-git-send-email-toshi.kani@hpe.com> <1461370883-7664-2-git-send-email-toshi.kani@hpe.com> <20160424225057.GA6670@node.shutemov.name> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: On Mon, 2016-04-25 at 01:50 +0300, Kirill A. Shutemov wrote: > On Fri, Apr 22, 2016 at 06:21:22PM -0600, Toshi Kani wrote: > >  : > > +unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long > > len, > > + loff_t off, unsigned long flags, unsigned long size) > > +{ > > + unsigned long addr; > > + loff_t off_end = off + len; > > + loff_t off_align = round_up(off, size); > > + unsigned long len_pad; > > + > > + if (off_end <= off_align || (off_end - off_align) < size) > > + return 0; > > + > > + len_pad = len + size; > > + if (len_pad < len || (off + len_pad) < off) > > + return 0; > > + > > + addr = current->mm->get_unmapped_area(filp, 0, len_pad, > > +       off >> PAGE_SHIFT, > > flags); > > + if (IS_ERR_VALUE(addr)) > > + return 0; > > + > > + addr += (off - addr) & (size - 1); > > + return addr; > > Hugh has more sanity checks before and after call to get_unmapped_area(). > Please, consider borrowing them. This function only checks if the request is qualified for THP mappings. It tries not to step into the implementation of the allocation code current- >mm->get_unmapped_area(), such as arch_get_unmapped_area_topdown() on x86. Let me walk thru Hugh's checks to make sure I am not missing something: ---(Hugh's checks)--- | + if (len > TASK_SIZE) | + return -ENOMEM; This check is made by arch_get_unmapped_area_topdown(). | + | + get_area = current->mm->get_unmapped_area; | + addr = get_area(file, uaddr, len, pgoff, flags); | + | + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) | + return addr; thp_get_unmapped_area() is defined to NULL in this case. | + if (IS_ERR_VALUE(addr)) | + return addr; Checked in my patch. | + if (addr & ~PAGE_MASK) | + return addr; arch_get_unmapped_area_topdown() aligns 'addr' unless MAP_FIXED is set. No need to check in this func. | + if (addr > TASK_SIZE - len) | + return addr; The allocation code needs to assure this case. | + if (shmem_huge == SHMEM_HUGE_DENY) | + return addr; This check is specific to Hugh's patch. | + if (len < HPAGE_PMD_SIZE) | + return addr; Checked in my patch. | + if (flags & MAP_FIXED) | + return addr; Checked by arch_get_unmapped_area_topdown(). | + /* | +  * Our priority is to support MAP_SHARED mapped hugely; | +  * and support MAP_PRIVATE mapped hugely too, until it is COWed. | +  * But if caller specified an address hint, respect that as before. | +  */ | + if (uaddr) | + return addr; Checked in my patch. (cut) | + offset = (pgoff << PAGE_SHIFT) & (HPAGE_PMD_SIZE-1); | + if (offset && offset + len < 2 * HPAGE_PMD_SIZE) | + return addr; Checked in my patch. | + if ((addr & (HPAGE_PMD_SIZE-1)) == offset) | + return addr; This is a lucky case, i.e. the 1st get_unmapped_area() call returned an aligned addr. Not applicable to my patch. | + | + inflated_len = len + HPAGE_PMD_SIZE - PAGE_SIZE; | + if (inflated_len > TASK_SIZE) | + return addr; Checked by arch_get_unmapped_area_topdown(). | + if (inflated_len < len) | + return addr; Checked in my patch. | + inflated_addr = get_area(NULL, 0, inflated_len, 0, flags); Not sure why passing 'filp' and 'off' as NULL here. | + if (IS_ERR_VALUE(inflated_addr)) | + return addr; Checked in my patch. | + if (inflated_addr & ~PAGE_MASK) | + return addr; Hmm... if this happens, it is a bug in the allocation code. I do not think this check is necessary. | + inflated_offset = inflated_addr & (HPAGE_PMD_SIZE-1); | + inflated_addr += offset - inflated_offset; | + if (inflated_offset > offset) | + inflated_addr += HPAGE_PMD_SIZE; | + | + if (inflated_addr > TASK_SIZE - len) | + return addr; The allocation code needs to assure this. | + return inflated_addr; > > > > +} > > + > > +unsigned long thp_get_unmapped_area(struct file *filp, unsigned long > > addr, > > + unsigned long len, unsigned long pgoff, unsigned long > > flags) > > +{ > > + loff_t off = (loff_t)pgoff << PAGE_SHIFT; > > + > > + if (addr) > > + goto out; > > I think it's too strong reaction to hint, isn't it? > We definately need this for MAP_FIXED. But in general? Maybe. It calls arch's get_unmapped_area() to proceed with the original args when 'addr' is passed. The arch's get_unmapped_are() then handles 'addr' as a hint when MAP_FIXED is not set. This can be used as a hint to avoid using THP mappings if a non-aligned address is passed. Hugh's code handles it in the same way as well. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f198.google.com (mail-ob0-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id A0CF96B007E for ; Mon, 25 Apr 2016 12:15:06 -0400 (EDT) Received: by mail-ob0-f198.google.com with SMTP id n2so243586341obo.1 for ; Mon, 25 Apr 2016 09:15:06 -0700 (PDT) Received: from g1t5425.austin.hp.com (g1t5425.austin.hp.com. [15.216.225.55]) by mx.google.com with ESMTPS id i65si24433984ioi.135.2016.04.25.09.15.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Apr 2016 09:15:05 -0700 (PDT) Message-ID: <1461600377.8149.76.camel@hpe.com> Subject: Re: [PATCH v4 1/2] thp, dax: add thp_get_unmapped_area for pmd mappings From: Toshi Kani Date: Mon, 25 Apr 2016 10:06:17 -0600 In-Reply-To: <20160424225057.GA6670@node.shutemov.name> References: <1461370883-7664-1-git-send-email-toshi.kani@hpe.com> <1461370883-7664-2-git-send-email-toshi.kani@hpe.com> <20160424225057.GA6670@node.shutemov.name> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: "Kirill A. Shutemov" , Hugh Dickins Cc: akpm@linux-foundation.org, dan.j.williams@intel.com, viro@zeniv.linux.org.uk, willy@linux.intel.com, ross.zwisler@linux.intel.com, kirill.shutemov@linux.intel.com, david@fromorbit.com, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, mike.kravetz@oracle.com, linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Mon, 2016-04-25 at 01:50 +0300, Kirill A. Shutemov wrote: > On Fri, Apr 22, 2016 at 06:21:22PM -0600, Toshi Kani wrote: > > A : > > +unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long > > len, > > + loff_t off, unsigned long flags, unsigned long size) > > +{ > > + unsigned long addr; > > + loff_t off_end = off + len; > > + loff_t off_align = round_up(off, size); > > + unsigned long len_pad; > > + > > + if (off_end <= off_align || (off_end - off_align) < size) > > + return 0; > > + > > + len_pad = len + size; > > + if (len_pad < len || (off + len_pad) < off) > > + return 0; > > + > > + addr = current->mm->get_unmapped_area(filp, 0, len_pad, > > + A A A A A A off >> PAGE_SHIFT, > > flags); > > + if (IS_ERR_VALUE(addr)) > > + return 0; > > + > > + addr += (off - addr) & (size - 1); > > + return addr; > > Hugh has more sanity checks before and after call to get_unmapped_area(). > Please, consider borrowing them. This function only checks if the request is qualified for THP mappings. It tries not to step into the implementation of the allocation code current- >mm->get_unmapped_area(), such asA arch_get_unmapped_area_topdown() on x86. Let me walk thru Hugh's checks to make sure I am not missing something: ---(Hugh's checks)--- | + if (len > TASK_SIZE) | + return -ENOMEM; This check is made by arch_get_unmapped_area_topdown(). | + | + get_area = current->mm->get_unmapped_area; | + addr = get_area(file, uaddr, len, pgoff, flags); | + | + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) | + return addr; thp_get_unmapped_area() is defined to NULL in this case. | + if (IS_ERR_VALUE(addr)) | + return addr; Checked in my patch. | + if (addr & ~PAGE_MASK) | + return addr; arch_get_unmapped_area_topdown() aligns 'addr' unless MAP_FIXED is set. No need to check in this func. | + if (addr > TASK_SIZE - len) | + return addr; The allocation code needs to assure this case. | + if (shmem_huge == SHMEM_HUGE_DENY) | + return addr; This check is specific to Hugh's patch. | + if (len < HPAGE_PMD_SIZE) | + return addr; Checked in my patch. | + if (flags & MAP_FIXED) | + return addr; Checked by arch_get_unmapped_area_topdown(). | + /* | + A * Our priority is to support MAP_SHARED mapped hugely; | + A * and support MAP_PRIVATE mapped hugely too, until it is COWed. | + A * But if caller specified an address hint, respect that as before. | + A */ | + if (uaddr) | + return addr; Checked in my patch. (cut) | + offset = (pgoff << PAGE_SHIFT) & (HPAGE_PMD_SIZE-1); | + if (offset && offset + len < 2 * HPAGE_PMD_SIZE) | + return addr; Checked in my patch. | + if ((addr & (HPAGE_PMD_SIZE-1)) == offset) | + return addr; This is a lucky case, i.e. the 1st get_unmapped_area() call returned an aligned addr. Not applicable to my patch. | + | + inflated_len = len + HPAGE_PMD_SIZE - PAGE_SIZE; | + if (inflated_len > TASK_SIZE) | + return addr; Checked by arch_get_unmapped_area_topdown(). | + if (inflated_len < len) | + return addr; Checked in my patch. | + inflated_addr = get_area(NULL, 0, inflated_len, 0, flags); Not sure why passing 'filp' and 'off' as NULL here. | + if (IS_ERR_VALUE(inflated_addr)) | + return addr; Checked in my patch. | + if (inflated_addr & ~PAGE_MASK) | + return addr; Hmm... if this happens, it is a bug in the allocation code. I do not think this check is necessary. | + inflated_offset = inflated_addr & (HPAGE_PMD_SIZE-1); | + inflated_addr += offset - inflated_offset; | + if (inflated_offset > offset) | + inflated_addr += HPAGE_PMD_SIZE; | + | + if (inflated_addr > TASK_SIZE - len) | + return addr; The allocation code needs to assure this. | + return inflated_addr; > > > > +} > > + > > +unsigned long thp_get_unmapped_area(struct file *filp, unsigned long > > addr, > > + unsigned long len, unsigned long pgoff, unsigned long > > flags) > > +{ > > + loff_t off = (loff_t)pgoff << PAGE_SHIFT; > > + > > + if (addr) > > + goto out; > > I think it's too strong reaction to hint, isn't it? > We definately need this for MAP_FIXED. But in general? Maybe. It calls arch's get_unmapped_area() to proceed with the original args when 'addr' is passed. The arch's get_unmapped_are() then handles 'addr' as a hint when MAP_FIXED is not set. This can be used as a hint to avoid using THP mappings if a non-aligned address is passed. Hugh's code handles it in the same way as well. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933111AbcDYQPJ (ORCPT ); Mon, 25 Apr 2016 12:15:09 -0400 Received: from g1t5425.austin.hp.com ([15.216.225.55]:47168 "EHLO g1t5425.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932995AbcDYQPG (ORCPT ); Mon, 25 Apr 2016 12:15:06 -0400 Message-ID: <1461600377.8149.76.camel@hpe.com> Subject: Re: [PATCH v4 1/2] thp, dax: add thp_get_unmapped_area for pmd mappings From: Toshi Kani To: "Kirill A. Shutemov" , Hugh Dickins Cc: akpm@linux-foundation.org, dan.j.williams@intel.com, viro@zeniv.linux.org.uk, willy@linux.intel.com, ross.zwisler@linux.intel.com, kirill.shutemov@linux.intel.com, david@fromorbit.com, jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca, mike.kravetz@oracle.com, linux-nvdimm@ml01.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Mon, 25 Apr 2016 10:06:17 -0600 In-Reply-To: <20160424225057.GA6670@node.shutemov.name> References: <1461370883-7664-1-git-send-email-toshi.kani@hpe.com> <1461370883-7664-2-git-send-email-toshi.kani@hpe.com> <20160424225057.GA6670@node.shutemov.name> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.5.2 (3.18.5.2-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2016-04-25 at 01:50 +0300, Kirill A. Shutemov wrote: > On Fri, Apr 22, 2016 at 06:21:22PM -0600, Toshi Kani wrote: > >  : > > +unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long > > len, > > + loff_t off, unsigned long flags, unsigned long size) > > +{ > > + unsigned long addr; > > + loff_t off_end = off + len; > > + loff_t off_align = round_up(off, size); > > + unsigned long len_pad; > > + > > + if (off_end <= off_align || (off_end - off_align) < size) > > + return 0; > > + > > + len_pad = len + size; > > + if (len_pad < len || (off + len_pad) < off) > > + return 0; > > + > > + addr = current->mm->get_unmapped_area(filp, 0, len_pad, > > +       off >> PAGE_SHIFT, > > flags); > > + if (IS_ERR_VALUE(addr)) > > + return 0; > > + > > + addr += (off - addr) & (size - 1); > > + return addr; > > Hugh has more sanity checks before and after call to get_unmapped_area(). > Please, consider borrowing them. This function only checks if the request is qualified for THP mappings. It tries not to step into the implementation of the allocation code current- >mm->get_unmapped_area(), such as arch_get_unmapped_area_topdown() on x86. Let me walk thru Hugh's checks to make sure I am not missing something: ---(Hugh's checks)--- | + if (len > TASK_SIZE) | + return -ENOMEM; This check is made by arch_get_unmapped_area_topdown(). | + | + get_area = current->mm->get_unmapped_area; | + addr = get_area(file, uaddr, len, pgoff, flags); | + | + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) | + return addr; thp_get_unmapped_area() is defined to NULL in this case. | + if (IS_ERR_VALUE(addr)) | + return addr; Checked in my patch. | + if (addr & ~PAGE_MASK) | + return addr; arch_get_unmapped_area_topdown() aligns 'addr' unless MAP_FIXED is set. No need to check in this func. | + if (addr > TASK_SIZE - len) | + return addr; The allocation code needs to assure this case. | + if (shmem_huge == SHMEM_HUGE_DENY) | + return addr; This check is specific to Hugh's patch. | + if (len < HPAGE_PMD_SIZE) | + return addr; Checked in my patch. | + if (flags & MAP_FIXED) | + return addr; Checked by arch_get_unmapped_area_topdown(). | + /* | +  * Our priority is to support MAP_SHARED mapped hugely; | +  * and support MAP_PRIVATE mapped hugely too, until it is COWed. | +  * But if caller specified an address hint, respect that as before. | +  */ | + if (uaddr) | + return addr; Checked in my patch. (cut) | + offset = (pgoff << PAGE_SHIFT) & (HPAGE_PMD_SIZE-1); | + if (offset && offset + len < 2 * HPAGE_PMD_SIZE) | + return addr; Checked in my patch. | + if ((addr & (HPAGE_PMD_SIZE-1)) == offset) | + return addr; This is a lucky case, i.e. the 1st get_unmapped_area() call returned an aligned addr. Not applicable to my patch. | + | + inflated_len = len + HPAGE_PMD_SIZE - PAGE_SIZE; | + if (inflated_len > TASK_SIZE) | + return addr; Checked by arch_get_unmapped_area_topdown(). | + if (inflated_len < len) | + return addr; Checked in my patch. | + inflated_addr = get_area(NULL, 0, inflated_len, 0, flags); Not sure why passing 'filp' and 'off' as NULL here. | + if (IS_ERR_VALUE(inflated_addr)) | + return addr; Checked in my patch. | + if (inflated_addr & ~PAGE_MASK) | + return addr; Hmm... if this happens, it is a bug in the allocation code. I do not think this check is necessary. | + inflated_offset = inflated_addr & (HPAGE_PMD_SIZE-1); | + inflated_addr += offset - inflated_offset; | + if (inflated_offset > offset) | + inflated_addr += HPAGE_PMD_SIZE; | + | + if (inflated_addr > TASK_SIZE - len) | + return addr; The allocation code needs to assure this. | + return inflated_addr; > > > > +} > > + > > +unsigned long thp_get_unmapped_area(struct file *filp, unsigned long > > addr, > > + unsigned long len, unsigned long pgoff, unsigned long > > flags) > > +{ > > + loff_t off = (loff_t)pgoff << PAGE_SHIFT; > > + > > + if (addr) > > + goto out; > > I think it's too strong reaction to hint, isn't it? > We definately need this for MAP_FIXED. But in general? Maybe. It calls arch's get_unmapped_area() to proceed with the original args when 'addr' is passed. The arch's get_unmapped_are() then handles 'addr' as a hint when MAP_FIXED is not set. This can be used as a hint to avoid using THP mappings if a non-aligned address is passed. Hugh's code handles it in the same way as well. Thanks, -Toshi