From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 2/9] mm: Provide new get_vaddr_pfns() helper Date: Mon, 4 May 2015 17:14:01 +0200 Message-ID: <20150504151401.GC5805@quack.suse.cz> References: <1426593399-6549-1-git-send-email-jack@suse.cz> <1426593399-6549-3-git-send-email-jack@suse.cz> <20150430155531.GY2449@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15]) by gabe.freedesktop.org (Postfix) with ESMTP id 7860A6E2A6 for ; Mon, 4 May 2015 08:14:03 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20150430155531.GY2449@suse.de> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Mel Gorman Cc: Jan Kara , Mauro Carvalho Chehab , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Hans Verkuil , linux-media@vger.kernel.org List-Id: dri-devel@lists.freedesktop.org T24gVGh1IDMwLTA0LTE1IDE2OjU1OjMxLCBNZWwgR29ybWFuIHdyb3RlOgo+IE9uIFR1ZSwgTWFy IDE3LCAyMDE1IGF0IDEyOjU2OjMyUE0gKzAxMDAsIEphbiBLYXJhIHdyb3RlOgo+ID4gUHJvdmlk ZSBuZXcgZnVuY3Rpb24gZ2V0X3ZhZGRyX3BmbnMoKS4gIFRoaXMgZnVuY3Rpb24gbWFwcyB2aXJ0 dWFsCj4gPiBhZGRyZXNzZXMgZnJvbSBnaXZlbiBzdGFydCBhbmQgZmlsbHMgZ2l2ZW4gYXJyYXkg d2l0aCBwYWdlIGZyYW1lIG51bWJlcnMgb2YKPiA+IHRoZSBjb3JyZXNwb25kaW5nIHBhZ2VzLiBJ ZiBnaXZlbiBzdGFydCBiZWxvbmdzIHRvIGEgbm9ybWFsIHZtYSwgdGhlIGZ1bmN0aW9uCj4gPiBn cmFicyByZWZlcmVuY2UgdG8gZWFjaCBvZiB0aGUgcGFnZXMgdG8gcGluIHRoZW0gaW4gbWVtb3J5 LiBJZiBzdGFydAo+ID4gYmVsb25ncyB0byBWTV9JTyB8IFZNX1BGTk1BUCB2bWEsIHdlIGRvbid0 IHRvdWNoIHBhZ2Ugc3RydWN0dXJlcy4gQ2FsbGVyCj4gPiBzaG91bGQgbWFrZSBzdXJlIHBmbnMg YXJlbid0IHJldXNlZCBmb3IgYW55dGhpbmcgZWxzZSB3aGlsZSBoZSBpcyB1c2luZwo+ID4gdGhl bS4KPiA+IAo+ID4gVGhpcyBmdW5jdGlvbiBpcyBjcmVhdGVkIGZvciB2YXJpb3VzIGRyaXZlcnMg dG8gc2ltcGxpZnkgaGFuZGxpbmcgb2YKPiA+IHRoZWlyIGJ1ZmZlcnMuCj4gPiAKPiA+IFNpZ25l ZC1vZmYtYnk6IEphbiBLYXJhIDxqYWNrQHN1c2UuY3o+Cj4gPiAtLS0KPiA+ICBpbmNsdWRlL2xp bnV4L21tLmggfCAgMzggKysrKysrKysrKysKPiA+ICBtbS9ndXAuYyAgICAgICAgICAgfCAxODAg KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysKPiA+ ICAyIGZpbGVzIGNoYW5nZWQsIDIxOCBpbnNlcnRpb25zKCspCj4gPiAKPiA+IGRpZmYgLS1naXQg YS9pbmNsdWRlL2xpbnV4L21tLmggYi9pbmNsdWRlL2xpbnV4L21tLmgKPiA+IGluZGV4IDQ3YTkz OTI4YjkwZi4uYTUwNDVkZjkyNDU0IDEwMDY0NAo+ID4gLS0tIGEvaW5jbHVkZS9saW51eC9tbS5o Cj4gPiArKysgYi9pbmNsdWRlL2xpbnV4L21tLmgKPiA+IEBAIC0xMjc5LDYgKzEyNzksNDQgQEAg bG9uZyBnZXRfdXNlcl9wYWdlc191bmxvY2tlZChzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRzaywgc3Ry dWN0IG1tX3N0cnVjdCAqbW0sCj4gPiAgCQkgICAgaW50IHdyaXRlLCBpbnQgZm9yY2UsIHN0cnVj dCBwYWdlICoqcGFnZXMpOwo+ID4gIGludCBnZXRfdXNlcl9wYWdlc19mYXN0KHVuc2lnbmVkIGxv bmcgc3RhcnQsIGludCBucl9wYWdlcywgaW50IHdyaXRlLAo+ID4gIAkJCXN0cnVjdCBwYWdlICoq cGFnZXMpOwo+ID4gKwo+ID4gKy8qIENvbnRhaW5lciBmb3IgcGlubmVkIHBmbnMgLyBwYWdlcyAq Lwo+ID4gK3N0cnVjdCBwaW5uZWRfcGZucyB7Cj4gPiArCXVuc2lnbmVkIGludCBucl9hbGxvY2F0 ZWRfcGZuczsJLyogTnVtYmVyIG9mIHBmbnMgd2UgaGF2ZSBzcGFjZSBmb3IgKi8KPiA+ICsJdW5z aWduZWQgaW50IG5yX3BmbnM7CQkvKiBOdW1iZXIgb2YgcGZucyBzdG9yZWQgaW4gcGZucyBhcnJh eSAqLwo+ID4gKwl1bnNpZ25lZCBpbnQgZ290X3JlZjoxOwkJLyogRGlkIHdlIHBpbiBwZm5zIGJ5 IGdldHRpbmcgcGFnZSByZWY/ICovCj4gPiArCXVuc2lnbmVkIGludCBpc19wYWdlczoxOwkvKiBE b2VzIGFycmF5IGNvbnRhaW4gcGFnZXMgb3IgcGZucz8gKi8KPiAKPiBUaGUgYml0IGZpZWxkIGlz IHByb2JhYmx5IG92ZXJraWxsIGFzIEkgZXhwZWN0IGl0J2xsIGdldCBwYWRkZWQgb3V0IGZvcgo+ IHBvaW50ZXIgYWxpZ25tZW50IGFueXdheS4gSnVzdCB1c2UgYm9vbHMuCiAgTWFrZXMgc2Vuc2Uu Cgo+IGlzX3BmbnMgaXMgbGVzcyBhbWJpZ3VvdXMgdGhhbiBpc19wYWdlcyBidXQgbm90IHZlcnkg aW1wb3J0YW50Lgo+IAo+IFRoZSBuYW1pbmcgaXMgbm90IGdyZWF0IGluIGdlbmVyYWwuIE9ubHkg c3RydWN0IHBhZ2VzIGFyZSBwaW5uZWQgaW4gdGhlCj4gdHJhZGl0aW9uYWwgbWVhbmluZyBvZiB0 aGUgd29yZC4gVGhlIHJhdyBQRk5zIGFyZSBub3Qgc28gdGhlcmUgaXMgbm8gc3VjaAo+IHRoaW5n IGFzIGEgInBpbm5lZCBwZm5zIi4gSXQgbWlnaHQgYmUgYmV0dGVyIGp1c3QgdG8gY2FsbCBpdCBm cmFtZV92ZWN0b3JzCj4gYW5kIGRvY3VtZW50IHRoYXQgaXQncyBlaXRoZXIgcmF3IFBGTnMgdGhh dCB0aGUgY2FsbGVyIHNob3VsZCBiZSByZXNwb25zaWJsZQo+IGZvciBvciBzdHJ1Y3QgcGFnZXMg dGhhdCBhcmUgcGlubmVkLgogIEdvb2QgcG9pbnQuIEknbGwgdHJ5IHRvIGNvbWUgdXAgd2l0aCBh IGJldHRlciBuYW1lLgoKPHNuaXA+IC0gSSBhZ3JlZSB3aXRoIG1pbm9yIGNvbW1lbnRzIHRoZXJl LgoKPiA+ICAvKioKPiA+ICsgKiBnZXRfdmFkZHJfcGZucygpIC0gbWFwIHZpcnR1YWwgYWRkcmVz c2VzIHRvIHBmbnMKPiA+ICsgKiBAc3RhcnQ6CXN0YXJ0aW5nIHVzZXIgYWRkcmVzcwo+ID4gKyAq IEBucl9wZm5zOgludW1iZXIgb2YgcGZucyBmcm9tIHN0YXJ0IHRvIG1hcAo+ID4gKyAqIEB3cml0 ZToJd2hldGhlciBwYWdlcyB3aWxsIGJlIHdyaXR0ZW4gdG8gYnkgdGhlIGNhbGxlcgo+ID4gKyAq IEBmb3JjZToJd2hldGhlciB0byBmb3JjZSB3cml0ZSBhY2Nlc3MgZXZlbiBpZiB1c2VyIG1hcHBp bmcgaXMKPiA+ICsgKgkJcmVhZG9ubHkuIFRoaXMgd2lsbCByZXN1bHQgaW4gdGhlIHBhZ2UgYmVp bmcgQ09XZWQgZXZlbgo+ID4gKyAqCQlpbiBNQVBfU0hBUkVEIG1hcHBpbmdzLiBZb3UgZG8gbm90 IHdhbnQgdGhpcy4KPiA+ICsgKiBAcGZuczoJc3RydWN0dXJlIHdoaWNoIHJlY2VpdmVzIHBmbnMg b2YgdGhlIHBhZ2VzIG1hcHBlZC4KPiA+ICsgKgkJSXQgc2hvdWxkIGhhdmUgc3BhY2UgZm9yIGF0 IGxlYXN0IG5yX3BmbnMgcGZucy4KPiA+ICsgKgo+ID4gKyAqIFRoaXMgZnVuY3Rpb24gbWFwcyB2 aXJ0dWFsIGFkZHJlc3NlcyBmcm9tIEBzdGFydCBhbmQgZmlsbHMgQHBmbnMgc3RydWN0dXJlCj4g PiArICogd2l0aCBwYWdlIGZyYW1lIG51bWJlcnMgb2YgY29ycmVzcG9uZGluZyBwYWdlcy4gSWYg QHN0YXJ0IGJlbG9uZ3MgdG8gYQo+ID4gKyAqIG5vcm1hbCB2bWEsIHRoZSBmdW5jdGlvbiBncmFi cyByZWZlcmVuY2UgdG8gZWFjaCBvZiB0aGUgcGFnZXMgdG8gcGluIHRoZW0gaW4KPiA+ICsgKiBt ZW1vcnkuIElmIEBzdGFydCBiZWxvbmdzIHRvIFZNX0lPIHwgVk1fUEZOTUFQIHZtYSwgd2UgZG9u J3QgdG91Y2ggcGFnZQo+ID4gKyAqIHN0cnVjdHVyZXMuIENhbGxlciBzaG91bGQgbWFrZSBzdXJl IHBmbnMgYXJlbid0IHJldXNlZCBmb3IgYW55dGhpbmcgZWxzZQo+ID4gKyAqIHdoaWxlIGhlIGlz IHVzaW5nIHRoZW0uCj4gPiArICoKPiA+ICsgKiBUaGlzIGZ1bmN0aW9uIHRha2VzIGNhcmUgb2Yg Z3JhYmJpbmcgbW1hcF9zZW0gYXMgbmVjZXNzYXJ5Lgo+ID4gKyAqLwo+ID4gK2ludCBnZXRfdmFk ZHJfcGZucyh1bnNpZ25lZCBsb25nIHN0YXJ0LCBpbnQgbnJfcGZucywgaW50IHdyaXRlLCBpbnQg Zm9yY2UsCj4gPiArCQkgICBzdHJ1Y3QgcGlubmVkX3BmbnMgKnBmbnMpCj4gPiArewo+ID4gKwlz dHJ1Y3QgbW1fc3RydWN0ICptbSA9IGN1cnJlbnQtPm1tOwo+ID4gKwlzdHJ1Y3Qgdm1fYXJlYV9z dHJ1Y3QgKnZtYTsKPiA+ICsJaW50IHJldCA9IDA7Cj4gPiArCWludCBlcnI7Cj4gPiArCj4gPiAr CWlmIChucl9wZm5zIDw9IDApCj4gPiArCQlyZXR1cm4gMDsKPiA+ICsKPiAKPiBJIGtub3cgSSBz dWdnZXN0ZWQgdGhhdCBucl9wZm5zIHNob3VsZCBiZSB1bnNpZ25lZCBlYXJsaWVyIGFuZCB0aGVu IEkKPiBzYXcgdGhpcy4gSXMgdGhlcmUgYW55IHZhbGlkIHVzZSBvZiB0aGUgQVBJIHRoYXQgd291 bGQgcGFzcyBpbiBhCj4gbmVnYXRpdmUgbnVtYmVyIGhlcmU/CiAgTm8uIEl0IHdhcyB0aGVyZSBq dXN0IGJlY2F1c2UgSSBoYWQgaW50cyBldmVyeXdoZXJlIHdpdGhvdXQgdG9vIG11Y2gKdGhvdWdo dCAoaXQncyBzaG9ydGVyIHRvIHR5cGUgKmdyaW4qKS4gSSdsbCBwdXQgdW5zaWduZWQgdHlwZXMg d2hlcmUgaXQKbWFrZXMgc2Vuc2UgYW5kIGFkZCBhIHRlc3QgZm9yIElOVF9NQVggc28gdGhhdCBn ZXRfdmFkZHJfcGZucygpIGNhbiBzdGlsbApyZXR1cm4gbmVnYXRpdmUgZXJyb3JzLgoKPiA+ICsJ aWYgKG5yX3BmbnMgPiBwZm5zLT5ucl9hbGxvY2F0ZWRfcGZucykKPiA+ICsJCW5yX3BmbnMgPSBw Zm5zLT5ucl9hbGxvY2F0ZWRfcGZuczsKPiA+ICsKPiAKPiBTaG91bGQgdGhpcyBiZSBhIFdBUk5f T05fT05DRT8gWW91IHJlY292ZXIgZnJvbSBpdCBvYnZpb3VzbHkgYnV0IHRoZSByZXR1cm4KPiB2 YWx1ZSBpcyBub3QgZG9jdW1lbnRlZCB0byBzYXkgdGhhdCBpdCBjb3VsZCByZXR1cm4gbGVzcyB0 aGFuIHJlcXVlc3RlZC4KPiBPZiBjb3Vyc2UsIHRoaXMgaXMgaW1wbGllZCBidXQgYSBjYWxsZXIg bWlnaHQgYXNzdW1lIGl0J3MgZHVlIHRvIGEgdHJhbnNpZW50Cj4gZXJyb3IgaW5zdGVhZCBvZiBi cm9rZW4gQVBJIHVzYWdlLgogIFllcCwgZ29vZCBpZGVhLgoKPiA+ICsJZG93bl9yZWFkKCZtbS0+ bW1hcF9zZW0pOwo+ID4gKwl2bWEgPSBmaW5kX3ZtYV9pbnRlcnNlY3Rpb24obW0sIHN0YXJ0LCBz dGFydCArIDEpOwo+ID4gKwlpZiAoIXZtYSkgewo+ID4gKwkJcmV0ID0gLUVGQVVMVDsKPiA+ICsJ CWdvdG8gb3V0Owo+ID4gKwl9Cj4gCj4gUmV0dXJuaW5nIC1FRkFVTFQgbWVhbnMgdGhhdCB0aGUg cmV0dXJuIHZhbHVlIGhhcyB0byBiZSBzaWduZWQgYnV0IHRoZQo+IHN0cnVjdHVyZSBpdHNlbGYg aGFzIHVuc2lnbmVkIGludCBmb3IgY291bnRlcnMgc28gdGhlIG1heGltdW0gbnVtYmVyIG9mCj4g UEZOcyBzdXBwb3J0ZWQgaXMgbm90IGF2YWlsYWJsZS4gV2UnZCBuZXZlciB3YW50IHRoYXQgbnVt YmVyIG9mIFBGTnMgcGlubmVkCj4gYnV0IHN0aWxsIGl0IG1pZ2h0IG1ha2Ugc2Vuc2UgdG8gZW5m b3JjZSB0aGUgbWF4aW11bSBwb3NzaWJsZSBzaXplIG9mIHRoZQo+IHN0cnVjdHVyZSB0aGF0IHRh a2VzIGludG8gYWNjb3VudCB0aGUgdmFsdWVzIG5lZWRlZCBmb3IgcmV0dXJuaW5nIGVycm9ycy4K ICBZZXMuIFNlZSBhYm92ZS4KCj4gPiArCWlmICghKHZtYS0+dm1fZmxhZ3MgJiAoVk1fSU8gfCBW TV9QRk5NQVApKSkgewo+ID4gKwkJcGZucy0+Z290X3JlZiA9IDE7Cj4gPiArCQlwZm5zLT5pc19w YWdlcyA9IDE7Cj4gPiArCQlyZXQgPSBnZXRfdXNlcl9wYWdlcyhjdXJyZW50LCBtbSwgc3RhcnQs IG5yX3BmbnMsIHdyaXRlLCBmb3JjZSwKPiA+ICsJCQkJICAgICBwZm5zX3ZlY3Rvcl9wYWdlcyhw Zm5zKSwgTlVMTCk7Cj4gPiArCQlnb3RvIG91dDsKPiA+ICsJfQo+ID4gKwo+ID4gKwlwZm5zLT5n b3RfcmVmID0gMDsKPiA+ICsJcGZucy0+aXNfcGFnZXMgPSAwOwo+ID4gKwlkbyB7Cj4gPiArCQl1 bnNpZ25lZCBsb25nICpudW1zID0gcGZuc192ZWN0b3JfcGZucyhwZm5zKTsKPiA+ICsKPiA+ICsJ CXdoaWxlIChyZXQgPCBucl9wZm5zICYmIHN0YXJ0ICsgUEFHRV9TSVpFIDw9IHZtYS0+dm1fZW5k KSB7Cj4gPiArCQkJZXJyID0gZm9sbG93X3Bmbih2bWEsIHN0YXJ0LCAmbnVtc1tyZXRdKTsKPiA+ ICsJCQlpZiAoZXJyKSB7Cj4gPiArCQkJCWlmIChyZXQgPT0gMCkKPiA+ICsJCQkJCXJldCA9IGVy cjsKPiA+ICsJCQkJZ290byBvdXQ7Cj4gPiArCQkJfQo+ID4gKwkJCXN0YXJ0ICs9IFBBR0VfU0la RTsKPiA+ICsJCQlyZXQrKzsKPiA+ICsJCX0KPiA+ICsJCS8qCj4gPiArCQkgKiBXZSBzdG9wIGlm IHdlIGhhdmUgZW5vdWdoIHBhZ2VzIG9yIGlmIFZNQSBkb2Vzbid0IGNvbXBsZXRlbHkKPiA+ICsJ CSAqIGNvdmVyIHRoZSB0YWlsIHBhZ2UuCj4gPiArCQkgKi8KPiA+ICsJCWlmIChyZXQgPj0gbnJf cGZucyB8fCBzdGFydCA8IHZtYS0+dm1fZW5kKQo+ID4gKwkJCWJyZWFrOwo+ID4gKwkJdm1hID0g ZmluZF92bWFfaW50ZXJzZWN0aW9uKG1tLCBzdGFydCwgc3RhcnQgKyAxKTsKPiA+ICsJfSB3aGls ZSAodm1hICYmIHZtYS0+dm1fZmxhZ3MgJiAoVk1fSU8gfCBWTV9QRk5NQVApKTsKPiAKPiBEb2N1 bWVudGF0aW9uIHByb2JhYmx5IHNob3VsZCBtZW50aW9uIHRoYXQgYSByYW5nZSB0aGF0IHN0cmlk ZXMgVk1BIHR5cGVzCj4gd2lsbCByZXR1cm4gcGFydGlhbCByZXN1bHRzLgogIEdvb2QgcG9pbnQu IFdpbGwgYWRkIHRoYXQuCgo+ID4gK291dDoKPiA+ICsJdXBfcmVhZCgmbW0tPm1tYXBfc2VtKTsK PiA+ICsJaWYgKCFyZXQpCj4gPiArCQlyZXQgPSAtRUZBVUxUOwo+IAo+IFJlYWxseT8gTWF5YmUg d2UganVzdCBmYWlsZWQgdG8gY2FsbCBnZXRfdXNlcl9wYWdlcyBvbiBhbnkgb2YgdGhlbS4gVGhl Cj4gbWVhbmluZyBvZiAtRUZBVUxUIHByb2JhYmx5IG5lZWRzIGEgY29tbWVudC4KICBTbyB0aGUg b25seSByZWFsIGNhc2Ugd2hlcmUgd2UgY291bGQgaGF2ZSAwIGhlcmUgaXMgd2hlbiBnZXRfdXNl cl9wYWdlcygpCnJldHVybnMgMCBmb3Igd2hhdGV2ZXIgcmVhc29uLiBJdCBzZWVtZWQgYmV0dGVy IHRvIG1lIHRvIHRyYW5zbGF0ZSAwIHRvCi1FRkFVTFQgc2luY2UgYWxsIHRoZSBkcml2ZXJzIEkg c2F3IGRpZCB0aGF0IGFueXdheSAoYXQgbGVhc3QgdGhvc2UgdGhhdApkaWRuJ3QgZm9yZ2V0IGFi b3V0IGNoZWNraW5nIGZvciAwKS4gSXQganVzdCBzZWVtcyBhcyBhIGxlc3MgZXJyb3ItcHJvbmUK aW50ZXJmYWNlIHRvIHJldHVybiBlcnJvciBvciBudW1iZXIgPiAwLiBJIGFncmVlIHRoYXQgLUVG QVVMVCBkb2Vzbid0Cm5lY2Vzc2FyaWx5IGFsd2F5cyBtYWtlIHNlbnNlIHNvIGlmIHlvdSBoYXZl IGlkZWEgZm9yIGEgYmV0dGVyIGVycm9yIEknbQpvcGVuIHRvIHN1Z2dlc3Rpb25zLgoKPiA+ICsJ aWYgKHJldCA+IDApCj4gPiArCQlwZm5zLT5ucl9wZm5zID0gcmV0Owo+ID4gKwlyZXR1cm4gcmV0 Owo+ID4gK30KPiA+ICtFWFBPUlRfU1lNQk9MKGdldF92YWRkcl9wZm5zKTsKPiA+ICsKPiA+ICsv KioKPiA+ICsgKiBwdXRfdmFkZHJfcGZucygpIC0gZHJvcCByZWZlcmVuY2VzIHRvIHBhZ2VzIGlm IGdldF92YWRkcl9wZm5zKCkgYWNxdWlyZWQgdGhlbQo+ID4gKyAqIEBwZm5zOiAgICAgc3RydWN0 dXJlIHdpdGggcGZucyB3ZSBwaW5uZWQKPiA+ICsgKgo+ID4gKyAqIERyb3AgcmVmZXJlbmNlcyB0 byBwYWdlcyBpZiBnZXRfdmFkZHJfcGZucygpIGFjcXVpcmVkIHRoZW0uIFdlIGFsc28KPiA+ICsg KiBpbnZhbGlkYXRlIHRoZSBhcnJheSBvZiBwZm5zIHNvIHRoYXQgaXQgaXMgcHJlcGFyZWQgZm9y IHRoZSBuZXh0IGNhbGwgaW50bwo+ID4gKyAqIGdldF92YWRkcl9wZm5zKCkuCj4gPiArICovCj4g PiArdm9pZCBwdXRfdmFkZHJfcGZucyhzdHJ1Y3QgcGlubmVkX3BmbnMgKnBmbnMpCj4gPiArewo+ ID4gKwlpbnQgaTsKPiA+ICsKPiA+ICsJaWYgKCFwZm5zLT5nb3RfcmVmKQo+ID4gKwkJZ290byBv dXQ7Cj4gPiArCWlmIChwZm5zLT5pc19wYWdlcykgewo+ID4gKwkJc3RydWN0IHBhZ2UgKipwYWdl cyA9IHBmbnNfdmVjdG9yX3BhZ2VzKHBmbnMpOwo+ID4gKwo+ID4gKwkJZm9yIChpID0gMDsgaSA8 IHBmbnMtPm5yX3BmbnM7IGkrKykKPiA+ICsJCQlwdXRfcGFnZShwYWdlc1tpXSk7Cj4gCj4gT2su Cj4gCj4gPiArCX0gZWxzZSB7Cj4gPiArCQl1bnNpZ25lZCBsb25nICpudW1zID0gcGZuc192ZWN0 b3JfcGZucyhwZm5zKTsKPiA+ICsKPiA+ICsJCWZvciAoaSA9IDA7IGkgPCBwZm5zLT5ucl9wZm5z OyBpKyspCj4gPiArCQkJcHV0X3BhZ2UocGZuX3RvX3BhZ2UobnVtc1tpXSkpOwo+ID4gKwl9Cj4g Cj4gSSdtIG1pc3Npbmcgc29tZXRoaW5nIGhlcmUuIFRoZSAhaXNfcGFnZXMgYnJhbmNoIGluIGdl dF92YWRkcl9wZm5zKCkKPiB1c2VkIGZvbGxvd19wZm4oKSBhbmQgaXQgZG9lcyBub3QgdGFrZSBh IHJlZmVyZW5jZSBvbiB0aGUgc3RydWN0IHBhZ2Ugc28KPiBJJ20gbm90IHN1cmUgd2hhdCB0aGlz IGlzIHB1dHRpbmcgZXhhY3RseS4KICBTbyB0aGlzIGNhc2Ugc2hvdWxkIGhhdmUgY292ZXJlZCB0 aGUgc2l0dWF0aW9uIHdoZW4gd2UgaGF2ZSBhIHZlY3RvciBvZgpub3JtYWwgcGFnZXMgKHRodXMg Z290X3JlZiA9PSAxKSBhbmQgY29udmVydCB0aGF0IHRvIGEgdmVjdG9yIG9mIHBmbnMuCk5vb25l IGN1cnJlbnRseSBkb2VzIHRoYXQgYW5kIHRoYXQgZnVuY3Rpb24gZm9yIGNvbnZlcnNpb24gZXZl biBkb2Vzbid0CnNlZW0gdG8gZXhpc3QgYnV0IGl0IHNlZW1zIHRvIG1lIGF0IGxlYXN0IG9tYXAg ZHJpdmVyIG1heSBuZWVkIGl0IGFuZCB0aGUKY29udmVyc2lvbiBJIGRpZCBoYXMgYSBidWcuIEFu eXdheSwgSSdsbCByZWNoZWNrIHRoZSBzaXR1YXRpb24gYW5kIGVpdGhlcgpkcm9wIHRoaXMgb3Ig bWFrZSB0aGUgZnVuY3Rpb24gbW9yZSBjb21wcmVoZW5zaWJsZS4KIAo+ID4gKwlwZm5zLT5nb3Rf cmVmID0gMDsKPiA+ICtvdXQ6Cj4gPiArCXBmbnMtPm5yX3BmbnMgPSAwOwo+ID4gK30KPiA+ICtF WFBPUlRfU1lNQk9MKHB1dF92YWRkcl9wZm5zKTsKPiA+ICsKPiAKPiBFWFBPUlRfU1lNQk9MX0dQ TD8KICBJJ3ZlIGp1c3QgZm9sbG93ZWQgd2hhdCBfX2dldF91c2VyX3BhZ2VzKCkgZG9lcy4uLgoK PiA+ICsvKioKPiA+ICsgKiBwZm5zX3ZlY3Rvcl90b19wYWdlcyAtIGNvbnZlcnQgdmVjdG9yIG9m IHBmbnMgdG8gdmVjdG9yIG9mIHBhZ2UgcG9pbnRlcnMKPiA+ICsgKiBAcGZuczoJc3RydWN0dXJl IHdpdGggcGlubmVkIHBmbnMKPiA+ICsgKgo+ID4gKyAqIENvbnZlcnQgQHBmbnMgdG8gbm90IGNv bnRhaW4gYXJyYXkgb2YgcGZucyBidXQgYXJyYXkgb2YgcGFnZSBwb2ludGVycy4KPiA+ICsgKiBJ ZiB0aGUgY29udmVyc2lvbiBpcyBzdWNjZXNzZnVsLCByZXR1cm4gMC4gT3RoZXJ3aXNlIHJldHVy biBhbiBlcnJvci4KPiA+ICsgKi8KPiA+ICtpbnQgcGZuc192ZWN0b3JfdG9fcGFnZXMoc3RydWN0 IHBpbm5lZF9wZm5zICpwZm5zKQo+ID4gK3sKPiA+ICsJaW50IGk7Cj4gPiArCXVuc2lnbmVkIGxv bmcgKm51bXM7Cj4gPiArCXN0cnVjdCBwYWdlICoqcGFnZXM7Cj4gPiArCj4gPiArCWlmIChwZm5z LT5pc19wYWdlcykKPiA+ICsJCXJldHVybiAwOwo+ID4gKwludW1zID0gcGZuc192ZWN0b3JfcGZu cyhwZm5zKTsKPiA+ICsJZm9yIChpID0gMDsgaSA8IHBmbnMtPm5yX3BmbnM7IGkrKykKPiA+ICsJ CWlmICghcGZuX3ZhbGlkKG51bXNbaV0pKQo+ID4gKwkJCXJldHVybiAtRUlOVkFMOwo+ID4gKwlw YWdlcyA9IChzdHJ1Y3QgcGFnZSAqKiludW1zOwo+ID4gKwlmb3IgKGkgPSAwOyBpIDwgcGZucy0+ bnJfcGZuczsgaSsrKQo+ID4gKwkJcGFnZXNbaV0gPSBwZm5fdG9fcGFnZShudW1zW2ldKTsKPiA+ ICsJcGZucy0+aXNfcGFnZXMgPSAxOwo+ID4gKwlyZXR1cm4gMDsKPiA+ICt9Cj4gPiArRVhQT1JU X1NZTUJPTChwZm5zX3ZlY3Rvcl90b19wYWdlcyk7Cj4gPiArCj4gPiArLyoqCj4gPiArICogcGZu c192ZWN0b3JfY3JlYXRlKCkgLSBhbGxvY2F0ZSAmIGluaXRpYWxpemUgc3RydWN0dXJlIGZvciBw aW5uZWQgcGZucwo+ID4gKyAqIEBucl9wZm5zOgludW1iZXIgb2YgcGZucyBzbG90cyB3ZSBzaG91 bGQgcmVzZXJ2ZQo+ID4gKyAqCj4gPiArICogQWxsb2NhdGUgYW5kIGluaXRpYWxpemUgc3RydWN0 IHBpbm5lZF9wZm5zIHRvIGJlIGFibGUgdG8gaG9sZCBAbnJfcGZucwo+ID4gKyAqIHBmbnMuCj4g PiArICovCj4gPiArc3RydWN0IHBpbm5lZF9wZm5zICpwZm5zX3ZlY3Rvcl9jcmVhdGUoaW50IG5y X3BmbnMpCj4gPiArewo+ID4gKwlzdHJ1Y3QgcGlubmVkX3BmbnMgKnBmbnM7Cj4gPiArCWludCBz aXplID0gc2l6ZW9mKHN0cnVjdCBwaW5uZWRfcGZucykgKyBzaXplb2YodW5zaWduZWQgbG9uZykg KiBucl9wZm5zOwo+ID4gKwo+IAo+IFNob3VsZCBiZSBzaXplb2Yodm9pZCAqKSAKICBPSy4KCj4g PiArCWlmIChXQVJOX09OX09OQ0UobnJfcGZucyA8PSAwKSkKPiA+ICsJCXJldHVybiBOVUxMOwo+ ID4gKwlpZiAoc2l6ZSA8PSBQQUdFX1NJWkUpCj4gPiArCQlwZm5zID0ga21hbGxvYyhzaXplLCBH RlBfS0VSTkVMKTsKPiAKPiBrbWFsbG9jIHN1cHBvcnRzIGxhcmdlciBzaXplcyB0aGFuIHRoaXMg YnV0IEkgc3VzcGVjdCB0aGlzIHdhcwo+IGRlbGliZXJhdGUgdG8gYXZvaWQgaGlnaC1vcmRlciBh bGxvY2F0aW9ucy4KICBFeGFjdGx5LgoKICBUaGFua3MgYSBsb3QgZm9yIHJldmlldyBNZWwhCgoJ CQkJCQkJCUhvbnphCi0tIApKYW4gS2FyYSA8amFja0BzdXNlLmN6PgpTVVNFIExhYnMsIENSCl9f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBt YWlsaW5nIGxpc3QKZHJpLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwOi8vbGlzdHMu ZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from cantor2.suse.de ([195.135.220.15]:55905 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751895AbbEDPOF (ORCPT ); Mon, 4 May 2015 11:14:05 -0400 Date: Mon, 4 May 2015 17:14:01 +0200 From: Jan Kara To: Mel Gorman Cc: Jan Kara , linux-media@vger.kernel.org, Hans Verkuil , Mauro Carvalho Chehab , linux-mm@kvack.org, dri-devel@lists.freedesktop.org, David Airlie Subject: Re: [PATCH 2/9] mm: Provide new get_vaddr_pfns() helper Message-ID: <20150504151401.GC5805@quack.suse.cz> References: <1426593399-6549-1-git-send-email-jack@suse.cz> <1426593399-6549-3-git-send-email-jack@suse.cz> <20150430155531.GY2449@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150430155531.GY2449@suse.de> Sender: linux-media-owner@vger.kernel.org List-ID: On Thu 30-04-15 16:55:31, Mel Gorman wrote: > On Tue, Mar 17, 2015 at 12:56:32PM +0100, Jan Kara wrote: > > Provide new function get_vaddr_pfns(). This function maps virtual > > addresses from given start and fills given array with page frame numbers of > > the corresponding pages. If given start belongs to a normal vma, the function > > grabs reference to each of the pages to pin them in memory. If start > > belongs to VM_IO | VM_PFNMAP vma, we don't touch page structures. Caller > > should make sure pfns aren't reused for anything else while he is using > > them. > > > > This function is created for various drivers to simplify handling of > > their buffers. > > > > Signed-off-by: Jan Kara > > --- > > include/linux/mm.h | 38 +++++++++++ > > mm/gup.c | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 218 insertions(+) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 47a93928b90f..a5045df92454 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1279,6 +1279,44 @@ long get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm, > > int write, int force, struct page **pages); > > int get_user_pages_fast(unsigned long start, int nr_pages, int write, > > struct page **pages); > > + > > +/* Container for pinned pfns / pages */ > > +struct pinned_pfns { > > + unsigned int nr_allocated_pfns; /* Number of pfns we have space for */ > > + unsigned int nr_pfns; /* Number of pfns stored in pfns array */ > > + unsigned int got_ref:1; /* Did we pin pfns by getting page ref? */ > > + unsigned int is_pages:1; /* Does array contain pages or pfns? */ > > The bit field is probably overkill as I expect it'll get padded out for > pointer alignment anyway. Just use bools. Makes sense. > is_pfns is less ambiguous than is_pages but not very important. > > The naming is not great in general. Only struct pages are pinned in the > traditional meaning of the word. The raw PFNs are not so there is no such > thing as a "pinned pfns". It might be better just to call it frame_vectors > and document that it's either raw PFNs that the caller should be responsible > for or struct pages that are pinned. Good point. I'll try to come up with a better name. - I agree with minor comments there. > > /** > > + * get_vaddr_pfns() - map virtual addresses to pfns > > + * @start: starting user address > > + * @nr_pfns: number of pfns from start to map > > + * @write: whether pages will be written to by the caller > > + * @force: whether to force write access even if user mapping is > > + * readonly. This will result in the page being COWed even > > + * in MAP_SHARED mappings. You do not want this. > > + * @pfns: structure which receives pfns of the pages mapped. > > + * It should have space for at least nr_pfns pfns. > > + * > > + * This function maps virtual addresses from @start and fills @pfns structure > > + * with page frame numbers of corresponding pages. If @start belongs to a > > + * normal vma, the function grabs reference to each of the pages to pin them in > > + * memory. If @start belongs to VM_IO | VM_PFNMAP vma, we don't touch page > > + * structures. Caller should make sure pfns aren't reused for anything else > > + * while he is using them. > > + * > > + * This function takes care of grabbing mmap_sem as necessary. > > + */ > > +int get_vaddr_pfns(unsigned long start, int nr_pfns, int write, int force, > > + struct pinned_pfns *pfns) > > +{ > > + struct mm_struct *mm = current->mm; > > + struct vm_area_struct *vma; > > + int ret = 0; > > + int err; > > + > > + if (nr_pfns <= 0) > > + return 0; > > + > > I know I suggested that nr_pfns should be unsigned earlier and then I > saw this. Is there any valid use of the API that would pass in a > negative number here? No. It was there just because I had ints everywhere without too much thought (it's shorter to type *grin*). I'll put unsigned types where it makes sense and add a test for INT_MAX so that get_vaddr_pfns() can still return negative errors. > > + if (nr_pfns > pfns->nr_allocated_pfns) > > + nr_pfns = pfns->nr_allocated_pfns; > > + > > Should this be a WARN_ON_ONCE? You recover from it obviously but the return > value is not documented to say that it could return less than requested. > Of course, this is implied but a caller might assume it's due to a transient > error instead of broken API usage. Yep, good idea. > > + down_read(&mm->mmap_sem); > > + vma = find_vma_intersection(mm, start, start + 1); > > + if (!vma) { > > + ret = -EFAULT; > > + goto out; > > + } > > Returning -EFAULT means that the return value has to be signed but the > structure itself has unsigned int for counters so the maximum number of > PFNs supported is not available. We'd never want that number of PFNs pinned > but still it might make sense to enforce the maximum possible size of the > structure that takes into account the values needed for returning errors. Yes. See above. > > + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) { > > + pfns->got_ref = 1; > > + pfns->is_pages = 1; > > + ret = get_user_pages(current, mm, start, nr_pfns, write, force, > > + pfns_vector_pages(pfns), NULL); > > + goto out; > > + } > > + > > + pfns->got_ref = 0; > > + pfns->is_pages = 0; > > + do { > > + unsigned long *nums = pfns_vector_pfns(pfns); > > + > > + while (ret < nr_pfns && start + PAGE_SIZE <= vma->vm_end) { > > + err = follow_pfn(vma, start, &nums[ret]); > > + if (err) { > > + if (ret == 0) > > + ret = err; > > + goto out; > > + } > > + start += PAGE_SIZE; > > + ret++; > > + } > > + /* > > + * We stop if we have enough pages or if VMA doesn't completely > > + * cover the tail page. > > + */ > > + if (ret >= nr_pfns || start < vma->vm_end) > > + break; > > + vma = find_vma_intersection(mm, start, start + 1); > > + } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)); > > Documentation probably should mention that a range that strides VMA types > will return partial results. Good point. Will add that. > > +out: > > + up_read(&mm->mmap_sem); > > + if (!ret) > > + ret = -EFAULT; > > Really? Maybe we just failed to call get_user_pages on any of them. The > meaning of -EFAULT probably needs a comment. So the only real case where we could have 0 here is when get_user_pages() returns 0 for whatever reason. It seemed better to me to translate 0 to -EFAULT since all the drivers I saw did that anyway (at least those that didn't forget about checking for 0). It just seems as a less error-prone interface to return error or number > 0. I agree that -EFAULT doesn't necessarily always make sense so if you have idea for a better error I'm open to suggestions. > > + if (ret > 0) > > + pfns->nr_pfns = ret; > > + return ret; > > +} > > +EXPORT_SYMBOL(get_vaddr_pfns); > > + > > +/** > > + * put_vaddr_pfns() - drop references to pages if get_vaddr_pfns() acquired them > > + * @pfns: structure with pfns we pinned > > + * > > + * Drop references to pages if get_vaddr_pfns() acquired them. We also > > + * invalidate the array of pfns so that it is prepared for the next call into > > + * get_vaddr_pfns(). > > + */ > > +void put_vaddr_pfns(struct pinned_pfns *pfns) > > +{ > > + int i; > > + > > + if (!pfns->got_ref) > > + goto out; > > + if (pfns->is_pages) { > > + struct page **pages = pfns_vector_pages(pfns); > > + > > + for (i = 0; i < pfns->nr_pfns; i++) > > + put_page(pages[i]); > > Ok. > > > + } else { > > + unsigned long *nums = pfns_vector_pfns(pfns); > > + > > + for (i = 0; i < pfns->nr_pfns; i++) > > + put_page(pfn_to_page(nums[i])); > > + } > > I'm missing something here. The !is_pages branch in get_vaddr_pfns() > used follow_pfn() and it does not take a reference on the struct page so > I'm not sure what this is putting exactly. So this case should have covered the situation when we have a vector of normal pages (thus got_ref == 1) and convert that to a vector of pfns. Noone currently does that and that function for conversion even doesn't seem to exist but it seems to me at least omap driver may need it and the conversion I did has a bug. Anyway, I'll recheck the situation and either drop this or make the function more comprehensible. > > + pfns->got_ref = 0; > > +out: > > + pfns->nr_pfns = 0; > > +} > > +EXPORT_SYMBOL(put_vaddr_pfns); > > + > > EXPORT_SYMBOL_GPL? I've just followed what __get_user_pages() does... > > +/** > > + * pfns_vector_to_pages - convert vector of pfns to vector of page pointers > > + * @pfns: structure with pinned pfns > > + * > > + * Convert @pfns to not contain array of pfns but array of page pointers. > > + * If the conversion is successful, return 0. Otherwise return an error. > > + */ > > +int pfns_vector_to_pages(struct pinned_pfns *pfns) > > +{ > > + int i; > > + unsigned long *nums; > > + struct page **pages; > > + > > + if (pfns->is_pages) > > + return 0; > > + nums = pfns_vector_pfns(pfns); > > + for (i = 0; i < pfns->nr_pfns; i++) > > + if (!pfn_valid(nums[i])) > > + return -EINVAL; > > + pages = (struct page **)nums; > > + for (i = 0; i < pfns->nr_pfns; i++) > > + pages[i] = pfn_to_page(nums[i]); > > + pfns->is_pages = 1; > > + return 0; > > +} > > +EXPORT_SYMBOL(pfns_vector_to_pages); > > + > > +/** > > + * pfns_vector_create() - allocate & initialize structure for pinned pfns > > + * @nr_pfns: number of pfns slots we should reserve > > + * > > + * Allocate and initialize struct pinned_pfns to be able to hold @nr_pfns > > + * pfns. > > + */ > > +struct pinned_pfns *pfns_vector_create(int nr_pfns) > > +{ > > + struct pinned_pfns *pfns; > > + int size = sizeof(struct pinned_pfns) + sizeof(unsigned long) * nr_pfns; > > + > > Should be sizeof(void *) OK. > > + if (WARN_ON_ONCE(nr_pfns <= 0)) > > + return NULL; > > + if (size <= PAGE_SIZE) > > + pfns = kmalloc(size, GFP_KERNEL); > > kmalloc supports larger sizes than this but I suspect this was > deliberate to avoid high-order allocations. Exactly. Thanks a lot for review Mel! Honza -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f169.google.com (mail-wi0-f169.google.com [209.85.212.169]) by kanga.kvack.org (Postfix) with ESMTP id 9EDC46B0038 for ; Mon, 4 May 2015 11:14:05 -0400 (EDT) Received: by widdi4 with SMTP id di4so125338516wid.0 for ; Mon, 04 May 2015 08:14:05 -0700 (PDT) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id ho2si23145169wjb.162.2015.05.04.08.14.03 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 04 May 2015 08:14:03 -0700 (PDT) Date: Mon, 4 May 2015 17:14:01 +0200 From: Jan Kara Subject: Re: [PATCH 2/9] mm: Provide new get_vaddr_pfns() helper Message-ID: <20150504151401.GC5805@quack.suse.cz> References: <1426593399-6549-1-git-send-email-jack@suse.cz> <1426593399-6549-3-git-send-email-jack@suse.cz> <20150430155531.GY2449@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150430155531.GY2449@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Mel Gorman Cc: Jan Kara , linux-media@vger.kernel.org, Hans Verkuil , Mauro Carvalho Chehab , linux-mm@kvack.org, dri-devel@lists.freedesktop.org, David Airlie On Thu 30-04-15 16:55:31, Mel Gorman wrote: > On Tue, Mar 17, 2015 at 12:56:32PM +0100, Jan Kara wrote: > > Provide new function get_vaddr_pfns(). This function maps virtual > > addresses from given start and fills given array with page frame numbers of > > the corresponding pages. If given start belongs to a normal vma, the function > > grabs reference to each of the pages to pin them in memory. If start > > belongs to VM_IO | VM_PFNMAP vma, we don't touch page structures. Caller > > should make sure pfns aren't reused for anything else while he is using > > them. > > > > This function is created for various drivers to simplify handling of > > their buffers. > > > > Signed-off-by: Jan Kara > > --- > > include/linux/mm.h | 38 +++++++++++ > > mm/gup.c | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 218 insertions(+) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 47a93928b90f..a5045df92454 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1279,6 +1279,44 @@ long get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm, > > int write, int force, struct page **pages); > > int get_user_pages_fast(unsigned long start, int nr_pages, int write, > > struct page **pages); > > + > > +/* Container for pinned pfns / pages */ > > +struct pinned_pfns { > > + unsigned int nr_allocated_pfns; /* Number of pfns we have space for */ > > + unsigned int nr_pfns; /* Number of pfns stored in pfns array */ > > + unsigned int got_ref:1; /* Did we pin pfns by getting page ref? */ > > + unsigned int is_pages:1; /* Does array contain pages or pfns? */ > > The bit field is probably overkill as I expect it'll get padded out for > pointer alignment anyway. Just use bools. Makes sense. > is_pfns is less ambiguous than is_pages but not very important. > > The naming is not great in general. Only struct pages are pinned in the > traditional meaning of the word. The raw PFNs are not so there is no such > thing as a "pinned pfns". It might be better just to call it frame_vectors > and document that it's either raw PFNs that the caller should be responsible > for or struct pages that are pinned. Good point. I'll try to come up with a better name. - I agree with minor comments there. > > /** > > + * get_vaddr_pfns() - map virtual addresses to pfns > > + * @start: starting user address > > + * @nr_pfns: number of pfns from start to map > > + * @write: whether pages will be written to by the caller > > + * @force: whether to force write access even if user mapping is > > + * readonly. This will result in the page being COWed even > > + * in MAP_SHARED mappings. You do not want this. > > + * @pfns: structure which receives pfns of the pages mapped. > > + * It should have space for at least nr_pfns pfns. > > + * > > + * This function maps virtual addresses from @start and fills @pfns structure > > + * with page frame numbers of corresponding pages. If @start belongs to a > > + * normal vma, the function grabs reference to each of the pages to pin them in > > + * memory. If @start belongs to VM_IO | VM_PFNMAP vma, we don't touch page > > + * structures. Caller should make sure pfns aren't reused for anything else > > + * while he is using them. > > + * > > + * This function takes care of grabbing mmap_sem as necessary. > > + */ > > +int get_vaddr_pfns(unsigned long start, int nr_pfns, int write, int force, > > + struct pinned_pfns *pfns) > > +{ > > + struct mm_struct *mm = current->mm; > > + struct vm_area_struct *vma; > > + int ret = 0; > > + int err; > > + > > + if (nr_pfns <= 0) > > + return 0; > > + > > I know I suggested that nr_pfns should be unsigned earlier and then I > saw this. Is there any valid use of the API that would pass in a > negative number here? No. It was there just because I had ints everywhere without too much thought (it's shorter to type *grin*). I'll put unsigned types where it makes sense and add a test for INT_MAX so that get_vaddr_pfns() can still return negative errors. > > + if (nr_pfns > pfns->nr_allocated_pfns) > > + nr_pfns = pfns->nr_allocated_pfns; > > + > > Should this be a WARN_ON_ONCE? You recover from it obviously but the return > value is not documented to say that it could return less than requested. > Of course, this is implied but a caller might assume it's due to a transient > error instead of broken API usage. Yep, good idea. > > + down_read(&mm->mmap_sem); > > + vma = find_vma_intersection(mm, start, start + 1); > > + if (!vma) { > > + ret = -EFAULT; > > + goto out; > > + } > > Returning -EFAULT means that the return value has to be signed but the > structure itself has unsigned int for counters so the maximum number of > PFNs supported is not available. We'd never want that number of PFNs pinned > but still it might make sense to enforce the maximum possible size of the > structure that takes into account the values needed for returning errors. Yes. See above. > > + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) { > > + pfns->got_ref = 1; > > + pfns->is_pages = 1; > > + ret = get_user_pages(current, mm, start, nr_pfns, write, force, > > + pfns_vector_pages(pfns), NULL); > > + goto out; > > + } > > + > > + pfns->got_ref = 0; > > + pfns->is_pages = 0; > > + do { > > + unsigned long *nums = pfns_vector_pfns(pfns); > > + > > + while (ret < nr_pfns && start + PAGE_SIZE <= vma->vm_end) { > > + err = follow_pfn(vma, start, &nums[ret]); > > + if (err) { > > + if (ret == 0) > > + ret = err; > > + goto out; > > + } > > + start += PAGE_SIZE; > > + ret++; > > + } > > + /* > > + * We stop if we have enough pages or if VMA doesn't completely > > + * cover the tail page. > > + */ > > + if (ret >= nr_pfns || start < vma->vm_end) > > + break; > > + vma = find_vma_intersection(mm, start, start + 1); > > + } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)); > > Documentation probably should mention that a range that strides VMA types > will return partial results. Good point. Will add that. > > +out: > > + up_read(&mm->mmap_sem); > > + if (!ret) > > + ret = -EFAULT; > > Really? Maybe we just failed to call get_user_pages on any of them. The > meaning of -EFAULT probably needs a comment. So the only real case where we could have 0 here is when get_user_pages() returns 0 for whatever reason. It seemed better to me to translate 0 to -EFAULT since all the drivers I saw did that anyway (at least those that didn't forget about checking for 0). It just seems as a less error-prone interface to return error or number > 0. I agree that -EFAULT doesn't necessarily always make sense so if you have idea for a better error I'm open to suggestions. > > + if (ret > 0) > > + pfns->nr_pfns = ret; > > + return ret; > > +} > > +EXPORT_SYMBOL(get_vaddr_pfns); > > + > > +/** > > + * put_vaddr_pfns() - drop references to pages if get_vaddr_pfns() acquired them > > + * @pfns: structure with pfns we pinned > > + * > > + * Drop references to pages if get_vaddr_pfns() acquired them. We also > > + * invalidate the array of pfns so that it is prepared for the next call into > > + * get_vaddr_pfns(). > > + */ > > +void put_vaddr_pfns(struct pinned_pfns *pfns) > > +{ > > + int i; > > + > > + if (!pfns->got_ref) > > + goto out; > > + if (pfns->is_pages) { > > + struct page **pages = pfns_vector_pages(pfns); > > + > > + for (i = 0; i < pfns->nr_pfns; i++) > > + put_page(pages[i]); > > Ok. > > > + } else { > > + unsigned long *nums = pfns_vector_pfns(pfns); > > + > > + for (i = 0; i < pfns->nr_pfns; i++) > > + put_page(pfn_to_page(nums[i])); > > + } > > I'm missing something here. The !is_pages branch in get_vaddr_pfns() > used follow_pfn() and it does not take a reference on the struct page so > I'm not sure what this is putting exactly. So this case should have covered the situation when we have a vector of normal pages (thus got_ref == 1) and convert that to a vector of pfns. Noone currently does that and that function for conversion even doesn't seem to exist but it seems to me at least omap driver may need it and the conversion I did has a bug. Anyway, I'll recheck the situation and either drop this or make the function more comprehensible. > > + pfns->got_ref = 0; > > +out: > > + pfns->nr_pfns = 0; > > +} > > +EXPORT_SYMBOL(put_vaddr_pfns); > > + > > EXPORT_SYMBOL_GPL? I've just followed what __get_user_pages() does... > > +/** > > + * pfns_vector_to_pages - convert vector of pfns to vector of page pointers > > + * @pfns: structure with pinned pfns > > + * > > + * Convert @pfns to not contain array of pfns but array of page pointers. > > + * If the conversion is successful, return 0. Otherwise return an error. > > + */ > > +int pfns_vector_to_pages(struct pinned_pfns *pfns) > > +{ > > + int i; > > + unsigned long *nums; > > + struct page **pages; > > + > > + if (pfns->is_pages) > > + return 0; > > + nums = pfns_vector_pfns(pfns); > > + for (i = 0; i < pfns->nr_pfns; i++) > > + if (!pfn_valid(nums[i])) > > + return -EINVAL; > > + pages = (struct page **)nums; > > + for (i = 0; i < pfns->nr_pfns; i++) > > + pages[i] = pfn_to_page(nums[i]); > > + pfns->is_pages = 1; > > + return 0; > > +} > > +EXPORT_SYMBOL(pfns_vector_to_pages); > > + > > +/** > > + * pfns_vector_create() - allocate & initialize structure for pinned pfns > > + * @nr_pfns: number of pfns slots we should reserve > > + * > > + * Allocate and initialize struct pinned_pfns to be able to hold @nr_pfns > > + * pfns. > > + */ > > +struct pinned_pfns *pfns_vector_create(int nr_pfns) > > +{ > > + struct pinned_pfns *pfns; > > + int size = sizeof(struct pinned_pfns) + sizeof(unsigned long) * nr_pfns; > > + > > Should be sizeof(void *) OK. > > + if (WARN_ON_ONCE(nr_pfns <= 0)) > > + return NULL; > > + if (size <= PAGE_SIZE) > > + pfns = kmalloc(size, GFP_KERNEL); > > kmalloc supports larger sizes than this but I suspect this was > deliberate to avoid high-order allocations. Exactly. Thanks a lot for review Mel! Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org