From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pankaj Gupta Subject: [PATCH v3 0/5] kvm "virtio pmem" device Date: Wed, 9 Jan 2019 19:20:19 +0530 Message-ID: <20190109135024.14093-1-pagupta@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, qemu-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org, linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-xfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: jack-AlSwsSmVLrQ@public.gmane.org, david-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, jasowang-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, lcapitulino-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, adilger.kernel-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org, zwisler-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, eblake-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, nilal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, riel-ebMLmSuQjDVBDgjK7y7TUQ@public.gmane.org, stefanha-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, imammedo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, kwolf-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, tytso-3s7WtUTddSA@public.gmane.org, xiaoguangrong.eric-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, rjw-LthD3rsA81gm4RdzfppkhA@public.gmane.org, pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org List-Id: linux-acpi@vger.kernel.org IFRoaXMgcGF0Y2ggc2VyaWVzIGhhcyBpbXBsZW1lbnRhdGlvbiBmb3IgInZpcnRpbyBwbWVtIi4g CiAidmlydGlvIHBtZW0iIGlzIGZha2UgcGVyc2lzdGVudCBtZW1vcnkobnZkaW1tKSBpbiBndWVz dCAKIHdoaWNoIGFsbG93cyB0byBieXBhc3MgdGhlIGd1ZXN0IHBhZ2UgY2FjaGUuIFRoaXMgYWxz bwogaW1wbGVtZW50cyBhIFZJUlRJTyBiYXNlZCBhc3luY2hyb25vdXMgZmx1c2ggbWVjaGFuaXNt LiAgCiAKIFNoYXJpbmcgZ3Vlc3Qga2VybmVsIGRyaXZlciBpbiB0aGlzIHBhdGNoc2V0IHdpdGgg dGhlIAogY2hhbmdlcyBzdWdnZXN0ZWQgaW4gdjIuIFRlc3RlZCB3aXRoIFFlbXUgc2lkZSBkZXZp Y2UgCiBlbXVsYXRpb24gZm9yIHZpcnRpby1wbWVtIFs2XS4gCiAKIERldGFpbHMgb2YgcHJvamVj dCBpZGVhIGZvciAndmlydGlvIHBtZW0nIGZsdXNoaW5nIGludGVyZmFjZSAKIGlzIHNoYXJlZCBb M10gJiBbNF0uCgogSW1wbGVtZW50YXRpb24gaXMgZGl2aWRlZCBpbnRvIHR3byBwYXJ0czoKIE5l dyB2aXJ0aW8gcG1lbSBndWVzdCBkcml2ZXIgYW5kIHFlbXUgY29kZSBjaGFuZ2VzIGZvciBuZXcg CiB2aXJ0aW8gcG1lbSBwYXJhdmlydHVhbGl6ZWQgZGV2aWNlLgoKMS4gR3Vlc3QgdmlydGlvLXBt ZW0ga2VybmVsIGRyaXZlcgotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KICAgLSBS ZWFkcyBwZXJzaXN0ZW50IG1lbW9yeSByYW5nZSBmcm9tIHBhcmF2aXJ0IGRldmljZSBhbmQgCiAg ICAgcmVnaXN0ZXJzIHdpdGggJ252ZGltbV9idXMnLiAgCiAgIC0gJ252ZGltbS9wbWVtJyBkcml2 ZXIgdXNlcyB0aGlzIGluZm9ybWF0aW9uIHRvIGFsbG9jYXRlIAogICAgIHBlcnNpc3RlbnQgbWVt b3J5IHJlZ2lvbiBhbmQgc2V0dXAgZmlsZXN5c3RlbSBvcGVyYXRpb25zIAogICAgIHRvIHRoZSBh bGxvY2F0ZWQgbWVtb3J5LiAKICAgLSB2aXJ0aW8gcG1lbSBkcml2ZXIgaW1wbGVtZW50cyBhc3lu Y2hyb25vdXMgZmx1c2hpbmcgCiAgICAgaW50ZXJmYWNlIHRvIGZsdXNoIGZyb20gZ3Vlc3QgdG8g aG9zdC4KCjIuIFFlbXUgdmlydGlvLXBtZW0gZGV2aWNlCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLQogICAtIENyZWF0ZXMgdmlydGlvIHBtZW0gZGV2aWNlIGFuZCBleHBvc2VzIGEg bWVtb3J5IHJhbmdlIHRvIAogICAgIEtWTSBndWVzdC4gCiAgIC0gQXQgaG9zdCBzaWRlIHRoaXMg aXMgZmlsZSBiYWNrZWQgbWVtb3J5IHdoaWNoIGFjdHMgYXMgCiAgICAgcGVyc2lzdGVudCBtZW1v cnkuIAogICAtIFFlbXUgc2lkZSBmbHVzaCB1c2VzIGFpbyB0aHJlYWQgcG9vbCBBUEkncyBhbmQg dmlydGlvIAogICAgIGZvciBhc3luY2hyb25vdXMgZ3Vlc3QgbXVsdGkgcmVxdWVzdCBoYW5kbGlu Zy4gCgogICBEYXZpZCBIaWxkZW5icmFuZCBDQ2VkIGFsc28gcG9zdGVkIGEgbW9kaWZpZWQgdmVy c2lvbls2XSBvZiAKICAgcWVtdSB2aXJ0aW8tcG1lbSBjb2RlIGJhc2VkIG9uIHVwZGF0ZWQgUWVt dSBtZW1vcnkgZGV2aWNlIEFQSS4gCgogVmlydGlvLXBtZW0gZXJyb3JzIGhhbmRsaW5nOgogLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQogIENoZWNrZWQgYmVoYXZpb3Vy IG9mIHZpcnRpby1wbWVtIGZvciBiZWxvdyB0eXBlcyBvZiBlcnJvcnMKICBOZWVkIHN1Z2dlc3Rp b25zIG9uIGV4cGVjdGVkIGJlaGF2aW91ciBmb3IgaGFuZGxpbmcgdGhlc2UgZXJyb3JzPwoKICAt IEhhcmR3YXJlIEVycm9yczogVW5jb3JyZWN0YWJsZSByZWNvdmVyYWJsZSBFcnJvcnM6IAogIGFd IHZpcnRpby1wbWVtOiAKICAgIC0gQXMgcGVyIGN1cnJlbnQgbG9naWMgaWYgZXJyb3IgcGFnZSBi ZWxvbmdzIHRvIFFlbXUgcHJvY2VzcywgCiAgICAgIGhvc3QgTUNFIGhhbmRsZXIgaXNvbGF0ZXMo aHdwb2lzb24pIHRoYXQgcGFnZSBhbmQgc2VuZCBTSUdCVVMuIAogICAgICBRZW11IFNJR0JVUyBo YW5kbGVyIGluamVjdHMgZXhjZXB0aW9uIHRvIEtWTSBndWVzdC4gCiAgICAtIEtWTSBndWVzdCB0 aGVuIGlzb2xhdGVzIHRoZSBwYWdlIGFuZCBzZW5kIFNJR0JVUyB0byBndWVzdCAKICAgICAgdXNl cnNwYWNlIHByb2Nlc3Mgd2hpY2ggaGFzIG1hcHBlZCB0aGUgcGFnZS4gCiAgCiAgYl0gRXhpc3Rp bmcgaW1wbGVtZW50YXRpb24gZm9yIEFDUEkgcG1lbSBkcml2ZXI6IAogICAgLSBIYW5kbGVzIHN1 Y2ggZXJyb3JzIHdpdGggTUNFIG5vdGlmaWVyIGFuZCBjcmVhdGVzIGEgbGlzdCAKICAgICAgb2Yg YmFkIGJsb2Nrcy4gUmVhZC9kaXJlY3QgYWNjZXNzIERBWCBvcGVyYXRpb24gcmV0dXJuIEVJTyAK ICAgICAgaWYgYWNjZXNzZWQgbWVtb3J5IHBhZ2UgZmFsbCBpbiBiYWQgYmxvY2sgbGlzdC4KICAg IC0gSXQgYWxzbyBzdGFydHMgYmFja2dvdW5kIHNjcnViYmluZy4gIAogICAgLSBTaW1pbGFyIGZ1 bmN0aW9uYWxpdHkgY2FuIGJlIHJldXNlZCBpbiB2aXJ0aW8tcG1lbSB3aXRoIE1DRSAKICAgICAg bm90aWZpZXIgYnV0IHdpdGhvdXQgc2NydWJiaW5nKG5vIEFDUEkvQVJTKT8gTmVlZCBpbnB1dHMg dG8gCiAgICAgIGNvbmZpcm0gaWYgdGhpcyBiZWhhdmlvdXIgaXMgb2sgb3IgbmVlZHMgYW55IGNo YW5nZT8KCkNoYW5nZXMgZnJvbSBQQVRDSCB2MjogWzFdCi0gRGlzYWJsZSBNQVBfU1lOQyBmb3Ig ZXh0NCAmIFhGUyBmaWxlc3lzdGVtcyAtIFtEYW5dIAotIFVzZSBuYW1lICd2aXJ0aW8gcG1lbScg aW4gcGxhY2Ugb2YgJ2Zha2UgZGF4JyAKCkNoYW5nZXMgZnJvbSBQQVRDSCB2MTogWzJdCi0gMC1k YXkgYnVpbGQgdGVzdCBmb3IgYnVpbGQgZGVwZW5kZW5jeSBvbiBsaWJudmRpbW0gCgogQ2hhbmdl cyBzdWdnZXN0ZWQgYnkgLSBbRGFuIFdpbGxpYW1zXQotIFNwbGl0IHRoZSBkcml2ZXIgaW50byB0 d28gcGFydHMgdmlydGlvICYgcG1lbSAgCi0gTW92ZSBxdWV1aW5nIG9mIGFzeW5jIGJsb2NrIHJl cXVlc3QgdG8gYmxvY2sgbGF5ZXIKLSBBZGQgInN5bmMiIHBhcmFtZXRlciBpbiBudmRpbW1fZmx1 c2ggZnVuY3Rpb24KLSBVc2UgaW5kaXJlY3QgY2FsbCBmb3IgbnZkaW1tX2ZsdXNoCi0gRG9u4oCZ dCBtb3ZlIGRlY2xhcmF0aW9ucyB0byBjb21tb24gZ2xvYmFsIGhlYWRlciBlLmcgbmQuaAotIG52 ZGltbV9mbHVzaCgpIHJldHVybiAwIG9yIC1FSU8gaWYgaXQgZmFpbHMKLSBUZWFjaCBuc2lvX3J3 X2J5dGVzKCkgdGhhdCB0aGUgZmx1c2ggY2FuIGZhaWwKLSBSZW5hbWUgbnZkaW1tX2ZsdXNoKCkg dG8gZ2VuZXJpY19udmRpbW1fZmx1c2goKQotIFVzZSAnbmRfcmVnaW9uLT5wcm92aWRlcl9kYXRh JyBmb3IgbG9uZyBkZXJlZmVyZW5jaW5nCi0gUmVtb3ZlIHZpcnRpb19wbWVtX2ZyZWV6ZS9yZXN0 b3JlIGZ1bmN0aW9ucwotIFJlbW92ZSBCU0QgbGljZW5zZSB0ZXh0IHdpdGggU1BEWCBsaWNlbnNl IHRleHQKCi0gQWRkIG1pZ2h0X3NsZWVwKCkgaW4gdmlydGlvX3BtZW1fZmx1c2ggLSBbTHVpel0K LSBNYWtlIHNwaW5fbG9ja19pcnFzYXZlKCkgbmFycm93CgpDaGFuZ2VzIGZyb20gUkZDIHYzCi0g UmViYXNlIHRvIGxhdGVzdCB1cHN0cmVhbSAtIEx1aXoKLSBDYWxsIG5kcmVnaW9uLT5mbHVzaCBp biBwbGFjZSBvZiBudmRpbW1fZmx1c2gtIEx1aXoKLSBrbWFsbG9jIHJldHVybiBjaGVjayAtIEx1 aXoKLSB2aXJ0cXVldWUgZnVsbCBoYW5kbGluZyAtIFN0ZWZhbgotIERvbid0IG1hcCBlbnRpcmUg dmlydGlvX3BtZW1fcmVxIHRvIGRldmljZSAtIFN0ZWZhbgotIHJlcXVlc3QgbGVhaywgY29ycmVj dCBzaXplb2YgcmVxLSBTdGVmYW4KLSBNb3ZlIGRlY2xhcmF0aW9uIHRvIHZpcnRpb19wbWVtLmMK CkNoYW5nZXMgZnJvbSBSRkMgdjI6Ci0gQWRkIGZsdXNoIGZ1bmN0aW9uIGluIHRoZSBuZF9yZWdp b24gaW4gcGxhY2Ugb2Ygc3dpdGNoaW5nCiAgb24gYSBmbGFnIC0gRGFuICYgU3RlZmFuCi0gQWRk IGZsdXNoIGNvbXBsZXRpb24gZnVuY3Rpb24gd2l0aCBwcm9wZXIgbG9ja2luZyBhbmQgd2FpdAog IGZvciBob3N0IHNpZGUgZmx1c2ggY29tcGxldGlvbiAtIFN0ZWZhbiAmIERhbgotIEtlZXAgdXNl cnNwYWNlIEFQSSBpbiB1YXBpIGhlYWRlciBmaWxlIC0gU3RlZmFuLCBNU1QKLSBVc2UgTEUgZmll bGRzICYgTmV3IGRldmljZSBpZCAtIE1TVAotIEluZGVudGF0aW9uICYgc3BhY2luZyBzdWdnZXN0 aW9ucyAtIE1TVCAmIEVyaWMKLSBSZW1vdmUgZXh0cmEgaGVhZGVyIGZpbGVzICYgYWRkIGxpY2Vu c2luZyAtIFN0ZWZhbgoKQ2hhbmdlcyBmcm9tIFJGQyB2MToKLSBSZXVzZSBleGlzdGluZyAncG1l bScgY29kZSBmb3IgcmVnaXN0ZXJpbmcgcGVyc2lzdGVudCAKICBtZW1vcnkgYW5kIG90aGVyIG9w ZXJhdGlvbnMgaW5zdGVhZCBvZiBjcmVhdGluZyBhbiBlbnRpcmVseSAKICBuZXcgYmxvY2sgZHJp dmVyLgotIFVzZSBWSVJUSU8gZHJpdmVyIHRvIHJlZ2lzdGVyIG1lbW9yeSBpbmZvcm1hdGlvbiB3 aXRoIAogIG52ZGltbV9idXMgYW5kIGNyZWF0ZSByZWdpb25fdHlwZSBhY2NvcmRpbmdseS4gCi0g Q2FsbCBWSVJUSU8gZmx1c2ggZnJvbSBleGlzdGluZyBwbWVtIGRyaXZlci4KClBhbmthaiBHdXB0 YSAoNSk6CiAgIGxpYm52ZGltbTogbmRfcmVnaW9uIGZsdXNoIGNhbGxiYWNrIHN1cHBvcnQKICAg dmlydGlvLXBtZW06IEFkZCB2aXJ0aW8tcG1lbSBndWVzdCBkcml2ZXIKICAgbGlibnZkaW1tOiBh ZGQgbmRfcmVnaW9uIGJ1ZmZlcmVkIGRheF9kZXYgZmxhZwogICBleHQ0OiBkaXNhYmxlIG1hcF9z eW5jIGZvciB2aXJ0aW8gcG1lbQogICB4ZnM6IGRpc2FibGUgbWFwX3N5bmMgZm9yIHZpcnRpbyBw bWVtCgpbMl0gaHR0cHM6Ly9sa21sLm9yZy9sa21sLzIwMTgvOC8zMS80MDcKWzNdIGh0dHBzOi8v d3d3LnNwaW5pY3MubmV0L2xpc3RzL2t2bS9tc2cxNDk3NjEuaHRtbApbNF0gaHR0cHM6Ly93d3cu c3Bpbmljcy5uZXQvbGlzdHMva3ZtL21zZzE1MzA5NS5odG1sICAKWzVdIGh0dHBzOi8vbGttbC5v cmcvbGttbC8yMDE4LzgvMzEvNDEzCls2XSBodHRwczovL21hcmMuaW5mby8/bD1xZW11LWRldmVs Jm09MTUzNTU1NzIxOTAxODI0Jnc9MgoKIGRyaXZlcnMvYWNwaS9uZml0L2NvcmUuYyAgICAgICAg IHwgICAgNCAtCiBkcml2ZXJzL2RheC9zdXBlci5jICAgICAgICAgICAgICB8ICAgMTcgKysrKysK IGRyaXZlcnMvbnZkaW1tL2NsYWltLmMgICAgICAgICAgIHwgICAgNiArCiBkcml2ZXJzL252ZGlt bS9uZC5oICAgICAgICAgICAgICB8ICAgIDEgCiBkcml2ZXJzL252ZGltbS9wbWVtLmMgICAgICAg ICAgICB8ICAgMTUgKysrLQogZHJpdmVycy9udmRpbW0vcmVnaW9uX2RldnMuYyAgICAgfCAgIDQ1 ICsrKysrKysrKysrKystCiBkcml2ZXJzL252ZGltbS92aXJ0aW9fcG1lbS5jICAgICB8ICAgODQg KysrKysrKysrKysrKysrKysrKysrKysrKysKIGRyaXZlcnMvdmlydGlvL0tjb25maWcgICAgICAg ICAgIHwgICAxMCArKysKIGRyaXZlcnMvdmlydGlvL01ha2VmaWxlICAgICAgICAgIHwgICAgMSAK IGRyaXZlcnMvdmlydGlvL3BtZW0uYyAgICAgICAgICAgIHwgIDEyNSArKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKysrKysKIGZzL2V4dDQvZmlsZS5jICAgICAgICAgICAgICAgICAg IHwgICAxMSArKysKIGZzL3hmcy94ZnNfZmlsZS5jICAgICAgICAgICAgICAgIHwgICAgOCArKwog aW5jbHVkZS9saW51eC9kYXguaCAgICAgICAgICAgICAgfCAgICA5ICsrCiBpbmNsdWRlL2xpbnV4 L2xpYm52ZGltbS5oICAgICAgICB8ICAgMTEgKysrCiBpbmNsdWRlL2xpbnV4L3ZpcnRpb19wbWVt LmggICAgICB8ICAgNjAgKysrKysrKysrKysrKysrKysrCiBpbmNsdWRlL3VhcGkvbGludXgvdmly dGlvX2lkcy5oICB8ICAgIDEgCiBpbmNsdWRlL3VhcGkvbGludXgvdmlydGlvX3BtZW0uaCB8ICAg MTAgKysrCiAxNyBmaWxlcyBjaGFuZ2VkLCA0MDYgaW5zZXJ0aW9ucygrKSwgMTIgZGVsZXRpb25z KC0pCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51 eC1udmRpbW0gbWFpbGluZyBsaXN0CkxpbnV4LW52ZGltbUBsaXN0cy4wMS5vcmcKaHR0cHM6Ly9s aXN0cy4wMS5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1udmRpbW0K From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com ([209.132.183.28]:42058 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730380AbfAINvC (ORCPT ); Wed, 9 Jan 2019 08:51:02 -0500 From: Pankaj Gupta To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, linux-fsdevel@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-acpi@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org Cc: jack@suse.cz, stefanha@redhat.com, dan.j.williams@intel.com, riel@surriel.com, nilal@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, zwisler@kernel.org, vishal.l.verma@intel.com, dave.jiang@intel.com, david@redhat.com, jmoyer@redhat.com, xiaoguangrong.eric@gmail.com, hch@infradead.org, mst@redhat.com, jasowang@redhat.com, lcapitulino@redhat.com, imammedo@redhat.com, eblake@redhat.com, willy@infradead.org, tytso@mit.edu, adilger.kernel@dilger.ca, darrick.wong@oracle.com, rjw@rjwysocki.net, pagupta@redhat.com Subject: [PATCH v3 0/5] kvm "virtio pmem" device Date: Wed, 9 Jan 2019 19:20:19 +0530 Message-Id: <20190109135024.14093-1-pagupta@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-ext4-owner@vger.kernel.org List-ID: This patch series has implementation for "virtio pmem". "virtio pmem" is fake persistent memory(nvdimm) in guest which allows to bypass the guest page cache. This also implements a VIRTIO based asynchronous flush mechanism. Sharing guest kernel driver in this patchset with the changes suggested in v2. Tested with Qemu side device emulation for virtio-pmem [6]. Details of project idea for 'virtio pmem' flushing interface is shared [3] & [4]. Implementation is divided into two parts: New virtio pmem guest driver and qemu code changes for new virtio pmem paravirtualized device. 1. Guest virtio-pmem kernel driver --------------------------------- - Reads persistent memory range from paravirt device and registers with 'nvdimm_bus'. - 'nvdimm/pmem' driver uses this information to allocate persistent memory region and setup filesystem operations to the allocated memory. - virtio pmem driver implements asynchronous flushing interface to flush from guest to host. 2. Qemu virtio-pmem device --------------------------------- - Creates virtio pmem device and exposes a memory range to KVM guest. - At host side this is file backed memory which acts as persistent memory. - Qemu side flush uses aio thread pool API's and virtio for asynchronous guest multi request handling. David Hildenbrand CCed also posted a modified version[6] of qemu virtio-pmem code based on updated Qemu memory device API. Virtio-pmem errors handling: ---------------------------------------- Checked behaviour of virtio-pmem for below types of errors Need suggestions on expected behaviour for handling these errors? - Hardware Errors: Uncorrectable recoverable Errors: a] virtio-pmem: - As per current logic if error page belongs to Qemu process, host MCE handler isolates(hwpoison) that page and send SIGBUS. Qemu SIGBUS handler injects exception to KVM guest. - KVM guest then isolates the page and send SIGBUS to guest userspace process which has mapped the page. b] Existing implementation for ACPI pmem driver: - Handles such errors with MCE notifier and creates a list of bad blocks. Read/direct access DAX operation return EIO if accessed memory page fall in bad block list. - It also starts backgound scrubbing. - Similar functionality can be reused in virtio-pmem with MCE notifier but without scrubbing(no ACPI/ARS)? Need inputs to confirm if this behaviour is ok or needs any change? Changes from PATCH v2: [1] - Disable MAP_SYNC for ext4 & XFS filesystems - [Dan] - Use name 'virtio pmem' in place of 'fake dax' Changes from PATCH v1: [2] - 0-day build test for build dependency on libnvdimm Changes suggested by - [Dan Williams] - Split the driver into two parts virtio & pmem - Move queuing of async block request to block layer - Add "sync" parameter in nvdimm_flush function - Use indirect call for nvdimm_flush - Don’t move declarations to common global header e.g nd.h - nvdimm_flush() return 0 or -EIO if it fails - Teach nsio_rw_bytes() that the flush can fail - Rename nvdimm_flush() to generic_nvdimm_flush() - Use 'nd_region->provider_data' for long dereferencing - Remove virtio_pmem_freeze/restore functions - Remove BSD license text with SPDX license text - Add might_sleep() in virtio_pmem_flush - [Luiz] - Make spin_lock_irqsave() narrow Changes from RFC v3 - Rebase to latest upstream - Luiz - Call ndregion->flush in place of nvdimm_flush- Luiz - kmalloc return check - Luiz - virtqueue full handling - Stefan - Don't map entire virtio_pmem_req to device - Stefan - request leak, correct sizeof req- Stefan - Move declaration to virtio_pmem.c Changes from RFC v2: - Add flush function in the nd_region in place of switching on a flag - Dan & Stefan - Add flush completion function with proper locking and wait for host side flush completion - Stefan & Dan - Keep userspace API in uapi header file - Stefan, MST - Use LE fields & New device id - MST - Indentation & spacing suggestions - MST & Eric - Remove extra header files & add licensing - Stefan Changes from RFC v1: - Reuse existing 'pmem' code for registering persistent memory and other operations instead of creating an entirely new block driver. - Use VIRTIO driver to register memory information with nvdimm_bus and create region_type accordingly. - Call VIRTIO flush from existing pmem driver. Pankaj Gupta (5): libnvdimm: nd_region flush callback support virtio-pmem: Add virtio-pmem guest driver libnvdimm: add nd_region buffered dax_dev flag ext4: disable map_sync for virtio pmem xfs: disable map_sync for virtio pmem [2] https://lkml.org/lkml/2018/8/31/407 [3] https://www.spinics.net/lists/kvm/msg149761.html [4] https://www.spinics.net/lists/kvm/msg153095.html [5] https://lkml.org/lkml/2018/8/31/413 [6] https://marc.info/?l=qemu-devel&m=153555721901824&w=2 drivers/acpi/nfit/core.c | 4 - drivers/dax/super.c | 17 +++++ drivers/nvdimm/claim.c | 6 + drivers/nvdimm/nd.h | 1 drivers/nvdimm/pmem.c | 15 +++- drivers/nvdimm/region_devs.c | 45 +++++++++++++- drivers/nvdimm/virtio_pmem.c | 84 ++++++++++++++++++++++++++ drivers/virtio/Kconfig | 10 +++ drivers/virtio/Makefile | 1 drivers/virtio/pmem.c | 125 +++++++++++++++++++++++++++++++++++++++ fs/ext4/file.c | 11 +++ fs/xfs/xfs_file.c | 8 ++ include/linux/dax.h | 9 ++ include/linux/libnvdimm.h | 11 +++ include/linux/virtio_pmem.h | 60 ++++++++++++++++++ include/uapi/linux/virtio_ids.h | 1 include/uapi/linux/virtio_pmem.h | 10 +++ 17 files changed, 406 insertions(+), 12 deletions(-)