From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Kani, Toshimitsu" Subject: Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Date: Wed, 22 Jun 2016 17:44:42 +0000 Message-ID: <1466616868.3504.320.camel@hpe.com> References: <20160613225756.GA18417@redhat.com> <20160620180043.GA21261@redhat.com> <1466446861.3504.243.camel@hpe.com> <20160620194026.GA21657@redhat.com> <20160620195217.GB21657@redhat.com> <1466452883.3504.244.camel@hpe.com> <1466457467.3504.249.camel@hpe.com> <20160620222236.GA22461@redhat.com> <20160621134147.GA26392@redhat.com> <1466523280.3504.262.camel@hpe.com> <20160621181728.GA27821@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20160621181728.GA27821@redhat.com> Content-Language: en-US Content-ID: <421B8C07030DC74D9301D80DBED8ECAD@NAMPRD84.PROD.OUTLOOK.COM> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: "snitzer@redhat.com" Cc: "axboe@kernel.dk" , "sandeen@redhat.com" , "axboe@fb.com" , "linux-nvdimm@ml01.01.org" , "linux-kernel@vger.kernel.org" , "linux-raid@vger.kernel.org" , "dm-devel@redhat.com" , "viro@zeniv.linux.org.uk" , "dan.j.williams@intel.com" , "ross.zwisler@linux.intel.com" , "agk@redhat.com" List-Id: dm-devel.ids T24gVHVlLCAyMDE2LTA2LTIxIGF0IDE0OjE3IC0wNDAwLCBNaWtlIFNuaXR6ZXIgd3JvdGU6Cj4g T24gVHVlLCBKdW4gMjEgMjAxNiBhdCAxMTo0NGFtIC0wNDAwLAo+IEthbmksIFRvc2hpbWl0c3Ug PHRvc2hpLmthbmlAaHBlLmNvbT4gd3JvdGU6Cj4gPiAKPiA+IE9uIFR1ZSwgMjAxNi0wNi0yMSBh dCAwOTo0MSAtMDQwMCwgTWlrZSBTbml0emVyIHdyb3RlOgo+ID4gPiAKPiA+ID4gT24gTW9uLCBK dW4gMjAgMjAxNiBhdMKgwqA2OjIycG0gLTA0MDAsCj4gPiA+IE1pa2UgU25pdHplciA8c25pdHpl ckByZWRoYXQuY29tPiB3cm90ZToKwqA6Cj4gPiA+IEknbSBub3cgd29uZGVyaW5nIGlmIHdlJ2Qg YmUgYmV0dGVyIG9mZiBzZXR0aW5nIGEgbmV3IFFVRVVFX0ZMQUdfREFYCj4gPiA+IHJhdGhlciB0 aGFuIGVzdGFibGlzaCBHRU5IRF9GTF9EQVggb24gdGhlIGdlbmhkPwo+ID4gPiAKPiA+ID4gSXQn ZCBiZSBxdWl0ZSBhIGJpdCBlYXNpZXIgdG8gYWxsb3cgdXBwZXIgbGF5ZXJzIChlLmcuIFhGUyBh bmQgZXh0NCkgdG8KPiA+ID4gY2hlY2sgZm9yIGEgcXVldWUgZmxhZy4KPiA+Cj4gPiBJIHRoaW5r IEdFTkhEX0ZMX0RBWCBpcyBtb3JlIGFwcHJvcHJpYXRlIHNpbmNlIERBWCBkb2VzIG5vdCB1c2Ug YSByZXF1ZXN0Cj4gPiBxdWV1ZSwgZXhjZXB0IGZvciBwcm90ZWN0aW5nIHRoZSB1bmRlcmxpbmlu ZyBkZXZpY2UgYmVpbmcgZGlzYWJsZWQgd2hpbGUKPiA+IGRpcmVjdF9hY2Nlc3MoKSBpcyBjYWxs ZWQgKGIyZTBkMTYyNWUxOSkuIMKgCj4KPiBUaGUgZGV2aWNlcyBpbiBxdWVzdGlvbiBoYXZlIGEg cmVxdWVzdF9xdWV1ZS7CoMKgQWxsIGJpby1iYXNlZCBkZXZpY2UgaGF2ZQo+IGEgcmVxdWVzdF9x dWV1ZS4KCkRBWC1jYXBhYmxlIGRldmljZXMgaGF2ZSB0d28gb3BlcmF0aW9uIG1vZGVzLCBiaW8t YmFzZWQgYW5kIERBWC4gwqBJIGFncmVlIHRoYXQKYmlvLWJhc2VkIG9wZXJhdGlvbiBpcyBhc3Nv Y2lhdGVkIHdpdGggYSByZXF1ZXN0IHF1ZXVlLCBhbmQgaXRzIGNhcGFiaWxpdGllcwpzaG91bGQg YmUgc2V0IHRvIGl0LiDCoERBWCwgb24gdGhlIG90aGVyIGhhbmQsIGlzIHJhdGhlciBpbmRlcGVu ZGVudCBmcm9tIGEKcmVxdWVzdCBxdWV1ZS4KCj4gSSBkb24ndCBoYXZlIGEgYmlnIHByb2JsZW0g d2l0aCBHRU5IRF9GTF9EQVguwqDCoEp1c3Qgd2FudGVkIHRvIHBvaW50IG91dAo+IHRoYXQgc3Vj aCBibG9jayBkZXZpY2UgY2FwYWJpbGl0aWVzIGFyZSBnZW5lcmFsbHkgYWR2ZXJ0aXNlZCBpbiB0 ZXJtcyBvZgo+IGEgUVVFVUVfRkxBRy4KCkkgZG8gbm90IGhhdmUgYSBzdHJvbmcgb3Bpbmlvbiwg YnV0IGZlZWwgYSBiaXQgb2RkIHRvIGFzc29jaWF0ZSBEQVggdG8gYQpyZXF1ZXN0IHF1ZXVlLsKg CsKgCj4gPiBBYm91dCBwcm90ZWN0aW5nIGRpcmVjdF9hY2Nlc3MsIHRoaXMgcGF0Y2ggYXNzdW1l cyB0aGF0IHRoZSB1bmRlcmxpbmluZwo+ID4gZGV2aWNlIGNhbm5vdCBiZSBkaXNhYmxlZCB1bnRp bCBkdHIoKSBpcyBjYWxsZWQuIMKgSXMgdGhpcyBjb3JyZWN0PyDCoElmCj4gPiBub3QsIEkgd2ls bCBuZWVkIHRvIGNhbGzCoGRheF9tYXBfYXRvbWljKCkuCj4KPiBPbmUgb2YgdGhlIGJpZyBkZXNp Z24gY29uc2lkZXJhdGlvbnMgZm9yIERNIHRoYXQgYSBETSBkZXZpY2UgY2FuIGJlCj4gc3VzcGVu ZGVkICh3aXRoIG9yIHdpdGhvdXQgZmx1c2gpIGFuZCBhbnkgbmV3IElPIHdpbGwgYmUgYmxvY2tl ZCB1bnRpbAo+IHRoZSBETSBkZXZpY2UgaXMgcmVzdW1lZC4KPiAKPiBTbyBpZGVhbGx5IERNIHNo b3VsZCBiZSBhYmxlIHRvIGhhdmUgdGhlIHNhbWUgY2FwYWJpbGl0eSBldmVuIGlmIHVzaW5nCj4g REFYLgoKU3VwcG9ydGluZyBzdXNwZW5kIGZvciBEQVggaXMgY2hhbGxlbmdpbmcgc2luY2UgaXQg YWxsb3dzIHVzZXIgYXBwbGljYXRpb25zIHRvCmFjY2VzcyBhIGRldmljZSBkaXJlY3RseS4gwqBP bmNlIGEgZGV2aWNlIHJhbmdlIGlzIG1tYXAnZCwgdGhlcmUgaXMgbm8ga2VybmVsCmludGVydmVu dGlvbiB0byBhY2Nlc3MgdGhlIHJhbmdlLCB1bmxlc3Mgd2UgaW52YWxpZGF0ZSB1c2VyIG1hcHBp bmdzLiDCoFRoaXMKaXNuJ3QgZG9uZSB0b2RheSBldmVuIGFmdGVyIGEgZHJpdmVyIGlzIHVuYmlu ZCdkIGZyb20gYSBkZXZpY2UuCgo+IEJ1dCB0aGF0IGlzIGRpZmZlcmVudCB0aGFuIHdoYXQgY29t bWl0IGIyZTBkMTYyNWUxOSBpcyBhZGRyZXNzaW5nLsKgwqBGb3IKPiBETSwgSSB3b3VsZG4ndCB0 aGluayB5b3UnZCBuZWVkIHRoZSBleHRyYSBwcm90ZWN0aW9ucyB0aGF0Cj4gZGF4X21hcF9hdG9t aWMoKSBpcyBwcm92aWRpbmcgZ2l2ZW4gdGhhdCB0aGUgdW5kZXJseWluZyBibG9jayBkZXZpY2UK PiBsaWZldGltZSBpcyBtYW5hZ2VkIHZpYSBETSBjb3JlJ3MgZG1fZ2V0X2RldmljZS9kbV9wdXRf ZGV2aWNlIChzZWUgYWxzbzoKPiBkbS5jOm9wZW5fdGFibGVfZGV2aWNlL2Nsb3NlX3RhYmxlX2Rl dmljZSkuCgpJIHRob3VnaHQgc28gYXMgd2VsbC4gwqBCdXQgSSByZWFsaXplZCB0aGF0IHRoZXJl IGlzIChhbG1vc3QpIG5vdGhpbmcgdGhhdCBjYW4KcHJldmVudCB0aGUgdW5iaW5kIG9wZXJhdGlv bi4gwqBJdCBjYW5ub3QgZmFpbCwgZWl0aGVyLiDCoFRoaXMgdW5iaW5kIHByb2NlZWRzCmV2ZW4g d2hlbiBhIGRldmljZSBpcyBpbi11c2UuIMKgSW4gY2FzZSBvZiBhIHBtZW0gZGV2aWNlLCBpdCBp cyBvbmx5IHByb3RlY3RlZApieSBwbWVtX3JlbGVhc2VfcXVldWUoKSwgd2hpY2ggaXMgY2FsbGVk IHdoZW4gYSBwbWVtIGRldmljZSBpcyBiZWluZyBkZWxldGVkCmFuZCBjYWxscyBibGtfY2xlYW51 cF9xdWV1ZSgpIHRvIHNlcmlhbGl6ZSBhIGNyaXRpY2FsIHNlY3Rpb24gYmV0d2VlbgpibGtfcXVl dWVfZW50ZXIoKSBhbmQgYmxrX3F1ZXVlX2V4aXQoKSBwZXIgYjJlMGQxNjI1ZTE5LiDCoFRoaXMg cHJldmVudHMgZnJvbSBhCmtlcm5lbCBEVExCIGZhdWx0LCBidXQgZG9lcyBub3QgcHJldmVudCBh IGRldmljZSBkaXNhcHBlYXJlZCB3aGlsZSBpbi11c2UuCgpQcm90ZWN0aW5nIERNJ3MgdW5kZXJs aW5pbmcgZGV2aWNlIHdpdGggYmxrX3F1ZXVlX2VudGVyKCkgKG9yIHNvbWV0aGluZwpzaW1pbGFy KSByZXF1aXJlcyBtb3JlIHRob3VnaHRzLi4uIMKgYmxrX3F1ZXVlX2VudGVyKCkgdG8gYSBETSBk ZXZpY2UgY2Fubm90IGJlCnJlZGlyZWN0ZWQgdG8gaXRzIHVuZGVybGluaW5nIGRldmljZS4gwqBT bywgdGhpcyBpcyBUQkQgZm9yIG5vdy4gwqBCdXQgSSBkbyBub3QKdGhpbmsgdGhpcyBpcyBhIGJs b2NrZXIgaXNzdWUgc2luY2UgZG9pbmcgdW5iaW5kIHRvIGEgdW5kZXJsaW5pbmcgZGV2aWNlIGlz CnF1aXRlIGhhcm1mdWwgbm8gbWF0dGVyIHdoYXQgd2UgZG8gLSBldmVuIGlmIGl0IGlzIHByb3Rl Y3RlZCB3aXRoCmJsa19xdWV1ZV9lbnRlcigpLgoKVGhhbmtzLAotVG9zaGkKCi0tCmRtLWRldmVs IG1haWxpbmcgbGlzdApkbS1kZXZlbEByZWRoYXQuY29tCmh0dHBzOi8vd3d3LnJlZGhhdC5jb20v bWFpbG1hbi9saXN0aW5mby9kbS1kZXZlbA== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752578AbcFVRpF (ORCPT ); Wed, 22 Jun 2016 13:45:05 -0400 Received: from mail-by2on0121.outbound.protection.outlook.com ([207.46.100.121]:12752 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752465AbcFVRpC (ORCPT ); Wed, 22 Jun 2016 13:45:02 -0400 From: "Kani, Toshimitsu" To: "snitzer@redhat.com" CC: "linux-kernel@vger.kernel.org" , "sandeen@redhat.com" , "linux-nvdimm@ml01.01.org" , "agk@redhat.com" , "linux-raid@vger.kernel.org" , "viro@zeniv.linux.org.uk" , "dan.j.williams@intel.com" , "axboe@fb.com" , "axboe@kernel.dk" , "ross.zwisler@linux.intel.com" , "dm-devel@redhat.com" Subject: Re: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Thread-Topic: [PATCH 0/6] Support DAX for device-mapper dm-linear devices Thread-Index: AQHRxcNvFc0gDtWw/UKSuO/qIYbBN5/oAoaAgAqtRoCAAAWsgIAAFjAAgAADUICAAAKLgIAAFViAgAASHACAAQDRgIAAH4oAgAAtfQCAAYZRAA== Date: Wed, 22 Jun 2016 17:44:42 +0000 Message-ID: <1466616868.3504.320.camel@hpe.com> References: <20160613225756.GA18417@redhat.com> <20160620180043.GA21261@redhat.com> <1466446861.3504.243.camel@hpe.com> <20160620194026.GA21657@redhat.com> <20160620195217.GB21657@redhat.com> <1466452883.3504.244.camel@hpe.com> <1466457467.3504.249.camel@hpe.com> <20160620222236.GA22461@redhat.com> <20160621134147.GA26392@redhat.com> <1466523280.3504.262.camel@hpe.com> <20160621181728.GA27821@redhat.com> In-Reply-To: <20160621181728.GA27821@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=toshi.kani@hpe.com; x-originating-ip: [15.219.163.9] x-ms-office365-filtering-correlation-id: 4d39d016-ae17-4c1d-1560-08d39ac4e4c4 x-microsoft-exchange-diagnostics: 1;CS1PR84MB0008;6:2M2B9DiBme3psB2KMiXqnFeVa5GLXV3QcIW5elIjM6E1pm8zOgMZyYizKRnI1J7GW6uCdpq9gC2GAQJOOBqjJpCYmpg4XSIITjerRiwvXg3+vCV9J9vVhTnR8EAx+w5W3k3zBG11m5IsUth+2wMeJ9EAJPLedlTMrnlrH7HIQZmGT7KAAxLZ4JEjJHJ9p/EgVjYoT8hQpP4iPNuGZN7MLpFynwlEpkHQOqXyznpMbWkhNXR/xJ6RmgBLNvRnsm50eFdoMRBXVbToz/059MBgxaMJ/mlsFlvdUS2i+fEKlaSgcBe3p9q0Uw1S9f8uFoSe;5:J7WO65dO7yslAvCokHOkHjwgR1Kk3oPwTQJ1oQqWlTaBdKv54K+CnKO/k3XueZ8VYkt9eamB9mjbjr9R0BDhQEOSY4NuU6bG+3LbteuOOdqOBxEa84E0+FyUy1drr6Iu9LDdzb/Rj7IfWpIcDMxNPg==;24:LiAqx0FUUn/als010L+2V9y2LPNYyV1ohnYy5dYYPZyi0HE6LDQ79AZT0aaH3runu9WNW5PqTAFitoKnTfy3lGjIQWcNkVl8TH2s8PTuIIo=;7:2Sk8q62raM9sEDl6fwqdxLTkd1gfVVdnbMtXIfMuilUmYtiT5C9/FWCWJFRx77AMSRA+czUJJRGXJ/V3bIasVSTMWrKQo/frcAuWFTUdm2GIsvv1EuDTVVWdUd8bcGuL0Di6OkIteOlus6+t8nAct/+ALI+Ooi4W7NMwUAlIEbHJwdo58jrZWnhSHmL8HUoNODgwC1B1QZBjKJczpqRNwE+XuwRWq3sLtnyDh1i/ncR1s9j30Drhi2IbLOR9JJ3lHiJeu4xbH2nYTHJYap4Jig== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0008; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(227479698468861)(278428928389397); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:CS1PR84MB0008;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0008; x-forefront-prvs: 0981815F2F x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(7916002)(51694002)(189002)(199003)(377424004)(24454002)(10400500002)(106356001)(101416001)(586003)(77096005)(11100500001)(122556002)(99286002)(5640700001)(102836003)(93886004)(7736002)(305945005)(106116001)(19580395003)(97736004)(105586002)(8936002)(110136002)(33646002)(3846002)(36756003)(19580405001)(103116003)(68736007)(575784001)(2950100001)(6116002)(2906002)(81156014)(2501003)(1730700003)(8676002)(92566002)(81166006)(189998001)(4326007)(76176999)(2900100001)(50986999)(3280700002)(2351001)(3660700001)(87936001)(66066001)(54356999)(5002640100001)(86362001)(7846002);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0008;H:CS1PR84MB0005.NAMPRD84.PROD.OUTLOOK.COM;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <421B8C07030DC74D9301D80DBED8ECAD@NAMPRD84.PROD.OUTLOOK.COM> MIME-Version: 1.0 X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Jun 2016 17:44:42.9953 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0008 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u5MHjCpl011891 On Tue, 2016-06-21 at 14:17 -0400, Mike Snitzer wrote: > On Tue, Jun 21 2016 at 11:44am -0400, > Kani, Toshimitsu wrote: > > > > On Tue, 2016-06-21 at 09:41 -0400, Mike Snitzer wrote: > > > > > > On Mon, Jun 20 2016 at  6:22pm -0400, > > > Mike Snitzer wrote:  : > > > I'm now wondering if we'd be better off setting a new QUEUE_FLAG_DAX > > > rather than establish GENHD_FL_DAX on the genhd? > > > > > > It'd be quite a bit easier to allow upper layers (e.g. XFS and ext4) to > > > check for a queue flag. > > > > I think GENHD_FL_DAX is more appropriate since DAX does not use a request > > queue, except for protecting the underlining device being disabled while > > direct_access() is called (b2e0d1625e19).   > > The devices in question have a request_queue.  All bio-based device have > a request_queue. DAX-capable devices have two operation modes, bio-based and DAX.  I agree that bio-based operation is associated with a request queue, and its capabilities should be set to it.  DAX, on the other hand, is rather independent from a request queue. > I don't have a big problem with GENHD_FL_DAX.  Just wanted to point out > that such block device capabilities are generally advertised in terms of > a QUEUE_FLAG. I do not have a strong opinion, but feel a bit odd to associate DAX to a request queue.    > > About protecting direct_access, this patch assumes that the underlining > > device cannot be disabled until dtr() is called.  Is this correct?  If > > not, I will need to call dax_map_atomic(). > > One of the big design considerations for DM that a DM device can be > suspended (with or without flush) and any new IO will be blocked until > the DM device is resumed. > > So ideally DM should be able to have the same capability even if using > DAX. Supporting suspend for DAX is challenging since it allows user applications to access a device directly.  Once a device range is mmap'd, there is no kernel intervention to access the range, unless we invalidate user mappings.  This isn't done today even after a driver is unbind'd from a device. > But that is different than what commit b2e0d1625e19 is addressing.  For > DM, I wouldn't think you'd need the extra protections that > dax_map_atomic() is providing given that the underlying block device > lifetime is managed via DM core's dm_get_device/dm_put_device (see also: > dm.c:open_table_device/close_table_device). I thought so as well.  But I realized that there is (almost) nothing that can prevent the unbind operation.  It cannot fail, either.  This unbind proceeds even when a device is in-use.  In case of a pmem device, it is only protected by pmem_release_queue(), which is called when a pmem device is being deleted and calls blk_cleanup_queue() to serialize a critical section between blk_queue_enter() and blk_queue_exit() per b2e0d1625e19.  This prevents from a kernel DTLB fault, but does not prevent a device disappeared while in-use. Protecting DM's underlining device with blk_queue_enter() (or something similar) requires more thoughts...  blk_queue_enter() to a DM device cannot be redirected to its underlining device.  So, this is TBD for now.  But I do not think this is a blocker issue since doing unbind to a underlining device is quite harmful no matter what we do - even if it is protected with blk_queue_enter(). Thanks, -Toshi