From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (Postfix) with ESMTP id C752B7F3F for ; Mon, 11 Jan 2016 01:15:42 -0600 (CST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay1.corp.sgi.com (Postfix) with ESMTP id 9BF9B8F8040 for ; Sun, 10 Jan 2016 23:15:39 -0800 (PST) Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by cuda.sgi.com with ESMTP id G4lDIrWHli2XH2A4 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 10 Jan 2016 23:15:36 -0800 (PST) Subject: Re: [resend PATCH 1/3] block, fs: reliably communicate bdev end-of-life References: <20160104181220.24118.96661.stgit@dwillia2-desk3.amr.corp.intel.com> <20160104182005.24118.50361.stgit@dwillia2-desk3.amr.corp.intel.com> <20160109075414.GA5008@ZenIV.linux.org.uk> From: Hannes Reinecke Message-ID: <56935694.1000408@suse.de> Date: Mon, 11 Jan 2016 08:15:32 +0100 MIME-Version: 1.0 In-Reply-To: List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: base64 Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Dan Williams , Al Viro Cc: Jens Axboe , linux-nvdimm , XFS Developers , linux-block@vger.kernel.org, Jan Kara , linux-fsdevel , Matthew Wilcox , Ross Zwisler T24gMDEvMDkvMjAxNiAwMzoxNyBQTSwgRGFuIFdpbGxpYW1zIHdyb3RlOgo+IE9uIEZyaSwgSmFu IDgsIDIwMTYgYXQgMTE6NTQgUE0sIEFsIFZpcm8gPHZpcm9AemVuaXYubGludXgub3JnLnVrPiB3 cm90ZToKPj4gT24gTW9uLCBKYW4gMDQsIDIwMTYgYXQgMTA6MjA6MDVBTSAtMDgwMCwgRGFuIFdp bGxpYW1zIHdyb3RlOgo+IFsuLl0KPj4gICAgICAgICAgV291bGQgeW91IG1pbmQgZXhwbGFpbmlu ZyB3aGF0IHRoZSBoZWxsIGlzIF90aGVfIGJhY2tpbmcgZGV2aWNlCj4+IG9mIGEgZmlsZXN5c3Rl bT8gIFdoYXQgZG9lcyB0aGF0IHRyYW5zbGF0ZSBpbnRvIGluIGNhc2Ugb2YgZS5nLiBidHJmcwo+ PiBzcGFubmluZyBzZXZlcmFsIGRpc2tzPyAgT3IgZXh0NCB3aXRoIGpvdXJuYWwgb24gYSBkaWZm ZXJlbnQgZGV2aWNlLCBmb3IKPj4gdGhhdCBtYXR0ZXI/Cj4+Cj4+ICAgICAgICAgIElmIGFueXRo aW5nLCBJIHdvdWxkIGFyZ3VlIHRoYXQgZmlsZXN5c3RlbSBpcyBvdXQgb2YgcGxhY2UgaGVyZSAt Cj4+IGdlbmVyYWwgc2l0dWF0aW9uIGlzICJJTyBvbiBYIG1heSByZXF1aXJlIElPIG9uIGRldmlj ZSBZIGFuZCBYIG5lZWRzIHRvIGRvCj4+IHNvbWV0aGluZyB3aGVuIFkgZ29lcyBhd2F5Ii4gIENv bnNpZGVyIGUuZy4gL2Rldi9sb29wIGJhY2tlZCBieSBhIGRldmljZQo+PiB0aGF0IHdlbnQgYXdh eS4gIE9yIGJ5IGEgZmlsZSBvbiBmcyB0aGF0IGhhcyBydW4gZG93biB0aGUgY3VydGFpbiBhbmQg am9pbmVkCj4+IHRoZSBibGVlZGluIGNob2lyIGludmlzaWJsZS4gIFdpdGggYW5vdGhlciBmcyBw YXJ0aWFsbHkgaG9zdGVkIGJ5IHRoYXQKPj4gbG9vcGJhY2sgZGV2aWNlLiAgT3IgYnkgUkFJRDAg Y29udGFpbmluZyBzYWlkIGRldmljZS4KPj4KPj4gICAgICAgICAgWW91IGFyZSBnaXZlbiBZIGFu ZCBhdHRlbXB0IHRvIGxvY2F0ZSB0aGUgYWZmZWN0ZWQgWC4gIF9UaGVuXwo+PiB5b3UgYXNzdW1l IHRoYXQgWCBpcyBhIGZpbGVzeXN0ZW0gYW5kIGhhcyAic29tZXRoaW5nIHRvIGJlIGRvbmUiIGlu ZGVwZW5kZW50Cj4+IGZyb20gdGhlIHJvbGUgWSBwbGF5ZWQgZm9yIGl0LCBzbyB5b3UgY2FuIHBp Y2sgdGhhdCBhY3Rpb24gZnJvbSBzdXBlcmJsb2NrCj4+IG1ldGhvZC4KPj4KPj4gICAgICAgICAg SU1PIHlvdSBhcmUgcGxhY2luZyB0aGUgYnVyZGVuIGluIHRoZSB3cm9uZyBwbGFjZS4gIF9SZWNl cGllbnRfCj4+IGtub3dzIHdoYXQgaXQgZGVwZW5kcyB1cG9uIGFuZCB3aGF0IHNob3VsZCBiZSBk b25lIGZvciBlYWNoIHNvdXJjZSBvZgo+PiB0cm91YmxlLiAgU28gbWFrZSBpdCByZWNlcGllbnQn cyByZXNwb25zaWJpbGl0eSB0byByZXF1ZXN0IG5vdGlmaWNhdGlvbnMuCj4+IEF0IHdoaWNoIHBv aW50IHRoZSBzdXBlcmJsb2NrIG1ldGhvZCBnb2VzIGF3YXksIGFsb25nIHdpdGggdGhlIHJlcXVp cmVtZW50Cj4+IHRvIGhhbmRsZSBhbGwgc291cmNlcyBvZiB0cm91YmxlIHRoZSBzYW1lIHdheSwg ZXRjLgo+Pgo+PiAgICAgICAgICBXaGF0J3MgbW9yZSwgdGhpbmdzIGxpa2UgUkFJRDUgKGFsc28g aW50ZXJlc3RlZCBpbiBrbm93aW5nIHdoZW4KPj4gYSBjb21wb25lbnQgaGFzIGJlZW4gcmlwcGVk IG91dCkgbWlnaHQgb3IgbWlnaHQgbm90IGRlY2lkZSB0byBwcm9wYWdhdGUKPj4gdGhlIGV2ZW50 IGZ1cnRoZXIgLSBhZnRlciBhbGwsIHRoYXQncyBleGFjdGx5IHRoZSBwb2ludCBvZiByZWR1bmRh bmN5Lgo+Pgo+PiAgICAgICAgICBJJ2QgbG9vayBpbnRvIHNvbWV0aGluZyBhbG9uZyB0aGUgbGlu ZXMgb2Ygbm90aWZpZXIgY2hhaW4gcGVyCj4+IGdlbmRpc2ssIHdpdGggcG90ZW50aWFsIHZpY3Rp bXMgcmVnaXN0ZXJpbmcgYSBjYWxsYmFjayB3aGVuIHRoZXkgZGVjaWRlCj4+IHRoYXQgZnJvbSBu b3cgb24gc3VjaCBhbmQgc3VjaCBkZXZpY2UgbWlnaHQgc2NyZXcgdGhlbSBvdmVyLi4uCj4KPiBN YWtlcyBzZW5zZS4gIEknbGwgZHJvcCB0aGlzIHNlcmllcyBmb3Igbm93IGFuZCBjb21lIGJhY2sg YWZ0ZXIKPiByZS13b3JraW5nIGl0IHVzZSBub3RpZmllcnMuCgpZZXMgcGxlYXNlLiBJIG5lZWQg YSBzaW1pbGFyIHRoaW5nIGZvciBjb21tdW5pY2F0aW5nIGRldmljZSBjaGFuZ2VzIAoocmVzaXpp bmcsIHRvcG9sb2d5IGNoYW5nZXMpLCBzbyBJJ2QgYmUgdmVyeSBtdWNoIGludGVyZXN0ZWQgaW4g dGhlbS4KCkFuZCB3aGlsZSB5b3UncmUgYXQgaXQsIG1heWJlIHdlIGNhbiBmb2xkIHRoZSBibG9j ayBkZXZpY2UgZXZlbnQgCmhhbmRsaW5nIGludG8gdGhhdCwgdG9vLgoKQ2hlZXJzLAoKSGFubmVz Ci0tIApEci4gSGFubmVzIFJlaW5lY2tlCQkgICAgICAgICAgICAgICB6U2VyaWVzICYgU3RvcmFn ZQpoYXJlQHN1c2UuZGUJCQkgICAgICAgICAgICAgICArNDkgOTExIDc0MDUzIDY4OApTVVNFIExJ TlVYIEdtYkgsIE1heGZlbGRzdHIuIDUsIDkwNDA5IE7DvHJuYmVyZwpHRjogRi4gSW1lbmTDtnJm ZmVyLCBKLiBTbWl0aGFyZCwgSi4gR3VpbGQsIEQuIFVwbWFueXUsIEcuIE5vcnRvbgpIUkIgMjEy ODQgKEFHIE7DvHJuYmVyZykKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fCnhmcyBtYWlsaW5nIGxpc3QKeGZzQG9zcy5zZ2kuY29tCmh0dHA6Ly9vc3Muc2dp LmNvbS9tYWlsbWFuL2xpc3RpbmZvL3hmcwo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de ([195.135.220.15]:44668 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbcAKHPg (ORCPT ); Mon, 11 Jan 2016 02:15:36 -0500 Subject: Re: [resend PATCH 1/3] block, fs: reliably communicate bdev end-of-life To: Dan Williams , Al Viro References: <20160104181220.24118.96661.stgit@dwillia2-desk3.amr.corp.intel.com> <20160104182005.24118.50361.stgit@dwillia2-desk3.amr.corp.intel.com> <20160109075414.GA5008@ZenIV.linux.org.uk> Cc: XFS Developers , linux-block@vger.kernel.org, linux-nvdimm , Dave Chinner , Jens Axboe , Jan Kara , linux-fsdevel , Matthew Wilcox , Ross Zwisler From: Hannes Reinecke Message-ID: <56935694.1000408@suse.de> Date: Mon, 11 Jan 2016 08:15:32 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 01/09/2016 03:17 PM, Dan Williams wrote: > On Fri, Jan 8, 2016 at 11:54 PM, Al Viro wrote: >> On Mon, Jan 04, 2016 at 10:20:05AM -0800, Dan Williams wrote: > [..] >> Would you mind explaining what the hell is _the_ backing device >> of a filesystem? What does that translate into in case of e.g. btrfs >> spanning several disks? Or ext4 with journal on a different device, for >> that matter? >> >> If anything, I would argue that filesystem is out of place here - >> general situation is "IO on X may require IO on device Y and X needs to do >> something when Y goes away". Consider e.g. /dev/loop backed by a device >> that went away. Or by a file on fs that has run down the curtain and joined >> the bleedin choir invisible. With another fs partially hosted by that >> loopback device. Or by RAID0 containing said device. >> >> You are given Y and attempt to locate the affected X. _Then_ >> you assume that X is a filesystem and has "something to be done" independent >> from the role Y played for it, so you can pick that action from superblock >> method. >> >> IMO you are placing the burden in the wrong place. _Recepient_ >> knows what it depends upon and what should be done for each source of >> trouble. So make it recepient's responsibility to request notifications. >> At which point the superblock method goes away, along with the requirement >> to handle all sources of trouble the same way, etc. >> >> What's more, things like RAID5 (also interested in knowing when >> a component has been ripped out) might or might not decide to propagate >> the event further - after all, that's exactly the point of redundancy. >> >> I'd look into something along the lines of notifier chain per >> gendisk, with potential victims registering a callback when they decide >> that from now on such and such device might screw them over... > > Makes sense. I'll drop this series for now and come back after > re-working it use notifiers. Yes please. I need a similar thing for communicating device changes (resizing, topology changes), so I'd be very much interested in them. And while you're at it, maybe we can fold the block device event handling into that, too. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)