* EXCHANGE_ID with same network address but different server owner @ 2017-05-12 13:27 Stefan Hajnoczi 2017-05-12 14:34 ` J. Bruce Fields 0 siblings, 1 reply; 23+ messages in thread From: Stefan Hajnoczi @ 2017-05-12 13:27 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List, J. Bruce Fields, Steve Dickson [-- Attachment #1: Type: text/plain, Size: 1170 bytes --] Hi, I've been working on NFS over the AF_VSOCK transport (https://www.spinics.net/lists/linux-nfs/msg60292.html). AF_VSOCK resets established network connections when the virtual machine is migrated to a new host. The NFS client expects file handles and other state to remain valid upon reconnecting. This is not the case after VM live migration since the new host does not have the NFS server state from the old host. Volatile file handles have been suggested as a way to reflect that state does not persist across reconnect, but the Linux NFS client does not support volatile file handles. I saw NFS 4.1 has a way for a new server running with the same network address of an old server to communicate that it is indeed a new server instance. If the server owner/scope in the EXCHANGE_ID response does not match the previous server's values then the server is a new instance. The implications of encountering a new server owner/scope upon reconnect aren't clear to me and I'm not sure to what extent the Linux implementation handles this case. Can anyone explain what happens if the NFS client finds a new server owner/scope after reconnecting? Thanks, Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-12 13:27 EXCHANGE_ID with same network address but different server owner Stefan Hajnoczi @ 2017-05-12 14:34 ` J. Bruce Fields 2017-05-12 15:01 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2017-05-12 14:34 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Chuck Lever, Linux NFS Mailing List, Steve Dickson On Fri, May 12, 2017 at 09:27:21AM -0400, Stefan Hajnoczi wrote: > Hi, > I've been working on NFS over the AF_VSOCK transport > (https://www.spinics.net/lists/linux-nfs/msg60292.html). AF_VSOCK > resets established network connections when the virtual machine is > migrated to a new host. > > The NFS client expects file handles and other state to remain valid upon > reconnecting. This is not the case after VM live migration since the > new host does not have the NFS server state from the old host. > > Volatile file handles have been suggested as a way to reflect that state > does not persist across reconnect, but the Linux NFS client does not > support volatile file handles. That's unlikely to change; the protocol allows the server to advertise volatile filehandles, but doesn't really give any tools to implement them reliably. > I saw NFS 4.1 has a way for a new server running with the same network > address of an old server to communicate that it is indeed a new server > instance. If the server owner/scope in the EXCHANGE_ID response does > not match the previous server's values then the server is a new > instance. > > The implications of encountering a new server owner/scope upon reconnect > aren't clear to me and I'm not sure to what extent the Linux > implementation handles this case. Can anyone explain what happens if > the NFS client finds a new server owner/scope after reconnecting? I haven't tested it, but if it reconnects to the same IP address and finds out it's no longer talking to the same server, I think the only correct thing it could do would be to just fail all further access. There's no easy solution. To migrate between NFS servers you need some sort of clustered NFS service with shared storage. We can't currently support concurrent access to shared storage from multiple NFS servers, so all that's possible active/passive failover. Also, people that set that up normally depend on a floating IP address--I'm not sure if there's an equivalent for VSOCK. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-12 14:34 ` J. Bruce Fields @ 2017-05-12 15:01 ` Trond Myklebust 2017-05-12 17:00 ` Chuck Lever 0 siblings, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2017-05-12 15:01 UTC (permalink / raw) To: stefanha@redhat.com, bfields@redhat.com Cc: SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com T24gRnJpLCAyMDE3LTA1LTEyIGF0IDEwOjM0IC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6 DQo+IE9uIEZyaSwgTWF5IDEyLCAyMDE3IGF0IDA5OjI3OjIxQU0gLTA0MDAsIFN0ZWZhbiBIYWpu b2N6aSB3cm90ZToNCj4gPiBIaSwNCj4gPiBJJ3ZlIGJlZW4gd29ya2luZyBvbiBORlMgb3ZlciB0 aGUgQUZfVlNPQ0sgdHJhbnNwb3J0DQo+ID4gKGh0dHBzOi8vd3d3LnNwaW5pY3MubmV0L2xpc3Rz L2xpbnV4LW5mcy9tc2c2MDI5Mi5odG1sKS7CoMKgQUZfVlNPQ0sNCj4gPiByZXNldHMgZXN0YWJs aXNoZWQgbmV0d29yayBjb25uZWN0aW9ucyB3aGVuIHRoZSB2aXJ0dWFsIG1hY2hpbmUgaXMNCj4g PiBtaWdyYXRlZCB0byBhIG5ldyBob3N0Lg0KPiA+IA0KPiA+IFRoZSBORlMgY2xpZW50IGV4cGVj dHMgZmlsZSBoYW5kbGVzIGFuZCBvdGhlciBzdGF0ZSB0byByZW1haW4gdmFsaWQNCj4gPiB1cG9u DQo+ID4gcmVjb25uZWN0aW5nLsKgwqBUaGlzIGlzIG5vdCB0aGUgY2FzZSBhZnRlciBWTSBsaXZl IG1pZ3JhdGlvbiBzaW5jZQ0KPiA+IHRoZQ0KPiA+IG5ldyBob3N0IGRvZXMgbm90IGhhdmUgdGhl IE5GUyBzZXJ2ZXIgc3RhdGUgZnJvbSB0aGUgb2xkIGhvc3QuDQo+ID4gDQo+ID4gVm9sYXRpbGUg ZmlsZSBoYW5kbGVzIGhhdmUgYmVlbiBzdWdnZXN0ZWQgYXMgYSB3YXkgdG8gcmVmbGVjdCB0aGF0 DQo+ID4gc3RhdGUNCj4gPiBkb2VzIG5vdCBwZXJzaXN0IGFjcm9zcyByZWNvbm5lY3QsIGJ1dCB0 aGUgTGludXggTkZTIGNsaWVudCBkb2VzDQo+ID4gbm90DQo+ID4gc3VwcG9ydCB2b2xhdGlsZSBm aWxlIGhhbmRsZXMuDQo+IA0KPiBUaGF0J3MgdW5saWtlbHkgdG8gY2hhbmdlOyB0aGUgcHJvdG9j b2wgYWxsb3dzIHRoZSBzZXJ2ZXIgdG8NCj4gYWR2ZXJ0aXNlDQo+IHZvbGF0aWxlIGZpbGVoYW5k bGVzLCBidXQgZG9lc24ndCByZWFsbHkgZ2l2ZSBhbnkgdG9vbHMgdG8gaW1wbGVtZW50DQo+IHRo ZW0gcmVsaWFibHkuDQo+IA0KPiA+IEkgc2F3IE5GUyA0LjEgaGFzIGEgd2F5IGZvciBhIG5ldyBz ZXJ2ZXIgcnVubmluZyB3aXRoIHRoZSBzYW1lDQo+ID4gbmV0d29yaw0KPiA+IGFkZHJlc3Mgb2Yg YW4gb2xkIHNlcnZlciB0byBjb21tdW5pY2F0ZSB0aGF0IGl0IGlzIGluZGVlZCBhIG5ldw0KPiA+ IHNlcnZlcg0KPiA+IGluc3RhbmNlLsKgwqBJZiB0aGUgc2VydmVyIG93bmVyL3Njb3BlIGluIHRo ZSBFWENIQU5HRV9JRCByZXNwb25zZQ0KPiA+IGRvZXMNCj4gPiBub3QgbWF0Y2ggdGhlIHByZXZp b3VzIHNlcnZlcidzIHZhbHVlcyB0aGVuIHRoZSBzZXJ2ZXIgaXMgYSBuZXcNCj4gPiBpbnN0YW5j ZS4NCj4gPiANCj4gPiBUaGUgaW1wbGljYXRpb25zIG9mIGVuY291bnRlcmluZyBhIG5ldyBzZXJ2 ZXIgb3duZXIvc2NvcGUgdXBvbg0KPiA+IHJlY29ubmVjdA0KPiA+IGFyZW4ndCBjbGVhciB0byBt ZSBhbmQgSSdtIG5vdCBzdXJlIHRvIHdoYXQgZXh0ZW50IHRoZSBMaW51eA0KPiA+IGltcGxlbWVu dGF0aW9uIGhhbmRsZXMgdGhpcyBjYXNlLsKgwqBDYW4gYW55b25lIGV4cGxhaW4gd2hhdCBoYXBw ZW5zDQo+ID4gaWYNCj4gPiB0aGUgTkZTIGNsaWVudCBmaW5kcyBhIG5ldyBzZXJ2ZXIgb3duZXIv c2NvcGUgYWZ0ZXIgcmVjb25uZWN0aW5nPw0KPiANCj4gSSBoYXZlbid0IHRlc3RlZCBpdCwgYnV0 IGlmIGl0IHJlY29ubmVjdHMgdG8gdGhlIHNhbWUgSVAgYWRkcmVzcyBhbmQNCj4gZmluZHMgb3V0 IGl0J3Mgbm8gbG9uZ2VyIHRhbGtpbmcgdG8gdGhlIHNhbWUgc2VydmVyLCBJIHRoaW5rIHRoZSBv bmx5DQo+IGNvcnJlY3QgdGhpbmcgaXQgY291bGQgZG8gd291bGQgYmUgdG8ganVzdCBmYWlsIGFs bCBmdXJ0aGVyIGFjY2Vzcy4NCj4gDQo+IFRoZXJlJ3Mgbm8gZWFzeSBzb2x1dGlvbi4NCj4gDQo+ IFRvIG1pZ3JhdGUgYmV0d2VlbiBORlMgc2VydmVycyB5b3UgbmVlZCBzb21lIHNvcnQgb2YgY2x1 c3RlcmVkIE5GUw0KPiBzZXJ2aWNlIHdpdGggc2hhcmVkIHN0b3JhZ2UuwqDCoFdlIGNhbid0IGN1 cnJlbnRseSBzdXBwb3J0IGNvbmN1cnJlbnQNCj4gYWNjZXNzIHRvIHNoYXJlZCBzdG9yYWdlIGZy b20gbXVsdGlwbGUgTkZTIHNlcnZlcnMsIHNvIGFsbCB0aGF0J3MNCj4gcG9zc2libGUgYWN0aXZl L3Bhc3NpdmUgZmFpbG92ZXIuwqDCoEFsc28sIHBlb3BsZSB0aGF0IHNldCB0aGF0IHVwDQo+IG5v cm1hbGx5IGRlcGVuZCBvbiBhIGZsb2F0aW5nIElQIGFkZHJlc3MtLUknbSBub3Qgc3VyZSBpZiB0 aGVyZSdzIGFuDQo+IGVxdWl2YWxlbnQgZm9yIFZTT0NLLg0KPiANCg0KQWN0dWFsbHksIHRoaXMg bWlnaHQgYmUgYSB1c2UgY2FzZSBmb3IgcmUtZXhwb3J0aW5nIE5GUy4gSWYgdGhlIGhvc3QNCmNv dWxkIHJlLWV4cG9ydCBhIE5GUyBtb3VudCB0byB0aGUgZ3Vlc3RzLCB0aGVuIHlvdSBkb24ndCBu ZWNlc3NhcmlseQ0KbmVlZCBhIGNsdXN0ZXJlZCBmaWxlc3lzdGVtLg0KDQpPVE9ILCB0aGlzIHdv dWxkIG5vdCBzb2x2ZSB0aGUgcHJvYmxlbSBvZiBtaWdyYXRpbmcgbG9ja3MsIHdoaWNoIGlzIG5v dA0KcmVhbGx5IGVhc3kgdG8gc3VwcG9ydCBpbiB0aGUgY3VycmVudCBzdGF0ZSBtb2RlbCBmb3Ig TkZTdjQueC4NCg0KPiAtLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWlu dGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-12 15:01 ` Trond Myklebust @ 2017-05-12 17:00 ` Chuck Lever 2017-05-15 14:43 ` Stefan Hajnoczi 0 siblings, 1 reply; 23+ messages in thread From: Chuck Lever @ 2017-05-12 17:00 UTC (permalink / raw) To: stefanha@redhat.com Cc: J. Bruce Fields, Trond Myklebust, Steve Dickson, Linux NFS Mailing List > On May 12, 2017, at 11:01 AM, Trond Myklebust <trondmy@primarydata.com> wrote: > > On Fri, 2017-05-12 at 10:34 -0400, J. Bruce Fields wrote: >> On Fri, May 12, 2017 at 09:27:21AM -0400, Stefan Hajnoczi wrote: >>> Hi, >>> I've been working on NFS over the AF_VSOCK transport >>> (https://www.spinics.net/lists/linux-nfs/msg60292.html). AF_VSOCK >>> resets established network connections when the virtual machine is >>> migrated to a new host. >>> >>> The NFS client expects file handles and other state to remain valid >>> upon >>> reconnecting. This is not the case after VM live migration since >>> the >>> new host does not have the NFS server state from the old host. >>> >>> Volatile file handles have been suggested as a way to reflect that >>> state >>> does not persist across reconnect, but the Linux NFS client does >>> not >>> support volatile file handles. >> >> That's unlikely to change; the protocol allows the server to >> advertise >> volatile filehandles, but doesn't really give any tools to implement >> them reliably. >> >>> I saw NFS 4.1 has a way for a new server running with the same >>> network >>> address of an old server to communicate that it is indeed a new >>> server >>> instance. If the server owner/scope in the EXCHANGE_ID response >>> does >>> not match the previous server's values then the server is a new >>> instance. >>> >>> The implications of encountering a new server owner/scope upon >>> reconnect >>> aren't clear to me and I'm not sure to what extent the Linux >>> implementation handles this case. Can anyone explain what happens >>> if >>> the NFS client finds a new server owner/scope after reconnecting? >> >> I haven't tested it, but if it reconnects to the same IP address and >> finds out it's no longer talking to the same server, I think the only >> correct thing it could do would be to just fail all further access. >> >> There's no easy solution. >> >> To migrate between NFS servers you need some sort of clustered NFS >> service with shared storage. We can't currently support concurrent >> access to shared storage from multiple NFS servers, so all that's >> possible active/passive failover. Also, people that set that up >> normally depend on a floating IP address--I'm not sure if there's an >> equivalent for VSOCK. >> > > Actually, this might be a use case for re-exporting NFS. If the host > could re-export a NFS mount to the guests, then you don't necessarily > need a clustered filesystem. > > OTOH, this would not solve the problem of migrating locks, which is not > really easy to support in the current state model for NFSv4.x. Some alternatives: - Make the local NFS server's exports read-only, NFSv3 only, and do not support locking. Ensure that the filehandles and namespace are the same on every NFS server. - As Trond suggested, all the local NFS servers accessed via AF_SOCK should re-export NFS filesystems that are located elsewhere and are visible everywhere. - Ensure there is an accompanying NFSv4 FS migration event that moves the client's files (and possibly its open and lock state) from the local NFS server to the destination NFS server concurrent with the live migration. If the client is aware of the FS migration, it will expect the filehandles to be the same, but it can reconstruct the open and lock state on the destination server (if that server allows GRACEful recovery for that client). This is possible in the protocol and implemented in the Linux NFS client, but none of it is implemented in the Linux NFS server. -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-12 17:00 ` Chuck Lever @ 2017-05-15 14:43 ` Stefan Hajnoczi 2017-05-15 16:02 ` J. Bruce Fields 0 siblings, 1 reply; 23+ messages in thread From: Stefan Hajnoczi @ 2017-05-15 14:43 UTC (permalink / raw) To: Chuck Lever Cc: J. Bruce Fields, Trond Myklebust, Steve Dickson, Linux NFS Mailing List [-- Attachment #1: Type: text/plain, Size: 4212 bytes --] On Fri, May 12, 2017 at 01:00:47PM -0400, Chuck Lever wrote: > > > On May 12, 2017, at 11:01 AM, Trond Myklebust <trondmy@primarydata.com> wrote: > > > > On Fri, 2017-05-12 at 10:34 -0400, J. Bruce Fields wrote: > >> On Fri, May 12, 2017 at 09:27:21AM -0400, Stefan Hajnoczi wrote: > >>> Hi, > >>> I've been working on NFS over the AF_VSOCK transport > >>> (https://www.spinics.net/lists/linux-nfs/msg60292.html). AF_VSOCK > >>> resets established network connections when the virtual machine is > >>> migrated to a new host. > >>> > >>> The NFS client expects file handles and other state to remain valid > >>> upon > >>> reconnecting. This is not the case after VM live migration since > >>> the > >>> new host does not have the NFS server state from the old host. > >>> > >>> Volatile file handles have been suggested as a way to reflect that > >>> state > >>> does not persist across reconnect, but the Linux NFS client does > >>> not > >>> support volatile file handles. > >> > >> That's unlikely to change; the protocol allows the server to > >> advertise > >> volatile filehandles, but doesn't really give any tools to implement > >> them reliably. > >> > >>> I saw NFS 4.1 has a way for a new server running with the same > >>> network > >>> address of an old server to communicate that it is indeed a new > >>> server > >>> instance. If the server owner/scope in the EXCHANGE_ID response > >>> does > >>> not match the previous server's values then the server is a new > >>> instance. > >>> > >>> The implications of encountering a new server owner/scope upon > >>> reconnect > >>> aren't clear to me and I'm not sure to what extent the Linux > >>> implementation handles this case. Can anyone explain what happens > >>> if > >>> the NFS client finds a new server owner/scope after reconnecting? > >> > >> I haven't tested it, but if it reconnects to the same IP address and > >> finds out it's no longer talking to the same server, I think the only > >> correct thing it could do would be to just fail all further access. > >> > >> There's no easy solution. > >> > >> To migrate between NFS servers you need some sort of clustered NFS > >> service with shared storage. We can't currently support concurrent > >> access to shared storage from multiple NFS servers, so all that's > >> possible active/passive failover. Also, people that set that up > >> normally depend on a floating IP address--I'm not sure if there's an > >> equivalent for VSOCK. > >> > > > > Actually, this might be a use case for re-exporting NFS. If the host > > could re-export a NFS mount to the guests, then you don't necessarily > > need a clustered filesystem. > > > > OTOH, this would not solve the problem of migrating locks, which is not > > really easy to support in the current state model for NFSv4.x. > > Some alternatives: > > - Make the local NFS server's exports read-only, NFSv3 > only, and do not support locking. Ensure that the > filehandles and namespace are the same on every NFS > server. > > - As Trond suggested, all the local NFS servers accessed > via AF_SOCK should re-export NFS filesystems that > are located elsewhere and are visible everywhere. > > - Ensure there is an accompanying NFSv4 FS migration event > that moves the client's files (and possibly its open and > lock state) from the local NFS server to the destination > NFS server concurrent with the live migration. > > If the client is aware of the FS migration, it will expect > the filehandles to be the same, but it can reconstruct > the open and lock state on the destination server (if that > server allows GRACEful recovery for that client). > > This is possible in the protocol and implemented in the > Linux NFS client, but none of it is implemented in the > Linux NFS server. Great, thanks for the pointers everyone. It's clear to me that AF_VSOCK won't get NFS migration for free. Initially live migration will not be supported. Re-exporting sounds interesting - perhaps the new host could re-export the old host's file systems. I'll look into the spec and code. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-15 14:43 ` Stefan Hajnoczi @ 2017-05-15 16:02 ` J. Bruce Fields 2017-05-16 13:11 ` J. Bruce Fields 2017-05-16 13:33 ` Stefan Hajnoczi 0 siblings, 2 replies; 23+ messages in thread From: J. Bruce Fields @ 2017-05-15 16:02 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Chuck Lever, Trond Myklebust, Steve Dickson, Linux NFS Mailing List On Mon, May 15, 2017 at 03:43:06PM +0100, Stefan Hajnoczi wrote: > On Fri, May 12, 2017 at 01:00:47PM -0400, Chuck Lever wrote: > > > > > On May 12, 2017, at 11:01 AM, Trond Myklebust <trondmy@primarydata.com> wrote: > > > Actually, this might be a use case for re-exporting NFS. If the host > > > could re-export a NFS mount to the guests, then you don't necessarily > > > need a clustered filesystem. > > > > > > OTOH, this would not solve the problem of migrating locks, which is not > > > really easy to support in the current state model for NFSv4.x. > > > > Some alternatives: > > > > - Make the local NFS server's exports read-only, NFSv3 > > only, and do not support locking. Ensure that the > > filehandles and namespace are the same on every NFS > > server. > > > > - As Trond suggested, all the local NFS servers accessed > > via AF_SOCK should re-export NFS filesystems that > > are located elsewhere and are visible everywhere. > > > > - Ensure there is an accompanying NFSv4 FS migration event > > that moves the client's files (and possibly its open and > > lock state) from the local NFS server to the destination > > NFS server concurrent with the live migration. > > > > If the client is aware of the FS migration, it will expect > > the filehandles to be the same, but it can reconstruct > > the open and lock state on the destination server (if that > > server allows GRACEful recovery for that client). > > > > This is possible in the protocol and implemented in the > > Linux NFS client, but none of it is implemented in the > > Linux NFS server. > > Great, thanks for the pointers everyone. > > It's clear to me that AF_VSOCK won't get NFS migration for free. > Initially live migration will not be supported. > > Re-exporting sounds interesting - perhaps the new host could re-export > the old host's file systems. I'll look into the spec and code. I've since forgotten the limitations of the nfs reexport series. Locking (lock recovery, specifically) seems like the biggest problem to solve to improve clustered nfs service; without that, it might actually be easier than reexporting, I don't know. If there's a use case for clustered nfs service that doesn't support file locking, maybe we should look into it. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-15 16:02 ` J. Bruce Fields @ 2017-05-16 13:11 ` J. Bruce Fields 2017-05-18 13:34 ` Stefan Hajnoczi 2017-05-16 13:33 ` Stefan Hajnoczi 1 sibling, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2017-05-16 13:11 UTC (permalink / raw) To: J. Bruce Fields Cc: Stefan Hajnoczi, Chuck Lever, Trond Myklebust, Steve Dickson, Linux NFS Mailing List I think you explained this before, perhaps you could just offer a pointer: remind us what your requirements or use cases are especially for VM migration? --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-16 13:11 ` J. Bruce Fields @ 2017-05-18 13:34 ` Stefan Hajnoczi 2017-05-18 14:28 ` Chuck Lever 0 siblings, 1 reply; 23+ messages in thread From: Stefan Hajnoczi @ 2017-05-18 13:34 UTC (permalink / raw) To: J. Bruce Fields Cc: J. Bruce Fields, Chuck Lever, Trond Myklebust, Steve Dickson, Linux NFS Mailing List [-- Attachment #1: Type: text/plain, Size: 2365 bytes --] On Tue, May 16, 2017 at 09:11:42AM -0400, J. Bruce Fields wrote: > I think you explained this before, perhaps you could just offer a > pointer: remind us what your requirements or use cases are especially > for VM migration? The NFS over AF_VSOCK configuration is: A guest running on host mounts an NFS export from the host. The NFS server may be kernel nfsd or an NFS frontend to a distributed storage system like Ceph. A little more about these cases below. Kernel nfsd is useful for sharing files. For example, the guest may read some files from the host when it launches and/or it may write out result files to the host when it shuts down. The user may also wish to share their home directory between the guest and the host. NFS frontends are a different use case. They hide distributed storage systems from guests in cloud environments. This way guests don't see the details of the Ceph, Gluster, etc nodes. Besides benefiting security it also allows NFS-capable guests to run without installing specific drivers for the distributed storage system. This use case is "filesystem as a service". The reason for using AF_VSOCK instead of TCP/IP is that traditional networking configuration is fragile. Automatically adding a dedicated NIC to the guest and choosing an IP subnet has a high chance of conflicts (subnet collisions, network interface naming, firewall rules, network management tools). AF_VSOCK is a zero-configuration communications channel so it avoids these problems. On to migration. For the most part, guests can be live migrated between hosts without significant downtime or manual steps. PCI passthrough is an example of a feature that makes it very hard to live migrate. I hope we can allow migration with NFS, although some limitations may be necessary to make it feasible. There are two NFS over AF_VSOCK migration scenarios: 1. The files live on host H1 and host H2 cannot access the files directly. There is no way for an NFS server on H2 to access those same files unless the directory is copied along with the guest or H2 proxies to the NFS server on H1. 2. The files are accessible from both host H1 and host H2 because they are on shared storage or distributed storage system. Here the problem is "just" migrating the state from H1's NFS server to H2 so that file handles remain valid. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 13:34 ` Stefan Hajnoczi @ 2017-05-18 14:28 ` Chuck Lever 2017-05-18 15:04 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: Chuck Lever @ 2017-05-18 14:28 UTC (permalink / raw) To: Stefan Hajnoczi Cc: J. Bruce Fields, J. Bruce Fields, Trond Myklebust, Steve Dickson, Linux NFS Mailing List > On May 18, 2017, at 9:34 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote: > > On Tue, May 16, 2017 at 09:11:42AM -0400, J. Bruce Fields wrote: >> I think you explained this before, perhaps you could just offer a >> pointer: remind us what your requirements or use cases are especially >> for VM migration? > > The NFS over AF_VSOCK configuration is: > > A guest running on host mounts an NFS export from the host. The NFS > server may be kernel nfsd or an NFS frontend to a distributed storage > system like Ceph. A little more about these cases below. > > Kernel nfsd is useful for sharing files. For example, the guest may > read some files from the host when it launches and/or it may write out > result files to the host when it shuts down. The user may also wish to > share their home directory between the guest and the host. > > NFS frontends are a different use case. They hide distributed storage > systems from guests in cloud environments. This way guests don't see > the details of the Ceph, Gluster, etc nodes. Besides benefiting > security it also allows NFS-capable guests to run without installing > specific drivers for the distributed storage system. This use case is > "filesystem as a service". > > The reason for using AF_VSOCK instead of TCP/IP is that traditional > networking configuration is fragile. Automatically adding a dedicated > NIC to the guest and choosing an IP subnet has a high chance of > conflicts (subnet collisions, network interface naming, firewall rules, > network management tools). AF_VSOCK is a zero-configuration > communications channel so it avoids these problems. > > On to migration. For the most part, guests can be live migrated between > hosts without significant downtime or manual steps. PCI passthrough is > an example of a feature that makes it very hard to live migrate. I hope > we can allow migration with NFS, although some limitations may be > necessary to make it feasible. > > There are two NFS over AF_VSOCK migration scenarios: > > 1. The files live on host H1 and host H2 cannot access the files > directly. There is no way for an NFS server on H2 to access those > same files unless the directory is copied along with the guest or H2 > proxies to the NFS server on H1. Having managed (and shared) storage on the physical host is awkward. I know some cloud providers might do this today by copying guest disk images down to the host's local disk, but generally it's not a flexible primary deployment choice. There's no good way to expand or replicate this pool of storage. A backup scheme would need to access all physical hosts. And the files are visible only on specific hosts. IMO you want to treat local storage on each physical host as a cache tier rather than as a back-end tier. > 2. The files are accessible from both host H1 and host H2 because they > are on shared storage or distributed storage system. Here the > problem is "just" migrating the state from H1's NFS server to H2 so > that file handles remain valid. Essentially this is the re-export case, and this makes a lot more sense to me from a storage administration point of view. The pool of administered storage is not local to the physical hosts running the guests, which is how I think cloud providers would prefer to operate. User storage would be accessible via an NFS share, but managed in a Ceph object (with redundancy, a common high throughput backup facility, and secure central management of user identities). Each host's NFS server could be configured to expose only the the cloud storage resources for the tenants on that host. The back-end storage (ie, Ceph) could operate on a private storage area network for better security. The only missing piece here is support in Linux-based NFS servers for transparent state migration. -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 14:28 ` Chuck Lever @ 2017-05-18 15:04 ` Trond Myklebust 2017-05-18 15:08 ` J. Bruce Fields 0 siblings, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2017-05-18 15:04 UTC (permalink / raw) To: stefanha@redhat.com, chuck.lever@oracle.com Cc: bfields@fieldses.org, bfields@redhat.com, SteveD@redhat.com, linux-nfs@vger.kernel.org T24gVGh1LCAyMDE3LTA1LTE4IGF0IDEwOjI4IC0wNDAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBNYXkgMTgsIDIwMTcsIGF0IDk6MzQgQU0sIFN0ZWZhbiBIYWpub2N6aSA8c3RlZmFuaGFA cmVkaGF0LmNvbT4NCj4gPiB3cm90ZToNCj4gPiANCj4gPiBPbiBUdWUsIE1heSAxNiwgMjAxNyBh dCAwOToxMTo0MkFNIC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6DQo+ID4gPiBJIHRoaW5r IHlvdSBleHBsYWluZWQgdGhpcyBiZWZvcmUsIHBlcmhhcHMgeW91IGNvdWxkIGp1c3Qgb2ZmZXIg YQ0KPiA+ID4gcG9pbnRlcjogcmVtaW5kIHVzIHdoYXQgeW91ciByZXF1aXJlbWVudHMgb3IgdXNl IGNhc2VzIGFyZQ0KPiA+ID4gZXNwZWNpYWxseQ0KPiA+ID4gZm9yIFZNIG1pZ3JhdGlvbj8NCj4g PiANCj4gPiBUaGUgTkZTIG92ZXIgQUZfVlNPQ0sgY29uZmlndXJhdGlvbiBpczoNCj4gPiANCj4g PiBBIGd1ZXN0IHJ1bm5pbmcgb24gaG9zdCBtb3VudHMgYW4gTkZTIGV4cG9ydCBmcm9tIHRoZSBo b3N0LsKgwqBUaGUNCj4gPiBORlMNCj4gPiBzZXJ2ZXIgbWF5IGJlIGtlcm5lbCBuZnNkIG9yIGFu IE5GUyBmcm9udGVuZCB0byBhIGRpc3RyaWJ1dGVkDQo+ID4gc3RvcmFnZQ0KPiA+IHN5c3RlbSBs aWtlIENlcGguwqDCoEEgbGl0dGxlIG1vcmUgYWJvdXQgdGhlc2UgY2FzZXMgYmVsb3cuDQo+ID4g DQo+ID4gS2VybmVsIG5mc2QgaXMgdXNlZnVsIGZvciBzaGFyaW5nIGZpbGVzLsKgwqBGb3IgZXhh bXBsZSwgdGhlIGd1ZXN0DQo+ID4gbWF5DQo+ID4gcmVhZCBzb21lIGZpbGVzIGZyb20gdGhlIGhv c3Qgd2hlbiBpdCBsYXVuY2hlcyBhbmQvb3IgaXQgbWF5IHdyaXRlDQo+ID4gb3V0DQo+ID4gcmVz dWx0IGZpbGVzIHRvIHRoZSBob3N0IHdoZW4gaXQgc2h1dHMgZG93bi7CoMKgVGhlIHVzZXIgbWF5 IGFsc28NCj4gPiB3aXNoIHRvDQo+ID4gc2hhcmUgdGhlaXIgaG9tZSBkaXJlY3RvcnkgYmV0d2Vl biB0aGUgZ3Vlc3QgYW5kIHRoZSBob3N0Lg0KPiA+IA0KPiA+IE5GUyBmcm9udGVuZHMgYXJlIGEg ZGlmZmVyZW50IHVzZSBjYXNlLsKgwqBUaGV5IGhpZGUgZGlzdHJpYnV0ZWQNCj4gPiBzdG9yYWdl DQo+ID4gc3lzdGVtcyBmcm9tIGd1ZXN0cyBpbiBjbG91ZCBlbnZpcm9ubWVudHMuwqDCoFRoaXMg d2F5IGd1ZXN0cyBkb24ndA0KPiA+IHNlZQ0KPiA+IHRoZSBkZXRhaWxzIG9mIHRoZSBDZXBoLCBH bHVzdGVyLCBldGMgbm9kZXMuwqDCoEJlc2lkZXMgYmVuZWZpdGluZw0KPiA+IHNlY3VyaXR5IGl0 IGFsc28gYWxsb3dzIE5GUy1jYXBhYmxlIGd1ZXN0cyB0byBydW4gd2l0aG91dA0KPiA+IGluc3Rh bGxpbmcNCj4gPiBzcGVjaWZpYyBkcml2ZXJzIGZvciB0aGUgZGlzdHJpYnV0ZWQgc3RvcmFnZSBz eXN0ZW0uwqDCoFRoaXMgdXNlIGNhc2UNCj4gPiBpcw0KPiA+ICJmaWxlc3lzdGVtIGFzIGEgc2Vy dmljZSIuDQo+ID4gDQo+ID4gVGhlIHJlYXNvbiBmb3IgdXNpbmcgQUZfVlNPQ0sgaW5zdGVhZCBv ZiBUQ1AvSVAgaXMgdGhhdCB0cmFkaXRpb25hbA0KPiA+IG5ldHdvcmtpbmcgY29uZmlndXJhdGlv biBpcyBmcmFnaWxlLsKgwqBBdXRvbWF0aWNhbGx5IGFkZGluZyBhDQo+ID4gZGVkaWNhdGVkDQo+ ID4gTklDIHRvIHRoZSBndWVzdCBhbmQgY2hvb3NpbmcgYW4gSVAgc3VibmV0IGhhcyBhIGhpZ2gg Y2hhbmNlIG9mDQo+ID4gY29uZmxpY3RzIChzdWJuZXQgY29sbGlzaW9ucywgbmV0d29yayBpbnRl cmZhY2UgbmFtaW5nLCBmaXJld2FsbA0KPiA+IHJ1bGVzLA0KPiA+IG5ldHdvcmsgbWFuYWdlbWVu dCB0b29scykuwqDCoEFGX1ZTT0NLIGlzIGEgemVyby1jb25maWd1cmF0aW9uDQo+ID4gY29tbXVu aWNhdGlvbnMgY2hhbm5lbCBzbyBpdCBhdm9pZHMgdGhlc2UgcHJvYmxlbXMuDQo+ID4gDQo+ID4g T24gdG8gbWlncmF0aW9uLsKgwqBGb3IgdGhlIG1vc3QgcGFydCwgZ3Vlc3RzIGNhbiBiZSBsaXZl IG1pZ3JhdGVkDQo+ID4gYmV0d2Vlbg0KPiA+IGhvc3RzIHdpdGhvdXQgc2lnbmlmaWNhbnQgZG93 bnRpbWUgb3IgbWFudWFsIHN0ZXBzLsKgwqBQQ0kNCj4gPiBwYXNzdGhyb3VnaCBpcw0KPiA+IGFu IGV4YW1wbGUgb2YgYSBmZWF0dXJlIHRoYXQgbWFrZXMgaXQgdmVyeSBoYXJkIHRvIGxpdmUgbWln cmF0ZS7CoMKgSQ0KPiA+IGhvcGUNCj4gPiB3ZSBjYW4gYWxsb3cgbWlncmF0aW9uIHdpdGggTkZT LCBhbHRob3VnaCBzb21lIGxpbWl0YXRpb25zIG1heSBiZQ0KPiA+IG5lY2Vzc2FyeSB0byBtYWtl IGl0IGZlYXNpYmxlLg0KPiA+IA0KPiA+IFRoZXJlIGFyZSB0d28gTkZTIG92ZXIgQUZfVlNPQ0sg bWlncmF0aW9uIHNjZW5hcmlvczoNCj4gPiANCj4gPiAxLiBUaGUgZmlsZXMgbGl2ZSBvbiBob3N0 IEgxIGFuZCBob3N0IEgyIGNhbm5vdCBhY2Nlc3MgdGhlIGZpbGVzDQo+ID4gwqAgZGlyZWN0bHku wqDCoFRoZXJlIGlzIG5vIHdheSBmb3IgYW4gTkZTIHNlcnZlciBvbiBIMiB0byBhY2Nlc3MNCj4g PiB0aG9zZQ0KPiA+IMKgIHNhbWUgZmlsZXMgdW5sZXNzIHRoZSBkaXJlY3RvcnkgaXMgY29waWVk IGFsb25nIHdpdGggdGhlIGd1ZXN0IG9yDQo+ID4gSDINCj4gPiDCoCBwcm94aWVzIHRvIHRoZSBO RlMgc2VydmVyIG9uIEgxLg0KPiANCj4gSGF2aW5nIG1hbmFnZWQgKGFuZCBzaGFyZWQpIHN0b3Jh Z2Ugb24gdGhlIHBoeXNpY2FsIGhvc3QgaXMNCj4gYXdrd2FyZC4gSSBrbm93IHNvbWUgY2xvdWQg cHJvdmlkZXJzIG1pZ2h0IGRvIHRoaXMgdG9kYXkgYnkNCj4gY29weWluZyBndWVzdCBkaXNrIGlt YWdlcyBkb3duIHRvIHRoZSBob3N0J3MgbG9jYWwgZGlzaywgYnV0DQo+IGdlbmVyYWxseSBpdCdz IG5vdCBhIGZsZXhpYmxlIHByaW1hcnkgZGVwbG95bWVudCBjaG9pY2UuDQo+IA0KPiBUaGVyZSdz IG5vIGdvb2Qgd2F5IHRvIGV4cGFuZCBvciByZXBsaWNhdGUgdGhpcyBwb29sIG9mDQo+IHN0b3Jh Z2UuIEEgYmFja3VwIHNjaGVtZSB3b3VsZCBuZWVkIHRvIGFjY2VzcyBhbGwgcGh5c2ljYWwNCj4g aG9zdHMuIEFuZCB0aGUgZmlsZXMgYXJlIHZpc2libGUgb25seSBvbiBzcGVjaWZpYyBob3N0cy4N Cj4gDQo+IElNTyB5b3Ugd2FudCB0byB0cmVhdCBsb2NhbCBzdG9yYWdlIG9uIGVhY2ggcGh5c2lj YWwgaG9zdCBhcw0KPiBhIGNhY2hlIHRpZXIgcmF0aGVyIHRoYW4gYXMgYSBiYWNrLWVuZCB0aWVy Lg0KPiANCj4gDQo+ID4gMi4gVGhlIGZpbGVzIGFyZSBhY2Nlc3NpYmxlIGZyb20gYm90aCBob3N0 IEgxIGFuZCBob3N0IEgyIGJlY2F1c2UNCj4gPiB0aGV5DQo+ID4gwqAgYXJlIG9uIHNoYXJlZCBz dG9yYWdlIG9yIGRpc3RyaWJ1dGVkIHN0b3JhZ2Ugc3lzdGVtLsKgwqBIZXJlIHRoZQ0KPiA+IMKg IHByb2JsZW0gaXMgImp1c3QiIG1pZ3JhdGluZyB0aGUgc3RhdGUgZnJvbSBIMSdzIE5GUyBzZXJ2 ZXIgdG8gSDINCj4gPiBzbw0KPiA+IMKgIHRoYXQgZmlsZSBoYW5kbGVzIHJlbWFpbiB2YWxpZC4N Cj4gDQo+IEVzc2VudGlhbGx5IHRoaXMgaXMgdGhlIHJlLWV4cG9ydCBjYXNlLCBhbmQgdGhpcyBt YWtlcyBhIGxvdA0KPiBtb3JlIHNlbnNlIHRvIG1lIGZyb20gYSBzdG9yYWdlIGFkbWluaXN0cmF0 aW9uIHBvaW50IG9mIHZpZXcuDQo+IA0KPiBUaGUgcG9vbCBvZiBhZG1pbmlzdGVyZWQgc3RvcmFn ZSBpcyBub3QgbG9jYWwgdG8gdGhlIHBoeXNpY2FsDQo+IGhvc3RzIHJ1bm5pbmcgdGhlIGd1ZXN0 cywgd2hpY2ggaXMgaG93IEkgdGhpbmsgY2xvdWQgcHJvdmlkZXJzDQo+IHdvdWxkIHByZWZlciB0 byBvcGVyYXRlLg0KPiANCj4gVXNlciBzdG9yYWdlIHdvdWxkIGJlIGFjY2Vzc2libGUgdmlhIGFu IE5GUyBzaGFyZSwgYnV0IG1hbmFnZWQNCj4gaW4gYSBDZXBoIG9iamVjdCAod2l0aCByZWR1bmRh bmN5LCBhIGNvbW1vbiBoaWdoIHRocm91Z2hwdXQNCj4gYmFja3VwIGZhY2lsaXR5LCBhbmQgc2Vj dXJlIGNlbnRyYWwgbWFuYWdlbWVudCBvZiB1c2VyDQo+IGlkZW50aXRpZXMpLg0KPiANCj4gRWFj aCBob3N0J3MgTkZTIHNlcnZlciBjb3VsZCBiZSBjb25maWd1cmVkIHRvIGV4cG9zZSBvbmx5IHRo ZQ0KPiB0aGUgY2xvdWQgc3RvcmFnZSByZXNvdXJjZXMgZm9yIHRoZSB0ZW5hbnRzIG9uIHRoYXQg aG9zdC4gVGhlDQo+IGJhY2stZW5kIHN0b3JhZ2UgKGllLCBDZXBoKSBjb3VsZCBvcGVyYXRlIG9u IGEgcHJpdmF0ZSBzdG9yYWdlDQo+IGFyZWEgbmV0d29yayBmb3IgYmV0dGVyIHNlY3VyaXR5Lg0K PiANCj4gVGhlIG9ubHkgbWlzc2luZyBwaWVjZSBoZXJlIGlzIHN1cHBvcnQgaW4gTGludXgtYmFz ZWQgTkZTDQo+IHNlcnZlcnMgZm9yIHRyYW5zcGFyZW50IHN0YXRlIG1pZ3JhdGlvbi4NCg0KTm90 IHJlYWxseS4gSW4gYSBjb250YWluZXJpc2VkIHdvcmxkLCB3ZSdyZSBnb2luZyB0byBzZWUgbW9y ZSBhbmQgbW9yZQ0KY2FzZXMgd2hlcmUganVzdCBhIHNpbmdsZSBwcm9jZXNzL2FwcGxpY2F0aW9u IGdldHMgbWlncmF0ZWQgZnJvbSBvbmUNCk5GUyBjbGllbnQgdG8gYW5vdGhlciAoYW5kIHllcywg YSByZS1leHBvcnRlci9wcm94eSBvZiBORlMgaXMganVzdA0KYW5vdGhlciBjbGllbnQgYXMgZmFy IGFzIHRoZSBvcmlnaW5hbCBzZXJ2ZXIgaXMgY29uY2VybmVkKS4NCklPVzogSSB0aGluayB3ZSB3 YW50IHRvIGFsbG93IGEgY2xpZW50IHRvIG1pZ3JhdGUgc29tZSBwYXJ0cyBvZiBpdHMNCmxvY2sg c3RhdGUgdG8gYW5vdGhlciBjbGllbnQsIHdpdGhvdXQgbmVjZXNzYXJpbHkgcmVxdWlyaW5nIGV2 ZXJ5DQpwcm9jZXNzIGJlaW5nIG1pZ3JhdGVkIHRvIGhhdmUgaXRzIG93biBjbGllbnRpZC4NCg0K SSdtIGluIHRoZSBwcm9jZXNzIG9mIGJ1aWxkaW5nIHVwIGEgbGF1bmRyeSBsaXN0IG9mIHByb2Js ZW1zIHRoYXQgSSdkDQpsaWtlIHRvIHNlZSBzb2x2ZWQgYXMgcGFydCBvZiB0aGUgbmV3IElFVEYg V0cgY2hhcnRlci4gVGhpcyBpcyBvbmUNCmlzc3VlIHRoYXQgSSB0aGluayBzaG91bGQgYmUgb24g dGhhdCBsaXN0Lg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWlu dGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 15:04 ` Trond Myklebust @ 2017-05-18 15:08 ` J. Bruce Fields 2017-05-18 15:15 ` Chuck Lever 2017-05-18 15:17 ` Trond Myklebust 0 siblings, 2 replies; 23+ messages in thread From: J. Bruce Fields @ 2017-05-18 15:08 UTC (permalink / raw) To: Trond Myklebust Cc: stefanha@redhat.com, chuck.lever@oracle.com, bfields@fieldses.org, SteveD@redhat.com, linux-nfs@vger.kernel.org On Thu, May 18, 2017 at 03:04:50PM +0000, Trond Myklebust wrote: > On Thu, 2017-05-18 at 10:28 -0400, Chuck Lever wrote: > > > On May 18, 2017, at 9:34 AM, Stefan Hajnoczi <stefanha@redhat.com> > > > wrote: > > > > > > On Tue, May 16, 2017 at 09:11:42AM -0400, J. Bruce Fields wrote: > > > > I think you explained this before, perhaps you could just offer a > > > > pointer: remind us what your requirements or use cases are > > > > especially > > > > for VM migration? > > > > > > The NFS over AF_VSOCK configuration is: > > > > > > A guest running on host mounts an NFS export from the host. The > > > NFS > > > server may be kernel nfsd or an NFS frontend to a distributed > > > storage > > > system like Ceph. A little more about these cases below. > > > > > > Kernel nfsd is useful for sharing files. For example, the guest > > > may > > > read some files from the host when it launches and/or it may write > > > out > > > result files to the host when it shuts down. The user may also > > > wish to > > > share their home directory between the guest and the host. > > > > > > NFS frontends are a different use case. They hide distributed > > > storage > > > systems from guests in cloud environments. This way guests don't > > > see > > > the details of the Ceph, Gluster, etc nodes. Besides benefiting > > > security it also allows NFS-capable guests to run without > > > installing > > > specific drivers for the distributed storage system. This use case > > > is > > > "filesystem as a service". > > > > > > The reason for using AF_VSOCK instead of TCP/IP is that traditional > > > networking configuration is fragile. Automatically adding a > > > dedicated > > > NIC to the guest and choosing an IP subnet has a high chance of > > > conflicts (subnet collisions, network interface naming, firewall > > > rules, > > > network management tools). AF_VSOCK is a zero-configuration > > > communications channel so it avoids these problems. > > > > > > On to migration. For the most part, guests can be live migrated > > > between > > > hosts without significant downtime or manual steps. PCI > > > passthrough is > > > an example of a feature that makes it very hard to live migrate. I > > > hope > > > we can allow migration with NFS, although some limitations may be > > > necessary to make it feasible. > > > > > > There are two NFS over AF_VSOCK migration scenarios: > > > > > > 1. The files live on host H1 and host H2 cannot access the files > > > directly. There is no way for an NFS server on H2 to access > > > those > > > same files unless the directory is copied along with the guest or > > > H2 > > > proxies to the NFS server on H1. > > > > Having managed (and shared) storage on the physical host is > > awkward. I know some cloud providers might do this today by > > copying guest disk images down to the host's local disk, but > > generally it's not a flexible primary deployment choice. > > > > There's no good way to expand or replicate this pool of > > storage. A backup scheme would need to access all physical > > hosts. And the files are visible only on specific hosts. > > > > IMO you want to treat local storage on each physical host as > > a cache tier rather than as a back-end tier. > > > > > > > 2. The files are accessible from both host H1 and host H2 because > > > they > > > are on shared storage or distributed storage system. Here the > > > problem is "just" migrating the state from H1's NFS server to H2 > > > so > > > that file handles remain valid. > > > > Essentially this is the re-export case, and this makes a lot > > more sense to me from a storage administration point of view. > > > > The pool of administered storage is not local to the physical > > hosts running the guests, which is how I think cloud providers > > would prefer to operate. > > > > User storage would be accessible via an NFS share, but managed > > in a Ceph object (with redundancy, a common high throughput > > backup facility, and secure central management of user > > identities). > > > > Each host's NFS server could be configured to expose only the > > the cloud storage resources for the tenants on that host. The > > back-end storage (ie, Ceph) could operate on a private storage > > area network for better security. > > > > The only missing piece here is support in Linux-based NFS > > servers for transparent state migration. > > Not really. In a containerised world, we're going to see more and more > cases where just a single process/application gets migrated from one > NFS client to another (and yes, a re-exporter/proxy of NFS is just > another client as far as the original server is concerned). > IOW: I think we want to allow a client to migrate some parts of its > lock state to another client, without necessarily requiring every > process being migrated to have its own clientid. It wouldn't have to be every process, it'd be every container, right? What's the disadvantage of per-container clientids? I guess you lose the chance to share delegations and caches. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 15:08 ` J. Bruce Fields @ 2017-05-18 15:15 ` Chuck Lever 2017-05-18 15:17 ` Trond Myklebust 2017-05-18 15:17 ` Trond Myklebust 1 sibling, 1 reply; 23+ messages in thread From: Chuck Lever @ 2017-05-18 15:15 UTC (permalink / raw) To: J. Bruce Fields Cc: Trond Myklebust, stefanha@redhat.com, J. Bruce Fields, Steve Dickson, Linux NFS Mailing List > On May 18, 2017, at 11:08 AM, J. Bruce Fields <bfields@redhat.com> wrote: > > On Thu, May 18, 2017 at 03:04:50PM +0000, Trond Myklebust wrote: >> On Thu, 2017-05-18 at 10:28 -0400, Chuck Lever wrote: >>>> On May 18, 2017, at 9:34 AM, Stefan Hajnoczi <stefanha@redhat.com> >>>> wrote: >>>> >>>> On Tue, May 16, 2017 at 09:11:42AM -0400, J. Bruce Fields wrote: >>>>> I think you explained this before, perhaps you could just offer a >>>>> pointer: remind us what your requirements or use cases are >>>>> especially >>>>> for VM migration? >>>> >>>> The NFS over AF_VSOCK configuration is: >>>> >>>> A guest running on host mounts an NFS export from the host. The >>>> NFS >>>> server may be kernel nfsd or an NFS frontend to a distributed >>>> storage >>>> system like Ceph. A little more about these cases below. >>>> >>>> Kernel nfsd is useful for sharing files. For example, the guest >>>> may >>>> read some files from the host when it launches and/or it may write >>>> out >>>> result files to the host when it shuts down. The user may also >>>> wish to >>>> share their home directory between the guest and the host. >>>> >>>> NFS frontends are a different use case. They hide distributed >>>> storage >>>> systems from guests in cloud environments. This way guests don't >>>> see >>>> the details of the Ceph, Gluster, etc nodes. Besides benefiting >>>> security it also allows NFS-capable guests to run without >>>> installing >>>> specific drivers for the distributed storage system. This use case >>>> is >>>> "filesystem as a service". >>>> >>>> The reason for using AF_VSOCK instead of TCP/IP is that traditional >>>> networking configuration is fragile. Automatically adding a >>>> dedicated >>>> NIC to the guest and choosing an IP subnet has a high chance of >>>> conflicts (subnet collisions, network interface naming, firewall >>>> rules, >>>> network management tools). AF_VSOCK is a zero-configuration >>>> communications channel so it avoids these problems. >>>> >>>> On to migration. For the most part, guests can be live migrated >>>> between >>>> hosts without significant downtime or manual steps. PCI >>>> passthrough is >>>> an example of a feature that makes it very hard to live migrate. I >>>> hope >>>> we can allow migration with NFS, although some limitations may be >>>> necessary to make it feasible. >>>> >>>> There are two NFS over AF_VSOCK migration scenarios: >>>> >>>> 1. The files live on host H1 and host H2 cannot access the files >>>> directly. There is no way for an NFS server on H2 to access >>>> those >>>> same files unless the directory is copied along with the guest or >>>> H2 >>>> proxies to the NFS server on H1. >>> >>> Having managed (and shared) storage on the physical host is >>> awkward. I know some cloud providers might do this today by >>> copying guest disk images down to the host's local disk, but >>> generally it's not a flexible primary deployment choice. >>> >>> There's no good way to expand or replicate this pool of >>> storage. A backup scheme would need to access all physical >>> hosts. And the files are visible only on specific hosts. >>> >>> IMO you want to treat local storage on each physical host as >>> a cache tier rather than as a back-end tier. >>> >>> >>>> 2. The files are accessible from both host H1 and host H2 because >>>> they >>>> are on shared storage or distributed storage system. Here the >>>> problem is "just" migrating the state from H1's NFS server to H2 >>>> so >>>> that file handles remain valid. >>> >>> Essentially this is the re-export case, and this makes a lot >>> more sense to me from a storage administration point of view. >>> >>> The pool of administered storage is not local to the physical >>> hosts running the guests, which is how I think cloud providers >>> would prefer to operate. >>> >>> User storage would be accessible via an NFS share, but managed >>> in a Ceph object (with redundancy, a common high throughput >>> backup facility, and secure central management of user >>> identities). >>> >>> Each host's NFS server could be configured to expose only the >>> the cloud storage resources for the tenants on that host. The >>> back-end storage (ie, Ceph) could operate on a private storage >>> area network for better security. >>> >>> The only missing piece here is support in Linux-based NFS >>> servers for transparent state migration. >> >> Not really. In a containerised world, we're going to see more and more >> cases where just a single process/application gets migrated from one >> NFS client to another (and yes, a re-exporter/proxy of NFS is just >> another client as far as the original server is concerned). >> IOW: I think we want to allow a client to migrate some parts of its >> lock state to another client, without necessarily requiring every >> process being migrated to have its own clientid. > > It wouldn't have to be every process, it'd be every container, right? > What's the disadvantage of per-container clientids? I guess you lose > the chance to share delegations and caches. Can't each container have it's own net namespace, and each net namespace have its own client ID? (I agree, btw, this class of problems should be considered in the new nfsv4 WG charter. Thanks for doing that, Trond). -- Chuck Lever ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 15:15 ` Chuck Lever @ 2017-05-18 15:17 ` Trond Myklebust 0 siblings, 0 replies; 23+ messages in thread From: Trond Myklebust @ 2017-05-18 15:17 UTC (permalink / raw) To: bfields@redhat.com, chuck.lever@oracle.com Cc: stefanha@redhat.com, bfields@fieldses.org, SteveD@redhat.com, linux-nfs@vger.kernel.org T24gVGh1LCAyMDE3LTA1LTE4IGF0IDExOjE1IC0wNDAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g PiBPbiBNYXkgMTgsIDIwMTcsIGF0IDExOjA4IEFNLCBKLiBCcnVjZSBGaWVsZHMgPGJmaWVsZHNA cmVkaGF0LmNvbT4NCj4gPiB3cm90ZToNCj4gPiANCj4gPiBPbiBUaHUsIE1heSAxOCwgMjAxNyBh dCAwMzowNDo1MFBNICswMDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6DQo+ID4gPiBPbiBUaHUs IDIwMTctMDUtMTggYXQgMTA6MjggLTA0MDAsIENodWNrIExldmVyIHdyb3RlOg0KPiA+ID4gPiA+ IE9uIE1heSAxOCwgMjAxNywgYXQgOTozNCBBTSwgU3RlZmFuIEhham5vY3ppIDxzdGVmYW5oYUBy ZWRoYXQNCj4gPiA+ID4gPiAuY29tPg0KPiA+ID4gPiA+IHdyb3RlOg0KPiA+ID4gPiA+IA0KPiA+ ID4gPiA+IE9uIFR1ZSwgTWF5IDE2LCAyMDE3IGF0IDA5OjExOjQyQU0gLTA0MDAsIEouIEJydWNl IEZpZWxkcw0KPiA+ID4gPiA+IHdyb3RlOg0KPiA+ID4gPiA+ID4gSSB0aGluayB5b3UgZXhwbGFp bmVkIHRoaXMgYmVmb3JlLCBwZXJoYXBzIHlvdSBjb3VsZCBqdXN0DQo+ID4gPiA+ID4gPiBvZmZl ciBhDQo+ID4gPiA+ID4gPiBwb2ludGVyOiByZW1pbmQgdXMgd2hhdCB5b3VyIHJlcXVpcmVtZW50 cyBvciB1c2UgY2FzZXMgYXJlDQo+ID4gPiA+ID4gPiBlc3BlY2lhbGx5DQo+ID4gPiA+ID4gPiBm b3IgVk0gbWlncmF0aW9uPw0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IFRoZSBORlMgb3ZlciBBRl9W U09DSyBjb25maWd1cmF0aW9uIGlzOg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IEEgZ3Vlc3QgcnVu bmluZyBvbiBob3N0IG1vdW50cyBhbiBORlMgZXhwb3J0IGZyb20gdGhlDQo+ID4gPiA+ID4gaG9z dC7CoMKgVGhlDQo+ID4gPiA+ID4gTkZTDQo+ID4gPiA+ID4gc2VydmVyIG1heSBiZSBrZXJuZWwg bmZzZCBvciBhbiBORlMgZnJvbnRlbmQgdG8gYSBkaXN0cmlidXRlZA0KPiA+ID4gPiA+IHN0b3Jh Z2UNCj4gPiA+ID4gPiBzeXN0ZW0gbGlrZSBDZXBoLsKgwqBBIGxpdHRsZSBtb3JlIGFib3V0IHRo ZXNlIGNhc2VzIGJlbG93Lg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IEtlcm5lbCBuZnNkIGlzIHVz ZWZ1bCBmb3Igc2hhcmluZyBmaWxlcy7CoMKgRm9yIGV4YW1wbGUsIHRoZQ0KPiA+ID4gPiA+IGd1 ZXN0DQo+ID4gPiA+ID4gbWF5DQo+ID4gPiA+ID4gcmVhZCBzb21lIGZpbGVzIGZyb20gdGhlIGhv c3Qgd2hlbiBpdCBsYXVuY2hlcyBhbmQvb3IgaXQgbWF5DQo+ID4gPiA+ID4gd3JpdGUNCj4gPiA+ ID4gPiBvdXQNCj4gPiA+ID4gPiByZXN1bHQgZmlsZXMgdG8gdGhlIGhvc3Qgd2hlbiBpdCBzaHV0 cyBkb3duLsKgwqBUaGUgdXNlciBtYXkNCj4gPiA+ID4gPiBhbHNvDQo+ID4gPiA+ID4gd2lzaCB0 bw0KPiA+ID4gPiA+IHNoYXJlIHRoZWlyIGhvbWUgZGlyZWN0b3J5IGJldHdlZW4gdGhlIGd1ZXN0 IGFuZCB0aGUgaG9zdC4NCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBORlMgZnJvbnRlbmRzIGFyZSBh IGRpZmZlcmVudCB1c2UgY2FzZS7CoMKgVGhleSBoaWRlDQo+ID4gPiA+ID4gZGlzdHJpYnV0ZWQN Cj4gPiA+ID4gPiBzdG9yYWdlDQo+ID4gPiA+ID4gc3lzdGVtcyBmcm9tIGd1ZXN0cyBpbiBjbG91 ZCBlbnZpcm9ubWVudHMuwqDCoFRoaXMgd2F5IGd1ZXN0cw0KPiA+ID4gPiA+IGRvbid0DQo+ID4g PiA+ID4gc2VlDQo+ID4gPiA+ID4gdGhlIGRldGFpbHMgb2YgdGhlIENlcGgsIEdsdXN0ZXIsIGV0 YyBub2Rlcy7CoMKgQmVzaWRlcw0KPiA+ID4gPiA+IGJlbmVmaXRpbmcNCj4gPiA+ID4gPiBzZWN1 cml0eSBpdCBhbHNvIGFsbG93cyBORlMtY2FwYWJsZSBndWVzdHMgdG8gcnVuIHdpdGhvdXQNCj4g PiA+ID4gPiBpbnN0YWxsaW5nDQo+ID4gPiA+ID4gc3BlY2lmaWMgZHJpdmVycyBmb3IgdGhlIGRp c3RyaWJ1dGVkIHN0b3JhZ2Ugc3lzdGVtLsKgwqBUaGlzDQo+ID4gPiA+ID4gdXNlIGNhc2UNCj4g PiA+ID4gPiBpcw0KPiA+ID4gPiA+ICJmaWxlc3lzdGVtIGFzIGEgc2VydmljZSIuDQo+ID4gPiA+ ID4gDQo+ID4gPiA+ID4gVGhlIHJlYXNvbiBmb3IgdXNpbmcgQUZfVlNPQ0sgaW5zdGVhZCBvZiBU Q1AvSVAgaXMgdGhhdA0KPiA+ID4gPiA+IHRyYWRpdGlvbmFsDQo+ID4gPiA+ID4gbmV0d29ya2lu ZyBjb25maWd1cmF0aW9uIGlzIGZyYWdpbGUuwqDCoEF1dG9tYXRpY2FsbHkgYWRkaW5nIGENCj4g PiA+ID4gPiBkZWRpY2F0ZWQNCj4gPiA+ID4gPiBOSUMgdG8gdGhlIGd1ZXN0IGFuZCBjaG9vc2lu ZyBhbiBJUCBzdWJuZXQgaGFzIGEgaGlnaCBjaGFuY2UNCj4gPiA+ID4gPiBvZg0KPiA+ID4gPiA+ IGNvbmZsaWN0cyAoc3VibmV0IGNvbGxpc2lvbnMsIG5ldHdvcmsgaW50ZXJmYWNlIG5hbWluZywN Cj4gPiA+ID4gPiBmaXJld2FsbA0KPiA+ID4gPiA+IHJ1bGVzLA0KPiA+ID4gPiA+IG5ldHdvcmsg bWFuYWdlbWVudCB0b29scykuwqDCoEFGX1ZTT0NLIGlzIGEgemVyby1jb25maWd1cmF0aW9uDQo+ ID4gPiA+ID4gY29tbXVuaWNhdGlvbnMgY2hhbm5lbCBzbyBpdCBhdm9pZHMgdGhlc2UgcHJvYmxl bXMuDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gT24gdG8gbWlncmF0aW9uLsKgwqBGb3IgdGhlIG1v c3QgcGFydCwgZ3Vlc3RzIGNhbiBiZSBsaXZlDQo+ID4gPiA+ID4gbWlncmF0ZWQNCj4gPiA+ID4g PiBiZXR3ZWVuDQo+ID4gPiA+ID4gaG9zdHMgd2l0aG91dCBzaWduaWZpY2FudCBkb3dudGltZSBv ciBtYW51YWwgc3RlcHMuwqDCoFBDSQ0KPiA+ID4gPiA+IHBhc3N0aHJvdWdoIGlzDQo+ID4gPiA+ ID4gYW4gZXhhbXBsZSBvZiBhIGZlYXR1cmUgdGhhdCBtYWtlcyBpdCB2ZXJ5IGhhcmQgdG8gbGl2 ZQ0KPiA+ID4gPiA+IG1pZ3JhdGUuwqDCoEkNCj4gPiA+ID4gPiBob3BlDQo+ID4gPiA+ID4gd2Ug Y2FuIGFsbG93IG1pZ3JhdGlvbiB3aXRoIE5GUywgYWx0aG91Z2ggc29tZSBsaW1pdGF0aW9ucw0K PiA+ID4gPiA+IG1heSBiZQ0KPiA+ID4gPiA+IG5lY2Vzc2FyeSB0byBtYWtlIGl0IGZlYXNpYmxl Lg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IFRoZXJlIGFyZSB0d28gTkZTIG92ZXIgQUZfVlNPQ0sg bWlncmF0aW9uIHNjZW5hcmlvczoNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiAxLiBUaGUgZmlsZXMg bGl2ZSBvbiBob3N0IEgxIGFuZCBob3N0IEgyIGNhbm5vdCBhY2Nlc3MgdGhlDQo+ID4gPiA+ID4g ZmlsZXMNCj4gPiA+ID4gPiDCoCBkaXJlY3RseS7CoMKgVGhlcmUgaXMgbm8gd2F5IGZvciBhbiBO RlMgc2VydmVyIG9uIEgyIHRvDQo+ID4gPiA+ID4gYWNjZXNzDQo+ID4gPiA+ID4gdGhvc2UNCj4g PiA+ID4gPiDCoCBzYW1lIGZpbGVzIHVubGVzcyB0aGUgZGlyZWN0b3J5IGlzIGNvcGllZCBhbG9u ZyB3aXRoIHRoZQ0KPiA+ID4gPiA+IGd1ZXN0IG9yDQo+ID4gPiA+ID4gSDINCj4gPiA+ID4gPiDC oCBwcm94aWVzIHRvIHRoZSBORlMgc2VydmVyIG9uIEgxLg0KPiA+ID4gPiANCj4gPiA+ID4gSGF2 aW5nIG1hbmFnZWQgKGFuZCBzaGFyZWQpIHN0b3JhZ2Ugb24gdGhlIHBoeXNpY2FsIGhvc3QgaXMN Cj4gPiA+ID4gYXdrd2FyZC4gSSBrbm93IHNvbWUgY2xvdWQgcHJvdmlkZXJzIG1pZ2h0IGRvIHRo aXMgdG9kYXkgYnkNCj4gPiA+ID4gY29weWluZyBndWVzdCBkaXNrIGltYWdlcyBkb3duIHRvIHRo ZSBob3N0J3MgbG9jYWwgZGlzaywgYnV0DQo+ID4gPiA+IGdlbmVyYWxseSBpdCdzIG5vdCBhIGZs ZXhpYmxlIHByaW1hcnkgZGVwbG95bWVudCBjaG9pY2UuDQo+ID4gPiA+IA0KPiA+ID4gPiBUaGVy ZSdzIG5vIGdvb2Qgd2F5IHRvIGV4cGFuZCBvciByZXBsaWNhdGUgdGhpcyBwb29sIG9mDQo+ID4g PiA+IHN0b3JhZ2UuIEEgYmFja3VwIHNjaGVtZSB3b3VsZCBuZWVkIHRvIGFjY2VzcyBhbGwgcGh5 c2ljYWwNCj4gPiA+ID4gaG9zdHMuIEFuZCB0aGUgZmlsZXMgYXJlIHZpc2libGUgb25seSBvbiBz cGVjaWZpYyBob3N0cy4NCj4gPiA+ID4gDQo+ID4gPiA+IElNTyB5b3Ugd2FudCB0byB0cmVhdCBs b2NhbCBzdG9yYWdlIG9uIGVhY2ggcGh5c2ljYWwgaG9zdCBhcw0KPiA+ID4gPiBhIGNhY2hlIHRp ZXIgcmF0aGVyIHRoYW4gYXMgYSBiYWNrLWVuZCB0aWVyLg0KPiA+ID4gPiANCj4gPiA+ID4gDQo+ ID4gPiA+ID4gMi4gVGhlIGZpbGVzIGFyZSBhY2Nlc3NpYmxlIGZyb20gYm90aCBob3N0IEgxIGFu ZCBob3N0IEgyDQo+ID4gPiA+ID4gYmVjYXVzZQ0KPiA+ID4gPiA+IHRoZXkNCj4gPiA+ID4gPiDC oCBhcmUgb24gc2hhcmVkIHN0b3JhZ2Ugb3IgZGlzdHJpYnV0ZWQgc3RvcmFnZSBzeXN0ZW0uwqDC oEhlcmUNCj4gPiA+ID4gPiB0aGUNCj4gPiA+ID4gPiDCoCBwcm9ibGVtIGlzICJqdXN0IiBtaWdy YXRpbmcgdGhlIHN0YXRlIGZyb20gSDEncyBORlMgc2VydmVyDQo+ID4gPiA+ID4gdG8gSDINCj4g PiA+ID4gPiBzbw0KPiA+ID4gPiA+IMKgIHRoYXQgZmlsZSBoYW5kbGVzIHJlbWFpbiB2YWxpZC4N Cj4gPiA+ID4gDQo+ID4gPiA+IEVzc2VudGlhbGx5IHRoaXMgaXMgdGhlIHJlLWV4cG9ydCBjYXNl LCBhbmQgdGhpcyBtYWtlcyBhIGxvdA0KPiA+ID4gPiBtb3JlIHNlbnNlIHRvIG1lIGZyb20gYSBz dG9yYWdlIGFkbWluaXN0cmF0aW9uIHBvaW50IG9mIHZpZXcuDQo+ID4gPiA+IA0KPiA+ID4gPiBU aGUgcG9vbCBvZiBhZG1pbmlzdGVyZWQgc3RvcmFnZSBpcyBub3QgbG9jYWwgdG8gdGhlIHBoeXNp Y2FsDQo+ID4gPiA+IGhvc3RzIHJ1bm5pbmcgdGhlIGd1ZXN0cywgd2hpY2ggaXMgaG93IEkgdGhp bmsgY2xvdWQgcHJvdmlkZXJzDQo+ID4gPiA+IHdvdWxkIHByZWZlciB0byBvcGVyYXRlLg0KPiA+ ID4gPiANCj4gPiA+ID4gVXNlciBzdG9yYWdlIHdvdWxkIGJlIGFjY2Vzc2libGUgdmlhIGFuIE5G UyBzaGFyZSwgYnV0IG1hbmFnZWQNCj4gPiA+ID4gaW4gYSBDZXBoIG9iamVjdCAod2l0aCByZWR1 bmRhbmN5LCBhIGNvbW1vbiBoaWdoIHRocm91Z2hwdXQNCj4gPiA+ID4gYmFja3VwIGZhY2lsaXR5 LCBhbmQgc2VjdXJlIGNlbnRyYWwgbWFuYWdlbWVudCBvZiB1c2VyDQo+ID4gPiA+IGlkZW50aXRp ZXMpLg0KPiA+ID4gPiANCj4gPiA+ID4gRWFjaCBob3N0J3MgTkZTIHNlcnZlciBjb3VsZCBiZSBj b25maWd1cmVkIHRvIGV4cG9zZSBvbmx5IHRoZQ0KPiA+ID4gPiB0aGUgY2xvdWQgc3RvcmFnZSBy ZXNvdXJjZXMgZm9yIHRoZSB0ZW5hbnRzIG9uIHRoYXQgaG9zdC4gVGhlDQo+ID4gPiA+IGJhY2st ZW5kIHN0b3JhZ2UgKGllLCBDZXBoKSBjb3VsZCBvcGVyYXRlIG9uIGEgcHJpdmF0ZSBzdG9yYWdl DQo+ID4gPiA+IGFyZWEgbmV0d29yayBmb3IgYmV0dGVyIHNlY3VyaXR5Lg0KPiA+ID4gPiANCj4g PiA+ID4gVGhlIG9ubHkgbWlzc2luZyBwaWVjZSBoZXJlIGlzIHN1cHBvcnQgaW4gTGludXgtYmFz ZWQgTkZTDQo+ID4gPiA+IHNlcnZlcnMgZm9yIHRyYW5zcGFyZW50IHN0YXRlIG1pZ3JhdGlvbi4N Cj4gPiA+IA0KPiA+ID4gTm90IHJlYWxseS4gSW4gYSBjb250YWluZXJpc2VkIHdvcmxkLCB3ZSdy ZSBnb2luZyB0byBzZWUgbW9yZSBhbmQNCj4gPiA+IG1vcmUNCj4gPiA+IGNhc2VzIHdoZXJlIGp1 c3QgYSBzaW5nbGUgcHJvY2Vzcy9hcHBsaWNhdGlvbiBnZXRzIG1pZ3JhdGVkIGZyb20NCj4gPiA+ IG9uZQ0KPiA+ID4gTkZTIGNsaWVudCB0byBhbm90aGVyIChhbmQgeWVzLCBhIHJlLWV4cG9ydGVy L3Byb3h5IG9mIE5GUyBpcw0KPiA+ID4ganVzdA0KPiA+ID4gYW5vdGhlciBjbGllbnQgYXMgZmFy IGFzIHRoZSBvcmlnaW5hbCBzZXJ2ZXIgaXMgY29uY2VybmVkKS4NCj4gPiA+IElPVzogSSB0aGlu ayB3ZSB3YW50IHRvIGFsbG93IGEgY2xpZW50IHRvIG1pZ3JhdGUgc29tZSBwYXJ0cyBvZg0KPiA+ ID4gaXRzDQo+ID4gPiBsb2NrIHN0YXRlIHRvIGFub3RoZXIgY2xpZW50LCB3aXRob3V0IG5lY2Vz c2FyaWx5IHJlcXVpcmluZyBldmVyeQ0KPiA+ID4gcHJvY2VzcyBiZWluZyBtaWdyYXRlZCB0byBo YXZlIGl0cyBvd24gY2xpZW50aWQuDQo+ID4gDQo+ID4gSXQgd291bGRuJ3QgaGF2ZSB0byBiZSBl dmVyeSBwcm9jZXNzLCBpdCdkIGJlIGV2ZXJ5IGNvbnRhaW5lciwNCj4gPiByaWdodD8NCj4gPiBX aGF0J3MgdGhlIGRpc2FkdmFudGFnZSBvZiBwZXItY29udGFpbmVyIGNsaWVudGlkcz/CoMKgSSBn dWVzcyB5b3UNCj4gPiBsb3NlDQo+ID4gdGhlIGNoYW5jZSB0byBzaGFyZSBkZWxlZ2F0aW9ucyBh bmQgY2FjaGVzLg0KPiANCj4gQ2FuJ3QgZWFjaCBjb250YWluZXIgaGF2ZSBpdCdzIG93biBuZXQg bmFtZXNwYWNlLCBhbmQgZWFjaCBuZXQNCj4gbmFtZXNwYWNlIGhhdmUgaXRzIG93biBjbGllbnQg SUQ/DQoNClBvc3NpYmx5LCBidXQgdGhhdCB3b3VsZG4ndCBjb3ZlciBTdGVmYW4ncyBjYXNlIG9m IGEgc2luZ2xlIGt2bQ0KcHJvY2Vzcy4g4pi6DQoNCj4gKEkgYWdyZWUsIGJ0dywgdGhpcyBjbGFz cyBvZiBwcm9ibGVtcyBzaG91bGQgYmUgY29uc2lkZXJlZCBpbg0KPiB0aGUgbmV3IG5mc3Y0IFdH IGNoYXJ0ZXIuIFRoYW5rcyBmb3IgZG9pbmcgdGhhdCwgVHJvbmQpLg0KPiANCi0tIA0KVHJvbmQg TXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9u ZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEuY29tDQo= ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 15:08 ` J. Bruce Fields 2017-05-18 15:15 ` Chuck Lever @ 2017-05-18 15:17 ` Trond Myklebust 2017-05-18 15:28 ` bfields 1 sibling, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2017-05-18 15:17 UTC (permalink / raw) To: bfields@redhat.com Cc: stefanha@redhat.com, bfields@fieldses.org, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com T24gVGh1LCAyMDE3LTA1LTE4IGF0IDExOjA4IC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6 DQo+IE9uIFRodSwgTWF5IDE4LCAyMDE3IGF0IDAzOjA0OjUwUE0gKzAwMDAsIFRyb25kIE15a2xl YnVzdCB3cm90ZToNCj4gPiBPbiBUaHUsIDIwMTctMDUtMTggYXQgMTA6MjggLTA0MDAsIENodWNr IExldmVyIHdyb3RlOg0KPiA+ID4gPiBPbiBNYXkgMTgsIDIwMTcsIGF0IDk6MzQgQU0sIFN0ZWZh biBIYWpub2N6aSA8c3RlZmFuaGFAcmVkaGF0LmMNCj4gPiA+ID4gb20+DQo+ID4gPiA+IHdyb3Rl Og0KPiA+ID4gPiANCj4gPiA+ID4gT24gVHVlLCBNYXkgMTYsIDIwMTcgYXQgMDk6MTE6NDJBTSAt MDQwMCwgSi4gQnJ1Y2UgRmllbGRzDQo+ID4gPiA+IHdyb3RlOg0KPiA+ID4gPiA+IEkgdGhpbmsg eW91IGV4cGxhaW5lZCB0aGlzIGJlZm9yZSwgcGVyaGFwcyB5b3UgY291bGQganVzdA0KPiA+ID4g PiA+IG9mZmVyIGENCj4gPiA+ID4gPiBwb2ludGVyOiByZW1pbmQgdXMgd2hhdCB5b3VyIHJlcXVp cmVtZW50cyBvciB1c2UgY2FzZXMgYXJlDQo+ID4gPiA+ID4gZXNwZWNpYWxseQ0KPiA+ID4gPiA+ IGZvciBWTSBtaWdyYXRpb24/DQo+ID4gPiA+IA0KPiA+ID4gPiBUaGUgTkZTIG92ZXIgQUZfVlNP Q0sgY29uZmlndXJhdGlvbiBpczoNCj4gPiA+ID4gDQo+ID4gPiA+IEEgZ3Vlc3QgcnVubmluZyBv biBob3N0IG1vdW50cyBhbiBORlMgZXhwb3J0IGZyb20gdGhlDQo+ID4gPiA+IGhvc3QuwqDCoFRo ZQ0KPiA+ID4gPiBORlMNCj4gPiA+ID4gc2VydmVyIG1heSBiZSBrZXJuZWwgbmZzZCBvciBhbiBO RlMgZnJvbnRlbmQgdG8gYSBkaXN0cmlidXRlZA0KPiA+ID4gPiBzdG9yYWdlDQo+ID4gPiA+IHN5 c3RlbSBsaWtlIENlcGguwqDCoEEgbGl0dGxlIG1vcmUgYWJvdXQgdGhlc2UgY2FzZXMgYmVsb3cu DQo+ID4gPiA+IA0KPiA+ID4gPiBLZXJuZWwgbmZzZCBpcyB1c2VmdWwgZm9yIHNoYXJpbmcgZmls ZXMuwqDCoEZvciBleGFtcGxlLCB0aGUNCj4gPiA+ID4gZ3Vlc3QNCj4gPiA+ID4gbWF5DQo+ID4g PiA+IHJlYWQgc29tZSBmaWxlcyBmcm9tIHRoZSBob3N0IHdoZW4gaXQgbGF1bmNoZXMgYW5kL29y IGl0IG1heQ0KPiA+ID4gPiB3cml0ZQ0KPiA+ID4gPiBvdXQNCj4gPiA+ID4gcmVzdWx0IGZpbGVz IHRvIHRoZSBob3N0IHdoZW4gaXQgc2h1dHMgZG93bi7CoMKgVGhlIHVzZXIgbWF5IGFsc28NCj4g PiA+ID4gd2lzaCB0bw0KPiA+ID4gPiBzaGFyZSB0aGVpciBob21lIGRpcmVjdG9yeSBiZXR3ZWVu IHRoZSBndWVzdCBhbmQgdGhlIGhvc3QuDQo+ID4gPiA+IA0KPiA+ID4gPiBORlMgZnJvbnRlbmRz IGFyZSBhIGRpZmZlcmVudCB1c2UgY2FzZS7CoMKgVGhleSBoaWRlIGRpc3RyaWJ1dGVkDQo+ID4g PiA+IHN0b3JhZ2UNCj4gPiA+ID4gc3lzdGVtcyBmcm9tIGd1ZXN0cyBpbiBjbG91ZCBlbnZpcm9u bWVudHMuwqDCoFRoaXMgd2F5IGd1ZXN0cw0KPiA+ID4gPiBkb24ndA0KPiA+ID4gPiBzZWUNCj4g PiA+ID4gdGhlIGRldGFpbHMgb2YgdGhlIENlcGgsIEdsdXN0ZXIsIGV0YyBub2Rlcy7CoMKgQmVz aWRlcw0KPiA+ID4gPiBiZW5lZml0aW5nDQo+ID4gPiA+IHNlY3VyaXR5IGl0IGFsc28gYWxsb3dz IE5GUy1jYXBhYmxlIGd1ZXN0cyB0byBydW4gd2l0aG91dA0KPiA+ID4gPiBpbnN0YWxsaW5nDQo+ ID4gPiA+IHNwZWNpZmljIGRyaXZlcnMgZm9yIHRoZSBkaXN0cmlidXRlZCBzdG9yYWdlIHN5c3Rl bS7CoMKgVGhpcyB1c2UNCj4gPiA+ID4gY2FzZQ0KPiA+ID4gPiBpcw0KPiA+ID4gPiAiZmlsZXN5 c3RlbSBhcyBhIHNlcnZpY2UiLg0KPiA+ID4gPiANCj4gPiA+ID4gVGhlIHJlYXNvbiBmb3IgdXNp bmcgQUZfVlNPQ0sgaW5zdGVhZCBvZiBUQ1AvSVAgaXMgdGhhdA0KPiA+ID4gPiB0cmFkaXRpb25h bA0KPiA+ID4gPiBuZXR3b3JraW5nIGNvbmZpZ3VyYXRpb24gaXMgZnJhZ2lsZS7CoMKgQXV0b21h dGljYWxseSBhZGRpbmcgYQ0KPiA+ID4gPiBkZWRpY2F0ZWQNCj4gPiA+ID4gTklDIHRvIHRoZSBn dWVzdCBhbmQgY2hvb3NpbmcgYW4gSVAgc3VibmV0IGhhcyBhIGhpZ2ggY2hhbmNlIG9mDQo+ID4g PiA+IGNvbmZsaWN0cyAoc3VibmV0IGNvbGxpc2lvbnMsIG5ldHdvcmsgaW50ZXJmYWNlIG5hbWlu ZywNCj4gPiA+ID4gZmlyZXdhbGwNCj4gPiA+ID4gcnVsZXMsDQo+ID4gPiA+IG5ldHdvcmsgbWFu YWdlbWVudCB0b29scykuwqDCoEFGX1ZTT0NLIGlzIGEgemVyby1jb25maWd1cmF0aW9uDQo+ID4g PiA+IGNvbW11bmljYXRpb25zIGNoYW5uZWwgc28gaXQgYXZvaWRzIHRoZXNlIHByb2JsZW1zLg0K PiA+ID4gPiANCj4gPiA+ID4gT24gdG8gbWlncmF0aW9uLsKgwqBGb3IgdGhlIG1vc3QgcGFydCwg Z3Vlc3RzIGNhbiBiZSBsaXZlDQo+ID4gPiA+IG1pZ3JhdGVkDQo+ID4gPiA+IGJldHdlZW4NCj4g PiA+ID4gaG9zdHMgd2l0aG91dCBzaWduaWZpY2FudCBkb3dudGltZSBvciBtYW51YWwgc3RlcHMu wqDCoFBDSQ0KPiA+ID4gPiBwYXNzdGhyb3VnaCBpcw0KPiA+ID4gPiBhbiBleGFtcGxlIG9mIGEg ZmVhdHVyZSB0aGF0IG1ha2VzIGl0IHZlcnkgaGFyZCB0byBsaXZlDQo+ID4gPiA+IG1pZ3JhdGUu wqDCoEkNCj4gPiA+ID4gaG9wZQ0KPiA+ID4gPiB3ZSBjYW4gYWxsb3cgbWlncmF0aW9uIHdpdGgg TkZTLCBhbHRob3VnaCBzb21lIGxpbWl0YXRpb25zIG1heQ0KPiA+ID4gPiBiZQ0KPiA+ID4gPiBu ZWNlc3NhcnkgdG8gbWFrZSBpdCBmZWFzaWJsZS4NCj4gPiA+ID4gDQo+ID4gPiA+IFRoZXJlIGFy ZSB0d28gTkZTIG92ZXIgQUZfVlNPQ0sgbWlncmF0aW9uIHNjZW5hcmlvczoNCj4gPiA+ID4gDQo+ ID4gPiA+IDEuIFRoZSBmaWxlcyBsaXZlIG9uIGhvc3QgSDEgYW5kIGhvc3QgSDIgY2Fubm90IGFj Y2VzcyB0aGUNCj4gPiA+ID4gZmlsZXMNCj4gPiA+ID4gwqAgZGlyZWN0bHkuwqDCoFRoZXJlIGlz IG5vIHdheSBmb3IgYW4gTkZTIHNlcnZlciBvbiBIMiB0byBhY2Nlc3MNCj4gPiA+ID4gdGhvc2UN Cj4gPiA+ID4gwqAgc2FtZSBmaWxlcyB1bmxlc3MgdGhlIGRpcmVjdG9yeSBpcyBjb3BpZWQgYWxv bmcgd2l0aCB0aGUNCj4gPiA+ID4gZ3Vlc3Qgb3INCj4gPiA+ID4gSDINCj4gPiA+ID4gwqAgcHJv eGllcyB0byB0aGUgTkZTIHNlcnZlciBvbiBIMS4NCj4gPiA+IA0KPiA+ID4gSGF2aW5nIG1hbmFn ZWQgKGFuZCBzaGFyZWQpIHN0b3JhZ2Ugb24gdGhlIHBoeXNpY2FsIGhvc3QgaXMNCj4gPiA+IGF3 a3dhcmQuIEkga25vdyBzb21lIGNsb3VkIHByb3ZpZGVycyBtaWdodCBkbyB0aGlzIHRvZGF5IGJ5 DQo+ID4gPiBjb3B5aW5nIGd1ZXN0IGRpc2sgaW1hZ2VzIGRvd24gdG8gdGhlIGhvc3QncyBsb2Nh bCBkaXNrLCBidXQNCj4gPiA+IGdlbmVyYWxseSBpdCdzIG5vdCBhIGZsZXhpYmxlIHByaW1hcnkg ZGVwbG95bWVudCBjaG9pY2UuDQo+ID4gPiANCj4gPiA+IFRoZXJlJ3Mgbm8gZ29vZCB3YXkgdG8g ZXhwYW5kIG9yIHJlcGxpY2F0ZSB0aGlzIHBvb2wgb2YNCj4gPiA+IHN0b3JhZ2UuIEEgYmFja3Vw IHNjaGVtZSB3b3VsZCBuZWVkIHRvIGFjY2VzcyBhbGwgcGh5c2ljYWwNCj4gPiA+IGhvc3RzLiBB bmQgdGhlIGZpbGVzIGFyZSB2aXNpYmxlIG9ubHkgb24gc3BlY2lmaWMgaG9zdHMuDQo+ID4gPiAN Cj4gPiA+IElNTyB5b3Ugd2FudCB0byB0cmVhdCBsb2NhbCBzdG9yYWdlIG9uIGVhY2ggcGh5c2lj YWwgaG9zdCBhcw0KPiA+ID4gYSBjYWNoZSB0aWVyIHJhdGhlciB0aGFuIGFzIGEgYmFjay1lbmQg dGllci4NCj4gPiA+IA0KPiA+ID4gDQo+ID4gPiA+IDIuIFRoZSBmaWxlcyBhcmUgYWNjZXNzaWJs ZSBmcm9tIGJvdGggaG9zdCBIMSBhbmQgaG9zdCBIMg0KPiA+ID4gPiBiZWNhdXNlDQo+ID4gPiA+ IHRoZXkNCj4gPiA+ID4gwqAgYXJlIG9uIHNoYXJlZCBzdG9yYWdlIG9yIGRpc3RyaWJ1dGVkIHN0 b3JhZ2Ugc3lzdGVtLsKgwqBIZXJlDQo+ID4gPiA+IHRoZQ0KPiA+ID4gPiDCoCBwcm9ibGVtIGlz ICJqdXN0IiBtaWdyYXRpbmcgdGhlIHN0YXRlIGZyb20gSDEncyBORlMgc2VydmVyIHRvDQo+ID4g PiA+IEgyDQo+ID4gPiA+IHNvDQo+ID4gPiA+IMKgIHRoYXQgZmlsZSBoYW5kbGVzIHJlbWFpbiB2 YWxpZC4NCj4gPiA+IA0KPiA+ID4gRXNzZW50aWFsbHkgdGhpcyBpcyB0aGUgcmUtZXhwb3J0IGNh c2UsIGFuZCB0aGlzIG1ha2VzIGEgbG90DQo+ID4gPiBtb3JlIHNlbnNlIHRvIG1lIGZyb20gYSBz dG9yYWdlIGFkbWluaXN0cmF0aW9uIHBvaW50IG9mIHZpZXcuDQo+ID4gPiANCj4gPiA+IFRoZSBw b29sIG9mIGFkbWluaXN0ZXJlZCBzdG9yYWdlIGlzIG5vdCBsb2NhbCB0byB0aGUgcGh5c2ljYWwN Cj4gPiA+IGhvc3RzIHJ1bm5pbmcgdGhlIGd1ZXN0cywgd2hpY2ggaXMgaG93IEkgdGhpbmsgY2xv dWQgcHJvdmlkZXJzDQo+ID4gPiB3b3VsZCBwcmVmZXIgdG8gb3BlcmF0ZS4NCj4gPiA+IA0KPiA+ ID4gVXNlciBzdG9yYWdlIHdvdWxkIGJlIGFjY2Vzc2libGUgdmlhIGFuIE5GUyBzaGFyZSwgYnV0 IG1hbmFnZWQNCj4gPiA+IGluIGEgQ2VwaCBvYmplY3QgKHdpdGggcmVkdW5kYW5jeSwgYSBjb21t b24gaGlnaCB0aHJvdWdocHV0DQo+ID4gPiBiYWNrdXAgZmFjaWxpdHksIGFuZCBzZWN1cmUgY2Vu dHJhbCBtYW5hZ2VtZW50IG9mIHVzZXINCj4gPiA+IGlkZW50aXRpZXMpLg0KPiA+ID4gDQo+ID4g PiBFYWNoIGhvc3QncyBORlMgc2VydmVyIGNvdWxkIGJlIGNvbmZpZ3VyZWQgdG8gZXhwb3NlIG9u bHkgdGhlDQo+ID4gPiB0aGUgY2xvdWQgc3RvcmFnZSByZXNvdXJjZXMgZm9yIHRoZSB0ZW5hbnRz IG9uIHRoYXQgaG9zdC4gVGhlDQo+ID4gPiBiYWNrLWVuZCBzdG9yYWdlIChpZSwgQ2VwaCkgY291 bGQgb3BlcmF0ZSBvbiBhIHByaXZhdGUgc3RvcmFnZQ0KPiA+ID4gYXJlYSBuZXR3b3JrIGZvciBi ZXR0ZXIgc2VjdXJpdHkuDQo+ID4gPiANCj4gPiA+IFRoZSBvbmx5IG1pc3NpbmcgcGllY2UgaGVy ZSBpcyBzdXBwb3J0IGluIExpbnV4LWJhc2VkIE5GUw0KPiA+ID4gc2VydmVycyBmb3IgdHJhbnNw YXJlbnQgc3RhdGUgbWlncmF0aW9uLg0KPiA+IA0KPiA+IE5vdCByZWFsbHkuIEluIGEgY29udGFp bmVyaXNlZCB3b3JsZCwgd2UncmUgZ29pbmcgdG8gc2VlIG1vcmUgYW5kDQo+ID4gbW9yZQ0KPiA+ IGNhc2VzIHdoZXJlIGp1c3QgYSBzaW5nbGUgcHJvY2Vzcy9hcHBsaWNhdGlvbiBnZXRzIG1pZ3Jh dGVkIGZyb20NCj4gPiBvbmUNCj4gPiBORlMgY2xpZW50IHRvIGFub3RoZXIgKGFuZCB5ZXMsIGEg cmUtZXhwb3J0ZXIvcHJveHkgb2YgTkZTIGlzIGp1c3QNCj4gPiBhbm90aGVyIGNsaWVudCBhcyBm YXIgYXMgdGhlIG9yaWdpbmFsIHNlcnZlciBpcyBjb25jZXJuZWQpLg0KPiA+IElPVzogSSB0aGlu ayB3ZSB3YW50IHRvIGFsbG93IGEgY2xpZW50IHRvIG1pZ3JhdGUgc29tZSBwYXJ0cyBvZiBpdHMN Cj4gPiBsb2NrIHN0YXRlIHRvIGFub3RoZXIgY2xpZW50LCB3aXRob3V0IG5lY2Vzc2FyaWx5IHJl cXVpcmluZyBldmVyeQ0KPiA+IHByb2Nlc3MgYmVpbmcgbWlncmF0ZWQgdG8gaGF2ZSBpdHMgb3du IGNsaWVudGlkLg0KPiANCj4gSXQgd291bGRuJ3QgaGF2ZSB0byBiZSBldmVyeSBwcm9jZXNzLCBp dCdkIGJlIGV2ZXJ5IGNvbnRhaW5lciwgcmlnaHQ/DQo+IFdoYXQncyB0aGUgZGlzYWR2YW50YWdl IG9mIHBlci1jb250YWluZXIgY2xpZW50aWRzP8KgwqBJIGd1ZXNzIHlvdSBsb3NlDQo+IHRoZSBj aGFuY2UgdG8gc2hhcmUgZGVsZWdhdGlvbnMgYW5kIGNhY2hlcy4NCj4gDQoNCkZvciB0aGUgY2Fz ZSB0aGF0IFN0ZWZhbiBpcyBkaXNjdXNzaW5nIChrdm0pIGl0IHdvdWxkIGxpdGVyYWxseSBiZSBh DQpzaW5nbGUgcHJvY2VzcyB0aGF0IGlzIGJlaW5nIG1pZ3JhdGVkLiBGb3IgbHhjIGFuZCBkb2Nr ZXIva3ViZXJuZXRlcy0NCnN0eWxlIGNvbnRhaW5lcnMsIGl0IHdvdWxkIGJlIGEgY29sbGVjdGlv biBvZiBwcm9jZXNzZXMuDQoNClRoZSBtb3VudHBvaW50cyB1c2VkIGJ5IHRoZXNlIGNvbnRhaW5l cnMgYXJlIG9mdGVuIG93bmVkIGJ5IHRoZSBob3N0Ow0KdGhleSBhcmUgdHlwaWNhbGx5IHNldCB1 cCBiZWZvcmUgc3RhcnRpbmcgdGhlIGNvbnRhaW5lcmlzZWQgcHJvY2Vzc2VzLg0KRnVydGhlcm1v cmUsIHRoZXJlIGlzIHR5cGljYWxseSBubyAic3RhcnQgY29udGFpbmVyIiBzeXN0ZW0gY2FsbCB0 aGF0DQp3ZSBjYW4gdXNlIHRvIGlkZW50aWZ5IHdoaWNoIHNldCBvZiBwcm9jZXNzZXMgKG9yIGNn cm91cHMpIGFyZQ0KY29udGFpbmVyaXNlZCwgYW5kIHNob3VsZCBzaGFyZSBhIGNsaWVudGlkLg0K DQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyLCBQcmlt YXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 15:17 ` Trond Myklebust @ 2017-05-18 15:28 ` bfields 2017-05-18 16:09 ` Trond Myklebust 0 siblings, 1 reply; 23+ messages in thread From: bfields @ 2017-05-18 15:28 UTC (permalink / raw) To: Trond Myklebust Cc: bfields@redhat.com, stefanha@redhat.com, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com On Thu, May 18, 2017 at 03:17:11PM +0000, Trond Myklebust wrote: > For the case that Stefan is discussing (kvm) it would literally be a > single process that is being migrated. For lxc and docker/kubernetes- > style containers, it would be a collection of processes. > > The mountpoints used by these containers are often owned by the host; > they are typically set up before starting the containerised processes. > Furthermore, there is typically no "start container" system call that > we can use to identify which set of processes (or cgroups) are > containerised, and should share a clientid. Is that such a hard problem? In any case, from the protocol point of view these all sound like client implementation details. The only problem I see with multiple client ID's is that you'd like to keep their delegations from conflicting with each other so they can share cache. But, maybe I'm missing something else. --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 15:28 ` bfields @ 2017-05-18 16:09 ` Trond Myklebust 2017-05-18 16:32 ` J. Bruce Fields 2017-05-22 14:25 ` Jeff Layton 0 siblings, 2 replies; 23+ messages in thread From: Trond Myklebust @ 2017-05-18 16:09 UTC (permalink / raw) To: bfields@fieldses.org Cc: stefanha@redhat.com, bfields@redhat.com, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com T24gVGh1LCAyMDE3LTA1LTE4IGF0IDExOjI4IC0wNDAwLCBiZmllbGRzQGZpZWxkc2VzLm9yZyB3 cm90ZToNCj4gT24gVGh1LCBNYXkgMTgsIDIwMTcgYXQgMDM6MTc6MTFQTSArMDAwMCwgVHJvbmQg TXlrbGVidXN0IHdyb3RlOg0KPiA+IEZvciB0aGUgY2FzZSB0aGF0IFN0ZWZhbiBpcyBkaXNjdXNz aW5nIChrdm0pIGl0IHdvdWxkIGxpdGVyYWxseSBiZQ0KPiA+IGENCj4gPiBzaW5nbGUgcHJvY2Vz cyB0aGF0IGlzIGJlaW5nIG1pZ3JhdGVkLiBGb3IgbHhjIGFuZA0KPiA+IGRvY2tlci9rdWJlcm5l dGVzLQ0KPiA+IHN0eWxlIGNvbnRhaW5lcnMsIGl0IHdvdWxkIGJlIGEgY29sbGVjdGlvbiBvZiBw cm9jZXNzZXMuDQo+ID4gDQo+ID4gVGhlIG1vdW50cG9pbnRzIHVzZWQgYnkgdGhlc2UgY29udGFp bmVycyBhcmUgb2Z0ZW4gb3duZWQgYnkgdGhlDQo+ID4gaG9zdDsNCj4gPiB0aGV5IGFyZSB0eXBp Y2FsbHkgc2V0IHVwIGJlZm9yZSBzdGFydGluZyB0aGUgY29udGFpbmVyaXNlZA0KPiA+IHByb2Nl c3Nlcy4NCj4gPiBGdXJ0aGVybW9yZSwgdGhlcmUgaXMgdHlwaWNhbGx5IG5vICJzdGFydCBjb250 YWluZXIiIHN5c3RlbSBjYWxsDQo+ID4gdGhhdA0KPiA+IHdlIGNhbiB1c2UgdG8gaWRlbnRpZnkg d2hpY2ggc2V0IG9mIHByb2Nlc3NlcyAob3IgY2dyb3VwcykgYXJlDQo+ID4gY29udGFpbmVyaXNl ZCwgYW5kIHNob3VsZCBzaGFyZSBhIGNsaWVudGlkLg0KPiANCj4gSXMgdGhhdCBzdWNoIGEgaGFy ZCBwcm9ibGVtPw0KPiANCg0KRXJyLCB5ZXMuLi4gaXNuJ3QgaXQ/IEhvdyBkbyBJIGlkZW50aWZ5 IGEgY29udGFpbmVyIGFuZCBrbm93IHdoZXJlIHRvDQpzZXQgdGhlIGxlYXNlIGJvdW5kYXJ5Pw0K DQpCZWFyIGluIG1pbmQgdGhhdCB0aGUgZGVmaW5pdGlvbiBvZiAiY29udGFpbmVyIiBpcyBub24t ZXhpc3RlbnQgYmV5b25kDQp0aGUgb2J2aW91cyAiYSBsb29zZSBjb2xsZWN0aW9uIG9mIHByb2Nl c3NlcyIuIEl0IHZhcmllcyBmcm9tIHRoZQ0KZG9ja2VyL2x4Yy92aXJ0dW96em8gc3R5bGUgY29u dGFpbmVyLCB3aGljaCB1c2VzIG5hbWVzcGFjZXMgdG8gYm91bmQNCnRoZSBwcm9jZXNzZXMsIHRv IHRoZSBHb29nbGUgdHlwZSBvZiAiY29udGFpbmVyIiB0aGF0IGlzIGFjdHVhbGx5IGp1c3QNCmEg c2V0IG9mIGNncm91cHMgYW5kIHRvIHRoZSBrdm0vcWVtdSBzaW5nbGUgcHJvY2Vzcy4NCg0KPiBJ biBhbnkgY2FzZSwgZnJvbSB0aGUgcHJvdG9jb2wgcG9pbnQgb2YgdmlldyB0aGVzZSBhbGwgc291 bmQgbGlrZQ0KPiBjbGllbnQNCj4gaW1wbGVtZW50YXRpb24gZGV0YWlscy4NCg0KSWYgeW91IGFy ZSBzZWVpbmcgYW4gb2J2aW91cyBhcmNoaXRlY3R1cmUgZm9yIHRoZSBjbGllbnQsIHRoZW4gcGxl YXNlDQpzaGFyZS4uLg0KDQo+IFRoZSBvbmx5IHByb2JsZW0gSSBzZWUgd2l0aCBtdWx0aXBsZSBj bGllbnQgSUQncyBpcyB0aGF0IHlvdSdkIGxpa2UNCj4gdG8NCj4ga2VlcCB0aGVpciBkZWxlZ2F0 aW9ucyBmcm9tIGNvbmZsaWN0aW5nIHdpdGggZWFjaCBvdGhlciBzbyB0aGV5IGNhbg0KPiBzaGFy ZSBjYWNoZS4NCj4gDQo+IEJ1dCwgbWF5YmUgSSdtIG1pc3Npbmcgc29tZXRoaW5nIGVsc2UuDQoN CkhhdmluZyB0byBhbiBFWENIQU5HRV9JRCArIENSRUFURV9TRVNTSU9OIG9uIGV2ZXJ5IGNhbGwg dG8NCmZvcmsoKS9jbG9uZSgpIGFuZCBhIERFU1RST1lfU0VTU0lPTi9ERVNUUk9ZX0VYQ0hBTkdF SUQgaW4gZWFjaCBwcm9jZXNzDQpkZXN0cnVjdG9yPyBMZWFzZSByZW5ld2FsIHBpbmdzIGZyb20g MTAwMCBwcm9jZXNzZXMgcnVubmluZyBvbiAxMDAwDQpjbGllbnRzPw0KDQpUaGlzIGlzIHdoYXQg SSBtZWFuIGFib3V0IGNvbnRhaW5lciBib3VuZGFyaWVzLiBJZiB0aGV5IGFyZW4ndCB3ZWxsDQpk ZWZpbmVkLCB0aGVuIHdlJ3JlIGRvd24gdG8gZG9pbmcgcHJlY2lzZWx5IHRoZSBhYm92ZS4NCg0K LS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJpbWFy eURhdGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg== ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 16:09 ` Trond Myklebust @ 2017-05-18 16:32 ` J. Bruce Fields 2017-05-18 17:13 ` Trond Myklebust 2017-05-22 14:25 ` Jeff Layton 1 sibling, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2017-05-18 16:32 UTC (permalink / raw) To: Trond Myklebust Cc: bfields@fieldses.org, stefanha@redhat.com, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com On Thu, May 18, 2017 at 04:09:10PM +0000, Trond Myklebust wrote: > On Thu, 2017-05-18 at 11:28 -0400, bfields@fieldses.org wrote: > > On Thu, May 18, 2017 at 03:17:11PM +0000, Trond Myklebust wrote: > > > For the case that Stefan is discussing (kvm) it would literally be > > > a > > > single process that is being migrated. For lxc and > > > docker/kubernetes- > > > style containers, it would be a collection of processes. > > > > > > The mountpoints used by these containers are often owned by the > > > host; > > > they are typically set up before starting the containerised > > > processes. > > > Furthermore, there is typically no "start container" system call > > > that > > > we can use to identify which set of processes (or cgroups) are > > > containerised, and should share a clientid. > > > > Is that such a hard problem? > > > > Err, yes... isn't it? How do I identify a container and know where to > set the lease boundary? > > Bear in mind that the definition of "container" is non-existent beyond > the obvious "a loose collection of processes". It varies from the > docker/lxc/virtuozzo style container, which uses namespaces to bound > the processes, to the Google type of "container" that is actually just > a set of cgroups and to the kvm/qemu single process. Sure, but, can't we pick *something* to use as the boundary (network namespace?), document it, and let userspace use that to tell us what it wants? > > In any case, from the protocol point of view these all sound like > > client > > implementation details. > > If you are seeing an obvious architecture for the client, then please > share... Make clientids per-network-namespace and store them in nfs_net? (Maybe that's what's already done, I can't tell.) > > The only problem I see with multiple client ID's is that you'd like > > to > > keep their delegations from conflicting with each other so they can > > share cache. > > > > But, maybe I'm missing something else. > > Having to an EXCHANGE_ID + CREATE_SESSION on every call to > fork()/clone() and a DESTROY_SESSION/DESTROY_EXCHANGEID in each process > destructor? Lease renewal pings from 1000 processes running on 1000 > clients? > > This is what I mean about container boundaries. If they aren't well > defined, then we're down to doing precisely the above. Again this sounds like a complaint about the kernel api rather than about the protocol. If the container management system knows what it wants and we give it a way to explain it to us, then we avoid most of that, right? --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 16:32 ` J. Bruce Fields @ 2017-05-18 17:13 ` Trond Myklebust 2017-05-22 12:45 ` Stefan Hajnoczi 0 siblings, 1 reply; 23+ messages in thread From: Trond Myklebust @ 2017-05-18 17:13 UTC (permalink / raw) To: bfields@redhat.com Cc: bfields@fieldses.org, stefanha@redhat.com, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com T24gVGh1LCAyMDE3LTA1LTE4IGF0IDEyOjMyIC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6 DQo+IE9uIFRodSwgTWF5IDE4LCAyMDE3IGF0IDA0OjA5OjEwUE0gKzAwMDAsIFRyb25kIE15a2xl YnVzdCB3cm90ZToNCj4gPiBPbiBUaHUsIDIwMTctMDUtMTggYXQgMTE6MjggLTA0MDAsIGJmaWVs ZHNAZmllbGRzZXMub3JnIHdyb3RlOg0KPiA+ID4gT24gVGh1LCBNYXkgMTgsIDIwMTcgYXQgMDM6 MTc6MTFQTSArMDAwMCwgVHJvbmQgTXlrbGVidXN0IHdyb3RlOg0KPiA+ID4gPiBGb3IgdGhlIGNh c2UgdGhhdCBTdGVmYW4gaXMgZGlzY3Vzc2luZyAoa3ZtKSBpdCB3b3VsZCBsaXRlcmFsbHkNCj4g PiA+ID4gYmUNCj4gPiA+ID4gYQ0KPiA+ID4gPiBzaW5nbGUgcHJvY2VzcyB0aGF0IGlzIGJlaW5n IG1pZ3JhdGVkLiBGb3IgbHhjIGFuZA0KPiA+ID4gPiBkb2NrZXIva3ViZXJuZXRlcy0NCj4gPiA+ ID4gc3R5bGUgY29udGFpbmVycywgaXQgd291bGQgYmUgYSBjb2xsZWN0aW9uIG9mIHByb2Nlc3Nl cy4NCj4gPiA+ID4gDQo+ID4gPiA+IFRoZSBtb3VudHBvaW50cyB1c2VkIGJ5IHRoZXNlIGNvbnRh aW5lcnMgYXJlIG9mdGVuIG93bmVkIGJ5IHRoZQ0KPiA+ID4gPiBob3N0Ow0KPiA+ID4gPiB0aGV5 IGFyZSB0eXBpY2FsbHkgc2V0IHVwIGJlZm9yZSBzdGFydGluZyB0aGUgY29udGFpbmVyaXNlZA0K PiA+ID4gPiBwcm9jZXNzZXMuDQo+ID4gPiA+IEZ1cnRoZXJtb3JlLCB0aGVyZSBpcyB0eXBpY2Fs bHkgbm8gInN0YXJ0IGNvbnRhaW5lciIgc3lzdGVtDQo+ID4gPiA+IGNhbGwNCj4gPiA+ID4gdGhh dA0KPiA+ID4gPiB3ZSBjYW4gdXNlIHRvIGlkZW50aWZ5IHdoaWNoIHNldCBvZiBwcm9jZXNzZXMg KG9yIGNncm91cHMpIGFyZQ0KPiA+ID4gPiBjb250YWluZXJpc2VkLCBhbmQgc2hvdWxkIHNoYXJl IGEgY2xpZW50aWQuDQo+ID4gPiANCj4gPiA+IElzIHRoYXQgc3VjaCBhIGhhcmQgcHJvYmxlbT8N Cj4gPiA+IA0KPiA+IA0KPiA+IEVyciwgeWVzLi4uIGlzbid0IGl0PyBIb3cgZG8gSSBpZGVudGlm eSBhIGNvbnRhaW5lciBhbmQga25vdyB3aGVyZQ0KPiA+IHRvDQo+ID4gc2V0IHRoZSBsZWFzZSBi b3VuZGFyeT8NCj4gPiANCj4gPiBCZWFyIGluIG1pbmQgdGhhdCB0aGUgZGVmaW5pdGlvbiBvZiAi Y29udGFpbmVyIiBpcyBub24tZXhpc3RlbnQNCj4gPiBiZXlvbmQNCj4gPiB0aGUgb2J2aW91cyAi YSBsb29zZSBjb2xsZWN0aW9uIG9mIHByb2Nlc3NlcyIuIEl0IHZhcmllcyBmcm9tIHRoZQ0KPiA+ IGRvY2tlci9seGMvdmlydHVvenpvIHN0eWxlIGNvbnRhaW5lciwgd2hpY2ggdXNlcyBuYW1lc3Bh Y2VzIHRvDQo+ID4gYm91bmQNCj4gPiB0aGUgcHJvY2Vzc2VzLCB0byB0aGUgR29vZ2xlIHR5cGUg b2YgImNvbnRhaW5lciIgdGhhdCBpcyBhY3R1YWxseQ0KPiA+IGp1c3QNCj4gPiBhIHNldCBvZiBj Z3JvdXBzIGFuZCB0byB0aGUga3ZtL3FlbXUgc2luZ2xlIHByb2Nlc3MuDQo+IA0KPiBTdXJlLCBi dXQsIGNhbid0IHdlIHBpY2sgKnNvbWV0aGluZyogdG8gdXNlIGFzIHRoZSBib3VuZGFyeSAobmV0 d29yaw0KPiBuYW1lc3BhY2U/KSwgZG9jdW1lbnQgaXQsIGFuZCBsZXQgdXNlcnNwYWNlIHVzZSB0 aGF0IHRvIHRlbGwgdXMgd2hhdA0KPiBpdA0KPiB3YW50cz8NCj4gDQo+ID4gPiBJbiBhbnkgY2Fz ZSwgZnJvbSB0aGUgcHJvdG9jb2wgcG9pbnQgb2YgdmlldyB0aGVzZSBhbGwgc291bmQgbGlrZQ0K PiA+ID4gY2xpZW50DQo+ID4gPiBpbXBsZW1lbnRhdGlvbiBkZXRhaWxzLg0KPiA+IA0KPiA+IElm IHlvdSBhcmUgc2VlaW5nIGFuIG9idmlvdXMgYXJjaGl0ZWN0dXJlIGZvciB0aGUgY2xpZW50LCB0 aGVuDQo+ID4gcGxlYXNlDQo+ID4gc2hhcmUuLi4NCj4gDQo+IE1ha2UgY2xpZW50aWRzIHBlci1u ZXR3b3JrLW5hbWVzcGFjZSBhbmQgc3RvcmUgdGhlbSBpbg0KPiBuZnNfbmV0P8KgwqAoTWF5YmUN Cj4gdGhhdCdzIHdoYXQncyBhbHJlYWR5IGRvbmUsIEkgY2FuJ3QgdGVsbC4pDQo+IA0KPiA+ID4g VGhlIG9ubHkgcHJvYmxlbSBJIHNlZSB3aXRoIG11bHRpcGxlIGNsaWVudCBJRCdzIGlzIHRoYXQg eW91J2QNCj4gPiA+IGxpa2UNCj4gPiA+IHRvDQo+ID4gPiBrZWVwIHRoZWlyIGRlbGVnYXRpb25z IGZyb20gY29uZmxpY3Rpbmcgd2l0aCBlYWNoIG90aGVyIHNvIHRoZXkNCj4gPiA+IGNhbg0KPiA+ ID4gc2hhcmUgY2FjaGUuDQo+ID4gPiANCj4gPiA+IEJ1dCwgbWF5YmUgSSdtIG1pc3Npbmcgc29t ZXRoaW5nIGVsc2UuDQo+ID4gDQo+ID4gSGF2aW5nIHRvIGFuIEVYQ0hBTkdFX0lEICsgQ1JFQVRF X1NFU1NJT04gb24gZXZlcnkgY2FsbCB0bw0KPiA+IGZvcmsoKS9jbG9uZSgpIGFuZCBhIERFU1RS T1lfU0VTU0lPTi9ERVNUUk9ZX0VYQ0hBTkdFSUQgaW4gZWFjaA0KPiA+IHByb2Nlc3MNCj4gPiBk ZXN0cnVjdG9yPyBMZWFzZSByZW5ld2FsIHBpbmdzIGZyb20gMTAwMCBwcm9jZXNzZXMgcnVubmlu ZyBvbiAxMDAwDQo+ID4gY2xpZW50cz8NCj4gPiANCj4gPiBUaGlzIGlzIHdoYXQgSSBtZWFuIGFi b3V0IGNvbnRhaW5lciBib3VuZGFyaWVzLiBJZiB0aGV5IGFyZW4ndCB3ZWxsDQo+ID4gZGVmaW5l ZCwgdGhlbiB3ZSdyZSBkb3duIHRvIGRvaW5nIHByZWNpc2VseSB0aGUgYWJvdmUuDQo+IA0KPiBB Z2FpbiB0aGlzIHNvdW5kcyBsaWtlIGEgY29tcGxhaW50IGFib3V0IHRoZSBrZXJuZWwgYXBpIHJh dGhlciB0aGFuDQo+IGFib3V0IHRoZSBwcm90b2NvbC7CoMKgSWYgdGhlIGNvbnRhaW5lciBtYW5h Z2VtZW50IHN5c3RlbSBrbm93cyB3aGF0IGl0DQo+IHdhbnRzIGFuZCB3ZSBnaXZlIGl0IGEgd2F5 IHRvIGV4cGxhaW4gaXQgdG8gdXMsIHRoZW4gd2UgYXZvaWQgbW9zdCBvZg0KPiB0aGF0LCByaWdo dD8NCj4gDQoNCk9LLCBzbyBjb25zaWRlciB0aGUgdXNlIGNhc2UgdGhhdCBpbnNwaXJlZCB0aGlz IGNvbnZlcnNhdGlvbjogbmFtZWx5DQp1c2luZyBuZnNkIG9uIHRoZSBzZXJ2ZXIgdG8gcHJveHkg Zm9yIGEgY2xpZW50IHJ1bm5pbmcgaW4ga3ZtIGFuZCB1c2luZw0KdGhlIHZzb2NrIGludGVyZmFj ZS4NCg0KSG93IGRvIEkgYXJjaGl0ZWN0IGtuZnNkIHNvIHRoYXQgaXQgaGFuZGxlcyB0aGF0IHVz ZSBjYXNlPyBBcmUgeW91DQpzYXlpbmcgdGhhdCBJIG5lZWQgdG8gc2V0IHVwIGEgY29udGFpbmVy IG9mIGtuZnNkIHRocmVhZHMganVzdCB0byBzZXJ2ZQ0KdGhpcyBvbmUga3ZtIGluc3RhbmNlPyBP dGhlcndpc2UsIHRoZSBsb2NrcyBjcmVhdGVkIGJ5IGtuZnNkIGZvciB0aGF0DQprdm0gcHJvY2Vz cyB3aWxsIGhhdmUgdGhlIHNhbWUgY2xpZW50aWQgYXMgYWxsIHRoZSBvdGhlciBsb2NrcyBjcmVh dGVkDQpieSBrbmZzZD8NCg0KVG8gbWUsIGl0IHNlZW1zIG1vcmUgZmxleGlibGUgdG8gYWxsb3cg YSB1dGlsaXR5IGxpa2UgY3JpdSAoaHR0cHM6Ly9jcmkNCnUub3JnL01haW5fUGFnZSkgdG8gc3Bl Y2lmeSAiSSdkIGxpa2UgdG8gbWFyayB0aGVzZSBzcGVjaWZpYyBsb2NrcyBhcw0KYmVpbmcgcGFy dCBvZiB0aGlzIGNoZWNrcG9pbnQvcmVzdG9yZSBjb250ZXh0IHBsZWFzZSIgKGh0dHBzOi8vY3Jp dS5vcmcNCi9GaWxlX2xvY2tzKSwgYW5kIGFsbG93IHRoZW0gdG8gYmUgYXR0ZW1wdGVkIHJlc3Rv cmVkIHdpdGggdGhlIHByb2Nlc3MNCnRoYXQgd2FzIG1pZ3JhdGVkLg0KTm90ZSB0aGF0IGNyaXUg YWxzbyB3b3JrcyBhdCB0aGUgbGV2ZWwgb2YgdGhlIGFwcGxpY2F0aW9uLCBub3QgYQ0KY29udGFp bmVyLCBldmVuIHRob3VnaCBpdCB3YXMgZGV2ZWxvcGVkIGJ5IHRoZSBjb250YWluZXIgdmlydHVh bGlzYXRpb24NCmNvbW11bml0eS4NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBj bGllbnQgbWFpbnRhaW5lciwgUHJpbWFyeURhdGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0 YS5jb20NCg== ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 17:13 ` Trond Myklebust @ 2017-05-22 12:45 ` Stefan Hajnoczi 0 siblings, 0 replies; 23+ messages in thread From: Stefan Hajnoczi @ 2017-05-22 12:45 UTC (permalink / raw) To: Trond Myklebust Cc: bfields@redhat.com, bfields@fieldses.org, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com [-- Attachment #1: Type: text/plain, Size: 3853 bytes --] On Thu, May 18, 2017 at 05:13:48PM +0000, Trond Myklebust wrote: > On Thu, 2017-05-18 at 12:32 -0400, J. Bruce Fields wrote: > > On Thu, May 18, 2017 at 04:09:10PM +0000, Trond Myklebust wrote: > > > On Thu, 2017-05-18 at 11:28 -0400, bfields@fieldses.org wrote: > > > > On Thu, May 18, 2017 at 03:17:11PM +0000, Trond Myklebust wrote: > > > > > For the case that Stefan is discussing (kvm) it would literally > > > > > be > > > > > a > > > > > single process that is being migrated. For lxc and > > > > > docker/kubernetes- > > > > > style containers, it would be a collection of processes. > > > > > > > > > > The mountpoints used by these containers are often owned by the > > > > > host; > > > > > they are typically set up before starting the containerised > > > > > processes. > > > > > Furthermore, there is typically no "start container" system > > > > > call > > > > > that > > > > > we can use to identify which set of processes (or cgroups) are > > > > > containerised, and should share a clientid. > > > > > > > > Is that such a hard problem? > > > > > > > > > > Err, yes... isn't it? How do I identify a container and know where > > > to > > > set the lease boundary? > > > > > > Bear in mind that the definition of "container" is non-existent > > > beyond > > > the obvious "a loose collection of processes". It varies from the > > > docker/lxc/virtuozzo style container, which uses namespaces to > > > bound > > > the processes, to the Google type of "container" that is actually > > > just > > > a set of cgroups and to the kvm/qemu single process. > > > > Sure, but, can't we pick *something* to use as the boundary (network > > namespace?), document it, and let userspace use that to tell us what > > it > > wants? > > > > > > In any case, from the protocol point of view these all sound like > > > > client > > > > implementation details. > > > > > > If you are seeing an obvious architecture for the client, then > > > please > > > share... > > > > Make clientids per-network-namespace and store them in > > nfs_net? (Maybe > > that's what's already done, I can't tell.) > > > > > > The only problem I see with multiple client ID's is that you'd > > > > like > > > > to > > > > keep their delegations from conflicting with each other so they > > > > can > > > > share cache. > > > > > > > > But, maybe I'm missing something else. > > > > > > Having to an EXCHANGE_ID + CREATE_SESSION on every call to > > > fork()/clone() and a DESTROY_SESSION/DESTROY_EXCHANGEID in each > > > process > > > destructor? Lease renewal pings from 1000 processes running on 1000 > > > clients? > > > > > > This is what I mean about container boundaries. If they aren't well > > > defined, then we're down to doing precisely the above. > > > > Again this sounds like a complaint about the kernel api rather than > > about the protocol. If the container management system knows what it > > wants and we give it a way to explain it to us, then we avoid most of > > that, right? > > > > OK, so consider the use case that inspired this conversation: namely > using nfsd on the server to proxy for a client running in kvm and using > the vsock interface. > > How do I architect knfsd so that it handles that use case? Are you > saying that I need to set up a container of knfsd threads just to serve > this one kvm instance? Otherwise, the locks created by knfsd for that > kvm process will have the same clientid as all the other locks created > by knfsd? Another issue with Linux namespaces is that the granularity of the "net" namespace isn't always what you want. The application may need its own NFS client but that requires isolating it from all other services in the network namespace (like the physical network interfaces :)). Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-18 16:09 ` Trond Myklebust 2017-05-18 16:32 ` J. Bruce Fields @ 2017-05-22 14:25 ` Jeff Layton 1 sibling, 0 replies; 23+ messages in thread From: Jeff Layton @ 2017-05-22 14:25 UTC (permalink / raw) To: Trond Myklebust, bfields@fieldses.org, David Howells Cc: stefanha@redhat.com, bfields@redhat.com, SteveD@redhat.com, linux-nfs@vger.kernel.org, chuck.lever@oracle.com On Thu, 2017-05-18 at 16:09 +0000, Trond Myklebust wrote: > On Thu, 2017-05-18 at 11:28 -0400, bfields@fieldses.org wrote: > > On Thu, May 18, 2017 at 03:17:11PM +0000, Trond Myklebust wrote: > > > For the case that Stefan is discussing (kvm) it would literally be > > > a > > > single process that is being migrated. For lxc and > > > docker/kubernetes- > > > style containers, it would be a collection of processes. > > > > > > The mountpoints used by these containers are often owned by the > > > host; > > > they are typically set up before starting the containerised > > > processes. > > > Furthermore, there is typically no "start container" system call > > > that > > > we can use to identify which set of processes (or cgroups) are > > > containerised, and should share a clientid. > > > > Is that such a hard problem? > > > > Err, yes... isn't it? How do I identify a container and know where to > set the lease boundary? > > Bear in mind that the definition of "container" is non-existent beyond > the obvious "a loose collection of processes". It varies from the > docker/lxc/virtuozzo style container, which uses namespaces to bound > the processes, to the Google type of "container" that is actually just > a set of cgroups and to the kvm/qemu single process. > > > In any case, from the protocol point of view these all sound like > > client > > implementation details. > > If you are seeing an obvious architecture for the client, then please > share... > > > The only problem I see with multiple client ID's is that you'd like > > to > > keep their delegations from conflicting with each other so they can > > share cache. > > > > But, maybe I'm missing something else. > > Having to an EXCHANGE_ID + CREATE_SESSION on every call to > fork()/clone() and a DESTROY_SESSION/DESTROY_EXCHANGEID in each process > destructor? Lease renewal pings from 1000 processes running on 1000 > clients? > > This is what I mean about container boundaries. If they aren't well > defined, then we're down to doing precisely the above. > This is the crux of the problem with containers in general. We've been pretending for a long time that the kernel doesn't really need to understand them and can just worry about namespaces, but that really hasn't worked out well so far. I think we need to consider making a "container" a first-class object in the kernel. Note that that would also help solve the long-standing problem of how to handle usermode helper upcalls in containers. I do happen to know of one kernel developer (cc'ed here) who has been working on something along those lines... -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-15 16:02 ` J. Bruce Fields 2017-05-16 13:11 ` J. Bruce Fields @ 2017-05-16 13:33 ` Stefan Hajnoczi 2017-05-16 13:36 ` J. Bruce Fields 1 sibling, 1 reply; 23+ messages in thread From: Stefan Hajnoczi @ 2017-05-16 13:33 UTC (permalink / raw) To: J. Bruce Fields Cc: Chuck Lever, Trond Myklebust, Steve Dickson, Linux NFS Mailing List [-- Attachment #1: Type: text/plain, Size: 2742 bytes --] On Mon, May 15, 2017 at 12:02:48PM -0400, J. Bruce Fields wrote: > On Mon, May 15, 2017 at 03:43:06PM +0100, Stefan Hajnoczi wrote: > > On Fri, May 12, 2017 at 01:00:47PM -0400, Chuck Lever wrote: > > > > > > > On May 12, 2017, at 11:01 AM, Trond Myklebust <trondmy@primarydata.com> wrote: > > > > Actually, this might be a use case for re-exporting NFS. If the host > > > > could re-export a NFS mount to the guests, then you don't necessarily > > > > need a clustered filesystem. > > > > > > > > OTOH, this would not solve the problem of migrating locks, which is not > > > > really easy to support in the current state model for NFSv4.x. > > > > > > Some alternatives: > > > > > > - Make the local NFS server's exports read-only, NFSv3 > > > only, and do not support locking. Ensure that the > > > filehandles and namespace are the same on every NFS > > > server. > > > > > > - As Trond suggested, all the local NFS servers accessed > > > via AF_SOCK should re-export NFS filesystems that > > > are located elsewhere and are visible everywhere. > > > > > > - Ensure there is an accompanying NFSv4 FS migration event > > > that moves the client's files (and possibly its open and > > > lock state) from the local NFS server to the destination > > > NFS server concurrent with the live migration. > > > > > > If the client is aware of the FS migration, it will expect > > > the filehandles to be the same, but it can reconstruct > > > the open and lock state on the destination server (if that > > > server allows GRACEful recovery for that client). > > > > > > This is possible in the protocol and implemented in the > > > Linux NFS client, but none of it is implemented in the > > > Linux NFS server. > > > > Great, thanks for the pointers everyone. > > > > It's clear to me that AF_VSOCK won't get NFS migration for free. > > Initially live migration will not be supported. > > > > Re-exporting sounds interesting - perhaps the new host could re-export > > the old host's file systems. I'll look into the spec and code. > > I've since forgotten the limitations of the nfs reexport series. > > Locking (lock recovery, specifically) seems like the biggest problem to > solve to improve clustered nfs service; without that, it might actually > be easier than reexporting, I don't know. If there's a use case for > clustered nfs service that doesn't support file locking, maybe we should > look into it. I suspect many guests will have a dedicated/private export. The guest will be the only client accessing its export. This could simplify the locking issues. That said, it would be nice to support full clustered operation. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-16 13:33 ` Stefan Hajnoczi @ 2017-05-16 13:36 ` J. Bruce Fields 2017-05-17 14:33 ` Stefan Hajnoczi 0 siblings, 1 reply; 23+ messages in thread From: J. Bruce Fields @ 2017-05-16 13:36 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Chuck Lever, Trond Myklebust, Steve Dickson, Linux NFS Mailing List On Tue, May 16, 2017 at 02:33:38PM +0100, Stefan Hajnoczi wrote: > I suspect many guests will have a dedicated/private export. The guest > will be the only client accessing its export. This could simplify the > locking issues. So why not migrate filesystem images instead of using NFS? --b. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: EXCHANGE_ID with same network address but different server owner 2017-05-16 13:36 ` J. Bruce Fields @ 2017-05-17 14:33 ` Stefan Hajnoczi 0 siblings, 0 replies; 23+ messages in thread From: Stefan Hajnoczi @ 2017-05-17 14:33 UTC (permalink / raw) To: J. Bruce Fields Cc: Chuck Lever, Trond Myklebust, Steve Dickson, Linux NFS Mailing List [-- Attachment #1: Type: text/plain, Size: 813 bytes --] On Tue, May 16, 2017 at 09:36:03AM -0400, J. Bruce Fields wrote: > On Tue, May 16, 2017 at 02:33:38PM +0100, Stefan Hajnoczi wrote: > > I suspect many guests will have a dedicated/private export. The guest > > will be the only client accessing its export. This could simplify the > > locking issues. > > So why not migrate filesystem images instead of using NFS? Some users consider disk image files inconvenient because they cannot be inspected and manipulated with regular shell utilities. Especially scenarios where many VMs are launched with VM-specific data files can benefit from using files directly instead of building disk images. I'm not saying all users just export per-VM directories, but it's a common case and may be a good starting point if a general solution is very hard. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 455 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-05-22 14:25 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-05-12 13:27 EXCHANGE_ID with same network address but different server owner Stefan Hajnoczi 2017-05-12 14:34 ` J. Bruce Fields 2017-05-12 15:01 ` Trond Myklebust 2017-05-12 17:00 ` Chuck Lever 2017-05-15 14:43 ` Stefan Hajnoczi 2017-05-15 16:02 ` J. Bruce Fields 2017-05-16 13:11 ` J. Bruce Fields 2017-05-18 13:34 ` Stefan Hajnoczi 2017-05-18 14:28 ` Chuck Lever 2017-05-18 15:04 ` Trond Myklebust 2017-05-18 15:08 ` J. Bruce Fields 2017-05-18 15:15 ` Chuck Lever 2017-05-18 15:17 ` Trond Myklebust 2017-05-18 15:17 ` Trond Myklebust 2017-05-18 15:28 ` bfields 2017-05-18 16:09 ` Trond Myklebust 2017-05-18 16:32 ` J. Bruce Fields 2017-05-18 17:13 ` Trond Myklebust 2017-05-22 12:45 ` Stefan Hajnoczi 2017-05-22 14:25 ` Jeff Layton 2017-05-16 13:33 ` Stefan Hajnoczi 2017-05-16 13:36 ` J. Bruce Fields 2017-05-17 14:33 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).