From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: BUG: Mount ignores mount options Date: Fri, 10 Aug 2018 20:05:44 -0500 Message-ID: <87pnypiufr.fsf@xmission.com> References: <153313703562.13253.5766498657900728120.stgit@warthog.procyon.org.uk> <87d0uqpba5.fsf@xmission.com> <20180810151606.GA6515@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20180810151606.GA6515-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org> (Al Viro's message of "Fri, 10 Aug 2018 16:16:06 +0100") List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: apparmor-bounces-nLRlyDuq1AZFpShjVBNYrg@public.gmane.org Sender: "AppArmor" Content-Type: text/plain; charset="us-ascii" To: Al Viro Cc: Eric Biggers , Tetsuo Handa , David Howells , selinux-+05T5uksL2qpZYMLLGbcSA@public.gmane.org, tomoyo-dev-en-5NWGOfrQmneRv+LV9MX5uooqe+aC9MnS@public.gmane.org, Paul Moore , Miklos Szeredi , Stephen Smalley , fenghua.yu-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org, apparmor-nLRlyDuq1AZFpShjVBNYrg@public.gmane.org, Greg Kroah-Hartman , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, "Theodore Y. Ts'o" , linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Li Zefan , Johannes Weiner , linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Tejun Heo , torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org QWwgVmlybyA8dmlyb0BaZW5JVi5saW51eC5vcmcudWs+IHdyaXRlczoKCj4gT24gRnJpLCBBdWcg MTAsIDIwMTggYXQgMDk6MDU6MjJBTSAtMDUwMCwgRXJpYyBXLiBCaWVkZXJtYW4gd3JvdGU6Cj4+ IAo+PiBUaGVyZSBpcyBhIHNlcmlvdXMgcHJvYmxlbSB3aXRoIG1vdW50IG9wdGlvbnMgdG9kYXkg dGhhdCBmc29wZW4gZG9lcyBub3QKPj4gYWRkcmVzcy4gIFRoZSBwcm9ibGVtIGlzIHRoYXQgbW91 bnQgb3B0aW9ucyBhcmUgaWdub3JlZCBmb3IgYmxvY2sgYmFzZWQKPj4gZmlsZXN5c3RlbXMsIGFu ZCBhbnkgb3RoZXIgdHlwZSBvZiBmaWxlc3lzdGVtIHRoYXQgZm9sbG93cyB0aGUgc2FtZQo+PiBw YXR0ZXJuLgo+PiAKPj4gVGhlIHNjcmlwdCBiZWxvdyBkZW1vbnN0cmF0ZXMgdGhpcyBidWcuICBT aG93aW5nIHRoaXMgYnVnIGNhbiBjYXVzZSB0aGUKPj4gZXh0NCAiYWNsIiAicXVvdGEiIGFuZCAi dXNlcl94YXR0ciIgb3B0aW9ucyB0byBiZSBzaWxlbnRseSBpZ25vcmVkLgo+PiAKPj4gZnNvcGVu IGhhcyBteSBuYWNrIHVudGlsIGl0IGFkZHJlc3NlcyB0aGlzIGlzc3VlLgo+PiAKPj4gSSBkb24n dCBrbm93IGlmIHdlIGNhbiBmaXggdGhpcyBpbiB0aGUgY29udGV4dCBvZiBzeXNfbW91bnQuICBC dXQgd2UgaWYKPj4gd2UgYXJlIHJlZG9pbmcgdGhlIG9wdGlvbiBwYXJzaW5nIG9mIGhvdyB3ZSBt b3VudCBmaWxlc3lzdGVtcyB0aGlzIG5lZWRzCj4+IHRvIGJlIGZpeGVkIGJlZm9yZSB3ZSBzdGFy dCB3b3JyeWluZyBhYm91dCBidWcgY29tcGF0aWJpbGl0eS4KPj4gCj4+IEhvcGVmdWxseSB0aGlz IHJlcG9ydCBpcyBzaW1wbGUgYW5kIGNsZWFyIGVub3VnaCB0aGF0IHdlIGNhbiBhdCBsZWFzdAo+ PiBhZ3JlZSBvbiB0aGUgcHJvYmxlbS4KPgo+IFN1cmUsIGl0IGlzIHNpbXBsZS4gIFNvJ3MgdGhl IHNvbHV0aW9uOiBNTlRfVVNFUk5TX1NQRUNJQUxfU0VNQU5USUNTIHRoYXQKPiB3b3VsZCBnZXQg cGFzc2VkIHRvIGZpbGVzeXN0ZW1zLCBzbyB0aGF0IEVyaWMgd291bGQgYmUgYWJsZSB0byBpbXBs ZW1lbnQKPiBoaXMgbW91bnQoMiktaW5jb21wYXRpYmxlIGJlaGF2aW91ciBhdCBsZWlzdXJlLCB3 aXRob3V0IHdvcnJ5aW5nIGFib3V0Cj4gY29tcGF0aWJpbGl0eSBpc3N1ZXMuCj4KPiBEb2VzIHRo YXQgYWRkcmVzcyB5b3VyIGNvbXBsYWludD8KCkFic29sdXRlbHkgbm90LgoKTXkgY29tcGxhaW50 IGlzIHRoYXQgdGhlIGN1cnJlbnQgaW1wbGVtZW50ZWQgYmVoYXZpb3Igb2YgcHJhY3RpY2FsbHkK ZXZlcnkgZmlsZXN5c3RlbSBpbiB0aGUga2VybmVsLCBpcyB0aGF0IGl0IHdpbGwgaWdub3JlIG1v dW50IG9wdGlvbnMKd2hlbiBtb3VudGVkIGEgc2Vjb25kIHRpbWUuICAKCkl0IGlzIG5vdCBzb21l IHdlaXJkIHNwZWNpYWwgY2FzZS4KCkl0IGlzIG5vdCBzb21lIGNvbnRhaW5lciB0aGluZy4KCkl0 IGlzIHRoYXQgdGhlIGJlaGF2aW9yIG9mIG1vdW50KDIpIHdpdGggcHJhY3RpY2FsbHkgZXZlcnkg ZmlsZXN5c3RlbQp0eXBlIHdoZW4gdGhhdCBmaWxlc3lzdGVtIGlzIGFscmVhZHkgbW91bnRlZCBz b21ld2hlcmUgZWxzZSBiZWhhdmVzCmluIHdheXMgbm8gb25lIHdvdWxkIGV4cGVjdC4KCldpdGgg dGhlIG5ldyBmc29wZW4gYXBpIHRoZSBlYXN5IHRoaW5nIHRvIGRvIGlzIHNpbXBseSBoYXZlIENN RF9DUkVBVEUKQ01EX0JJTkRfSU5URVJOQUwgYW5kIGJlIGRvbmUgd2l0aCBpdC4gIENNRF9DUkVB VEUgZ3VhcmFudGVlIHRoYXQgYSBuZXcKc3VwZXJibG9jayBpcyBjcmVhdGVkLiAgQ01EX0JJTkRf SU5URVJOQUwgd291bGQgb25seSB3b3JrIHdpdGggYW4KZXhpc3Rpbmcgc3VwZXJibG9jay4gIFRo ZW4gcm9vdCB3b3VsZCBhdCBsZWFzdCBrbm93IHRoYXQgaGUgaXMKY29ubmVjdGluZyB0byBhbiBh bHJlYWR5IG1vdW50ZWQgZmlsZXN5c3RlbSBhbmQgY291bGQgbG9vayBhdCB0aGUKb3B0aW9ucyBl dGMgYW5kIGZhaWwgaWYgaGUgZGlkbid0IGxpa2Ugd2hhdCBoZSBzYXcuICBObyBzdXJwcmlzZXMs IG5vCm11c3MsIG5vIGZ1c3Mgc2ltcGxlLgoKCkJ1dCBJIGhhdmUgYmVlbiB0b2xkIHRoZSBzaW1w bGUgc29sdXRpb24gYWJvdmUgaXMgc29tZWhvdyB1bmFjY2VwdGFibGUuCkFuZCBhbiBvcHRpb24g dG8gY29tcGFyZSB0aGUgbW91bnQgb3B0aW9ucyBhbmQgc2VlIGlmIHRoZXkgYXJlIHRoZSBzYW1l CndhcyBvZmZlcmVkLiAgVGhhdCB3b3VsZCB3aWxsIHdvcmsgdG8uCgpJIGp1c3QgY2FyZSB0aGF0 IHdlIGRlZmluZSB0aGUgc2VtYW50aWNzIGluIHN1Y2ggYSB3YXkgdGhhdCBpdCBpcyBub3QKZWFz eSBmb3Igcm9vdCB0byBnZXQgY29uZnVzZWQgYW5kIGRvIHNvbWV0aGluZyBzdHVwaWQgdGhhdCB3 aWxsIGJpdGUKbGF0ZXIsIGFuZCB0aGF0IHdlIGJ1aWxkIHRoZSBpbmZyYXN0cnVjdHVyZSBzbyB0 aGF0IGFsbCBmaWxlc3lzdGVtcwpjYW4gaW1wbGVtZW50IGl0IGVhc2lseS4KClNvIHllcyB0aGlz IGlzIDEwMCUgYSBxdWVzdGlvbiBhYm91dCBob3cgZmlsZXN5c3RlbXMgc2hvdWxkIGJlaGF2ZSB3 aXRoCnJlc3BlY3QgdG8gdGhlaXIgb3B0aW9uIHdoZW4gbW91bnRlZCBmb3IgYSBzZWNvbmQgdGlt ZS4gIFRoYXQgaXMgd2hhdApEYXZlIEhvd2VsbHMgcGF0Y2hzZXQgaXMgYWRkcmVzc2luZy4KCj4g QmVjYXVzZSBvbmUgdGhpbmcgd2UgYXJlIG5vdCBnb2luZyB0byBkbyBpcyBjaGFuZ2luZyBtb3Vu dCgyKQo+IGJlaGF2aW91ci4KCkkgaGF2ZSBub3QgYXNrZWQgZm9yIHRoYXQuICBJIGhhdmUgYXNr ZWQgdGhhdCB3ZSBnZXQgaXQgcmlnaHQgZm9yCmZzb3Blbi4KCj4gUmVhc29uOiB1c2VybGFuZC12 aXNpYmxlIGJlaGF2aW91ciBvZiBoZWxsIGtub3dzIGhvdyBtYW55IGxvY2FsIHNjcmlwdHMuCgoK Cj4gQW5vdGhlciB0aGluZyB0aGF0Cj4gaXMgZmxhdC1vdXQgbm90IGZlYXNpYmxlIGlzIHNvbWUg a2luZCBvZiBibGFua2V0ICJjb21wYXJlIG9wdGlvbnMiCj4gc3R1ZmY7IGl0ICpjYW4qIGJlIGRv bmUgYXMgaGVscGVycyB0byBiZSB1c2VkIGJ5IGZpbGVzeXN0ZW0gd2hlbgo+IGl0IHNlZXMgdGhh dCBuZXcgZmxhZywgYnV0IGl0J3Mgc2ltcGx5IG5vdCBnb2luZyB0byB3b3JrIGF0IHRoZQo+IGZz LWluZGVwZW5kZW50IGxldmVsLgo+Cj4gVHJpdmlhbCBleGFtcGxlIHdpdGggdGhlIHNhbWUgZXh0 NDoKPiBtb3VudCAvZGV2L3NkYTEgL21udC9hIC1vIGJzZGRmIHZzLiBtb3VudCAvZGV2L3NkYTEg L21udC9iCj4gZXh0NCBjYW4gdGVsbCB0aGF0IHRoZXNlIGFyZSB0aGUgc2FtZS4gIHN5c2NhbGwg aXRzZWxmIGhhcyBubwo+IGNsdWUuICBXaGF0J3MgbW9yZSwgaXQncyBub3QganVzdCBleHBsaWNp dGx5IHNwZWxsZWQgZGVmYXVsdAo+IG9wdGlvbnMgLSBpdCdzIHRoZSBzdHVmZiB0aGF0IGhhcyBt b3JlIHRoYW4gb25lIGZvcm0uICBBbmQgd2hpbGUKPiB3ZSBhcmUgYXQgaXQsIHRoZSB0aGluZ3Mg bGlrZSB0d28gTkZTIG1vdW50cyBvZiBkaWZmZXJlbnQgdHJlZXMKPiBmcm9tIHRoZSBzYW1lIHNl cnZlcjsgdGhleSBtaWdodCBvciBtaWdodCBub3QgZ2V0IHRoZSBzYW1lIHN1cGVyYmxvY2suCj4g RGVwZW5kaW5nIHVwb24gdGhlIG9wdGlvbnMuCj4KPiBDb252ZW5pZW5jZSBoZWxwZXIgdGhhdCB3 b3VsZCBhbGxvdyBleHQ0IHRvIGNvbXBhcmUgb3B0aW9ucyBhbmQgcmVqZWN0Cj4gdGhlIGluY29t cGF0aWJsZSBtb3VudD8gIE5vdCBzdXJlIGhvdyBtdWNoIGV4dDQtc3BlY2lmaWMga25vd2xlZGdl Cj4gd291bGQgaGF2ZSB0byBnbyBpbiBpdCwgYnV0IGlmIHlvdSBjYW4gY29tZSB1cCB3aXRoIG9u ZSAtIG1vcmUgcG93ZXIKPiB0byB5b3UuICBCdXQgdGhlIGRlY2lzaW9uIHRvIHVzZSBpdCAqbXVz dCogYmUgZXh0NC1zcGVjaWZpYy4gIEJlY2F1c2UKPiBmb3IgZS5nLiBORlMgc3VjaCB0aGluZyBh cyAtbyBmc2lkPS4uLiwgd2hpbGUgY2VydGFpbmx5IGEgcGFydCBvZgo+IG9wdGlvbnMsIGhhcyBh IHZlcnkgZGlmZmVyZW50IG1lYW5pbmcgLSBpdCdzICJ1c2UgYSBzZXBhcmF0ZSBmcyBpbnN0YW5j ZSIKPiAoYW5kIGxldCB0aGUgc2VydmVyIGRlYWwgd2l0aCBjb2hlcmVuY3kgaXNzdWVzIG9uIGl0 cyBlbmQpLgo+Cj4gRGVjaXNpb24gdG8gdXNlIHNnZXQoKSAoYW5kIHRoZSB3YXkgaXQncyB1c2Vk KSBpcyB1cCB0byBmaWxlc3lzdGVtLgo+IFdlICpjYW4ndCogbGlmdCB0aGF0IGludG8gc3lzY2Fs bC4gIE5vdCB3aXRob3V0IGJyZWFraW5nIHRoZSBmdWNrIG91dAo+IG9mIGV4aXN0aW5nIGJlaGF2 aW91ci4KCkkgaGF2ZSBuZXZlciBwcm9wb3NlZCB0aGF0LiAgU2VlIGFib3ZlLiAgSSBtYXkgaGF2 ZSB0YWxrZWQgaW4gdGVybXMKb2Ygd2hhdCBzZ2V0IGRvZXMgYW5kIG11ZGRpZWQgdGhlIHdhdGVy cy4gIElmIHNvIEkgYXBvbG9naXplLgoKQWxsIEkgcHJvcG9zZWQgd2FzIHRoYXQgd2UgZGlzdGlu Z3Vpc2ggYmV0d2VlbiBhIGZpcnN0IG1vdW50IGFuZCBhbgphZGRpdGlvbmFsIG1vdW50IHNvIHRo YXQgdXNlcnNwYWNlIGtub3dzIHRoZSBvcHRpb25zIHdpbGwgYmUgaWdub3JlZC4KClRoZW4gdGhl IGNvZGUgdG8gcmVwbGljYXRlIHRoZSBjdXJyZW50IGJlaGF2aW9yIGNhbiBsb29rIGxpa2U6CgoJ ZmQgPSBmc29wZW4oLi4uKTsKCWZzY29uZmlnKGZkLCAuLi4pOwoJZnNjb25maWcoZmQsIC4uLik7 Cglmc2NvbmZpZyhmZCwgLi4uKTsKCWZzY29uZmlnKGZkLCAuLi4pOwoJZnNjb25maWcoZmQsIC4u Lik7Cglmc2NvbmZpZyhmZCwgLi4uKTsKCWZzY29uZmlnKGZkLCAuLi4pOwoJCglpZiAoZnNjb25m aWcoZmQsIENNRF9DUkVBVEUpID09IC1FQlVTWSkgewogICAgICAgIAlmc2NvbmZpZyhmZCwgQ01E X0JJTkRfSU5URVJOQUwpOwogICAgICAgIH0KCkJ1dCB1c2Vyc3BhY2Ugd291bGQgdGhlbiBiZSBm cmVlIHRvIGlzc3VlIGEgd2FybmluZyBvciBkbyBzb21ldGhpbmcKZWxzZSBpZiBDTURfQ1JFQVRF IHJldHVybnMgLUVCVVNZLgoKSSBkb24ndCBrbm93IGhvdyB0aGUgYWJvdmUgd291bmQgdXAgYmVp bmcgY29uc3RydWVkIGFzIGFza2luZyB0aGF0IHRoZQpjb2RlIGNhbGwgc2dldCBkaXJlY3RseSBi dXQgdGhhdCBpcyB3aGF0IGhhcyBoYXBwZW5lZC4KCj4gSGF2aW5nIHNvbWV0aGluZyBsaWtlIGEg c2Vjb25kIGNhbGxiYWNrIGZvciBtb3VudF9iZGV2KCkgdGhhdCB3b3VsZAo+IGJlIGNhbGxlZCB3 aGVuIHdlJ2QgZm91bmQgYW4gZXhpc3RpbmcgaW5zdGFuY2UgZm9yIHRoZSBzYW1lIGJsb2NrCj4g ZGV2aWNlPyAgU3VyZSwgbm8gcHJvYmxlbS4gIEhhdmluZyBhIGhlbHBlciBmb3IgZG9pbmcgc3Vj aCBjb21wYXJpc29uCj4gdGhhdCB3b3VsZCB3b3JrIGluIGVub3VnaCBjYXNlcyB0byBib3RoZXIs IHNvIHRoYXQgZGlmZmVyZW50IGZzCj4gY291bGQgYXZvaWQgYm9pbGVycGxhdGUgaW4gdGhhdCBj YWxsYmFjaz8gIEFnYWluLCBtb3JlIHBvd2VyIHRvIHlvdS4KCk5vcm1hbCBmb3JtcyBldGMuICBJ ZiB3ZSB3YW50IHRvIGRvIHRoYXQgaXQganVzdCByZXF1aXJlcyBhIHdlZSBiaXQgb2YKZGlzY2lw bGluZS4gIEFuZCBpZiBhbGwgb2YgdGhlIG9wdGlvbiBwYXJzaW5nIGlzIGJlaW5nIHJld3JpdHRl biBhbmQKcmV0ZXN0ZWQgYW55d2F5IEkgZG9uJ3Qgc2VlIHdoeSB3ZSBjYW4ndCBkbyBzb21ldGhp bmcgbGlrZSB0aGF0IGFzIHdlbGwuClNvIGl0IGRvZXMgbm90IHNvdW5kIHVucmVhc29uYWJsZSB0 byBtZS4KCkl0IGRvZXMgc291bmQgbGlrZSBtb3JlIHdvcmsgdGhhbiB3aGF0IEkgd2FzIHByb3Bv c2luZy4KCkVyaWMKCi0tIApBcHBBcm1vciBtYWlsaW5nIGxpc3QKQXBwQXJtb3JAbGlzdHMudWJ1 bnR1LmNvbQpNb2RpZnkgc2V0dGluZ3Mgb3IgdW5zdWJzY3JpYmUgYXQ6IGh0dHBzOi8vbGlzdHMu dWJ1bnR1LmNvbS9tYWlsbWFuL2xpc3RpbmZvL2FwcGFybW9yCg== From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Date: Fri, 10 Aug 2018 20:05:44 -0500 Subject: BUG: Mount ignores mount options In-Reply-To: <20180810151606.GA6515@ZenIV.linux.org.uk> (Al Viro's message of "Fri, 10 Aug 2018 16:16:06 +0100") References: <153313703562.13253.5766498657900728120.stgit@warthog.procyon.org.uk> <87d0uqpba5.fsf@xmission.com> <20180810151606.GA6515@ZenIV.linux.org.uk> Message-ID: <87pnypiufr.fsf@xmission.com> To: linux-security-module@vger.kernel.org List-Id: linux-security-module.vger.kernel.org Al Viro writes: > On Fri, Aug 10, 2018 at 09:05:22AM -0500, Eric W. Biederman wrote: >> >> There is a serious problem with mount options today that fsopen does not >> address. The problem is that mount options are ignored for block based >> filesystems, and any other type of filesystem that follows the same >> pattern. >> >> The script below demonstrates this bug. Showing this bug can cause the >> ext4 "acl" "quota" and "user_xattr" options to be silently ignored. >> >> fsopen has my nack until it addresses this issue. >> >> I don't know if we can fix this in the context of sys_mount. But we if >> we are redoing the option parsing of how we mount filesystems this needs >> to be fixed before we start worrying about bug compatibility. >> >> Hopefully this report is simple and clear enough that we can at least >> agree on the problem. > > Sure, it is simple. So's the solution: MNT_USERNS_SPECIAL_SEMANTICS that > would get passed to filesystems, so that Eric would be able to implement > his mount(2)-incompatible behaviour at leisure, without worrying about > compatibility issues. > > Does that address your complaint? Absolutely not. My complaint is that the current implemented behavior of practically every filesystem in the kernel, is that it will ignore mount options when mounted a second time. It is not some weird special case. It is not some container thing. It is that the behavior of mount(2) with practically every filesystem type when that filesystem is already mounted somewhere else behaves in ways no one would expect. With the new fsopen api the easy thing to do is simply have CMD_CREATE CMD_BIND_INTERNAL and be done with it. CMD_CREATE guarantee that a new superblock is created. CMD_BIND_INTERNAL would only work with an existing superblock. Then root would at least know that he is connecting to an already mounted filesystem and could look at the options etc and fail if he didn't like what he saw. No surprises, no muss, no fuss simple. But I have been told the simple solution above is somehow unacceptable. And an option to compare the mount options and see if they are the same was offered. That would will work to. I just care that we define the semantics in such a way that it is not easy for root to get confused and do something stupid that will bite later, and that we build the infrastructure so that all filesystems can implement it easily. So yes this is 100% a question about how filesystems should behave with respect to their option when mounted for a second time. That is what Dave Howells patchset is addressing. > Because one thing we are not going to do is changing mount(2) > behaviour. I have not asked for that. I have asked that we get it right for fsopen. > Reason: userland-visible behaviour of hell knows how many local scripts. > Another thing that > is flat-out not feasible is some kind of blanket "compare options" > stuff; it *can* be done as helpers to be used by filesystem when > it sees that new flag, but it's simply not going to work at the > fs-independent level. > > Trivial example with the same ext4: > mount /dev/sda1 /mnt/a -o bsddf vs. mount /dev/sda1 /mnt/b > ext4 can tell that these are the same. syscall itself has no > clue. What's more, it's not just explicitly spelled default > options - it's the stuff that has more than one form. And while > we are at it, the things like two NFS mounts of different trees > from the same server; they might or might not get the same superblock. > Depending upon the options. > > Convenience helper that would allow ext4 to compare options and reject > the incompatible mount? Not sure how much ext4-specific knowledge > would have to go in it, but if you can come up with one - more power > to you. But the decision to use it *must* be ext4-specific. Because > for e.g. NFS such thing as -o fsid=..., while certainly a part of > options, has a very different meaning - it's "use a separate fs instance" > (and let the server deal with coherency issues on its end). > > Decision to use sget() (and the way it's used) is up to filesystem. > We *can't* lift that into syscall. Not without breaking the fuck out > of existing behaviour. I have never proposed that. See above. I may have talked in terms of what sget does and muddied the waters. If so I apologize. All I proposed was that we distinguish between a first mount and an additional mount so that userspace knows the options will be ignored. Then the code to replicate the current behavior can look like: fd = fsopen(...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); if (fsconfig(fd, CMD_CREATE) == -EBUSY) { fsconfig(fd, CMD_BIND_INTERNAL); } But userspace would then be free to issue a warning or do something else if CMD_CREATE returns -EBUSY. I don't know how the above wound up being construed as asking that the code call sget directly but that is what has happened. > Having something like a second callback for mount_bdev() that would > be called when we'd found an existing instance for the same block > device? Sure, no problem. Having a helper for doing such comparison > that would work in enough cases to bother, so that different fs > could avoid boilerplate in that callback? Again, more power to you. Normal forms etc. If we want to do that it just requires a wee bit of discipline. And if all of the option parsing is being rewritten and retested anyway I don't see why we can't do something like that as well. So it does not sound unreasonable to me. It does sound like more work than what I was proposing. Eric From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) To: Al Viro Cc: David Howells , John Johansen , Tejun Heo , selinux@tycho.nsa.gov, Paul Moore , Li Zefan , linux-api@vger.kernel.org, apparmor@lists.ubuntu.com, Casey Schaufler , fenghua.yu@intel.com, Greg Kroah-Hartman , Eric Biggers , linux-security-module@vger.kernel.org, Tetsuo Handa , Johannes Weiner , Stephen Smalley , tomoyo-dev-en@lists.sourceforge.jp, cgroups@vger.kernel.org, torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, "Theodore Y. Ts'o" , Miklos Szeredi References: <153313703562.13253.5766498657900728120.stgit@warthog.procyon.org.uk> <87d0uqpba5.fsf@xmission.com> <20180810151606.GA6515@ZenIV.linux.org.uk> Date: Fri, 10 Aug 2018 20:05:44 -0500 In-Reply-To: <20180810151606.GA6515@ZenIV.linux.org.uk> (Al Viro's message of "Fri, 10 Aug 2018 16:16:06 +0100") Message-ID: <87pnypiufr.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: BUG: Mount ignores mount options List-Id: "Security-Enhanced Linux \(SELinux\) mailing list" List-Post: List-Help: Al Viro writes: > On Fri, Aug 10, 2018 at 09:05:22AM -0500, Eric W. Biederman wrote: >> >> There is a serious problem with mount options today that fsopen does not >> address. The problem is that mount options are ignored for block based >> filesystems, and any other type of filesystem that follows the same >> pattern. >> >> The script below demonstrates this bug. Showing this bug can cause the >> ext4 "acl" "quota" and "user_xattr" options to be silently ignored. >> >> fsopen has my nack until it addresses this issue. >> >> I don't know if we can fix this in the context of sys_mount. But we if >> we are redoing the option parsing of how we mount filesystems this needs >> to be fixed before we start worrying about bug compatibility. >> >> Hopefully this report is simple and clear enough that we can at least >> agree on the problem. > > Sure, it is simple. So's the solution: MNT_USERNS_SPECIAL_SEMANTICS that > would get passed to filesystems, so that Eric would be able to implement > his mount(2)-incompatible behaviour at leisure, without worrying about > compatibility issues. > > Does that address your complaint? Absolutely not. My complaint is that the current implemented behavior of practically every filesystem in the kernel, is that it will ignore mount options when mounted a second time. It is not some weird special case. It is not some container thing. It is that the behavior of mount(2) with practically every filesystem type when that filesystem is already mounted somewhere else behaves in ways no one would expect. With the new fsopen api the easy thing to do is simply have CMD_CREATE CMD_BIND_INTERNAL and be done with it. CMD_CREATE guarantee that a new superblock is created. CMD_BIND_INTERNAL would only work with an existing superblock. Then root would at least know that he is connecting to an already mounted filesystem and could look at the options etc and fail if he didn't like what he saw. No surprises, no muss, no fuss simple. But I have been told the simple solution above is somehow unacceptable. And an option to compare the mount options and see if they are the same was offered. That would will work to. I just care that we define the semantics in such a way that it is not easy for root to get confused and do something stupid that will bite later, and that we build the infrastructure so that all filesystems can implement it easily. So yes this is 100% a question about how filesystems should behave with respect to their option when mounted for a second time. That is what Dave Howells patchset is addressing. > Because one thing we are not going to do is changing mount(2) > behaviour. I have not asked for that. I have asked that we get it right for fsopen. > Reason: userland-visible behaviour of hell knows how many local scripts. > Another thing that > is flat-out not feasible is some kind of blanket "compare options" > stuff; it *can* be done as helpers to be used by filesystem when > it sees that new flag, but it's simply not going to work at the > fs-independent level. > > Trivial example with the same ext4: > mount /dev/sda1 /mnt/a -o bsddf vs. mount /dev/sda1 /mnt/b > ext4 can tell that these are the same. syscall itself has no > clue. What's more, it's not just explicitly spelled default > options - it's the stuff that has more than one form. And while > we are at it, the things like two NFS mounts of different trees > from the same server; they might or might not get the same superblock. > Depending upon the options. > > Convenience helper that would allow ext4 to compare options and reject > the incompatible mount? Not sure how much ext4-specific knowledge > would have to go in it, but if you can come up with one - more power > to you. But the decision to use it *must* be ext4-specific. Because > for e.g. NFS such thing as -o fsid=..., while certainly a part of > options, has a very different meaning - it's "use a separate fs instance" > (and let the server deal with coherency issues on its end). > > Decision to use sget() (and the way it's used) is up to filesystem. > We *can't* lift that into syscall. Not without breaking the fuck out > of existing behaviour. I have never proposed that. See above. I may have talked in terms of what sget does and muddied the waters. If so I apologize. All I proposed was that we distinguish between a first mount and an additional mount so that userspace knows the options will be ignored. Then the code to replicate the current behavior can look like: fd = fsopen(...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); fsconfig(fd, ...); if (fsconfig(fd, CMD_CREATE) == -EBUSY) { fsconfig(fd, CMD_BIND_INTERNAL); } But userspace would then be free to issue a warning or do something else if CMD_CREATE returns -EBUSY. I don't know how the above wound up being construed as asking that the code call sget directly but that is what has happened. > Having something like a second callback for mount_bdev() that would > be called when we'd found an existing instance for the same block > device? Sure, no problem. Having a helper for doing such comparison > that would work in enough cases to bother, so that different fs > could avoid boilerplate in that callback? Again, more power to you. Normal forms etc. If we want to do that it just requires a wee bit of discipline. And if all of the option parsing is being rewritten and retested anyway I don't see why we can't do something like that as well. So it does not sound unreasonable to me. It does sound like more work than what I was proposing. Eric