From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751751AbdBMKUa (ORCPT ); Mon, 13 Feb 2017 05:20:30 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:40679 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751058AbdBMKU2 (ORCPT ); Mon, 13 Feb 2017 05:20:28 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: James Bottomley Cc: Josh Triplett , Christoph Hellwig , Amir Goldstein , Djalal Harouni , Chris Mason , Theodore Tso , Andy Lutomirski , Seth Forshee , linux-fsdevel , linux-kernel , LSM List , Dongsu Park , David Herrmann , Miklos Szeredi , Alban Crequy , Al Viro , "Serge E. Hallyn" , Phil Estes References: <1486235880.2484.17.camel@HansenPartnership.com> <1486235972.2484.19.camel@HansenPartnership.com> <20170207091924.GA13995@infradead.org> <1486485440.2488.15.camel@HansenPartnership.com> <20170207181040.GA18551@infradead.org> <1486494123.2488.56.camel@HansenPartnership.com> <20170207194933.GB4393@infradead.org> <20170208015423.GC23245@cloud> <1486567365.2484.28.camel@HansenPartnership.com> <20170209103640.myuysvawpj55z4fi@x> <1486654467.2616.8.camel@HansenPartnership.com> Date: Mon, 13 Feb 2017 23:15:48 +1300 In-Reply-To: <1486654467.2616.8.camel@HansenPartnership.com> (James Bottomley's message of "Thu, 09 Feb 2017 07:34:27 -0800") Message-ID: <87poim75wr.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1cdDk2-0008O1-MI;;;mid=<87poim75wr.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=101.100.131.98;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX187b5m3y9Y06Iax0xAQT3ma08bNV8YyqiE= X-SA-Exim-Connect-IP: 101.100.131.98 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 XMSubMetaSx_00 1+ Sexy Words * 1.2 XMSubMetaSxObfu_03 Obfuscated Sexy Noun-People X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;James Bottomley X-Spam-Relay-Country: X-Spam-Timing: total 5302 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.7 (0.1%), b_tie_ro: 1.95 (0.0%), parse: 0.80 (0.0%), extract_message_metadata: 12 (0.2%), get_uri_detail_list: 2.3 (0.0%), tests_pri_-1000: 6 (0.1%), tests_pri_-950: 1.17 (0.0%), tests_pri_-900: 0.97 (0.0%), tests_pri_-400: 30 (0.6%), check_bayes: 29 (0.5%), b_tokenize: 10 (0.2%), b_tok_get_all: 10 (0.2%), b_comp_prob: 3.0 (0.1%), b_tok_touch_all: 3.5 (0.1%), b_finish: 0.55 (0.0%), tests_pri_0: 904 (17.1%), check_dkim_signature: 0.48 (0.0%), check_dkim_adsp: 2.9 (0.1%), tests_pri_500: 4342 (81.9%), poll_dns_idle: 4333 (81.7%), rewrite_mail: 0.00 (0.0%) Subject: Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org James Bottomley writes: > On Thu, 2017-02-09 at 02:36 -0800, Josh Triplett wrote: >> On Wed, Feb 08, 2017 at 07:22:45AM -0800, James Bottomley wrote: >> > On Tue, 2017-02-07 at 17:54 -0800, Josh Triplett wrote: >> > > On Tue, Feb 07, 2017 at 11:49:33AM -0800, Christoph Hellwig >> > > wrote: >> > > > On Tue, Feb 07, 2017 at 11:02:03AM -0800, James Bottomley >> > > > wrote: >> > > > > > Another option would be to require something like a >> > > > > > project as used for project quotas as the root. This would >> > > > > > also be conveniant as it could storge the used remapping >> > > > > > tables. >> > > > > >> > > > > So this would be like the current project quota except set on >> > > > > a subtree? I could see it being done that way but I don't >> > > > > see what advantage it has over using flags in the subtree >> > > > > itself (the mapping is known based on the mount namespace, so >> > > > > there's really only a single bit of information to store). >> > > > >> > > > projects (which are the underling concept for project quotas) >> > > > are per-subtree in practice - the flag is set on an inode and >> > > > then all directories and files underneath inherit the project >> > > > ID, hardlinking outside a project is prohinited. >> > > >> > > I'm interested in having a VFS-level way to do more than just a >> > > shift; I'd like to be able to arbitrarily remap IDs between >> > > what's on disk and the system IDs. >> > >> > OK, so the shift is effectively an arbitrary remap because it >> > allows multiple ranges to be mapped (althought the userns currently >> > imposes a maximum number of five extents but that limit is a bit >> > arbitrary just to try to limit the amount of space the >> > parametrisation takes). See >> > kernel/user_namespace.c:map_id_up/down() >> > >> > > If we're talking about developing a VFS-level solution for >> > > this, I'd like to avoid limiting it to just a shift. (A >> > > shift/range would definitely be the simplest solution for many >> > > common container cases, but not all.) >> > >> > I assume the above satisfies you on this point, but raises the >> > question: do you want an arbitrary shift not parametrised by a user >> > namespace? If so how many such shifts do you want ... giving some >> > details of the use case would be helpful. >> >> The limit of five extents means this may not work in the most general >> case, no. > > That's not an API limit, so it can be changed if there's a need. The > problem was merely how to parametrise a mapping without taking too much > space. > >> One use case: given an on-disk filesystem, its name-to-number >> mapping, and your host name-to-number mapping, mount the filesystem >> with all the UIDs bidirectionally mapped to those on your host >> system. > > This is pretty much what the s_user_ns does. > >> Another use case: given an on-disk filesystem with potentially >> arbitrary UIDs (not necessarily in a clean contiguous block), and a >> pile of unprivileged UIDs, mount the filesystem such that every on >> -disk UID gets a unique unprivileged UID. > > So is this. Basically anything that begins by mounting gets a super > block and can use the s_user_ns to map from the filesystem view to the > kernel view of ids. Apart from greater sophistication in the > parametrisation, it sounds like we have all the machinery you need. > I'm sure the containers people will consider reasonable patches to > change this. Yes. And to be clear we have all of that merged now and mostly present and hooked up in all filesystems without any shiftfs like changes needed. To use this with a filesystem a last pass needs to be had to verify that the cases where something does not map are handled cleanly. Eric