linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: jack-AlSwsSmVLrQ@public.gmane.org,
	tytso-3s7WtUTddSA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org,
	hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	bpm-sJ/iWh9BUns@public.gmane.org,
	christopher.jones-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	tm-d1IQDZat3X0@public.gmane.org,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	chris.mason-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	tinguely-sJ/iWh9BUns@public.gmane.org
Subject: Re: container disk quota
Date: Thu, 31 May 2012 20:31:42 +0800	[thread overview]
Message-ID: <4FC764AE.4070404@oracle.com> (raw)
In-Reply-To: <4FC731C1.5000903-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

Hi Glauber,

Thanks for you comments!

On 05/31/2012 04:54 PM, Glauber Costa wrote:

> On 05/30/2012 06:58 PM, jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org wrote:
>> Hello All,
>>
>> According to glauber's comments regarding container disk quota, it
>> should be binded to mount
>> namespace rather than cgroup.
>>
>> Per my try out, it works just fine by combining with userland quota
>> utilitly in this way.
> that's great.
> 
> I'll take a look at the patches.
> 
> 
>>
>> * Modify quotactl(2) to examine if the caller is invoked inside
>> container.
>>    implemented by checking the quota device name("rootfs" for lxc
>> guest) or current pid namespace
>>    is not the initial one, then do mount namespace quotactl if
>> required, or goto
>>    the normal quotactl procedure.
> 
> I dislike the use of "lxc" name. There is nothing lxc-specific in this,
> this is namespace-specific. lxc is just one of the container solutions
> out there, so let's keep it generic.

I think I should forget all things regarding LXC, just treat it as a new
quota feature with regard to namespace.

>>
>> * Also, I have not handle a couple of things for now.
>>    . I think the container quota should be isolated to Jan's fs/quota/
>> directory.
>>    . There are a dozens of helper routines at general quota, e.g,
>>      struct if_dqblk<->  struct fs_disk_quota converts.
>>      dquot space and inodes bill up.
>>      They can be refactored as shared routines to some extents.
>>    . quotastats(8) is not teached to aware container for now.
>>
>> Changes in quota userland utility:
>> * Introduce a new quota format string "lxc" to all quota control
>> utility, to
>>    let each utility know that the user want to run container quota
>> control. e.g:
>>    quotacheck -cvugm -F "lxc" /
>>    quotaon -u -F "lxc" /
>>    ....
>>
>> * Currently, I manually created the underlying device(by editing cgroup
>>    device access list and running mknod /dev/sdaX x x) for the rootfs
>>    inside containers to let the cache mount points routine pass for
>>    executing quotacheck against the "/" directory.  Actually, it can be
>>    omitted here.
>>
>> * Add a new quotaio_lxc.c[.h] for container quota IO, it basically
>> same to
>>    VFS quotaio logic, I just hope to isolate container stuff here.
>>
>> Issues:
>> * How to detect quotactl(2) is launched from container in a reasonable
>> way.
> 
> It's a system call. It is always called by a process. The process
> belongs to a namespace. What else is needed?

nothing now. :)

> 
>> * Do we need to let container quota works for cgroup combine with
>> unshare(1)?
>>    Now the patchset is mainly works for lxc guest.  IMHO, it can be
>> used outside
>>    guest if the user desired.  In this case, the quota limits can take
>> effort
>>    among different underlying file systems if they have exported quota
>> billing
>>    routines.
> 
> I still don't understand what is the business of cgroups here. If you
> are attaching it to mount namespace, you can always infer the context
> from the calling process. I still need to look at your patches, but I
> believe that dropping the "feature" of manipulating this from outside of
> the container will save you a lot of trouble.

Yup, just treat it to be namespace specific, there is nothing need to
consider with cgroup interface.

> 
> Please note that a process can temporarily join a namespace with
> setns(). So you can have a *utility* that does it from the outer world,
> but the kernel has no business with that. As far as we're concerned, I
> believe that you should always get your context from the current
> namespace, and forbid any usage from outside.

I'll more investigation for that.

> 
>> * The hash table list defines(hash table size)for dquot caching for
>> each type is
>>    referred to kernel/user.c, maybe its better to define an array
>> separatly for
>>    performance optimizations.  Of course, that's all depending on my
>> current
>>    implementation is on the right road. :)
>>
>> * Container quota statistics, should them be calculated and exposed to
>> /proc/fs/quota?  If the underlying file system also enabled with
>> quotas, they will be
>>    mixed up, so how about add a new proc file like "ns_quota" there?
> No, this should be transferred to the process-specific proc and them
> symlinked. Take a look at "/proc/self".
> 
>>
>> * Memory shrinks acquired from kswap.
>>    As all dquot are cached in memory, and if the user executing
>> quotaoff, maybe
>>    I need to handle quota disable but still be kept at memory.
>>    Also, add another routine to disable and remove all quotas from
>> memory to
>>    save memory directly.
> 
> I didn't read your patches yet, so take it with a grain of salt here.
> But I don't understand why you make this distinction of keeping it in
> memory only.
> 
> You could keep quota files outside of the container, and then bind mount
> them to the current location in the setup-phase.

I have tried to keep quota files outsides originally, but I changed my
thoughts afterwards, because of three reasons at that time:

1) The quota files could be overwrote if the container's rootfs is
located at the root directory of a storage partition, and this partition
is mounted with quota limits enabled.

2) To deal with quota files, looks I have to tweak up
quota_read()/quota_write(), assuming ext4, which are corresponding to
ext4_quota_read()/ext4_quota_write().

3) As mount namespace could be created and destroyed at any stage,
it has no memory to recall which inodes are quota files. however, quota
tools need to restore a few things from those files I remember.
but can not recalled all of them for now. :( I'll do some check up to
refresh my head in this point.

Sure, considering that we can bind mount them at setup phase, the first
concern could be ignored.


Thanks,
-Jeff

  parent reply	other threads:[~2012-05-31 12:31 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-30 14:58 container disk quota jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found] ` <1338389946-13711-1-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-30 14:58   ` [PATCH 01/12] container quota: add kernel configuration for container quota jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found]     ` <1338389946-13711-2-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  9:00       ` Glauber Costa
2012-05-31  9:01         ` Glauber Costa
2012-05-30 14:58   ` [PATCH 02/12] container quota: lock/unlock mount namespace when performing quotactl jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found]     ` <1338389946-13711-3-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  9:04       ` Glauber Costa
     [not found]         ` <4FC73418.1040402-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 12:40           ` Jeff Liu
2012-05-30 14:58   ` [PATCH 03/12] container quota: introduce container quota format identifier jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:58   ` [PATCH 04/12] container quota: introduce container disk quota data header file jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-31  9:10     ` Glauber Costa
     [not found]       ` <4FC735A2.4040400-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 12:53         ` Jeff Liu
2012-05-30 14:58   ` [PATCH 05/12] container quota: bind disk quota stuff on mount namespace jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 06/12] container quota: implementations and header for block/inode bill up jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 07/12] container quota: add quota control source file jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 08/12] container quota: let quotactl(2) works for container jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 09/12] container quota: add container disk quota entry to Makefile jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 10/12] container quota: bill container inodes alloc/free on ext4 jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found]     ` <1338389946-13711-11-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-30 15:55       ` Ted Ts'o
     [not found]         ` <20120530155543.GB13236-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-05-31  1:43           ` Jeff Liu
     [not found]             ` <4FC6CCB6.4090908-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  1:54               ` Ted Ts'o
     [not found]                 ` <20120531015453.GA6759-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-05-31  2:37                   ` Jeff Liu
     [not found]                     ` <4FC6D94D.6040106-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  3:24                       ` Jeff Liu
2012-05-31  9:15       ` Glauber Costa
     [not found]         ` <4FC736AD.2070404-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 12:58           ` Jeff Liu
     [not found]             ` <4FC76B0D.6020804-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31 13:14               ` Glauber Costa
     [not found]                 ` <4FC76ECA.3070301-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 13:43                   ` Jeff Liu
2012-06-05  0:03                 ` Dave Chinner
2012-05-30 14:59   ` [PATCH 11/11] container quota: bill container disk blocks " jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 12/12] container quota: init/destroy container dqinfo on mount namespace jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-31  8:54   ` container disk quota Glauber Costa
     [not found]     ` <4FC731C1.5000903-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31  9:19       ` Glauber Costa
     [not found]         ` <4FC7378B.2030707-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 13:04           ` Jeff Liu
2012-05-31 12:31       ` Jeff Liu [this message]
2012-06-01 15:54   ` Jan Kara
     [not found]     ` <20120601155457.GA30909-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2012-06-01 16:04       ` Serge Hallyn
2012-06-02  5:59         ` Jeff Liu
2012-06-02  6:06           ` Kirill Korotaev
     [not found]             ` <01FED15D-15A3-4542-B95B-1166F0A309E6-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-06-02  6:24               ` Jeff Liu
     [not found]                 ` <4FC9B183.10605-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-02 15:21                   ` Kirill Korotaev
     [not found]                     ` <8660DDAA-D7A7-4C03-8CBB-9DB7E94C80CB-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-06-03  4:23                       ` Jeff Liu
     [not found]                         ` <4FCAE6CB.8060208-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-03  5:47                           ` Kirill Korotaev
     [not found]                             ` <81DE9C10-649B-4D13-86B0-200944AE8767-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-06-03  6:02                               ` Jeff Liu
2012-06-03  9:48                             ` Glauber Costa
2012-06-04  2:57           ` Serge Hallyn
2012-06-04  4:46             ` Jeff Liu
     [not found]               ` <4FCC3DB9.40105-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-04  9:42                 ` Jan Kara
     [not found]                   ` <20120604094224.GA7670-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2012-06-04 13:35                     ` Jeff Liu
2012-06-04 13:56                       ` Jan Kara
     [not found]                         ` <20120604135615.GD11010-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2012-06-04 14:55                           ` Jeff Liu
     [not found]                             ` <4FCCCC64.5060301-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-04 15:50                               ` Jeff Liu
2012-06-02  5:42       ` Jeff Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC764AE.4070404@oracle.com \
    --to=jeff.liu-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=bpm-sJ/iWh9BUns@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=chris.mason-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=christopher.jones-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=jack-AlSwsSmVLrQ@public.gmane.org \
    --cc=linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tinguely-sJ/iWh9BUns@public.gmane.org \
    --cc=tm-d1IQDZat3X0@public.gmane.org \
    --cc=tytso-3s7WtUTddSA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).