All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Liu <jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: jack-AlSwsSmVLrQ@public.gmane.org,
	tytso-3s7WtUTddSA@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org,
	hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	bpm-sJ/iWh9BUns@public.gmane.org,
	christopher.jones-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	tm-d1IQDZat3X0@public.gmane.org,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	chris.mason-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	tinguely-sJ/iWh9BUns@public.gmane.org
Subject: Re: container disk quota
Date: Thu, 31 May 2012 20:31:42 +0800	[thread overview]
Message-ID: <4FC764AE.4070404@oracle.com> (raw)
In-Reply-To: <4FC731C1.5000903-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

Hi Glauber,

Thanks for you comments!

On 05/31/2012 04:54 PM, Glauber Costa wrote:

> On 05/30/2012 06:58 PM, jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org wrote:
>> Hello All,
>>
>> According to glauber's comments regarding container disk quota, it
>> should be binded to mount
>> namespace rather than cgroup.
>>
>> Per my try out, it works just fine by combining with userland quota
>> utilitly in this way.
> that's great.
> 
> I'll take a look at the patches.
> 
> 
>>
>> * Modify quotactl(2) to examine if the caller is invoked inside
>> container.
>>    implemented by checking the quota device name("rootfs" for lxc
>> guest) or current pid namespace
>>    is not the initial one, then do mount namespace quotactl if
>> required, or goto
>>    the normal quotactl procedure.
> 
> I dislike the use of "lxc" name. There is nothing lxc-specific in this,
> this is namespace-specific. lxc is just one of the container solutions
> out there, so let's keep it generic.

I think I should forget all things regarding LXC, just treat it as a new
quota feature with regard to namespace.

>>
>> * Also, I have not handle a couple of things for now.
>>    . I think the container quota should be isolated to Jan's fs/quota/
>> directory.
>>    . There are a dozens of helper routines at general quota, e.g,
>>      struct if_dqblk<->  struct fs_disk_quota converts.
>>      dquot space and inodes bill up.
>>      They can be refactored as shared routines to some extents.
>>    . quotastats(8) is not teached to aware container for now.
>>
>> Changes in quota userland utility:
>> * Introduce a new quota format string "lxc" to all quota control
>> utility, to
>>    let each utility know that the user want to run container quota
>> control. e.g:
>>    quotacheck -cvugm -F "lxc" /
>>    quotaon -u -F "lxc" /
>>    ....
>>
>> * Currently, I manually created the underlying device(by editing cgroup
>>    device access list and running mknod /dev/sdaX x x) for the rootfs
>>    inside containers to let the cache mount points routine pass for
>>    executing quotacheck against the "/" directory.  Actually, it can be
>>    omitted here.
>>
>> * Add a new quotaio_lxc.c[.h] for container quota IO, it basically
>> same to
>>    VFS quotaio logic, I just hope to isolate container stuff here.
>>
>> Issues:
>> * How to detect quotactl(2) is launched from container in a reasonable
>> way.
> 
> It's a system call. It is always called by a process. The process
> belongs to a namespace. What else is needed?

nothing now. :)

> 
>> * Do we need to let container quota works for cgroup combine with
>> unshare(1)?
>>    Now the patchset is mainly works for lxc guest.  IMHO, it can be
>> used outside
>>    guest if the user desired.  In this case, the quota limits can take
>> effort
>>    among different underlying file systems if they have exported quota
>> billing
>>    routines.
> 
> I still don't understand what is the business of cgroups here. If you
> are attaching it to mount namespace, you can always infer the context
> from the calling process. I still need to look at your patches, but I
> believe that dropping the "feature" of manipulating this from outside of
> the container will save you a lot of trouble.

Yup, just treat it to be namespace specific, there is nothing need to
consider with cgroup interface.

> 
> Please note that a process can temporarily join a namespace with
> setns(). So you can have a *utility* that does it from the outer world,
> but the kernel has no business with that. As far as we're concerned, I
> believe that you should always get your context from the current
> namespace, and forbid any usage from outside.

I'll more investigation for that.

> 
>> * The hash table list defines(hash table size)for dquot caching for
>> each type is
>>    referred to kernel/user.c, maybe its better to define an array
>> separatly for
>>    performance optimizations.  Of course, that's all depending on my
>> current
>>    implementation is on the right road. :)
>>
>> * Container quota statistics, should them be calculated and exposed to
>> /proc/fs/quota?  If the underlying file system also enabled with
>> quotas, they will be
>>    mixed up, so how about add a new proc file like "ns_quota" there?
> No, this should be transferred to the process-specific proc and them
> symlinked. Take a look at "/proc/self".
> 
>>
>> * Memory shrinks acquired from kswap.
>>    As all dquot are cached in memory, and if the user executing
>> quotaoff, maybe
>>    I need to handle quota disable but still be kept at memory.
>>    Also, add another routine to disable and remove all quotas from
>> memory to
>>    save memory directly.
> 
> I didn't read your patches yet, so take it with a grain of salt here.
> But I don't understand why you make this distinction of keeping it in
> memory only.
> 
> You could keep quota files outside of the container, and then bind mount
> them to the current location in the setup-phase.

I have tried to keep quota files outsides originally, but I changed my
thoughts afterwards, because of three reasons at that time:

1) The quota files could be overwrote if the container's rootfs is
located at the root directory of a storage partition, and this partition
is mounted with quota limits enabled.

2) To deal with quota files, looks I have to tweak up
quota_read()/quota_write(), assuming ext4, which are corresponding to
ext4_quota_read()/ext4_quota_write().

3) As mount namespace could be created and destroyed at any stage,
it has no memory to recall which inodes are quota files. however, quota
tools need to restore a few things from those files I remember.
but can not recalled all of them for now. :( I'll do some check up to
refresh my head in this point.

Sure, considering that we can bind mount them at setup phase, the first
concern could be ignored.


Thanks,
-Jeff

  parent reply	other threads:[~2012-05-31 12:31 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-30 14:58 container disk quota jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found] ` <1338389946-13711-1-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-30 14:58   ` [PATCH 01/12] container quota: add kernel configuration for container quota jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found]     ` <1338389946-13711-2-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  9:00       ` Glauber Costa
2012-05-31  9:00         ` Glauber Costa
     [not found]         ` <4FC73318.3020709-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31  9:01           ` Glauber Costa
2012-05-31  9:01         ` Glauber Costa
2012-05-31  9:01           ` Glauber Costa
2012-05-31  9:00       ` Glauber Costa
2012-05-30 14:58   ` [PATCH 02/12] container quota: lock/unlock mount namespace when performing quotactl jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found]     ` <1338389946-13711-3-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  9:04       ` Glauber Costa
2012-05-31  9:04         ` Glauber Costa
     [not found]         ` <4FC73418.1040402-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 12:40           ` Jeff Liu
2012-05-30 14:58   ` [PATCH 03/12] container quota: introduce container quota format identifier jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:58   ` [PATCH 04/12] container quota: introduce container disk quota data header file jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-31  9:10     ` Glauber Costa
2012-05-31  9:10       ` Glauber Costa
     [not found]       ` <4FC735A2.4040400-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 12:53         ` Jeff Liu
     [not found]     ` <1338389946-13711-5-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  9:10       ` Glauber Costa
2012-05-30 14:58   ` [PATCH 05/12] container quota: bind disk quota stuff on mount namespace jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 06/12] container quota: implementations and header for block/inode bill up jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 07/12] container quota: add quota control source file jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 08/12] container quota: let quotactl(2) works for container jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 09/12] container quota: add container disk quota entry to Makefile jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 10/12] container quota: bill container inodes alloc/free on ext4 jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
     [not found]     ` <1338389946-13711-11-git-send-email-jeff.liu-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-30 15:55       ` Ted Ts'o
     [not found]         ` <20120530155543.GB13236-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-05-31  1:43           ` Jeff Liu
     [not found]             ` <4FC6CCB6.4090908-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  1:54               ` Ted Ts'o
     [not found]                 ` <20120531015453.GA6759-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-05-31  2:37                   ` Jeff Liu
     [not found]                     ` <4FC6D94D.6040106-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31  3:24                       ` Jeff Liu
2012-05-31  9:15       ` Glauber Costa
2012-05-31  9:15         ` Glauber Costa
     [not found]         ` <4FC736AD.2070404-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 12:58           ` Jeff Liu
     [not found]             ` <4FC76B0D.6020804-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31 13:14               ` Glauber Costa
2012-05-31 13:14               ` Glauber Costa
2012-05-31 13:14                 ` Glauber Costa
     [not found]                 ` <4FC76ECA.3070301-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 13:43                   ` Jeff Liu
2012-06-05  0:03                   ` Dave Chinner
2012-06-05  0:03                 ` Dave Chinner
2012-05-31  9:15       ` Glauber Costa
2012-05-30 14:59   ` [PATCH 11/11] container quota: bill container disk blocks " jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-30 14:59   ` [PATCH 12/12] container quota: init/destroy container dqinfo on mount namespace jeff.liu-QHcLZuEGTsvQT0dZR+AlfA
2012-05-31  8:54   ` container disk quota Glauber Costa
2012-05-31  8:54     ` Glauber Costa
     [not found]     ` <4FC731C1.5000903-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31  9:19       ` Glauber Costa
2012-05-31  9:19         ` Glauber Costa
     [not found]         ` <4FC7378B.2030707-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-05-31 13:04           ` Jeff Liu
2012-05-31  9:19       ` Glauber Costa
2012-05-31 12:31       ` Jeff Liu [this message]
2012-05-31  8:54   ` Glauber Costa
2012-06-01 15:54   ` Jan Kara
     [not found]     ` <20120601155457.GA30909-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2012-06-01 16:04       ` Serge Hallyn
2012-06-02  5:59         ` Jeff Liu
2012-06-02  6:06           ` Kirill Korotaev
     [not found]             ` <01FED15D-15A3-4542-B95B-1166F0A309E6-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-06-02  6:24               ` Jeff Liu
     [not found]                 ` <4FC9B183.10605-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-02 15:21                   ` Kirill Korotaev
     [not found]                     ` <8660DDAA-D7A7-4C03-8CBB-9DB7E94C80CB-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-06-03  4:23                       ` Jeff Liu
     [not found]                         ` <4FCAE6CB.8060208-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-03  5:47                           ` Kirill Korotaev
     [not found]                             ` <81DE9C10-649B-4D13-86B0-200944AE8767-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-06-03  6:02                               ` Jeff Liu
2012-06-03  9:48                               ` Glauber Costa
2012-06-03  9:48                             ` Glauber Costa
     [not found]           ` <4FC9ABBB.3050303-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-02  6:06             ` Kirill Korotaev
2012-06-04  2:57             ` Serge Hallyn
2012-06-04  2:57           ` Serge Hallyn
2012-06-04  4:46             ` Jeff Liu
     [not found]               ` <4FCC3DB9.40105-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-04  9:42                 ` Jan Kara
2012-06-04  9:42                 ` Jan Kara
     [not found]                   ` <20120604094224.GA7670-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2012-06-04 13:35                     ` Jeff Liu
     [not found]                       ` <4FCCB98A.2030703-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-04 13:56                         ` Jan Kara
2012-06-04 13:56                       ` Jan Kara
     [not found]                         ` <20120604135615.GD11010-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2012-06-04 14:55                           ` Jeff Liu
     [not found]                             ` <4FCCCC64.5060301-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-06-04 15:50                               ` Jeff Liu
2012-06-02  5:42       ` Jeff Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC764AE.4070404@oracle.com \
    --to=jeff.liu-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=bpm-sJ/iWh9BUns@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=chris.mason-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=christopher.jones-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=jack-AlSwsSmVLrQ@public.gmane.org \
    --cc=linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tinguely-sJ/iWh9BUns@public.gmane.org \
    --cc=tm-d1IQDZat3X0@public.gmane.org \
    --cc=tytso-3s7WtUTddSA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.