Re: [PATCH 6/6] containers: BeanCounters over generic process containers

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Pavel Emelianov <xemul@openvz.org>
To: herbert@13thfloor.at, dev@sw.ru, containers@lists.osdl.org,
	ckrm-tech@lists.sourceforge.net, linux-kernel@vger.kernel.org
Cc: akpm@osdl.org, pj@sgi.com, sekharan@us.ibm.com,
	menage@google.com, serue@us.ibm.com, vatsa@in.ibm.com,
	rohitseth@google.com, winget@google.com
Subject: Re: [PATCH 6/6] containers: BeanCounters over generic process containers
Date: Mon, 25 Dec 2006 13:35:15 +0300	[thread overview]
Message-ID: <458FA963.2050502@openvz.org> (raw)
In-Reply-To: <20061223194955.GA30764@MAIL.13thfloor.at>

Herbert Poetzl wrote:
> On Fri, Dec 22, 2006 at 06:14:48AM -0800, Paul Menage wrote:
>> This patch implements the BeanCounter resource control abstraction
>> over generic process containers. It contains the beancounter core
>> code, plus the numfiles resource counter. It doesn't currently contain
>> any of the memory tracking code or the code for switching beancounter
>> context in interrupts.
> 
> I don't like it, it looks bloated and probably
> adds plenty of overhead (similar to the OVZ
> implementation where this seems to be taken from)

FULL BC patch w/o pages fractions accounting doesn't
add any noticeable overhead to mainstream kernel.
Pages fractions accounting will be optimized as well.
The part you're talking about is only 1/100 of the
complete patch.

> here are some comments/questions:
> 
>> Currently all the beancounters resource counters are lumped into a
>> single hierarchy; ideally it would be possible for each resource
>> counter to be a separate container subsystem, allowing them to be
>> connected to different hierarchies.
>>
>> +static inline void bc_uncharge(struct beancounter *bc, int res_id,
>> +		unsigned long val)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&bc->bc_lock, flags);
>> +	bc_uncharge_locked(bc, res_id, val);
>> +	spin_unlock_irqrestore(&bc->bc_lock, flags);
> 
> why use a spinlock, when we could use atomic
> counters?

Because approach

if (atomic_read(&bc->barrier) > aromic_read(&bc->held))
        atomic_inc(&bc->held);

used in vserver accounting is not atomic ;)

Look at the comment below about charging two resources at once.

> 
>> +int bc_charge_locked(struct beancounter *bc, int res, unsigned long val,
>> +		int strict, unsigned long flags)
>> +{
>> +	struct bc_resource_parm *parm;
>> +	unsigned long new_held;
>> +
>> +	BUG_ON(val > BC_MAXVALUE);
>> +
>> +	parm = &bc->bc_parms[res];
>> +	new_held = parm->held + val;
>> +
>> +	switch (strict) {
>> +	case BC_LIMIT:
>> +		if (new_held > parm->limit)
>> +			break;
>> +		/* fallthrough */
>> +	case BC_BARRIER:
>> +		if (new_held > parm->barrier) {
>> +			if (strict == BC_BARRIER)
>> +				break;
>> +			if (parm->held < parm->barrier &&
>> +					bc_resources[res]->bcr_barrier_hit)
>> +				bc_resources[res]->bcr_barrier_hit(bc);
>> +		}
> 
> why do barrier checks with every accounting? 
> there are probably a few cases where the
> checks could be independant from the accounting

Let's look at

    if (parm->held < parm->barrier &&
             bc_resources[res]->bcr_barrier_hit)
         bc_resources[res]->bcr_barrier_hit(bc);

code one more time.

In case of BC_LIMIT charge BC code informs resource
controller about BARRIER hit to take some actions
before hard resource shortage.

>> +		/* fallthrough */
>> +	case BC_FORCE:
>> +		parm->held = new_held;
>> +		bc_adjust_maxheld(parm);
> 
> in what cases do we want to cross the barrier?
> 
>> +		return 0;
>> +	default:
>> +		BUG();
>> +	}
>> +
>> +	if (bc_resources[res]->bcr_limit_hit)
>> +		return bc_resources[res]->bcr_limit_hit(bc, val, flags);
>> +
>> +	parm->failcnt++;
>> +	return -ENOMEM;
> 
>> +int bc_file_charge(struct file *file)
>> +{
>> +	int sev;
>> +	struct beancounter *bc;
>> +
>> +	task_lock(current);
> 
> why do we lock current? it won't go away that
> easily, and for switching the bc, it might be 
> better to use RCU or a separate lock, no?

This came from containers patches. BC code doesn't take
locks on fast paths.

>> +	bc = task_bc(current);
>> +	css_get_current(&bc->css);
>> +	task_unlock(current);
>> +
>> +	sev = (capable(CAP_SYS_ADMIN) ? BC_LIMIT : BC_BARRIER);
>> +
>> +	if (bc_charge(bc, BC_NUMFILES, 1, sev)) {
>> +		css_put(&bc->css);
>> +		return -EMFILE;
>> +	}
>> +
>> +	file->f_bc = bc;
>> +	return 0;
>> +}
> 
> also note that certain limits are much more
> complicated than the (very simple) file limits
> and the code will be called at higher frequency

We do know it and we have "pre-charges" optimization
for frequent calls. bc->lock we've seen is used to
make two or more resources charge in only one atomic
operation, that is faster than doing atomic_inc()
for each resource as you've proposed above.

> how to handle requests like:
>  try to get as 64 files or as many as available
>  whatever is smaller

I promise, that if Linus will include patch that adds a syscall
to open 64 or "as many as available whatever is smaller" files
at once we'll add this functionality.

> happy xmas,
> Herbert
> 
>

next prev parent reply	other threads:[~2006-12-25 10:35 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-22 14:14 [PATCH 0/6] containers: Generic Process Containers (V6) Paul Menage
2006-12-22 14:14 ` [PATCH 1/6] containers: Generic container system abstracted from cpusets code Paul Menage
2006-12-30 13:10   ` Eric W. Biederman
2006-12-31  5:17     ` Paul Jackson
2007-01-02 22:15     ` Paul Menage
2006-12-22 14:14 ` [PATCH 2/6] containers: Cpusets hooked into containers Paul Menage
2006-12-22 14:14 ` [PATCH 3/6] containers: Add generic multi-subsystem API to containers Paul Menage
2007-01-10 15:56   ` [ckrm-tech] " Balbir Singh
2007-01-11 22:53     ` Paul Menage
2007-01-12  6:29       ` Balbir Singh
2007-01-12  8:10         ` Paul Menage
2007-01-12  8:22           ` Balbir Singh
2007-01-20 17:27           ` Balbir Singh
2006-12-22 14:14 ` [PATCH 4/6] containers: Simple CPU accounting container subsystem Paul Menage
2007-01-10 14:21   ` [ckrm-tech] " Balbir Singh
2007-01-12  0:33     ` Paul Menage
2007-01-12  6:24       ` Balbir Singh
2007-01-12  8:15         ` Paul Menage
2007-01-12  8:26           ` Balbir Singh
2007-01-12 17:32             ` Paul Menage
2007-01-15  9:01       ` [PATCH 0/1] Add mount/umount callbacks to containers (Re: [ckrm-tech] [PATCH 4/6] containers: Simple CPU accounting container subsystem) Balbir Singh
2007-01-15  9:04         ` [PATCH 1/1] Fix a panic while mouting containers on powerpc and some other small cleanups " Balbir Singh
2007-01-15  9:22           ` Paul Menage
2007-01-15  9:51             ` [ckrm-tech] [PATCH 1/1] Fix a panic while mouting containers on powerpc and some other small cleanups (Re: " Balbir Singh
2007-01-15 10:01               ` Paul Menage
2007-01-15 10:10                 ` Balbir Singh
2006-12-22 14:14 ` [PATCH 5/6] containers: Resource Groups over generic containers Paul Menage
2006-12-22 14:14 ` [PATCH 6/6] containers: BeanCounters over generic process containers Paul Menage
2006-12-23 19:49   ` Herbert Poetzl
2006-12-24 11:32     ` Paul Menage
2006-12-25 10:16     ` Kirill Korotaev
2006-12-26  0:54       ` Paul Menage
2006-12-25 10:35     ` Pavel Emelianov [this message]
2007-01-03 14:43 ` [PATCH 0/6] containers: Generic Process Containers (V6) Serge E. Hallyn
2007-01-05  0:25   ` Paul Menage
2007-01-12 18:42     ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=458FA963.2050502@openvz.org \
    --to=xemul@openvz.org \
    --cc=akpm@osdl.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=containers@lists.osdl.org \
    --cc=dev@sw.ru \
    --cc=herbert@13thfloor.at \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=pj@sgi.com \
    --cc=rohitseth@google.com \
    --cc=sekharan@us.ibm.com \
    --cc=serue@us.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=winget@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox