From: Kirill Korotaev <dev@sw.ru>
To: balbir@in.ibm.com
Cc: Andrew Morton <akpm@osdl.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Pavel Emelianov <xemul@openvz.org>, Andrey Savochkin <saw@sw.ru>,
devel@openvz.org, Rik van Riel <riel@redhat.com>,
Andi Kleen <ak@suse.de>, Oleg Nesterov <oleg@tv-sign.ru>,
Alexey Dobriyan <adobriyan@mail.ru>,
Matt Helsley <matthltc@us.ibm.com>,
CKRM-Tech <ckrm-tech@lists.sourceforge.net>
Subject: Re: [PATCH 6/7] BC: kernel memory (core)
Date: Mon, 04 Sep 2006 16:21:44 +0400 [thread overview]
Message-ID: <44FC1A58.3090506@sw.ru> (raw)
In-Reply-To: <44F48A6A.40501@in.ibm.com>
Balbir Singh wrote:
> Kirill Korotaev wrote:
>
>> Introduce BC_KMEMSIZE resource which accounts kernel
>> objects allocated by task's request.
>>
>> Reference to BC is kept on struct page or slab object.
>> For slabs each struct slab contains a set of pointers
>> corresponding objects are charged to.
>>
>> Allocation charge rules:
>> 1. Pages - if allocation is performed with __GFP_BC flag - page
>> is charged to current's exec_bc.
>> 2. Slabs - kmem_cache may be created with SLAB_BC flag - in this
>> case each allocation is charged. Caches used by kmalloc are
>> created with SLAB_BC | SLAB_BC_NOCHARGE flags. In this case
>> only __GFP_BC allocations are charged.
>>
>
> <snip>
>
>> +#define __GFP_BC_LIMIT ((__force gfp_t)0x100000u) /* Charge against
>> BC limit */
>>
>
> What's _GFP_BC_LIMIT for, could you add the description for that flag?
> The comment is not very clear
>
>> +#ifdef CONFIG_BEANCOUNTERS
>> + union {
>> + struct beancounter *page_bc;
>> + } bc;
>> +#endif
>> };
>>
>> +#define page_bc(page) ((page)->bc.page_bc)
>
>
> Minor comment - page->(bc).page_bc has too many repititions of page and
> bc - see
> the Practice of Programming by Kernighan and Pike
>
> I missed the part of why you wanted to have a union (in struct page for
> bc)?
because this union is used both for kernel memory accounting and user memeory tracking.
>> const char *bc_rnames[] = {
>> + "kmemsize", /* 0 */
>> };
>>
>> static struct hlist_head bc_hash[BC_HASH_SIZE];
>> @@ -221,6 +222,8 @@ static void init_beancounter_syslimits(s
>> { int k;
>>
>> + bc->bc_parms[BC_KMEMSIZE].limit = 32 * 1024 * 1024;
>> +
>
>
> Can't this be configurable CONFIG_XXX or a #defined constant?
This is some arbitraty limited container, just to make sure it is not
created unlimited. User space should initialize limits properly after creation
anyway. So I don't see reasons to make it configurable, do you?
>> --- ./mm/mempool.c.bckmem 2006-04-21 11:59:36.000000000 +0400
>> +++ ./mm/mempool.c 2006-08-28 12:59:28.000000000 +0400
>> @@ -119,6 +119,7 @@ int mempool_resize(mempool_t *pool, int
>> unsigned long flags;
>>
>> BUG_ON(new_min_nr <= 0);
>> + gfp_mask &= ~__GFP_BC;
>>
>> spin_lock_irqsave(&pool->lock, flags);
>> if (new_min_nr <= pool->min_nr) {
>> @@ -212,6 +213,7 @@ void * mempool_alloc(mempool_t *pool, gf
>> gfp_mask |= __GFP_NOMEMALLOC; /* don't allocate emergency
>> reserves */
>> gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */
>> gfp_mask |= __GFP_NOWARN; /* failures are OK */
>> + gfp_mask &= ~__GFP_BC; /* do not charge */
>>
>> gfp_temp = gfp_mask & ~(__GFP_WAIT|__GFP_IO);
>>
>
> Is there any reasn why mempool_xxxx() functions are not charged? Is it
> because
> mempool functions are mostly used from the I/O path?
yep.
>> --- ./mm/page_alloc.c.bckmem 2006-08-28 12:20:13.000000000 +0400
>> +++ ./mm/page_alloc.c 2006-08-28 12:59:28.000000000 +0400
>> @@ -40,6 +40,8 @@
>> #include <linux/sort.h>
>> #include <linux/pfn.h>
>>
>> +#include <bc/kmem.h>
>> +
>> #include <asm/tlbflush.h>
>> #include <asm/div64.h>
>> #include "internal.h"
>> @@ -516,6 +518,8 @@ static void __free_pages_ok(struct page if
>> (reserved)
>> return;
>>
>> + bc_page_uncharge(page, order);
>> +
>> kernel_map_pages(page, 1 << order, 0);
>> local_irq_save(flags);
>> __count_vm_events(PGFREE, 1 << order);
>> @@ -799,6 +803,8 @@ static void fastcall free_hot_cold_page(
>> if (free_pages_check(page))
>> return;
>>
>> + bc_page_uncharge(page, 0);
>> +
>> kernel_map_pages(page, 1, 0);
>>
>> pcp = &zone_pcp(zone, get_cpu())->pcp[cold];
>> @@ -1188,6 +1194,11 @@ nopage:
>> show_mem();
>> }
>> got_pg:
>> + if ((gfp_mask & __GFP_BC) &&
>> + bc_page_charge(page, order, gfp_mask)) {
>
>
> I wonder if bc_page_charge() should be called bc_page_charge_failed()?
> Does it make sense to atleast partially start reclamation here? I know with
> bean counters we cannot reclaim from a particular container, but for now
> we could kick off kswapd() or call shrink_all_memory() inline (Dave's
> patches do this to shrink memory from the particular cpuset). Or do you
> want to leave this
> slot open for later?
yes. my intention is to account correctly all needed information first.
After we agree on accounting, we can agree on how to do reclamaition.
>> + __free_pages(page, order);
>> + page = NULL;
>> + }
>
>
>
--
VGER BF report: U 0.50051
next prev parent reply other threads:[~2006-09-04 12:18 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-29 14:33 [PATCH] BC: resource beancounters (v3) Kirill Korotaev
2006-08-29 14:49 ` [PATCH 1/7] introduce atomic_dec_and_lock_irqsave() Kirill Korotaev
2006-08-30 9:59 ` Roman Zippel
2006-08-30 14:57 ` Oleg Nesterov
2006-08-30 10:51 ` Roman Zippel
2006-08-30 15:51 ` Oleg Nesterov
2006-08-30 11:54 ` Roman Zippel
2006-08-30 16:58 ` Dipankar Sarma
2006-08-30 17:25 ` Roman Zippel
2006-08-31 22:58 ` Paul E. McKenney
2006-08-29 14:51 ` [PATCH 2/7] BC: kconfig Kirill Korotaev
2006-08-30 4:05 ` Randy.Dunlap
2006-08-29 14:53 ` [PATCH 3/7] BC: beancounters core (API) Kirill Korotaev
2006-08-30 18:58 ` [ckrm-tech] " Chandra Seetharaman
2006-08-30 19:47 ` Andrew Morton
2006-08-29 14:54 ` [PATCH 4/7] BC: context inheriting and changing Kirill Korotaev
2006-08-30 19:11 ` [ckrm-tech] " Chandra Seetharaman
2006-08-29 14:55 ` [PATCH 5/7] BC: user interface (syscalls) Kirill Korotaev
2006-08-30 19:15 ` [ckrm-tech] " Chandra Seetharaman
2006-08-29 14:58 ` [PATCH 6/7] BC: kernel memory (core) Kirill Korotaev
2006-08-29 18:41 ` Balbir Singh
2006-09-04 12:21 ` Kirill Korotaev [this message]
2006-09-04 15:45 ` Balbir Singh
2006-08-29 20:29 ` [ckrm-tech] " Dave Hansen
2006-09-04 12:23 ` Kirill Korotaev
2006-09-05 16:21 ` Dave Hansen
2006-08-30 19:25 ` Chandra Seetharaman
2006-08-29 14:58 ` [PATCH 7/7] BC: kernel memory (marks) Kirill Korotaev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44FC1A58.3090506@sw.ru \
--to=dev@sw.ru \
--cc=adobriyan@mail.ru \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=balbir@in.ibm.com \
--cc=ckrm-tech@lists.sourceforge.net \
--cc=devel@openvz.org \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matthltc@us.ibm.com \
--cc=oleg@tv-sign.ru \
--cc=riel@redhat.com \
--cc=saw@sw.ru \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox