From: Nishanth Aravamudan <nacc@us.ibm.com>
To: npiggin@suse.de
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
andi@firstfloor.org, kniht@linux.vnet.ibm.com, abh@cray.com,
wli@holomorphy.com
Subject: Re: [patch 05/18] hugetlb: multiple hstates
Date: Fri, 25 Apr 2008 10:38:27 -0700 [thread overview]
Message-ID: <20080425173827.GC9680@us.ibm.com> (raw)
In-Reply-To: <20080423015430.162027000@nick.local0.net>
On 23.04.2008 [11:53:07 +1000], npiggin@suse.de wrote:
> Add basic support for more than one hstate in hugetlbfs
>
> - Convert hstates to an array
> - Add a first default entry covering the standard huge page size
> - Add functions for architectures to register new hstates
> - Add basic iterators over hstates
>
> Signed-off-by: Andi Kleen <ak@suse.de>
> Signed-off-by: Nick Piggin <npiggin@suse.de>
> ---
> include/linux/hugetlb.h | 11 ++++
> mm/hugetlb.c | 112 +++++++++++++++++++++++++++++++++++++-----------
> 2 files changed, 97 insertions(+), 26 deletions(-)
>
> Index: linux-2.6/mm/hugetlb.c
> ===================================================================
> --- linux-2.6.orig/mm/hugetlb.c
> +++ linux-2.6/mm/hugetlb.c
> @@ -27,7 +27,17 @@ unsigned long sysctl_overcommit_huge_pag
> static gfp_t htlb_alloc_mask = GFP_HIGHUSER;
> unsigned long hugepages_treat_as_movable;
>
> -struct hstate global_hstate;
> +static int max_hstate = 0;
> +
> +static unsigned long default_hstate_resv = 0;
Unnecessary initializations (and whitespace)?
What's the purpose of default_hstate_resv? Isn't it basically just
replacing max_huge_pages? "resv" has a very special meaning in
hugetlb.c, can another name be chosen?
> +struct hstate hstates[HUGE_MAX_HSTATE];
> +
> +/* for command line parsing */
> +struct hstate *parsed_hstate __initdata = NULL;
Unnecessary initialization (checkpatch caught the first two, not this
one). Should this be static? Isn't __initdata traditionally put closer
to the front of the line?
> +#define for_each_hstate(h) \
> + for ((h) = hstates; (h) < &hstates[max_hstate]; (h)++)
>
> /*
> * Protects updates to hugepage_freelists, nr_huge_pages, and free_huge_pages
> @@ -128,9 +138,19 @@ static void update_and_free_page(struct
> __free_pages(page, huge_page_order(h));
> }
>
> +struct hstate *size_to_hstate(unsigned long size)
> +{
> + struct hstate *h;
> + for_each_hstate (h) {
Extraneous space?
> + if (huge_page_size(h) == size)
> + return h;
> + }
> + return NULL;
> +}
Might become annoying if we add many hugepagesizes, but I guess we'll
never have enough to really matter. Just don't want to have to worry
about this loop for performance reasons when only one hugepage size is
in use? Would it make sense to cache the last value used? Probably
overkill for now.
> static void free_huge_page(struct page *page)
> {
> - struct hstate *h = &global_hstate;
> + struct hstate *h = size_to_hstate(PAGE_SIZE << compound_order(page));
Perhaps this could be made a static inline function?
static inline page_hstate(struct page *page)
{
return size_to_hstate(PAGE_SIZE << compound_order(page))
}
I guess I haven't checked yet if it's used anywhere else, but it makes
things a little clearer, perhaps?
And this is only needed to be done actually for the destructor case?
Technically, we have the hstate already in the set_max_huge_pages()
path? Might be worth a cleanup down-the-road.
> int nid = page_to_nid(page);
> struct address_space *mapping;
>
> @@ -495,38 +515,80 @@ static struct page *alloc_huge_page(stru
> return page;
> }
>
> -static int __init hugetlb_init(void)
> +static void __init hugetlb_init_hstate(struct hstate *h)
Could this perhaps be named hugetlb_init_one_hstate()? Makes it harder
for me to go cross-eyed as I go between the functions :)
<snip>
> +static void __init report_hugepages(void)
> +{
> + struct hstate *h;
> +
> + for_each_hstate(h) {
> + printk(KERN_INFO "Total HugeTLB memory allocated, %ld %dMB pages\n",
> + h->free_huge_pages,
> + 1 << (h->order + PAGE_SHIFT - 20));
This will need to be changed for 64K hugepages (which already exist in
mainline). Perhaps we need a hugepage_units() function :)
<snip>
> +/* Should be called on processing a hugepagesz=... option */
> +void __init huge_add_hstate(unsigned order)
> +{
> + struct hstate *h;
> + if (size_to_hstate(PAGE_SIZE << order)) {
> + printk("hugepagesz= specified twice, ignoring\n");
Needs a KERN_ level.
And did we decide whether specifying hugepagesz= multiple times is ok,
or not?
> + return;
> + }
> + BUG_ON(max_hstate >= HUGE_MAX_HSTATE);
> + BUG_ON(order < HPAGE_SHIFT - PAGE_SHIFT);
> + h = &hstates[max_hstate++];
> + h->order = order;
> + h->mask = ~((1ULL << (order + PAGE_SHIFT)) - 1);
> + hugetlb_init_hstate(h);
> + parsed_hstate = h;
> +}
> +
> static int __init hugetlb_setup(char *s)
> {
> - if (sscanf(s, "%lu", &max_huge_pages) <= 0)
> - max_huge_pages = 0;
> + if (sscanf(s, "%lu", &default_hstate_resv) <= 0)
> + default_hstate_resv = 0;
> return 1;
> }
> __setup("hugepages=", hugetlb_setup);
> @@ -544,28 +606,27 @@ static unsigned int cpuset_mems_nr(unsig
>
> #ifdef CONFIG_SYSCTL
> #ifdef CONFIG_HIGHMEM
> -static void try_to_free_low(unsigned long count)
> +static void try_to_free_low(struct hstate *h, unsigned long count)
> {
> - struct hstate *h = &global_hstate;
> int i;
>
> for (i = 0; i < MAX_NUMNODES; ++i) {
> struct page *page, *next;
> struct list_head *freel = &h->hugepage_freelists[i];
> list_for_each_entry_safe(page, next, freel, lru) {
> - if (count >= nr_huge_pages)
> + if (count >= h->nr_huge_pages)
> return;
> if (PageHighMem(page))
> continue;
> list_del(&page->lru);
> - update_and_free_page(page);
> + update_and_free_page(h, page);
> h->free_huge_pages--;
> h->free_huge_pages_node[page_to_nid(page)]--;
> }
> }
> }
> #else
> -static inline void try_to_free_low(unsigned long count)
> +static inline void try_to_free_low(struct hstate *h, unsigned long count)
> {
> }
> #endif
> @@ -625,7 +686,7 @@ static unsigned long set_max_huge_pages(
> */
> min_count = h->resv_huge_pages + h->nr_huge_pages - h->free_huge_pages;
> min_count = max(count, min_count);
> - try_to_free_low(min_count);
> + try_to_free_low(h, min_count);
> while (min_count < persistent_huge_pages(h)) {
> struct page *page = dequeue_huge_page(h);
> if (!page)
> @@ -648,6 +709,7 @@ int hugetlb_sysctl_handler(struct ctl_ta
> {
> proc_doulongvec_minmax(table, write, file, buffer, length, ppos);
> max_huge_pages = set_max_huge_pages(max_huge_pages);
> + global_hstate.max_huge_pages = max_huge_pages;
So this implies the sysctl still only controls the singe state? Perhaps
it would be better if this patch made set_max_huge_pages() take an
hstate? Also, this seems to be the only place where max_huge_pages is
still used, so can't you just do:
global_hstate.max_huge_pages = set_max_huge_pages(max_huge_pages); ?
<snip>
> @@ -1296,7 +1358,7 @@ out:
> int hugetlb_reserve_pages(struct inode *inode, long from, long to)
> {
> long ret, chg;
> - struct hstate *h = &global_hstate;
> + struct hstate *h = hstate_inode(inode);
>
> chg = region_chg(&inode->i_mapping->private_list, from, to);
> if (chg < 0)
> @@ -1315,7 +1377,7 @@ int hugetlb_reserve_pages(struct inode *
>
> void hugetlb_unreserve_pages(struct inode *inode, long offset, long freed)
> {
> - struct hstate *h = &global_hstate;
> + struct hstate *h = hstate_inode(inode);
Couldn't both of these changes have been made in the previous patch?
Thanks,
Nish
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-04-25 17:38 UTC|newest]
Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-23 1:53 [patch 00/18] multi size, and giant hugetlb page support, 1GB hugetlb for x86 npiggin
2008-04-23 1:53 ` [patch 01/18] hugetlb: fix lockdep spew npiggin
2008-04-23 13:06 ` KOSAKI Motohiro
2008-04-23 1:53 ` [patch 02/18] hugetlb: factor out huge_new_page npiggin
2008-04-24 23:49 ` Nishanth Aravamudan
2008-04-24 23:54 ` Nishanth Aravamudan
2008-04-24 23:58 ` Nishanth Aravamudan
2008-04-25 7:10 ` Andi Kleen
2008-04-25 16:54 ` Nishanth Aravamudan
2008-04-25 19:13 ` Christoph Lameter
2008-04-25 19:29 ` Nishanth Aravamudan
2008-04-30 19:16 ` Christoph Lameter
2008-04-30 20:44 ` Nishanth Aravamudan
2008-05-01 19:23 ` Christoph Lameter
2008-05-01 20:25 ` Nishanth Aravamudan
2008-05-01 20:34 ` Christoph Lameter
2008-05-01 21:01 ` Nishanth Aravamudan
2008-05-23 5:03 ` Nick Piggin
2008-04-23 1:53 ` [patch 03/18] mm: offset align in alloc_bootmem npiggin, Yinghai Lu
2008-04-23 1:53 ` [patch 04/18] hugetlb: modular state npiggin
2008-04-23 15:21 ` Jon Tollefson
2008-04-23 15:38 ` Nick Piggin
2008-04-25 17:13 ` Nishanth Aravamudan
2008-05-23 5:02 ` Nick Piggin
2008-05-23 20:48 ` Nishanth Aravamudan
2008-04-23 1:53 ` [patch 05/18] hugetlb: multiple hstates npiggin
2008-04-25 17:38 ` Nishanth Aravamudan [this message]
2008-04-25 17:48 ` Nishanth Aravamudan
2008-04-25 17:55 ` Andi Kleen
2008-04-25 17:52 ` Nishanth Aravamudan
2008-04-25 18:10 ` Andi Kleen
2008-04-28 10:13 ` Andy Whitcroft
2008-05-23 5:18 ` Nick Piggin
2008-04-29 17:27 ` Nishanth Aravamudan
2008-05-23 5:19 ` Nick Piggin
2008-04-23 1:53 ` [patch 06/18] hugetlb: multi hstate proc files npiggin
2008-05-02 19:53 ` Nishanth Aravamudan
2008-05-23 5:22 ` Nick Piggin
2008-05-23 20:30 ` Nishanth Aravamudan
2008-04-23 1:53 ` [patch 07/18] hugetlbfs: per mount hstates npiggin
2008-04-25 18:09 ` Nishanth Aravamudan
2008-04-25 20:36 ` Nishanth Aravamudan
2008-04-25 22:39 ` Nishanth Aravamudan
2008-04-28 18:20 ` Adam Litke
2008-04-28 18:46 ` Nishanth Aravamudan
2008-05-23 5:24 ` Nick Piggin
2008-05-23 20:34 ` Nishanth Aravamudan
2008-05-23 22:49 ` Nick Piggin
2008-05-23 23:24 ` Nishanth Aravamudan
2008-04-23 1:53 ` [patch 08/18] hugetlb: multi hstate sysctls npiggin
2008-04-25 18:14 ` Nishanth Aravamudan
2008-05-23 5:25 ` Nick Piggin
2008-05-23 20:27 ` Nishanth Aravamudan
2008-04-25 23:35 ` Nishanth Aravamudan
2008-05-23 5:28 ` Nick Piggin
2008-05-23 10:40 ` Andi Kleen
2008-04-23 1:53 ` [patch 09/18] hugetlb: abstract numa round robin selection npiggin
2008-04-23 1:53 ` [patch 10/18] mm: introduce non panic alloc_bootmem npiggin
2008-04-23 1:53 ` [patch 11/18] mm: export prep_compound_page to mm npiggin
2008-04-23 16:12 ` Andrew Hastings
2008-05-23 5:29 ` Nick Piggin
2008-04-23 1:53 ` [patch 12/18] hugetlbfs: support larger than MAX_ORDER npiggin
2008-04-23 16:15 ` Andrew Hastings
2008-04-23 16:25 ` Andi Kleen
2008-04-25 18:55 ` Nishanth Aravamudan
2008-05-23 5:29 ` Nick Piggin
2008-04-30 21:01 ` Dave Hansen
2008-05-23 5:30 ` Nick Piggin
2008-04-23 1:53 ` [patch 13/18] hugetlb: support boot allocate different sizes npiggin
2008-04-23 16:15 ` Andrew Hastings
2008-04-25 18:40 ` Nishanth Aravamudan
2008-04-25 18:50 ` Andi Kleen
2008-04-25 20:05 ` Nishanth Aravamudan
2008-05-23 5:36 ` Nick Piggin
2008-05-23 6:04 ` Nick Piggin
2008-05-23 20:32 ` Nishanth Aravamudan
2008-05-23 22:45 ` Nick Piggin
2008-05-23 22:53 ` Nishanth Aravamudan
2008-04-23 1:53 ` [patch 14/18] hugetlb: printk cleanup npiggin
2008-04-27 3:32 ` Nishanth Aravamudan
2008-05-23 5:37 ` Nick Piggin
2008-04-23 1:53 ` [patch 15/18] hugetlb: introduce huge_pud npiggin
2008-04-23 1:53 ` [patch 16/18] x86: support GB hugepages on 64-bit npiggin
2008-04-23 1:53 ` [patch 17/18] x86: add hugepagesz option " npiggin
2008-04-30 19:34 ` Nishanth Aravamudan
2008-04-30 19:52 ` Andi Kleen
2008-04-30 20:02 ` Nishanth Aravamudan
2008-04-30 20:19 ` Andi Kleen
2008-04-30 20:23 ` Nishanth Aravamudan
2008-04-30 20:45 ` Andi Kleen
2008-04-30 20:51 ` Nishanth Aravamudan
2008-04-30 20:40 ` Jon Tollefson
2008-04-30 20:48 ` Nishanth Aravamudan
2008-05-23 5:41 ` Nick Piggin
2008-05-23 10:43 ` Andi Kleen
2008-05-23 12:34 ` Nick Piggin
2008-05-23 14:29 ` Andi Kleen
2008-05-23 20:43 ` Nishanth Aravamudan
2008-05-23 20:39 ` Nishanth Aravamudan
2008-05-23 22:52 ` Nick Piggin
2008-04-23 1:53 ` [patch 18/18] hugetlb: my fixes 2 npiggin
2008-04-23 10:48 ` Andi Kleen
2008-04-23 15:36 ` Nick Piggin
2008-04-23 18:49 ` Nishanth Aravamudan
2008-04-23 19:37 ` Andi Kleen
2008-04-23 21:11 ` Nishanth Aravamudan
2008-04-23 21:38 ` Nishanth Aravamudan
2008-04-23 22:06 ` Dave Hansen
2008-04-23 15:20 ` Jon Tollefson
2008-04-23 15:44 ` Nick Piggin
2008-04-23 8:05 ` [patch 00/18] multi size, and giant hugetlb page support, 1GB hugetlb for x86 Andi Kleen
2008-04-23 15:34 ` Nick Piggin
2008-04-23 15:46 ` Andi Kleen
2008-04-23 15:53 ` Nick Piggin
2008-04-23 16:02 ` Andi Kleen
2008-04-23 16:02 ` Nick Piggin
2008-04-23 18:54 ` Nishanth Aravamudan
2008-04-23 18:52 ` Nishanth Aravamudan
2008-04-24 2:08 ` Nick Piggin
2008-04-24 6:43 ` Nishanth Aravamudan
2008-04-24 7:06 ` Nick Piggin
2008-04-24 17:08 ` Nishanth Aravamudan
2008-04-23 18:43 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080425173827.GC9680@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=abh@cray.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=kniht@linux.vnet.ibm.com \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).