From: Andrew Morton <akpm@linux-foundation.org>
To: Roland Dreier <rdreier@cisco.com>
Cc: linuxppc-dev@ozlabs.org, linux-kernel@vger.kernel.org,
eli@mellanox.co.il
Subject: Re: FW: [PATCH] powerpc/mm: Export HPAGE_SHIFT
Date: Wed, 4 Feb 2009 13:23:13 -0800 [thread overview]
Message-ID: <20090204132313.e3047e2f.akpm@linux-foundation.org> (raw)
In-Reply-To: <adad4dym2zp.fsf@cisco.com>
On Wed, 04 Feb 2009 11:11:22 -0800
Roland Dreier <rdreier@cisco.com> wrote:
> > > > huge_page_size(page_hstate(page))
>
> > > That would suit. I assume the intention is for that to be usable by
> > > driver modules on any architecture?
>
> > erm, you overestimate the amount of planning and forethought which goes
> > into these things ;)
>
> > The lack of any EXPORT_SYMBOL(size_to_hstate) is a broadish hint.
>
> Heh. Looking into the implementation, it seems that I could actually do
>
> PAGE_SIZE << compound_order(page)
>
> directly (since there's no reason to go from size to hstate and back to
> size. I don't know all the details of these VM internals, but that
> seems to only work on the first (small) page of a giant page? Which
> causes problems for what we're trying to do here...
>
> To summarize the goal, we are mapping user memory to a device that has
> its own page tables, where the device's page tables can also use
> multiple page sizes. Using big pages on the device leads to similar
> efficiencies as hugetlb pages do on the CPU, and in fact if a user has
> used hugetlb pages for the memory they're giving to the device, that's a
> very strong hint that the device should use big pages too.
>
> But one valid situation we have to handle in the driver is if, say,
> userspace has a hugetlb mapped at virtual address 0x200000, and wants to
> map 0x80000 bytes at 0x280000 to the device. In that case, we're going
> to do essentially
>
> get_user_pages(..., 0x280000, 0x80000 / PAGE_SIZE, ...)
>
> and get_user_pages() is going to give us a bunch of normal PAGE_SIZE
> pages starting at offset 0x800000 within the compound page that makes up
> the huge page mapped at 0x200000.
>
> get_user_pages() also gives us the vma back, and we can see from
> is_vm_hugetlb_page() (-- BTW can I just say that a function
> is_xxx_page() that operates on vmas is horribly misnamed --) that these
> pages all come from a hugetlb mapping, but figuring out the size of that
> mapping is I guess a challenge.
compound_head() will convert any page* inside a hugepage into a pointer
to the head page. It should work OK for regular pages as well as
CONFIG_HUGETLB=n.
So..
PAGE_SIZE << compound_order(compound_head(page))
?
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Roland Dreier <rdreier@cisco.com>
Cc: benh@kernel.crashing.org, eli@mellanox.co.il,
linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org
Subject: Re: FW: [PATCH] powerpc/mm: Export HPAGE_SHIFT
Date: Wed, 4 Feb 2009 13:23:13 -0800 [thread overview]
Message-ID: <20090204132313.e3047e2f.akpm@linux-foundation.org> (raw)
In-Reply-To: <adad4dym2zp.fsf@cisco.com>
On Wed, 04 Feb 2009 11:11:22 -0800
Roland Dreier <rdreier@cisco.com> wrote:
> > > > huge_page_size(page_hstate(page))
>
> > > That would suit. I assume the intention is for that to be usable by
> > > driver modules on any architecture?
>
> > erm, you overestimate the amount of planning and forethought which goes
> > into these things ;)
>
> > The lack of any EXPORT_SYMBOL(size_to_hstate) is a broadish hint.
>
> Heh. Looking into the implementation, it seems that I could actually do
>
> PAGE_SIZE << compound_order(page)
>
> directly (since there's no reason to go from size to hstate and back to
> size. I don't know all the details of these VM internals, but that
> seems to only work on the first (small) page of a giant page? Which
> causes problems for what we're trying to do here...
>
> To summarize the goal, we are mapping user memory to a device that has
> its own page tables, where the device's page tables can also use
> multiple page sizes. Using big pages on the device leads to similar
> efficiencies as hugetlb pages do on the CPU, and in fact if a user has
> used hugetlb pages for the memory they're giving to the device, that's a
> very strong hint that the device should use big pages too.
>
> But one valid situation we have to handle in the driver is if, say,
> userspace has a hugetlb mapped at virtual address 0x200000, and wants to
> map 0x80000 bytes at 0x280000 to the device. In that case, we're going
> to do essentially
>
> get_user_pages(..., 0x280000, 0x80000 / PAGE_SIZE, ...)
>
> and get_user_pages() is going to give us a bunch of normal PAGE_SIZE
> pages starting at offset 0x800000 within the compound page that makes up
> the huge page mapped at 0x200000.
>
> get_user_pages() also gives us the vma back, and we can see from
> is_vm_hugetlb_page() (-- BTW can I just say that a function
> is_xxx_page() that operates on vmas is horribly misnamed --) that these
> pages all come from a hugetlb mapping, but figuring out the size of that
> mapping is I guess a challenge.
compound_head() will convert any page* inside a hugepage into a pointer
to the head page. It should work OK for regular pages as well as
CONFIG_HUGETLB=n.
So..
PAGE_SIZE << compound_order(compound_head(page))
?
next prev parent reply other threads:[~2009-02-04 21:23 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-03 16:49 [PATCH] powerpc/mm: Export HPAGE_SHIFT Eli Cohen
2009-02-04 1:08 ` FW: " Roland Dreier
2009-02-04 1:08 ` Roland Dreier
2009-02-04 1:50 ` Benjamin Herrenschmidt
2009-02-04 5:13 ` Andrew Morton
2009-02-04 5:13 ` Andrew Morton
2009-02-04 5:31 ` Nick Piggin
2009-02-04 5:31 ` Nick Piggin
2009-02-04 6:17 ` wli
2009-02-04 6:17 ` wli
2009-02-04 6:16 ` Roland Dreier
2009-02-04 6:16 ` Roland Dreier
2009-02-04 6:26 ` Andrew Morton
2009-02-04 6:26 ` Andrew Morton
2009-02-04 19:11 ` Roland Dreier
2009-02-04 19:11 ` Roland Dreier
2009-02-04 21:00 ` wli
2009-02-04 21:00 ` wli
2009-02-04 21:31 ` Roland Dreier
2009-02-04 21:31 ` Roland Dreier
2009-02-04 21:23 ` Andrew Morton [this message]
2009-02-04 21:23 ` Andrew Morton
2009-02-04 23:55 ` Benjamin Herrenschmidt
2009-02-04 23:55 ` Benjamin Herrenschmidt
2009-02-05 5:10 ` Roland Dreier
2009-02-05 5:10 ` Roland Dreier
2009-02-05 5:24 ` Benjamin Herrenschmidt
2009-02-05 5:24 ` Benjamin Herrenschmidt
2009-02-05 5:33 ` Roland Dreier
2009-02-05 5:33 ` Roland Dreier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090204132313.e3047e2f.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=eli@mellanox.co.il \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=rdreier@cisco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.