All of lore.kernel.org
 help / color / mirror / Atom feed
From: catalin.marinas@arm.com (Catalin Marinas)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 2/7] Add various hugetlb page table fix
Date: Tue, 7 Feb 2012 14:11:00 +0000	[thread overview]
Message-ID: <20120207141100.GI3351@arm.com> (raw)
In-Reply-To: <CAOMgcGLhGyz5xU1P2C=Av9LWazJF3X=U14riNi=xESZX68mdwQ@mail.gmail.com>

On Tue, Feb 07, 2012 at 01:24:09PM +0000, carson bill wrote:
> 2012/2/7, Catalin Marinas <catalin.marinas@arm.com>:
> > On Tue, Feb 07, 2012 at 01:42:01AM +0000, bill4carson wrote:
> >> On 2012?02?07? 00:26, Catalin Marinas wrote:
> >> > On Wed, Feb 01, 2012 at 03:10:21AM +0000, bill4carson wrote:
> >> >> Why L_PTE_HUGEPAGE is needed?
> >> >>
> >> >> hugetlb subsystem will call pte_page to derive the corresponding page
> >> >> struct from a given pte, and pte_pfn is used first to convert pte into
> >> >> a page frame number.
> >> >
> >> > Are you sure the pte_pfn() conversion is right? Does it need to be
> >> > different from the 4K pfn?
> > ...
> >> pte_page is defined as following to derive page struct from a given pte.
> >> This macro is used both in generic mm as well as hugetlb sub-system, so
> >> we need do the switch in pte_pfn to mark huge page based linux pte out
> >> of normal page based linux pte, that's what L_PTE_HUGEPAGE for.
> >>
> >> #define pte_page(pte)		pfn_to_page(pte_pfn(pte))
> >>
> >> So L_PTE_HUGEPAGE is *NOT* set in normal page based linux pte,
> >> linux pte bits[31:12] is the page frame number;
> >
> > I agree.
> >
> >> otherwise, we got a huge page based linux pte, and linux pte
> >> bits[31:20] is page frame number for SECTION mapping, and bits[31:24]
> >> is page frame number for SUPER-SECTION mapping.
> >
> > Actually it is still 31:12 but with bits 19:12 or 23:12 masked out. So
> > you do the correct shift by PAGE_SHIFT with the additional masking for
> > huge pages (harmless).
> >
> > But do we actually need this masking? Do the huge_pte_offset() or
> > huge_pte_alloc() functions return the Linux pte (pmd) for the huge page?
> > If yes, can we not ensure that bits 19:12 are already zero? This
> > shouldn't be any different from the 4K Linux pte but with an address
> > aligned to 1MB.
> 
> I'm afraid there is some misunderstanding.
> huge_pte_offset() returns the huge linux pte address if they exist;
> huge_pte_alloc()  allocates a location to store huge linux pte, and
> return this address;
> non of above functions return huge linux pte *value*.

I agree, huge_pte_offset() returns a pointer to the Linux pte/pmd if it
exists. My point is that the values stored in Linux pte/pmd have bits
20:12 cleared already as the address is at least 2MB aligned (well,
apart from the additional L_PTE_HPAGE_* bits that you declared). Is this
correct? If yes, then you don't need any additional masking for
pte_pfn() even if it is passed a Linux pmd.

> make_huge_pte() will return huge linux pte for a given page and vma
> protection bits,
> please notice pte_mkhuge is used to mark this pte as huge linux pte by setting
> L_PTE_HUGEPAGE, then set_huge_pte_at() is used to set huge linux pte as well
> huge hardware pte.
> 
> 
> 2113static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page,
> 2114                                int writable)
> 2115{
> 2116        pte_t entry;
> 2117
> 2118        if (writable) {
> 2119                entry =
> 2120                    pte_mkwrite(pte_mkdirty(mk_pte(page,
> vma->vm_page_prot)));
> 2121        } else {
> 2122                entry = huge_pte_wrprotect(mk_pte(page, vma->vm_page_prot));
> 2123        }
> 2124        entry = pte_mkyoung(entry);
> 2125        entry = pte_mkhuge(entry);
> 2126
> 2127        return entry;
> 2128}
> 
> Hence, normal linux pte must has L_PTE_HUGEPAE cleared;
> A huge linux pte must has L_PTE_HUGEPAGE(BIT11) set
> This could lead to L_PTE_HPAGE_2M(BIT12) or L_PTE_HPAGE_16M(BIT13) set
> respectively, that's why the masking is needed for pte_pfn.

But if you avoid setting L_PTE_HPAGE_*, than we don't need the masking
for pte_pfn. In which case, we don't need to differentiate between a
normal and a huge pte in pte_pfn(), so no need for L_PTE_HUGEPAGE. The
set_huge_pte_at() function is only called with a huge pte, so it doesn't
need to check the L_PTE_HUGEPAGE bit either.

-- 
Catalin

  reply	other threads:[~2012-02-07 14:11 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-30  7:57 [RFC] ARM hugetlb support bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 1/7] Add various hugetlb arm high level hooks bill4carson at gmail.com
2012-02-06 17:07   ` Catalin Marinas
2012-02-07  2:00     ` bill4carson
2012-02-07 11:54       ` Catalin Marinas
2012-02-07 12:15   ` Catalin Marinas
2012-02-07 12:57     ` carson bill
2012-01-30  7:57 ` [PATCH 2/7] Add various hugetlb page table fix bill4carson at gmail.com
2012-01-31  9:57   ` Catalin Marinas
2012-01-31  9:58   ` Russell King - ARM Linux
2012-01-31 12:25     ` Catalin Marinas
2012-02-01  3:10       ` bill4carson
2012-02-06 16:26         ` Catalin Marinas
2012-02-07  1:42           ` bill4carson
2012-02-07 11:50             ` Catalin Marinas
2012-02-07 13:24               ` carson bill
2012-02-07 14:11                 ` Catalin Marinas [this message]
2012-02-07 14:46                   ` carson bill
2012-02-07 15:09                     ` Catalin Marinas
2012-02-07 15:41                       ` carson bill
2012-01-30  7:57 ` [PATCH 3/7] Introduce set_hugepte_ext api for huge page hardware page table setup bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 4/7] Store huge page linux pte in mm_struct bill4carson at gmail.com
2012-01-31  9:37   ` Catalin Marinas
2012-01-31 10:01   ` Russell King - ARM Linux
2012-02-01  5:45     ` bill4carson
2012-02-06  2:04       ` bill4carson
2012-02-06 10:29         ` Catalin Marinas
2012-02-06 14:40           ` carson bill
2012-01-30  7:57 ` [PATCH 5/7] Using do_page_fault for section fault handling bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 6/7] Add hugetlb Kconfig option bill4carson at gmail.com
2012-01-30  7:57 ` [PATCH 7/7] Minor compiling fix bill4carson at gmail.com
2012-01-31  9:29 ` [RFC] ARM hugetlb support Catalin Marinas
2012-02-01  1:56   ` bill4carson
2012-02-02 14:38     ` Catalin Marinas
2012-02-03  1:41       ` bill4carson
2012-02-06 16:29         ` Catalin Marinas
  -- strict thread matches above, loose matches on Subject: below --
2012-02-13  9:44 [RFC-PATCH V2] " Bill Carson
2012-02-13  9:44 ` [PATCH 2/7] Add various hugetlb page table fix Bill Carson
2012-03-01 10:13   ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120207141100.GI3351@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.