* [PATCH mmotm] thp: transparent hugepage core fixlet @ 2011-01-11 0:55 Hugh Dickins 2011-01-11 1:57 ` Andrea Arcangeli 0 siblings, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2011-01-11 0:55 UTC (permalink / raw) To: Andrew Morton; +Cc: Andrea Arcangeli, linux-mm If you configure THP in addition to HUGETLB_PAGE on x86_32 without PAE, the p?d-folding works out that munlock_vma_pages_range() can crash to follow_page()'s pud_huge() BUG_ON(flags & FOLL_GET): it needs the same VM_HUGETLB check already there on the pmd_huge() line. Conveniently, openSUSE provides a "blogd" which tests this out at startup! Signed-off-by: Hugh Dickins <hughd@google.com> --- This massive rework belongs just after thp-transparent-hugepage-core.patch mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- mmotm.orig/mm/memory.c 2011-01-10 16:31:29.000000000 -0800 +++ mmotm/mm/memory.c 2011-01-10 16:33:16.000000000 -0800 @@ -1288,7 +1288,7 @@ struct page *follow_page(struct vm_area_ pud = pud_offset(pgd, address); if (pud_none(*pud)) goto no_page_table; - if (pud_huge(*pud)) { + if (pud_huge(*pud) && vma->vm_flags & VM_HUGETLB) { BUG_ON(flags & FOLL_GET); page = follow_huge_pud(mm, address, pud, flags & FOLL_WRITE); goto out; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mmotm] thp: transparent hugepage core fixlet 2011-01-11 0:55 [PATCH mmotm] thp: transparent hugepage core fixlet Hugh Dickins @ 2011-01-11 1:57 ` Andrea Arcangeli 2011-01-11 2:29 ` Hugh Dickins 0 siblings, 1 reply; 7+ messages in thread From: Andrea Arcangeli @ 2011-01-11 1:57 UTC (permalink / raw) To: Hugh Dickins; +Cc: Andrew Morton, linux-mm Hi Hugh, On Mon, Jan 10, 2011 at 04:55:53PM -0800, Hugh Dickins wrote: > If you configure THP in addition to HUGETLB_PAGE on x86_32 without PAE, > the p?d-folding works out that munlock_vma_pages_range() can crash to > follow_page()'s pud_huge() BUG_ON(flags & FOLL_GET): it needs the same > VM_HUGETLB check already there on the pmd_huge() line. Conveniently, > openSUSE provides a "blogd" which tests this out at startup! > > Signed-off-by: Hugh Dickins <hughd@google.com> > --- > This massive rework belongs just after thp-transparent-hugepage-core.patch > > mm/memory.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- mmotm.orig/mm/memory.c 2011-01-10 16:31:29.000000000 -0800 > +++ mmotm/mm/memory.c 2011-01-10 16:33:16.000000000 -0800 > @@ -1288,7 +1288,7 @@ struct page *follow_page(struct vm_area_ > pud = pud_offset(pgd, address); > if (pud_none(*pud)) > goto no_page_table; > - if (pud_huge(*pud)) { > + if (pud_huge(*pud) && vma->vm_flags & VM_HUGETLB) { > BUG_ON(flags & FOLL_GET); > page = follow_huge_pud(mm, address, pud, flags & FOLL_WRITE); > goto out; How is THP related to this? pud_trans_huge doesn't exist, if pud_huge is true, vma is already guaranteed to belong to hugetlbfs without requiring the additional check. I added the check to pmd_huge already, there it is needed, but for pud_huge it isn't as far as I can tell. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mmotm] thp: transparent hugepage core fixlet 2011-01-11 1:57 ` Andrea Arcangeli @ 2011-01-11 2:29 ` Hugh Dickins 2011-01-11 14:04 ` Andrea Arcangeli 0 siblings, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2011-01-11 2:29 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Andrew Morton, linux-mm On Mon, Jan 10, 2011 at 5:57 PM, Andrea Arcangeli <aarcange@redhat.com> wrote: > On Mon, Jan 10, 2011 at 04:55:53PM -0800, Hugh Dickins wrote: >> If you configure THP in addition to HUGETLB_PAGE on x86_32 without PAE, >> the p?d-folding works out that munlock_vma_pages_range() can crash to >> follow_page()'s pud_huge() BUG_ON(flags & FOLL_GET): it needs the same >> VM_HUGETLB check already there on the pmd_huge() line. Conveniently, >> openSUSE provides a "blogd" which tests this out at startup! > > How is THP related to this? pud_trans_huge doesn't exist, if pud_huge > is true, vma is already guaranteed to belong to hugetlbfs without > requiring the additional check. THP puts in pmds that are huge. In this configuration the "folding" is such that the puds are the pmds. So the pud_huge test passes and the BUG_ON hits. I hope I've explained that correctly, agreed that it's confusing! > > I added the check to pmd_huge already, there it is needed, but for > pud_huge it isn't as far as I can tell. Crashing on that BUG_ON suggests otherwise ;) Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mmotm] thp: transparent hugepage core fixlet 2011-01-11 2:29 ` Hugh Dickins @ 2011-01-11 14:04 ` Andrea Arcangeli 2011-01-11 16:31 ` Andrea Arcangeli 0 siblings, 1 reply; 7+ messages in thread From: Andrea Arcangeli @ 2011-01-11 14:04 UTC (permalink / raw) To: Hugh Dickins; +Cc: Andrew Morton, linux-mm On Mon, Jan 10, 2011 at 06:29:29PM -0800, Hugh Dickins wrote: > On Mon, Jan 10, 2011 at 5:57 PM, Andrea Arcangeli <aarcange@redhat.com> wrote: > > On Mon, Jan 10, 2011 at 04:55:53PM -0800, Hugh Dickins wrote: > >> If you configure THP in addition to HUGETLB_PAGE on x86_32 without PAE, > >> the p?d-folding works out that munlock_vma_pages_range() can crash to > >> follow_page()'s pud_huge() BUG_ON(flags & FOLL_GET): it needs the same > >> VM_HUGETLB check already there on the pmd_huge() line. Conveniently, > >> openSUSE provides a "blogd" which tests this out at startup! > > > > How is THP related to this? pud_trans_huge doesn't exist, if pud_huge > > is true, vma is already guaranteed to belong to hugetlbfs without > > requiring the additional check. > > THP puts in pmds that are huge. In this configuration the "folding" is > such that the puds are the pmds. So the pud_huge test passes and > the BUG_ON hits. I hope I've explained that correctly, agreed that > it's confusing! > > > > > I added the check to pmd_huge already, there it is needed, but for > > pud_huge it isn't as far as I can tell. > > Crashing on that BUG_ON suggests otherwise ;) I think I see what you mean, pgd=pud=pmd with 2 levels only, but if pud_huge can return 1 on x86_32 without PAE, that sounds like an architectural bug to me. Why can't pud_huge simply return 0 for x86_32? Any other place dealing with hugepages and calling pud_huge on x86 noPAE would be at risk, otherwise, no? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mmotm] thp: transparent hugepage core fixlet 2011-01-11 14:04 ` Andrea Arcangeli @ 2011-01-11 16:31 ` Andrea Arcangeli 2011-01-11 22:59 ` Hugh Dickins 0 siblings, 1 reply; 7+ messages in thread From: Andrea Arcangeli @ 2011-01-11 16:31 UTC (permalink / raw) To: Hugh Dickins; +Cc: Andrew Morton, linux-mm On Tue, Jan 11, 2011 at 03:04:21PM +0100, Andrea Arcangeli wrote: > architectural bug to me. Why can't pud_huge simply return 0 for > x86_32? Any other place dealing with hugepages and calling pud_huge on > x86 noPAE would be at risk, otherwise, no? Isn't this better solution? ====== Subject: avoid confusing hugetlbfs code when pmd_trans_huge is set From: Andrea Arcangeli <aarcange@redhat.com> If pmd is set huge by THP, pud_huge shouldn't return 1 when pud doesn't exist and it's just a 1:1 bypass over the pmd (like it happens on 32bit x86 because there are at most 2 or 3 level of pagetables). Only pmd_huge can return 1. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> --- diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -227,7 +227,15 @@ int pmd_huge(pmd_t pmd) int pud_huge(pud_t pud) { +#ifdef CONFIG_X86_64 return !!(pud_val(pud) & _PAGE_PSE); +#else + /* + * pud is a bypass with 2 or 3 level pagetables, only pmd_huge + * can return 1. + */ + return 0; +#endif } struct page * -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mmotm] thp: transparent hugepage core fixlet 2011-01-11 16:31 ` Andrea Arcangeli @ 2011-01-11 22:59 ` Hugh Dickins 2011-01-12 2:02 ` Andrea Arcangeli 0 siblings, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2011-01-11 22:59 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Andi Kleen, Jeremy Fitzhardinge, Andrew Morton, linux-mm On Tue, 11 Jan 2011, Andrea Arcangeli wrote: > On Tue, Jan 11, 2011 at 03:04:21PM +0100, Andrea Arcangeli wrote: > > architectural bug to me. Why can't pud_huge simply return 0 for > > x86_32? Any other place dealing with hugepages and calling pud_huge on > > x86 noPAE would be at risk, otherwise, no? > > Isn't this better solution? [Better solution than my patch to follow_page() in mmotm, to fix crash with Transparent Huge Pages by duplicating Andrea's pmd_huge VM_HUGETLB check to the pud_huge line too.] The truth is, I'm sure one of the solutions is better than the other, but I'm too confused by p?d folding to know which is which ;) Certainly I don't oppose your patch as a replacement for mine, if you're sure yours is better. There are only two places which are using pud_huge() anyway: follow_page() and apply_to_pmd_range(). Is the latter's BUG_ON(pud_huge) safe? Safe in the THP world? And I never quite understood why we have both pmd_huge and pmd_large, pud_huge and pud_large. There are answers to these questions, but it would take me hours and hours of easily-confused research (across several arches) to decide. I'm hoping someone else has a surer grasp: Andi introduced pud_huge(), and Jeremy is the most active in the pagetable layers nowadays - perhaps they can tell us more quickly. Hugh > > ====== > Subject: avoid confusing hugetlbfs code when pmd_trans_huge is set > > From: Andrea Arcangeli <aarcange@redhat.com> > > If pmd is set huge by THP, pud_huge shouldn't return 1 when pud doesn't exist > and it's just a 1:1 bypass over the pmd (like it happens on 32bit x86 because > there are at most 2 or 3 level of pagetables). Only pmd_huge can return 1. > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > --- > > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > --- a/arch/x86/mm/hugetlbpage.c > +++ b/arch/x86/mm/hugetlbpage.c > @@ -227,7 +227,15 @@ int pmd_huge(pmd_t pmd) > > int pud_huge(pud_t pud) > { > +#ifdef CONFIG_X86_64 > return !!(pud_val(pud) & _PAGE_PSE); > +#else > + /* > + * pud is a bypass with 2 or 3 level pagetables, only pmd_huge > + * can return 1. > + */ > + return 0; > +#endif > } > > struct page * -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mmotm] thp: transparent hugepage core fixlet 2011-01-11 22:59 ` Hugh Dickins @ 2011-01-12 2:02 ` Andrea Arcangeli 0 siblings, 0 replies; 7+ messages in thread From: Andrea Arcangeli @ 2011-01-12 2:02 UTC (permalink / raw) To: Hugh Dickins; +Cc: Andi Kleen, Jeremy Fitzhardinge, Andrew Morton, linux-mm On Tue, Jan 11, 2011 at 02:59:43PM -0800, Hugh Dickins wrote: > On Tue, 11 Jan 2011, Andrea Arcangeli wrote: > > On Tue, Jan 11, 2011 at 03:04:21PM +0100, Andrea Arcangeli wrote: > > > architectural bug to me. Why can't pud_huge simply return 0 for > > > x86_32? Any other place dealing with hugepages and calling pud_huge on > > > x86 noPAE would be at risk, otherwise, no? > > > > Isn't this better solution? > > [Better solution than my patch to follow_page() in mmotm, to fix crash > with Transparent Huge Pages by duplicating Andrea's pmd_huge VM_HUGETLB > check to the pud_huge line too.] > > The truth is, I'm sure one of the solutions is better than the other, > but I'm too confused by p?d folding to know which is which ;) > > Certainly I don't oppose your patch as a replacement for mine, > if you're sure yours is better. > > There are only two places which are using pud_huge() anyway: > follow_page() and apply_to_pmd_range(). Is the latter's > BUG_ON(pud_huge) safe? Safe in the THP world? The latter BUG_ON should be safe in THP world, there's a pmd_huge bug on too so it can't be a problem in THP world. > And I never quite understood why we have both pmd_huge and pmd_large, > pud_huge and pud_large. > > There are answers to these questions, but it would take me hours and > hours of easily-confused research (across several arches) to decide. > > I'm hoping someone else has a surer grasp: Andi introduced pud_huge(), > and Jeremy is the most active in the pagetable layers nowadays - > perhaps they can tell us more quickly. I'd like their opinion too but for exactly the same reason why you asked yourself if the latter BUG_ON is safe, I think my patch from practical prospective reduces the risk. When THP uses pmd_mkhuge it's counter intuitive that pud_huge returns 1, and there's no benefit to that at all other than risking troubles like this one. In fact a branch and a block of follow_page is eliminated at compile time by my patch (as opposed your patch adds one more branch and can't eliminate a block of code if the second branch would be taken but we know it can't). I consider this an arch bug, not common code issue. This is the THP modifications to the code and I didn't expect having to alter the pud_huge check in addition to the below one. I thought this shall be enough if the arch is correct (and optimal). @@ -1273,11 +1301,32 @@ struct page *follow_page(struct vm_area_ pmd = pmd_offset(pud, address); if (pmd_none(*pmd)) goto no_page_table; - if (pmd_huge(*pmd)) { + if (pmd_huge(*pmd) && vma->vm_flags & VM_HUGETLB) { BUG_ON(flags & FOLL_GET); page = follow_huge_pmd(mm, address, pmd, flags & FOLL_WRITE); goto out; I think the x86 3level should work ok with follow_page_pmd (it's basically identical to follow_page_pud so it won't notice the difference) so I hope it doesn't break anything, and it will speedup follow_page too (even when THP is off). Across the whole tree if you grep for pmd_offset, you'll find all the places that you've to care for THP, I'd like to still not having to care about the result of pud_offset (having to care for pmd_offset is more than enough ;). Other archs implementing pud_huge should also return 0 if there are only 3 levels, if they introduce THP, this will have the benefit of optimizing follow_page when THP is off as well for them. Your patch is ok if this will not be considered an arch bug (I think to avoid mistakes pud_huge should be implemented by pgtable-nopud.h but that's a little bigger cleanup I didn't do myself yet). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-01-12 2:03 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-11 0:55 [PATCH mmotm] thp: transparent hugepage core fixlet Hugh Dickins 2011-01-11 1:57 ` Andrea Arcangeli 2011-01-11 2:29 ` Hugh Dickins 2011-01-11 14:04 ` Andrea Arcangeli 2011-01-11 16:31 ` Andrea Arcangeli 2011-01-11 22:59 ` Hugh Dickins 2011-01-12 2:02 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).