public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: William Lee Irwin III <wli@holomorphy.com>,
	Andi Kleen <ak@suse.de>, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Zachary Amsden <zach@vmware.com>
Subject: Re: Why preallocate pmd in x86 32-bit PAE?
Date: Fri, 16 Nov 2007 10:30:28 -0800	[thread overview]
Message-ID: <473DE1C4.1050601@goop.org> (raw)
In-Reply-To: <alpine.LFD.0.9999.0711160921510.4260@woody.linux-foundation.org>

Linus Torvalds wrote:
> On Fri, 16 Nov 2007, Jeremy Fitzhardinge wrote:
>   
>>> IIRC, the present bit is ignored in the magic 4-entry PGD.  All entries 
>>> have to be present.
>>>       
>> Hm, do you recall what processors that might affect?  As far as I know,
>> current processors will ignore non-present top-level entries.
>>     
>
> Are you sure?
>   

3.8.5 in vol 3a "Page-Directory and Page-Table Entries With Extended
Addressing Enabled":

    The present flag (bit 0) in the page-directory-pointer-table entries
    can be set to 0 or 1. If the present flag is clear, the remaining
    bits in the page-directory-pointer-table entry are available to the
    operating system. If the present flag is set, the fields of the
    page-directory-pointer-table entry are defined in Figures 3-20 for
    4-KByte pages and Figures 3-21 for 2-MByte pages.

So I would assume this works on all current CPUs, but I can imagine that
some older/off-brand processors might get it wrong.

> Anyway, this is not worth making a distinction for. Just pre-allocate all 
> of them. There really is just 4 PGD entries, and it really *is* different 
> from having a full three-level page table, and of the four PGD entries:
>
>  - one is used for the kernel mapping (assuming the regular 1:3 layout)
>  - AT LEAST two are required by user space anyway
>
> so pre-allocating is never going to waste more than one page.
>   

Yeah, I'm not so concerned about memory saving; I don't think there
would be any in practice.

> And you may feel that pre-allocating is a special case, but it's an 
> *easier* special case than the one that you are apparently thinking about 
> (which is to special-case according to CPU version).
>   

I'm hoping to avoid special-casing anything, if I can help it, aside
from the normal 32/64-bit 2/3/4-level parameterising of the various
pagetable accessors.

> So don't do it. Just preallocate for the magic 4-entry PGD. You can make 
> the special case just be something like
>
> 	/* Preallocate for small PGD's */
> 	#if PTRS_PER_PGD == 4
> 		for (i = 0; i < USER_PTRS_PER_PGD; i++) {
> 			pmd_t *pmd = pmd_alloc();
> 			set_pgd(pgd+i, __pgd(PAGE_PRESENT | __pa(pmd));
> 		}	
> 	#endif
>
> or similar. 
>
> There is absolutely *zero* reason not to do this, and there is also zero 
> reason to make this be a "32-bit vs 64-bit" issue. The code can be there 
> in both, and the #if could even be all in C code (ie there may be reasons 
> to prefer writing it as
>
> 	/* The old-style PAE PGD needs to be preallocated */
> 	if (USER_PTRS_PER_PGD <= 4) {
> 		...
> 	}
>
> and the compiler should even compile it away entirely for all practical 
> cases even without using the preprocessor.
>   

Perhaps.  And there's the corresponding difference between 32 and 64 bit
on freeing a pagetable; 32-bit assumes the pgd destructor will free the
pmd, whereas 64-bit does it separately.  Even in the current 32-bit
code, there's separate handling for PAE and non-PAE.  I think it can all
be collapsed down in a reasonable way.

>> That just means we need to reload cr3 after populating the pgd with a
>> new pmd, right?
>>     
>
> BUT ONLY FOR THIS CASE!
>
> And if you preallocate it, you make *that* special case go away. 
>   

Yes, that is a bit awkward; it means that 32-bit PAE would need a
speparate pgd_populate.  But that seems like a smaller change than 1)
making 32-bit PAE pgd-alloc preallocate the pmd, and 2) making pmd_free
noop on 32-bit PAE, and 3) making pgd_free free the preallocated pmd. 
Perhaps 2 & 3 aren't necessary and can be the same as 64-bit.

I'll need to look into it more carefully.

> So you're going to have special cases regardless. Do the simple and 
> really straightforward one, please! Nothing subtle.
>   

Yep, absolutely.

    J

  reply	other threads:[~2007-11-16 18:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-15 21:57 Why preallocate pmd in x86 32-bit PAE? Jeremy Fitzhardinge
2007-11-15 22:12 ` Linus Torvalds
2007-11-15 22:42   ` H. Peter Anvin
2007-11-16  0:40     ` William Lee Irwin III
2007-11-16  0:41       ` H. Peter Anvin
2007-11-16 11:16         ` Andi Kleen
2007-11-16 15:45           ` H. Peter Anvin
2007-11-16 15:53             ` Andi Kleen
2007-11-16 16:10               ` H. Peter Anvin
2007-11-16 17:12   ` Jeremy Fitzhardinge
2007-11-16 17:35     ` Linus Torvalds
2007-11-16 18:30       ` Jeremy Fitzhardinge [this message]
2007-11-16 19:14       ` Jeremy Fitzhardinge
2007-11-16 19:22         ` Linus Torvalds
2007-11-16 19:43           ` Jeremy Fitzhardinge
2007-11-16 17:45     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=473DE1C4.1050601@goop.org \
    --to=jeremy@goop.org \
    --cc=ak@suse.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wli@holomorphy.com \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox