All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: William Lee Irwin III <wli@holomorphy.com>,
	Andi Kleen <ak@suse.de>, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Zachary Amsden <zach@vmware.com>
Subject: Re: Why preallocate pmd in x86 32-bit PAE?
Date: Fri, 16 Nov 2007 10:30:28 -0800	[thread overview]
Message-ID: <473DE1C4.1050601@goop.org> (raw)
In-Reply-To: <alpine.LFD.0.9999.0711160921510.4260@woody.linux-foundation.org>

Linus Torvalds wrote:
> On Fri, 16 Nov 2007, Jeremy Fitzhardinge wrote:
>   
>>> IIRC, the present bit is ignored in the magic 4-entry PGD.  All entries 
>>> have to be present.
>>>       
>> Hm, do you recall what processors that might affect?  As far as I know,
>> current processors will ignore non-present top-level entries.
>>     
>
> Are you sure?
>   

3.8.5 in vol 3a "Page-Directory and Page-Table Entries With Extended
Addressing Enabled":

    The present flag (bit 0) in the page-directory-pointer-table entries
    can be set to 0 or 1. If the present flag is clear, the remaining
    bits in the page-directory-pointer-table entry are available to the
    operating system. If the present flag is set, the fields of the
    page-directory-pointer-table entry are defined in Figures 3-20 for
    4-KByte pages and Figures 3-21 for 2-MByte pages.

So I would assume this works on all current CPUs, but I can imagine that
some older/off-brand processors might get it wrong.

> Anyway, this is not worth making a distinction for. Just pre-allocate all 
> of them. There really is just 4 PGD entries, and it really *is* different 
> from having a full three-level page table, and of the four PGD entries:
>
>  - one is used for the kernel mapping (assuming the regular 1:3 layout)
>  - AT LEAST two are required by user space anyway
>
> so pre-allocating is never going to waste more than one page.
>   

Yeah, I'm not so concerned about memory saving; I don't think there
would be any in practice.

> And you may feel that pre-allocating is a special case, but it's an 
> *easier* special case than the one that you are apparently thinking about 
> (which is to special-case according to CPU version).
>   

I'm hoping to avoid special-casing anything, if I can help it, aside
from the normal 32/64-bit 2/3/4-level parameterising of the various
pagetable accessors.

> So don't do it. Just preallocate for the magic 4-entry PGD. You can make 
> the special case just be something like
>
> 	/* Preallocate for small PGD's */
> 	#if PTRS_PER_PGD == 4
> 		for (i = 0; i < USER_PTRS_PER_PGD; i++) {
> 			pmd_t *pmd = pmd_alloc();
> 			set_pgd(pgd+i, __pgd(PAGE_PRESENT | __pa(pmd));
> 		}	
> 	#endif
>
> or similar. 
>
> There is absolutely *zero* reason not to do this, and there is also zero 
> reason to make this be a "32-bit vs 64-bit" issue. The code can be there 
> in both, and the #if could even be all in C code (ie there may be reasons 
> to prefer writing it as
>
> 	/* The old-style PAE PGD needs to be preallocated */
> 	if (USER_PTRS_PER_PGD <= 4) {
> 		...
> 	}
>
> and the compiler should even compile it away entirely for all practical 
> cases even without using the preprocessor.
>   

Perhaps.  And there's the corresponding difference between 32 and 64 bit
on freeing a pagetable; 32-bit assumes the pgd destructor will free the
pmd, whereas 64-bit does it separately.  Even in the current 32-bit
code, there's separate handling for PAE and non-PAE.  I think it can all
be collapsed down in a reasonable way.

>> That just means we need to reload cr3 after populating the pgd with a
>> new pmd, right?
>>     
>
> BUT ONLY FOR THIS CASE!
>
> And if you preallocate it, you make *that* special case go away. 
>   

Yes, that is a bit awkward; it means that 32-bit PAE would need a
speparate pgd_populate.  But that seems like a smaller change than 1)
making 32-bit PAE pgd-alloc preallocate the pmd, and 2) making pmd_free
noop on 32-bit PAE, and 3) making pgd_free free the preallocated pmd. 
Perhaps 2 & 3 aren't necessary and can be the same as 64-bit.

I'll need to look into it more carefully.

> So you're going to have special cases regardless. Do the simple and 
> really straightforward one, please! Nothing subtle.
>   

Yep, absolutely.

    J

  reply	other threads:[~2007-11-16 18:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-15 21:57 Why preallocate pmd in x86 32-bit PAE? Jeremy Fitzhardinge
2007-11-15 22:12 ` Linus Torvalds
2007-11-15 22:42   ` H. Peter Anvin
2007-11-16  0:40     ` William Lee Irwin III
2007-11-16  0:41       ` H. Peter Anvin
2007-11-16 11:16         ` Andi Kleen
2007-11-16 15:45           ` H. Peter Anvin
2007-11-16 15:53             ` Andi Kleen
2007-11-16 16:10               ` H. Peter Anvin
2007-11-16 17:12   ` Jeremy Fitzhardinge
2007-11-16 17:35     ` Linus Torvalds
2007-11-16 18:30       ` Jeremy Fitzhardinge [this message]
2007-11-16 19:14       ` Jeremy Fitzhardinge
2007-11-16 19:22         ` Linus Torvalds
2007-11-16 19:43           ` Jeremy Fitzhardinge
2007-11-16 17:45     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=473DE1C4.1050601@goop.org \
    --to=jeremy@goop.org \
    --cc=ak@suse.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wli@holomorphy.com \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.