* Question regarding x86_64 __PHYSICAL_MASK_SHIFT @ 2005-10-04 11:59 Tejun Heo 2005-10-04 12:02 ` Tejun Heo 2005-10-04 17:24 ` Andi Kleen 0 siblings, 2 replies; 8+ messages in thread From: Tejun Heo @ 2005-10-04 11:59 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml Hello, Andi. In include/asm-x86_64/page.h, __VIRTUAL_MASK_SHIFT is defined as 48 bits which is the size of virtual address space on current x86_64 machines as used as such. OTOH, __PHYSICAL_MASK_SHIFT is defined as 46 and used as mask shift for physical page address (i.e. physaddr >> 12). In addition to being a bit confusing due to similar names but different meanings, this means that we assume processors can physically address 58 (46 + 12) bits, but both amd64 and IA-32e manuals say that current architectural limit is 52 bits and bits 52-62 are reserved in all page table entries. This currently (and in foreseeable future) doesn't cause any problem but it's still a bit weird. Am I missing something? Thanks. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 11:59 Question regarding x86_64 __PHYSICAL_MASK_SHIFT Tejun Heo @ 2005-10-04 12:02 ` Tejun Heo 2005-10-04 17:24 ` Andi Kleen 1 sibling, 0 replies; 8+ messages in thread From: Tejun Heo @ 2005-10-04 12:02 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml Sorry, corrections. On Tue, Oct 04, 2005 at 08:59:48PM +0900, Tejun Heo wrote: > > Hello, Andi. > > In include/asm-x86_64/page.h, __VIRTUAL_MASK_SHIFT is defined as 48 > bits which is the size of virtual address space on current x86_64 > machines as used as such. OTOH, __PHYSICAL_MASK_SHIFT is defined as 46 and used as such. > and used as mask shift for physical page address (i.e. physaddr >> 12). physical page number (i.e. physaddr >> 12). Sorry about extra noise, I gotta eat something, feeling dizzy. :-) -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 11:59 Question regarding x86_64 __PHYSICAL_MASK_SHIFT Tejun Heo 2005-10-04 12:02 ` Tejun Heo @ 2005-10-04 17:24 ` Andi Kleen 2005-10-04 18:52 ` Tejun Heo 1 sibling, 1 reply; 8+ messages in thread From: Andi Kleen @ 2005-10-04 17:24 UTC (permalink / raw) To: Tejun Heo; +Cc: lkml, discuss On Tuesday 04 October 2005 13:59, Tejun Heo wrote: > Hello, Andi. > > In include/asm-x86_64/page.h, __VIRTUAL_MASK_SHIFT is defined as 48 > bits which is the size of virtual address space on current x86_64 > machines as used as such. OTOH, __PHYSICAL_MASK_SHIFT is defined as 46 > and used as mask shift for physical page address (i.e. physaddr >> 12). > > In addition to being a bit confusing due to similar names but > different meanings, this means that we assume processors can physically > address 58 (46 + 12) bits, but both amd64 and IA-32e manuals say that > current architectural limit is 52 bits and bits 52-62 are reserved in > all page table entries. This currently (and in foreseeable future) > doesn't cause any problem but it's still a bit weird. You're right - PHYSICAL_MASK shouldn't be applied to PFNs, only to full addresses. Fixed with appended patch. The 46bits limit is because half of the 48bit virtual space is used for user space and the other 47 bit half is divided into direct mapping and other mappings (ioremap, vmalloc etc.). All physical memory has to fit into the direct mapping, so you end with a 46 bit limit. See also Documentation/x86-64/mm.txt -Andi Don't apply __PHYSICAL_MASK to page frame numbers Pointed out by Tejun Heo. Cc: htejun@gmail.com Signed-off-by: Andi Kleen <ak@suse.de> Index: linux/include/asm-x86_64/pgtable.h =================================================================== --- linux.orig/include/asm-x86_64/pgtable.h +++ linux/include/asm-x86_64/pgtable.h @@ -259,7 +259,7 @@ static inline unsigned long pud_bad(pud_ #define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) /* FIXME: is this right? */ #define pte_page(x) pfn_to_page(pte_pfn(x)) -#define pte_pfn(x) ((pte_val(x) >> PAGE_SHIFT) & __PHYSICAL_MASK) +#define pte_pfn(x) ((pte_val(x) & __PHYSICAL_MASK) >> PAGE_SHIFT) static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) { @@ -384,7 +384,7 @@ static inline pud_t *__pud_offset_k(pud_ #define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0) #define pmd_bad(x) ((pmd_val(x) & (~PTE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE ) #define pfn_pmd(nr,prot) (__pmd(((nr) << PAGE_SHIFT) | pgprot_val(prot))) -#define pmd_pfn(x) ((pmd_val(x) >> PAGE_SHIFT) & __PHYSICAL_MASK) +#define pmd_pfn(x) ((pmd_val(x) & __PHYSICAL_MASK) >> PAGE_SHIFT) #define pte_to_pgoff(pte) ((pte_val(pte) & PHYSICAL_PAGE_MASK) >> PAGE_SHIFT) #define pgoff_to_pte(off) ((pte_t) { ((off) << PAGE_SHIFT) | _PAGE_FILE }) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 17:24 ` Andi Kleen @ 2005-10-04 18:52 ` Tejun Heo 2005-10-04 19:01 ` Andi Kleen 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2005-10-04 18:52 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml, discuss Hello, Andi. On Tue, Oct 04, 2005 at 07:24:56PM +0200, Andi Kleen wrote: > > You're right - PHYSICAL_MASK shouldn't be applied to PFNs, only to full > addresses. Fixed with appended patch. > > The 46bits limit is because half of the 48bit virtual space > is used for user space and the other 47 bit half is divided into > direct mapping and other mappings (ioremap, vmalloc etc.). All > physical memory has to fit into the direct mapping, so you > end with a 46 bit limit. __PHYSICAL_MASK is only used to mask out non-pfn bits from page table entries. I don't really see how it's related to virtual space construction. > > See also Documentation/x86-64/mm.txt > Thanks. :-) I think PHYSICAL_PAGE_MASK and PTE_FILE_MAX_BITS should also be updated. How about the following patch? Compile & boot tested. This patch cleans up __PHYSICAL_MASK and related constants. - __PHYSICAL_MASK_SHIFT is changed to 52 to reflect architectural physical address limit. - PHYSICAL_PAGE_MASK is fixed to properly mask pfn bits of page table entries. - PTE_FILE_MAX_BITS is fixed to properly reflect available bits in a page table entry (40bits). - Macros dealing with page table entries are modified to universally use PTE_MASK (which equals PHYSICAL_PAGE_MASK) when extracting pfn for consistency. Signed-off-by: Tejun Heo <htejun@gmail.com> diff --git a/include/asm-x86_64/page.h b/include/asm-x86_64/page.h --- a/include/asm-x86_64/page.h +++ b/include/asm-x86_64/page.h @@ -11,7 +11,7 @@ #define PAGE_SIZE (1UL << PAGE_SHIFT) #endif #define PAGE_MASK (~(PAGE_SIZE-1)) -#define PHYSICAL_PAGE_MASK (~(PAGE_SIZE-1) & (__PHYSICAL_MASK << PAGE_SHIFT)) +#define PHYSICAL_PAGE_MASK (PAGE_MASK & __PHYSICAL_MASK) #define THREAD_ORDER 1 #ifdef __ASSEMBLY__ @@ -81,7 +81,7 @@ typedef struct { unsigned long pgprot; } #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) /* See Documentation/x86_64/mm.txt for a description of the memory map. */ -#define __PHYSICAL_MASK_SHIFT 46 +#define __PHYSICAL_MASK_SHIFT 52 #define __PHYSICAL_MASK ((1UL << __PHYSICAL_MASK_SHIFT) - 1) #define __VIRTUAL_MASK_SHIFT 48 #define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1) diff --git a/include/asm-x86_64/pgtable.h b/include/asm-x86_64/pgtable.h --- a/include/asm-x86_64/pgtable.h +++ b/include/asm-x86_64/pgtable.h @@ -101,7 +101,7 @@ static inline void pgd_clear (pgd_t * pg } #define pud_page(pud) \ -((unsigned long) __va(pud_val(pud) & PHYSICAL_PAGE_MASK)) +((unsigned long) __va(pud_val(pud) & PTE_MASK)) #define ptep_get_and_clear(mm,addr,xp) __pte(xchg(&(xp)->pte, 0)) @@ -245,7 +245,7 @@ static inline unsigned long pud_bad(pud_ #define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) /* FIXME: is this right? */ #define pte_page(x) pfn_to_page(pte_pfn(x)) -#define pte_pfn(x) ((pte_val(x) >> PAGE_SHIFT) & __PHYSICAL_MASK) +#define pte_pfn(x) ((pte_val(x) & PTE_MASK) >> PAGE_SHIFT) static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) { @@ -352,11 +352,11 @@ static inline pud_t *__pud_offset_k(pud_ #define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0) #define pmd_bad(x) ((pmd_val(x) & (~PTE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE ) #define pfn_pmd(nr,prot) (__pmd(((nr) << PAGE_SHIFT) | pgprot_val(prot))) -#define pmd_pfn(x) ((pmd_val(x) >> PAGE_SHIFT) & __PHYSICAL_MASK) +#define pmd_pfn(x) ((pmd_val(x) & PTE_MASK) >> PAGE_SHIFT) -#define pte_to_pgoff(pte) ((pte_val(pte) & PHYSICAL_PAGE_MASK) >> PAGE_SHIFT) +#define pte_to_pgoff(pte) ((pte_val(pte) & PTE_MASK) >> PAGE_SHIFT) #define pgoff_to_pte(off) ((pte_t) { ((off) << PAGE_SHIFT) | _PAGE_FILE }) -#define PTE_FILE_MAX_BITS __PHYSICAL_MASK_SHIFT +#define PTE_FILE_MAX_BITS (__PHYSICAL_MASK_SHIFT - PAGE_SHIFT) /* PTE - Level 1 access. */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 18:52 ` Tejun Heo @ 2005-10-04 19:01 ` Andi Kleen 2005-10-04 19:19 ` Tejun Heo 0 siblings, 1 reply; 8+ messages in thread From: Andi Kleen @ 2005-10-04 19:01 UTC (permalink / raw) To: Tejun Heo; +Cc: lkml, discuss On Tuesday 04 October 2005 20:52, Tejun Heo wrote: > Hello, Andi. > > On Tue, Oct 04, 2005 at 07:24:56PM +0200, Andi Kleen wrote: > > You're right - PHYSICAL_MASK shouldn't be applied to PFNs, only to full > > addresses. Fixed with appended patch. > > > > The 46bits limit is because half of the 48bit virtual space > > is used for user space and the other 47 bit half is divided into > > direct mapping and other mappings (ioremap, vmalloc etc.). All > > physical memory has to fit into the direct mapping, so you > > end with a 46 bit limit. > > __PHYSICAL_MASK is only used to mask out non-pfn bits from page table > entries. I don't really see how it's related to virtual space > construction. Any other bits are not needed and should be pte_bad()ed. Ok there could be IO mappings beyond 46bits in theory, but I will worry about these when they happen. For now it's better to error out to easier detect memory corruptions in page tables (some x86-64 CPUs tend to machine check when presented with unmapped physical addresses, which is nasty to track down) > > > See also Documentation/x86-64/mm.txt > > Thanks. :-) > > I think PHYSICAL_PAGE_MASK and PTE_FILE_MAX_BITS should also be > updated. How about the following patch? Compile & boot tested. No, I think the existing code with my patch is fine. -Andi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 19:01 ` Andi Kleen @ 2005-10-04 19:19 ` Tejun Heo 2005-10-04 19:27 ` Andi Kleen 0 siblings, 1 reply; 8+ messages in thread From: Tejun Heo @ 2005-10-04 19:19 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml, discuss Andi Kleen wrote: > On Tuesday 04 October 2005 20:52, Tejun Heo wrote: > >> Hello, Andi. >> >>On Tue, Oct 04, 2005 at 07:24:56PM +0200, Andi Kleen wrote: >> >>>You're right - PHYSICAL_MASK shouldn't be applied to PFNs, only to full >>>addresses. Fixed with appended patch. >>> >>>The 46bits limit is because half of the 48bit virtual space >>>is used for user space and the other 47 bit half is divided into >>>direct mapping and other mappings (ioremap, vmalloc etc.). All >>>physical memory has to fit into the direct mapping, so you >>>end with a 46 bit limit. >> >> __PHYSICAL_MASK is only used to mask out non-pfn bits from page table >>entries. I don't really see how it's related to virtual space >>construction. > > > Any other bits are not needed and should be pte_bad()ed. > > Ok there could be IO mappings beyond 46bits in theory, but I will worry about > these when they happen. For now it's better to error out to easier detect > memory corruptions in page tables (some x86-64 CPUs tend to machine > check when presented with unmapped physical addresses, which > is nasty to track down) > Ahh.. I see. > >>>See also Documentation/x86-64/mm.txt >> >> Thanks. :-) >> >> I think PHYSICAL_PAGE_MASK and PTE_FILE_MAX_BITS should also be >>updated. How about the following patch? Compile & boot tested. > > > No, I think the existing code with my patch is fine. Hmmm.. but, currently * PHYSICAL_PAGE_MASK == (~(PAGE_SIZE-1)&(__PHYSICAL_MASK << PAGE_SHIFT) == (0xffffffff_fffff000 & (0x00003fff_ffffffff << 12) == 0x03ffffff_fffff000 while it actually should be 0x00003fff_fffff000 * PTE_FILE_MAX_BITS == __PHYSICAL_MASK_SHIFT == 46, but only 40bits are available in page table entries. -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 19:19 ` Tejun Heo @ 2005-10-04 19:27 ` Andi Kleen 2005-10-05 4:01 ` Tejun Heo 0 siblings, 1 reply; 8+ messages in thread From: Andi Kleen @ 2005-10-04 19:27 UTC (permalink / raw) To: Tejun Heo; +Cc: lkml, discuss On Tuesday 04 October 2005 21:19, Tejun Heo wrote: > Hmmm.. but, currently > > * PHYSICAL_PAGE_MASK == (~(PAGE_SIZE-1)&(__PHYSICAL_MASK << PAGE_SHIFT) > == (0xffffffff_fffff000 & (0x00003fff_ffffffff << 12) > == 0x03ffffff_fffff000 > while it actually should be 0x00003fff_fffff000 Right. Fixed now > * PTE_FILE_MAX_BITS == __PHYSICAL_MASK_SHIFT == 46, but only 40bits are > available in page table entries. The non linear mapping format is independent from the MMU, the number is pretty much arbitary, but it is consistent to make it the same as other ptes for easier sanity checking. -Andi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding x86_64 __PHYSICAL_MASK_SHIFT 2005-10-04 19:27 ` Andi Kleen @ 2005-10-05 4:01 ` Tejun Heo 0 siblings, 0 replies; 8+ messages in thread From: Tejun Heo @ 2005-10-05 4:01 UTC (permalink / raw) To: Andi Kleen; +Cc: lkml, discuss Hello, Andi. Andi Kleen wrote: > On Tuesday 04 October 2005 21:19, Tejun Heo wrote: > > >> Hmmm.. but, currently >> >>* PHYSICAL_PAGE_MASK == (~(PAGE_SIZE-1)&(__PHYSICAL_MASK << PAGE_SHIFT) >> == (0xffffffff_fffff000 & (0x00003fff_ffffffff << 12) >> == 0x03ffffff_fffff000 >> while it actually should be 0x00003fff_fffff000 > > > Right. Fixed now > > >>* PTE_FILE_MAX_BITS == __PHYSICAL_MASK_SHIFT == 46, but only 40bits are >>available in page table entries. > > > The non linear mapping format is independent from the MMU, the number > is pretty much arbitary, but it is consistent to make it the same as > other ptes for easier sanity checking. Okay, please forgive me if I'm bugging you with something stupid but I still don't quite get it. When using NONLINEAR mapping, pgoff is stored to pte to use later when faulting in the page. Storing and reading pgoff are done with the following macros. #define pte_to_pgoff(pte) \ ((pte_val(pte) & PHYSICAL_PAGE_MASK) >> PAGE_SHIFT) #define pgoff_to_pte(off) \ ((pte_t) { ((off) << PAGE_SHIFT) | _PAGE_FILE }) In pte_to_pgoff we're masking pte value with PHYSICAL_PAGE_MASK which gives us 34bits with patches applied. This means that if a pgoff goes through pgoff_to_pte and then pte_to_pgoff only 34bits survive. sys_remap_file_pages() checks if required pgoff's fit in pte's using PTE_FILE_MAX_BITS. #define PTE_FILE_MAX_BITS __PHYSICAL_MASK_SHIFT Which is 46 with patches applied. Meaning that we could end up shoving up value larger than 34bits into pte and losing information when reading back (and it's only 16GB!). So, IMHO, we should either shrink PTE_FILE_MAX_BITS to 36 or change pte_to_pgoff/pgoff_to_pte macros to carry more bits (as pte bits 52-62 are available, we can shove 46bits easily). No? -- tejun ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-10-05 4:01 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-10-04 11:59 Question regarding x86_64 __PHYSICAL_MASK_SHIFT Tejun Heo 2005-10-04 12:02 ` Tejun Heo 2005-10-04 17:24 ` Andi Kleen 2005-10-04 18:52 ` Tejun Heo 2005-10-04 19:01 ` Andi Kleen 2005-10-04 19:19 ` Tejun Heo 2005-10-04 19:27 ` Andi Kleen 2005-10-05 4:01 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).