All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: x86-64 machine_to_phys vs NX bit
@ 2006-11-11  4:30 John Byrne
  2006-11-11  9:56 ` John Byrne
  0 siblings, 1 reply; 19+ messages in thread
From: John Byrne @ 2006-11-11  4:30 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 1351 bytes --]

Keir Fraser wrote:
> On 24/8/06 8:25 pm, "Rik van Riel" <riel@redhat.com> wrote:
> 
>> Say, something like the following?
>> 
>> -    paddr_t phys = mfn_to_pfn(machine >> PAGE_SHIFT);
>> +    paddr_t phys = mfn_to_pfn((machine >> PAGE_SHIFT) & PHYSICAL_MASK);
>> 
>> I'm still thinking I may have missed something in the code
>> somewhere, but I've been looking at this for over an hour now
>> and can't seem to find it...
>> 
>> Any ideas?
> 
> Your suggested patch looks reasonable but it'd be good to find out why this
> hasn't caused us problems. For example, perhaps supported_pte_mask doesn't
> include PAGE_NX, so we're never setting the NX bit on 64-bit PTEs? That must
> be worth checking out, possibly also tracing machine_to_phys to find out
> where that bit 63 goes -- I agree that it looks like mfn_to_pfn() shouldn#t
> work if bit63 is set in the 'maddr' argument.
> 
>  -- Keir
> 
> 

While trying to debug a migration problem in Xen 3.0.3 I have noticed
this issue. I don't see a fix in xen-unstable. Has this gotten dropped
on the floor?

The suggested patch above is not quite correct or complete. My proposed
patch aqainst xen-unstable changeset 12364:d19deb173503 is attached. 
Note that there is also an issue in x86 PAE: machine_to_phys() currently 
will strip the NX bit.

Signed-off-by: John Byrne <john.l.byrne@hp.com>








[-- Attachment #2: nxmask.patch --]
[-- Type: text/x-patch, Size: 3583 bytes --]

diff -r d19deb173503 linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/maddr.h
--- a/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/maddr.h	Fri Nov 10 15:27:22 2006 +0000
+++ b/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/maddr.h	Fri Nov 10 21:37:45 2006 -0600
@@ -127,10 +127,17 @@ static inline maddr_t phys_to_machine(pa
 	machine = (machine << PAGE_SHIFT) | (phys & ~PAGE_MASK);
 	return machine;
 }
+
 static inline paddr_t machine_to_phys(maddr_t machine)
 {
+	/*
+	 * In PAE mode, the NX bit needs to be dealt with in the value
+	 * passed to mfn_to_pfn(). On x86_64, we need to mask it off,
+	 * but for i386 the conversion to ulong for the argument will
+	 * clip it off.
+	 */
 	paddr_t phys = mfn_to_pfn(machine >> PAGE_SHIFT);
-	phys = (phys << PAGE_SHIFT) | (machine & ~PAGE_MASK);
+	phys = (phys << PAGE_SHIFT) | (machine & ~PHYSICAL_PAGE_MASK);
 	return phys;
 }
 
diff -r d19deb173503 linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h
--- a/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h	Fri Nov 10 15:27:22 2006 +0000
+++ b/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h	Fri Nov 10 21:37:45 2006 -0600
@@ -5,6 +5,16 @@
 #define PAGE_SHIFT	12
 #define PAGE_SIZE	(1UL << PAGE_SHIFT)
 #define PAGE_MASK	(~(PAGE_SIZE-1))
+
+#ifdef CONFIG_X86_PAE
+#define __PHYSICAL_MASK_SHIFT	36
+#define __PHYSICAL_MASK		((1ULL << __PHYSICAL_MASK_SHIFT) - 1)
+#else
+#define __PHYSICAL_MASK_SHIFT	32
+#define __PHYSICAL_MASK		(~0UL)
+#endif
+
+#define PHYSICAL_PAGE_MASK	(PAGE_MASK & __PHYSICAL_MASK)
 
 #define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1))
 #define LARGE_PAGE_SIZE (1UL << PMD_SHIFT)
diff -r d19deb173503 linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/maddr.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/maddr.h	Fri Nov 10 15:27:22 2006 +0000
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/maddr.h	Fri Nov 10 21:36:35 2006 -0600
@@ -122,8 +122,8 @@ static inline maddr_t phys_to_machine(pa
 
 static inline paddr_t machine_to_phys(maddr_t machine)
 {
-	paddr_t phys = mfn_to_pfn(machine >> PAGE_SHIFT);
-	phys = (phys << PAGE_SHIFT) | (machine & ~PAGE_MASK);
+	paddr_t phys = mfn_to_pfn((machine & PHYSICAL_PAGE_MASK) >> PAGE_SHIFT);
+	phys = (phys << PAGE_SHIFT) | (machine & ~PHYSICAL_PAGE_MASK);
 	return phys;
 }
 
diff -r d19deb173503 linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/page.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/page.h	Fri Nov 10 15:27:22 2006 +0000
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/page.h	Fri Nov 10 21:36:35 2006 -0600
@@ -33,6 +33,13 @@
 #define PAGE_SIZE	(1UL << PAGE_SHIFT)
 #endif
 #define PAGE_MASK	(~(PAGE_SIZE-1))
+
+/* See Documentation/x86_64/mm.txt for a description of the memory map. */
+#define __PHYSICAL_MASK_SHIFT	46
+#define __PHYSICAL_MASK		((1UL << __PHYSICAL_MASK_SHIFT) - 1)
+#define __VIRTUAL_MASK_SHIFT	48
+#define __VIRTUAL_MASK		((1UL << __VIRTUAL_MASK_SHIFT) - 1)
+
 #define PHYSICAL_PAGE_MASK	(~(PAGE_SIZE-1) & __PHYSICAL_MASK)
 
 #define THREAD_ORDER 1 
@@ -162,12 +169,6 @@ static inline pgd_t __pgd(unsigned long 
 
 /* to align the pointer to the (next) page boundary */
 #define PAGE_ALIGN(addr)	(((addr)+PAGE_SIZE-1)&PAGE_MASK)
-
-/* See Documentation/x86_64/mm.txt for a description of the memory map. */
-#define __PHYSICAL_MASK_SHIFT	46
-#define __PHYSICAL_MASK		((1UL << __PHYSICAL_MASK_SHIFT) - 1)
-#define __VIRTUAL_MASK_SHIFT	48
-#define __VIRTUAL_MASK		((1UL << __VIRTUAL_MASK_SHIFT) - 1)
 
 #define KERNEL_TEXT_SIZE  (40UL*1024*1024)
 #define KERNEL_TEXT_START 0xffffffff80000000UL 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: x86-64 machine_to_phys vs NX bit
@ 2006-08-25 17:02 Nakajima, Jun
  0 siblings, 0 replies; 19+ messages in thread
From: Nakajima, Jun @ 2006-08-25 17:02 UTC (permalink / raw)
  To: Ian Pratt, Keir Fraser, Rik van Riel; +Cc: xen-devel

Ian Pratt wrote:
>>> No, it gets shifted right 12 and *then* converted to a long. So you
>>> get a 44-bit addressing capability (32+12). But NX bit is bit 63, so
>>> it gets truncated.
>> 
>> Yes. For (on-going) 32-bit PV guests running on the 64-bit Xen, I
>> guess we should fix the convenient optimization now?
> 
> Or we restrict 32b guests to the bottom 16 terabytes of memory. Please
> send me a machine for testing the patch :-)
> 
> Ian
> 

No, we don't :-) It cannot happen on the architecture implementations
that exist today. Maybe ASSERT() would be sufficient so that it can
remind us of the issues in years.

Jun
---
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: x86-64 machine_to_phys vs NX bit
@ 2006-08-25 15:56 Ian Pratt
  0 siblings, 0 replies; 19+ messages in thread
From: Ian Pratt @ 2006-08-25 15:56 UTC (permalink / raw)
  To: Nakajima, Jun, Keir Fraser, Rik van Riel; +Cc: xen-devel

> > No, it gets shifted right 12 and *then* converted to a long. So you
> > get a 44-bit addressing capability (32+12). But NX bit is bit 63, so
> > it gets truncated.
> 
> Yes. For (on-going) 32-bit PV guests running on the 64-bit Xen, I
guess
> we should fix the convenient optimization now?

Or we restrict 32b guests to the bottom 16 terabytes of memory. Please
send me a machine for testing the patch :-)

Ian

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: x86-64 machine_to_phys vs NX bit
@ 2006-08-25 15:37 Nakajima, Jun
  0 siblings, 0 replies; 19+ messages in thread
From: Nakajima, Jun @ 2006-08-25 15:37 UTC (permalink / raw)
  To: Keir Fraser, Rik van Riel; +Cc: xen-devel

Keir Fraser wrote:
> On 25/8/06 4:19 pm, "Rik van Riel" <riel@redhat.com> wrote:
> 
>>> A long is only 32 bits there, so when we pass the MFN portion the
>>> NX bit is conveniently truncated away!
>> 
>> Which means it'll do the wrong thing for machine addresses > 4GB
>> on PAE, or am I overlooking something?
> 
> No, it gets shifted right 12 and *then* converted to a long. So you
> get a 44-bit addressing capability (32+12). But NX bit is bit 63, so
> it gets truncated.

Yes. For (on-going) 32-bit PV guests running on the 64-bit Xen, I guess
we should fix the convenient optimization now? 

> 
>  -- Keir

Jun
---
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: x86-64 machine_to_phys vs NX bit
@ 2006-08-25 14:46 Nakajima, Jun
  2006-08-25 15:10 ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Nakajima, Jun @ 2006-08-25 14:46 UTC (permalink / raw)
  To: Rik van Riel, Keir Fraser; +Cc: xen-devel

Rik van Riel wrote:
> Keir Fraser wrote:
>> On 24/8/06 8:25 pm, "Rik van Riel" <riel@redhat.com> wrote:
>> 
>>> Say, something like the following?
>>> 
>>> -    paddr_t phys = mfn_to_pfn(machine >> PAGE_SHIFT);
>>> +    paddr_t phys = mfn_to_pfn((machine >> PAGE_SHIFT) &
>>> PHYSICAL_MASK); 
>>> 
>>> I'm still thinking I may have missed something in the code
>>> somewhere, but I've been looking at this for over an hour now
>>> and can't seem to find it...
>>> 
>>> Any ideas?
>> 
>> Your suggested patch looks reasonable but it'd be good to find out
>> why this hasn't caused us problems. For example, perhaps
>> supported_pte_mask doesn't include PAGE_NX, so we're never setting
>> the NX bit on 64-bit PTEs? 
> 
> We do set the NX bit.  Including on vmalloced pages...
> 
>> That must be worth checking out, possibly also tracing
>> machine_to_phys to find out where that bit 63 goes -- I agree that it
>> looks like mfn_to_pfn() shouldn't work if bit63 is set in the
>> 'maddr' argument.
> 
> Absolutely :)

I agree, and I'm wondering why we don't have the same problem on i386?
To me it basically does the same thing.

static inline unsigned long long pte_val(pte_t x)
{
        unsigned long long ret;

        if (x.pte_low) {
                ret = x.pte_low | (unsigned long long)x.pte_high << 32;
                ret = machine_to_phys(ret) | 1;
        } else {
                ret = 0;
        }
        return ret;
}


Jun
---
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 19+ messages in thread
* x86-64 machine_to_phys vs NX bit
@ 2006-08-24 19:25 Rik van Riel
  2006-08-25  7:32 ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2006-08-24 19:25 UTC (permalink / raw)
  To: xen-devel

On x86-64 machine_to_phys looks like it should never succeed, yet I'm
guessing it must somehow be lucky...

The problem would be that the NX bit is bit 63 of the pte, meaning that
when pte_val is called we are working with a value 2^63 higher than we
should be.

#define pte_val(x)      (((x).pte & 1) ? machine_to_phys((x).pte) : \
                          (x).pte)

static inline paddr_t machine_to_phys(maddr_t machine)
{
         paddr_t phys = mfn_to_pfn(machine >> PAGE_SHIFT);
         phys = (phys << PAGE_SHIFT) | (machine & ~PAGE_MASK);
         return phys;
}

Should we mask the 'machine' variable with PHYSICAL_MASK at
some point so we cut off the NX bit and other reserved bits?

Say, something like the following?

-    paddr_t phys = mfn_to_pfn(machine >> PAGE_SHIFT);
+    paddr_t phys = mfn_to_pfn((machine >> PAGE_SHIFT) & PHYSICAL_MASK);

I'm still thinking I may have missed something in the code
somewhere, but I've been looking at this for over an hour now
and can't seem to find it...

Any ideas?

-- 
What is important?  What you want to be true, or what is true?

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-11-14  8:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-11  4:30 x86-64 machine_to_phys vs NX bit John Byrne
2006-11-11  9:56 ` John Byrne
2006-11-13  8:07   ` Jan Beulich
2006-11-13  8:16     ` Keir Fraser
2006-11-13 20:45       ` John Byrne
2006-11-14  1:15         ` John Byrne
2006-11-14  8:05           ` Jan Beulich
2006-11-14  8:17             ` Keir Fraser
  -- strict thread matches above, loose matches on Subject: below --
2006-08-25 17:02 Nakajima, Jun
2006-08-25 15:56 Ian Pratt
2006-08-25 15:37 Nakajima, Jun
2006-08-25 14:46 Nakajima, Jun
2006-08-25 15:10 ` Keir Fraser
2006-08-25 15:19   ` Rik van Riel
2006-08-25 15:27     ` Keir Fraser
2006-08-24 19:25 Rik van Riel
2006-08-25  7:32 ` Keir Fraser
2006-08-25 13:11   ` Jan Beulich
2006-08-25 13:54   ` Rik van Riel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.